// Research / Trust, PII & Safety
On-prem LLM deployment that finishes when the network is unplugged.
Most LLM stacks assume outbound HTTPS at install time. Yours can't have it. We deliver signed install bundles, an offline model registry, FIPS-validated crypto, and runbooks that don't depend on a DNS lookup - so the deployment finishes the same day the network is sealed.
// What we see
The stack works in dev. Then the network gets cut.
01
The install assumes outbound HTTPS
Helm chart pulls images from Docker Hub. Python wheels resolve from PyPI. The model loader hits Hugging Face at startup. Everything works on the dev cluster. On the air-gapped network it sits at 0% for an hour and then errors. The fix is rearchitecting the bundle, not toggling a flag.
02
Updates become a quarterly fire drill
Pulling a new model means coordinating sneakernet drives, re-signing bundles, and a four-hour change window. Teams default to never updating - and then the security team finds a CVE in a transformer dependency and the timeline collapses.
03
Audit logs go to a vendor cloud
The observability stack ships traces to Datadog. The LLM telemetry goes to a SaaS dashboard. Both are off-network at install time, both are unacceptable at audit time, and retrofitting them onto your SIEM is a two-month project that should have been week one.
// Case Study
We trained EasyDocs' invoice extraction model
EasyDocs is the platform provider - they ship document management software to their own customers. We trained the fine-tuned NLP model that runs inside it, auto-extracting VAT numbers, totals, and addresses from invoices and learning from every user correction. Deployed on their servers, no external dependencies.
98%
field-level extraction accuracy
<300ms
inference time per invoice
On-prem
deployment with no external dependencies

// What we do
Three things that decide whether the air-gapped install ships.
Most on-prem failures aren't model-quality failures. They're install-bundle failures, update-cadence failures, and audit-trail failures. We design from the assumption that nothing reaches the internet.
Signed install bundles, zero egress
One artifact, fully signed (cosign + TUF root), with every dependency pinned by hash. OS packages, Python wheels, OCI images, and model weights all mirrored locally. The install verifies provenance on every boundary and finishes with zero outbound packets.
Updates without internet
Signed delta bundles for models, code, and CVE patches. Operators carry the diff across the gap on whatever media your policy allows; the local registry verifies the signature, applies the delta, and rolls back automatically on health-check failure. Update cadence becomes routine, not a quarterly fire drill.
Audit logs into your SIEM, not a vendor cloud
Every inference, model load, and key access lands in Splunk, Elastic, QRadar, or whatever your responders already parse. Append-only with hash chaining, tamper detection wired to your runbook, no SaaS dashboard between the incident and the audit.
// Method fit
Air-gapped isn't the right deployment for every privacy concern.
skip it if
Your data can stay in a private VPC
If your perimeter is a VPC with no public ingress and your security team accepts AWS/GCP/Azure as in-scope, a managed cloud LLM (Bedrock, Vertex, Azure OpenAI) gives you the compliance posture without the on-prem ops. Air-gapped is for when even private-cloud isn't an option.
Your concern is third-party API exfil
If the actual problem is 'we don't want customer text going to OpenAI,' an egress-side PII redactor in front of the API call addresses that without standing up your own GPUs. Cheaper, faster, and doesn't require a hardware accreditation.
PII Redaction & LLM Data PrivacyYour model needs change weekly
Air-gapped update cadence is days-to-weeks, not minutes. If your engineering team is iterating on the underlying model on a tight loop, the friction will dominate the project. Stabilize the model selection first, deploy on-prem second.
use it if
On-prem fits when data sovereignty or accreditation is non-negotiable, the network can't reach the internet, your team needs FIPS-validated crypto and SIEM-native audit, and the models you depend on are open-weight or your-weights.
// How we work
BoM first. Bundle next. Hand off the runbooks.
Every air-gapped engagement starts with the constraints that are expensive to undo - hardware envelope, accreditation target, network topology. The model and the install plan come after those are written down.
01
BoM and accreditation review (week one)
We sit with your team, size the hardware against the target throughput, and align the bundle to your accreditation paperwork (FedRAMP, IL4, ISO 27001, sectoral). The output is a signed BoM and a deployment design your CISO approves before any image is built.
02
Build and verify in your environment
All work runs in your network, on your hardware, with your operators paired in. Every artifact is signed before it crosses the gap; every install step is reproducible from the bundle. Your team owns the runbooks at the end of week three, not after a knowledge-transfer call.
03
Hand off the operating model
We hand off the install bundle, the offline registry, the SIEM forwarders, the runbooks for incident / patch / model rotation, and a quarterly support cadence with documented response SLAs. Slack for 30 days after delivery for the questions that come up after we leave.
// Expert insight
“Air-gapped is not a checkbox - it's an architecture choice that affects every dependency you pull. Most teams discover at install time that their stack assumes outbound HTTPS. We start from the assumption that nothing reaches the internet, and the deployment finishes the same day.”
Karol Gawron
Head of R&D @ bards.ai
// Why bards.ai
Why us, instead of two senior platform engineers you'd hire.
Air-gapped projects punish ramp-up time - there's no Stack Overflow tab on the secured network. We've already learned the constraints, on production deployments at EasyDocs, in defense-adjacent environments, and in regulated finance.
Production on-prem deployments
EasyDocs runs our document-AI stack fully on-prem at 98% extraction accuracy with no third-party API in the path. The signing, the registry, and the audit pipeline are the same artifacts we ship to every air-gapped customer.
16+ open-source models we own end-to-end
bardsai/eu-pii-anonymization-multilang and 15 more on Hugging Face. We can run on-prem because we don't depend on someone else's hosted API for the critical path.
Senior engineers only, no juniors
Every person on your engagement has shipped to environments with badges at the door. No learning the change-control board on your dollar.
// FAQ
Common questions about air-gapped deployment
Technical work usually runs 4-6 weeks (bundle hardening, BoM signoff, runbook authoring, paired install). Total wall-clock is dominated by your accreditation timeline. Environments with ATO precedent for similar stacks land closer to 4 weeks; greenfield accreditations stretch to 10.
Depends on the model and throughput. Llama 3 70B at moderate concurrency: 4-8 H100s or equivalent A100/MI300X per replica. Smaller fine-tuned models or PII workloads: a single L40S or A10 per service. We deliver a sized BoM after a discovery call - vendor-neutral, two or three options at different price points.
Signed delta bundles. We publish a new version on our side, sign it, your operator brings it across the gap on whatever media policy allows. The local registry verifies the signature, applies the delta, and rolls back automatically on health-check failure. Cadence is typically quarterly with out-of-band patches for CVEs.
FIPS 140-2 validated crypto across the stack, by default. IL4/IL5 is an environment classification, not a software certification - we've deployed components into IL4-aligned networks and have BoM documentation that fits FedRAMP-style paperwork. For IL5 specifically we work with the customer's accreditor on the gaps.
Engagements start at $40K. Most on-prem deployments land between $40K and $150K depending on hardware envelope, multi-tenancy, accreditation depth, and whether SIEM/HSM integrations are greenfield. Hardware is separate. Fixed-fee proposal after the first scoping call - no time-and-materials surprise.
// Let's ship it
Send us your accreditation target. We'll send back a BoM.
Tell us about the network constraints, the hardware envelope, the models you need to run, and the audit posture you have to maintain. We'll come back with a sized BoM and a deployment plan - usually within a business day. Engagements from $40K, typically 4-8 weeks.
Karol Gawron
Head of R&D @ bards.ai