// Research / Trust, PII & Safety

On-prem LLM deployment that finishes when the network is unplugged.

Most LLM stacks assume outbound HTTPS at install time. Yours can't have it. We deliver signed install bundles, an offline model registry, FIPS-validated crypto, and runbooks that don't depend on a DNS lookup - so the deployment finishes the same day the network is sealed.

// What we see

The stack works in dev. Then the network gets cut.

01

The install assumes outbound HTTPS

Helm chart pulls images from Docker Hub. Python wheels resolve from PyPI. The model loader hits Hugging Face at startup. Everything works on the dev cluster. On the air-gapped network it sits at 0% for an hour and then errors. The fix is rearchitecting the bundle, not toggling a flag.

02

Updates become a quarterly fire drill

Pulling a new model means coordinating sneakernet drives, re-signing bundles, and a four-hour change window. Teams default to never updating - and then the security team finds a CVE in a transformer dependency and the timeline collapses.

03

Audit logs go to a vendor cloud

The observability stack ships traces to Datadog. The LLM telemetry goes to a SaaS dashboard. Both are off-network at install time, both are unacceptable at audit time, and retrofitting them onto your SIEM is a two-month project that should have been week one.

// Case Study

We trained EasyDocs' invoice extraction model

EasyDocs is the platform provider - they ship document management software to their own customers. We trained the fine-tuned NLP model that runs inside it, auto-extracting VAT numbers, totals, and addresses from invoices and learning from every user correction. Deployed on their servers, no external dependencies.

  • 98%

    field-level extraction accuracy

  • <300ms

    inference time per invoice

  • On-prem

    deployment with no external dependencies

Read the case study
We trained EasyDocs' invoice extraction model

// What we do

Three things that decide whether the air-gapped install ships.

Most on-prem failures aren't model-quality failures. They're install-bundle failures, update-cadence failures, and audit-trail failures. We design from the assumption that nothing reaches the internet.

Signed install bundles, zero egress

One artifact, fully signed (cosign + TUF root), with every dependency pinned by hash. OS packages, Python wheels, OCI images, and model weights all mirrored locally. The install verifies provenance on every boundary and finishes with zero outbound packets.

Updates without internet

Signed delta bundles for models, code, and CVE patches. Operators carry the diff across the gap on whatever media your policy allows; the local registry verifies the signature, applies the delta, and rolls back automatically on health-check failure. Update cadence becomes routine, not a quarterly fire drill.

Audit logs into your SIEM, not a vendor cloud

Every inference, model load, and key access lands in Splunk, Elastic, QRadar, or whatever your responders already parse. Append-only with hash chaining, tamper detection wired to your runbook, no SaaS dashboard between the incident and the audit.

// Method fit

Air-gapped isn't the right deployment for every privacy concern.

skip it if

  • Your data can stay in a private VPC

    If your perimeter is a VPC with no public ingress and your security team accepts AWS/GCP/Azure as in-scope, a managed cloud LLM (Bedrock, Vertex, Azure OpenAI) gives you the compliance posture without the on-prem ops. Air-gapped is for when even private-cloud isn't an option.

  • Your concern is third-party API exfil

    If the actual problem is 'we don't want customer text going to OpenAI,' an egress-side PII redactor in front of the API call addresses that without standing up your own GPUs. Cheaper, faster, and doesn't require a hardware accreditation.

    PII Redaction & LLM Data Privacy
  • Your model needs change weekly

    Air-gapped update cadence is days-to-weeks, not minutes. If your engineering team is iterating on the underlying model on a tight loop, the friction will dominate the project. Stabilize the model selection first, deploy on-prem second.

use it if

On-prem fits when data sovereignty or accreditation is non-negotiable, the network can't reach the internet, your team needs FIPS-validated crypto and SIEM-native audit, and the models you depend on are open-weight or your-weights.

// How we work

BoM first. Bundle next. Hand off the runbooks.

Every air-gapped engagement starts with the constraints that are expensive to undo - hardware envelope, accreditation target, network topology. The model and the install plan come after those are written down.

01

BoM and accreditation review (week one)

We sit with your team, size the hardware against the target throughput, and align the bundle to your accreditation paperwork (FedRAMP, IL4, ISO 27001, sectoral). The output is a signed BoM and a deployment design your CISO approves before any image is built.

02

Build and verify in your environment

All work runs in your network, on your hardware, with your operators paired in. Every artifact is signed before it crosses the gap; every install step is reproducible from the bundle. Your team owns the runbooks at the end of week three, not after a knowledge-transfer call.

03

Hand off the operating model

We hand off the install bundle, the offline registry, the SIEM forwarders, the runbooks for incident / patch / model rotation, and a quarterly support cadence with documented response SLAs. Slack for 30 days after delivery for the questions that come up after we leave.

Karol Gawron

// Expert insight

Air-gapped is not a checkbox - it's an architecture choice that affects every dependency you pull. Most teams discover at install time that their stack assumes outbound HTTPS. We start from the assumption that nothing reaches the internet, and the deployment finishes the same day.

Karol Gawron

Head of R&D @ bards.ai

See our case studies

// Why bards.ai

Why us, instead of two senior platform engineers you'd hire.

Air-gapped projects punish ramp-up time - there's no Stack Overflow tab on the secured network. We've already learned the constraints, on production deployments at EasyDocs, in defense-adjacent environments, and in regulated finance.

Production on-prem deployments

EasyDocs runs our document-AI stack fully on-prem at 98% extraction accuracy with no third-party API in the path. The signing, the registry, and the audit pipeline are the same artifacts we ship to every air-gapped customer.

16+ open-source models we own end-to-end

bardsai/eu-pii-anonymization-multilang and 15 more on Hugging Face. We can run on-prem because we don't depend on someone else's hosted API for the critical path.

Senior engineers only, no juniors

Every person on your engagement has shipped to environments with badges at the door. No learning the change-control board on your dollar.

// FAQ

Common questions about air-gapped deployment

Technical work usually runs 4-6 weeks (bundle hardening, BoM signoff, runbook authoring, paired install). Total wall-clock is dominated by your accreditation timeline. Environments with ATO precedent for similar stacks land closer to 4 weeks; greenfield accreditations stretch to 10.

Depends on the model and throughput. Llama 3 70B at moderate concurrency: 4-8 H100s or equivalent A100/MI300X per replica. Smaller fine-tuned models or PII workloads: a single L40S or A10 per service. We deliver a sized BoM after a discovery call - vendor-neutral, two or three options at different price points.

Signed delta bundles. We publish a new version on our side, sign it, your operator brings it across the gap on whatever media policy allows. The local registry verifies the signature, applies the delta, and rolls back automatically on health-check failure. Cadence is typically quarterly with out-of-band patches for CVEs.

FIPS 140-2 validated crypto across the stack, by default. IL4/IL5 is an environment classification, not a software certification - we've deployed components into IL4-aligned networks and have BoM documentation that fits FedRAMP-style paperwork. For IL5 specifically we work with the customer's accreditor on the gaps.

Engagements start at $40K. Most on-prem deployments land between $40K and $150K depending on hardware envelope, multi-tenancy, accreditation depth, and whether SIEM/HSM integrations are greenfield. Hardware is separate. Fixed-fee proposal after the first scoping call - no time-and-materials surprise.

// Let's ship it

Send us your accreditation target. We'll send back a BoM.

Tell us about the network constraints, the hardware envelope, the models you need to run, and the audit posture you have to maintain. We'll come back with a sized BoM and a deployment plan - usually within a business day. Engagements from $40K, typically 4-8 weeks.

Karol Gawron

Karol Gawron

Head of R&D @ bards.ai