← Back to the AI Box
Architecture
One box, eight layers, no magic
Everything running on the gbox is documented, open-source, auditable. Here is the stack in detail, the data flow, and the security stance.
Layered stack
Eight layers, from host OS to admin
08
Layer 08 — Control plane
FastAPI admin dashboard + connector orchestrator. Service health, audit search, GDPR erase, multi-tenant. One screen to drive the box.
gbox-admin · orchestrator
MIT
07
Layer 07 — Observability
Prometheus for metrics, Grafana for dashboards, Loki for logs, Vector for the pipeline. SIEM-ready audit logs (Splunk, Elastic, Wazuh).
Prometheus · Grafana · Loki · Vector
Apache 2.0
06
Layer 06 — User interface
Conversational UI inspired by ChatGPT. Sessions, history, citations, business-module hooks (Contract Review, Helpdesk, etc.).
Open WebUI
BSD-3
05
Layer 05 — Model gateway
Automatic model selection per task (chat, code, transcription). Group quotas, metrics, centralised audit.
LiteLLM
MIT
04
Layer 04 — Inference runtime
Local execution of open-source models. 4/8-bit quantisation configurable. Apple Silicon (Metal) or CUDA optimised.
Ollama · llama.cpp
MIT
03
Layer 03 — Document index
Postgres relational DB with vector extension for hybrid search (BM25 + cosine similarity). Embeddings computed locally.
Postgres + pgvector
PostgreSQL
02
Layer 02 — Authentication
Enterprise SSO front (OIDC, SAML). Group-based access policy via LDAP / Azure AD. Full audit log of every user action.
Authentik
MIT
01
Layer 01 — Host system
Hardened macOS Sonoma+ or Ubuntu LTS 24.04. Restrictive firewall. Minimal services. Signed updates. FileVault / LUKS at-rest.
macOS · Ubuntu LTS
Data flow
A request, step by step
1
1. The user types a question
Pre-authenticated via enterprise SSO (Azure AD, Okta, Google Workspace). Their session is logged.
2
2. The gateway picks a model
Based on detected task (chat, code, translation) and any active module (Helpdesk, Contract Review).
3
3. The document index is queried
Hybrid search across Postgres + pgvector. Filtered by source ACLs: only what the user is allowed to see is sent to the model.
4
4. The model generates the answer
Local inference on Apple Silicon or NVIDIA GPU. Token-by-token streaming. The context NEVER leaves the box.
5
5. The answer arrives sourced
With clickable citations to internal documents. The user can verify in one click. The exchange is logged.
✕
✕ No outbound connection
No call to OpenAI, Anthropic, Google, Azure. No telemetry. The box can run on an isolated LAN, with no internet access.
✓
✓ Full audit log
Every request, every consulted source, every model invocation is logged with a signed timestamp. Configurable retention.
Security by design
Six pillars, all on by default
Encryption at-rest and in-transit
FileVault / LUKS on disk, internal TLS for component-to-component traffic, AES-256 encrypted Restic backups.
Air-gap mode
The box can run with no outbound connection. Updates delivered on signed USB sticks. Compliant with critical-infrastructure / defence requirements.
Signed updates
Every firmware, component or model update is GPG-signed by our key. Signature verified automatically before installation.
Detailed audit logs
Every user action is traced and signed. SIEM-compatible format. Splunk, Elastic, Wazuh exports available.
Secrets vault (age)
Connector secrets (API tokens, certificates) are encrypted at rest with age. The private key lives off-box (USB / HSM / Yubikey). `gbox vault rotate` for periodic rotation.
SIEM-ready pipeline
Vector → Loki + signed JSONL for audit events. Normalised export-ready format for Splunk, Elastic, Wazuh, IBM QRadar. Plug whatever your auditors run.
Update lifecycle
Three update types, three cadences
Monthly · auto
Security patches
OS and critical-component fixes. Applied automatically off-hours, or pushed manually by your IT team.
Quarterly
Product evolutions
New features, new modules, optimisations. Detailed release notes. 30-day rollback window.
Bi-annual
Model refresh
New generations of open-source models (Gemma, Mistral, Llama). Pre-deployment validation by our team.
Air-gap mode: updates by signed USB
For internet-less environments, we ship quarterly signed USB sticks. SHA256 + GPG verification before install.
Bill of materials
Every open-source component, listed
Because you have the right to know what's running in your rack. Exact versions are auditable from the box admin page.
| Component | Role | Licence |
|---|---|---|
| Open WebUI | Conversational user interface | BSD-3-Clause |
| LiteLLM | Gateway / model routing | MIT |
| Ollama | Model orchestration and downloads | MIT |
| llama.cpp | Optimised local inference engine | MIT |
| PostgreSQL | Relational DB (sessions, audit, metadata) | PostgreSQL |
| pgvector | Vector extension for semantic search | PostgreSQL |
| Authentik | Enterprise SSO and RBAC | MIT |
| Apache Tika | Text extraction from PDF, Office, etc. | Apache 2.0 |
| Whisper | Audio transcription (meetings, dictation) | MIT |
| Caddy | Internal reverse-proxy with auto TLS | Apache 2.0 |
| Restic | Encrypted incremental backups | BSD-2-Clause |
All listed licences allow commercial use. The box admin page audits exact production versions and active CVEs continuously.
A question for your CISO or IT director?
Our team can provide a detailed architecture dossier (50+ pages) or schedule a call with your security team.
Talk to an expert