A production-grade appliance running local LLMs at 300+ tokens per second. No cloud tokens. No per-seat bills. No data leaving your walls. Deploy in 3 to 10 days. Operate on fixed cost. Own your AI stack.
Per-seat pricing. Per-GB ingestion. Per-token inference. Variable bills that scale with usage — usually in the wrong direction.
Meanwhile: your data leaves your perimeter. Your compliance team writes memos. Your CFO writes checks.
AI Box is the counter-narrative.
Every spec is chosen to run real workloads — not a chat demo. Local inference at production scale.
| Compute | AMD AI MAX+ 395 · Radeon 8060 Graphics |
| Memory | 128GB LPDDR5X 8000 · on-board unified |
| Storage | 2TB NVMe PCIe 4.0 · expandable to 16TB |
| Throughput | 300+ tokens / sec · local inference |
| Model support | Up to 70B parameter models · 8K+ context |
| Runtime stack | Ollama · LLaMA · Qwen · Mistral · GPT-OSS |
| RAG infrastructure | MongoDB + Qdrant · vector DB per tenant |
| Connectivity | 2.5 Gigabit · WiFi 7 · BT 5.4 · Tailscale VPN |
| Form factor | 19.3 × 18.6 × 7.7 cm · VESA-mountable |
| Power | DC 19V / 11.8A / 230W |
| Deployment time | 3 to 10 days · plug-and-play |
| Warranty | 2 years · with optional yForce support retainer |
From single-site pilots to multi-tenant production clusters — pick the architecture that matches your scale.
Single-site deployment for pilots, small clinics, field offices. Encrypted tunnel back to HQ data center. Full tenant isolation, no inbound ports exposed.
Rack-mount at VITRO, STT Manila, or your preferred PH data center. Multi-tenant VLAN segmentation. Site-to-site VPN to client data centers.
Edge boxes for latency-sensitive inference. Colocated cluster for heavy workloads. Cloud burst only where it makes sense. You decide where each workload lives.
A 500-endpoint client. Two cost models. The math is not subtle.
| Dimension | ◆ AI Box | Cloud-Native AI |
|---|---|---|
| Cost model | Fixed CAPEX | Variable OPEX |
| LLM costs | Zero · local inference | $30–50 per user / month |
| Ingestion / storage | Unlimited · local NVMe | $2.76–5.22 per GB / day |
| Data sovereignty | 100% on-premise | Vendor cloud regions |
| Latency | Local · 300+ tok/sec | Internet-dependent · variable |
| Customization | Full control · open stack | Vendor APIs & guardrails |
| Lock-in | None | High · proprietary |
| Time to deploy | 3–10 days | Weeks to months |
Cloud SOC: ~$15K to $50K per month, ongoing. AI Box: one-time investment. Payback in 6 to 10 months. Every month after that is margin.
Every Phoenix agent, every RAG pipeline, every sandboxed Docker container — runs locally on AI Box. Zero dependency on cloud inference. Zero tokens sent outside your network.
Learn about Phoenix →A walkthrough of the architecture, the ROI model, and a live deployment scenario tailored to your infrastructure — with our solution architects, not a sales rep reading slides.