Every Module. Every Port.
All 17 loadable modules and capabilities documented in full. Each module is self-contained, free, and runs entirely on your machine.
Central orchestration loop for chat turns, tool calls, and streamed responses. Takes client requests, prepares model calls, and coordinates tool execution round-trips through the tools channel. Expects an OpenAI-compatible backend upstream and tracks connectivity state for that endpoint.
- ›Multi-round tool call loop with result correlation before generation continues
- ›Connectivity state tracking for the upstream OpenAI-compatible endpoint
- ›Structured agent event emission (start, delta, tool_call, tool_result, done)
- ›Drop-in orchestration path for terminals requiring full agent behavior
Local inference engine that loads GGUF models through llama.cpp and generates token streams. Process-isolated so model runtime crashes do not take down the main Hypervisor process. Core local model runtime used behind server-facing modules such as gpt_server.
- ›GGUF model loading via llama-cpp-python with configurable GPU layer offload
- ›Isolated child process — native code crashes cannot bring down the Hypervisor
- ›CUDA runtime auto-discovery: app-local → CUDA Toolkit → nvidia pip packages
- ›Streaming token output with per-chunk hardware telemetry (t/s, VRAM, load %)
- ›Separate LLM Inference Setup and RAG Embeddings Setup install flows
In-process LiteLLM gateway for cloud providers that acts as a drop-in OpenAI-compatible API source for the same downstream wiring used by local stacks.
- ›In-process LiteLLM library mode — no CLI subprocess
- ›Providers: OpenAI, Anthropic, Gemini, Grok, Ollama, custom LiteLLM string
- ›API key prefix auto-detection pre-fills provider and model
- ›Vault-stored API keys — serialised as __vault__, never logged or emitted
- ›Bearer auth middleware when vault is active; /v1/health exempt
- ›Mutually exclusive with GPT Server on the same LOCAL_IP_IN jack
Primary interactive web terminal for chat, tools, storage, and RAG-assisted prompting. Hosts a local web service for the frontend and coordinates multi-channel rack inputs. Operates in direct API mode or agent-engine mode depending on wiring and payload routing.
- ›Pre-built React/Vite UI — no Node.js or npm required at runtime
- ›SSE streaming chat (/api/chat) and WebSocket push hub (/ws)
- ›RAG gate: QUERY_PAYLOAD → CONTEXT_PAYLOAD injection before every LLM request
- ›MCP tool registry with tool_call_id correlation and 30 s configurable timeout
- ›Direct DB access via aiosqlite (SQLCipher when vault is present)
- ›Bearer auth middleware on all protected routes when vault is active
Discord bot terminal that bridges Discord messages into the Miniloader chat flow. Mirrors key wiring patterns from gpt_terminal, including context, tools, and storage links. Supports direct API mode or the full agent-engine path depending on rack wiring.
- ›Slash command and message event handling with CHAT_REQUEST translation
- ›Streaming responses posted back to the originating Discord channel
- ›Dual wiring modes: direct API (LOCAL_IP_IN) or full agent-engine path (AGENT_IN)
- ›Bot token and OAuth credentials vault-stored; never logged or emitted
- ›Web/tunnel output for participation in routed companion workflows
LiveKit Voice handles realtime STT/TTS by turning speech into agent turns and streaming responses back as synthesized audio with browser join configuration.
- ›Realtime speech-to-text and text-to-speech worker flows
- ›Transcribed speech injected directly into the agent turn pipeline
- ›VOICE_CONFIG_PAYLOAD emission for browser client room join
- ›STT and TTS model readiness tracked separately in module parameters
- ›Web helper port auto-shifts if default is occupied
Local OpenAI-compatible HTTP API facade. Translates rack CHAT_REQUEST payloads into /v1/chat/completions and /v1/models endpoints. Streams model output back as SSE/JSON by consuming brain stream payloads from the linked inference module.
- ›OpenAI-compatible /v1/chat/completions and /v1/models
- ›Loopback binding (127.0.0.1) with vault-backed bearer auth
- ›Auto-generated endpoint password stored in vault on first account creation
- ›SERVER_CONFIG_PAYLOAD handshake consumed by downstream terminals
- ›Configurable port and CORS policy
Persistence layer for SQL query and transaction payloads. Supports encrypted SQLite and PostgreSQL backends with one unified transport contract. Requests and responses share the same bidirectional storage channel.
- ›SQLite and PostgreSQL backends selectable via db_type parameter
- ›SQLCipher encryption at rest — key derived per-user from vault via HKDF
- ›Postgres via asyncpg pool; pg_password vault-stored
- ›Schema: system_state, threads, messages, templates, settings
- ›Read-only mode blocks all transaction payloads
- ›Sensitive params serialised as __vault__ in rack snapshots
Document chunking, embedding generation, and semantic retrieval for context injection. Ingests document payloads from file_access and persists vectors in a local ChromaDB store. Query payloads answered with ranked context chunks.
- ›SentenceTransformer embeddings — model loaded locally, zero network calls
- ›ChromaDB vector persistence (collection: miniloader_docs)
- ›Sliding-window character chunking with configurable size and overlap
- ›Per-user vector store scoping when vault is active
- ›Cosine similarity scores normalised to [0, 1]
- ›Planned: pgvector backend — identical payload contract
Local document extraction and file tool execution. Emits document payloads for downstream RAG indexing and separately exposes file tools for agents. Bridge between local files and AI-facing tool and RAG pipelines.
- ›Text extraction: .txt, .md, .csv, .json, .py, .yaml, .yml, .pdf, .docx
- ›MCP tools: file_list, file_read_text, file_write_text
- ›read_only mode enforced — file_write_text blocked before execution
- ›SEND button triggers active-file emission to RAG Engine
- ›Configurable root_path and active file selection via UI checkbox map
Tool aggregation hub and backplane. Up to 8 cartridge slots merge tool schemas from multiple providers, route execution, and deliver a single merged schema to consumers.
- ›8-slot backplane UI — each slot maps to a cartridge module with its own rack card
- ›provider_tool namespace validation — non-conforming tool names rejected
- ›Schema collision detection — first provider to register a name owns it
- ›TTL eviction for silent providers; merged schema rebroadcast on eviction
- ›30 s call timeout with TOOL_TIMEOUT error payload
- ›Hybrid cartridges: pg_cartridge (Postgres), sqlite_cartridge (planned)
PostgreSQL hybrid cartridge for MCP Bus. Manages an asyncpg connection pool in-process and exposes SQL tools to the AI agent through the standard MCP tool channel.
- ›In-process asyncpg connection pool — no subprocess overhead
- ›Tools: pg_query, pg_execute, pg_list_tables, pg_describe_table, pg_list_schemas
- ›pg_password vault-stored; never logged or emitted in payloads
- ›Wires to MCP Bus via backplane slot — appears in Add Cartridge context menu
- ›Full Hypervisor lifecycle: PWR, INIT, CFG per cartridge instance
Exposes Gmail, Calendar, and Contacts capabilities as agent tools. Publishes tool schemas and results over a bidirectional tools channel. Acts as a productivity tool provider rather than a model or server endpoint.
- ›Tools: Gmail read/send, Calendar event management, Contacts lookup
- ›OAuth token state managed in module parameters with periodic refresh
- ›Tool schemas discoverable through standard MCP Bus backplane slot
- ›Authentication and credential state vault-stored
Scaffolded tool-provider module for Obsidian-style workflows. Emits tool schemas and accepts execution requests through the standard tools channel contract. Current implementation preserves interface compatibility while the concrete tool set is expanded.
- ›Full MCP tool channel interface — wiring-stable across future releases
- ›Placeholder execution returns not-implemented without crashing the pipeline
- ›Schema published on init so consumers enumerate the tool set correctly
- ›Planned: vault/graph read, note creation, tag-based search
Playwright-backed browser automation tools for the agent system. Emits tool schemas and executes browser actions through the standard bidirectional tools channel. Focused on web interaction rather than chat serving.
- ›Playwright-backed browser automation (Chromium, Firefox, WebKit)
- ›Tools: navigate, click, fill, screenshot, extract text, evaluate script
- ›Browser readiness and navigation state tracked in module parameters
- ›Wires into MCP Bus as a standard tool-provider cartridge
- ›Playwright and browser binaries installed separately from core rack
Ngrok Tunnel publishes local services to the internet by consuming routing config and emitting tunnel status with a public URL on the same channel.
- ›One tunnel route per instance — use multiple instances for multiple services
- ›Free tier: dynamic *.ngrok-free.app URL
- ›Paid tier: custom domain support
- ›Managed subprocess with PID registered in Hypervisor process registry
- ›Auth token vault-stored; serialised as __vault__ in snapshots
Decorative rack spacer with no functional I/O. Occupies visual space and keeps rack layouts clean. Initializes directly into running state without transport behavior.
- ›No I/O ports — purely decorative
- ›Initializes directly to running state
- ›Zero runtime overhead
