Architecture

Odysseus uses a hub-and-spoke architecture. The control plane (hub) is fully managed by Delta Telematics. You connect your own servers by installing a lightweight agent (spoke) with a single command. There is nothing to host, patch, or maintain on the control plane side.

Overview

The platform separates concerns into two layers:

  • Control Plane (managed) — Hosted by Delta Telematics. Handles orchestration decisions, scheduling, metrics aggregation, the web dashboard, and Athena AI. You interact with it through the dashboard.
  • Agent (your infrastructure) — A lightweight process that runs on each of your servers. It receives instructions from the control plane, executes container operations via Docker, and reports health, metrics, and logs back.
Key principle: The control plane makes decisions; the agent executes them. If the control plane is temporarily unreachable, your running containers continue operating without interruption.
Odysseus platform architecture overview showing control plane and agent layers Odysseus dashboard — the central control plane interface showing cluster health, deployments, and resource utilization

Components

Control Plane

The managed control plane consists of several coordinated services:

ComponentPurpose
API ServerREST API gateway that handles all client requests, enforces authentication and RBAC, and routes operations to the appropriate backend service
Orchestration EngineDecides where and how to place containers across your nodes, manages rolling updates, canary deployments, and rollbacks
SchedulerAssigns workloads to nodes based on resource availability, affinity rules, and health status
State StoreMaintains the desired state of all deployments and reconciles it with actual state reported by agents
Metrics AggregationCollects and aggregates metrics from all agents, powering the dashboard, alerting, and autoscaling decisions
CVE ScannerScans container images using dual backends and enforces deployment-gating policies based on vulnerability severity
SRE EngineDetects anomalies, creates incidents automatically, and can execute approved remediation actions
Component interaction diagram showing how control plane services communicate

Agent

Nodes management showing health status, VPN connectivity, scheduling state, and resource usage per node

The agent is a single binary that runs on each of your servers. Install it with one command:

curl -sSL https://get.odysseus.delta-telematics.ca | sh -s -- --token YOUR_ENROLLMENT_TOKEN

The agent:

  • Connects to the control plane over an encrypted WireGuard VPN tunnel
  • Executes container lifecycle operations (create, start, stop, remove) via the local Docker daemon
  • Streams container logs and resource metrics to the control plane
  • Reports node health (CPU, memory, disk, network) at configurable intervals
  • Self-updates automatically with health-check-gated rollback
  • Continues operating independently if the control plane is temporarily unreachable

Dashboard

The web dashboard provides a real-time view of your entire infrastructure:

  • Deployment management with one-click scaling, restarts, and rollbacks
  • Live container logs with search and filtering
  • Node health and resource utilization maps
  • CVE scan results and security policy management
  • Incident timeline with remediation controls
  • Athena AI chat for natural-language infrastructure queries
  • Audit log viewer with export capabilities

Athena AI

Athena MCP integration showing how natural language queries connect to orchestration tools

Athena is an AI-powered operations assistant integrated into the dashboard. Powered by Claude via the Model Context Protocol (MCP), Athena has access to 61 tools that let it:

  • Query your deployment state, metrics, and logs
  • Diagnose performance issues and suggest optimizations
  • Execute operations (with your approval) such as scaling, restarting, or rolling back
  • Explain incidents and recommend remediation steps
  • Answer questions about your infrastructure in natural language

Data Flow

Deployment flow showing how requests travel from user through control plane to agents

User Operations

  1. You issue a command via the Dashboard
  2. The API Server authenticates the request, checks RBAC permissions, and validates input
  3. The Orchestration Engine determines how to fulfill the request (which nodes, what order, rollout strategy)
  4. Instructions are sent to the Agent(s) on the target node(s) over encrypted WireGuard tunnels
  5. Each Agent executes the container operations via the local Docker daemon
  6. Results flow back through the same path to confirm success or report errors

Monitoring and Metrics

  1. Agents continuously collect container and node metrics
  2. Metrics are streamed to the Control Plane over the encrypted tunnel
  3. The Metrics Aggregation service processes, stores, and indexes the data
  4. The Dashboard renders real-time charts, the SRE Engine watches for anomalies, and the Scheduler uses metrics for placement decisions

Log Streaming

  1. Container stdout/stderr is captured by the Agent
  2. Logs are forwarded to the Control Plane in real time
  3. Accessible via the Dashboard log viewer

Key Design Principles

Tenant Isolation

Every layer of the platform enforces strict tenant boundaries:

  • Network: Each tenant's containers run on isolated Docker networks with no cross-tenant connectivity
  • Data: All queries are scoped to the authenticated tenant at the database layer
  • API: Every request is validated against the tenant context in the JWT token
  • Resources: CPU, memory, and container count quotas are enforced per tenant

Encrypted Communication

  • External traffic: TLS with auto-renewed Let's Encrypt certificates for all public endpoints
  • Internal traffic: WireGuard VPN tunnels between the control plane and every agent node
  • No plaintext: There is no unencrypted path between any two components

Agent Independence

Agents are designed to operate autonomously:

  • Running containers are unaffected by control plane downtime
  • Agents queue operations locally if the tunnel is temporarily disrupted
  • State is fully reconciled when connectivity is restored
  • No customer data is stored on the control plane (agents manage their own Docker state)

Automatic Agent Upgrades

Agents self-update to the latest compatible version:

  • New versions are pulled automatically when available
  • Upgrades are gated by health checks: if the new version fails its health check, it automatically rolls back
  • Zero manual intervention required

Integrations

Odysseus integrates with proven infrastructure tools, all managed for you as part of the control plane:

ToolRole
ConsulService discovery and distributed state coordination across nodes
VaultSecrets management with automatic rotation and least-privilege policies
PrometheusMetrics collection, alerting rules, and autoscaling signal source
TraefikIngress routing, TLS termination, and weighted traffic splitting for canary deployments
Trivy + GrypeDual-backend CVE scanning for comprehensive vulnerability detection
WireGuardEncrypted VPN tunnels for all control-plane-to-agent communication

Resilience

Odysseus is designed to handle failures gracefully at every level:

Failure ScenarioBehavior
Control plane downtimeRunning containers continue operating. Agents queue any pending operations and reconcile when the connection is restored.
Network partitionWireGuard maintains persistent tunnels and automatically reconnects. Agents continue local operations during the partition.
Agent crashThe agent process is managed by systemd and restarts automatically. Running containers are not affected by agent restarts.
Bad agent updateHealth-check-gated rollback reverts to the previous agent version within seconds.
Node failureThe orchestration engine detects the node as unhealthy and reschedules containers to other available nodes.
Container crashRestart policies and health checks trigger automatic container recovery. Persistent incidents are escalated to the SRE engine.