A dedicated AI workspace per company. Claude connects to your CRM, email, calendar, and 40+ services through a 21-container distributed system with 3-layer self-healing and per-tenant VM isolation.
Users connect via Claude Desktop (SSH), Claude.ai (Remote MCP), or Telegram. Every request hits a dedicated server with full access to the company's tools.
The control plane runs as a multi-compose distributed system on a single VM. Core services, observability, and infrastructure are deployed independently.
All 21 containers share one Docker bridge network. Only Caddy is exposed to the internet. APIs, Redis, and metrics are bound to 127.0.0.1.
The system recovers from failures automatically. Docker restarts containers, cron scripts watch Docker, and the application watches everything.
autoheal container monitors all services every 60 seconds. If a container's health check fails 3-5 times, it gets restarted automatically. Every service has a custom health check: Redis uses redis-cli ping, APIs use HTTP GET /health, worker uses /health/readiness.
worker-guardian.sh runs every minute, checks worker health via HTTP. services-guardian.sh runs every 5 minutes, watches n8n, Prometheus, Loki, Grafana, Promtail. After 3 failed restarts, escalates to Telegram with log excerpts. 15-min cooldown prevents alert spam.
packages/self-heal/. 14 preconfigured alerts (redis-memory, queue-backlog, disk-space, workflow-dlq-growth, caddy-cert-expiry, etc). 8 automated remediation actions (dlq-retry, redis-purge, container-health, disk-cleanup, etc). Proactive health loop runs every 5 minutes, dispatches self-heal alerts after 3 consecutive failures.