Files
evotraders/docs/current-architecture.md

169 lines
5.9 KiB
Markdown

# Current Architecture
This file describes the current code-supported architecture only. Historical
paths and partial migrations are intentionally excluded unless brief historical
context is needed to explain the current shape.
Reference material:
- visual diagram: [current-architecture.excalidraw](./current-architecture.excalidraw)
- next-step roadmap: [development-roadmap.md](./development-roadmap.md)
- legacy inventory: [legacy-inventory.md](./legacy-inventory.md)
- terminology guide: [terminology.md](./terminology.md)
## Runtime Mode
The supported runtime model is the split-service development architecture.
Split-service architecture with a dedicated runtime API surface and a separate
Gateway process.
```bash
./start-dev.sh # Starts all services including runtime_service and Gateway
```
**Characteristics:**
- `runtime_service` (:8003) provides runtime lifecycle APIs
- the checked-in `start-dev.sh` starts split services and lets `runtime_service` spawn Gateway
- manual split-service flows can also let `runtime_service` spawn Gateway
- Clear separation between Control Plane (runtime_service) and Data Plane (Gateway)
- Service discovery via environment variables
- Independent scaling and deployment of each service
**Use cases:**
- Local development with hot-reload
- Multi-node deployments
- Production environments requiring service isolation
## Default Runtime Shape
The active runtime path is:
`frontend -> frontend_service proxy or direct split-service calls -> runtime_service/control APIs -> gateway subprocess -> market/pipeline/storage`
Current service surfaces:
- `backend.apps.agent_service` on `:8000`
- control plane for workspaces, agents, skills, approvals
- `backend.apps.trading_service` on `:8001`
- read-only trading data APIs
- `backend.apps.news_service` on `:8002`
- read-only explain/news APIs
- `backend.apps.runtime_service` on `:8003`
- runtime lifecycle and gateway process management
- gateway WebSocket on `:8765`
- live feed/event transport and pipeline coordination
### Control Plane vs Data Plane
**Control Plane (runtime_service :8003):**
- Gateway lifecycle management (start/stop/restart)
- Runtime configuration and bootstrap
- Process health monitoring
- Run history and state snapshots
**Data Plane (Gateway :8765):**
- WebSocket event streaming
- Market data ingestion
- Pipeline execution (analysis -> decision -> execution)
- Real-time trading operations
## Runtime Data Layout
The canonical runtime data root is:
- `runs/<run_id>/`
Important files under each run:
- `runs/<run_id>/BOOTSTRAP.md`
- machine-readable front matter plus run-scoped prompt body
- `runs/<run_id>/agents/<agent_id>/`
- run-scoped agent workspace files and active/local skills
- `runs/<run_id>/state/runtime_state.json`
- runtime snapshot
- `runs/<run_id>/state/server_state.json`
- server-side state (portfolio, trades, market data)
- `runs/<run_id>/team_dashboard/*.json`
- compatibility/export layer for dashboard consumers
- can be disabled in controlled environments via `ENABLE_DASHBOARD_COMPAT_EXPORTS=false`
## Workspace Terms
Two similarly named concepts still exist in the repository:
- `workspaces/`
- design-time registry and CRUD surface exposed by `agent_service`
- `runs/<run_id>/`
- actual runtime state, agent assets, skills, bootstrap config, and logs
When reading current runtime code, prefer `runs/<run_id>/` as the source of
truth. The `workspaces/` registry is not the default execution path.
## Skill Sandbox Execution
Skill scripts (analysis tools, valuation reports) can be executed in multiple
sandbox modes via `backend/tools/sandboxed_executor.py`:
| Mode | Backend Class | Description |
|------|---------------|-------------|
| `none` | `NoSandboxBackend` | Direct module import and execution (default, development only) |
| `docker` | `DockerSandboxBackend` | Docker container isolation with resource limits |
| `kubernetes` | `KubernetesSandboxBackend` | Kubernetes Pod isolation (reserved interface) |
Environment configuration:
```bash
SKILL_SANDBOX_MODE=none # none | docker | kubernetes
SKILL_SANDBOX_IMAGE=python:3.11-slim
SKILL_SANDBOX_MEMORY_LIMIT=512m
SKILL_SANDBOX_CPU_LIMIT=1.0
SKILL_SANDBOX_NETWORK=none
SKILL_SANDBOX_TIMEOUT=60
```
The default `none` mode displays a runtime security warning on first execution
as a reminder that scripts run without isolation. Production deployments should
use `docker` mode with appropriate resource limits.
## Migration Roadmap
### Current State
The system is in an active development state:
1. **Microservice infrastructure is operational** - runtime_service can start/stop Gateway as subprocess
2. **Pipeline logic remains in Gateway** - full Pipeline execution still happens within Gateway process
3. **Direct gateway startup has been removed** - the repository now exposes a single supported startup model
### Future Direction
Phase 1: Documentation and startup convergence (active)
- Clarify runtime modes and their use cases
- Unify documentation across all entry points
Phase 2: Runtime model consolidation
- Ensure all runtime state lives under `runs/<run_id>/`
- Remove dependencies on root-level legacy directories
Phase 3: Pipeline decomposition (planned)
- Extract Pipeline stages into independent services
- Gateway becomes a thin event router
- runtime_service evolves into full orchestrator
Phase 4: Deployment convergence (future)
- Remove or rewrite historical deployment artifacts
- Keep all documented startup paths aligned with `runtime_service`
## Legacy Compatibility
These items still exist, but they are not the recommended source of truth for
new development:
- root-level runtime data directories such as `live/`, `production/`, `backtest/`
- historical documentation gaps that have not yet been fully rewritten
Legacy fallback agent paths still exist in compatibility-oriented creation
flows, but the default `TradingPipeline` runtime now prefers `EvoAgent` for the
supported roles unless rollout settings explicitly reduce that set.