Files
evotraders/docs/current-architecture.md
cillin 16b54d5ccc feat(agent): complete EvoAgent integration for all 6 agent roles
Migrate all agent roles from Legacy to EvoAgent architecture:
- fundamentals_analyst, technical_analyst, sentiment_analyst, valuation_analyst
- risk_manager, portfolio_manager

Key changes:
- EvoAgent now supports Portfolio Manager compatibility methods (_make_decision,
  get_decisions, get_portfolio_state, load_portfolio_state, update_portfolio)
- Add UnifiedAgentFactory for centralized agent creation
- ToolGuard with batch approval API and WebSocket broadcast
- Legacy agents marked deprecated (AnalystAgent, RiskAgent, PMAgent)
- Remove backend/agents/compat.py migration shim
- Add run_id alongside workspace_id for semantic clarity
- Complete integration test coverage (13 tests)
- All smoke tests passing for 6 agent roles

Constraint: Must maintain backward compatibility with existing run configs
Constraint: Memory support must work with EvoAgent (no fallback to Legacy)
Rejected: Separate PM implementation for EvoAgent | unified approach cleaner
Confidence: high
Scope-risk: broad
Directive: EVO_AGENT_IDS env var still respected but defaults to all roles
Not-tested: Kubernetes sandbox mode for skill execution
2026-04-02 00:55:08 +08:00

7.0 KiB

Current Architecture

This file describes the current code-supported architecture only. Historical paths and partial migrations are intentionally excluded unless called out as legacy compatibility.

Reference material:

Runtime Modes

The system supports two distinct runtime modes:

Standalone Mode (Legacy Compatibility)

Direct Gateway startup via backend.main as a monolithic entrypoint.

python -m backend.main --mode live --port 8765

Characteristics:

  • Single process runs Gateway, Pipeline, Market Service, and Scheduler
  • No service discovery or process management
  • Suitable for single-node deployments and quick testing
  • All components share the same memory space

Use cases:

  • Quick local testing without service orchestration
  • Single-node production deployments
  • Backward compatibility with legacy startup scripts

Microservice Mode (Default for Development)

Split-service architecture with dedicated runtime_service managing the Gateway lifecycle.

./start-dev.sh  # Starts all services including runtime_service and Gateway

Characteristics:

  • runtime_service (:8003) acts as Gateway Process Manager
  • Gateway runs as a subprocess managed by runtime_service
  • Clear separation between Control Plane (runtime_service) and Data Plane (Gateway)
  • Service discovery via environment variables
  • Independent scaling and deployment of each service

Use cases:

  • Local development with hot-reload
  • Multi-node deployments
  • Production environments requiring service isolation

Mode Comparison

Aspect Standalone Mode Microservice Mode
Entry point python -m backend.main ./start-dev.sh or individual services
Process model Single monolithic process Multiple specialized processes
Gateway management Self-contained Managed by runtime_service
Service discovery None (in-process) Environment variable based
Hot reload Full restart required Per-service reload
Scaling Vertical only Horizontal possible
Complexity Lower Higher
Use case Testing, simple deployments Development, production

Default Runtime Shape (Microservice Mode)

The active runtime path is:

frontend -> frontend_service proxy or direct split-service calls -> runtime_service/control APIs -> gateway subprocess -> market/pipeline/storage

Current service surfaces:

  • backend.apps.agent_service on :8000
    • control plane for workspaces, agents, skills, approvals
  • backend.apps.trading_service on :8001
    • read-only trading data APIs
  • backend.apps.news_service on :8002
    • read-only explain/news APIs
  • backend.apps.runtime_service on :8003
    • runtime lifecycle and gateway process management
  • backend.apps.openclaw_service on :8004
    • optional OpenClaw REST facade
  • gateway WebSocket on :8765
    • live feed/event transport and pipeline coordination

Control Plane vs Data Plane

Control Plane (runtime_service :8003):

  • Gateway lifecycle management (start/stop/restart)
  • Runtime configuration and bootstrap
  • Process health monitoring
  • Run history and state snapshots

Data Plane (Gateway :8765):

  • WebSocket event streaming
  • Market data ingestion
  • Pipeline execution (analysis -> decision -> execution)
  • Real-time trading operations

Runtime Data Layout

The canonical runtime data root is:

  • runs/<run_id>/

Important files under each run:

  • runs/<run_id>/BOOTSTRAP.md
    • machine-readable front matter plus run-scoped prompt body
  • runs/<run_id>/agents/<agent_id>/
    • run-scoped agent workspace files and active/local skills
  • runs/<run_id>/state/runtime_state.json
    • runtime snapshot
  • runs/<run_id>/state/server_state.json
    • server-side state (portfolio, trades, market data)
  • runs/<run_id>/team_dashboard/*.json
    • compatibility/export layer for dashboard consumers
    • can be disabled in controlled environments via ENABLE_DASHBOARD_COMPAT_EXPORTS=false

Workspace Terms

Two similarly named concepts still exist in the repository:

  • workspaces/
    • design-time registry and CRUD surface exposed by agent_service
  • runs/<run_id>/
    • actual runtime state, agent assets, skills, bootstrap config, and logs

When reading current runtime code, prefer runs/<run_id>/ as the source of truth. The workspaces/ registry is not the default execution path.

Skill Sandbox Execution

Skill scripts (analysis tools, valuation reports) can be executed in multiple sandbox modes via backend/tools/sandboxed_executor.py:

Mode Backend Class Description
none NoSandboxBackend Direct module import and execution (default, development only)
docker DockerSandboxBackend Docker container isolation with resource limits
kubernetes KubernetesSandboxBackend Kubernetes Pod isolation (reserved interface)

Environment configuration:

SKILL_SANDBOX_MODE=none              # none | docker | kubernetes
SKILL_SANDBOX_IMAGE=python:3.11-slim
SKILL_SANDBOX_MEMORY_LIMIT=512m
SKILL_SANDBOX_CPU_LIMIT=1.0
SKILL_SANDBOX_NETWORK=none
SKILL_SANDBOX_TIMEOUT=60

The default none mode displays a runtime security warning on first execution as a reminder that scripts run without isolation. Production deployments should use docker mode with appropriate resource limits.

Migration Roadmap

Current State

The system is in a transitional state:

  1. Microservice infrastructure is operational - runtime_service can start/stop Gateway as subprocess
  2. Pipeline logic remains in Gateway - full Pipeline execution still happens within Gateway process
  3. Standalone mode is preserved - direct backend.main startup for compatibility

Future Direction

Phase 1: Documentation and startup convergence (active)

  • Clarify runtime modes and their use cases
  • Unify documentation across all entry points

Phase 2: Runtime model consolidation

  • Ensure all runtime state lives under runs/<run_id>/
  • Remove dependencies on root-level legacy directories

Phase 3: Pipeline decomposition (planned)

  • Extract Pipeline stages into independent services
  • Gateway becomes a thin event router
  • runtime_service evolves into full orchestrator

Phase 4: Standalone mode deprecation (future)

  • Remove direct backend.main entry point
  • All deployments use microservice mode

Legacy Compatibility

These items still exist, but they are not the recommended source of truth for new development:

  • root-level runtime data directories such as live/, production/, backtest/
  • direct backend.main startup as the primary development path

The current runtime still creates legacy AnalystAgent / RiskAgent / PMAgent instances directly. EvoAgent remains an in-progress migration target, not the default execution path.