Files

cillin 16b54d5ccc feat(agent): complete EvoAgent integration for all 6 agent roles

Migrate all agent roles from Legacy to EvoAgent architecture:
- fundamentals_analyst, technical_analyst, sentiment_analyst, valuation_analyst
- risk_manager, portfolio_manager

Key changes:
- EvoAgent now supports Portfolio Manager compatibility methods (_make_decision,
  get_decisions, get_portfolio_state, load_portfolio_state, update_portfolio)
- Add UnifiedAgentFactory for centralized agent creation
- ToolGuard with batch approval API and WebSocket broadcast
- Legacy agents marked deprecated (AnalystAgent, RiskAgent, PMAgent)
- Remove backend/agents/compat.py migration shim
- Add run_id alongside workspace_id for semantic clarity
- Complete integration test coverage (13 tests)
- All smoke tests passing for 6 agent roles

Constraint: Must maintain backward compatibility with existing run configs
Constraint: Memory support must work with EvoAgent (no fallback to Legacy)
Rejected: Separate PM implementation for EvoAgent | unified approach cleaner
Confidence: high
Scope-risk: broad
Directive: EVO_AGENT_IDS env var still respected but defaults to all roles
Not-tested: Kubernetes sandbox mode for skill execution

2026-04-02 00:55:08 +08:00

7.0 KiB

Raw Blame History

Current Architecture

This file describes the current code-supported architecture only. Historical paths and partial migrations are intentionally excluded unless called out as legacy compatibility.

Reference material:

visual diagram: current-architecture.excalidraw
next-step roadmap: development-roadmap.md
legacy inventory: legacy-inventory.md
terminology guide: terminology.md

Runtime Modes

The system supports two distinct runtime modes:

Standalone Mode (Legacy Compatibility)

Direct Gateway startup via backend.main as a monolithic entrypoint.

python -m backend.main --mode live --port 8765

Characteristics:

Single process runs Gateway, Pipeline, Market Service, and Scheduler
No service discovery or process management
Suitable for single-node deployments and quick testing
All components share the same memory space

Use cases:

Quick local testing without service orchestration
Single-node production deployments
Backward compatibility with legacy startup scripts

Microservice Mode (Default for Development)

Split-service architecture with dedicated runtime_service managing the Gateway lifecycle.

./start-dev.sh  # Starts all services including runtime_service and Gateway

Characteristics:

runtime_service (:8003) acts as Gateway Process Manager
Gateway runs as a subprocess managed by runtime_service
Clear separation between Control Plane (runtime_service) and Data Plane (Gateway)
Service discovery via environment variables
Independent scaling and deployment of each service

Use cases:

Local development with hot-reload
Multi-node deployments
Production environments requiring service isolation

Mode Comparison

Aspect	Standalone Mode	Microservice Mode
Entry point	`python -m backend.main`	`./start-dev.sh` or individual services
Process model	Single monolithic process	Multiple specialized processes
Gateway management	Self-contained	Managed by runtime_service
Service discovery	None (in-process)	Environment variable based
Hot reload	Full restart required	Per-service reload
Scaling	Vertical only	Horizontal possible
Complexity	Lower	Higher
Use case	Testing, simple deployments	Development, production

Default Runtime Shape (Microservice Mode)

The active runtime path is:

frontend -> frontend_service proxy or direct split-service calls -> runtime_service/control APIs -> gateway subprocess -> market/pipeline/storage

Current service surfaces:

backend.apps.agent_service on :8000
- control plane for workspaces, agents, skills, approvals
backend.apps.trading_service on :8001
- read-only trading data APIs
backend.apps.news_service on :8002
- read-only explain/news APIs
backend.apps.runtime_service on :8003
- runtime lifecycle and gateway process management
backend.apps.openclaw_service on :8004
- optional OpenClaw REST facade
gateway WebSocket on :8765
- live feed/event transport and pipeline coordination

Control Plane vs Data Plane

Control Plane (runtime_service :8003):

Gateway lifecycle management (start/stop/restart)
Runtime configuration and bootstrap
Process health monitoring
Run history and state snapshots

Data Plane (Gateway :8765):

WebSocket event streaming
Market data ingestion
Pipeline execution (analysis -> decision -> execution)
Real-time trading operations

Runtime Data Layout

The canonical runtime data root is:

runs/<run_id>/

Important files under each run:

runs/<run_id>/BOOTSTRAP.md
- machine-readable front matter plus run-scoped prompt body
runs/<run_id>/agents/<agent_id>/
- run-scoped agent workspace files and active/local skills
runs/<run_id>/state/runtime_state.json
- runtime snapshot
runs/<run_id>/state/server_state.json
- server-side state (portfolio, trades, market data)
runs/<run_id>/team_dashboard/*.json
- compatibility/export layer for dashboard consumers
- can be disabled in controlled environments via ENABLE_DASHBOARD_COMPAT_EXPORTS=false

Workspace Terms

Two similarly named concepts still exist in the repository:

workspaces/
- design-time registry and CRUD surface exposed by agent_service
runs/<run_id>/
- actual runtime state, agent assets, skills, bootstrap config, and logs

When reading current runtime code, prefer runs/<run_id>/ as the source of truth. The workspaces/ registry is not the default execution path.

Skill Sandbox Execution

Skill scripts (analysis tools, valuation reports) can be executed in multiple sandbox modes via backend/tools/sandboxed_executor.py:

Mode	Backend Class	Description
`none`	`NoSandboxBackend`	Direct module import and execution (default, development only)
`docker`	`DockerSandboxBackend`	Docker container isolation with resource limits
`kubernetes`	`KubernetesSandboxBackend`	Kubernetes Pod isolation (reserved interface)

Environment configuration:

SKILL_SANDBOX_MODE=none              # none | docker | kubernetes
SKILL_SANDBOX_IMAGE=python:3.11-slim
SKILL_SANDBOX_MEMORY_LIMIT=512m
SKILL_SANDBOX_CPU_LIMIT=1.0
SKILL_SANDBOX_NETWORK=none
SKILL_SANDBOX_TIMEOUT=60

The default none mode displays a runtime security warning on first execution as a reminder that scripts run without isolation. Production deployments should use docker mode with appropriate resource limits.

Migration Roadmap

Current State

The system is in a transitional state:

Microservice infrastructure is operational - runtime_service can start/stop Gateway as subprocess
Pipeline logic remains in Gateway - full Pipeline execution still happens within Gateway process
Standalone mode is preserved - direct backend.main startup for compatibility

Future Direction

Phase 1: Documentation and startup convergence (active)

Clarify runtime modes and their use cases
Unify documentation across all entry points

Phase 2: Runtime model consolidation

Ensure all runtime state lives under runs/<run_id>/
Remove dependencies on root-level legacy directories

Phase 3: Pipeline decomposition (planned)

Extract Pipeline stages into independent services
Gateway becomes a thin event router
runtime_service evolves into full orchestrator

Phase 4: Standalone mode deprecation (future)

Remove direct backend.main entry point
All deployments use microservice mode

Legacy Compatibility

These items still exist, but they are not the recommended source of truth for new development:

root-level runtime data directories such as live/, production/, backtest/
direct backend.main startup as the primary development path

The current runtime still creates legacy AnalystAgent / RiskAgent / PMAgent instances directly. EvoAgent remains an in-progress migration target, not the default execution path.

7.0 KiB Raw Blame History