feat(agent): complete EvoAgent integration for all 6 agent roles

Migrate all agent roles from Legacy to EvoAgent architecture: - fundamentals_analyst, technical_analyst, sentiment_analyst, valuation_analyst - risk_manager, portfolio_manager Key changes: - EvoAgent now supports Portfolio Manager compatibility methods (_make_decision, get_decisions, get_portfolio_state, load_portfolio_state, update_portfolio) - Add UnifiedAgentFactory for centralized agent creation - ToolGuard with batch approval API and WebSocket broadcast - Legacy agents marked deprecated (AnalystAgent, RiskAgent, PMAgent) - Remove backend/agents/compat.py migration shim - Add run_id alongside workspace_id for semantic clarity - Complete integration test coverage (13 tests) - All smoke tests passing for 6 agent roles Constraint: Must maintain backward compatibility with existing run configs Constraint: Memory support must work with EvoAgent (no fallback to Legacy) Rejected: Separate PM implementation for EvoAgent | unified approach cleaner Confidence: high Scope-risk: broad Directive: EVO_AGENT_IDS env var still respected but defaults to all roles Not-tested: Kubernetes sandbox mode for skill execution
2026-04-02 00:55:08 +08:00
parent 0fa413380c
commit 16b54d5ccc
73 changed files with 9454 additions and 904 deletions
--- a/docs/current-architecture.md
+++ b/docs/current-architecture.md
@@ -0,0 +1,202 @@
+# Current Architecture
+
+This file describes the current code-supported architecture only. Historical
+paths and partial migrations are intentionally excluded unless called out as
+legacy compatibility.
+
+Reference material:
+
+- visual diagram: [current-architecture.excalidraw](./current-architecture.excalidraw)
+- next-step roadmap: [development-roadmap.md](./development-roadmap.md)
+- legacy inventory: [legacy-inventory.md](./legacy-inventory.md)
+- terminology guide: [terminology.md](./terminology.md)
+
+## Runtime Modes
+
+The system supports two distinct runtime modes:
+
+### Standalone Mode (Legacy Compatibility)
+
+Direct Gateway startup via `backend.main` as a monolithic entrypoint.
+
+```bash
+python -m backend.main --mode live --port 8765
+```
+
+**Characteristics:**
+- Single process runs Gateway, Pipeline, Market Service, and Scheduler
+- No service discovery or process management
+- Suitable for single-node deployments and quick testing
+- All components share the same memory space
+
+**Use cases:**
+- Quick local testing without service orchestration
+- Single-node production deployments
+- Backward compatibility with legacy startup scripts
+
+### Microservice Mode (Default for Development)
+
+Split-service architecture with dedicated runtime_service managing the Gateway lifecycle.
+
+```bash
+./start-dev.sh  # Starts all services including runtime_service and Gateway
+```
+
+**Characteristics:**
+- `runtime_service` (:8003) acts as Gateway Process Manager
+- Gateway runs as a subprocess managed by runtime_service
+- Clear separation between Control Plane (runtime_service) and Data Plane (Gateway)
+- Service discovery via environment variables
+- Independent scaling and deployment of each service
+
+**Use cases:**
+- Local development with hot-reload
+- Multi-node deployments
+- Production environments requiring service isolation
+
+## Mode Comparison
+
+| Aspect | Standalone Mode | Microservice Mode |
+|--------|-----------------|-------------------|
+| **Entry point** | `python -m backend.main` | `./start-dev.sh` or individual services |
+| **Process model** | Single monolithic process | Multiple specialized processes |
+| **Gateway management** | Self-contained | Managed by runtime_service |
+| **Service discovery** | None (in-process) | Environment variable based |
+| **Hot reload** | Full restart required | Per-service reload |
+| **Scaling** | Vertical only | Horizontal possible |
+| **Complexity** | Lower | Higher |
+| **Use case** | Testing, simple deployments | Development, production |
+
+## Default Runtime Shape (Microservice Mode)
+
+The active runtime path is:
+
+`frontend -> frontend_service proxy or direct split-service calls -> runtime_service/control APIs -> gateway subprocess -> market/pipeline/storage`
+
+Current service surfaces:
+
+- `backend.apps.agent_service` on `:8000`
+  - control plane for workspaces, agents, skills, approvals
+- `backend.apps.trading_service` on `:8001`
+  - read-only trading data APIs
+- `backend.apps.news_service` on `:8002`
+  - read-only explain/news APIs
+- `backend.apps.runtime_service` on `:8003`
+  - runtime lifecycle and gateway process management
+- `backend.apps.openclaw_service` on `:8004`
+  - optional OpenClaw REST facade
+- gateway WebSocket on `:8765`
+  - live feed/event transport and pipeline coordination
+
+### Control Plane vs Data Plane
+
+**Control Plane (runtime_service :8003):**
+- Gateway lifecycle management (start/stop/restart)
+- Runtime configuration and bootstrap
+- Process health monitoring
+- Run history and state snapshots
+
+**Data Plane (Gateway :8765):**
+- WebSocket event streaming
+- Market data ingestion
+- Pipeline execution (analysis -> decision -> execution)
+- Real-time trading operations
+
+## Runtime Data Layout
+
+The canonical runtime data root is:
+
+- `runs/<run_id>/`
+
+Important files under each run:
+
+- `runs/<run_id>/BOOTSTRAP.md`
+  - machine-readable front matter plus run-scoped prompt body
+- `runs/<run_id>/agents/<agent_id>/`
+  - run-scoped agent workspace files and active/local skills
+- `runs/<run_id>/state/runtime_state.json`
+  - runtime snapshot
+- `runs/<run_id>/state/server_state.json`
+  - server-side state (portfolio, trades, market data)
+- `runs/<run_id>/team_dashboard/*.json`
+  - compatibility/export layer for dashboard consumers
+  - can be disabled in controlled environments via `ENABLE_DASHBOARD_COMPAT_EXPORTS=false`
+
+## Workspace Terms
+
+Two similarly named concepts still exist in the repository:
+
+- `workspaces/`
+  - design-time registry and CRUD surface exposed by `agent_service`
+- `runs/<run_id>/`
+  - actual runtime state, agent assets, skills, bootstrap config, and logs
+
+When reading current runtime code, prefer `runs/<run_id>/` as the source of
+truth. The `workspaces/` registry is not the default execution path.
+
+## Skill Sandbox Execution
+
+Skill scripts (analysis tools, valuation reports) can be executed in multiple
+sandbox modes via `backend/tools/sandboxed_executor.py`:
+
+| Mode | Backend Class | Description |
+|------|---------------|-------------|
+| `none` | `NoSandboxBackend` | Direct module import and execution (default, development only) |
+| `docker` | `DockerSandboxBackend` | Docker container isolation with resource limits |
+| `kubernetes` | `KubernetesSandboxBackend` | Kubernetes Pod isolation (reserved interface) |
+
+Environment configuration:
+
+```bash
+SKILL_SANDBOX_MODE=none              # none | docker | kubernetes
+SKILL_SANDBOX_IMAGE=python:3.11-slim
+SKILL_SANDBOX_MEMORY_LIMIT=512m
+SKILL_SANDBOX_CPU_LIMIT=1.0
+SKILL_SANDBOX_NETWORK=none
+SKILL_SANDBOX_TIMEOUT=60
+```
+
+The default `none` mode displays a runtime security warning on first execution
+as a reminder that scripts run without isolation. Production deployments should
+use `docker` mode with appropriate resource limits.
+
+## Migration Roadmap
+
+### Current State
+
+The system is in a transitional state:
+
+1. **Microservice infrastructure is operational** - runtime_service can start/stop Gateway as subprocess
+2. **Pipeline logic remains in Gateway** - full Pipeline execution still happens within Gateway process
+3. **Standalone mode is preserved** - direct `backend.main` startup for compatibility
+
+### Future Direction
+
+Phase 1: Documentation and startup convergence (active)
+- Clarify runtime modes and their use cases
+- Unify documentation across all entry points
+
+Phase 2: Runtime model consolidation
+- Ensure all runtime state lives under `runs/<run_id>/`
+- Remove dependencies on root-level legacy directories
+
+Phase 3: Pipeline decomposition (planned)
+- Extract Pipeline stages into independent services
+- Gateway becomes a thin event router
+- runtime_service evolves into full orchestrator
+
+Phase 4: Standalone mode deprecation (future)
+- Remove direct `backend.main` entry point
+- All deployments use microservice mode
+
+## Legacy Compatibility
+
+These items still exist, but they are not the recommended source of truth for
+new development:
+
+- root-level runtime data directories such as `live/`, `production/`, `backtest/`
+- direct `backend.main` startup as the primary development path
+
+The current runtime still creates legacy `AnalystAgent` / `RiskAgent` /
+`PMAgent` instances directly. EvoAgent remains an in-progress migration target,
+not the default execution path.