feat(agent): complete EvoAgent integration for all 6 agent roles

Migrate all agent roles from Legacy to EvoAgent architecture:
- fundamentals_analyst, technical_analyst, sentiment_analyst, valuation_analyst
- risk_manager, portfolio_manager

Key changes:
- EvoAgent now supports Portfolio Manager compatibility methods (_make_decision,
  get_decisions, get_portfolio_state, load_portfolio_state, update_portfolio)
- Add UnifiedAgentFactory for centralized agent creation
- ToolGuard with batch approval API and WebSocket broadcast
- Legacy agents marked deprecated (AnalystAgent, RiskAgent, PMAgent)
- Remove backend/agents/compat.py migration shim
- Add run_id alongside workspace_id for semantic clarity
- Complete integration test coverage (13 tests)
- All smoke tests passing for 6 agent roles

Constraint: Must maintain backward compatibility with existing run configs
Constraint: Memory support must work with EvoAgent (no fallback to Legacy)
Rejected: Separate PM implementation for EvoAgent | unified approach cleaner
Confidence: high
Scope-risk: broad
Directive: EVO_AGENT_IDS env var still respected but defaults to all roles
Not-tested: Kubernetes sandbox mode for skill execution
This commit is contained in:
2026-04-02 00:55:08 +08:00
parent 0fa413380c
commit 16b54d5ccc
73 changed files with 9454 additions and 904 deletions

View File

@@ -0,0 +1,202 @@
# Current Architecture
This file describes the current code-supported architecture only. Historical
paths and partial migrations are intentionally excluded unless called out as
legacy compatibility.
Reference material:
- visual diagram: [current-architecture.excalidraw](./current-architecture.excalidraw)
- next-step roadmap: [development-roadmap.md](./development-roadmap.md)
- legacy inventory: [legacy-inventory.md](./legacy-inventory.md)
- terminology guide: [terminology.md](./terminology.md)
## Runtime Modes
The system supports two distinct runtime modes:
### Standalone Mode (Legacy Compatibility)
Direct Gateway startup via `backend.main` as a monolithic entrypoint.
```bash
python -m backend.main --mode live --port 8765
```
**Characteristics:**
- Single process runs Gateway, Pipeline, Market Service, and Scheduler
- No service discovery or process management
- Suitable for single-node deployments and quick testing
- All components share the same memory space
**Use cases:**
- Quick local testing without service orchestration
- Single-node production deployments
- Backward compatibility with legacy startup scripts
### Microservice Mode (Default for Development)
Split-service architecture with dedicated runtime_service managing the Gateway lifecycle.
```bash
./start-dev.sh # Starts all services including runtime_service and Gateway
```
**Characteristics:**
- `runtime_service` (:8003) acts as Gateway Process Manager
- Gateway runs as a subprocess managed by runtime_service
- Clear separation between Control Plane (runtime_service) and Data Plane (Gateway)
- Service discovery via environment variables
- Independent scaling and deployment of each service
**Use cases:**
- Local development with hot-reload
- Multi-node deployments
- Production environments requiring service isolation
## Mode Comparison
| Aspect | Standalone Mode | Microservice Mode |
|--------|-----------------|-------------------|
| **Entry point** | `python -m backend.main` | `./start-dev.sh` or individual services |
| **Process model** | Single monolithic process | Multiple specialized processes |
| **Gateway management** | Self-contained | Managed by runtime_service |
| **Service discovery** | None (in-process) | Environment variable based |
| **Hot reload** | Full restart required | Per-service reload |
| **Scaling** | Vertical only | Horizontal possible |
| **Complexity** | Lower | Higher |
| **Use case** | Testing, simple deployments | Development, production |
## Default Runtime Shape (Microservice Mode)
The active runtime path is:
`frontend -> frontend_service proxy or direct split-service calls -> runtime_service/control APIs -> gateway subprocess -> market/pipeline/storage`
Current service surfaces:
- `backend.apps.agent_service` on `:8000`
- control plane for workspaces, agents, skills, approvals
- `backend.apps.trading_service` on `:8001`
- read-only trading data APIs
- `backend.apps.news_service` on `:8002`
- read-only explain/news APIs
- `backend.apps.runtime_service` on `:8003`
- runtime lifecycle and gateway process management
- `backend.apps.openclaw_service` on `:8004`
- optional OpenClaw REST facade
- gateway WebSocket on `:8765`
- live feed/event transport and pipeline coordination
### Control Plane vs Data Plane
**Control Plane (runtime_service :8003):**
- Gateway lifecycle management (start/stop/restart)
- Runtime configuration and bootstrap
- Process health monitoring
- Run history and state snapshots
**Data Plane (Gateway :8765):**
- WebSocket event streaming
- Market data ingestion
- Pipeline execution (analysis -> decision -> execution)
- Real-time trading operations
## Runtime Data Layout
The canonical runtime data root is:
- `runs/<run_id>/`
Important files under each run:
- `runs/<run_id>/BOOTSTRAP.md`
- machine-readable front matter plus run-scoped prompt body
- `runs/<run_id>/agents/<agent_id>/`
- run-scoped agent workspace files and active/local skills
- `runs/<run_id>/state/runtime_state.json`
- runtime snapshot
- `runs/<run_id>/state/server_state.json`
- server-side state (portfolio, trades, market data)
- `runs/<run_id>/team_dashboard/*.json`
- compatibility/export layer for dashboard consumers
- can be disabled in controlled environments via `ENABLE_DASHBOARD_COMPAT_EXPORTS=false`
## Workspace Terms
Two similarly named concepts still exist in the repository:
- `workspaces/`
- design-time registry and CRUD surface exposed by `agent_service`
- `runs/<run_id>/`
- actual runtime state, agent assets, skills, bootstrap config, and logs
When reading current runtime code, prefer `runs/<run_id>/` as the source of
truth. The `workspaces/` registry is not the default execution path.
## Skill Sandbox Execution
Skill scripts (analysis tools, valuation reports) can be executed in multiple
sandbox modes via `backend/tools/sandboxed_executor.py`:
| Mode | Backend Class | Description |
|------|---------------|-------------|
| `none` | `NoSandboxBackend` | Direct module import and execution (default, development only) |
| `docker` | `DockerSandboxBackend` | Docker container isolation with resource limits |
| `kubernetes` | `KubernetesSandboxBackend` | Kubernetes Pod isolation (reserved interface) |
Environment configuration:
```bash
SKILL_SANDBOX_MODE=none # none | docker | kubernetes
SKILL_SANDBOX_IMAGE=python:3.11-slim
SKILL_SANDBOX_MEMORY_LIMIT=512m
SKILL_SANDBOX_CPU_LIMIT=1.0
SKILL_SANDBOX_NETWORK=none
SKILL_SANDBOX_TIMEOUT=60
```
The default `none` mode displays a runtime security warning on first execution
as a reminder that scripts run without isolation. Production deployments should
use `docker` mode with appropriate resource limits.
## Migration Roadmap
### Current State
The system is in a transitional state:
1. **Microservice infrastructure is operational** - runtime_service can start/stop Gateway as subprocess
2. **Pipeline logic remains in Gateway** - full Pipeline execution still happens within Gateway process
3. **Standalone mode is preserved** - direct `backend.main` startup for compatibility
### Future Direction
Phase 1: Documentation and startup convergence (active)
- Clarify runtime modes and their use cases
- Unify documentation across all entry points
Phase 2: Runtime model consolidation
- Ensure all runtime state lives under `runs/<run_id>/`
- Remove dependencies on root-level legacy directories
Phase 3: Pipeline decomposition (planned)
- Extract Pipeline stages into independent services
- Gateway becomes a thin event router
- runtime_service evolves into full orchestrator
Phase 4: Standalone mode deprecation (future)
- Remove direct `backend.main` entry point
- All deployments use microservice mode
## Legacy Compatibility
These items still exist, but they are not the recommended source of truth for
new development:
- root-level runtime data directories such as `live/`, `production/`, `backtest/`
- direct `backend.main` startup as the primary development path
The current runtime still creates legacy `AnalystAgent` / `RiskAgent` /
`PMAgent` instances directly. EvoAgent remains an in-progress migration target,
not the default execution path.