feat(agent): complete EvoAgent integration for all 6 agent roles

Migrate all agent roles from Legacy to EvoAgent architecture:
- fundamentals_analyst, technical_analyst, sentiment_analyst, valuation_analyst
- risk_manager, portfolio_manager

Key changes:
- EvoAgent now supports Portfolio Manager compatibility methods (_make_decision,
  get_decisions, get_portfolio_state, load_portfolio_state, update_portfolio)
- Add UnifiedAgentFactory for centralized agent creation
- ToolGuard with batch approval API and WebSocket broadcast
- Legacy agents marked deprecated (AnalystAgent, RiskAgent, PMAgent)
- Remove backend/agents/compat.py migration shim
- Add run_id alongside workspace_id for semantic clarity
- Complete integration test coverage (13 tests)
- All smoke tests passing for 6 agent roles

Constraint: Must maintain backward compatibility with existing run configs
Constraint: Memory support must work with EvoAgent (no fallback to Legacy)
Rejected: Separate PM implementation for EvoAgent | unified approach cleaner
Confidence: high
Scope-risk: broad
Directive: EVO_AGENT_IDS env var still respected but defaults to all roles
Not-tested: Kubernetes sandbox mode for skill execution
This commit is contained in:
2026-04-02 00:55:08 +08:00
parent 0fa413380c
commit 16b54d5ccc
73 changed files with 9454 additions and 904 deletions

329
docs/runtime-api-changes.md Normal file
View File

@@ -0,0 +1,329 @@
# Runtime Service API 变更文档
## 概述
本文档描述了 `runtime_service` API 的改进,包括新增端点、增强的响应字段和改进的错误处理。
## 新增端点
### 1. GET /api/runtime/mode
返回当前运行模式(实盘或回测)及相关配置。
**响应模型**: `RuntimeModeResponse`
```json
{
"mode": "live",
"is_backtest": false,
"run_id": "20250401_120000",
"schedule_mode": "daily",
"is_running": true
}
```
**字段说明**:
- `mode`: 运行模式,`"live"`(实盘)或 `"backtest"`(回测),运行时停止时为 `"stopped"`
- `is_backtest`: 是否为回测模式
- `run_id`: 当前运行的任务 ID
- `schedule_mode`: 调度模式,`"daily"``"intraday"`
- `is_running`: Gateway 是否正在运行
---
### 2. GET /api/runtime/gateway/health
全面的 Gateway 健康检查,包括进程状态、端口连通性和配置状态。
**响应模型**: `GatewayHealthResponse`
```json
{
"status": "healthy",
"checks": {
"process": {
"status": "healthy",
"details": {
"pid": 12345,
"status": "running",
"returncode": null
}
},
"port": {
"status": "healthy",
"details": {
"port": 8765,
"accessible": true
}
},
"configuration": {
"status": "healthy",
"details": {
"has_runtime_manager": true
}
}
},
"timestamp": "2025-04-01T12:00:00.000000"
}
```
**状态说明**:
- `status`: 整体健康状态,`"healthy"`(健康)、`"degraded"`(降级)或 `"unhealthy"`(不健康)
- `checks.process.status`: 进程状态
- `checks.port.status`: 端口连通性
- `checks.configuration.status`: 配置状态
---
### 3. GET /health/gateway
服务级别的 Gateway 健康检查端点。
**响应示例**:
```json
{
"status": "healthy",
"checks": {
"process": {
"status": "healthy",
"details": {
"pid": 12345,
"status": "running",
"returncode": null
}
},
"port": {
"status": "healthy",
"details": {
"port": 8765,
"accessible": true
}
},
"configuration": {
"status": "healthy",
"details": {
"has_runtime_manager": true
}
}
},
"timestamp": "2025-04-01T12:00:00.000000"
}
```
---
## 改进的端点
### GET /api/runtime/gateway/status
**新增字段**:
- `process_status`: 进程状态(`"running"``"exited"``"not_running"`
- `pid`: 进程 ID
**响应示例**:
```json
{
"is_running": true,
"port": 8765,
"run_id": "20250401_120000",
"process_status": "running",
"pid": 12345
}
```
---
### GET /health
**改进的响应结构**:
```json
{
"status": "healthy",
"service": "runtime-service",
"gateway": {
"running": true,
"port": 8765,
"pid": 12345,
"process_status": "running",
"returncode": null
}
}
```
**字段说明**:
- `status`: 服务整体状态(考虑 Gateway 进程状态)
- `gateway.running`: Gateway 是否运行中
- `gateway.pid`: Gateway 进程 ID
- `gateway.process_status`: 进程详细状态
- `gateway.returncode`: 进程退出码(如已退出)
---
### GET /api/status
**新增字段**:
- `runtime.gateway_pid`: Gateway 进程 ID
- `runtime.gateway_process_status`: 进程状态
**响应示例**:
```json
{
"status": "operational",
"service": "runtime-service",
"runtime": {
"gateway_running": true,
"gateway_port": 8765,
"gateway_pid": 12345,
"gateway_process_status": "running",
"has_runtime_manager": true
}
}
```
---
### POST /api/runtime/start
**改进的错误信息**:
启动失败时返回详细的错误信息,包括:
- 进程退出码
- 最近的日志输出(最多 4000 字符)
- 配置问题检测
**错误响应示例**:
```json
{
"detail": "Gateway process exited unexpectedly\nExit code: 1\nRecent log output:\n[ERROR] FINNHUB_API_KEY not set...\nConfiguration issues detected: FINNHUB_API_KEY environment variable is required for live mode"
}
```
---
### POST /api/runtime/stop
**改进的错误信息**:
- 当 Gateway 进程已退出时,返回包含退出码和 PID 的详细信息
- 停止失败时返回具体原因
**错误响应示例(进程已退出)**:
```json
{
"detail": "No runtime is currently running. Previous Gateway process exited with code 1. PID: 12345"
}
```
**成功响应**:
```json
{
"status": "stopped",
"message": "Runtime stopped successfully (PID: 12345)"
}
```
---
## 配置验证
### 启动时验证
Gateway 启动前会自动验证以下配置:
1. **模式验证**
- `mode` 必须是 `"live"``"backtest"`
2. **环境变量**
- 实盘模式需要 `FINNHUB_API_KEY`
- 需要 `MODEL_NAME``OPENAI_API_KEY`
3. **股票池**
- `tickers` 不能为空且必须是列表
4. **数值验证**
- `initial_cash` 必须大于 0
- `margin_requirement` 必须在 0-1 之间
5. **回测日期**
- `start_date``end_date` 格式必须为 `YYYY-MM-DD`
- `start_date` 必须早于 `end_date`
6. **调度模式**
- `schedule_mode` 必须是 `"daily"``"intraday"`
**验证失败响应**:
```json
{
"detail": "Gateway configuration validation failed: FINNHUB_API_KEY environment variable is required for live mode; initial_cash must be greater than 0"
}
```
---
## 数据模型
### GatewayStatusResponse
```python
class GatewayStatusResponse(BaseModel):
is_running: bool
port: int
run_id: Optional[str] = None
process_status: Optional[str] = None # 新增
pid: Optional[int] = None # 新增
```
### GatewayHealthResponse
```python
class GatewayHealthResponse(BaseModel):
status: str
checks: Dict[str, Any]
timestamp: str
```
### RuntimeModeResponse
```python
class RuntimeModeResponse(BaseModel):
mode: str
is_backtest: bool
run_id: Optional[str] = None
schedule_mode: Optional[str] = None
is_running: bool
```
---
## 架构改进
### 新增辅助函数
1. **`_validate_gateway_config(bootstrap)`**
- 验证 Gateway 启动配置
- 返回验证错误列表
2. **`_get_gateway_process_details()`**
- 获取 Gateway 进程详细信息
- 包括 PID、状态、退出码
3. **`_check_gateway_health()`**
- 执行全面的健康检查
- 检查进程、端口、配置
---
## 向后兼容性
所有改进都保持向后兼容:
- 现有端点继续工作
- 新增字段为可选
- 错误响应格式保持不变(仅在 detail 中提供更详细信息)