Files

cillin 16b54d5ccc feat(agent): complete EvoAgent integration for all 6 agent roles

Migrate all agent roles from Legacy to EvoAgent architecture:
- fundamentals_analyst, technical_analyst, sentiment_analyst, valuation_analyst
- risk_manager, portfolio_manager

Key changes:
- EvoAgent now supports Portfolio Manager compatibility methods (_make_decision,
  get_decisions, get_portfolio_state, load_portfolio_state, update_portfolio)
- Add UnifiedAgentFactory for centralized agent creation
- ToolGuard with batch approval API and WebSocket broadcast
- Legacy agents marked deprecated (AnalystAgent, RiskAgent, PMAgent)
- Remove backend/agents/compat.py migration shim
- Add run_id alongside workspace_id for semantic clarity
- Complete integration test coverage (13 tests)
- All smoke tests passing for 6 agent roles

Constraint: Must maintain backward compatibility with existing run configs
Constraint: Memory support must work with EvoAgent (no fallback to Legacy)
Rejected: Separate PM implementation for EvoAgent | unified approach cleaner
Confidence: high
Scope-risk: broad
Directive: EVO_AGENT_IDS env var still respected but defaults to all roles
Not-tested: Kubernetes sandbox mode for skill execution

2026-04-02 00:55:08 +08:00

5.7 KiB

Raw Blame History

Skill Template (Anthropic + AgentScope Aligned)

用于定义可执行、可路由、可评估的技能规范。建议所有 SKILL.md 至少覆盖以下 6 个部分。

Frontmatter Spec

All SKILL.md files should begin with a YAML frontmatter block:

---
name: skill_name          # Required. Unique identifier for the skill.
description: ...          # Required. One-line description of the skill.
version: "1.0.0"          # Optional. Semantic version string.
tools: [...]               # Optional. Tools provided or used by this skill.
allowed_tools: [...]      # Optional. List of tool names permitted when this skill is active.
denied_tools: [...]       # Optional. List of tool names denied when this skill is active.
---

Frontmatter Fields

Field	Type	Description
`name`	string	Unique skill identifier (kebab-case recommended).
`description`	string	Human-readable one-line description.
`version`	string	Semantic version (e.g., `"1.0.0"`).
`tools`	list[string]	Tools provided by or associated with this skill.
`allowed_tools`	list[string]	Enumerates which tools are permitted when this skill is active. If set, only these tools may be used.
`denied_tools`	list[string]	Enumerates which tools are forbidden when this skill is active. Denied tools take precedence over `allowed_tools`.

Tool Restriction Rules

If only allowed_tools is set: only those tools are accessible.
If only denied_tools is set: all tools except those are accessible.
If both are set: allowed_tools defines the initial set, then denied_tools removes from it.
Denial takes precedence: a tool in denied_tools is always blocked even if also in allowed_tools.

1) When to use

明确触发条件（任务类型、关键词、场景）。
明确不应使用该技能的边界（避免误触发）。

2) Required inputs

列出最小必要输入（如 tickers、价格、组合状态、风险约束）。
声明输入缺失时的处理规则（终止 / 降级 / 请求补充）。

3) Decision procedure

采用固定步骤，确保可复现。
每一步说明目标、判据和产物（例如中间结论）。
标明冲突处理逻辑（信号冲突、数据冲突、置信度冲突）。

4) Tool call policy

说明优先使用哪些工具组与工具。
规定何时可以“无工具直接结论”，何时必须工具先证据后结论。
规定工具失败、超时、返回异常时的替代动作。

5) Output schema

定义标准输出字段，便于下游 Agent 消费与评估。
推荐包含：signal、confidence、reasons、risks、invalidation、next_action。
若是组合决策技能，必须包含每个 ticker 的 action 与 quantity。

6) Failure fallback

规定在数据不足、信号冲突、风险超限、工具不可用时的降级策略。
默认优先“保守 + 可解释 + 可执行”的输出。

Optional: Evaluation hooks

定义技能的可评估指标，用于后续记忆/反思阶段写入长期经验。

支持的指标类型

指标类型	描述	适用技能
`hit_rate`	信号命中率 - 决策信号与实际结果的符合程度	sentiment_review, technical_review
`risk_violation`	风控违例率 - 触发风控规则的次数	risk_review, portfolio_decisioning
`position_deviation`	仓位偏离率 - 建议仓位与实际执行仓位的偏差	portfolio_decisioning
`pnl_attribution`	P&L 归因一致性 - 收益归因与实际收益的匹配度	fundamental_review, valuation_review
`signal_consistency`	信号一致性 - 多来源信号的一致程度	sentiment_review
`decision_latency`	决策延迟 - 从输入到决策的耗时	portfolio_decisioning
`tool_usage`	工具使用率 - 工具调用次数与成功率的比值	所有技能
`custom`	自定义指标	特定业务场景

使用方式

from backend.agents.base.evaluation_hook import EvaluationHook, MetricType

# 在技能执行开始时
evaluation_hook.start_evaluation(
    skill_name="technical_review",
    inputs={"tickers": ["AAPL"], "prices": {...}}
)

# 在技能执行过程中添加指标
evaluation_hook.add_metric(
    name="signal_confidence",
    metric_type=MetricType.HIT_RATE,
    value=0.85,
    metadata={"method": "rsi", "threshold": 30}
)

# 在技能完成时记录结果
evaluation_hook.record_outputs({"signal": "buy", "confidence": 0.8})
evaluation_hook.complete_evaluation(success=True)

评估结果存储

评估结果自动保存到 runs/{run_id}/evaluations/{agent_id}/{skill_name}_{timestamp}.json

Skill Sandbox Execution | 技能沙盒执行

技能脚本（如估值报告生成）通过沙盒执行器运行，支持三种隔离模式：

模式	描述	适用场景
`none`	直接执行，无隔离	开发环境（默认）
`docker`	Docker 容器隔离	生产环境
`kubernetes`	Kubernetes Pod 隔离	企业级（预留）

沙盒配置

环境变量控制沙盒行为：

SKILL_SANDBOX_MODE=none              # none | docker | kubernetes
SKILL_SANDBOX_IMAGE=python:3.11-slim
SKILL_SANDBOX_MEMORY_LIMIT=512m
SKILL_SANDBOX_CPU_LIMIT=1.0
SKILL_SANDBOX_NETWORK=none
SKILL_SANDBOX_TIMEOUT=60

开发注意事项

默认 none 模式会在首次执行时显示安全警告
生产环境必须设置 SKILL_SANDBOX_MODE=docker
技能脚本应无副作用，输入输出通过函数参数和返回值
函数命名与脚本文件名的映射通过 FUNCTION_TO_SCRIPT_MAP 处理（如 build_ev_ebitda_report 在 multiple_valuation_report.py 中）

5.7 KiB Raw Blame History