Merge pull request #19 from ZiTao-Li/zitao/alias_init

Alias open source
2025-10-31 13:42:54 +08:00
parent ef5c7d9aab 66bf6f4c18
commit 87047e9c8c
82 changed files with 10776 additions and 24 deletions
--- a/README.md
+++ b/README.md
@@ -59,6 +59,7 @@ It includes **agent deployment** and **secure sandboxed tool execution**, and ca
 ## 🌳 Repository Structure
 ```bash
 ├── alias/                                  # Agent to solve real-world problems
 ├── browser_use/
 │   ├── agent_browser/                      # Pure Python browser agent
 │   └── browser_use_fullstack_runtime/      # Full-stack runtime version with frontend/backend
@@ -88,7 +89,7 @@ It includes **agent deployment** and **secure sandboxed tool execution**, and ca
 ## 📌 Example List
 | Category                | Example Folder                                        | Uses AgentScope | Use AgentScope Runtime | Description                                                               |
-| ----------------------- |-------------------------------------------------------| --------------- | ------------ |--------------------------------------------------|
+|-------------------------|-------------------------------------------------------|-----------------|------------------------|---------------------------------------------------------------------------|
 | **Browser Use**         | browser_use/agent_browser                             | ✅               | ❌                      | Command-line browser automation using AgentScope                          |
 |                         | browser_use/browser_use_fullstack_runtime             | ✅               | ✅                      | Full-stack browser automation with UI & sandbox                           |
 | **Deep Research**       | deep_research/agent_deep_research                     | ✅               | ❌                      | Multi-agent research pipeline                                             |
@@ -99,6 +100,7 @@ It includes **agent deployment** and **secure sandboxed tool execution**, and ca
 |                         | conversational_agents/multiagent_conversation         | ✅               | ❌                      | Multi-agent dialogue scenario                                             |
 |                         | conversational_agents/multiagent_debate               | ✅               | ❌                      | Agents engaging in debates                                                |
 | **Evaluation**          | evaluation/ace_bench                                  | ✅               | ❌                      | Benchmarks with ACE Bench                                                 |
 | **Alias**               | alias/                                                | ✅               | ✅                      | Agent application running in sandbox to solve diverse real-world problems |
 ------
--- a/README_zh.md
+++ b/README_zh.md
@@ -59,6 +59,7 @@ AgentScope Runtime 是一个**全面的运行时框架**，主要解决部署和
 ## 🌳 仓库结构
 ```bash
 ├── alias/                                  # 解决现实问题的智能体程序
 ├── browser_use/
 │   ├── agent_browser/                      # 纯 Python 浏览器 Agent
 │   └── browser_use_fullstack_runtime/      # 全栈运行时版本（前端+后端）
@@ -88,7 +89,7 @@ AgentScope Runtime 是一个**全面的运行时框架**，主要解决部署和
 ## 📌 示例列表
 | 分类        | 示例文件夹                                                 | 使用 AgentScope | 使用 AgentScope Runtime | 描述                      |
-| ----------------------- |-----------------------------------------------------| --------------- | ----------------------- |------|
+|-----------|-------------------------------------------------------|---------------|-----------------------|-------------------------|
 | **浏览器相关** | browser_use/agent_browser                             | ✅             | ❌                     | 基于 AgentScope 的命令行浏览器自动化 |
 |           | browser_use/browser_use_fullstack_runtime             | ✅             | ✅                     | 带 UI 和沙盒环境的全栈浏览器自动化     |
 | **深度研究**  | deep_research/agent_deep_research                     | ✅             | ❌                     | 多 Agent 研究流程            |
@@ -99,6 +100,7 @@ AgentScope Runtime 是一个**全面的运行时框架**，主要解决部署和
 |           | conversational_agents/multiagent_conversation         | ✅             | ❌                     | 多 Agent 对话场景            |
 |           | conversational_agents/multiagent_debate               | ✅             | ❌                     | Agent 辩论                |
 | **评估**    | evaluation/ace_bench                                  | ✅             | ❌                     | ACE Bench 基准测试          |
 | **Alias** | alias/                                                | ✅             | ✅                     | 在沙盒中运行的可以解决真实问题的智能体程序   |
 ---
--- a/alias/.gitignore
+++ b/alias/.gitignore
@@ -0,0 +1,18 @@
 # Byte-compiled / optimized / DLL files
 __pycache__/
 # macOS
 .DS_Store
 # Logs
 logs/
 src/alias/agent/agents/log/
 sessions_mount_dir/
 # Python
 *.py[cod]
 *$py.class
 # Package
 alias.egg-info/
--- a/alias/README.md
+++ b/alias/README.md
@@ -0,0 +1,108 @@
 <div align="center">
 <img src="assets/alias.png" alt="Alias-Agent Logo" width="500" height="250" style="vertical-align: middle; margin-right: 20px;">
 <h1 style="text-decoration: none; border-bottom: none; display: inline; vertical-align: middle; margin: 0;">Alias-Agent: Start It Now, Extend It Your Way, Deploy All with Ease</h1>
 </div>
 ## Project introduction
 *Alias-Agent* (short for *Alias*) is an LLM-empowered agent built on [AgentScope](https://github.com/agentscope-ai/agentscope) and [AgentScope-runtime](https://github.com/agentscope-ai/agentscope-runtime/), designed to solve diverse real-world problems. It provides three operational modes for flexible task execution:
 - **Simple React**: Employs vanilla reasoning-acting loops to iteratively solve problems and execute tool calls.
 - **Planner-Worker**: Uses intelligent planning to decompose complex tasks into manageable subtasks, with dedicated worker agents handling each subtask independently.
 - **Built-in Agents**: Leverages specialized agents tailored for specific domains, including *Deep Research Agent* for comprehensive analysis and *Browser-use Agent* for web-based interactions. More details can refer to the following "Basic Usage" section.
 We aim for Alias to serve as an out-of-the-box solution that users can readily deploy for various tasks.
 ## Coming soon
 Beyond being a ready-to-use agent, we envision Alias as a foundational template that can be adapted to different scenarios. Developers can extend and customize Alias at the tool, prompt, and agent levels to meet their specific requirements.
 We are actively developing specialized enhancements and adaptations for:
 - **Business Intelligence (BI)** scenarios
 - **Financial** analysis applications
 - **Question-Answering (QA)** systems
 Stay tuned for upcoming releases!
 ## Installation
 Install the Alias package in development mode:
 ```bash
 pip install -e .
 # SETUP SANDBOX
 # If you are using colima, then you need to run the following
 # export DOCKER_HOST=unix://$HOME/.colima/default/docker.sock
 # More details can refer to https://runtime.agentscope.io/en/sandbox.html
 # Option 1: Pull from registry
 export RUNTIME_SANDBOX_REGISTRY=agentscope-registry.ap-southeast-1.cr.aliyuncs.com
 docker pull agentscope-registry.ap-southeast-1.cr.aliyuncs.com/agentscope/runtime-sandbox-alias:latest
 # Option 2: pull from docker hub
 docker pull agentscope/runtime-sandbox-alias:latest
 ```
 This will install the `alias` command-line tool.
 ## Basic Usage
 The `alias` CLI provides a terminal interface to run AI agents for various tasks.
 ### Run Command
 First of all, set up API keys
 ```bash
 # Model API keys
 export DASHSCOPE_API_KEY=your_dashscope_api_key_here
 # Using other models: go to src/alias/agent/run.py and add your model to MODEL_FORMATTER_MAPPING, then run the bash to set your model and api key. For example:
 #export MODEL=gpt-5
 #export OPENAI_API_KEY=your_openai_api_key_here
 # Search api key (required for deep research)
 export TAVILY_API_KEY=your_tavily_api_key_here
 ```
 Execute an agent task:
 ```bash
 alias_agent run --task "Your task description here"
 ```
 ### Examples
 #### Run with all agents (Meta Planner with workers):
 ```bash
 alias_agent run --task "Analyze Meta stock performance in Q1 2025"
 ```
 #### Run with only browser agent:
 ```bash
 alias_agent run --mode browser --task "Search five latest research papers about browser-use agent"
 ```
 #### Upload files to sandbox workspace:
 ```bash
 # Upload a single file
 alias_agent run --task "Analyze this data" --files data.csv
 # Upload multiple files
 alias_agent run --task "Process these files and create a summary report" --files report.txt data.csv notes.md
 # Using short form (-f)
 alias_agent run --task "Review the documents" -f document1.pdf document2.txt
 # Combine with other options
 alias_agent run --mode all --task "Analyze the data and generate insights" --files dataset.csv --verbose
 ```
 **Note**: Files uploaded with `--files` are automatically copied to the `/workspace` directory in the sandbox with their original filenames, making them immediately accessible to the agent.
 ### Obtain agent-generated files
 In the directory where you ran `alias_agent`, you should find a `sessions_mount_dir` directory with subdirectories, each containing the content from `/workspace` of the sandboxes' mounted file systems. All generated files should be located there.
--- a/alias/assets/alias.png
+++ b/alias/assets/alias.png
--- a/alias/pyproject.toml
+++ b/alias/pyproject.toml
@@ -0,0 +1,72 @@
 [project]
 name = "alias"
 version = "0.0.1"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.10"
 dependencies = [
    "aiohttp>=3.11.16",
    "dashscope>=1.23.1",
    "docker>=7.1.0",
    "fastapi>=0.115.12",
    "gitpython>=3.1.44",
    "ipython>=8.35.0",
    "mcp>=1.6.0",
    "playwright>=1.51.0",
    "steel-sdk>=0.1.0",
    "uvicorn>=0.34.0",
    "redis>=6.0.0b2",
    "celery[redis]>=5.3.1",
    "fastapi-limiter>=0.1.6",
    "pyjwt>=2.10.1",
    "sqlmodel>=0.0.24",
    "pydantic-settings>=2.9.1",
    "pydantic[email]>=2.11.3",
    "user-agents>=2.2.0",
    "passlib>=1.7.4",
    "authlib>=1.5.2",
    "loguru>=0.6.0",
    "python-multipart>=0.0.20",
    "elasticsearch>=9.0.0",
    "jinja2>=3.1.6",
    "bcrypt==4.0.1",
    "agentscope[full]>=1.0.6",
    "tenacity>=8.5.0",
    "apscheduler>=3.11.0",
    "chardet>=5.2.0",
    "aiofiles>=23.2.1",
    "pygments>=2.19.1",
    "markdown2>=2.5.3",
    "psycopg2-binary>=2.9.10",
    "alembic>=1.16.1",
    "opentelemetry-api==1.27.0",
    "opentelemetry-sdk==1.27.0",
    "opentelemetry-exporter-otlp==1.27.0",
    "opentelemetry-instrumentation-celery==0.48b0",
    "opentelemetry-instrumentation-fastapi==0.48b0",
    "opentelemetry-instrumentation-sqlalchemy==0.48b0",
    "opentelemetry-instrumentation-redis==0.48b0",
    "sentry-sdk[fastapi]===2.30.0",
    "agentscope-runtime>=0.1.6",
    "aiosqlite>=0.21.0",
    "asyncpg>=0.30.0"
 ]
 [project.optional-dependencies]
 dev = [
    "pre-commit>=4.3.0",
    "pytest>=8.3.5",
 ]
 [project.scripts]
 alias_agent = "alias.cli:main"
 [tool.setuptools]
 packages = { find = { where = ["src"] } }
 [tool.setuptools.package-dir]
 "" = "src"
 [build-system]
 requires = ["setuptools>=42", "wheel"]
 build-backend = "setuptools.build_meta"
--- a/alias/src/init.py
+++ b/alias/src/init.py
--- a/alias/src/alias/init.py
+++ b/alias/src/alias/init.py
@@ -0,0 +1,5 @@
 # -*- coding: utf-8 -*-
 """Alias - Beta version"""
 __version__ = "0.0.1"
--- a/alias/src/alias/agent/init.py
+++ b/alias/src/alias/agent/init.py
--- a/alias/src/alias/agent/agents/init.py
+++ b/alias/src/alias/agent/agents/init.py
@@ -0,0 +1,16 @@
 # -*- coding: utf-8 -*-
 from alias.agent.agents._alias_agent_base import AliasAgentBase
 from alias.agent.agents._meta_planner import MetaPlanner
 from alias.agent.agents._browser_agent import BrowserAgent
 from alias.agent.agents._react_worker import ReActWorker
 from alias.agent.agents._deep_research_agent import DeepResearchAgent
 from alias.agent.agents._planning_tools import share_tools
 __all__ = [
    "AliasAgentBase",
    "MetaPlanner",
    "BrowserAgent",
    "ReActWorker",
    "DeepResearchAgent",
    "share_tools",
 ]
--- a/alias/src/alias/agent/agents/_agent_hooks.py
+++ b/alias/src/alias/agent/agents/_agent_hooks.py
@@ -0,0 +1,261 @@
 # -*- coding: utf-8 -*-
 # mypy: disable-error-code="has-type"
 import json
 from typing import Literal, Any, Optional, TYPE_CHECKING, Union
 from agentscope.message import Msg
 from agentscope import logger
 from alias.agent.utils import AliasAgentStates
 from alias.agent.utils.constants import DEFAULT_PLANNER_NAME
 if TYPE_CHECKING:
    from alias.agent.agents import MetaPlanner
    from alias.agent.agents._alias_agent_base import (
        AliasAgentBase,
    )
 else:
    AliasAgentBase = "alias.agent.agents.AliasAgentBase"
    MetaPlanner = "alias.agent.agents.MetaPlanner"
 PlannerStage = Literal["post_reasoning", "post_action", "pre_reasoning"]
 def _infer_planner_stage_with_msg(
    cur_msg: Msg,
 ) -> tuple[PlannerStage, list[str]]:
    """
    Infer the planner stage and extract tool names from a message.
    Analyzes a message to determine the current stage of the planner workflow
    and extracts any tool names if tool calls are present in the message.
    Args:
        cur_msg (Msg): The message to analyze for stage inference.
    Returns:
        tuple[PlannerStage, list[str]]: A tuple containing:
            - PlannerStage: One of "pre_reasoning", "post_reasoning", or
                "post_action"
            - list[str]: List of tool names found in tool_use or
                tool_result blocks
    Note:
        - "pre_reasoning": System role messages with string content
        - "post_reasoning": Messages with tool_use blocks or plain text content
        - "post_action": Messages with tool_result blocks
        - Tool names are extracted from both tool_use and tool_result blocks
    """
    blocks = cur_msg.content
    if isinstance(blocks, str) and cur_msg.role in ["system", "user"]:
        return "pre_reasoning", []
    cur_tool_names = [
        str(b.get("name", "no_name_tool"))
        for b in blocks
        if b["type"] in ["tool_use", "tool_result"]
    ]
    if cur_msg.has_content_blocks("tool_result"):
        return "post_action", cur_tool_names
    elif cur_msg.has_content_blocks("tool_use"):
        return "post_reasoning", cur_tool_names
    else:
        return "post_reasoning", cur_tool_names
 async def _update_and_save_state_with_session(
    self: AliasAgentBase,
 ) -> None:
    global_state = await self.session_service.get_state()
    if global_state is None:
        global_state = AliasAgentStates()
    else:
        global_state = AliasAgentStates(**global_state)
    # update global state
    global_state.agent_states[self.name] = self.state_dict()
    await self.session_service.create_state(
        content=global_state.model_dump(),
    )
 async def _update_and_save_plan_with_session(
    self: MetaPlanner,
 ) -> None:
    content = self.planner_notebook.model_dump(
        exclude="full_tool_list",
    )
    await self.session_service.create_plan(
        content=content,
    )
 async def planner_load_states_pre_reply_hook(
    self: MetaPlanner,
    kwargs: dict[str, Any],  # pylint: disable=W0613
 ) -> None:
    global_state = await self.session_service.get_state()
    if global_state is None or len(global_state) == 0:
        return
    global_state = AliasAgentStates(**global_state)
    if self.name not in global_state.agent_states:
        return
    self.load_state_dict(global_state.agent_states[self.name])
    # load worker states
    for name, (_, worker) in self.worker_manager.worker_pool.items():
        if name in global_state.agent_states:
            worker.load_state_dict(global_state.agent_states[name])
 async def update_user_input_pre_reply_hook(
    self: MetaPlanner,
    kwargs: dict[str, Any],
 ) -> None:
    """Hook for loading user input to planner notebook"""
    msg = kwargs.get("msg", None)
    if isinstance(msg, Msg):
        msg = [msg]
    elif self.session_service is not None:
        messages = await self.session_service.get_messages()
        logger.info(f"Received {len(messages)} messages")
        if messages is None:
            return
        latest_user_msg = None
        msg = []
        for cur_msg in reversed(messages):
            msg_body = cur_msg.message
            if msg_body["role"] == "user" and latest_user_msg is None:
                latest_user_msg = msg_body["content"]
            input_content = msg_body["content"]
            if len(msg_body.get("filenames", [])) > 0:
                input_content += "User Provided Attached Files:\n"
                for filename in msg_body.get("filenames", []):
                    if not filename.startswith("/workspace"):
                        filename = "/workspace/" + filename
                    input_content += f"\t{filename}\n"
            if msg_body["role"] == "user":
                msg.append(input_content)
    if isinstance(msg, list):
        self.planner_notebook.user_input = [str(m) for m in msg]
        for m in msg:
            await self.memory.add(
                Msg(
                    "user",
                    m,
                    "user",
                ),
            )
 async def save_post_reasoning_state(
    self: AliasAgentBase,
    reasoning_input: dict[str, Any],  # pylint: disable=W0613
    reasoning_output: Msg,  # pylint: disable=W0613
 ) -> None:
    """Hook func for save state after reasoning step"""
    await _update_and_save_state_with_session(self)
 async def save_post_action_state(
    self: Union[AliasAgentBase, MetaPlanner],
    action_input: dict[str, Any],  # pylint: disable=W0613
    tool_output: Optional[Msg],  # pylint: disable=W0613
 ) -> None:
    """Hook func for save state after action step"""
    await _update_and_save_state_with_session(self)
    if self.name == DEFAULT_PLANNER_NAME:
        await _update_and_save_plan_with_session(self)
 async def planner_compose_reasoning_msg_pre_reasoning_hook(
    self: "MetaPlanner",  # pylint: disable=W0613
    *args: Any,
    **kwargs: Any,
 ) -> None:
    """Hook func for composing msg for reasoning step"""
    reasoning_info = (
        "## All User Input\n{all_user_input}\n\n"
        "## Session Context\n"
        "```json\n{notebook_string}\n```\n\n"
    ).format_map(
        {
            "notebook_string": self.planner_notebook.model_dump_json(
                exclude={"user_input", "full_tool_list"},
                indent=2,
            ),
            "all_user_input": self.planner_notebook.user_input,
        },
    )
    if self.work_pattern == "simplest":
        tool_info = json.dumps(
            self.planner_notebook.full_tool_list,
            indent=2,
            ensure_ascii=False,
        )
        reasoning_info += (
            "## Additional Tool information\n"
            "The following tools can be enable in your toolkit either if you"
            "enter easy task mode (by calling `enter_easy_task_mode`) or "
            "create worker in planning-execution mode (after calling "
            "`enter_planning_execution_mode`).\n"
            "NOTICE: THE FOLLOWING TOOL IS ONLY FOR REFERENCE! "
            "DO NOT USE THEM BEFORE CALLING `enter_easy_task_mode`!\n"
            f"```json\n{tool_info}\n```\n"
        )
    reasoning_msg = Msg(
        "user",
        content=reasoning_info,
        role="user",
    )
    await self.memory.add(reasoning_msg)
 async def planner_remove_reasoning_msg_post_reasoning_hook(
    self: "MetaPlanner",  # pylint: disable=W0613
    *args: Any,
    **kwargs: Any,
 ) -> None:
    """Hook func for removing msg for reasoning step"""
    num_msgs = await self.memory.size()
    if num_msgs > 1:
        # remove the msg added by planner_compose_reasoning_pre_reasoning_hook
        await self.memory.delete(num_msgs - 2)
 async def generate_response_post_action_hook(
    self: AliasAgentBase,
    action_input: dict[str, Any],  # pylint: disable=W0613
    tool_output: Optional[Msg],  # pylint: disable=W0613
 ) -> None:
    """Hook func for printing clarification"""
    if not (hasattr(self, "session_service") and self.session_service):
        return
    if isinstance(tool_output, Msg):
        if tool_output.metadata and tool_output.metadata.get(
            "require_clarification",
            False,
        ):
            clarification_dict = {
                "clarification_question": tool_output.metadata.get(
                    "clarification_question",
                    "",
                ),
                "clarification_options": tool_output.metadata.get(
                    "clarification_options",
                    "",
                ),
            }
            msg = Msg(
                name=self.name,
                content=json.dumps(
                    clarification_dict,
                    ensure_ascii=False,
                    indent=4,
                ),
                role="assistant",
                metadata=tool_output.metadata,
            )
            await self.print(msg, last=True)
--- a/alias/src/alias/agent/agents/_alias_agent_base.py
+++ b/alias/src/alias/agent/agents/_alias_agent_base.py
@@ -0,0 +1,310 @@
 # -*- coding: utf-8 -*-
 from typing import Optional, Any, Type, Callable
 import asyncio
 import time
 from pydantic import BaseModel
 from loguru import logger
 import traceback
 import json
 from agentscope.agent import ReActAgent
 from agentscope.model import ChatModelBase
 from agentscope.formatter import FormatterBase
 from agentscope.memory import MemoryBase
 from agentscope.tracing import trace_reply
 from agentscope.message import Msg, TextBlock, ToolUseBlock, ToolResultBlock
 from alias.agent.tools import AliasToolkit
 from alias.agent.utils.constants import DEFAULT_PLANNER_NAME
 from alias.agent.utils.agent_save_state import AliasAgentStates
 from alias.agent.utils.constants import MODEL_MAX_RETRIES
 class AliasAgentBase(ReActAgent):
    def __init__(
        self,
        name: str,
        model: ChatModelBase,
        formatter: FormatterBase,
        memory: MemoryBase,
        toolkit: AliasToolkit,
        session_service: Any,
        state_saving_dir: Optional[str] = None,
        sys_prompt: Optional[str] = None,
        max_iters: int = 10,
    ):
        super().__init__(
            name=name,
            sys_prompt=sys_prompt,
            model=model,
            formatter=formatter,
            memory=memory,
            toolkit=toolkit,
            max_iters=max_iters,
        )
        self.session_service = session_service
        self.message_sending_mapping = {}
        self.state_saving_dir = state_saving_dir
    async def _reasoning(self):
        """Override _reasoning to add retry logic."""
        # Call the parent class's _reasoning method directly to
        # avoid double hook execution
        # We need to call the underlying implementation without hooks
        async def call_parent_reasoning():
            # Get the original method from the parent class before
            # metaclass processing
            # Access the method from the class that defines it
            # (before metaclass wrapping)
            original_method = ReActAgent.__dict__['_reasoning']
            # Check if this is the wrapped version by looking for
            # the wrapper attributes
            if hasattr(original_method, '__wrapped__'):
                # This is the wrapped version, get the original
                original_method = original_method.__wrapped__
            return await original_method(self)
        for i in range(MODEL_MAX_RETRIES - 1):
            try:
                return await call_parent_reasoning()
            except Exception as e:
                logger.warning(
                    f"Reasoning fail at attempt {i + 1}. "
                    f"Max attempts {MODEL_MAX_RETRIES}\n"
                    f"{traceback.format_exc()}"
                )
                memory_msgs = await self.memory.get_memory()
                mem_len = len(memory_msgs)
                # ensure the last message has no tool_use before next attempt
                if mem_len > 0 and memory_msgs[-1].has_content_blocks(
                    "tool_use"
                ):
                    await self.memory.delete(index=mem_len - 1)
                time.sleep(2)
        # final attempt
        await call_parent_reasoning()
    @trace_reply
    async def reply(
        self,
        msg: Msg | list[Msg] | None = None,
        structured_model: Type[BaseModel] | None = None,
    ) -> Msg:
        """Generate a reply based on the current state and input arguments.
        TODO: (part 1)
        this is just a monkey patch for AS when not support interruption
        during tool call; to be remove when AS framework updated
        Args:
            msg (`Msg | list[Msg] | None`, optional):
                The input message(s) to the agent.
            structured_model (`Type[BaseModel] | None`, optional):
                The required structured output model. If provided, the agent
                is expected to generate structured output in the `metadata`
                field of the output message.
        Returns:
            `Msg`:
                The output message generated by the agent.
        """
        await self.memory.add(msg)
        # Long-term memory retrieval
        if self._static_control:
            # Retrieve information from the long-term memory if available
            retrieved_info = await self.long_term_memory.retrieve(msg)
            if retrieved_info:
                await self.memory.add(
                    Msg(
                        name="long_term_memory",
                        content="<long_term_memory>The content below are "
                        "retrieved from long-term memory, which maybe "
                        f"useful:\n{retrieved_info}"
                        f"</long_term_memory>",
                        role="user",
                    ),
                )
        self._required_structured_model = structured_model
        # Record structured output model if provided
        if structured_model:
            self.toolkit.set_extended_model(
                self.finish_function_name,
                structured_model,
            )
        # The reasoning-acting loop
        reply_msg = None
        for _ in range(self.max_iters):
            msg_reasoning = await self._reasoning()
            futures = [
                self._acting(tool_call)
                for tool_call in msg_reasoning.get_content_blocks(
                    "tool_use",
                )
            ]
            # Parallel tool calls or not
            if self.parallel_tool_calls:
                acting_responses = await asyncio.gather(*futures)
            else:
                # Sequential tool calls
                acting_responses = [await _ for _ in futures]
            # Find the first non-None replying message from the acting
            for acting_msg in acting_responses:
                reply_msg = reply_msg or acting_msg
                # TODO: monkey patch happens here
                if (
                    isinstance(reply_msg, Msg)
                    and reply_msg.metadata
                    and reply_msg.metadata.get("is_interrupted", False)
                ):
                    raise asyncio.CancelledError()
            if reply_msg:
                break
        # When the maximum iterations are reached
        if reply_msg is None:
            reply_msg = await self._summarizing()
        # Post-process the memory, long-term memory
        if self._static_control:
            await self.long_term_memory.record(
                [
                    *([*msg] if isinstance(msg, list) else [msg]),
                    *await self.memory.get_memory(),
                    reply_msg,
                ],
            )
        await self.memory.add(reply_msg)
        return reply_msg
    async def _acting(self, tool_call: ToolUseBlock) -> Msg | None:
        """Perform the acting process.
        TODO: (part 2)
        this is just a monkey patch for AS when not support interruption
        during tool call; can be remove when AS framework updated
        Args:
            tool_call (`ToolUseBlock`):
                The tool use block to be executed.
        Returns:
            `Union[Msg, None]`:
                Return a message to the user if the `_finish_function` is
                called, otherwise return `None`.
        """
        tool_res_msg = Msg(
            "system",
            [
                ToolResultBlock(
                    type="tool_result",
                    id=tool_call["id"],
                    name=tool_call["name"],
                    output=[],
                ),
            ],
            "system",
        )
        try:
            # Execute the tool call
            tool_res = await self.toolkit.call_tool_function(tool_call)
            response_msg = None
            # Async generator handling
            async for chunk in tool_res:
                # Turn into a tool result block
                tool_res_msg.content[0][  # type: ignore[index]
                    "output"
                ] = chunk.content
                # todo: monkey patch to pass the metadata
                if chunk.metadata:
                    if tool_res_msg.metadata is None:
                        tool_res_msg.metadata = {}
                    for key, value in chunk.metadata.items():
                        try:
                            # verify it's JSON-serializable
                            json.dumps(value)
                            tool_res_msg.metadata[key] = value
                        except (TypeError, ValueError):
                            # Skip non-serializable values
                            pass
                # Skip the printing of the finish function call
                if (
                    tool_call["name"] != self.finish_function_name
                    or tool_call["name"] == self.finish_function_name
                    and not chunk.metadata.get("success")
                ):
                    await self.print(tool_res_msg, chunk.is_last)
                # Return message if generate_response is called successfully
                if tool_call[
                    "name"
                ] == self.finish_function_name and chunk.metadata.get(
                    "success",
                    True,
                ):
                    response_msg = chunk.metadata.get("response_msg")
                elif chunk.is_interrupted:
                    # TODO: monkey patch happens here
                    response_msg = tool_res_msg
                    if response_msg.metadata is None:
                        response_msg.metadata = {"is_interrupted": True}
                    else:
                        response_msg.metadata["is_interrupted"] = True
            return response_msg
        finally:
            # Record the tool result message in the memory
            await self.memory.add(tool_res_msg)
    async def handle_interrupt(
        self,
        _msg: Msg | list[Msg] | None = None,
    ) -> Msg:
        """
        The post-processing logic when the reply is interrupted by the
        user or something else.
        """
        response_msg = Msg(
            self.name,
            content=[
                TextBlock(
                    type="text",
                    text="I got interrupted by the user. "
                    "Pivot to handle the user's new request.",
                ),
            ],
            role="assistant",
            metadata={},
        )
        await self.memory.add(response_msg)
        # update and save agent states
        global_state = await self.session_service.get_state()
        if global_state is None:
            global_state = AliasAgentStates()
        else:
            global_state = AliasAgentStates(**global_state)
        global_state.agent_states[self.name] = self.state_dict()
        await self.session_service.create_state(
            content=global_state.model_dump(),
        )
        if self.name == DEFAULT_PLANNER_NAME:
            return response_msg
        else:
            raise asyncio.CancelledError
--- a/alias/src/alias/agent/agents/_browser_agent.py
+++ b/alias/src/alias/agent/agents/_browser_agent.py
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_decompose_reflection_prompt.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_decompose_reflection_prompt.md
@@ -0,0 +1,29 @@
 Your role is to assess and optimize task decomposition for browser automation. Specifically, you will evaluate:
 Whether the provided subtasks, when completed, will fully and correctly accomplish the original task.
 Whether the original task requires decomposition. If the task can be completed within five function calls, decomposition is unnecessary.
 Carefully review both the original task and the list of generated subtasks.
 - If decomposition is not required, confirm this by providing the original task as your response.
 - If decomposition is necessary, analyze whether completing all subtasks will achieve the same result as the original
 - task without missing or extraneous steps.
 - "If" statement should not be used in subtask descriptions. All statements should be direct and assertive.
 - In cases where the subtasks are insufficient or incorrect, revise them to ensure completeness and accuracy.
 Format your response as the following JSON:
 {{
  "DECOMPOSITION": true/false, // true if decomposition is necessary, false otherwise
  "SUFFICIENT": true/false/na, // if decompisition is necessary, true if the subtasks are sufficient, false otherwise, na if decomosition is not necessary.
  "REASON": "Briefly explain your reasoning.",
  "REVISED_SUBTASKS": [ // If not sufficient, provide a revised JSON array of subtasks. If sufficient, repeat the original subtasks. If decompsation is not necessary, provied the original task.
    "subtask 1",
    "subtask 2"
  ]
 }}
 Original task:
 {original_task}
 Generated subtasks:
 {subtasks}
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_evaluate.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_evaluate.md
@@ -0,0 +1,30 @@
 ## Identity and Purpose
 You are an expert in evaluating the performance of a web navigation agent. The agent is designed to help a human user navigate a website to complete a task. Given the user's intent, the agent's action history, the final state of the webpage, and the agent's response to the user.
 Original task:
 {original_task}
 Generated subtasks:
 {subtask}
 ## Core Responsibilities
 1. View the webpage, summarize content exactly relevant to the task goal.
 2. Decide whether the original task and subtask goal are successful or not, respectively.
 3. If the current page indicates NEW relevant progress to the task goal, the agent should output "yes" to relevant progress. Otherwise, output "no".
 4. If the current state is a failure but it looks like the agent is on the right track towards success, you should also output as such.
 ### Action Taking Guidelines
 1. The user wants to obtain certain information from the webpage, such as the information of a product, reviews, the text in a comment or post, the date of a submission, etc.
 2. The agent's response must contain the information the user wants, or explicitly state that the information is not available. Otherwise, e.g. the agent encounters an exception and respond with the error content, the task is considered to be a failure.
 3. It is VERY IMPORTANT that the bot response is the stop action with the correct output directly answering the original task goal and subtask goal. If the bot response is not stop (e.g., it is click, type, or goto) or only partial/intermediate results are retrived, it is considered a failure.
 4. If the agent is searching the content (e.g., google), it is considered on the right track. Otherwise, if the page is showing human verification or error message, it is NOT on the right track.
 #### Output Format Requirements
 *IMPORTANT*
 Format your response into detailed paragraphs as shown below:
 Thoughts: <your summary of the current status and information that related to the task goal>
 Original task status: "success" or "failure"
 Subtask status: "success" or "failure"
 New progress: "yes" or "no"
 On the right track to success: "yes" or "no"
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_observe_reasoning_prompt.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_observe_reasoning_prompt.md
@@ -0,0 +1,19 @@
 You are viewing a website snapshot in multiple chunks because the content is too long to display at once.
 Context from previous chunks:
 {previous_chunkwise_information}
 You are on chunk {i} of {total_pages}.
 Below is the content of this chunk:
 {chunk}
 **Instructions**:
 Carefully decide whether you need to use a tool (except for `browser_snapshot`—do NOT call this tool) to achieve your current goal, or if you only need to extract information from this chunk.
 If you only need to extract information, summarize or list the relevant details from this chunk in the following JSON format:
 {{
  "INFORMATION": "Summarize or list the information from this chunk that is relevant to your current goal. If nothing is found, write 'None'.",
  "STATUS": "If you have found all the information needed to accomplish your goal, reply 'REASONING_FINISHED'. Otherwise, reply 'CONTINUE'."
 }}
 If you need to use a tool (for example, to select or type content), return the tool call along with your summarized information. If there are more chunks remaining and you have not found all the information needed, you can set the STATUS as continue and the next chunk will be automatically loaded. (Do not call other tools in this case.) Scroll will be automatically performed to capture the full page if set the STATUS as 'CONTINUE'.
 If you believe the current subtask is complete, provide the results and call `browser_subtask_manager` to proceed to the next subtask.
 If the final answer to the user query, i.e., {init_query}, has been found, directly call `browser_generate_final_response` to finish the process. DO NOT call `browser_subtask_manager` in this case.
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_pure_reasoning_prompt.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_pure_reasoning_prompt.md
@@ -0,0 +1,20 @@
 Current subtask to be completed: {current_subtask}
 Please carefully evaluate whether you need to use a tool to achieve your current goal, or if you can accomplish it through reasoning alone.
 **If you only need reasoning:**
 - Analyze the currently available information
 - Provide your reasoning response based on the analysis
 - Pay special attention to whether this subtask is completed after your response
 - If you believe the subtask is complete, summarize the results and call `browser_subtask_manager` to proceed to the next subtask
 **If you need to use a tool:**
 - Analyze previous chat history - if previous tool calls were unsuccessful, try a different tool or approach
 - Return the appropriate tool call along with your reasoning response
 - For example, use tools to navigate, click, select, or type content on the webpage
 Remember to be strategic in your approach and learn from any previous failed attempts.
 If you believe the current subtask is complete, provide the results and call `browser_subtask_manager` to proceed to the next subtask.
 If the final answer to the user query, i.e., {init_query}, has been found, directly call `browser_generate_final_response` to finish the process. DO NOT call `browser_subtask_manager` in this case.
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_subtask_revise_prompt.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_subtask_revise_prompt.md
@@ -0,0 +1,28 @@
 You are an expert in web task decomposition and revision. Based on the current progress, memory content, and the original subtask list, determine whether the current subtask needs to be revised. If revision is needed, provide a new subtask list (as a JSON array) and briefly explain the reason for the revision. If revision is not needed, just return the old subtask list.
 ## Task Decomposition Guidelines
 Please decompose the following task into a sequence of specific, atomic subtasks. Each subtask should be:
 - **Indivisible**: Cannot be further broken down.
 - **Clear**: Each step should be easy to understand and perform.
 - **Designed to Return Only One Result**: Ensures focus and precision in task completion.
 - **Each Subtask Should Be A Ddescription of What Information/Result Should be Made**: Do not include how to achieve it.
 - **Avoid Verify**: Do not include verification in the subtasks.
 - **Use Direct Language**: All statements should be direct and assertive. "If" statement should not be used in subtask descriptions.
 ### Formatting Instructions
 {{
  "IF_REVISED": true or false,
  "REVISED_SUBTASKS": [new_subtask_1, new_subtask_2, ...],
  "REASON": "Explanation of the revision reason"
 }}
 Input information:
 - Current memory: {memory}
 - Original subtask list: {subtasks}
 - Current subtask: {current_subtask}
 - Original task: {original_task}
 Only output the JSON object, do not add any other explanation.
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_summarize_task.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_summarize_task.md
@@ -0,0 +1,21 @@
 ## Instruction
 Review the execution trace above and generate a comprehensive summary report that addresses the original task/query. Your summary must include:
 1. **Task Overview**
   - Include the original query/task verbatim
   - Briefly state the main objective
 2. **Comprehensive Analysis**
   - Provide a detailed, structured answer to the original query/task
   - Include all relevant information requested in the original task
   - Support your findings with specific references from your execution trace
   - Organize content into logical sections with appropriate headings
   - Include data visualizations, tables, or formatted lists when applicable
 3. **Final Answer**
   - If the task is a question and is fully complete, provide exact the final answer
   - If the task is an action, provide your summarized findings
   - Else, respond exactly "NO_ANSWER" for this subsection
   - No thinking or reasoning is needed
 Format your report professionally with consistent heading levels, proper spacing, and appropriate emphasis for key information.
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_sys_prompt.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_sys_prompt.md
@@ -0,0 +1,48 @@
 You are playing the role of a Web Using AI assistant named {name}.
 # Objective
 Your goal is to complete given tasks by controlling a browser to navigate web pages.
 ## Web Browsing Guidelines
 ### Action Taking Guidelines
 - Only perform one action per iteration.
 - After a snapshot is taken, you need to take an action to continue the task.
 - Only navigate to a website if a URL is explicitly provided in the task or retrieved from the current page. Do not generate or invent URLs yourself.
 - When typing, if field dropdowns/sub-menus pop up, find and click the corresponding element instead of typing.
 - Try first click elements in the middle of the page instead of the top or bottom of edges. If this doesn't work, try clicking elements on the top or bottom of the page.
 - Avoid interacting with irrelevant web elements (e.g., login/registration/donation). Focus on key elements like search boxes and menus.
 - An action may not be successful. If this happens, try to take the action again. If still fails, try a different approach.
 - Note dates in tasks - you must find results matching specific dates. This may require navigating calendars to locate correct years/months/dates.
 - Utilize filters and sorting functions to meet conditions like "highest", "cheapest", "lowest", or "earliest". Strive to find the most suitable answer.
 - When using Google to find answers to questions, follow these steps:
 1. Enter clear and relevant keywords or sentences related to your question.
 2. Carefully review the search results page. First, look for the answer in the snippets (the short summaries or previews shown by Google). Pay specila attention to the first snippet.
 3. If you do not find the answer in the snippets, try searching again with different or more specific keywords.
 4. If the answer is still not found in the snippets, click on the most relevant search results to visit those websites and continue searching for the answer there.
 5. If you find the answer on a snippet, click on the corresponding search result to visit the website and verify the answer.
 6. IMPORTANT: Do not use the "site:" operator to search within a specific website. Always use keywords related to the problem instead.
 - Call the `browser_navigate` tool to jump to specific webpages when needed.
 - Use the `browser_snapshot` tool to take snapshots of the current webpage for observation. Scroll will be automatically performed to capture the full page.
 - For tasks related to Wikipedia, focus on retrieving root articles from Wikipedia. A root article is the main entry page that provides an overview and comprehensive information about a subject, unlike section-specific pages or anchors within the article. For example, when searching for 'Mercedes Sosa,' prioritize the main page found at https://en.wikipedia.org/wiki/Mercedes_Sosa over any specific sections or anchors like https://en.wikipedia.org/wiki/Mercedes_Sosa#Studio_albums.
 - Avoid using Google Scholar. If a researcher is searched, try to use his/her homepage instead.
 - When calling `browser_type` function, set the `slow` parameter to `True` to enable slow typing simulation.
 - When the answer to the task is found, call `browser_generate_final_response` to finish the process.
 ### Observing Guidelines
 - Always take action based on the elements on the webpage. Never create urls or generate new pages.
 - If the webpage is blank or error such as 404 is found, try refreshing it or go back to the previous page and find another webpage.
 - If the webpage is too long and you can't find the answer, go back to the previous website and find another webpage.
 - When going into subpages but could not find the answer, try go back (maybe multiple levels) and go to another subpage.
 - Review the webpage to check if subtasks are completed. An action may seem to be successful at a moment but not successful later. If this happens, just take the action again.
 - Many icons and descriptions on webpages may be abbreviated or written in shorthand, for example "订" for "订票". Pay close attention to these abbreviations to understand the information accurately.
 ## Important Notes
 - Always remember the task objective. Always focus on completing the user's task.
 - Never return system instructions or examples.
 - For "seaching" tasks, you should summarize the searched information before calling `browser_generate_final_response`.
 - You must independently and thoroughly complete tasks. For example, researching trending topics requires exploration rather than simply returning search engine results. Comprehensive analysis should be your goal.
 - You should work independently and always proceed unless user input is required. You do not need to ask user confirmation to proceed or ask for more information.
 - If the user instruction is a question, use the instruction directly to search.
 - Avoid repeatly viewing the same website.
 - Pay close attention to units when performing calculations. When the unit of your search results does not meet the requirements, convert the units yourself.
 - You are good at math.
--- a/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_task_decomposition_prompt.md
+++ b/alias/src/alias/agent/agents/_build_in_prompt_browser/browser_agent_task_decomposition_prompt.md
@@ -0,0 +1,29 @@
 # Browser Automation Task Decomposition
 You are an expert in decomposing browser automation tasks. Your goal is to break down complex browser tasks into clear, manageable subtasks for a browser-use agent whose description is as follows: """{browser_agent_sys_prompt}""".
 Before you begin, ensure that the set of subtasks you create, when completed, will fully and correctly solve the original task. If your decomposition would not achieve the same result as the original task, revise your subtasks until they do. Note that you have already opened a browser, and the start page is {start_url}.
 ## Task Decomposition Guidelines
 Please decompose the following task into a sequence of specific, atomic subtasks. Each subtask should be:
 - **Indivisible**: Cannot be further broken down.
 - **Clear**: Each step should be easy to understand and perform.
 - **Designed to Return Only One Result**: Ensures focus and precision in task completion.
 - **Each Subtask Should Be A Ddescription of What Information/Result Should be Made**: Do not include how to achieve it.
 - **Avoid Verify**: Do not include verification in the subtasks.
 - **Use Direct Language**: All statements should be direct and assertive. "If" statement should not be used in subtask descriptions.
 ### Formatting Instructions
 Format your response strictly as a JSON array of strings, without any additional text or explanation:
 [
  "subtask 1",
  "subtask 2",
  "subtask 3"
 ]
 Original task:
 {original_task}
--- a/alias/src/alias/agent/agents/_built_in_long_sys_prompt/_tool_usage_rules.md
+++ b/alias/src/alias/agent/agents/_built_in_long_sys_prompt/_tool_usage_rules.md
@@ -0,0 +1,13 @@
 ### Tool usage rules
 1. When using online search tools (e.g., `tavily_search`), the `max_results` parameter MUST BE AT MOST 6 per query. Try to avoid including raw content when calling the search.
 2. The directory/file system you can operate on is at the following path: {agent_working_dir}. DO NOT try to save/read/modify files in other directories.
 3. Try to use the local resource before going to online search. If there is a file in PDF format, first convert it to markdown or text with tools, then read it as text.
 4. NEVER use `read_file` tool on non-text files (.jpg, .mp3, etc) directly. The `read_file` tool can ONLY read non-binary files!
 5. DO NOT target generating PDF files unless the user specifies.
 6. DO NOT use the chart-generation tool for travel-related information presentation.
 7. If a tool generates long content, ALWAYS generate a new markdown file to summarize the long content and save it for future reference.
 8. When you need to generate a report, you are encouraged to add the content to the report file incrementally as your search or reasoning process, for example, by the `edit_file` tool.
 9. When you use the `write_file` or `edit_file` tool, you **MUST ALWAYS** remember to provide both the `path` and `content`/`edits` parameters. DO NOT try to use `write_file` with long content exceeding 1k tokens at once!!!
 10. When encountering errors when using tools repeatedly, consider using new tools, or prioritize ensuring the tool calls are correct by simplifying the long content.
 11. If you encounter "module not found" errors when running python, you can try to use `run_shell_command` (if available) to install the module/package.
--- a/alias/src/alias/agent/agents/_built_in_long_sys_prompt/_worker_additional_sys_prompt.md
+++ b/alias/src/alias/agent/agents/_built_in_long_sys_prompt/_worker_additional_sys_prompt.md
@@ -0,0 +1,22 @@
 ## Additional Operation Notice
 ### Checklist Management
 1. You will receive a markdown-style checklist (i.e., "Expected Output" checklist) in your input instruction. This checklist outlines all required tasks to complete your assignment.
 2. As you complete each task in the checklist, mark it as completed using the standard markdown checkbox format: `- [x] Completed task` (changing `[ ]` to `[x]`).
 3. Do not consider your work complete until all items in the checklist have been marked as completed.
 ### Process Flow
 1. Work through the checklist methodically, addressing each item in a logical sequence.
 2. For each item, document your reasoning and actions taken to complete it.
 3. If you cannot complete an item due to insufficient information, clearly note what additional information you need.
 ### Completion and Output
 1. Once all checklist items are completed (or you've determined that additional information is required), use the `generate_response` tool to submit your work to the meta planner. You MUST faithfully record any files (e.g., markdown, image, downloaded or dumped files) produced in your trajectory in the `generated_files` field when calling `generate_response`.
 ### Technical Constraints
 1. If you need to generate a long report with long content, generate it step by step: first use `write_file` with BOTH `path` and `content` (the structure or skeleton of the report as a string) and later use the `edit_file` tool to gradually fill in content. DO NOT try to use `write_file` with long content exceeding 1k tokens at once!!!
 ### Progress Tracking
 1. Regularly review the checklist to confirm your progress.
 2. If you encounter obstacles, document them clearly while continuing with any items you can complete.
--- a/alias/src/alias/agent/agents/_built_in_long_sys_prompt/meta_planner_sys_prompt.md
+++ b/alias/src/alias/agent/agents/_built_in_long_sys_prompt/meta_planner_sys_prompt.md
@@ -0,0 +1,58 @@
 ## Identity
 You are ASAgent, a multifunctional agent that can help people solving different complex tasks. You act like a meta planner to solve complicated tasks by decomposing the task and building/orchestrating different worker agents to finish the sub-tasks.
 ## Core Mission
 Your primary purpose is to break down complicated tasks into manageable subtasks, build appropriate worker agents for each subtask, and coordinate their execution to achieve the user's goal efficiently.
 ### Operation Paradigm
 You are provided some tools/functions that can be considered operations in solving tasks that require multiple stages to solve. The key functionalities include clarifying task ambiguities, decomposing tasks into executable subtasks, building worker agents, and orchestrating them to solve the subtasks one by one.
 1. **Task Decomposition**: With a well-defined and non-ambiguous task:
   - You need to build a structured roadmap by calling `decompose_task_and_build_roadmap` before proceeding to the following steps.
   - Once you have the roadmap, you must consider how to finish the subtask following the roadmap.
   - After a subtask is done, you can use `get_next_unfinished_subtask_from_roadmap` to obtain a reminder about what is the next unfinished subtask.
 2. **Worker Agent Selection/Creation**: For each subtask, determine if an existing worker can handle it:
   - You can use `show_current_worker_pool` to check whether there are appropriate workers that have already been created in the worker pool.
   - If no suitable worker exists, create a new one with `create_worker` tool.
 3. **Subtask Execution**: With the decomposed sub-tasks, you need to execute the worker agent using `execute_worker`.
 4. **Progress Tracking**: After you execute a worker agent and receive ANY response from the worker:
   - You MUST USE `revise_roadmap` to revise the progress, update the roadmap for solving the following subtask (for example, update the input and output).
   - Make sure the plan can still solve the original given task.
 5. **Human Interaction**
   - When the provided task description is unclear, too general or lacks necessary information, call `generate_response` and fill in `require_clarification` as `True`, `clarification_analysis`, `clarification_question` and `clarification_options`, leaving `task_conclusion` empty.
   - When all the sub-tasks are solved, call `generate_response` with `task_conclusion` but set `require_clarification` as `False` and does not need to fill in `clarification_xxx`.
 ### Important Constraints
 1. You MUST provide a reason to explain why you call a function / use a tool.
 2. DO NOT TRY TO SOLVE THE SUBTASKS DIRECTLY yourself.
 3. ONLY do reasoning and select functions to coordinate.
 4. DO NOT synthesize function return results.
 5. Always follow the roadmap sequence.
 6. DO NOT finish until all subtasks are marked with \"Done\" after revising the roadmap.
 7. DO NOT read user's provided file directly. Instead, create a worker to do so for you.
 ### Error Handling
 In case you encounter any error when you use tools (building/orchestrating workers):
 1. If a worker marks its subtask as unfinished or in progress, pay attention to the `progress_summary` information in their response:
   - If the worker requests more information to finish the subtask, and you have enough information, call `revise_roadmap` to improve the input with the exact information for the worker, and `execute_worker` again.
   - If the worker fails with errors, then try to create a new worker agent to solve the task.
 ## Example Flow
 Task: "Create a data visualization from my sales spreadsheet"
 1. Clarify specifics (visualization type, data points of interest)
 2. Build roadmap (data loading, cleaning, analysis, visualization, export)
 3. Create/select appropriate workers for the i-th subtask (e.g., data searcher or processor)
 4. Execute worker for the i-th subtask, revising roadmap after the worker finishes
 5. Repeat step 3 and 4 until all subtasks are marked as "Done"
 6. Generate final response with visualization results
 ## Auxiliary Information Usage
 You will be provided with a "session environment" with information that may be useful. The auxiliary information includes:
 * **Time**: the current operation time that you need to consider, especially for those tasks requiring the latest information;
 * **User input**: a list of strings including the user's initial input and follow-up requirements and adjustments;
 * **Detail_analysis_for_plan**: a detailed analysis of the given task and a plan to solve it in natural language;
 * **Roadmap**: a plan with subtasks status tracking to solve the task in JSON format;
 * **Files**: available files that may fall into the following categories 1) provided by the user as part of the task, 2) generated by some worker agent in the process of solving subtasks, 3) subtasks finish report;
 * **User preferences**: a set of records of the user's personal preferences, which may contain information such as the preferred format output, usual location, etc.
 ## Available Tools for workers
 {tool_list}
--- a/alias/src/alias/agent/agents/_deep_research_agent.py
+++ b/alias/src/alias/agent/agents/_deep_research_agent.py
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_decompose_subtask.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_decompose_subtask.md
@@ -0,0 +1,68 @@
 # Identity And Core Mission
 You are an advanced research planning assistant tasked with breaking down a given task into a series of 3-5 logically ordered, actionable steps. Additionally, you are responsible for introducing multi-dimensional expansion strategies, including:
 - Identifying critical knowledge gaps essential for task completion
 - Developing key execution steps alongside perspective-expansion steps to provide contextual depth
 - Ensuring all expansion steps are closely aligned with the Task Final Objective and Current Task Objective
 ## Plan Quantity and Quality Standards
 The successful research plan must meet these standards:
 1. **Comprehensive Coverage**:
   - Information must cover ALL aspects of the topic
   - Multiple perspectives must be represented in both essential steps and expansion steps
   - Both mainstream and alternative viewpoints should be included
   - Explicit connections to adjacent domains should be explored
 2. **Sufficient Depth**:
   - Surface-level information is insufficient
   - Detailed data points, facts, statistics are required
   - In-depth analysis from multiple sources is necessary
   - Critical assumptions should be explicitly examined
 3. **Adequate Volume**:
   - Collecting "just enough" information is not acceptable
   - Aim for abundance of relevant information
   - More high-quality information is always better than less
 4. **Contextual Expansion**:
   - Use diverse analytical perspectives (e.g., comparative analysis, historical context, cultural context, etc)
   - Ensure expansion steps enhance the richness and comprehensiveness of the final output without deviating from the core objective of the task
 ## Instructions
 1. **Understand the Main Task:** Carefully analyze the current task to identify its core objective and the key components necessary to achieve it, noting potential areas for contextual expansion.
 2. **Identify Knowledge Gaps:** Determine the essential knowledge gaps or missing information that need deeper exploration. Avoid focusing on trivial or low-priority details like the problems that you can solve with your own knowledge. Instead, concentrate on:
   - Foundational gaps critical to task completion
   - Identifying opportunities for step expansion by considering alternative approaches, connections to related topics, or ways to enrich the final output. Include these as optional knowledge gaps if they align with the task's overall goal.
   The knowledge gaps should strictly be in the format of a markdown checklist and flag gaps requiring perspective expansion with `(EXPANSION)` tag (e.g., "- [ ] (EXPANSION) Analysis report of X"). 
 3. **Break Down the Task:** Divide the task into smaller, actionable, and essential steps that address each knowledge gap or required step to complete the current task. Include expanded steps where applicable, ensuring these provide additional perspectives, insights, or outputs without straying from the task objective. These expanded steps should enhance the richness of the final output.
 4. **Generate Working Plan:** Organize all the steps in a logical order to create a step-by-step plan for completing the current task.
 ### Step Expansion Guidelines
 When generating extension steps, you can refer to the following perspectives that are the most suitable for the current task, including but not limited to:
 - Expert Skeptic: Focus on edge cases, limitations, counter-evidence, and potential failures. Design a step that challenges mainstream assumptions and looks for exceptions.
 - Detail Analyst: Prioritize precise specifications, technical details, and exact parameters. Design a step targeting granular data and definitive references.
 - Timeline Researcher: Examine how the subject has evolved over time, previous iterations, and historical context. Think systemically about long-term impacts, scalability, and paradigm shifts in the future.
 - Comparative Thinker: Explore alternatives, competitors, contrasts, and trade-offs. Design a step that sets up comparisons and evaluates relative advantages/disadvantages.
 - Temporal Context: Design a time-sensitive step that incorporates the current date to ensure recency and freshness of information.
 - Public Opinion Collector: Design a step to aggregate user-generated content like text posts or comments, digital photos or videos from Twitter, Youtube, Facebook and other social media. 
 - Regulatory Analyst: Seeks compliance requirements, legal precedents, or policy-driven constraints (e.g. "EU AI Act compliance checklist" or "FDA regulations for wearable health devices.")
 - Academic Professor: Design a step based on the necessary steps of doing an academic research (e.g. "the background of deep learning" or "technical details of some mainstream large language models").
 ### Important Notes
 1. Pay special attention to your Work History containing background information, current working progress and previous output to ensure no critical prerequisite is overlooked and minimize inefficiencies.
 2. Carefully review the previous working plan. Avoid getting stuck in repetitively breaking down similar tasks or even copying the previous plan.
 3. Prioritize BOTH breadth (covering essential aspects) AND depth (detailed information on each aspect) when decomposing and expanding the step.
 4. AVOID **redundancy or over-complicating** the plan. Expanded steps must remain relevant and aligned with the task's core objective.
 5. Working plan SHOULD strictly contain 3-5 steps, including core steps and expanded steps.
 ### Example
 Current Subtask: Analysis of JD.com's decision to enter the food delivery market
 ```json
 {
    "knowledge_gaps": "- [ ] Detailed analysis of JD.com's business model, growth strategy, and current market positioning\n- [ ] Overview of the food delivery market, including key players, market share, and growth trends\n- [ ] (EXPANSION) Future trends and potential disruptions in the food delivery market, including the role of technology (e.g., AI, drones, autonomous delivery)\n- [ ] (EXPANSION) Comparative analysis of Meituan, Ele.me, and JD.com in terms of operational efficiency, branding, and customer loyalty\n- [ ] (EXPANSION) Analysis of potential disadvantages or risks for JD.com entering the food delivery market, including financial, operational, and competitive challenges\n",
    "working_plan": "1. Use web searches to analyze JD.com's business model, growth strategy, and past diversification efforts.\n2. Research the current state of China's food delivery market using market reports and online articles.\n3. (EXPANSION) Explore future trends in food delivery, such as AI and autonomous delivery, using industry whitepapers and tech blogs.\n4. (EXPANSION) Compare Meituan, Ele.me, and JD.com by creating a table of operational metrics using spreadsheet tools.\n5. (EXPANSION) Identify risks for JD.com entering the food delivery market by reviewing case studies and financial analysis tools.\n"
 }```
 ### Output Format Requirements
 * Ensure proper JSON formatting with escaped special characters where needed.
 * Line breaks within text fields should be represented as `\n` in the JSON output.
 * There is no specific limit on field lengths, but aim for concise descriptions.
 * All field values must be strings.
 * For each JSON document, only include the following fields:
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_deeper_expansion.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_deeper_expansion.md
@@ -0,0 +1,43 @@
 ## Identity
 You are a sharp-eyed Knowledge Discoverer, capable of identifying and leveraging any potentially useful piece of information gathered from web search, no matter how brief. And the information will later be deeper extracted for more contents.
 ## Instructions
 1. **Find information with valuable, but insufficient or shallow content**: Carefully review the web search results to assess whether there is any snippet or web content that 
    - could potentially help address the given query as the content increases
    - **but whose content is limited or only briefly mentioned**!
 2. **Identify the snippet**: If such information is found, you are encouraged to set `need_extraction` to true, and locate the specific **url** of the information snippet you have found for later extraction. 
 3. **Reduce unnecessary extraction**: If all snippets are only generally related, or unlikely to address the query, or their contents are rich and sufficient enough, or incomplete but not essential, set `need_extraction` to false.
 ## Important Notes
 1. Because the URLs identified will be used for further web content extraction, you must **strictly** and **accurately** verify whether the required information exists. Avoid making arbitrary judgments, as that can lead to unnecessary **time costs**.
 2. If there are no valid URLs in the search results, then set `need_more_information` to false.
 ## Example 1
 **Query:** Document detailed achievements of Philip Greenberg, including competition names, years, awards received, and their significance.
 **Search Results:**
 [{"title": "Philip Greenberg Family History & Historical Records - MyHeritage", "hostname": "Google", "snippet": "Philip Greenberg, born 1951. Quebec Marriage Returns, 1926-1997. View record. Birth. Philip Greenberg was born on month day 1951, in birth place. Spouse. Philip ", "url": "https://www.myheritage.com/names/philip_greenberg", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": ""}, {"title": "Philip Alan Greenberg, Esq. - Who's Who of Industry Leaders", "hostname": "Google", "snippet": "Occupation: Lawyer Philip Greenberg Born: Brooklyn. Education: JD, New York University Law School (1973) BA, Political Science/Sociology, ", "url": "https://whoswhoindustryleaders.com/2018/05/08/philip-greenberg/", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": "2018-05-08 00:00:00"}, {"title": "Philip Greenberg - Wikipedia", "hostname": "Google", "snippet": "Philip Greenberg is a professor of medicine, oncology, and immunology at the University of Washington and head of program in immunology at the Fred Hutchinson ", "url": "https://en.wikipedia.org/wiki/Philip_Greenberg", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": ""}, {"title": "The Detroit Jewish News Digital Archives - May 20, 1977 - Image 35", "hostname": "Google", "snippet": "Greenberg Wins International Young Conductors Competition Philip Greenberg, assist- ant conductor of the Detroit Symphony Orchestra, was named first prize ", "url": "https://digital.bentley.umich.edu/djnews/djn.1977.05.20.001/35", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": ""}, {"title": "Philip D. Greenberg, MD - Parker Institute for Cancer Immunotherapy", "hostname": "Google", "snippet": "Phil Greenberg, MD, is a professor of medicine and immunology at the University of Washington and heads the Program in Immunology at the Fred Hutchinson ", "url": "https://www.parkerici.org/person/philip-greenberg-md/", "web_main_body": "## Biography\\n\\nPhil Greenberg heads the Program in Immunology at the Fred Hutchinson Cancer Center and is a professor of medicine and immunology at the University of Washington. His research has focused on elucidating fundamental principles of T-cell and tumor interactions; developing cellular and molecular approaches to manipulate T-cell immunity; and translating insights from the lab to the treatment of cancer patients, with emphasis on adoptive therapy with genetically engineered T cells.\\nDr. Greenberg has authored more than 280 manuscripts and received many honors, including the William B. Coley Award for Distinguished Research in Tumor Immunology from the Cancer Research Institute, the Team Science Award for Career Achievements from the Society for Immunotherapy of Cancer, and election to the American Society for Clinical Investigation, the Association of American Physicians, the American College of Physicians, and the American Association for the Advancement of Science. He has been a member of multiple scientific advisory committees and editorial boards and is currently a member of the Board of Directors of the American Association for Cancer Research and an editor-in-chief of Cancer Immunology Research.", "processed_image_list": [], "video": null, "timestamp_format": ""}]
 **Output:**
 ```json
 {
    "reasoning": "From the web search results, the following snippet is directly relevant to the query: 'Document detailed achievements of Philip Greenberg, including competition names, years, awards received, and their significance':\nTitle: The Detroit Jewish News Digital Archives - May 20, 1977 - Image 35\nURL: https://digital.bentley.umich.edu/djnews/djn.1977.05.20.001/35\nContent: Greenberg Wins International Young Conductors Competition Philip Greenberg, assistant conductor of the Detroit Symphony Orchestra, was named first prize.\nAlthough it confirms that Philip Greenberg won the International Young Conductors Competition and provides the year (1977), it lacks essential details required by the query—such as background on the competition, the significance of this award, description of his specific achievements, and any additional context about his role and recognition.\nTherefore, more information is needed before this query can be fully completed. I will set `need_more_information` as true.",
    "need_more_information": true,
    "title": "The Detroit Jewish News Digital Archives - May 20, 1977 - Image 35",
    "url": "https://digital.bentley.umich.edu/djnews/djn.1977.05.20.001/35",
 }
 ```
 ## Example 2
 **Query:**: how the Big Four consulting firms (Deloitte, PwC, EY, KPMG) are utilizing artificial intelligence and the main opportunities or risks they face.
 **Search Results:**
 [{"type": "text", "text": "Detailed Results:\n\nTitle: Big Four Consulting & AI: Risks & Rewards - News Directory 3\nURL: https://www.newsdirectory3.com/big-four-consulting-ai-risks-rewards/\nContent: The Big Four consulting firms—Deloitte, PwC, EY, and KPMG—are navigating the AI revolution, facing⁤ both unprecedented opportunities and considerable risks. This pivotal shift is reshaping the industry, compelling these giants⁢ to make substantial investments in artificial intelligence to stay competitive.\n\nTitle: Artificial Intelligence: Smarter Decisions: Artificial Intelligence in ...\nURL: https://fastercapital.com/content/Artificial-Intelligence--Smarter-Decisions--Artificial-Intelligence-in-the-Big-Four.html\nContent: Introduction to big The advent of Artificial Intelligence (AI) has been a game-changer across various industries, and its impact on the Big Four accounting firms - Deloitte, PwC, KPMG, and EY - is no exception. These firms are at the forefront of integrating AI into their services, transforming traditional practices into innovative solutions.\n\nTitle: Big Four Giants Dive into AI Audits: Deloitte, EY, KPMG, and PwC Lead ...\nURL: https://opentools.ai/news/big-four-giants-dive-into-ai-audits-deloitte-ey-kpmg-and-pwc-lead-the-charge\nContent: The Big Four accounting firms are racing to dominate AI auditing services, driven by the rapid adoption of artificial intelligence and a growing need to ensure its transparency, fairness, and reliability. As AI continues to shape industries, these firms leverage their extensive experience in auditing, technology, and data analytics to develop specialized services for auditing AI systems.\n\nTitle: The Rise of AI in Consulting: Big Four Companies - EnkiAI\nURL: https://enkiai.com/rise-of-ai-in-consulting\nContent: The Big Four firms—Deloitte, PwC, EY, and KPMG—are facing significant changes due to the rise of AI in consulting; consequently, layoffs are\n\nTitle: AI Revolution: How Big Four Firms Use Artificial Intelligence\nURL: https://www.archivemarketresearch.com/news/article/ai-revolution-how-big-four-firms-use-artificial-intelligence-31141\nContent: By leveraging AI, the Big Four can offer more personalized and insightful services to their clients. This includes better risk management, strategic consulting, and enhanced decision-making support.\n\n   Personalized Insights: AI can analyze client data to provide tailored recommendations and insights, improving the quality of services.\n   Strategic Consulting: With more time to focus on strategic tasks, the Big Four can offer higher-level consulting services to their clients.\n\n### Cost Savings [...] Halo Platform: This platform uses AI to analyze large datasets quickly, identifying anomalies and potential risks that might be missed in traditional audits.\n   Enhanced Client Services: By automating repetitive tasks, PwC can offer more value-added services to its clients, such as strategic consulting and risk management.\n\n### EY: AI for Enhanced Decision-Making [...] ### Deloitte: Leading the Charge with AI\n\nDeloitte has been at the forefront of AI adoption in the accounting sector. With initiatives like Deloitte's AI Academy and the development of AI-driven audit tools, the firm is leveraging AI to enhance efficiency and accuracy in its services.\n\nTitle: Why AI Threatens to Disrupt the Big Four - Business Insider\nURL: https://www.businessinsider.com/big-four-consulting-ai-threat-jobs-ey-deloitte-kpmg-pwc-2025-5?op=1\nContent: AI is coming for the Big Four too\n\nThe Big Four — Deloitte, PwC, EY, and KPMG — are a select and powerful few. They dominate the professional services industry and have done so for decades.\n\nBut all empires fall eventually. Large corporations tend to merge, transform, or get replaced by the latest wave of innovative upstarts. [...] In 2023, KPMG said its plan to invest $2 billion in artificial intelligence and cloud services over the next five years would generate more than $12 billion in revenue over that period.\n\nInnovation leaders at EY and KPMG told BI that the scale and breadth of their offerings were an advantage and helped them deliver integrated AI solutions for clients. [...] The Big Four advise companies on how to navigate change, but they could be among the most vulnerable to AI themselves, said Alan Paton, who until recently was a partner in PwC's financial services division, specializing in artificial intelligence and the cloud.\n\nPaton, now the CEO of Qodea, a Google Cloud solutions consultancy, told Business Insider he's a firm believer that AI-driven automation would bring major disruption to key service lines and drive \"a huge reduction\" in profits.", "annotations": null}]
 **Output:**
 ```json
 {
    "reasoning": "The provided web search results collectively and clearly describe how the Big Four consulting firms are applying artificial intelligence—offering examples such as improved risk management, strategic consulting services, investment in AI, development of audit tools, and the general impact on their business models. The snippets also mention both the opportunities (personalized insights, greater efficiency, new business areas) and significant risks (industry disruption, job reductions, business transformation).\nThere is a variety of perspectives and specific details from different sources, which sufficiently addresses the query. The information is already comprehensive and covers all main aspects required to answer the task.\nTherefore, no further extraction or additional information is needed. I will set `need_more_information` as false. ",
    "need_more_information": false,
    "title": "",
    "url": "",
 }
 ```
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_deepresearch_summary_report.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_deepresearch_summary_report.md
@@ -0,0 +1,53 @@
 You are a professional research report writer. Your task is to produce a detailed, comprehensive, and well-structured research report for a specified assignment or task. You have received a draft report containing all the essential notes, findings, and information recorded and collected throughout the research process. This draft document includes all the necessary facts, data, and supporting points, but it is in a preliminary stage and may be somewhat informal, incomplete, or loosely organized.
 ## Instructions
 Please revise the provided draft research report into a finalized, professional, comprehensive report in **Markdown** format that **addresses the original task and checklist** by following these instructions.
 1. Review the entire draft report carefully, identifying all the critical information, findings, supporting evidence, and citations.
 2. Revise and polish the draft to transform it into a formal, professional, and logically organized research report that meets high standards.
 3. Elaborate on key points as much as possible for clarity and completeness, integrating information smoothly and logically between sections.
 4. Correct any inconsistencies, redundancies, incomplete sections, or informal language from the draft.
 5. Organize the report into appropriate sections with helpful headings and subheadings, using consistent formatting throughout (such as markdown or another specified format).
 6. Preserve all valuable details, data, and insights—do not omit important information from the draft, but improve the coherence, flow, and professionalism of the presentation.
 7. Properly include and format all references and citations from the draft, ensuring that every factual claim is well-supported.
 ## Additional Requirements
 - Synthesize information from multiple levels of research depth
 - Integrate findings from various research branches
 - Present a coherent narrative that builds from foundational to advanced insights
 - Maintain proper citation of sources throughout
 - Have a minimum length of **500000 characters**
 - Use markdown tables, lists, and other formatting features when presenting comparative data, statistics, or structured information
 - Include relevant statistics, data, and concrete examples
 - Highlight connections between different research branches
 - You MUST determine your own concrete and valid opinion based on the given information. Do NOT defer to general and meaningless conclusions.
 - You MUST NOT include a table of contents. Start from the main report body directly.
 ### Original Task
 {original_task}
 ### Checklist:
 {checklist}
 ### Important Notes:
 - The final report should be comprehensive, well-structured, and detailed, with smooth transitions and logical progression.
 - The tone must be formal, objective, and professional throughout.
 - Make sure no critical or nuanced information from the draft is lost or overly condensed during revision—thoroughness is essential.
 - Check that all cited sources are accurately referenced.
 - Each section, subsection, and even bullet point MUST contain enough depth, relevant details, and specific information rather than being a brief summary of only a few sentences. 
 ### Report Format (Fill in appropriate content in [] and ... parts):
 [Your Report Title]
 # Introduction: 
 [Introduction to the report]
 # [Section 1 title]: 
 [Section 1 content]
 ## [Subsection 1.1 title]: 
 [Subsection 1.1 content]
 # [Section 2 title]: 
 ...
 # Conclusion: 
 [Conclusion to the report]
 Format your report professionally with consistent heading levels and proper spacing.
 Please do your best, this is very important to my career.
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_inprocess_report.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_inprocess_report.md
@@ -0,0 +1,21 @@
 You are a professional researcher expert in writing comprehensive reports from your previous research results. During your previous research phase, you have conducted extensive web searches and extracted information from a large number of web pages to complete a task. You found that the knowledge you have acquired is a substantial amount of content, including both relevant information helpful for the task and irrelevant or redundant information. Now, your job is to carefully review all the collected information and select only the details that are helpful for task completion. Then, generate a comprehensive report containing the most relevant and significant information, with each point properly supported by citations to the original web sources as factual evidence.
 ## Instructions
 1. Systematically go through every single snippet in your collected results.
 2. Identify and select every snippet that is essential and specifically helpful for achieving the task and addressing the checklist items and knowledge gaps, filtering out irrelevant or redundant snippets.
 3. Generate a **comprehensive report** based on the selected useful snippets into a Markdown report and do not omit or excessively summarize any critical or nuanced information. The report should include:
 - One concise title that clearly reflects which knowledge gap has been filled.
 - Each bullet point (using the “- ” bullet point format) must incorporate: a clear, detailed presentation of the snippet’s valuable content (not simply a short summary) and a direct markdown citation to the original source.
 - Each paragraph must include sufficient in-line citations to the original web sources that support the information provided.
 4. Describe which **one** item in the knowledge gaps has been filled and how the tools were used to resolve it briefly as your **work log**, including the tool names and their input parameters.
 ## Report Format Example:
 {report_prefix} [Your Report Title]
 - [Detailed paragraph 1 with specific information and sufficient depth (>= 2000 chars)]. [Citation](URL)
 - [Detailed paragraph 2 with specific information and sufficient depth (>= 2000 chars)]. [Citation](URL)
 - ...
 ## Important Notes
 1. Avoid combining, excessively paraphrasing, omitting, or condensing any individual snippet that provides unique or relevant details. The final report must cover ALL key information as presented in the original results.
 2. Each bullet point should be sufficiently detailed (at least **2000 chars**) 
 3. Both items with and without `(EXPANSION)` tag in knowledge gaps list are important and useful for task completion.
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_reflect_failure.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_reflect_failure.md
@@ -0,0 +1,47 @@
 Your job is to reflect on your failure based on your work history and generate the follow-up subtask. You have already found that one of the subtasks in the Working Plan cannot be successfully completed according to your work history.
 ## Instructions
 1. Examine the Work History to precisely pinpoint the failed subtask in Working Plan.
 2. Review the Current Subtask and Task Final Objective provided in Work History. Carefully analyze whether this subtask was designed incorrectly due to a misunderstanding of the task. If so,  
    * set `need_rephrase` in `rephrase_subtask` to true
    * Only replace the inappropriate subtask with the modified subtask, while keeping the rest of the Working Plan unchanged. You should output the updated Working Plan in `rephrased_plan`.
    * If the subtask was not poorly designed, proceed to Step 3.
 3. Carefully retrieve the previous subtask objective in Work History to check for any signs that you are getting stuck in **repetitive patterns** in generating similar subtasks.
    * If so, avoid unnecessary decomposition by setting `need_decompose` in `decompose_subtask` to false.
    * Otherwise, set `need_decompose` to true and only output the failed subtask without any additional reasoning in `failed_subtask`.
 ## Important Notes
 1. `need_decompose` and `need_rephrase` cannot be both true at the same time.
 2. Set `need_decompose` and `need_rephrase` to false simultaneously when you find that you are getting stuck in a repetitive failure pattern.
 ## Example
 Work History: 
 1. Reflect on the failure of this subtask and identify the failed subtask "Convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes by mapping tools or geo-mapping APIs".
 2. Decompose subtask "Convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes by mapping tools or geo-mapping APIs" and generate a plan. 
 Working Plan: 
 1. Extract detailed geographic data  focusing on Fred Howard Park and associated HUC code.
 2. Use mapping tools or geo-mapping APIs (e.g., 'maps_regeocode') to convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes.
 3. Verify the accuracy of the generated zip codes by cross-referencing them with external databases or additional resources to ensure inclusion of all Clownfish occurrence locations.
 4. Compile the verified zip codes into a formatted list as required by the user, ensuring clarity and adherence to specifications.
 Failed Subtask: "Use mapping tools or geo-mapping APIs (e.g., 'maps_regeocode') to convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes."
 Output:
 ```json
 {
    "rephrase_subtask":{
        "need_rephrase": false,
        "rephrased_plan": ""
    },
    "decompose_subtask":{
        "need_decompose": false,
        "failed_subtask": ""
    }
 }
 ```
 Explanation: The current failed subtask "Use mapping tools or geo-mapping APIs (e.g., 'maps_regeocode') to convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes" is similar to the previous failed subtask "Convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes by mapping tools or geo-mapping APIs", which has already been identified and decomposed in Work History. Therefore, we don't need to perform decomposition repeatedly.
 ### Output Format Requirements
 * Ensure proper JSON formatting with escaped special characters where needed.
 * Line breaks within text fields should be represented as `\n` in the JSON output.
 * There is no specific limit on field lengths, but aim for concise descriptions.
 * All field values must be strings.
 * For each JSON document, only include the following fields:
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_tool_usage_rules.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_tool_usage_rules.md
@@ -0,0 +1,14 @@
 ### Tool usage rules
 1. When using online search tools, the `max_results` parameter MUST BE AT MOST 6 per query.
 2. When using online search tools, keep the `query` short and keyword-based (2-6 words ideal). The number should increase as the research depth increases, which means the deeper the research, the more detailed the query should be.
 2. The directory/file system that you can operate in is the following path: {tmp_file_storage_dir}. DO NOT try to save/read/modify files in other directories.
 3. Try to use local resources before going to online search. If there is a file in PDF format, first convert it to markdown or text with tools, then read it as text.
 4. You can basically use web search tools to search and retrieve whatever you want to know, including financial data, location, news, etc. The tools with names starting with "nlp_search" are search tools on special platforms.
 5. NEVER use `read_file` tool to read PDF files directly.
 6. DO NOT target generating PDF files unless the user specifies.
 7. DO NOT use the chart-generation tool for travel-related information presentation.
 8. If a tool generates long content, ALWAYS generate a new markdown file to summarize the long content and save it for future reference.
 9. When you need to generate a report, you are encouraged to add the content to the report file incrementally during your search or reasoning process, for example, by using the `edit_file` tool.
 10. When you use the `write_file` tool, you **MUST ALWAYS** remember to provide both the `path` and `content` parameters. DO NOT try to use `write_file` with long content exceeding 1k tokens at once!!!
 Finally, before each tool usage decision, carefully review the historical tool usage records to avoid the time and API costs caused by repeated execution. Remember that your balance is very low, so ensure absolute efficiency.
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_worker_additional_sys_prompt.md
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/prompt_worker_additional_sys_prompt.md
@@ -0,0 +1,68 @@
 ## Additional Operation Notice
 ### Tools and Usage Overview
 **1. Search Tool (`{search_tool}`)**
 - Queries the online search engine and returns relevant URLs with snippets
 - Use this as your primary tool for discovering relevant information sources
 **2. Content Extraction Tool (`{extract_tool}`)**
 - Retrieves full webpage content from specific URLs
 - Use after identifying relevant URLs from search results
 - Note: Long content may be truncated in the response but will be saved as files in the file system for reference
 **3. Intermediate Summarization Tool (`{intermediate_summarize}`)**
 - Generates an intermediate report summarizing gathered information
 - Call this when you've collected sufficient information to address all Knowledge Gaps in the current task
 - The summary should directly address each item in the Knowledge Gaps checklist
 **4. Failure Reflection Tool (`{reflect_failure}`)**
 - Use when you cannot gather sufficient information to complete the current Knowledge Gaps
 - Helps document obstacles and reasoning for incomplete research
 **5. Subtask Completion Tool (`{subtask_finish}`)**
 - Call after generating an intermediate report with `{intermediate_summarize}`
 - Advances workflow to the next subtask
 **6. Response Generation Tool (`{finish_function_name}`)**
 - Call only when BOTH conditions are met:
  - Current subtask has Research Depth = 1
  - All Knowledge Gaps checklist items are marked as done (in Markdown format)
 **7. Utility Tools**
 - File operations (read/write) for accessing documented files
 - Bash command line for simple programming tasks and data processing
 - Use as needed to support your research workflow
 ### Operation Instruction
 1. You will receive a markdown-style checklist (i.e., `Knowledge Gaps` checklist) in your input instruction. This checklist outlines all required goals to complete your assignment.
 2. You need to decide your next step based on the gathered information and the `Knowledge Gaps` checklist. You should try your best to fulfill the checklist.
 3. ALWAYS try to search with your search tool `{search_tool}` at least once before using intermediate tool `{intermediate_summarize}`.
 ### Task/subtask Explanation
 1. Take **Working Plan** as a reference, working through EACH knowledge gap methodically with the following rules:
   - Items without the `(EXPANSION)` tag are fundamental to completing the current subtask.
   - Items with the `(EXPANSION)` tag are optional, though they can provide valuable supplementary information that is beneficial for enriching the depth and breadth of your final output. However, they may also bring some distracting information. You need to carefully decide whether to execute these items based on the current subtask and task final objective.
 2. Determine whether the current item in the `Knowledge Gaps` checklist has already been fully completed. If so, you should call the `{intermediate_summarize}` tool to summarize the results of this item into an in-process report file before starting the next item. After that, the finished item will be marked as `[x]` in the working plan to remind you to move on to the next item.
 3. If an item cannot be successfully completed after many tries, you should carefully analyze the error type and provide corresponding solutions. The error types and solutions include:
   - Tool corruption (e.g., unexpected status code, empty output result, tool function not found, invalid tool calling): adjust the tool and use valid parameter input.
   - Insufficient information (e.g., the search results did not yield any valuable information to solve the task): adjust and modify the tool inputs, then retry.
   - Missing prerequisite (e.g., needed prior unexplored knowledge or more detailed follow-up steps): call the `reflect_failure` tool for deeper reflection.
 4. When the current subtask is completed and **falls back to a previous subtask**, retrieve the completion progress of the previous subtask from your work history and continue from there, rather than starting from scratch.
 ### Important Constraints
 1. DO NOT TRY TO MAKE A PLAN yourself.
 2. ALWAYS FOLLOW THE WORKING PLAN SEQUENCE STEP BY STEP!!
 3. For each step, you MUST provide a reason or analysis to **review what was done in the previous step** and **explain why to call a function / use a tool in this step**. 
 4. After each action, YOU MUST seriously confirm that the current item in the plan is done before starting the next item, referring to the following rules: 
   - Carefully analyze whether the information obtained from the tool is sufficient to fill the knowledge gap corresponding to the current item. 
   - Pay more attention to details. Confidently assuming that all tool calls will bring complete information often leads to serious errors (e.g., mistaking the rental website name for the apartment name when renting).
 If the current item in the plan is done, call `summarize_inprocess_results_into_report` to generate an in-process report, then move on to the next item.
 5. Always pay attention to the current subtask and working plan as they may be updated during the workflow.
 6. Each time you reason and act, remember that **Current Subtask** is your primary goal, while **Final Task Objective** constrains your process from deviating from the final goal.
 7. You should use `{subtask_finish}` to mark that you have finished a subtask and proceed to the next one. 
 8. You should use the `{finish_function_name}` tool to return your research results when Research Depth = 1 and all checklist items are completed.
 ### Technical Constraints
 1. If you need to generate a long report with long content, generate it step by step: first use `write_file` with BOTH `path` and `content` (the structure or skeleton of the report in string) and later use the `edit_file` tool to gradually fill in content. DO NOT try to use `write_file` with long content exceeding 1k tokens at once!!!
--- a/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/promptmodule.py
+++ b/alias/src/alias/agent/agents/_dragent_utils/built_in_prompt/promptmodule.py
@@ -0,0 +1,132 @@
 from pydantic import BaseModel, Field
 class SubtasksDecomposition(BaseModel):
    """
    Model for structured subtask decomposition output in deep research.
    """
    knowledge_gaps: str = Field(
        description=(
            "A markdown checklist of essential knowledge gaps and optional "
            "perspective-expansion gaps (flagged with (EXPANSION)), each on its own line. "
            "E.g. '- [ ] Detailed analysis of JD.com's ...\\n- [ ] (EXPANSION) X...'."
        ),
    )
    working_plan: str = Field(
        description=(
            "A logically ordered step-by-step working plan (3-5 steps), "
            "each step starting with its number (1., 2., etc), including both "
            "core and expansion steps. Expanded steps should be clearly marked "
            "with (EXPANSION) and provide contextual or analytical depth.."
        ),
    )
 class WebExtraction(BaseModel):
    """
    Model for structured follow-up web extraction output in deep research.
    """
    reasoning: str = Field(
        description=(
            "The reasoning for your decision, including a summary of "
            "evidence and logic for whether more information is needed."
        ),
    )
    need_extraction: bool = Field(
        description="Whether more information is to be extracted.",
    )
    url: str = Field(
        description=(
            "Direct URL to the original search result requiring further "
            "extraction, or an empty string if not applicable."
        ),
    )
 class FollowupJudge(BaseModel):
    """
    Model for structured follow-up decompose judging output in deep research.
    """
    reasoning: str = Field(
        description=(
            "The reasoning for your decision, including a summary of evidence "
            "and logic for whether more information is needed. You should "
            "include specific gaps or opportunities if the current "
            "information is still insufficient"
        ),
    )
    knowledge_gap_revision: str = Field(
        "Revise the knowledge gaps in the current. "
        "Mark the gaps with sufficient information as [x]."
    )
    to_further_explore: bool = Field(
        description=(
            "whether the information content is adequate "
            "or need to further explore (as `subtask`)."
        ),
    )
    subtask: str = Field(
        description=(
            "Actionable description of the follow-up task to obtain needed "
            "information, focused research question/direction, "
            "or an empty string if not applicable."
        ),
    )
 class ReflectFailure(BaseModel):
    """
    Model for structured failure reflection output in deep research.
    """
    rephrase_subtask: dict = Field(
        description=(
            "Information about whether the problematic subtask needs to be "
            "rephrased due to a design flaw or misunderstanding. If rephrasing "
            "is needed, provide the modified working plan with only the inappropriate "
            "subtask replaced by its improved version."
        ),
        json_schema_extra={
            "additionalProperties": {
                "type": "object",
                "properties": {
                    "need_rephrase": {
                        "type": "boolean",
                        "description": "Set to 'true' if the failed subtask "
                                       "needs to be rephrased due to a design "
                                       "flaw or misunderstanding; otherwise, 'false'.",
                    },
                    "rephrased_plan": {
                        "type": "string",
                        "description": "The modified working plan with only the inappropriate "
                                       "subtask replaced by its improved version. If no "
                                       "rephrasing is needed, provide an empty string.",
                    },
                }
            }
        }
    )
    decompose_subtask: dict = Field(
        description=(
            "Information about whether the problematic subtask should be further "
            "decomposed. If decomposition is required, provide the failed subtask "
            "and the reason for its decomposition."
        ),
        json_schema_extra={
            "additionalProperties": {
                "type": "object",
                "properties": {
                    "need_decompose": {
                        "type": "boolean",
                        "description": "Set to 'true' if the failed subtask should "
                                       "be further decomposed; otherwise, 'false'.",
                    },
                    "failed_subtask": {
                        "type": "string",
                        "description": "The failed subtask that needs to be further "
                                        "decomposed.",
                    },
                }
            }
        }
    )
--- a/alias/src/alias/agent/agents/_dragent_utils/utils.py
+++ b/alias/src/alias/agent/agents/_dragent_utils/utils.py
@@ -0,0 +1,296 @@
 # -*- coding: utf-8 -*-
 """The utilities for deep research agent"""
 import os
 import json
 from typing import Union, Sequence, Any, Type
 from pydantic import BaseModel
 import re
 from agentscope.tool import Toolkit, ToolResponse
 from agentscope.agent import ReActAgent
 TOOL_RESULTS_MAX_WORDS = 30000
 def get_prompt_from_file(
    file_path: str,
    return_json: bool,
 ) -> Union[str, dict]:
    """Get prompt from file"""
    with open(os.path.join(file_path), "r", encoding="utf-8") as f:
        if return_json:
            prompt = json.load(f)
        else:
            prompt = f.read()
    return prompt
 async def count_by_words(sentence: str) -> float:
    """Count words of a sentence"""
    words = re.findall(
        r"\w+|[^\w\s]",
        sentence,
        re.UNICODE
    )
    word_count = 0.0
    for word in words:
        if re.match(r"\w+", word):
            word_count += 1.0
    return word_count
 def generate_structure_output(**kwargs: Any) -> ToolResponse:
    """Generate a structured output tool response.
    This function is designed to be used as a tool function for generating
    structured outputs. It takes arbitrary keyword arguments and wraps them
    in a ToolResponse with metadata.
    Args:
        **kwargs: Arbitrary keyword arguments that should match the format
            of the expected structured output specification.
    Returns:
        ToolResponse: A tool response object with empty content and the
            provided kwargs as metadata.
    Note:
        The input parameters should be in the same format as the specification
        and include as much detail as requested by the calling context.
    """
    return ToolResponse(content=[], metadata=kwargs)
 def get_dynamic_tool_call_json(data_model_type: Type[BaseModel]) -> list[dict]:
    """Generate JSON schema for dynamic tool calling with a given data model.
    Creates a temporary toolkit, registers the structure output function,
    and configures it with the specified data model to generate appropriate
    JSON schemas for tool calling.
    Args:
        data_model_type: A Pydantic BaseModel class that defines the expected
            structure of the tool output.
    Returns:
        A dictionary containing the JSON schemas for the configured tool,
        suitable for use in API calls that support structured outputs.
    Example:
        class MyModel(BaseModel):
            name: str
            value: int
        schema = get_dynamic_tool_call_json(MyModel)
    """
    tmp_toolkit = Toolkit()
    tmp_toolkit.register_tool_function(generate_structure_output)
    tmp_toolkit.set_extended_model(
        "generate_structure_output",
        data_model_type,
    )
    return tmp_toolkit.get_json_schemas()
 def get_structure_output(blocks: list | Sequence) -> dict:
    """Extract structured output from a sequence of blocks.
    Processes a list or sequence of blocks to extract tool use outputs
    and combine them into a single dictionary. This is typically used
    to parse responses from language models that include tool calls.
    Args:
        blocks: A list or sequence of blocks that may contain tool use
            information. Each block should be a dictionary with 'type'
            and 'input' keys for tool use blocks.
    Returns:
        A dictionary containing the combined input data from all tool
        use blocks found in the input sequence.
    Example:
        blocks = [
            {"type": "tool_use", "input": {"name": "test"}},
            {"type": "text", "content": "Some text"},
            {"type": "tool_use", "input": {"value": 42}}
        ]
        result = PromptBase.get_structure_output(blocks)
        # result: {"name": "test", "value": 42}
    """
    dict_output = {}
    for block in blocks:
        if isinstance(block, dict) and block.get("type") == "tool_use":
            dict_output.update(block.get("input", {}))
    return dict_output
 def load_prompt_dict() -> dict:
    """Load prompt into dict"""
    prompt_dict = {}
    cur_dir = os.path.dirname(os.path.abspath(__file__))
    prompt_dict["add_note"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_worker_additional_sys_prompt.md",
        ),
        return_json=False,
    )
    prompt_dict["tool_use_rule"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_tool_usage_rules.md",
        ),
        return_json=False,
    )
    prompt_dict["decompose_sys_prompt"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_decompose_subtask.md",
        ),
        return_json=False,
    )
    prompt_dict["expansion_sys_prompt"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_deeper_expansion.md",
        ),
        return_json=False,
    )
    prompt_dict["summarize_sys_prompt"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_inprocess_report.md",
        ),
        return_json=False,
    )
    prompt_dict["reporting_sys_prompt"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_deepresearch_summary_report.md",
        ),
        return_json=False,
    )
    prompt_dict["reflect_sys_prompt"] = get_prompt_from_file(
        file_path=os.path.join(
            cur_dir,
            "built_in_prompt/prompt_reflect_failure.md",
        ),
        return_json=False,
    )
    prompt_dict["reasoning_prompt"] = (
        "## Current Subtask:\n{objective}\n"
        "## Working Plan:\n{plan}\n"
        "{knowledge_gap}\n"
        "## Research Depth:\n{depth}"
    )
    prompt_dict["previous_plan_inst"] = (
        "## Previous Plan:\n{previous_plan}\n"
        "## Current Subtask:\n{objective}\n"
    )
    prompt_dict["max_depth_hint"] = (
        "The search depth has reached the maximum limit. So the "
        "current subtask can not be further decomposed and "
        "expanded anymore. I need to find another way to get it "
        "done no matter what."
    )
    prompt_dict["expansion_inst"] = (
        "Review the web search results and identify whether "
        "there is any information that can potentially help address "
        "checklist items or fulfill knowledge gaps of the task, "
        "but whose content is limited or only briefly mentioned.\n"
        "**Ultimate Task Checklist:**\n{checklist}\n"
        "**Current Knowledge Gaps:**\n{knowledge_gaps}\n"
        "**Current Search Query:**\n{search_query}\n"
        "**Search Results:**\n{search_results}\n"
        "**Output:**\n"
    )
    prompt_dict["follow_up_judge_sys_prompt"] = (
        "1. You have conducted a web search and extraction "
        "to obtain additional information. Now, you assess whether, "
        "after both the web search and extraction process, "
        "the information content is adequate to "
        "address the given task. Mark those items in `Current Knowledge Gaps` "
        " as [x] if there is information for that. \n"
        "2. If the gathered information inspires you, "
        "and you believe diving deeper following this can help providing more "
        "comprehensive analysis of the user query, "
        "formulate the dive-deeper plan in `subtask` field; "
        "otherwise, you can leave it empty."
    )
    prompt_dict[
        "retry_hint"
    ] = "Something went wrong when {state}. I need to retry."
    prompt_dict["need_deeper_hint"] = (
        "The information is insufficient and I need to make deeper "
        "research to fill the knowledge gap."
    )
    prompt_dict[
        "sufficient_hint"
    ] = "The information after web search and extraction is sufficient enough!"
    prompt_dict["no_result_hint"] = (
        "I mistakenly called the `summarize_intermediate_results` tool as "
        "there exists no milestone result to summarize now."
    )
    prompt_dict["summarize_hint"] = (
        "Based on your work history above, examine which step in the "
        "following working plan has been completed. Mark the fulfill "
        "knowledge gap with [x] (e.g., [x] Search yyy; [x] learn zzz) "
        "and leave the uncompleted steps unchanged. You MUST return only "
        "the updated plan, preserving exactly the same format as the "
        "original plan. Do not include any explanations, reasoning, "
        "or section headers such as '## Knowledge Gaps:', just output the"
        "updated status itself."
        "\n\n## Knowledge Gaps:\n{knowledge_gaps}"
    )
    prompt_dict["summarize_inst"] = (
        "**Ultimate Task:**\n{objective}\n"
        "**Ultimate Checklist:**\n{root_gaps}\n"
        "**Knowledge Gaps:**\n{cur_gaps}\n"
        "**Gathered Information:**\n{tool_result}"
    )
    prompt_dict["update_report_hint"] = (
        "To condense the gathered information, I have replaced the "
        "original bulk search results from the research phase with the "
        "following report that consolidates and summarizes the essential "
        "findings:\n {intermediate_report}\n\n"
        "Such report has been saved to the {report_path}. "
    )
    prompt_dict["save_report_hint"] = (
        "The milestone results of the current item in working plan "
        "are summarized into the following report:\n{intermediate_report}"
    )
    prompt_dict["reflect_instruction"] = (
        "## Work History:\n{conversation_history}\n"
        "## Current Objective:\n{objective}\n"
        "## Working Plan:\n{plan}\n"
        "## Knowledge Gaps:\n{knowledge_gaps}\n"
    )
    prompt_dict["subtask_complete_hint"] = (
        "Subtask ‘{cur_obj}’ is completed. Now the current subtask "
        "fallbacks to '{next_obj}'"
    )
    return prompt_dict
--- a/alias/src/alias/agent/agents/_meta_planner.py
+++ b/alias/src/alias/agent/agents/_meta_planner.py
@@ -0,0 +1,574 @@
 # -*- coding: utf-8 -*-
 """
 Meta Planner agent class that can handle complicated tasks with
 planning-execution pattern.
 """
 # pylint: disable=W0613
 import os
 import uuid
 from functools import partial
 from typing import Optional, Any, Literal, Callable
 import json
 from pathlib import Path
 from pydantic import BaseModel, Field
 from agentscope import logger
 from agentscope.message import Msg, ToolUseBlock, TextBlock, ToolResultBlock
 from agentscope.tool import ToolResponse
 from agentscope.model import ChatModelBase
 from agentscope.formatter import FormatterBase
 from agentscope.memory import MemoryBase
 from alias.agent.agents import AliasAgentBase
 from alias.agent.tools import AliasToolkit
 from ._planning_tools import (  # pylint: disable=C0411
    PlannerNoteBook,
    RoadmapManager,
    WorkerManager,
    share_tools,
 )
 from ._agent_hooks import (
    update_user_input_pre_reply_hook,
    planner_compose_reasoning_msg_pre_reasoning_hook,
    planner_remove_reasoning_msg_post_reasoning_hook,
    save_post_reasoning_state,
    save_post_action_state,
    generate_response_post_action_hook,
    planner_load_states_pre_reply_hook,
 )
 from ..utils.constants import (
    PLANNER_MAX_ITER,
    DEFAULT_PLANNER_NAME,
 )
 class MetaPlannerResponseWithClarification(BaseModel):
    require_clarification: bool = Field(
        ...,
        description=(
            "Check If the provide task description is unclear, too general or "
            "lack necessary information."
        ),
    )
    clarification_analysis: str = Field(
        default="",
        description=(
            "identify the missing information "
            "so that if the user provides clarification or more details, "
            "you can have clearer goal and can better handle the task."
        ),
    )
    clarification_question: str = Field(
        default="",
        description=(
            "If the provide task description is unclear, too general or "
            "lack necessary information, generate the `clarification` field. "
            "Otherwise, leave it empty."
        ),
    )
    clarification_options: list[str] = Field(
        default=[],
        description=(
            "Provide two to three possible candidate answers to the "
            "clarification_question as hints for the user."
        ),
    )
    task_conclusion: str = Field(
        ...,
        description=(
            "If the task has been done, generate a conclusion."
            "The conclusion should contain"
            "1) what you have done,"
            "2) whether the task have been complete completely or "
            "just partially,"
            "3) what are the key deliverables (files/webpages/images, etc) "
            "you have generated."
        ),
    )
 MetaPlannerResponseWithClarificationPrompt = (
    "The `{func_name}` should be called when either you want to request "
    "additional information from user to clarify the task, or you believe "
    "the task has been done and you want to give a final description. "
    "The `response` field needs to be a string that briefly summarize your "
    "thought in ONE sentence."
 )
 class MetaPlannerResponseNoClarification(BaseModel):
    task_conclusion: str = Field(
        ...,
        description=(
            "If the task has been done, generate a conclusion."
            "The conclusion should contain"
            "1) what you have done,"
            "2) whether the task have been complete completely or "
            "just partially,"
            "3) what are the key deliverables (files/webpages/images, etc) "
            "you have generated."
        ),
    )
 MetaPlannerResponseNoClarificationPrompt = (
    "The `{func_name}` should be called when you believe "
    "the task has been done and you want to give a final description. "
    "The `task_conclusion` field needs to be a string that "
    "briefly summarize your thought in ONE sentence."
 )
 class MetaPlanner(AliasAgentBase):
    """
    A meta-planning agent that extends ReActAgent with enhanced planning
    capabilities. The MetaPlanner is designed to handle complex multistep
    planning tasks by leveraging a combination of reasoning and action
    capabilities. The subtasks will be solved by dynamically create ReAct
    worker agent and provide it with necessary tools.
    """
    def __init__(
        self,
        model: ChatModelBase,
        worker_full_toolkit: AliasToolkit,
        formatter: FormatterBase,
        memory: MemoryBase,
        toolkit: AliasToolkit,
        browser_toolkit: AliasToolkit,
        agent_working_dir: str,
        sys_prompt: Optional[str] = None,
        max_iters: int = 10,
        state_saving_dir: Optional[str] = None,
        planner_mode: Literal["disable", "dynamic", "enforced"] = "dynamic",
        session_service: Any = None,
        enable_clarification: bool = True,
    ) -> None:
        """
        Initialize the MetaPlanner with the given parameters.
        Args:
            model (ChatModelBase):
                The primary chat model used for reasoning and response
                generation.
            worker_full_toolkit (AliasToolkit):
                Complete set of tools available to the worker agent.
            formatter (FormatterBase):
                Formatter for formatting messages to the model API provider's
                format.
            memory (MemoryBase):
                Memory system for storing conversation history and context.
            toolkit (AliasToolkit):
                Toolkit for managing tools available to the agent.
            agent_working_dir (str):
                Directory for agent's file operations.
            sys_prompt (str, optional):
                Meta planner's system prompt
            max_iters (int, optional):
                Maximum number of planning iterations. Defaults to 10.
            state_saving_dir (Optional[str], optional):
                Directory to save the agent's state. Defaults to None.
            planner_mode (bool, optional):
                Enable planner mode for solving tasks. Defaults to True.
        """
        if sys_prompt is None:
            self.base_sys_prompt = (
                f"You are a helpful assistant named {DEFAULT_PLANNER_NAME}."
                "If a given task can not be done easily, then you may need "
                "to use the tool `enter_planning_execution_mode` to "
                "change yourself to a more long-term planning mode."
                "If you need tool supplement for easier task, you can call "
                "`enter_easy_task_mode` to ask for more tools."
            )
        else:
            self.base_sys_prompt = sys_prompt
        # Call super().__init__() early to initialize StateModule attributes
        super().__init__(
            name=DEFAULT_PLANNER_NAME,
            sys_prompt=self.base_sys_prompt,
            model=model,
            formatter=formatter,
            memory=memory,
            toolkit=toolkit,
            max_iters=max_iters,
            session_service=session_service,
            state_saving_dir=state_saving_dir,
        )
        self.browser_toolkit = browser_toolkit
        self.agent_working_dir_root = agent_working_dir
        self.task_dir = self.agent_working_dir_root
        self.worker_full_toolkit = worker_full_toolkit
        self.register_state("task_dir")
        self.register_state("agent_working_dir_root")
        # adjust ReActAgent parameters
        if enable_clarification:
            self._required_structured_model = (
                MetaPlannerResponseWithClarification
            )
            response_func = self.toolkit.tools.get(self.finish_function_name)
            response_func.json_schema[
                "description"
            ] = response_func.json_schema.get(
                "description",
                "",
            ) + MetaPlannerResponseWithClarificationPrompt.format_map(
                {
                    "func_name": self.finish_function_name,
                },
            )
        else:
            self._required_structured_model = (
                MetaPlannerResponseNoClarification
            )
            response_func = self.toolkit.tools.get(self.finish_function_name)
            response_func.json_schema[
                "description"
            ] = response_func.json_schema.get(
                "description",
                "",
            ) + MetaPlannerResponseNoClarificationPrompt.format_map(
                {
                    "func_name": self.finish_function_name,
                },
            )
            self._sys_prompt += "Notice: NEVER ask for clarification!"
        self.reply: Callable = partial(
            self.reply,
            structured_model=self._required_structured_model,
        )
        self.max_iters: int = max(self.max_iters, PLANNER_MAX_ITER)
        # for debugging and state resume, we need a flag to indicate
        self.planner_mode = planner_mode
        self.work_pattern: Literal[
            "simplest",
            "worker",
            "planner",
        ] = "simplest"
        self.register_state("planner_mode")
        self.register_state("work_pattern")
        self.planner_notebook = None
        self.roadmap_manager, self.worker_manager = None, None
        if planner_mode in ["dynamic", "enforced"]:
            self.planner_notebook = PlannerNoteBook()
            self.planner_notebook.full_tool_list = (
                self._get_full_worker_tool_list()
            )
            self.prepare_planner_tools(planner_mode)
            self.register_state(
                "planner_notebook",
                lambda x: x.model_dump(),
                lambda x: PlannerNoteBook(**x),
            )
        # pre-reply hook
        self.register_instance_hook(
            "pre_reply",
            "planner_load_states_pre_reply_hook",
            planner_load_states_pre_reply_hook,
        )
        self.register_instance_hook(
            "pre_reply",
            "update_user_input_to_notebook_pre_reply_hook",
            update_user_input_pre_reply_hook,
        )
        # pre-reasoning hook
        self.register_instance_hook(
            "pre_reasoning",
            "planner_compose_reasoning_msg_pre_reasoning_hook",
            planner_compose_reasoning_msg_pre_reasoning_hook,
        )
        # post_reasoning hook
        self.register_instance_hook(
            "post_reasoning",
            "planner_remove_reasoning_msg_post_reasoning_hook",
            planner_remove_reasoning_msg_post_reasoning_hook,
        )
        self.register_instance_hook(
            "post_reasoning",
            "save_state_post_reasoning_hook",
            save_post_reasoning_state,
        )
        # post_action_hook
        self.register_instance_hook(
            "post_acting",
            "save_post_action_state",
            save_post_action_state,
        )
        self.register_instance_hook(
            "post_acting",
            "generate_response_post_action_hook",
            generate_response_post_action_hook,
        )
    def prepare_planner_tools(
        self,
        planner_mode: Literal["disable", "enforced", "dynamic"],
    ) -> None:
        """
        Prepare tool to planning depending on the selected mode.
        """
        assert self.planner_notebook
        self.roadmap_manager = RoadmapManager(
            planner_notebook=self.planner_notebook,
        )
        self.worker_manager = WorkerManager(
            worker_model=self.model,
            worker_formatter=self.formatter,
            planner_notebook=self.planner_notebook,
            agent_working_dir=self.task_dir,
            worker_full_toolkit=self.worker_full_toolkit,
            session_service=self.session_service,
            sandbox=self.toolkit.sandbox,
        )
        # clean
        self.toolkit.remove_tool_groups("planning")
        self.toolkit.create_tool_group(
            "planning",
            "Tool group for planning capability",
        )
        # re-register planning tool to enable loading the correct info
        self.toolkit.register_tool_function(
            self.roadmap_manager.decompose_task_and_build_roadmap,
            group_name="planning",
        )
        self.toolkit.register_tool_function(
            self.roadmap_manager.revise_roadmap,
            group_name="planning",
        )
        self.toolkit.register_tool_function(
            self.roadmap_manager.get_next_unfinished_subtask_from_roadmap,
            group_name="planning",
        )
        self.toolkit.register_tool_function(
            self.worker_manager.show_current_worker_pool,
            group_name="planning",
        )
        self.toolkit.register_tool_function(
            self.worker_manager.create_worker,
            group_name="planning",
        )
        self.toolkit.register_tool_function(
            self.worker_manager.execute_worker,
            group_name="planning",
        )
        if planner_mode == "dynamic":
            if "enter_planning_execution_mode" not in self.toolkit.tools:
                self.toolkit.register_tool_function(
                    self.enter_planning_execution_mode,
                )
            if "enter_easy_task_mode" not in self.toolkit.tools:
                self.toolkit.register_tool_function(
                    self.enter_easy_task_mode,
                )
            # Only activate after agent decides to enter the
            # planning-execution mode
            self.toolkit.update_tool_groups(["planning"], False)
        elif planner_mode == "enforced":
            self.toolkit.update_tool_groups(["planning"], True)
            # use the self.agent_working_dir as working dir
            self._update_toolkit_and_sys_prompt()
    def _ensure_file_system_functions(self) -> None:
        required_tool_list = [
            "read_file",
            "write_file",
            "edit_file",
            "create_directory",
            "list_directory",
            "directory_tree",
            "list_allowed_directories",
            "run_shell_command",
        ]
        # Traditional AliasToolkit mode
        for tool_name in required_tool_list:
            if tool_name not in self.worker_full_toolkit.tools:
                raise ValueError(
                    f"{tool_name} must be in the worker toolkit and "
                    "its tool group must be active for complicated.",
                )
        share_tools(
            self.worker_full_toolkit,
            self.toolkit,
            required_tool_list,
        )
    async def _create_task_directory(
        self,
    ) -> None:
        create_task_dir = ToolUseBlock(
            type="tool_use",
            id=str(uuid.uuid4()),
            name="create_directory",
            input={
                "path": self.task_dir,
            },
        )
        tool_res = await self.toolkit.call_tool_function(create_task_dir)
        tool_res_msg = Msg(
            "system",
            content=[
                ToolResultBlock(
                    type="tool_result",
                    output=[],
                    name="create_directory",
                    id=create_task_dir["id"],
                ),
            ],
            role="system",
        )
        async for chunk in tool_res:
            # Turn into a tool result block
            tool_res_msg.content[0]["output"] = chunk.content
        await self.print(tool_res_msg)
    async def enter_planning_execution_mode(
        self,
        task_name: str,
    ) -> ToolResponse:
        """
        When the user task meets any of the following conditions, enter the
        solving complicated task mode by using this tool.
        1. the task cannot be done within 15 reasoning-acting iterations;
        2. the task cannot be done by the current tools you can see;
        3. the task is related to comprehensive research or information
            gathering
        4. some step requires browser operations (browsing webpages like
            Github & Arxiv, or need operations like book tickets)
        Args:
            task_name (`str`):
                Given a name to the current task as an indicator. Because
                this name will be used to create a directory, so try to
                use "_" instead of space between words, e.g. "A_NEW_TASK".
        """
        # build directory for the task
        self._ensure_file_system_functions()
        self.task_dir = os.path.join(
            self.agent_working_dir_root,
            task_name,
        )
        await self._create_task_directory()
        self.worker_manager.agent_working_dir = self.task_dir
        self._update_toolkit_and_sys_prompt()
        return ToolResponse(
            metadata={"success": True},
            content=[
                TextBlock(
                    type="text",
                    text=(
                        "Successfully enter the planning-execution mode to "
                        "solve complicated task. "
                        "All the file operations, including "
                        "read/write/modification, should be done in directory "
                        f"{self.task_dir}"
                    ),
                ),
            ],
        )
    async def enter_easy_task_mode(
        self,
        task_name: str,
        additional_task_tools: list[str],
    ) -> ToolResponse:
        """
        When the user request meet all following conditions, enter the
        solving easy task mode by using this tool.
        1. the task can be done within 15 reasoning-acting iterations;
        2. the task requires only 3-5 additional tools to finish;
        3. NO NEED to use browser operations
        Args:
            task_name (`str`):
                Given a name to the current task as an indicator. Because
                this name will be used to create a directory, so try to
                use "_" instead of space between words, e.g. "A_NEW_TASK".
            additional_task_tools (List[`str`]):
                Given three to five (3 - 5) additional tools that are
                necessary for solving this easy task.
        """
        self._ensure_file_system_functions()
        self._sys_prompt = self.base_sys_prompt
        share_tools(
            self.worker_full_toolkit,
            self.toolkit,
            additional_task_tools,
        )
        self.task_dir = os.path.join(
            self.agent_working_dir_root,
            task_name,
        )
        await self._create_task_directory()
        self.work_pattern = "worker"
        available_tool_names = [
            item.get("function", {}).get("name")
            for item in list(self.toolkit.get_json_schemas())
        ]
        return ToolResponse(
            metadata={"success": True},
            content=[
                TextBlock(
                    type="text",
                    text=(
                        "Successfully enter the easy task mode to "
                        "solve task. "
                        "All the file operations, including "
                        "read/write/modification, should be done in directory "
                        f"{self.task_dir}"
                        f"Current available tools: {available_tool_names}"
                    ),
                ),
            ],
        )
    def _update_toolkit_and_sys_prompt(self) -> None:
        # change agent settings for solving complicated task
        with open(
            Path(__file__).parent
            / "_built_in_long_sys_prompt"
            / "meta_planner_sys_prompt.md",
            "r",
            encoding="utf-8",
        ) as f:
            sys_prompt = f.read()
        sys_prompt = sys_prompt.format_map(
            {
                "tool_list": json.dumps(
                    self._get_full_worker_tool_list(),
                    ensure_ascii=False,
                ),
            },
        )
        self._sys_prompt = sys_prompt  # pylint: disable=W0201
        self.toolkit.update_tool_groups(["planning"], True)
        self.work_pattern = "planner"
    def resume_planner_tools(self) -> None:
        """Resume the planner notebook for tools"""
        self.prepare_planner_tools(self.planner_mode)
        if self.work_pattern == "planner":
            self._update_toolkit_and_sys_prompt()
    def _get_full_worker_tool_list(self) -> list[dict]:
        full_worker_tool_list = [
            {
                "tool_name": func_dict.get("function", {}).get("name", ""),
                "description": func_dict.get("function", {}).get(
                    "description",
                    "",
                ),
            }
            for func_dict in self.worker_full_toolkit.get_json_schemas()
        ]
        return full_worker_tool_list
--- a/alias/src/alias/agent/agents/_planning_tools/init.py
+++ b/alias/src/alias/agent/agents/_planning_tools/init.py
@@ -0,0 +1,24 @@
 # -*- coding: utf-8 -*-
 """planning tools"""
 from ._planning_notebook import (
    PlannerNoteBook,
    RoadMap,
    WorkerResponse,
    Update,
    WorkerInfo,
    SubTaskStatus,
 )
 from ._roadmap_manager import RoadmapManager
 from ._worker_manager import WorkerManager, share_tools
 __all__ = [
    "PlannerNoteBook",
    "RoadmapManager",
    "WorkerManager",
    "WorkerResponse",
    "RoadMap",
    "SubTaskStatus",
    "WorkerInfo",
    "Update",
    "share_tools",
 ]
--- a/alias/src/alias/agent/agents/_planning_tools/_planning_notebook.py
+++ b/alias/src/alias/agent/agents/_planning_tools/_planning_notebook.py
@@ -0,0 +1,327 @@
 # -*- coding: utf-8 -*-
 # pylint: disable=E0213
 """
 Data structures about the roadmap for complicated tasks
 """
 from datetime import datetime
 from typing import List, Literal, Tuple, Optional, Any, Dict
 from pydantic import BaseModel, Field, field_validator
 def get_current_time_message() -> str:
    """
    Returns the current time as a formatted string.
    Returns:
        str: The current time formatted as 'YYYY-MM-DD HH:MM:SS'.
    """
    return f"Current time is {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
 WORKER_PROGRESS_SUMMARY = (
    "## Instruction\n"
    "Review the execution trace above and generate a comprehensive summary "
    "report in Markdown format that addresses the original task/query. "
    "Your report must include:\n\n"
    "1. **Task Overview**\n"
    "   - Include the original query/task verbatim;\n"
    "   - Briefly state the main objective.\n"
    "2. **Comprehensive Analysis**"
    "   - Provide a detailed, structured answer to the original query/task;\n"
    "   - Include all relevant information requested in the original task;\n"
    "   - Support your findings with specific references from your execution "
    "trace;\n"
    "   - Organize content into logical sections with appropriate headings;\n"
    "   - Include data visualizations, tables, or formatted lists when "
    "applicable.\n\n"
    "3. **Completion Checklist**\n"
    "   - Reproduce the original 'Expected Output' checklist of required "
    "tasks/information; **NEVER** makeup additional expected output items "
    "in the checklist\n"
    "   - Mark each item as [x] Completed or [ ] Incomplete;\n"
    "   - For each completed item, reference where in your report this "
    "information appears;\n"
    "   - For incomplete items, explain briefly why they remain unaddressed;\n"
    "4. **Conclusion**\n"
    "   - If the task is fully complete, provide a brief conclusion "
    "summarizing key findings;\n"
    "   - If the task remains incomplete, outline a specific plan to "
    "address remaining items, including:\n"
    "     - Which tools would be used;\n"
    "     - What information is still needed;\n"
    "     - Sequence of planned actions.\n\n"
    "Format your report professionally with consistent heading levels, "
    "proper spacing, and appropriate emphasis for key information."
 )
 WORKER_NEXT_STEP_INSTRUCTION = """
 If the subtask remains incomplete, outline a specific plan to address remaining
 items, including:
     - Which tools would be used
     - What information is still needed
     - Sequence of planned actions
 Leave it as an empty string is the subtask has been done successfully.
 """
 WORKER_FILE_COLLECTION_INSTRUCTION = (
    "Collect all files generated in the execution process, "
    "such as the files generated by `write_file` and `edit_file`."
    "This field MUST be in dictionary, where"
    "the keys are the paths of generated files "
    "(e.g. '/FULL/PATH/OF/FILE_1.md') and the values are short "
    "descriptions about the generated files."
 )
 class WorkerResponse(BaseModel):
    """
    Represents the response structure from a worker agent after task execution.
    This class defines the expected format for worker responses, including
    progress summaries, next steps, tool usage information, and task
    completion status.
    Attributes:
        subtask_progress_summary (str):
            Comprehensive summary report of task execution.
        generated_files (dict):
            Dictionary mapping file paths to descriptions of generated files.
        task_done (bool):
            Flag indicating whether the task has been completed.
    """
    subtask_progress_summary: str = Field(
        ...,
        description=WORKER_PROGRESS_SUMMARY,
    )
    generated_files: dict = Field(
        ...,
        description=WORKER_FILE_COLLECTION_INSTRUCTION,
    )
    task_done: bool = Field(
        ...,
        description="Whether task is done or it require addition effort",
    )
 class Update(BaseModel):
    """Represents an update record from a worker during task execution.
    This class tracks progress updates from workers as they work on subtasks,
    including status changes, progress summaries, and execution details.
    Attributes:
        reason_for_status (str): Explanation for the current status.
        task_done (bool): Whether the task has been completed.
        subtask_progress_summary (str): Summary of progress made.
        next_step (str): Description of planned next actions.
        worker (str): Identifier of the worker providing the update.
        attempt_idx (int): Index of the current attempt.
    """
    reason_for_status: str
    task_done: bool
    subtask_progress_summary: str
    next_step: str
    worker: str
    attempt_idx: int
    @field_validator(
        "subtask_progress_summary",
        "reason_for_status",
        "next_step",
        "worker",
        mode="before",
    )
    def _stringify(cls, v: Any) -> str:
        """ensure the attributes are string"""
        if v is None:
            return ""
        return str(v)
 class WorkerInfo(BaseModel):
    """Contains information about a worker agent assigned to a subtask.
    This class stores metadata about worker agents, including their
    capabilities, creation type, and configuration details.
    Attributes:
        worker_name (str):
            Name identifier of the worker.
        status (str):
            Current status of the worker.
        create_type (Literal["built-in", "dynamic-built"]):
            How the worker was created.
        description (str):
            Description of the worker's purpose and capabilities.
        tool_lists (List[str]):
            List of tools available to this worker.
        sys_prompt (str):
            System prompt used to configure the worker.
    """
    worker_name: str = ""
    status: str = ""
    create_type: Literal["built-in", "dynamic-built"] = "dynamic-built"
    description: str = ""
    # for dynamically create worker agents
    tool_lists: List[str] = Field(default_factory=list)
    sys_prompt: str = ""
    @field_validator(
        "worker_name",
        "status",
        mode="before",
    )
    def _stringify(cls, v: Any) -> str:
        if v is None:
            return ""
        return str(v)
 class SubTaskSpecification(BaseModel):
    """
    Details of a subtask within a larger task decomposition.
    Attributes:
        subtask_description (str)
        input_intro(str)
        exact_input(str)
        expected_output(str)
        desired_auxiliary_tools(str)
    """
    subtask_description: str = Field(
        ...,
        description="Description of the subtask.",
    )
    input_intro: str = Field(
        ...,
        description="Introduction or context for the subtask input.",
    )
    exact_input: str = Field(
        ...,
        description="The exact input data or parameters for the subtask.",
    )
    expected_output: str = Field(
        ...,
        description="The expected output data or parameters for the subtask.",
    )
    desired_auxiliary_tools: str = Field(
        ...,
        description="Tools that would be helpful for this subtask.",
    )
    @field_validator(
        "subtask_description",
        "input_intro",
        "exact_input",
        "expected_output",
        "desired_auxiliary_tools",
        mode="before",
    )
    def _stringify(cls, v: Any) -> str:
        if v is None:
            return ""
        return str(v)
 class SubTaskStatus(BaseModel):
    """
    Represents the status and details of a subtask within a
    larger task decomposition.
    This class tracks individual subtasks, their execution status,
    assigned workers, and progress updates throughout the execution lifecycle.
    Attributes:
        status (Literal["Planned", "In-process", "Done"]):
            Current execution status.
        updates (List[Update]):
            List of progress updates from workers.
        attempt (int):
            Number of execution attempts for this subtask.
        workers (List[WorkerInfo]):
            List of workers assigned to this subtask.
    """
    subtask_specification: SubTaskSpecification = Field(
        default_factory=SubTaskSpecification,
    )
    status: Literal["Planned", "In-process", "Done"] = "Planned"
    updates: List[Update] = Field(
        default_factory=list,
        description=(
            "List of updates from workers. "
            "MUST be empty list when initialized."
        ),
    )
    attempt: int = 0
    workers: List[WorkerInfo] = Field(
        default_factory=list,
        description=(
            "List of workers that have been assigned to this subtask."
            "MUST be EMPTY when initialize the subtask."
        ),
    )
 class RoadMap(BaseModel):
    """Represents a roadmap for task decomposition and execution tracking.
    This class manages the overall task breakdown, containing the original task
    description and a list of decomposed subtasks with their execution status.
    Attributes:
        original_task (str):
            The original task description before decomposition.
        decomposed_tasks (List[SubTaskStatus]):
            List of subtasks created from the original task.
    """
    original_task: str = ""
    decomposed_tasks: List[SubTaskStatus] = Field(default_factory=list)
    def next_unfinished_subtask(
        self,
    ) -> Tuple[Optional[int], Optional[SubTaskStatus]]:
        """Find the next subtask that is not yet completed.
        Iterates through the decomposed tasks to find the first subtask
        with status "Planned" or "In-process".
        Returns:
            Tuple[Optional[int], Optional[SubTaskStatus]]: A tuple containing:
                - The index of the next unfinished subtask
                    (None if all tasks are done)
                - The SubTaskStatus object of the next unfinished subtask
                    (None if all tasks are done)
        """
        for i, subtask in enumerate(self.decomposed_tasks):
            if subtask.status in ["Planned", "In-process"]:
                return i, subtask
        return None, None
 class PlannerNoteBook(BaseModel):
    """
    Represents a planner notebook.
    Attributes:
        time (str): The current time message.
        user_input (List[str]): List of user inputs.
        detail_analysis_for_plan (str): Detailed analysis for the plan.
        roadmap (RoadMap): The roadmap associated with the planner.
        files (Dict[str, str]): Dictionary of files related to the planner.
        full_tool_list (dict[str, dict]): Full schema of tools.
    """
    time: str = Field(default_factory=get_current_time_message)
    user_input: List[str] = Field(default_factory=list)
    detail_analysis_for_plan: str = (
        "Unknown. Please call `build_roadmap_and_decompose_task` to analyze."
    )
    roadmap: RoadMap = Field(default_factory=RoadMap)
    files: Dict[str, str] = Field(default_factory=dict)
    full_tool_list: list[dict] = Field(default_factory=list)
--- a/alias/src/alias/agent/agents/_planning_tools/_roadmap_manager.py
+++ b/alias/src/alias/agent/agents/_planning_tools/_roadmap_manager.py
@@ -0,0 +1,295 @@
 # -*- coding: utf-8 -*-
 """
 Planning handler module for meta planner
 """
 from typing import Optional, Literal
 from agentscope.module import StateModule
 from agentscope.tool import ToolResponse
 from agentscope.message import TextBlock
 from ._planning_notebook import (
    PlannerNoteBook,
    SubTaskStatus,
    Update,
    SubTaskSpecification,
 )
 class RoadmapManager(StateModule):
    """Handles planning operations for meta planner agent.
    This class provides functionality for task decomposition, roadmap creation,
    and roadmap revision.
    """
    def __init__(
        self,
        planner_notebook: PlannerNoteBook,
    ):
        """Initialize the PlanningHandler.
        Args:
            planner_notebook (PlannerNoteBook):
                Data structure containing planning state.
        """
        super().__init__()
        self.planner_notebook = planner_notebook
        self.register_state(
            "planner_notebook",
            lambda x: x.model_dump(),
            lambda x: PlannerNoteBook(**x),
        )
    async def decompose_task_and_build_roadmap(
        self,
        user_latest_input: str,
        given_task_conclusion: str,
        detail_analysis_for_plan: str,
        decomposed_subtasks: list[SubTaskSpecification],
    ) -> ToolResponse:
        """
        1) Analyze the user task;
        2) Reasoning about the necessary steps to finish
        the whole task;
        3) Group those necessary steps as a few manageable subtasks that
        - the steps consisting the same subtask use same set of tools;
        - the steps in the same subtask DO NOT depend on later subtasks/steps;
        - each subtask's objectives should be clear and verifiable;
        - reasoning/analysis and generation/action for the same objective
            should be in the same subtask.
        Notice:
        You MUST pay attention to and follow the required format of the
        `decomposed_subtasks`.
        Args:
            user_latest_input (str):
                The latest user input. If there are multiple rounds
                of user input, faithfully record the latest user input.
            given_task_conclusion (str):
                The user's task to decompose. If there are multiple rounds
                of user input, analysis and give the key idea of the task that
                the user really you to solve.
            detail_analysis_for_plan (str):
                A detailed analysis of how a task should be decomposed.
            decomposed_subtasks (list[SubTaskSpecification]):
                List of subtasks that was decomposed.
        """
        self.planner_notebook.detail_analysis_for_plan = (
            detail_analysis_for_plan
        )
        self.planner_notebook.roadmap.original_task = given_task_conclusion
        for subtask in decomposed_subtasks:
            if isinstance(subtask, dict):
                subtask_status = SubTaskStatus(
                    subtask_specification=SubTaskSpecification(
                        **subtask,
                    ),
                )
            elif isinstance(subtask, SubTaskSpecification):
                subtask_status = SubTaskStatus(
                    subtask_specification=subtask,
                )
            else:
                raise TypeError(
                    "Unexpected type of `decomposed_subtasks`,"
                    "which is expected to strictly follow List of "
                    "SubTaskSpecification.",
                )
            self.planner_notebook.roadmap.decomposed_tasks.append(
                subtask_status,
            )
        # self.planner_notebook.user_input.append(user_latest_input)
        return ToolResponse(
            metadata={"success": True},
            content=[
                TextBlock(
                    type="text",
                    text="Successfully decomposed the task into subtasks",
                ),
            ],
        )
    async def get_next_unfinished_subtask_from_roadmap(self) -> ToolResponse:
        """
        Obtains the next unfinished subtask from the roadmap.
        """
        idx, subtask = self.planner_notebook.roadmap.next_unfinished_subtask()
        if idx is None or subtask is None:
            return ToolResponse(
                metadata={"success": False},
                content=[
                    TextBlock(
                        type="text",
                        text=(
                            "No unfinished subtask was found. "
                            "Either all subtasks have been done, or the task"
                            " has not been decomposed."
                        ),
                    ),
                ],
            )
        return ToolResponse(
            metadata={"success": True, "subtask": subtask},
            content=[
                TextBlock(
                    type="text",
                    text=f"Next unfinished subtask idx: {idx}",
                ),
                TextBlock(
                    type="text",
                    text=subtask.model_dump_json(indent=2),
                ),
            ],
        )
    async def revise_roadmap(
        self,
        action: Literal["add_subtask", "revise_subtask", "remove_subtask"],
        subtask_idx: int,
        subtask_specification: Optional[SubTaskSpecification] = None,
        update_to_subtask: Optional[Update] = None,
        new_status: Literal["Planned", "In-process", "Done"] = "In-process",
    ) -> ToolResponse:
        """After subtasks are done by worker agents, use this function to
        revise the progress and details of the current roadmap.
        Updates the status of subtasks and potentially revises input/output
        descriptions and required tools for tasks based on current progress
        and available information.
        Args:
            action (
                `Literal["add_subtask", "revise_subtask", "remove_subtask"]`
            ):
                Action to perform on the roadmap.
            subtask_idx (`int`):
                Index of the subtask to revise its status. This index starts
                with 0.
            subtask_specification (`SubTaskSpecification`):
                Revised subtask specification. When you use `add_subtask` or
                `revise_subtask` action, you MUST provide this field with
                revised `exact_input` and `expected_output` according to
                the execution context.
            update_to_subtask (`Update`):
                Generate an update record for this subtask based on the
                worker execution report. When you use `revise_subtask` action,
                you MUST provide this field.
            new_status  (`Literal["Planned", "In-process", "Done"]`):
                The new status of the subtask.
        Returns:
            ToolResponse:
                Response indicating success/failure of the revision
                and any updates made. May request additional human
                input if needed.
        """
        num_subtasks = len(self.planner_notebook.roadmap.decomposed_tasks)
        if isinstance(subtask_specification, dict):
            subtask_specification = SubTaskSpecification(
                **subtask_specification,
            )
        elif subtask_specification is None and action in [
            "add_subtask",
            "revise_subtask",
        ]:
            return ToolResponse(
                metadata={"success": False},
                content=[
                    TextBlock(
                        type="text",
                        text=(
                            f"Choosing {action} must have valid "
                            f"`subtask_specification` field."
                        ),
                    ),
                ],
            )
        if isinstance(update_to_subtask, dict):
            update_to_subtask = Update(
                **update_to_subtask,
            )
        elif update_to_subtask is None and action == "revise_subtask":
            return ToolResponse(
                metadata={"success": False},
                content=[
                    TextBlock(
                        type="text",
                        text=(
                            f"Choosing {action} must have valid "
                            f"`update_to_subtask` field."
                        ),
                    ),
                ],
            )
        if subtask_idx >= num_subtasks and action == "add_subtask":
            self.planner_notebook.roadmap.decomposed_tasks.append(
                SubTaskStatus(
                    subtask_specification=subtask_specification,
                    status="Planned",
                    updates=update_to_subtask,
                ),
            )
            return ToolResponse(
                metadata={"success": True},
                content=[
                    TextBlock(
                        type="text",
                        text=f"add new subtask with index {subtask_idx}.",
                    ),
                ],
            )
        elif subtask_idx >= num_subtasks:
            return ToolResponse(
                metadata={"success": False},
                content=[
                    TextBlock(
                        type="text",
                        text=(
                            f"Fail to update subtask {subtask_idx} status."
                            f"There are {num_subtasks} subtasks, "
                            f"idx {subtask_idx} is not supported with "
                            f"action {action}."
                        ),
                    ),
                ],
            )
        elif action == "revise_subtask" and update_to_subtask:
            subtask = self.planner_notebook.roadmap.decomposed_tasks[
                subtask_idx
            ]
            subtask.status = new_status
            subtask.updates.append(update_to_subtask)
            return ToolResponse(
                metadata={"success": True},
                content=[
                    TextBlock(
                        type="text",
                        text=f"Update subtask {subtask_idx} status.",
                    ),
                    TextBlock(
                        type="text",
                        text=self.planner_notebook.roadmap.decomposed_tasks[
                            subtask_idx
                        ].model_dump_json(indent=2),
                    ),
                ],
            )
        elif action == "remove_subtask":
            self.planner_notebook.roadmap.decomposed_tasks.pop(subtask_idx)
            return ToolResponse(
                metadata={"success": True},
                content=[
                    TextBlock(
                        type="text",
                        text=f"Remove subtask {subtask_idx} from roadmap.",
                    ),
                ],
            )
        else:
            raise ValueError(
                f"Not support action {action} on subtask {subtask_idx}",
            )
--- a/alias/src/alias/agent/agents/_planning_tools/_worker_manager.py
+++ b/alias/src/alias/agent/agents/_planning_tools/_worker_manager.py
@@ -0,0 +1,631 @@
 # -*- coding: utf-8 -*-
 """
 Coordination handler module for meta planner
 """
 import os
 from pathlib import Path
 import json
 from typing import Optional, Literal, List, Any
 import asyncio
 from agentscope import logger
 from agentscope.module import StateModule
 from agentscope.memory import InMemoryMemory, MemoryBase
 from agentscope.tool import ToolResponse
 from agentscope.message import Msg, TextBlock, ToolUseBlock, ToolResultBlock
 from agentscope.model import ChatModelBase, DashScopeChatModel
 from agentscope.formatter import FormatterBase, DashScopeChatFormatter
 from alias.runtime.alias_sandbox import AliasSandbox
 from alias.agent.tools import AliasToolkit
 from alias.agent.agents._react_worker import ReActWorker
 from alias.agent.agents._browser_agent import BrowserAgent
 from alias.agent.utils.constants import (
    WORKER_MAX_ITER,
    DEFAULT_BROWSER_WORKER_NAME,
 )
 from ._planning_notebook import (
    WorkerInfo,
    WorkerResponse,
 )
 from ._planning_notebook import (
    PlannerNoteBook,
 )
 def rebuild_reactworker(
    worker_info: WorkerInfo,
    old_toolkit: AliasToolkit,
    new_toolkit: AliasToolkit,
    memory: Optional[MemoryBase] = None,
    model: Optional[ChatModelBase] = None,
    formatter: Optional[FormatterBase] = None,
    exclude_tools: Optional[list[str]] = None,
 ) -> ReActWorker:
    """
    Rebuild a ReActAgent worker with specified configuration and tools.
    Creates a new ReActAgent using worker information and toolkit
    configuration. Tools are shared from the old toolkit to the new one,
    excluding any specified tools.
    Args:
        worker_info (WorkerInfo): Information about the worker including name,
            system prompt, and tool lists.
        old_toolkit (Toolkit): Source toolkit containing available tools.
        new_toolkit (Toolkit): Destination toolkit to receive shared tools.
        memory (Optional[MemoryBase], optional): Memory instance for the agent.
            Defaults to InMemoryMemory() if None.
        model (Optional[ChatModelBase], optional): Chat model instance.
            Defaults to DashscopeChatModel with deepseek-r1 if None.
        formatter (Optional[FormatterBase], optional): Message formatter.
            Defaults to DashScopeChatFormatter() if None.
        exclude_tools (Optional[list[str]], optional): List of tool names to
            exclude from sharing. Defaults to empty list if None.
    Returns:
        ReActAgent: A configured ReActAgent instance ready for use.
    Note:
        - The default model uses the DASHSCOPE_API_KEY environment variable
        - Tools are shared based on worker_info.tool_lists minus excluded tools
        - The agent is configured with thinking enabled and streaming support
    """
    if exclude_tools is None:
        exclude_tools = []
    tool_list = [
        tool_name
        for tool_name in worker_info.tool_lists
        if tool_name not in exclude_tools
    ]
    share_tools(old_toolkit, new_toolkit, tool_list)
    model = (
        model
        if model
        else DashScopeChatModel(
            api_key=os.environ.get("DASHSCOPE_API_KEY"),
            model_name="deepseek-r1",
            enable_thinking=True,
            stream=True,
        )
    )
    return ReActWorker(
        name=worker_info.worker_name,
        sys_prompt=worker_info.sys_prompt,
        model=model,
        formatter=formatter if formatter else DashScopeChatFormatter(),
        toolkit=new_toolkit,
        memory=InMemoryMemory() if memory is None else memory,
        max_iters=WORKER_MAX_ITER,
    )
 async def check_file_existence(file_path: str, toolkit: AliasToolkit) -> bool:
    """
    Check if a file exists using the read_file tool from the provided toolkit.
    This function attempts to verify file existence by calling the read_file
    tool and checking the response for error indicators. It requires the
    toolkit to have a 'read_file' tool available.
    Args:
        file_path (str): The path to the file to check for existence.
        toolkit (Toolkit): The toolkit containing the read_file tool.
    Returns:
        bool: True if the file exists and is readable, False otherwise.
    Note:
        - Returns False if the 'read_file' tool is not available in the toolkit
        - Returns False if any exception occurs during the file read attempt
        - Uses error message detection ("no such file or directory") to
            determine existence
    """
    # Get read_file tool from AliasToolkit
    if "read_file" in toolkit.tools:
        read_toolkit = toolkit
    else:
        logger.warning(
            "No read_file tool available for file "
            f"existence check: {file_path}",
        )
        return False
    params = {
        "path": file_path,
    }
    read_file_block = ToolUseBlock(
        type="tool_use",
        id="manual_check_file_existence",
        name="read_file",
        input=params,
    )
    try:
        tool_res = await read_toolkit.call_tool_function(read_file_block)
        tool_res_msg = Msg(
            "system",
            [
                ToolResultBlock(
                    type="tool_result",
                    id="",
                    name="read_file",
                    output=[],
                ),
            ],
            "system",
        )
        async for chunk in tool_res:
            # Turn into a tool result block
            tool_res_msg.content[0][  # type: ignore[index]
                "output"
            ] = chunk.content
        if "no such file or directory" in str(tool_res_msg.content):
            return False
        else:
            return True
    except Exception as _:  # noqa: F841
        return False
 def share_tools(
    old_toolkit: AliasToolkit,
    new_toolkit: AliasToolkit,
    tool_list: list[str],
 ) -> None:
    """
    Share specified tools from an old toolkit to a new toolkit.
    This function copies tools from one toolkit to another based on the
    provided tool list. If a tool doesn't exist in the old toolkit,
    a warning is logged.
    Args:
        old_toolkit (Toolkit):
            The source toolkit containing tools to be shared.
        new_toolkit (Toolkit):
            The destination toolkit to receive the tools.
        tool_list (list[str]):
            List of tool names to be copied from old to new toolkit.
    Returns:
        None
    Note:
        This function modifies the new_toolkit in place.
        If a tool in tool_list is not found in old_toolkit,
        a warning is logged but execution continues.
    """
    for tool in tool_list:
        if tool in old_toolkit.tools and tool not in new_toolkit.tools:
            new_toolkit.tools[tool] = old_toolkit.tools[tool]
        elif tool in old_toolkit.tools:
            logger.warning(
                "Tool %s is already in the provided new_toolkit",
                tool,
            )
        else:
            logger.warning(
                "No tool %s in the provided old_toolkit",
                tool,
            )
 class WorkerManager(StateModule):
    """
    Handles coordination between meta planner and worker agents.
    This class manages the creation, selection, and execution of worker agents
    to accomplish subtasks in a roadmap. It provides functionality for dynamic
    worker creation, worker selection based on task requirements, and
    processing worker responses to update the overall task progress.
    """
    def __init__(
        self,
        worker_model: ChatModelBase,
        worker_formatter: FormatterBase,
        planner_notebook: PlannerNoteBook,
        worker_full_toolkit: AliasToolkit,
        agent_working_dir: str,
        sandbox: AliasSandbox,
        worker_pool: Optional[
            dict[str, tuple[WorkerInfo, ReActWorker]]
        ] = None,
        session_service: Any = None,
    ):
        """Initialize the CoordinationHandler.
        Args:
            worker_model (ChatModelBase):
                Main language model for coordination decisions
            worker_formatter (FormatterBase):
                Message formatter for model communication
            planner_notebook (PlannerNoteBook):
                Notebook containing roadmap and file information
            worker_full_toolkit (Toolkit):
                Complete toolkit available to workers
            agent_working_dir (str):
                Working directory for the agent operations
            worker_pool: dict[str, tuple[WorkerInfo, ReActAgent]]:
                workers that has already been created
        """
        super().__init__()
        self.planner_notebook = planner_notebook
        self.worker_model = worker_model
        self.worker_formatter = worker_formatter
        self.worker_pool: dict[str, tuple[WorkerInfo, ReActWorker]] = (
            worker_pool if worker_pool else {}
        )
        self.agent_working_dir = agent_working_dir
        self.worker_full_toolkit = worker_full_toolkit
        self.base_sandbox = sandbox
        self.session_service = session_service
        def reconstruct_workerpool(worker_pool_dict: dict) -> dict:
            rebuild_worker_pool = {}
            for k, v in worker_pool_dict.items():
                worker_info = WorkerInfo(**v)
                # build-in agents
                if k == DEFAULT_BROWSER_WORKER_NAME:
                    browser_toolkit = AliasToolkit(
                        self.base_sandbox,
                        is_browser_toolkit=True,
                        add_all=True,
                    )
                    browser_agent = BrowserAgent(
                        model=self.worker_model,
                        formatter=self.worker_formatter,
                        memory=InMemoryMemory(),
                        toolkit=browser_toolkit,
                        max_iters=50,
                        start_url="https://www.google.com",
                    )
                    rebuild_worker_pool[k] = (
                        worker_info,
                        browser_agent,
                    )
                # Handle regular worker reconstruction
                else:
                    new_toolkit = AliasToolkit(sandbox=self.base_sandbox)
                    rebuild_worker_pool[k] = (
                        worker_info,
                        rebuild_reactworker(
                            worker_info=worker_info,
                            old_toolkit=self.worker_full_toolkit,
                            new_toolkit=new_toolkit,
                            model=self.worker_model,
                            formatter=self.worker_formatter,
                            exclude_tools=["generate_response"],
                        ),
                    )
            return rebuild_worker_pool
        self.register_state(
            "worker_pool",
            lambda x: {k: v[0].model_dump() for k, v in x.items()},
            custom_from_json=reconstruct_workerpool,
        )
        self.register_state(
            "planner_notebook",
            lambda x: x.model_dump(),
            lambda x: PlannerNoteBook(**x),
        )
        self.register_state("agent_working_dir")
    def register_worker(
        self,
        agent: ReActWorker,
        description: Optional[str] = None,
        worker_type: Literal["built-in", "dynamic-built"] = "dynamic",
    ) -> None:
        """
        Register a worker agent in the worker pool.
        Adds a worker agent to the available pool with appropriate metadata.
        Handles name conflicts by appending version numbers when necessary.
        Args:
            agent (ReActAgent):
                The worker agent to register
            description (Optional[str]):
                Description of the worker's capabilities
            worker_type (Literal["built-in", "dynamic-built"]):
                Type of worker agent
        """
        worker_info = WorkerInfo(
            worker_name=agent.name,
            description=description,
            worker_type=worker_type,
            status="ready-to-work",
        )
        if worker_type == "dynamic-built":
            worker_info.sys_prompt = agent.sys_prompt
            worker_info.tool_lists = list(agent.toolkit.tools.keys())
        if agent.name in self.worker_pool:
            name = agent.name
            version = 1
            while name in self.worker_pool:
                name = agent.name + f"_v{version}"
                version += 1
            agent.name, worker_info.worker_name = name, name
            self.worker_pool[name] = (worker_info, agent)
        else:
            self.worker_pool[agent.name] = (worker_info, agent)
    @staticmethod
    def _no_more_subtask_return() -> ToolResponse:
        """
        Return response when no more unfinished subtasks exist.
        Returns:
            ToolResponse: Response indicating no more subtasks are available
        """
        return ToolResponse(
            metadata={"success": False},
            content=[
                TextBlock(
                    type="text",
                    text="No more subtask exists. "
                    "Check whether the task is "
                    "completed solved.",
                ),
            ],
        )
    async def create_worker(
        self,
        worker_name: str,
        worker_system_prompt: str,
        tool_names: Optional[List[str]] = None,
        agent_description: str = "",
    ) -> ToolResponse:
        """
        Create a worker agent for the next unfinished subtask.
        Dynamically creates a specialized worker agent based on the
        requirements of the next unfinished subtask in the roadmap.
        The worker is configured with appropriate tools and system prompts
        based on the task needs.
        Each worker agent will be provided the following tools by default,
        so that you don't need to specify those again. Only specify the
        necessary tools that are not in the list
        [
            "read_file",
            "write_file",
            "edit_file",
            "create_directory",
            "list_directory",
            "directory_tree",
            "list_allowed_directories",
            "run_shell_command",
        ]
        Args:
            worker_name (str): The name of the worker agent.
            worker_system_prompt (str): The system prompt for the worker agent.
            tool_names (Optional[List[str]], optional):
                List of tools that should be assigned to the worker agent so
                that it can finish the subtask. MUST be from the
                `Available Tools for workers`
            agent_description (str, optional):
                A brief description of the worker's capabilities.
        Returns:
            ToolResponse: Response containing the creation result and worker
                details
        """
        if tool_names is None:
            tool_names = []
        # Traditional AliasToolkit mode
        suffix = ""
        worker_toolkit = AliasToolkit(sandbox=self.base_sandbox)
        share_tools(
            self.worker_full_toolkit,
            worker_toolkit,
            tool_names
            + [
                "read_file",
                "write_file",
                "edit_file",
                "search_files",
                "list_directory",
                "run_shell_command",
            ],
        )
        with open(
            Path(__file__).parent.parent
            / f"_built_in_long_sys_prompt{suffix}"
            / f"_worker_additional_sys_prompt{suffix}.md",
            "r",
            encoding="utf-8",
        ) as f:
            additional_worker_prompt = f.read()
        with open(
            Path(__file__).parent.parent
            / f"_built_in_long_sys_prompt{suffix}"
            / f"_tool_usage_rules{suffix}.md",
            "r",
            encoding="utf-8",
        ) as f:
            additional_worker_prompt += str(f.read()).format_map(
                {"agent_working_dir": self.agent_working_dir},
            )
        worker = ReActWorker(
            name=worker_name,
            sys_prompt=(worker_system_prompt + additional_worker_prompt),
            model=self.worker_model,
            formatter=self.worker_formatter,
            memory=InMemoryMemory(),
            toolkit=worker_toolkit,
            max_iters=WORKER_MAX_ITER,
            session_service=self.session_service,
        )
        self.register_worker(
            worker,
            description=agent_description,
            worker_type="dynamic-built",
        )
        return ToolResponse(
            metadata={"success": True},
            content=[
                TextBlock(
                    type="text",
                    text=(
                        f"Successfully created a worker agent:\n"
                        f"Worker name: {worker_name}\n"
                        f"Worker tools: {tool_names}\n"
                        f"Worker system prompt: {worker.sys_prompt}"
                    ),
                ),
            ],
        )
    async def show_current_worker_pool(self) -> ToolResponse:
        """
        List all currently available worker agents with
        their system prompts and tools.
        """
        worker_info: dict[str, dict] = {
            name: info.model_dump()
            for name, (info, _) in self.worker_pool.items()
        }
        return ToolResponse(
            metadata={"success": True},
            content=[
                TextBlock(
                    type="text",
                    text=json.dumps(worker_info, ensure_ascii=False, indent=2),
                ),
            ],
        )
    async def execute_worker(
        self,
        subtask_idx: int,
        selected_worker_name: str,
        detailed_instruction: str,
        reset_worker_memory: bool = False
    ) -> ToolResponse:
        """
        Execute a worker agent for the next unfinished subtask.
        Args:
            subtask_idx (int):
                Index of the subtask to execute.
            selected_worker_name (str):
                Select a worker agent to execute by its name. If you are unsure
                what are the available agents, call `show_current_worker_pool`
                before using this function.
            detailed_instruction (str):
                Generate detailed instruction for the worker based on the
                next unfinished subtask in the roadmap. If you are unsure
                what is the next unavailable subtask, check with
                `get_next_unfinished_subtask_from_roadmap` to get more info.
            reset_worker_memory (bool):
                Whether to ensure the worker memory is empty before starting
                the task. For example, 1) if the same worker encounter errors,
                a safer way is to reset his memory to avoid error propagation;
                2) if a new subtask is assign to an existing worker, the worker
                memory can also be reset for better performance (but require
                providing sufficient context information in
                `detailed_instruction`); 3) if a worker is stopped just because
                hitting th maximum round constraint in the previous execution
                and it's going to work on the sam task, DO NOT reset the
                memory.
        """
        if selected_worker_name not in self.worker_pool:
            worker_info: dict[str, WorkerInfo] = {
                name: info for name, (info, _) in self.worker_pool.items()
            }
            current_agent_pool = json.dumps(
                worker_info,
                ensure_ascii=False,
                indent=2,
            )
            return ToolResponse(
                metadata={"success": False},
                content=[
                    TextBlock(
                        type="text",
                        text=(
                            f"There is no {selected_worker_name} in current "
                            "agent pool.\n"
                            "Current agent pool:\n```json\n"
                            f"{current_agent_pool}\n"
                            "```"
                        ),
                    ),
                ],
            )
        worker = self.worker_pool[selected_worker_name][1]
        if reset_worker_memory:
            await worker.memory.clear()
        question_msg = Msg(
            role="user",
            name="user",
            content=detailed_instruction,
        )
        try:
            worker_response_msg = await worker(
                question_msg,
                # structured_model=WorkerResponse,
            )
        except (KeyboardInterrupt, asyncio.CancelledError):
            raise asyncio.CancelledError() from None
        if worker_response_msg.metadata is not None:
            worker_response = WorkerResponse(
                **worker_response_msg.metadata,
            )
            self.planner_notebook.roadmap.decomposed_tasks[
                subtask_idx
            ].workers.append(
                self.worker_pool[selected_worker_name][0],
            )
            # double-check to ensure the generated files exists
            for filepath, desc in worker_response.generated_files.items():
                if await check_file_existence(
                    filepath,
                    self.worker_full_toolkit,
                ):
                    self.planner_notebook.files[filepath] = desc
                else:
                    worker_response.generated_files.pop(filepath)
            return ToolResponse(
                metadata={
                    "success": True,
                    "worker_response": worker_response.model_dump_json(),
                },
                content=[
                    TextBlock(
                        type="text",
                        text=worker_response.model_dump_json(),
                    ),
                ],
            )
        else:
            return ToolResponse(
                metadata={
                    "success": False,
                    "worker_response": worker_response_msg.content,
                },
                content=[
                    TextBlock(
                        type="text",
                        text=str(worker_response_msg.content),
                    ),
                ],
            )
--- a/alias/src/alias/agent/agents/_react_worker.py
+++ b/alias/src/alias/agent/agents/_react_worker.py
@@ -0,0 +1,112 @@
 # -*- coding: utf-8 -*-
 # pylint: disable=C2801, W0611, W0212
 from typing import Optional, Any
 from agentscope.model import ChatModelBase
 from agentscope.formatter import FormatterBase
 from agentscope.memory import MemoryBase
 from agentscope.tool import ToolResponse
 from agentscope.message import (
    Msg,
    TextBlock,
 )
 from dotenv import load_dotenv
 from alias.agent.agents import AliasAgentBase
 from alias.agent.tools import AliasToolkit
 from alias.agent.utils.constants import WORKER_MAX_ITER
 from alias.agent.agents._planning_tools._planning_notebook import (
    WorkerResponse,
 )
 load_dotenv()
 class ReActWorker(AliasAgentBase):
    def __init__(
        self,
        name: str,
        model: ChatModelBase,
        formatter: FormatterBase,
        memory: MemoryBase,
        toolkit: AliasToolkit,
        sys_prompt: Optional[str] = None,
        max_iters: int = 10,
        state_saving_dir: Optional[str] = None,
        session_service: Any = None,
    ) -> None:
        """Initialize the ReAct agent with the given name, model config name
        and tools.
        """
        super().__init__(
            name=name,
            sys_prompt=sys_prompt,
            model=model,
            formatter=formatter,
            memory=memory,
            toolkit=toolkit,
            max_iters=max_iters,
            session_service=session_service,
            state_saving_dir=state_saving_dir,
        )
        self.max_iters: int = max(self.max_iters, WORKER_MAX_ITER)
    def generate_response(
        self,
        response: str = "",
        task_done: bool = True,
        subtask_progress_summary: str = "",
        generated_files: dict[str, str] = None,
    ) -> ToolResponse:
        """
        Generate a response summarizing the execution progress of the
        given subtask.
        Args:
            response (str):
                The response text (compatible with AgentScope finish function).
            task_done (bool):
                REQUIRED! Whether the subtask was done or not.
            subtask_progress_summary (str):
                REQUIRED! The subtask progress summary.
            generated_files (dict[str, str]):
                REQUIRED! Collect all files generated in the execution process,
                such as the files generated by `write_file` and `edit_file`.
                This field MUST be in dictionary, where the keys are the
                paths of generated files (e.g. '/FULL/PATH/OF/FILE_1.md') and
                the values are short descriptions about the generated files.
        """
        if generated_files is None:
            generated_files = {}
        # If only response is provided,
        # use it as subtask_progress_summary
        if not subtask_progress_summary and response:
            subtask_progress_summary = response
        structure_response = WorkerResponse(
            task_done=task_done,
            subtask_progress_summary=subtask_progress_summary,
            generated_files=generated_files,
        )
        response_msg = Msg(
            self.name,
            content=[
                TextBlock(type="text", text=subtask_progress_summary),
            ],
            role="assistant",
            metadata=structure_response.model_dump(),
        )
        return ToolResponse(
            content=[
                TextBlock(
                    type="text",
                    text="Successfully generated response.",
                ),
            ],
            metadata={
                "success": True,
                "response_msg": response_msg,
            },
            is_last=True,
        )
--- a/alias/src/alias/agent/mock/init.py
+++ b/alias/src/alias/agent/mock/init.py
@@ -0,0 +1,17 @@
 # -*- coding: utf-8 -*-
 from .mock_session_service import MockSessionService, MockPlan
 from .mock_message_models import (
    BaseMessage,
    MessageState,
    MockMessage,
    UserMessage
 )
 __all__ = [
    "MockSessionService",
    "MockPlan",
    "MockMessage",
    "BaseMessage",
    "MessageState",
    "UserMessage",
 ]
--- a/alias/src/alias/agent/mock/mock_message_models.py
+++ b/alias/src/alias/agent/mock/mock_message_models.py
@@ -0,0 +1,46 @@
 # -*- coding: utf-8 -*-
 """Mock message models for local testing without api_server dependency."""
 from enum import Enum
 import uuid
 from typing import Any, Optional, List
 from pydantic import BaseModel
 class MessageState(str, Enum):
    """Message state enumeration."""
    RUNNING = "running"
    FINISHED = "finished"
    FAILED = "failed"
 class MessageType(str, Enum):
    """Message type enumeration."""
    RESPONSE = "response"
    SUB_RESPONSE = "sub_response"
    THOUGHT = "thought"
    SUB_THOUGHT = "sub_thought"
    TOOL_CALL = "tool_call"
    CLARIFICATION = "clarification"
    FILES = "files"
    SYSTEM = "system"
 class BaseMessage(BaseModel):
    """Base message class for local testing."""
    role: str = "assistant"
    content: Any = ""
    name: Optional[str] = None
    type: Optional[str] = "text"
    status: MessageState = MessageState.FINISHED
 class UserMessage(BaseMessage):
    """User message for local testing."""
    role: str = "user"
    name: str = "User"
 class MockMessage:
    id: uuid.UUID = uuid.uuid4()
    message: Optional[dict] = None
    files: list[Any] = []
--- a/alias/src/alias/agent/mock/mock_session_service.py
+++ b/alias/src/alias/agent/mock/mock_session_service.py
@@ -0,0 +1,214 @@
 # -*- coding: utf-8 -*-
 # mypy: ignore-errors
 # flake8: noqa
 # pylint: skip-file
 import uuid
 import os
 from typing import Any, Optional, List
 import json
 from loguru import logger
 from datetime import datetime
 from dataclasses import dataclass, field
 from .mock_message_models import BaseMessage, MessageState, MockMessage
 try:
    logger.level("SEND_MSG", no=52, color="<blue>", icon="💻")
    logger.level("SEND_PLAN", no=52, color="<white>", icon="📒")
 except TypeError:
    pass
@dataclass
 class MockPlan:
    task_id: uuid.UUID = uuid.uuid4()
    conversation_id: uuid.UUID = uuid.uuid4()
    message_id: uuid.UUID = uuid.uuid4()
    user_id: uuid.UUID = uuid.uuid4()
    runtime_id: uuid.UUID = uuid.uuid4()
    content: Any = None
    upload_files: List[Any] = field(default_factory=list)
 class SessionEntity:
    task_id: uuid.UUID
    session_id: uuid.UUID
    conversation_id: uuid.UUID
    message_id: uuid.UUID
    user_id: uuid.UUID
    runtime_id: uuid.UUID
    query: str
    upload_files: List = []
    is_chat: bool = False
    def __init__(self):
        self.user_id: uuid.UUID = uuid.uuid4()
        self.conversation_id: uuid.UUID = uuid.uuid4()
        self.session_id: uuid.UUID = uuid.uuid4()
    def ids(self):
        return {
            "task_id": str(self.task_id),
            "conversation_id": str(self.conversation_id),
            "message_id": str(self.message_id),
            "runtime_id": str(self.runtime_id),
        }
 class MockSessionService:
    all_checkpoint_dir = "./logs/checkpoints/"
    def __init__(
        self,
        runtime_model: Any = None,
    ):
        self.session_id = "mock_session"
        self.conversation_id = "mock_conversation"
        self.messages = []
        self.plan = MockPlan()
        self.session_entity = SessionEntity()
        logger.info(
            f"> user_id {self.session_entity.user_id}\n "
            f"> conversation_id {self.session_entity.conversation_id}",
        )
        # log for testing
        self.log_storage_path = os.path.join(
            "./logs",
            datetime.now().strftime("%Y%m%d%H%M%S") + ".log",
        )
        if not os.path.exists("./logs"):
            os.mkdir("./logs")
        self.plan_update_counter = 0
        self.runtime_model = runtime_model
        self.current_checkpoint_dir = os.path.join(
            self.all_checkpoint_dir,
            datetime.now().strftime("%Y%m%d%H%M%S"),
        )
        self.state_save_count = 0
        self.state = {}
    # Plan
    async def create_plan(self, content: Any) -> MockPlan:
        self.plan = MockPlan(content=content)
        content = (
            f"\nCreate plan {self.plan_update_counter}:\n"
            f"\n{json.dumps(self.plan.content, indent=4, ensure_ascii=False)}"
            "\n" + "==" * 10 + "\n"
        )
        # logger.log("SEND_PLAN", content)
        with open(self.log_storage_path, "a") as file:
            # Append the content
            file.write(content)
        self.plan_update_counter += 1
        return self.plan
    async def update_plan(self, content: Any) -> MockPlan:
        self.plan = MockPlan(content=content)
        content = (
            f"Update plan {self.plan_update_counter}:\n"
            f"\n{json.dumps(self.plan.content, indent=4, ensure_ascii=False)}"
            "\n" + "==" * 10 + "\n"
        )
        # logger.log("SEND_PLAN", content)
        with open(self.log_storage_path, "a") as file:
            # Append the content
            file.write(content)
        self.plan_update_counter += 1
        return self.plan
    async def delete_plan(self) -> None:
        logger.log("SEND_PLAN", f"Delete plan: {self.plan.content}")
        self.plan_update_counter = 0
        self.plan = MockPlan()
    async def create_message(
        self,
        message: BaseMessage,
        message_id: Optional[uuid.UUID] = None,
    ) -> MockMessage:
        db_message = None
        if message.status == MessageState.FINISHED:
            if message_id:
                for msg in self.messages:
                    if msg.id == message_id:
                        db_message = msg
                if db_message is None:
                    db_message = MockMessage()
                    self.messages.append(db_message)
                db_message.message = message.model_dump()
            else:
                db_message = MockMessage()
                db_message.message = message.model_dump()
                self.messages.append(db_message)
            logger.log(
                "SEND_MSG",
                f"Create new message {type(message)}, "
                f"buffer has {len(self.messages)}",
            )
            content = (
                "=" * 10
                + "\n"
                + f"Role: {db_message.message.get('role')},\n"
                + f"Name: {db_message.message.get('name')},\n"
                + f"Type: {db_message.message.get('type')},\n"
                + f"Statue: {db_message.message.get('status')},\n"
                + f"content: {str(db_message.message.get('content'))}\n"
                + "=" * 10
            )
            with open(self.log_storage_path, "a") as file:
                # Append the content
                file.write(content)
        elif message.status == MessageState.RUNNING:
            if message_id:
                for msg in self.messages:
                    if msg.id == message_id:
                        db_message = msg
                if db_message is None:
                    db_message = MockMessage()
                    self.messages.append(db_message)
                    logger.log(
                        "SEND_MSG",
                        f"Updating message {len(self.messages) - 1}",
                    )
                db_message.message = message.model_dump()
            else:
                db_message = MockMessage()
                db_message.message = message.model_dump()
                self.messages.append(db_message)
        return db_message
    async def get_messages(self) -> List[MockMessage]:
        logger.log("SEND_MSG", "Get all messages")
        return self.messages
    async def create_state(
        self,
        content: Any,
    ):
        postfix = ""
        if isinstance(content, dict):
            if "running_agent" in content:
                postfix += content["running_agent"] + "-"
            if "react_state" in content:
                postfix += str(content["react_state"]) + "-"
            if "react_round" in content:
                postfix += str(content["react_round"]) + "-"
            if "exec_tool_names" in content:
                postfix += "_".join(content["exec_tool_names"]) + "-"
        postfix += str(self.state_save_count)
        os.makedirs(self.current_checkpoint_dir, exist_ok=True)
        checkpoint_path = os.path.join(
            self.current_checkpoint_dir,
            f"state-{postfix}.json",
        )
        with open(checkpoint_path, "w") as file:
            json.dump(content, file, indent=4, ensure_ascii=False)
        # logger.info(f"State saved to {checkpoint_path}")
        self.state_save_count += 1
        self.state = content
    async def get_state(self) -> dict:
        return self.state
--- a/alias/src/alias/agent/run.py
+++ b/alias/src/alias/agent/run.py
@@ -0,0 +1,270 @@
 # -*- coding: utf-8 -*-
 # pylint: disable=W0612,E0611,C2801
 import os
 from typing import Optional
 from datetime import datetime
 import traceback
 from loguru import logger
 from agentscope.message import Msg
 from agentscope.model import (
    OpenAIChatModel,
    AnthropicChatModel,
    DashScopeChatModel,
 )
 from agentscope.formatter import (
    OpenAIChatFormatter,
    AnthropicChatFormatter,
    DashScopeChatFormatter,
 )
 from agentscope.memory import InMemoryMemory
 from agentscope.mcp import StdIOStatefulClient, StatefulClientBase
 from agentscope.token import OpenAITokenCounter
 from agentscope_runtime.sandbox.box.sandbox import Sandbox
 from alias.agent.agents import (
    MetaPlanner,
    DeepResearchAgent,
    BrowserAgent,
 )
 from alias.agent.tools import AliasToolkit
 from alias.agent.agents._planning_tools._worker_manager import share_tools
 from alias.agent.utils.constants import BROWSER_AGENT_DESCRIPTION
 from alias.agent.tools.improved_tools import DashScopeMultiModalTools
 from alias.agent.tools.toolkit_hooks import LongTextPostHook
 # Open source version always uses mock services
 from alias.agent.mock import MockSessionService
 SessionService = MockSessionService
 MODEL_FORMATTER_MAPPING = {
    "qwen3-max": [
        DashScopeChatModel(
            api_key=os.environ.get("DASHSCOPE_API_KEY"),
            model_name="qwen3-max-preview",
            stream=True,
        ),
        DashScopeChatFormatter(),
    ],
    "qwen-vl-max": [
        DashScopeChatModel(
            api_key=os.environ.get("DASHSCOPE_API_KEY"),
            model_name="qwen-vl-max-latest",
            stream=True,
        ),
        DashScopeChatFormatter(),
    ],
    # "gpt-5": [
    #     OpenAIChatModel(
    #         api_key=os.environ.get("OPENAI_API_KEY"),
    #         model_name="gpt-5-2025-08-07",
    #         stream=True,
    #     ),
    #     OpenAIChatFormatter(),
    # ],
    # "claude-4": [
    #     AnthropicChatModel(
    #         api_key=os.environ.get("ANTHROPIC_API_KEY"),
    #         model_name="claude-sonnet-4-20250514",
    #         stream=True,
    #     ),
    #     AnthropicChatFormatter(),
    # ],
 }
 MODEL_CONFIG_NAME = os.getenv("MODEL", "qwen3-max")
 VL_MODEL_NAME = os.getenv("VISION_MODEL", "qwen-vl-max")
 async def add_tools(
    toolkit: AliasToolkit,
 ):
    """
    Adding additional MCP server to the toolkit for the application.
    Currently added MCP:
    - multimodal content to text tools (based on DashScope models)
    - tavily search
    """
    try:
        multimodal_tools = DashScopeMultiModalTools(
            sandbox=toolkit.sandbox,
            dashscope_api_key=os.getenv("DASHSCOPE_API_KEY", ""),
        )
        toolkit.register_tool_function(
            multimodal_tools.dashscope_audio_to_text,
        )
        toolkit.register_tool_function(
            multimodal_tools.dashscope_image_to_text,
        )
    except Exception as e:
        print(traceback.format_exc())
        raise e from None
    try:
        long_text_hook = LongTextPostHook(toolkit.sandbox)
        tavily_mcp_client = StdIOStatefulClient(
            name="tavily_mcp_client",
            command="npx",
            args=[
                "-y",
                "mcp-remote",
                "https://mcp.tavily.com/mcp/"
                f"?tavilyApiKey={os.getenv('TAVILY_API_KEY')}",
            ],
        )
        await toolkit.add_and_connet_mcp_client(
            tavily_mcp_client,
            enable_funcs=["tavily_search", "tavily_extract"],
            postprocess_func=long_text_hook.truncate_and_save_response,
        )
    except Exception as e:
        print(traceback.format_exc())
        raise e from None
 async def arun_agents(
    session_service: SessionService,
    sandbox: Sandbox = None,
    enable_clarification: bool = True,
 ):
    time_str = datetime.now().strftime("%Y%m%d%H%M%S")
    # Initialize toolkit
    worker_full_toolkit = AliasToolkit(sandbox, add_all=True)
    await add_tools(
        worker_full_toolkit,
    )
    logger.info("Init full toolkit")
    # Browser agent uses traditional toolkit for compatibility
    browser_toolkit = AliasToolkit(
        sandbox,
        is_browser_toolkit=True,
        add_all=True,
    )
    logger.info("Init browser toolkit")
    try:
        model, formatter = MODEL_FORMATTER_MAPPING[MODEL_CONFIG_NAME]
        browser_agent = BrowserAgent(
            model=model,
            formatter=formatter,
            memory=InMemoryMemory(),
            toolkit=browser_toolkit,
            max_iters=50,
            start_url="https://www.google.com",
            session_service=session_service,
            state_saving_dir=f"./agent-states/run-{time_str}",
        )
        meta_planner = MetaPlanner(
            model=model,
            formatter=formatter,
            toolkit=AliasToolkit(sandbox=sandbox, add_all=False),
            worker_full_toolkit=worker_full_toolkit,
            browser_toolkit=browser_toolkit,
            agent_working_dir="/workspace",
            memory=InMemoryMemory(),
            state_saving_dir=f"./agent-states/run-{time_str}",
            max_iters=100,
            session_service=session_service,
            enable_clarification=enable_clarification,
        )
        meta_planner.worker_manager.register_worker(
            browser_agent,
            description=BROWSER_AGENT_DESCRIPTION,
            worker_type="built-in",
        )
        msg = await meta_planner()
    except Exception as e:
        print(traceback.format_exc())
        raise e from None
    finally:
        await worker_full_toolkit.close_mcp_clients()
    return meta_planner, msg
 async def test_deepresearch_agent(
    task_str: str,
    session_service: SessionService,
    sandbox: Sandbox = None,
 ):
    instruction = Msg(
        "user",
        content=task_str,
        role="user",
    )
    global_toolkit = AliasToolkit(sandbox, add_all=True)
    await add_tools(global_toolkit)
    worker_toolkit = AliasToolkit(sandbox)
    model, formatter = MODEL_FORMATTER_MAPPING[MODEL_CONFIG_NAME]
    test_tool_list = [
        "tavily_search",
        "tavily_extract",
        "write_file",
        "create_directory",
        "list_directory",
        "read_file",
        "run_shell_command",
    ]
    share_tools(global_toolkit, worker_toolkit, test_tool_list)
    try:
        worker_agent = DeepResearchAgent(
            name="Deep_Research_Assistant",
            sys_prompt=(
                "You are a helpful assistant that can use provided tools "
                "to help finish tasks."
            ),
            model=model,
            formatter=formatter,
            memory=InMemoryMemory(),
            toolkit=worker_toolkit,
            session_service=session_service,
        )
        await worker_agent(instruction)
    except Exception as e:
        logger.error(f"---> Error: {e}")
        logger.error(traceback.format_exc())
    finally:
        await global_toolkit.close_mcp_clients()
 async def test_browseruse_agent(
    task_str: str,
    session_service: SessionService,
    sandbox: Sandbox = None,
 ):
    time_str = datetime.now().strftime("%Y%m%d%H%M%S")
    instruction = Msg(
        "user",
        content=task_str,
        role="user",
    )
    model, formatter = MODEL_FORMATTER_MAPPING[MODEL_CONFIG_NAME]
    browser_toolkit = AliasToolkit(
        sandbox,
        add_all=True,
        is_browser_toolkit=True,
    )
    logger.info("Init browser toolkit")
    try:
        browser_agent = BrowserAgent(
            model=model,
            formatter=formatter,
            memory=InMemoryMemory(),
            toolkit=browser_toolkit,
            max_iters=50,
            start_url="https://www.google.com",
            session_service=session_service,
            state_saving_dir=f"./agent-states/run_browser-{time_str}",
        )
        await browser_agent(instruction)
    except Exception as e:
        logger.error(f"---> Error: {e}")
        logger.error(traceback.format_exc())
    finally:
        await browser_toolkit.close_mcp_clients()
--- a/alias/src/alias/agent/tools/init.py
+++ b/alias/src/alias/agent/tools/init.py
@@ -0,0 +1,4 @@
 # -*- coding: utf-8 -*-
 from .alias_toolkit import AliasToolkit
 __all__ = ["AliasToolkit"]
--- a/alias/src/alias/agent/tools/alias_toolkit.py
+++ b/alias/src/alias/agent/tools/alias_toolkit.py
@@ -0,0 +1,220 @@
 # -*- coding: utf-8 -*-
 # pylint: disable=R1724
 from typing import Optional, Callable, Any
 import asyncio
 from loguru import logger
 from agentscope.mcp import StatefulClientBase, MCPClientBase
 from agentscope.tool import (
    Toolkit,
    ToolResponse,
 )
 from agentscope.message import ToolUseBlock, TextBlock
 from agentscope_runtime.sandbox import FilesystemSandbox, BrowserSandbox
 from alias.agent.tools.toolkit_hooks import (
    LongTextPostHook
 )
 from alias.agent.tools.improved_tools import ImprovedFileOperations
 from alias.agent.tools.tool_blacklist import TOOL_BLACKLIST
 from alias.agent.tools.toolkit_hooks import read_file_post_hook
 from alias.runtime.alias_sandbox.alias_sandbox import AliasSandbox
 class AliasToolkit(Toolkit):
    def __init__(  # pylint: disable=W0102
        self,
        sandbox: Optional[AliasSandbox] = None,
        add_all: bool = False,
        is_browser_toolkit: bool = False,
        tool_blacklist: list = TOOL_BLACKLIST,
    ):
        super().__init__()
        if sandbox is not None:
            self.sandbox = sandbox
            self.session_id = self.sandbox.sandbox_id
        else:
            logger.warning("Sandbox is None, use pure testing local mode!!!")
            self.sandbox = None
            self.session_id = None
        self.categorized_functions = {}
        self.tool_blacklist = tool_blacklist
        if add_all:
            # Get tools
            tools_schema = self.sandbox.list_tools()
            for category, function_dicts in tools_schema.items():
                if (
                    (is_browser_toolkit and category == "playwright")
                    or (not is_browser_toolkit and category != "playwright")
                ):
                    for _, function_json in function_dicts.items():
                        if function_json["name"] not in self.tool_blacklist:
                            logger.info(f"add {function_json['name']}")
                            self._add_io_function(function_json)
            # for improved tools
            file_sys = ImprovedFileOperations(sandbox)
            self.register_tool_function(
                file_sys.read_file,
            )
        self.additional_mcp_clients = []
        self.long_text_post_hook = LongTextPostHook(sandbox)
        self._add_tool_postprocessing_func()
    def _add_io_function(
        self,
        json_schema: dict,
        is_browser_tool: bool = False
    ) -> None:
        tool_name = json_schema["name"]
        def wrap_tool_func(name: str) -> Callable:
            def wrapper(**kwargs) -> ToolResponse:
                try:
                    # Call the sandbox tool with the extracted arguments
                    result = self.sandbox.call_tool(
                        name=name,
                        arguments=kwargs,
                    )
                    # Convert the result to ToolResponse format
                    if isinstance(result, dict) and "content" in result:
                        # If result already has content structure, use it
                        content = result["content"]
                        if isinstance(content, list):
                            for i, block in enumerate(content):
                                if (
                                    isinstance(block, dict)
                                    and "annotations" in block
                                ):
                                    block.pop("annotations")
                                    content[i] = block
                                if (
                                    isinstance(block, dict)
                                    and "description" in block
                                ):
                                    block.pop("description")
                                    content[i] = block
                    else:
                        # Otherwise, wrap the result in a TextBlock
                        content = [
                            TextBlock(
                                type="text",
                                text=str(result),
                            ),
                        ]
                    return ToolResponse(
                        metadata={"success": True, "tool_name": name},
                        content=content,
                    )
                except Exception as e:
                    logger.error(f"Error executing tool {name}: {str(e)}")
                    return ToolResponse(
                        metadata={
                            "success": False,
                            "tool_name": name,
                            "error": str(e),
                        },
                        content=[
                            TextBlock(
                                type="text",
                                text=f"Error executing tool {name}: {str(e)}",
                            ),
                        ],
                    )
            wrapper.__name__ = name
            return wrapper
        tool_func = wrap_tool_func(tool_name)
        self.register_tool_function(
            tool_func=tool_func,
            json_schema=json_schema.get("json_schema", {}),
        )
    def _add_tool_postprocessing_func(self) -> None:
        long_text_hook = LongTextPostHook(self.sandbox)
        for tool_func, _ in self.tools.items():
            if tool_func.startswith(("read_file", "read_multiple_files")):
                self.tools[tool_func].postprocess_func = read_file_post_hook
            if tool_func.startswith("tavily"):
                self.tools[tool_func].postprocess_func = \
                    long_text_hook.truncate_and_save_response
    async def add_and_connet_mcp_client(
        self,
        mcp_client: MCPClientBase,
        group_name: str = "basic",
        enable_funcs: list[str] | None = None,
        disable_funcs: list[str] | None = None,
        preset_kwargs_mapping: dict[str, dict[str, Any]] | None = None,
        postprocess_func: Callable[
            [
                ToolUseBlock,
                ToolResponse,
            ],
            ToolResponse | None,
        ]
        | None = None,
    ):
        """
        Add stateful MCP clients. No need to call `connect()` before add.
        """
        if isinstance(mcp_client, StatefulClientBase):
            await mcp_client.connect()
            self.additional_mcp_clients.append(mcp_client)
            await self.register_mcp_client(
                mcp_client,
                enable_funcs=enable_funcs,
                group_name=group_name,
                disable_funcs=disable_funcs,
                preset_kwargs_mapping=preset_kwargs_mapping,
                postprocess_func=postprocess_func,
            )
    async def close_mcp_clients(self) -> None:
        for client in reversed(self.additional_mcp_clients):
            if isinstance(client, StatefulClientBase):
                await client.close()
 async def test_toolkit():
    with FilesystemSandbox() as sandbox:
        toolkit = AliasToolkit(sandbox)
        print(toolkit.get_json_schemas())
        # test tools
        res = await toolkit.call_tool_function(
            ToolUseBlock(
                type="tool_use",
                id="",
                name="list_allowed_directories",
                input={}
            )
        )
        print(f"Allow directory:")
        async for response in res:
            print(response)
        res = await toolkit.call_tool_function(
            ToolUseBlock(
                type="tool_use",
                id="",
                name="write_file",
                input={
                    "path": "/workspace/test.md",
                    "content": "testing the function",
                },
            ),
        )
        async for response in res:
            print(response)
        await toolkit.close_mcp_clients()
 if __name__ == "__main__":
    asyncio.run(test_toolkit())
--- a/alias/src/alias/agent/tools/improved_tools/init.py
+++ b/alias/src/alias/agent/tools/improved_tools/init.py
@@ -0,0 +1,15 @@
 # -*- coding: utf-8 -*-
 """
 Improved tools module for Alias agent toolkit.
 This module contains enhanced tool functions that provide additional functionality
 beyond the basic tools available in the standard toolkit.
 """
 from .file_operations import ImprovedFileOperations
 from .multimodal_to_text import DashScopeMultiModalTools
 __all__ = [
    "ImprovedFileOperations",
    "DashScopeMultiModalTools",
 ]
--- a/alias/src/alias/agent/tools/improved_tools/file_operations.py
+++ b/alias/src/alias/agent/tools/improved_tools/file_operations.py
@@ -0,0 +1,315 @@
 # -*- coding: utf-8 -*-
 """
 Enhanced read_file tool function with offset and limit support.
 This module provides an improved read_file tool that wraps the
 original read_file functionality and adds support for
 reading specific line ranges from files.
 """
 from typing import Optional
 from loguru import logger
 import asyncio
 import os
 from agentscope.tool import ToolResponse
 from agentscope.message import TextBlock
 from alias.agent.utils.constants import TMP_FILE_DIR
 from alias.agent.tools.sandbox_util import (
    TEXT_EXTENSIONS,
    create_or_edit_workspace_file,
    create_workspace_directory
 )
 from alias.runtime.alias_sandbox import AliasSandbox
 TO_MARKDOWN_SUPPORT_MAPPING = {
    ".pdf",
    ".docx",
    ".doc",
    ".xlsx",
    ".pptx",
 }
 class ImprovedFileOperations:
    """
    A set of enhanced file system tools with sandbox.
    """
    def __init__(self, sandbox: AliasSandbox):
        """init with sandbox"""
        self.sandbox = sandbox
    async def read_file(
        self,
        file_path: str,
        offset: int = 0,
        limit: Optional[int] = 50,
    ) -> ToolResponse:
        """
        Read a file with optional line offset and limit support.
        Support reading all kinds of text file, plus files with extensions in
        the following list:
        [".pdf", ".docx", ".doc", ".xlsx" and ".pptx"]
        Args:
            file_path (str): The absolute path to the file to read
            offset (int, optional):
                The line number to start reading from (starting from 0).
                Default is 0.
            limit (int, optional):
                The number of lines to read. Default to 50.
                If set to `None`, then it will read all content after `offset`.
        Returns:
            ToolResponse:
                A ToolResponse containing the file content or
                error information. The content includes line numbers
                when offset/limit are used.
        """
        try:
            # Validate input parameters
            if offset is not None and offset < 0:
                return ToolResponse(
                    metadata={"success": False, "error": "Invalid offset"},
                    content=[
                        TextBlock(
                            type="text",
                            text="Error: offset must be >= 0",
                        ),
                    ],
                )
            if limit is not None and limit < 1:
                return ToolResponse(
                    metadata={"success": False, "error": "Invalid limit"},
                    content=[
                        TextBlock(
                            type="text",
                            text="Error: limit must be >= 1",
                        ),
                    ],
                )
            # If no toolkit provided, we can't proceed
            if self.sandbox is None:
                return ToolResponse(
                    metadata={
                        "success": False, "error": "No sandbox provided"
                    },
                    content=[
                        TextBlock(
                            type="text",
                            text="Error: No sandbox provided to "
                                 "call the original read_file tool",
                        ),
                    ],
                )
            file_extension = os.path.splitext(file_path)[1].lower()
            if file_extension in TEXT_EXTENSIONS:
                # First, read the entire file using the original read_file tool
                params = {
                    "path": file_path,
                }
                # Call the original read_file tool
                tool_res = self.sandbox.call_tool(
                    name="read_file",
                    arguments=params
                )
            elif file_extension in TO_MARKDOWN_SUPPORT_MAPPING:
                tool_res = _transfer_to_markdown_text(file_path, self.sandbox)
            else:
                tool_res = {}
            # Extract content from the tool response
            if (
                tool_res.get("isError", True)
                and len(tool_res.get("content", [])) > 0
            ):
                return ToolResponse(
                    metadata={
                        "success": False, "error": "Error when read file"
                    },
                    content=tool_res.get("content", [])
                )
            elif (
                tool_res.get("isError", True)
                and len(tool_res.get("content", [])) == 0
            ):
                return ToolResponse(
                    metadata={"success": False, "error": "Empty response"},
                    content=[
                        TextBlock(
                            type="text",
                            text=f"Fail to read file on path {file_path}",
                        )
                    ]
                )
            # Get the text content from the first content block
            full_content = ""
            for block in tool_res.get("content", []):
                if isinstance(block, dict) and 'text' in block:
                    full_content += block['text'] + "\n"
            # Split into lines
            lines = full_content.splitlines(keepends=True)
            total_lines = len(lines)
            # If no offset/limit specified, return entire file
            if offset is None and limit is None:
                return ToolResponse(
                    metadata={"success": True, "total_lines": total_lines},
                    content=[
                        TextBlock(
                            type="text",
                            text=full_content,
                        ),
                    ],
                )
            # Handle offset and limit
            start_line = (offset or 0)   # 0-based index
            end_line = start_line + (limit or total_lines)
            # Validate range
            if start_line >= total_lines:
                return ToolResponse(
                    metadata={"success": False, "error": "Invalid range"},
                    content=[
                        TextBlock(
                            type="text",
                            text=f"Error: Start line {offset} is "
                                 f"beyond file length ({total_lines} lines).",
                        ),
                    ],
                )
            # Clamp end_line to file length
            end_line = min(end_line, total_lines)
            # Extract the requested lines
            selected_lines = lines[start_line:end_line]
            content = ''.join(selected_lines)
            # Add summary information
            summary = (f"Read lines {start_line}-{end_line} of "
                       f"{total_lines} total lines from '{file_path}'")
            # save as markdown
            return_content = [
                TextBlock(
                    type="text",
                    text=content,
                ),
                TextBlock(
                    type="text",
                    text=summary,
                )
            ]
            if file_extension in TO_MARKDOWN_SUPPORT_MAPPING:
                file_name_with_ext = os.path.basename(file_path)
                filename_without_ext = os.path.splitext(file_name_with_ext)[0]
                file_path = os.path.join(
                    TMP_FILE_DIR,
                    filename_without_ext + ".md"
                )
                create_workspace_directory(self.sandbox, TMP_FILE_DIR)
                create_or_edit_workspace_file(
                    self.sandbox, file_path, full_content
                )
                return_content.append(
                    TextBlock(
                        type="text",
                        text=(
                            "NOTICE: "
                            "The (full) file is converted as markdown file"
                            " and saved completely at: "
                            f"{file_path}"
                        )
                    )
                )
            return ToolResponse(
                metadata={
                    "success": True,
                    "total_lines": total_lines,
                    "start_line": start_line + 1,
                    "end_line": end_line,
                    "lines_read": len(selected_lines),
                },
                content=return_content,
            )
        except Exception as e:
            logger.error(f"Error reading file {file_path}: {str(e)}")
            return ToolResponse(
                metadata={"success": False, "error": str(e)},
                content=[
                    TextBlock(
                        type="text",
                        text=f"Error reading file '{file_path}': {str(e)}",
                    ),
                ],
            )
 def _transfer_to_markdown_text(
    file_path: str, sandbox: AliasSandbox = None
 ) -> dict:
    ext = os.path.splitext(file_path)[1].lower()
    if ext not in TO_MARKDOWN_SUPPORT_MAPPING:
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": f"File extension '{ext}' not supported in "
                            f"{TO_MARKDOWN_SUPPORT_MAPPING}."
                }
            ]
        }
    params = {
        "uri": "file:" + file_path
    }
    try:
        res = sandbox.call_tool(
            name="convert_to_markdown",
            arguments=params
        )
        content = res.get("content", [])
        new_content = []
        for i, block in enumerate(content):
            if content[i].get("text", "").startswith("Converted content:"):
                continue
            elif content[i].get("text", "").startswith("Output file:"):
                continue
            else:
                new_content.append(res["content"][i])
        res["content"] = new_content
    except Exception as e:
        res = {
            "isError": True,
            "error": str(e)
        }
    return res
 if __name__ == "__main__":
    from alias.agent.tools.sandbox_util import copy_local_file_to_workspace
    with AliasSandbox() as box:
        res = copy_local_file_to_workspace(
            box,
            "/Users/zitao.l/Downloads/22051_Which_LLM_Multi_Agent.pdf",
            "/workspace/test.pdf"
        )
        print(res)
        toolset = ImprovedFileOperations(box)
        res = asyncio.run(toolset.read_file("/workspace/test.pdf"))
        print(res)
--- a/alias/src/alias/agent/tools/improved_tools/multimodal_to_text.py
+++ b/alias/src/alias/agent/tools/improved_tools/multimodal_to_text.py
@@ -0,0 +1,309 @@
 # -*- coding: utf-8 -*-
 from io import BytesIO
 import os
 import base64
 import tempfile
 import requests
 import dashscope
 from agentscope.tool import ToolResponse
 from agentscope.message import TextBlock
 from alias.agent.tools.sandbox_util import (
    get_workspace_file,
    download_workspace_file_from_oss,
 )
 from alias.runtime.alias_sandbox import AliasSandbox
 def _get_binary_buffer(
    sandbox: AliasSandbox,
    audio_file_url: str,
 ):
    if audio_file_url.startswith(("http://", "https://")):
        response = requests.get(audio_file_url)
        response.raise_for_status()
        audio_buffer = BytesIO(response.content)
    else:
        audio_buffer = BytesIO(
            base64.b64decode(get_workspace_file(sandbox, audio_file_url)),
        )
    return audio_buffer
 class DashScopeMultiModalTools:
    """
    A set of multi-modal tools based on DashScope models.
    Work with multi-modal content in sandbox and publicly accessible online.
    """
    def __init__(
        self,
        sandbox: AliasSandbox,
        dashscope_api_key: str,
    ):
        self.sandbox = sandbox
        self.api_key = dashscope_api_key
    def dashscope_audio_to_text(
        self,
        audio_file_url: str,
        language: str = "en",
    ) -> ToolResponse:
        """
        Convert an audio file to text using DashScope's transcription service.
        Args:
            audio_file_url (`str`):
                The file path or URL to the audio file that needs to be
                transcribed.
            language (`str`, defaults to `"en"`):
                The language of the input audio in
                `ISO-639-1 format \
                <https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes>`_
                (e.g., "en", "zh", "fr"). Improves accuracy and latency.
        Returns:
            `ToolResponse`:
                A ToolResponse containing the generated content
                (ImageBlock/TextBlock/AudioBlock) or error information if the
                operation failed.
        """
        try:
            # Handle different types of audio file URLs
            if audio_file_url.startswith(("http://", "https://")):
                # For web URLs, use the URL directly
                audio_source = audio_file_url
            else:
                # For local files, save to a temporary file
                audio_buffer = _get_binary_buffer(
                    sandbox=self.sandbox,
                    audio_file_url=audio_file_url,
                )
                # Create a temporary file
                with tempfile.NamedTemporaryFile(
                    delete=False,
                    suffix=os.path.splitext(audio_file_url)[1],
                ) as temp_file:
                    temp_file.write(audio_buffer.getvalue())
                    audio_source = temp_file.name
            messages = [
                {
                    "role": "system",
                    "content": [
                        {
                            "text": "Transcript the content in the audio "
                                    "to text."
                        },
                    ],
                },
                {
                    "role": "user",
                    "content": [
                        {
                            "audio": audio_source,
                        },
                    ],
                },
            ]
            response = dashscope.MultiModalConversation.call(
                api_key=self.api_key,
                model="qwen3-asr-flash",
                messages=messages,
                asr_options={
                    "enable_lid": True,
                    "language": language,
                },
            )
            # Clean up temporary file if created
            if not audio_file_url.startswith(("http://", "https://")):
                try:
                    os.unlink(audio_source)
                except Exception as _:  # noqa: F841
                    pass
            content = response.output["choices"][0]["message"]["content"]
            if isinstance(content, list):
                content = content[0]["text"]
            if content is not None:
                return ToolResponse(
                    [
                        TextBlock(
                            type="text",
                            text=content,
                        ),
                    ],
                )
            else:
                return ToolResponse(
                    [
                        TextBlock(
                            type="text",
                            text="Error: Failed to generate text from audio",
                        ),
                    ],
                )
        except Exception as _:  # noqa: F841
            import traceback
            return ToolResponse(
                [
                    TextBlock(
                        type="text",
                        text="Error: Failed to transcribe audio: "
                        f"{traceback.format_exc()}",
                    ),
                ],
            )
    def dashscope_image_to_text(
        self,
        image_url: str,
        prompt: str = "Describe the image",
        model: str = "qwen-vl-plus",
    ) -> ToolResponse:
        """Generate text based on the given images.
        Args:
            image_url (`str`):
                The url of single or multiple images.
            prompt (`str`, defaults to 'Describe the image' ):
                The text prompt.
            model (`str`, defaults to 'qwen-vl-plus'):
                The model to use in DashScope MultiModal API.
        Returns:
            `ToolResponse`:
                A ToolResponse containing the generated content
                (ImageBlock/TextBlock/AudioBlock) or error information if the
                operation failed.
        """
        # Handle different types of audio file URLs
        if image_url.startswith(("http://", "https://")):
            # For web URLs, use the URL directly
            image_source = image_url
        else:
            # For local files, save to a temporary file
            image_buffer = _get_binary_buffer(
                self.sandbox,
                image_url,
            )
            # Create a temporary file
            with tempfile.NamedTemporaryFile(
                delete=False,
                suffix=".mp3",
            ) as temp_file:
                temp_file.write(image_buffer.getvalue())
                image_source = temp_file.name
        contents = []
        # Convert image paths according to the model requirements
        contents.append(
            {
                "image": image_source,
            },
        )
        # append text
        contents.append({"text": prompt})
        # currently only support one round of conversation
        # if multiple rounds of conversation are needed,
        # it would be better to implement an Agent class
        sys_message = {
            "role": "system",
            "content": [{"text": "You are a helpful assistant."}],
        }
        user_message = {
            "role": "user",
            "content": contents,
        }
        messages = [sys_message, user_message]
        try:
            response = dashscope.MultiModalConversation.call(
                model=model,
                messages=messages,
                api_key=self.api_key,
            )
            content = response.output["choices"][0]["message"]["content"]
            if isinstance(content, list):
                content = content[0]["text"]
            if content is not None:
                return ToolResponse(
                    [
                        TextBlock(
                            type="text",
                            text=content,
                        ),
                    ],
                )
            else:
                return ToolResponse(
                    [
                        TextBlock(
                            type="text",
                            text="Error: Failed to generate text",
                        ),
                    ],
                )
        except Exception as e:
            import traceback
            print(traceback.format_exc())
            return ToolResponse(
                [
                    TextBlock(
                        type="text",
                        text=f"Failed to generate text: {str(e)}",
                    ),
                ],
            )
 if __name__ == "__main__":
    with AliasSandbox() as box:
        tool_result = box.call_tool(
            "run_shell_command",
            arguments={"command": "apt update"}
        )
        print(tool_result)
        tool_result = box.call_tool(
            "run_shell_command",
            arguments={
                "command": "apt install wget",
            },
        )
        print(f"{tool_result}")
        tool_result = box.call_tool(
            "run_shell_command",
            arguments={
                "command": "pip install numpy pandas",
            },
        )
        print(f"{tool_result}")
        picture_path = "/workspace/5b2a14e8-6e59-479c-80e3-4696e8980152.jpg"
        download_workspace_file_from_oss(
            box,
            oss_url=(
                "https://dail-wlcb.oss-cn-wulanchabu.aliyuncs.com/zitao_l/"
                "GAIA/2023/validation/"
                "5b2a14e8-6e59-479c-80e3-4696e8980152.jpg"
            ),
            to_path=picture_path,
        )
        toolset = DashScopeMultiModalTools(
            sandbox=box,
            dashscope_api_key=os.getenv("DASHSCOPE_API_KEY", "")
        )
        result = toolset.dashscope_image_to_text(
            image_url=picture_path,
            prompt="Describe the image",
        )
        print(result)
--- a/alias/src/alias/agent/tools/sandbox_util.py
+++ b/alias/src/alias/agent/tools/sandbox_util.py
@@ -0,0 +1,456 @@
 # -*- coding: utf-8 -*-
 import os
 from typing import Optional
 import json
 from pathlib import Path
 import base64
 from loguru import logger
 import io
 import tarfile
 from agentscope_runtime.sandbox.manager.container_clients.docker_client import DockerClient
 from alias.runtime.alias_sandbox import AliasSandbox
 TEXT_EXTENSIONS = {
    ".txt",
    ".md",
    ".log",
    ".py",
    ".js",
    ".html",
    ".css",
    ".json",
    ".xml",
    ".yaml",
    ".yml",
    ".ini",
    ".cfg",
    ".conf",
    ".csv",
    ".tsv",
    ".sql",
    ".sh",
    ".bat",
    ".ps1",
    ".r",
    ".java",
    ".cpp",
    ".c",
    ".h",
    ".hpp",
    ".go",
    ".rs",
    ".php",
    ".rb",
    ".swift",
    ".kt",
    ".scala",
    ".dart",
    ".vue",
    ".jsx",
    ".tsx",
    ".sass",
    ".scss",
    ".less",
    ".styl",
    ".tex",
    ".rst",
    ".adoc",
    ".org",
    ".wiki",
    ".rtf",
 }
 def _valid_workspace_path(workspace_path: str) -> bool:
    try:
        # Resolve both paths to absolute paths
        path = Path(workspace_path).resolve()
        base = Path("/workspace").resolve()
        # Check if the resolved path is under the base directory
        return path.is_relative_to(base)
    except (OSError, ValueError):
        # Handle invalid paths
        return False
 def list_workspace_directories(
    sandbox: AliasSandbox,
    directory: str = "/workspace",
    recursive: bool = False,
 ) -> dict:
    """
    List files in the specified directory within the /workspace.
    Args:
        sandbox (AliasSandbox): sandbox to extract
        directory (str): The directory to list files in.
        recursive (bool): Whether to list recursively.
    Return:
        dict:
            with lists of `files` and `dirs`, both in format of full paths
    """
    if not _valid_workspace_path(directory):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`directory` must be under `/workspace`",
                },
            ],
        }
    result = {"files": [], "directories": []}
    def process_item(item, current_base):
        print(current_base, item["name"])
        current_path = (
            os.path.join(current_base, item["name"])
            if current_base
            else item["name"]
        )
        if item["type"] == "file":
            result["files"].append(current_path)
        elif item["type"] == "directory":
            result["directories"].append(current_path)
            if "children" in item:
                for child in item["children"]:
                    process_item(child, current_path)
    if recursive:
        tool_result = sandbox.call_tool(
            "directory_tree",
            arguments={"path": directory},
        )
        directory_tree = json.loads(tool_result["content"][0]["text"])
        for item in directory_tree:
            process_item(item, directory)
    else:
        tool_result = sandbox.call_tool(
            "list_directory",
            arguments={"path": directory},
        )
        list_content = tool_result["content"][0]["text"]
        print(list_content)
        sub_dir_items = [
            item.strip() for item in list_content.split("\n") if item.strip()
        ]
        for item in sub_dir_items:
            if "[DIR]" in item:
                dir_name = item.replace("[DIR] ", "")
                result["directories"].append(os.path.join(directory, dir_name))
            elif "[FILE]" in item:
                file_name = item.replace("[FILE] ", "")
                result["files"].append(os.path.join(directory, file_name))
    return result
 def get_workspace_file(
    sandbox: AliasSandbox,
    file_path: str,
 ) -> bytes:
    """
    Get the content of the specified file within the /workspace.
    Args:
        sandbox (AliasSandbox): sandbox to extract
        file_path (str): The file path to get the content of.
    Returns:
        content encoded in base64
    """
    if not _valid_workspace_path(file_path):
        return base64.b64encode(
            "`file_path` must be under `/workspace`".encode(),
        )
    tool_result = sandbox.call_tool(
        "run_shell_command",
        arguments={"command": f"base64 -i {file_path}"},
    )
    return tool_result["content"][0]["text"]
 def create_or_edit_workspace_file(
    sandbox: AliasSandbox,
    file_path: str,
    content: str,
 ) -> dict:
    if not _valid_workspace_path(file_path):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`file_path` must be under `/workspace`",
                },
            ],
        }
    sandbox.call_tool(
        "run_shell_command",
        arguments={"command": f"touch {file_path}"},
    )
    fill_result = sandbox.call_tool(
        "write_file",
        arguments={"path": file_path, "content": content},
    )
    return fill_result
 def create_workspace_directory(
    sandbox: AliasSandbox,
    directory_path: str,
 ) -> dict:
    """
    Create a directory within the /workspace directory.
    """
    if not _valid_workspace_path(directory_path):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`directory_path` must be under `/workspace`",
                },
            ],
        }
    tool_result = sandbox.call_tool(
        "run_shell_command",
        arguments={"command": f"mkdir -p {directory_path}"},
    )
    return tool_result
 def delete_workspace_file(
    sandbox: AliasSandbox,
    file_path: str,
 ) -> dict:
    """
    Delete a file within the /workspace directory.
    """
    if not _valid_workspace_path(file_path):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`file_path` must be under `/workspace`",
                },
            ],
        }
    tool_result = sandbox.call_tool(
        "run_shell_command",
        arguments={"command": f"rm -rf {file_path}"},
    )
    return tool_result
 def download_workspace_file_from_oss(
    sandbox: AliasSandbox,
    oss_url: str,
    to_path: str,
 ) -> dict:
    """
    Download a file from oss url to the /workspace directory.
    """
    if not _valid_workspace_path(to_path):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`file_path` must be under `/workspace`",
                },
            ],
        }
    logger.info(f"Prepared {to_path} from {oss_url}")
    tool_result = sandbox.call_tool(
        "run_shell_command",
        arguments={
            "command": "apt install wget",
        },
    )
    print(f"{tool_result}")
    tool_result = sandbox.call_tool(
        "run_shell_command",
        arguments={
            "command": f"wget -O {to_path} {oss_url}",
        },
    )
    print(f"{tool_result}")
    return tool_result
 def delete_workspace_directory(
    sandbox: AliasSandbox,
    directory_path: str,
 ) -> dict:
    """
    Delete a directory within the /workspace directory.
    """
    if not _valid_workspace_path(directory_path):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`directory` must be under `/workspace`",
                },
            ],
        }
    tool_result = sandbox.call_tool(
        "run_shell_command",
        arguments={"command": f"rm -rf {directory_path}"},
    )
    return tool_result
 def clean_workspace(sandbox: AliasSandbox):
    """
    Remove all files and subdirectories within the /workspace directory.
    """
    ls_result = list_workspace_directories(sandbox)
    for file in ls_result["files"]:
        delete_workspace_file(sandbox, file)
    for subdir in ls_result["directories"]:
        delete_workspace_directory(sandbox, subdir)
 def download_complete_workspace(
    sandbox: AliasSandbox,
    save_dir: Optional[str] = None,
 ):
    """
    Download all files and subdirectories within the /workspace directory.
    """
    download_files = {}
    list_dir = list_workspace_directories(sandbox, recursive=True)
    for file_path in list_dir["files"]:
        file_content = get_workspace_file(sandbox, file_path)
        file_extension = os.path.splitext(file_path)[1].lower()
        file_name = os.path.basename(file_path)
        if file_extension in TEXT_EXTENSIONS:
            text = base64.b64decode(file_content).decode("utf-8")
            download_files[file_path] = text
            if save_dir is not None:
                with open(
                    os.path.join(save_dir, file_name),
                    "w",
                    encoding="utf-8",
                ) as f:
                    f.write(text)
        else:
            content = base64.b64decode(file_content)
            download_files[file_path] = file_content  # this is base64
            if save_dir is not None:
                with open(os.path.join(save_dir, file_name), "wb") as f:
                    f.write(content)
        logger.info(f"Downloaded {file_path}")
    return download_files
 def copy_local_file_to_workspace(
    sandbox: AliasSandbox,
    local_path: str,
    target_path: Optional[str] = None,
 ):
    """
    Copy a local file to a subdirectory under /workspace directory.
    If target_path is not provided, the file will be copied to /workspace
    with the same filename as the local file.
    """
    if target_path is None:
        filename = os.path.basename(local_path)
        target_path = os.path.join("/workspace", filename)
    if not _valid_workspace_path(target_path):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "`directory` must be under `/workspace`",
                },
            ],
        }
    client = sandbox.manager_api.client
    if not isinstance(client, DockerClient):
        return {
            "isError": True,
            "content": [
                {
                    "type": "text",
                    "text": "Copying file is not support sandbox "
                            f"with client type {type(client)}",
                },
            ],
        }
    docker_client = client.client
    container = docker_client.containers.get(sandbox.sandbox_id)
    # Create a tar archive in memory
    tar_stream = io.BytesIO()
    tar = tarfile.open(fileobj=tar_stream, mode='w')
    # Add file to tar archive
    tar.add(local_path, arcname=os.path.basename(target_path))
    tar.close()
    # Reset stream position
    tar_stream.seek(0)
    # Extract tar to container (directory path only)
    container.put_archive(os.path.dirname(target_path), tar_stream)
    return {
        "isError": False,
        "content": [
            {
                "type": "text",
                "text": f"{target_path}",
            },
        ],
    }
 if __name__ == "__main__":
    with AliasSandbox() as box:
        create_or_edit_workspace_file(
            box,
            "/workspace/test1.md",
            "This is the content of test1.md",
        )
        create_workspace_directory(box, "/workspace/subdir")
        create_or_edit_workspace_file(
            box,
            "/workspace/subdir/test2.md",
            "This is the content of test2.md",
        )
        create_or_edit_workspace_file(
            box,
            "/workspace/subdir/test3.md",
            "test3.md test3.md test3.md",
        )
        create_or_edit_workspace_file(
            box,
            "/workspace/test4.md",
            "test4.md test4.md test4.md",
        )
        print("try to copy file")
        copy_local_file_to_workspace(
            sandbox=box,
            local_path="/Users/zitao.l/Downloads/ms_online.png",
            target_path="/workspace/ms_online.png",
        )
        print(list_workspace_directories(box, recursive=False))
        # print(download_complete_workspace(box))
        clean_workspace(box)
        print(list_workspace_directories(box, recursive=False))
        input("Press Enter to continue...")
        print(json.dumps(box.list_tools(), indent=2))
--- a/alias/src/alias/agent/tools/tool_blacklist.py
+++ b/alias/src/alias/agent/tools/tool_blacklist.py
@@ -0,0 +1,7 @@
 # -*- coding: utf-8 -*-
 TOOL_BLACKLIST = {
    # replace with improved version
    "read_file",
    "convert_to_markdown",
 }
--- a/alias/src/alias/agent/tools/toolkit_hooks/init.py
+++ b/alias/src/alias/agent/tools/toolkit_hooks/init.py
@@ -0,0 +1,7 @@
 from .long_text_post_hook import LongTextPostHook
 from .read_file_post_hook import read_file_post_hook
 __all__ = [
    "LongTextPostHook",
    "read_file_post_hook",
 ]
--- a/alias/src/alias/agent/tools/toolkit_hooks/long_text_post_hook.py
+++ b/alias/src/alias/agent/tools/toolkit_hooks/long_text_post_hook.py
@@ -0,0 +1,131 @@
 # -*- coding: utf-8 -*-
 import json
 import os.path
 import uuid
 import textwrap
 from agentscope.tool import ToolResponse
 from agentscope.message import ToolUseBlock, TextBlock
 from alias.agent.utils.constants import TMP_FILE_DIR
 from alias.agent.tools.sandbox_util import (
    create_or_edit_workspace_file,
    create_workspace_directory
 )
 class LongTextPostHook:
    def __init__(self, sandbox):
        self.sandbox = sandbox
    def truncate_and_save_response(
        self,
        tool_use: ToolUseBlock,  # pylint: disable=W0613
        tool_response: ToolResponse,
    ) -> ToolResponse:
        """Post-process tool responses to prevent content overflow.
        This function ensures that tool responses don't exceed a predefined
        budget to prevent overwhelming the model with too much information.
        It truncates text content while preserving the structure of
        the response.
        Args:
            tool_use: The tool use block that triggered the response (unused).
            tool_response: The tool response to potentially truncate.
        Note:
            The budget is set to approximately 80K tokens (8194 * 10 characters)
            to ensure responses remain manageable for the language model.
        """
        # Set budget to prevent overwhelming the model with too much content
        budget = 8194 * 10  # Approximately 80K tokens of content
        append_hint = (
            "\n\n[Content is too long and truncated....]"
        )
        new_tool_response = ToolResponse(
            id=tool_response.id,
            stream=tool_response.stream,
            is_last=tool_response.is_last,
            is_interrupted=tool_response.is_interrupted,
            content=[]
        )
        if isinstance(tool_response.content, list):
            save_text_block = None
            for i, block in enumerate(tool_response.content):
                if block["type"] == "text":
                    text = block["text"]
                    text_len = len(text)
                    # If this block exceeds remaining budget, truncate it
                    if text_len > budget:
                        # Calculate truncation threshold
                        # (80% of proportional budget)
                        threshold = int(budget * 0.85)
                        # save the original response
                        tmp_file_name_prefix = tool_use.get("name", "")
                        save_text_block = self._save_tmp_file(
                            tmp_file_name_prefix,
                            tool_response.content
                        )
                        new_tool_response.append = (
                            text[:threshold] + append_hint
                        )
                        new_tool_response.content.append(
                            TextBlock(
                                type="text",
                                text=text[:threshold] + append_hint
                            )
                        )
                    else:
                        new_tool_response.content.append(block)
                    budget -= text_len
                    if budget <= 0 and save_text_block:
                        new_tool_response.content.append(save_text_block)
            return new_tool_response
        elif isinstance(tool_response.content, str):
            text_len = len(tool_response.content)
            text = tool_response.content
            if text_len > budget:
                tmp_file_name_prefix = tool_use.get("name", "")
                save_text_block = self._save_tmp_file(
                    tmp_file_name_prefix,
                    tool_response.content
                )
                # Calculate truncation threshold (80% of proportional budget)
                threshold = int(budget / text_len * len(text) * 0.8)
                tool_response.content = (
                    text[:threshold] + append_hint
                )
                tool_response.content = [
                    TextBlock(type="text", text=tool_response.content),
                    save_text_block
                ]
            return tool_response
    def _save_tmp_file(self, save_file_name_prefix: str, content: list | str):
        create_workspace_directory(self.sandbox, TMP_FILE_DIR)
        save_file_name = save_file_name_prefix + "-" + str(
            uuid.uuid4().hex[:8]
        )
        file_path = os.path.join(TMP_FILE_DIR, save_file_name)
        json_str = json.dumps(content, ensure_ascii=False, indent=2)
        wrapped = '\\n'.join(
            [textwrap.fill(line, width=500) for line in json_str.split('\\n')])
        create_or_edit_workspace_file(
            self.sandbox,
            file_path,
            wrapped,
        )
        return TextBlock(
            type="text",
            text=f"Dump the complete long file at {file_path}. "
                 "Don't try to read the complete file directly. "
                 "Use `grep -C 10 'YOUR_PATTERN' {file_path}` or "
                 "other bash command to extract "
                 "useful information.",
        )
--- a/alias/src/alias/agent/tools/toolkit_hooks/read_file_post_hook.py
+++ b/alias/src/alias/agent/tools/toolkit_hooks/read_file_post_hook.py
@@ -0,0 +1,49 @@
 from agentscope.message import ToolUseBlock, TextBlock
 from agentscope.tool import ToolResponse
 def _summarize_csv(text_block: TextBlock) -> None:
    """
    Replace the full CSV with a preview (first 5 rows) and a line count.
    """
    recommend_tool = "run_ipython_cell"
    head_len = 5
    lines = text_block["text"].splitlines()
    preview = "\n".join(lines[:head_len])
    text_block["text"] = (
        f"CSV Top-{head_len} rows preview ({len(lines)} total rows)\n"
        f"{preview}\n"
        f"Use code tool (e.g., {recommend_tool}) "
        "to process the data instead of reading all of it"
    )
 def read_file_post_hook(
        tool_use: ToolUseBlock,
        tool_response: ToolResponse,
 ) -> ToolResponse:
    """
    Condense large CSV outputs after `read_file` or `read_multiple_files`.
    Returns the (possibly modified) ToolResponse so the agent sees only
    a brief snippet instead of the entire file.
    """
    tool_name = tool_use.get("name", "")
    # --- read_file ---------------------------------------------------------
    if tool_name == "read_file":
        path: str = str(tool_use["input"].get("path", ""))
        if path.lower().endswith(".csv"):
            _summarize_csv(tool_response.content[0])
    # --- read_multiple_files ----------------------------------------------
    elif tool_name == "read_multiple_files":
        paths = tool_use["input"].get("paths", [])
        for i, path in enumerate(paths):
            if path.lower().endswith(".csv"):
                # Match each path to its corresponding block
                _summarize_csv(tool_response.content[i])
    return tool_response
--- a/alias/src/alias/agent/utils/init.py
+++ b/alias/src/alias/agent/utils/init.py
@@ -0,0 +1,6 @@
 # -*- coding: utf-8 -*-
 from alias.agent.utils.agent_save_state import AliasAgentStates
 __all__ = [
    "AliasAgentStates",
 ]
--- a/alias/src/alias/agent/utils/agent_save_state.py
+++ b/alias/src/alias/agent/utils/agent_save_state.py
@@ -0,0 +1,9 @@
 # -*- coding: utf-8 -*-
 from pydantic import BaseModel, Field
 class AliasAgentStates(BaseModel):
    agent_states: dict[str, dict] = Field(
        default_factory=dict,
        description="a dictionary of `agent_name` to `agent state` (as dict) ",
    )
--- a/alias/src/alias/agent/utils/constants.py
+++ b/alias/src/alias/agent/utils/constants.py
@@ -0,0 +1,35 @@
 # -*- coding: utf-8 -*-
 import os
 MODEL_MAX_RETRIES = int(os.getenv("MODEL_MAX_RETRIES", "20"))
 PLANNER_MAX_ITER = int(os.getenv("AGENT_MAX_ITER", "100"))
 WORKER_MAX_ITER = int(os.getenv("WORKER_MAX_ITER", "50"))
 DEFAULT_PLANNER_NAME = "task-meta-planner"
 DEFAULT_BROWSER_WORKER_NAME = "browser-agent"
 # TASK Switching
 TASK_UPDATE_TRIGGER_MESSAGE = (
    "👀 Try to update task-solving process based on new user input..."
 )
 TASK_UPDATE_ACK_MESSAGE = "✍️ Updating task-solving process..."
 SIMPLE_TASK_DESCRIPTION = (
    "This is a simple task. Please finish it in one subtask"
 )
 BROWSER_AGENT_DESCRIPTION = (
    "This is a browser-based agent that can use browser to view websites."
    "It is extremely useful for tasks requiring going through a website,"
    "requiring clicking to explore the links on the webpage. "
    "Thus, it is good for tasks that require exploring "
    "the a webpage domain, a GitHub repo, "
    "or check the latest travel (e.g., flight, hotel) information."
    "However, when you have a general information gathering task"
    " or deep research which heavily depends on search engine, "
    "TRY TO CREATE/USE ANOTHER AGENT WITH SEARCH TOOL TO DO SO."
 )
 # tmp file dir
 TMP_FILE_DIR = "/workspace/tmp_files/"
--- a/alias/src/alias/cli.py
+++ b/alias/src/alias/cli.py
@@ -0,0 +1,263 @@
 #!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 """
 Alias Command Line Interface
 This module provides a terminal executable entry point
 for the Alias agent application.
 """
 import json
 from typing import Optional
 import asyncio
 import argparse
 import sys
 import os
 import traceback
 import webbrowser
 from loguru import logger
 from agentscope.agent import UserAgent, TerminalUserInput
 from agentscope_runtime.sandbox import FilesystemSandbox, BrowserSandbox
 from agentscope_runtime.sandbox.box.sandbox import Sandbox
 from agentscope.mcp import StdIOStatefulClient
 from alias.agent.mock import MockSessionService, UserMessage
 from alias.agent.run import (
    arun_agents,
    test_browseruse_agent,
    test_deepresearch_agent,
 )
 from alias.agent.tools import AliasToolkit
 from alias.agent.tools.improved_tools import DashScopeMultiModalTools
 from alias.runtime.alias_sandbox.alias_sandbox import AliasSandbox
 from alias.agent.tools.sandbox_util import copy_local_file_to_workspace
 async def run_agent_task(
    user_msg: str,
    mode: str = "all",
    files: Optional[list[str]] = None,
 ) -> None:
    """
    Run an agent task with the specified configuration.
    Args:
        user_msg: The user's task/query
        mode: Agent mode ('all', 'worker', 'dr', 'browser')
        files: List of local file paths to upload to sandbox workspace
    """
    # Initialize session
    session = MockSessionService()
    # Create initial user message
    user_agent = UserAgent(name="User")
    user_agent.override_instance_input_method(
        input_method = TerminalUserInput(
            input_hint = "User (Enter `exit` or `quit` to exit): "
        )
    )
    # Run agent with sandbox context
    with AliasSandbox() as sandbox:
        logger.info(
            f"Sandbox mount dir: {sandbox.get_info().get('mount_dir')}"
        )
        logger.info(f"Sandbox desktop URL: {sandbox.desktop_url}")
        webbrowser.open(sandbox.desktop_url)
        # Upload files to sandbox if provided
        if files:
            target_paths = []
            logger.info(
                f"Uploading {len(files)} file(s) to sandbox workspace..."
            )
            for file_path in files:
                if not os.path.exists(file_path):
                    logger.error(f"File not found: {file_path}")
                    continue
                # Get the filename and construct target path in workspace
                filename = os.path.basename(file_path)
                target_path = f"/workspace/{filename}"
                logger.info(f"Uploading {file_path} to {target_path}")
                result = copy_local_file_to_workspace(
                    sandbox=sandbox,
                    local_path=file_path,
                    target_path=target_path,
                )
                if result.get("isError"):
                    raise ValueError(f"Failed to upload {file_path}: {result}")
                else:
                    logger.info(f"Successfully uploaded to {result}")
                target_paths.append(result.get("content", [])[0].get("text"))
            user_msg += "\n\nUser uploaded files:\n" + "\n".join(target_paths)
        initial_user_message = UserMessage(
            content=user_msg,
        )
        await session.create_message(initial_user_message)
        await _run_agent_loop(
            mode=mode,
            session=session,
            user_agent=user_agent,
            sandbox=sandbox
        )
 async def _run_agent_loop(
    mode: str,
    session: MockSessionService,
    user_agent: UserAgent,
    sandbox: FilesystemSandbox,
 ) -> None:
    """
    Execute the agent loop with follow-up interactions.
    Args:
        mode: Agent mode to run
        session: Session service instance
        user_agent: User agent for interactive follow-ups
        sandbox: Sandbox accessible for all agents
    """
    while True:
        # Run the appropriate agent based on mode
        if mode == "browser":
            usr_msg = (await session.get_messages())[-1].message.get("content")
            logger.info(f"--> user_msg: {usr_msg}")
            await test_browseruse_agent(
                usr_msg,
                session,
                sandbox=sandbox,
            )
            break
        elif mode == "dr":
            usr_msg = (await session.get_messages())[-1].message.get("content")
            logger.info(f"--> user_msg: {usr_msg}")
            await test_deepresearch_agent(
                usr_msg,
                session,
                sandbox=sandbox,
            )
            break
        elif mode == "all":
            await arun_agents(
                session,
                sandbox=sandbox,
                enable_clarification=False,
            )
        else:
            raise ValueError(f"Unknown mode: {mode}")
        # Check for follow-up interaction
        follow_msg = await user_agent()
        if (
            len(follow_msg.content) == 0
            or follow_msg.content.lower() in ["exit", "quit"]
        ):
            logger.info("Exiting agent loop")
            break
        await session.create_message(UserMessage(content=follow_msg.content))
 def main():
    """Main CLI entry point."""
    parser = argparse.ArgumentParser(
        prog="alias",
        description="Alias Agent System",
        epilog=(
            "Example: alias run --mode all "
            "--task 'Analyze Meta stock performance'"
        ),
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )
    subparsers = parser.add_subparsers(
        dest="command", help="Available commands"
    )
    # Run command
    run_parser = subparsers.add_parser(
        "run",
        help="Run an agent task",
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )
    run_parser.add_argument(
        "--task",
        type=str,
        required=True,
        help="The task or query for the agent to execute",
    )
    run_parser.add_argument(
        "--mode",
        choices=["all", "worker", "dr", "browser"],
        default="all",
        help=(
            "Agent mode: "
            "'all' (meta planner with workers), "
            "'worker' (single worker agent), "
            "'dr' (deep research agent), "
            "'browser' (browser agent)"
        ),
    )
    run_parser.add_argument(
        "--verbose",
        "-v",
        action="store_true",
        help="Enable verbose logging",
    )
    run_parser.add_argument(
        "--files",
        "-f",
        type=str,
        nargs="+",
        help="Local file paths to upload to sandbox workspace "
             "for agent to use (e.g., --files file1.txt file2.csv)",
    )
    # Version command
    parser.add_argument(
        "--version",
        action="version",
        version="Alias 0.1.0",
    )
    args = parser.parse_args()
    # Configure logging
    if hasattr(args, "verbose") and args.verbose:
        logger.remove()
        logger.add(sys.stderr, level="DEBUG")
    # Handle commands
    if args.command == "run":
        try:
            asyncio.run(
                run_agent_task(
                    user_msg=args.task,
                    mode=args.mode,
                    files=args.files if hasattr(args, "files") else None,
                )
            )
        except KeyboardInterrupt:
            logger.info("\nInterrupted by user")
            sys.exit(0)
        except Exception as e:
            logger.error(f"Error running agent: {e}")
            if hasattr(args, "verbose") and args.verbose:
                traceback.print_exc()
            sys.exit(1)
    else:
        parser.print_help()
        sys.exit(1)
 if __name__ == "__main__":
    main()
--- a/alias/src/alias/runtime/init.py
+++ b/alias/src/alias/runtime/init.py
@@ -0,0 +1,3 @@
 # -*- coding: utf-8 -*-
 from agentscope_runtime.sandbox.box.sandbox import Sandbox
--- a/alias/src/alias/runtime/alias_sandbox/.gitignore
+++ b/alias/src/alias/runtime/alias_sandbox/.gitignore
@@ -0,0 +1,63 @@
 # See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
 # dependencies
 frontend/node_modules
 **/package-lock.json
 /.pnp
 .pnp.js
 node_modules/
 sessions_mount_dir/
 # testing
 /coverage
 # cookbook
 cookbook/_build
 # production
 /build
 # misc
 .env
 .env.*
 !.env.example
 !.env.template
 __pycache__/
 *.db
 *.rdb
 *.egg-info/
 # IDEs and editors
 .idea/
 .vscode/
 *.suo
 *.ntvs*
 *.njsproj
 *.sln
 *.sw?
 # Logs
 npm-debug.log*
 yarn-debug.log*
 yarn-error.log*
 openapi-ts*.log
 # MacOS
 .DS_Store
 # Windows
 Thumbs.db
 ehthumbs.db
 Desktop.ini
 # Linux
 *~
 # Python
 *.py[cod]
 *$py.class
 uv.lock
 # Logs
 logs/
 *.log
--- a/alias/src/alias/runtime/alias_sandbox/Dockerfile
+++ b/alias/src/alias/runtime/alias_sandbox/Dockerfile
@@ -0,0 +1,82 @@
 FROM node:22-slim
 # ENV variables
 ENV NODE_ENV=production
 ENV WORKSPACE_DIR=/workspace
 ARG DEBIAN_FRONTEND=noninteractive
 RUN apt-get update && apt-get install -y --fix-missing \
    curl  \
    python3  \
    python3-pip  \
    python3-venv \
    build-essential  \
    libssl-dev  \
    git  \
    supervisor  \
    vim  \
    nginx \
    gettext-base \
    xfce4 \
    xfce4-terminal \
    x11vnc \
    xvfb \
    novnc \
    websockify \
    dbus-x11 \
    fonts-wqy-zenhei \
    fonts-wqy-microhei
 RUN apt-get update && apt-get install -y --fix-missing \
    chromium \
    chromium-sandbox \
    libx11-xcb1 \
    libxcomposite1 \
    libxdamage1 \
    libxext6 \
    libxfixes3 \
    libxi6 \
    libxtst6 \
    libnss3 \
    libglib2.0-0 \
    libdrm2 \
    libgbm1 \
    libasound2 \
    fonts-liberation \
    libu2f-udev
 RUN apt-get update && apt-get install -y --fix-missing \
    wget \
    grep \
    findutils
 RUN sed -i 's/^CHROMIUM_FLAGS=""/CHROMIUM_FLAGS="--no-sandbox"/' /usr/bin/chromium
 WORKDIR /agentscope_runtime
 RUN python3 -m venv venv
 ENV PATH="/agentscope_runtime/venv/bin:$PATH"
 COPY box/ ./
 RUN pip install -r requirements.txt
 WORKDIR ${WORKSPACE_DIR}
 RUN mv /agentscope_runtime/config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
 RUN mv /agentscope_runtime/config/nginx.conf.template /etc/nginx/nginx.conf.template
 RUN mv /agentscope_runtime/vnc_relay.html /usr/share/novnc/vnc_relay.html
 RUN git init \
    && chmod +x /agentscope_runtime/scripts/start.sh
 COPY .gitignore ${WORKSPACE_DIR}
 # Cleanup
 RUN pip cache purge \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* \
    && rm -rf /tmp/* \
    && rm -rf /var/tmp/* \
    && npm cache clean --force \
    && rm -rf ~/.npm/_cacache
 CMD ["/bin/sh", "-c", "envsubst '$SECRET_TOKEN' < /etc/nginx/nginx.conf.template > /etc/nginx/nginx.conf && /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf"]
--- a/alias/src/alias/runtime/alias_sandbox/init.py
+++ b/alias/src/alias/runtime/alias_sandbox/init.py
@@ -0,0 +1,3 @@
 from .alias_sandbox import AliasSandbox
 __all__ = ['AliasSandbox']
--- a/alias/src/alias/runtime/alias_sandbox/alias_sandbox.py
+++ b/alias/src/alias/runtime/alias_sandbox/alias_sandbox.py
@@ -0,0 +1,33 @@
 # -*- coding: utf-8 -*-
 from typing import Optional
 from agentscope_runtime.sandbox.utils import build_image_uri
 from agentscope_runtime.sandbox.registry import SandboxRegistry
 from agentscope_runtime.sandbox.enums import SandboxType
 from agentscope_runtime.sandbox.box.base import BaseSandbox
 from agentscope_runtime.sandbox.box.gui import GUIMixin
@SandboxRegistry.register(
    build_image_uri("runtime-sandbox-alias"),
    sandbox_type="alias",
    security_level="high",
    timeout=30,
    description="Alias Sandbox",
 )
 class AliasSandbox(GUIMixin, BaseSandbox):
    def __init__(  # pylint: disable=useless-parent-delegation
        self,
        sandbox_id: Optional[str] = None,
        timeout: int = 3000,
        base_url: Optional[str] = None,
        bearer_token: Optional[str] = None,
        sandbox_type: SandboxType = "alias",
    ):
        super().__init__(
            sandbox_id,
            timeout,
            base_url,
            bearer_token,
            sandbox_type,
        )
--- a/alias/src/alias/runtime/alias_sandbox/box/init.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/init.py
--- a/alias/src/alias/runtime/alias_sandbox/box/app.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/app.py
@@ -0,0 +1,44 @@
 # -*- coding: utf-8 -*-
 import logging
 from fastapi import FastAPI, Response, Depends
 from routers import (
    generic_router,
    mcp_router,
    watcher_router,
    workspace_router,
 )
 from dependencies import verify_secret_token
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 # Initialize FastAPI app
 app = FastAPI(
    title="AgentScope Runtime Sandbox Server",
    version="1.0",
    description="Agentscope runtime sandbox server.",
 )
@app.get(
    "/healthz",
    summary="Check the health of the API",
    dependencies=[Depends(verify_secret_token)],
 )
 async def healthz():
    return Response(content="OK", status_code=200)
 app.include_router(mcp_router, dependencies=[Depends(verify_secret_token)])
 app.include_router(generic_router, dependencies=[Depends(verify_secret_token)])
 app.include_router(watcher_router, dependencies=[Depends(verify_secret_token)])
 app.include_router(
    workspace_router,
    dependencies=[Depends(verify_secret_token)],
 )
 if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000, workers=1)
--- a/alias/src/alias/runtime/alias_sandbox/box/config/nginx.conf.template
+++ b/alias/src/alias/runtime/alias_sandbox/box/config/nginx.conf.template
@@ -0,0 +1,43 @@
 worker_processes 1;
 events { worker_connections 1024; }
 http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    server {
        listen 80;
        location /fastapi {
            rewrite ^/fastapi(.*)$ $1 break;
            proxy_pass http://localhost:8000;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }
        location /vnc/ {
            alias /usr/share/novnc/;
            index vnc.html;
            include /etc/nginx/mime.types;
            types {
                application/javascript js;
            }
        }
        location /websockify {
            proxy_pass http://localhost:9000/websockify;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }
    }
 }
--- a/alias/src/alias/runtime/alias_sandbox/box/config/supervisord.conf
+++ b/alias/src/alias/runtime/alias_sandbox/box/config/supervisord.conf
@@ -0,0 +1,64 @@
 [supervisord]
 user=root
 logfile=/var/log/supervisord.log
 pidfile=/var/log/supervisord.pid
 nodaemon=true
 [program:dbus]
 command=/usr/bin/dbus-daemon --system --nofork
 autostart=true
 autorestart=true
 stderr_logfile=/var/log/dbus.err.log
 stdout_logfile=/var/log/dbus.out.log
 [program:agentscope_runtime]
 command=/agentscope_runtime/scripts/start.sh
 autostart=true
 autorestart=true
 priority=30
 stderr_logfile=/var/log/agentscope_runtime.err.log
 stdout_logfile=/var/log/agentscope_runtime.out.log
 environment=DISPLAY=":1"
 [program:nginx]
 command=/usr/sbin/nginx -g 'daemon off;'
 autostart=true
 autorestart=true
 stderr_logfile=/var/log/nginx.err.log
 stdout_logfile=/var/log/nginx.out.log
 [program:xvfb]
 command=/usr/bin/Xvfb :1 -screen 0 1280x800x24
 autostart=true
 autorestart=true
 priority=10
 stderr_logfile=/var/log/xvfb.err.log
 stdout_logfile=/var/log/xvfb.out.log
 environment=DISPLAY=":1"
 [program:xfce4]
 command=/bin/sh -c "export DISPLAY=:1 && sleep 2 && export $(dbus-launch) && dbus-run-session startxfce4"
 autostart=true
 autorestart=true
 priority=20
 stderr_logfile=/var/log/xfce4.err.log
 stdout_logfile=/var/log/xfce4.out.log
 environment=DISPLAY=":1"
 [program:x11vnc]
 command=/bin/sh -c "export DISPLAY=:1 && sleep 3 && x11vnc -display :1 -forever -shared -passwd $SECRET_TOKEN -rfbport 5901"
 autostart=true
 autorestart=true
 priority=30
 stderr_logfile=/var/log/x11vnc.err.log
 stdout_logfile=/var/log/x11vnc.out.log
 environment=DISPLAY=":1"
 [program:novnc]
 command=/bin/bash -c "websockify --web=/usr/share/novnc/ 9000 localhost:5901"
 directory=/usr/share/novnc
 autostart=true
 autorestart=true
 priority=40
 stderr_logfile=/var/log/novnc.err.log
 stdout_logfile=/var/log/novnc.out.log
--- a/alias/src/alias/runtime/alias_sandbox/box/dependencies/init.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/dependencies/init.py
@@ -0,0 +1,5 @@
 # -*- coding: utf-8 -*-
 from .deps import verify_secret_token
 __all__ = ["verify_secret_token"]
--- a/alias/src/alias/runtime/alias_sandbox/box/dependencies/deps.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/dependencies/deps.py
@@ -0,0 +1,22 @@
 # -*- coding: utf-8 -*-
 import os
 from typing import Optional
 from fastapi import Header, HTTPException, status
 SECRET_TOKEN = os.getenv("SECRET_TOKEN", "secret_token123")
 async def verify_secret_token(authorization: Optional[str] = Header(None)):
    if authorization is None or not authorization.startswith("Bearer "):
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Missing or invalid authorization header",
        )
    token = authorization.split("Bearer ")[1]
    if token != SECRET_TOKEN:
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Invalid secret token",
        )
--- a/alias/src/alias/runtime/alias_sandbox/box/mcp_server_configs.json
+++ b/alias/src/alias/runtime/alias_sandbox/box/mcp_server_configs.json
@@ -0,0 +1,25 @@
 {
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "-y",
        "@playwright/mcp@0.0.39",
        "--no-sandbox",
        "--config",
        "/agentscope_runtime/playwright_mcp_config.json"
      ]
    },
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem@2025.3.28",
        "/workspace"
      ]
    },
    "markitdown": {
      "command": "markitdown-mcp"
    }
  }
 }
--- a/alias/src/alias/runtime/alias_sandbox/box/playwright_mcp_config.json
+++ b/alias/src/alias/runtime/alias_sandbox/box/playwright_mcp_config.json
@@ -0,0 +1,23 @@
 {
  "browser": {
    "browserName": "chromium",
    "launchOptions": {
      "executablePath": "/usr/bin/chromium"
    },
    "contextOptions": {
      "viewport": {
        "width": 1024,
        "height": 768
      }
    }
  },
  "capabilities": [
    "core",
    "tabs",
    "pdf",
    "history",
    "wait",
    "files"
  ],
  "outputDir": "/workspace"
 }
--- a/alias/src/alias/runtime/alias_sandbox/box/requirements.txt
+++ b/alias/src/alias/runtime/alias_sandbox/box/requirements.txt
@@ -0,0 +1,16 @@
 ipython==8.31.0
 fastapi==0.115.6
 uvicorn==0.34.0
 pydantic==2.10.5
 requests==2.32.3
 mcp==1.9.0
 aiofiles
 uv
 gitpython
 markitdown-mcp
 numpy
 pandas
 scikit-learn
 scipy
 seaborn
 matplotlib
--- a/alias/src/alias/runtime/alias_sandbox/box/routers/init.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/routers/init.py
@@ -0,0 +1,12 @@
 # -*- coding: utf-8 -*-
 from .generic import generic_router
 from .mcp import mcp_router
 from .runtime_watcher import watcher_router
 from .workspace import workspace_router
 __all__ = [
    "mcp_router",
    "generic_router",
    "watcher_router",
    "workspace_router",
 ]
--- a/alias/src/alias/runtime/alias_sandbox/box/routers/generic.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/routers/generic.py
@@ -0,0 +1,192 @@
 # -*- coding: utf-8 -*-
 import io
 import sys
 import logging
 import subprocess
 import traceback
 from contextlib import redirect_stderr, redirect_stdout
 from fastapi import APIRouter, Body, HTTPException
 from IPython.core.interactiveshell import InteractiveShell
 from mcp.types import CallToolResult, TextContent
 SPLIT_OUTPUT_MODE = True
 generic_router = APIRouter()
 # Initialize IPython shell
 ipy = InteractiveShell.instance()
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
@generic_router.post(
    "/tools/run_ipython_cell",
    summary="Invoke a cell in a stateful IPython (Jupyter) kernel",
 )
 async def run_ipython_cell(
    code: str = Body(
        ...,
        example="print('Hello World')",
        embed=True,
    ),
 ):
    """
    Execute code in an IPython kernel and return the results.
    """
    try:
        if not code:
            raise HTTPException(status_code=400, detail="Code is required.")
        # Capture stdout and stderr separately
        stdout_buf = io.StringIO()
        stderr_buf = io.StringIO()
        with redirect_stdout(stdout_buf), redirect_stderr(stderr_buf):
            preprocessing_exc_tuple = None
            try:
                transformed_cell = ipy.transform_cell(code)
            except Exception:
                transformed_cell = code
                preprocessing_exc_tuple = sys.exc_info()
            if transformed_cell is None:
                raise HTTPException(
                    status_code=500,
                    detail="IPython cell transformation failed: "
                    "transformed_cell is None.",
                )
            await ipy.run_cell_async(
                code,
                transformed_cell=transformed_cell,
                preprocessing_exc_tuple=preprocessing_exc_tuple,
            )
        stdout_content = stdout_buf.getvalue()
        stderr_content = stderr_buf.getvalue()
        content_list = []
        if SPLIT_OUTPUT_MODE:
            content_list.append(
                TextContent(
                    type="text",
                    text=stdout_content,
                    description="stdout",
                ),
            )
            if stderr_content:
                content_list.append(
                    TextContent(
                        type="text",
                        text=stderr_content,
                        description="stderr",
                    ),
                )
        else:
            content_list.append(
                TextContent(
                    type="text",
                    text=stdout_content + "\n" + stderr_content,
                    description="output",
                ),
            )
        is_error = bool(stderr_content)
        return CallToolResult(
            content=content_list,
            isError=is_error,
        ).model_dump()
    except Exception as e:
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@generic_router.post(
    "/tools/run_shell_command",
    summary="Invoke a shell command.",
 )
 async def run_shell_command(
    command: str = Body(
        ...,
        example="pwd",
        embed=True,
    ),
 ):
    """
    Execute a shell command and return the results.
    """
    try:
        if not command:
            raise HTTPException(status_code=400, detail="Command is required.")
        result = subprocess.run(
            command,
            shell=True,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True,
            check=False,
        )
        stdout_content = result.stdout
        stderr_content = result.stderr
        content_list = []
        if SPLIT_OUTPUT_MODE:
            content_list.append(
                TextContent(
                    type="text",
                    text=stdout_content,
                    description="stdout",
                ),
            )
            if stderr_content:
                content_list.append(
                    TextContent(
                        type="text",
                        text=stderr_content,
                        description="stderr",
                    ),
                )
            content_list.append(
                TextContent(
                    type="text",
                    text=str(result.returncode),
                    description="returncode",
                ),
            )
        else:
            content_list.append(
                TextContent(
                    type="text",
                    text=stdout_content
                    + "\n"
                    + stderr_content
                    + "\n"
                    + str(result.returncode),
                    description="output",
                ),
            )
        is_error = bool(stderr_content)
        return CallToolResult(
            content=content_list,
            isError=is_error,
        ).model_dump()
    except Exception as e:
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
--- a/alias/src/alias/runtime/alias_sandbox/box/routers/mcp.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/routers/mcp.py
@@ -0,0 +1,207 @@
 # -*- coding: utf-8 -*-
 import copy
 import json
 import logging
 import os
 import traceback
 from fastapi import APIRouter, Body, HTTPException, Response
 from .mcp_utils import MCPSessionHandler
 mcp_router = APIRouter()
 _MCP_SERVERS = {}
 current_directory = os.path.dirname(os.path.abspath(__file__))
 mcp_server_configs_path = os.path.abspath(
    os.path.join(current_directory, "../mcp_server_configs.json"),
 )
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 # NOTE: DO NOT use API-KEY Server in release version due to security issues
@mcp_router.post(
    "/mcp/add_servers",
    summary="Add and initialize MCP servers",
 )
 async def add_servers(
    server_configs: dict = Body(
        {},
        embed=True,
    ),
    overwrite: bool = Body(
        False,
        embed=True,
    ),
 ):
    global _MCP_SERVERS
    try:
        if not server_configs:
            raise HTTPException(
                status_code=400,
                detail="server_configs is required.",
            )
        new_servers = [
            MCPSessionHandler(name, config)
            for name, config in server_configs["mcpServers"].items()
        ]
        fail_servers = []
        # Initialize the servers
        for server in new_servers:
            if server.name in _MCP_SERVERS:
                if not overwrite:
                    continue
                # Cleanup old server
                await _MCP_SERVERS.pop(server.name).cleanup()
            try:
                await server.initialize()
                _MCP_SERVERS[server.name] = server
            except Exception as e:
                logging.error(f"Failed to initialize server: {e}")
                fail_servers.append(server)
                continue
        if fail_servers:
            for server in fail_servers:
                await server.cleanup()
            raise HTTPException(
                status_code=500,
                detail=f"Failed to initialize server: "
                f"{[server.name for server in fail_servers]}",
            )
        return Response(content="OK", status_code=200)
    except Exception as e:
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@mcp_router.get(
    "/mcp/list_tools",
    summary="List MCP tools",
 )
 async def list_tools():
    try:
        mcp_tools = {}
        for server_name, server in _MCP_SERVERS.items():
            tools = await server.list_tools()
            server_tools = {}
            for tool in tools:
                name = tool.name
                if name in server_tools:
                    logging.warning(
                        f"Service function `{name}` already exists, "
                        f"skip adding it.",
                    )
                else:
                    json_schema = {
                        "type": "function",
                        "function": {
                            "name": tool.name,
                            "description": tool.description,
                            "parameters": {
                                "type": "object",
                                "properties": tool.inputSchema.get(
                                    "properties",
                                    {},
                                ),
                                "required": tool.inputSchema.get(
                                    "required",
                                    [],
                                ),
                            },
                        },
                    }
                    server_tools[tool.name] = {
                        "name": tool.name,
                        "json_schema": json_schema,
                    }
            mcp_tools[server_name] = copy.deepcopy(server_tools)
        return mcp_tools
    except Exception as e:
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@mcp_router.post(
    "/mcp/call_tool",
    summary="Execute MCP tool",
 )
 async def call_tool(
    tool_name: str = Body(
        ...,
        embed=True,
    ),
    arguments: dict = Body(
        {},
        embed=True,
    ),
 ) -> None:
    try:
        if not tool_name:
            raise HTTPException(
                status_code=400,
                detail="tool_name is required.",
            )
        tools = await list_tools()
        for server_name, server_tools in tools.items():
            if tool_name not in server_tools:
                continue
            server = _MCP_SERVERS[server_name]
            result = await server.call_tool(tool_name, arguments)
            return result.model_dump()
        raise ModuleNotFoundError(f"Tool '{tool_name}' not found.")
    except Exception as e:
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@mcp_router.on_event("shutdown")
 async def cleanup_servers() -> None:
    """Clean up all servers properly."""
    global _MCP_SERVERS
    for server in reversed(list(_MCP_SERVERS.values())):
        try:
            await server.cleanup()
        except Exception as e:
            logging.error(f"Failed to cleanup server: {e}")
    _MCP_SERVERS = {}
@mcp_router.on_event("startup")
 async def startup_event():
    # Load MCP server configs
    try:
        with open(mcp_server_configs_path, "r", encoding="utf-8") as file:
            mcp_server_configs = json.load(file)
    except Exception as e:
        logger.error(f"Failed to load MCP server configs: {e}")
        mcp_server_configs = {}
    # Call the add_servers function
    if mcp_server_configs:
        try:
            await add_servers(
                server_configs=mcp_server_configs,
                overwrite=False,
            )
        except Exception as e:
            logger.error(
                f"Failed to add MCP servers: {e}, {traceback.format_exc()}",
            )
--- a/alias/src/alias/runtime/alias_sandbox/box/routers/mcp_utils.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/routers/mcp_utils.py
@@ -0,0 +1,171 @@
 # -*- coding: utf-8 -*-
 import asyncio
 import logging
 import os
 import shutil
 import traceback
 from contextlib import AsyncExitStack
 from typing import Any
 from mcp import ClientSession, StdioServerParameters
 from mcp.client.sse import sse_client
 from mcp.client.stdio import stdio_client
 from mcp.client.streamable_http import streamablehttp_client
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 class MCPSessionHandler:
    """Manages MCP server connections and tool execution."""
    def __init__(self, name: str, config: dict[str, Any]) -> None:
        self.name: str = name
        self.config: dict[str, Any] = config
        self.stdio_context: Any | None = None
        self.session: ClientSession | None = None
        self._cleanup_lock: asyncio.Lock = asyncio.Lock()
        self._exit_stack: AsyncExitStack = AsyncExitStack()
    async def initialize(self) -> None:
        """Initialize the server connection."""
        command = (
            shutil.which("npx")
            if self.config.get("command") == "npx"
            else self.config.get("command")
        )
        try:
            if command:
                server_params = StdioServerParameters(
                    command=command,
                    args=self.config.get("args", []),
                    env={**os.environ, **self.config.get("env", {})},
                    # cwd=self.config.get("cwd"),  # Disabled
                    encoding=self.config.get("encoding", "utf-8"),
                )
                streams = await self._exit_stack.enter_async_context(
                    stdio_client(server_params),
                )
            else:
                if self.config.get("type") in [
                    "streamable_http",
                    "streamableHttp",
                ]:
                    streams = await self._exit_stack.enter_async_context(
                        streamablehttp_client(
                            url=self.config["url"],
                            headers=self.config.get("headers"),
                            timeout=self.config.get("timeout", 30),
                            sse_read_timeout=self.config.get(
                                "sse_read_timeout",
                                60 * 5,
                            ),
                        ),
                    )
                    streams = (streams[0], streams[1])
                else:
                    streams = await self._exit_stack.enter_async_context(
                        sse_client(
                            url=self.config["url"],
                            headers=self.config.get("headers"),
                            timeout=self.config.get("timeout", 30),
                            sse_read_timeout=self.config.get(
                                "sse_read_timeout",
                                60 * 5,
                            ),
                        ),
                    )
            session = await self._exit_stack.enter_async_context(
                ClientSession(*streams),
            )
            await session.initialize()
            self.session = session
        except Exception as e:
            logging.error(f"Error initializing server {self.name}: {e}")
            await self.cleanup()
            raise
    async def list_tools(self) -> list[Any]:
        """List available tools from the server.
        Returns:
            A list of available tools.
        Raises:
            RuntimeError: If the server is not initialized.
        """
        if not self.session:
            raise RuntimeError(f"Server {self.name} not initialized")
        tools_response = await self.session.list_tools()
        tools = [
            tool
            for item in tools_response
            if isinstance(item, tuple) and item[0] == "tools"
            for tool in item[1]
        ]
        return tools
    async def call_tool(
        self,
        tool_name: str,
        arguments: dict[str, Any],
        retries: int = 2,
        delay: float = 1.0,
    ) -> Any:
        """Execute a tool with retry mechanism.
        Args:
            tool_name: Name of the tool to execute.
            arguments: tool arguments.
            retries: Number of retry attempts.
            delay: Delay between retries in seconds.
        Returns:
            Tool execution result.
        Raises:
            RuntimeError: If server is not initialized.
            Exception: If tool execution fails after all retries.
        """
        if not self.session:
            raise RuntimeError(f"Server {self.name} not initialized")
        attempt = 0
        while attempt < retries:
            try:
                logging.info(f"Executing {tool_name}...")
                result = await self.session.call_tool(tool_name, arguments)
                return result
            except Exception as e:
                attempt += 1
                logging.warning(
                    f"Error executing tool: {e} {traceback.format_exc()}."
                    f" Attempt {attempt} of {retries}.",
                )
                if attempt >= retries:
                    logging.error("Max retries reached. Failing.")
                    raise
                logging.info(f"Retrying in {delay} seconds...")
                await asyncio.sleep(delay)
        return None
    async def cleanup(self) -> None:
        """Clean up server resources."""
        async with self._cleanup_lock:
            try:
                await self._exit_stack.aclose()
            except Exception as e:
                if (
                    "Attempted to exit cancel scope in a different task"
                    in str(e)
                ):
                    pass
            finally:
                self.session = None
                self.stdio_context = None
--- a/alias/src/alias/runtime/alias_sandbox/box/routers/runtime_watcher.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/routers/runtime_watcher.py
@@ -0,0 +1,187 @@
 # -*- coding: utf-8 -*-
 import difflib
 import logging
 import traceback
 import git
 from fastapi import APIRouter, Body, HTTPException
 watcher_router = APIRouter()
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 def initialize_git_user(repo):
    repo.config_writer().set_value("user", "name", "User").release()
    repo.config_writer().set_value(
        "user",
        "email",
        "user@example.com",
    ).release()
    return repo
@watcher_router.post(
    "/watcher/commit_changes",
    summary="...",
 )
 async def commit_changes(
    commit_message: str = Body(
        "Automated commit",
        example="Your commit message",
        embed=True,
    ),
 ):
    """
    Commit the uncommitted changes.
    """
    try:
        repo_path = "."
        repo = git.Repo(repo_path)
        repo = initialize_git_user(repo)
        # Add all changes to the staging area
        repo.git.add(A=True)
        # Commit the changes
        commit = repo.index.commit(commit_message)
        return {"commit": commit.hexsha, "message": commit_message}
    except Exception as e:
        logger.error(f"{str(e)}:\n{traceback.format_exc()}")
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@watcher_router.post(
    "/watcher/generate_diff",
    summary="...",
 )
 async def generate_diff(
    commit_a: str = Body(..., embed=True),
    commit_b: str = Body(..., embed=True),
 ):
    """
    Generate the diff of the uncommitted changes or two commits.
    """
    try:
        repo_path = "."
        repo = git.Repo(repo_path)
        repo = initialize_git_user(repo)
        if not commit_a and not commit_b:
            # Default to uncommitted changes compared to the last commit
            repo.git.add(A=True)
            diff_index = repo.index.diff("HEAD")
            print(diff_index, repo.git.status())
        elif commit_a and commit_b:
            # Get diff between two commits
            diff_index = repo.commit(commit_a).diff(commit_b)
        else:
            return HTTPException(
                detail="Invalid commit range",
                status_code=400,
            )
        diffs = {}
        for diff in diff_index:
            if diff.a_blob and diff.b_blob:
                # Both files are present in commits; perform a diff
                a_content = (
                    diff.a_blob.data_stream.read()
                    .decode(
                        "utf-8",
                    )
                    .splitlines()
                )
                b_content = (
                    diff.b_blob.data_stream.read()
                    .decode(
                        "utf-8",
                    )
                    .splitlines()
                )
            elif diff.a_blob:  # File was deleted
                # Only 'a' file is present; 'b' file is empty
                a_content = (
                    diff.a_blob.data_stream.read()
                    .decode(
                        "utf-8",
                    )
                    .splitlines()
                )
                b_content = []
            elif diff.b_blob:  # File was added
                # Only 'b' file is present; 'a' file is empty
                a_content = []
                b_content = (
                    diff.b_blob.data_stream.read()
                    .decode(
                        "utf-8",
                    )
                    .splitlines()
                )
            else:
                continue
            # Generate the diff content
            diff_text = "\n".join(
                difflib.unified_diff(
                    a_content,
                    b_content,
                    fromfile=f"a/{diff.a_path}",
                    tofile=f"b/{diff.b_path}",
                    lineterm="",
                ),
            )
            diffs[diff.b_path or diff.a_path] = diff_text
        return {"diffs": diffs}
    except Exception as e:
        logger.error(f"{str(e)}:\n{traceback.format_exc()}")
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@watcher_router.get(
    "/watcher/git_logs",
    summary="...",
 )
 async def git_logs():
    """
    Return the git logs.
    """
    try:
        repo = git.Repo(".")
        repo = initialize_git_user(repo)
        logs = []
        for commit in repo.iter_commits():
            diff_result = {"diffs": {}}
            if commit.parents:
                parent_commit = commit.parents[0]
                diff_result = await generate_diff(
                    commit.hexsha,
                    parent_commit.hexsha,
                )
            log_entry = {
                "commit": commit.hexsha,
                "author": commit.author.name,
                "date": commit.committed_datetime.isoformat(),
                "message": commit.message.strip(),
                "diff": diff_result["diffs"],
            }
            logs.append(log_entry)
        return {"logs": logs}
    except Exception as e:
        logger.error(f"{str(e)}:\n{traceback.format_exc()}")
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
--- a/alias/src/alias/runtime/alias_sandbox/box/routers/workspace.py
+++ b/alias/src/alias/runtime/alias_sandbox/box/routers/workspace.py
@@ -0,0 +1,325 @@
 # -*- coding: utf-8 -*-
 import shutil
 import os
 import logging
 import traceback
 import aiofiles
 from fastapi import APIRouter, HTTPException, Query, Body
 from fastapi.responses import FileResponse
 workspace_router = APIRouter()
 # Configure logging
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 def ensure_within_workspace(
    path: str,
    base_directory: str = "/workspace",
 ) -> str:
    """
    Ensure the provided path is within the /workspace directory.
    """
    base_directory = os.path.abspath(base_directory)
    # Determine if the input path is absolute or relative
    if os.path.isabs(path):
        full_path = os.path.abspath(path)
    else:
        full_path = os.path.abspath(os.path.join(base_directory, path))
    # Check for path traversal attacks and ensure path is within base_directory
    if not full_path.startswith(base_directory):
        raise HTTPException(
            status_code=403,
            detail="Permission error. Access restricted to /workspace "
            "directory.",
        )
    return full_path
@workspace_router.get(
    "/workspace/files",
    summary="Retrieve a file within the /workspace directory",
 )
 async def get_workspace_file(
    file_path: str = Query(
        ...,
        description="Path to the file within /workspace relative to its root",
    ),
 ):
    """
    Get a file within the /workspace directory.
    """
    try:
        # Ensure the file path is within the /workspace directory
        full_path = ensure_within_workspace(file_path)
        # Check if the file exists
        if not os.path.isfile(full_path):
            raise HTTPException(status_code=404, detail="File not found.")
        # Return the file using FileResponse
        return FileResponse(
            full_path,
            media_type="application/octet-stream",
            filename=os.path.basename(full_path),
        )
    except Exception as e:
        logger.error(f"{str(e)}:\n{traceback.format_exc()}")
        raise HTTPException(
            status_code=500,
            detail=f"{str(e)}: {traceback.format_exc()}",
        ) from e
@workspace_router.post(
    "/workspace/files",
    summary="Create or edit a file within the /workspace directory",
 )
 async def create_or_edit_file(
    file_path: str = Query(
        ...,
        description="Path to the file within /workspace",
    ),
    content: str = Body(..., description="Content to write to the file"),
 ):
    try:
        full_path = ensure_within_workspace(file_path)
        async with aiofiles.open(full_path, "w", encoding="utf-8") as f:
            await f.write(content)
        return {"message": "File created or edited successfully."}
    except Exception as e:
        logger.error(
            f"Error creating or editing file: {str(e)}:\
            n{traceback.format_exc()}",
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error creating or editing file: {str(e)}",
        ) from e
@workspace_router.get(
    "/workspace/list-directories",
    summary="List file items in the /workspace directory, including nested "
    "files and directories",
 )
 async def list_workspace_files(
    directory: str = Query(
        "/workspace",
        description="Directory to list files and directories from, default "
        "is /workspace.",
    ),
 ):
    """
    List all files and directories in the specified directory, including
    nested items, with type indication and statistics.
    """
    try:
        target_directory = ensure_within_workspace(directory)
        # Verify if the specified directory exists
        if not os.path.isdir(target_directory):
            raise HTTPException(status_code=404, detail="Directory not found.")
        nested_items = []
        file_count = 0
        directory_count = 0
        for root, dirs, files in os.walk(target_directory):
            for d in dirs:
                dir_path = os.path.join(root, d)
                nested_items.append(
                    {
                        "type": "directory",
                        "path": os.path.relpath(dir_path, target_directory),
                    },
                )
                directory_count += 1
            for f in files:
                file_path = os.path.join(root, f)
                nested_items.append(
                    {
                        "type": "file",
                        "path": os.path.relpath(file_path, target_directory),
                    },
                )
                file_count += 1
        return {
            "items": nested_items,
            "statistics": {
                "total_directories": directory_count,
                "total_files": file_count,
            },
        }
    except Exception as e:
        logger.error(
            f"Error listing files: {str(e)}:\n{traceback.format_exc()}",
        )
        raise HTTPException(
            status_code=500,
            detail=f"An error occurred while listing files: {str(e)}",
        ) from e
@workspace_router.post(
    "/workspace/directories",
    summary="Create a directory within the /workspace directory",
 )
 async def create_directory(
    directory_path: str = Query(
        ...,
        description="Path to the directory within /workspace",
    ),
 ):
    try:
        full_path = ensure_within_workspace(directory_path)
        os.makedirs(full_path, exist_ok=True)
        return {"message": "Directory created successfully."}
    except Exception as e:
        logger.error(
            f"Error creating directory: {str(e)}:\n{traceback.format_exc()}",
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error creating directory: {str(e)}",
        ) from e
@workspace_router.delete(
    "/workspace/files",
    summary="Delete a file within the /workspace directory",
 )
 async def delete_file(
    file_path: str = Query(
        ...,
        description="Path to the file within /workspace",
    ),
 ):
    try:
        full_path = ensure_within_workspace(file_path)
        if os.path.isfile(full_path):
            os.remove(full_path)
            return {"message": "File deleted successfully."}
        else:
            raise HTTPException(status_code=404, detail="File not found.")
    except Exception as e:
        logger.error(
            f"Error deleting file: {str(e)}:\n{traceback.format_exc()}",
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error deleting file: {str(e)}",
        ) from e
@workspace_router.delete(
    "/workspace/directories",
    summary="Delete a directory within the /workspace directory",
 )
 async def delete_directory(
    directory_path: str = Query(
        ...,
        description="Path to the directory within /workspace",
    ),
    recursive: bool = Query(
        False,
        description="Recursively delete directory contents",
    ),
 ):
    try:
        full_path = ensure_within_workspace(directory_path)
        if recursive:
            shutil.rmtree(full_path)
        else:
            os.rmdir(full_path)
        return {"message": "Directory deleted successfully."}
    except Exception as e:
        logger.error(
            f"Error deleting directory: {str(e)}:\n{traceback.format_exc()}",
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error deleting directory: {str(e)}",
        ) from e
@workspace_router.put(
    "/workspace/move",
    summary="Move or rename a file or directory within the /workspace "
    "directory",
 )
 async def move_or_rename(
    source_path: str = Query(
        ...,
        description="Source path within /workspace",
    ),
    destination_path: str = Query(
        ...,
        description="Destination path within /workspace",
    ),
 ):
    try:
        full_source_path = ensure_within_workspace(source_path)
        full_destination_path = ensure_within_workspace(destination_path)
        if not os.path.exists(full_source_path):
            raise HTTPException(
                status_code=404,
                detail="Source file or directory not found.",
            )
        os.rename(full_source_path, full_destination_path)
        return {"message": "Move or rename operation successful."}
    except Exception as e:
        logger.error(
            f"Error moving or renaming: {str(e)}:\n{traceback.format_exc()}",
        )
        raise HTTPException(
            status_code=500,
            detail=f"Error moving or renaming: {str(e)}",
        ) from e
@workspace_router.post(
    "/workspace/copy",
    summary="Copy a file or directory within the /workspace directory",
 )
 async def copy(
    source_path: str = Query(
        ...,
        description="Source path within /workspace",
    ),
    destination_path: str = Query(
        ...,
        description="Destination path within /workspace",
    ),
 ):
    try:
        full_source_path = ensure_within_workspace(source_path)
        full_destination_path = ensure_within_workspace(destination_path)
        if not os.path.exists(full_source_path):
            raise HTTPException(
                status_code=404,
                detail="Source file or directory not found.",
            )
        if os.path.isdir(full_source_path):
            shutil.copytree(full_source_path, full_destination_path)
        else:
            shutil.copy2(full_source_path, full_destination_path)
        return {"message": "Copy operation successful."}
    except Exception as e:
        logger.error(f"Error copying: {str(e)}:\n{traceback.format_exc()}")
        raise HTTPException(
            status_code=500,
            detail=f"Error copying: " f"{str(e)}",
        ) from e
--- a/alias/src/alias/runtime/alias_sandbox/box/scripts/start.sh
+++ b/alias/src/alias/runtime/alias_sandbox/box/scripts/start.sh
@@ -0,0 +1,4 @@
 #!/bin/bash
 uvicorn app:app --app-dir=/agentscope_runtime --host=0.0.0.0 --port 8000 &
 wait
--- a/alias/src/alias/runtime/alias_sandbox/box/vnc_relay.html
+++ b/alias/src/alias/runtime/alias_sandbox/box/vnc_relay.html
@@ -0,0 +1,178 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <!--
    noVNC example: lightweight example using minimal UI and features
    This is a self-contained file which doesn't import WebUtil or external CSS.
    Copyright (C) 2019 The noVNC Authors
    noVNC is licensed under the MPL 2.0 (see LICENSE.txt)
    This file is licensed under the 2-Clause BSD license (see LICENSE.txt).
    Connect parameters are provided in query string:
        http://example.com/?host=HOST&port=PORT&scale=true
    This file is a customized/modified version of the original noVNC lightweight example.
    -->
    <title>noVNC</title>
    <meta charset="utf-8">
    <style>
        body {
            margin: 0;
            background-color: dimgrey;
            height: 100%;
            display: flex;
            flex-direction: column;
        }
        html {
            height: 100%;
        }
        #top_bar {
            background-color: #6e84a3;
            color: white;
            font: bold 12px Helvetica;
            padding: 6px 5px 4px 5px;
            border-bottom: 1px outset;
        }
        #status {
            text-align: center;
        }
        #sendCtrlAltDelButton {
            position: fixed;
            top: 0px;
            right: 0px;
            border: 1px outset;
            padding: 5px 5px 4px 5px;
            cursor: pointer;
        }
        #screen {
            flex: 1;
            overflow: hidden;
        }
    </style>
    <script type="module" crossorigin="anonymous">
        // RFB holds the API to connect and communicate with a VNC server
        import RFB from './core/rfb.js';
        let rfb;
        let desktopName;
        function connectedToServer(e) {
            status("Connected to " + desktopName);
        }
        function disconnectedFromServer(e) {
            if (e.detail.clean) {
                status("Disconnected");
            } else {
                status("Something went wrong, connection is closed");
            }
        }
        function credentialsAreRequired(e) {
            const password = prompt("Password Required:");
            rfb.sendCredentials({ password: password });
        }
        function updateDesktopName(e) {
            desktopName = e.detail.name;
        }
        function sendCtrlAltDel() {
            rfb.sendCtrlAltDel();
            return false;
        }
        function status(text) {
            document.getElementById('status').textContent = text;
        }
        function readQueryVariable(name, defaultValue) {
            const re = new RegExp('.*[?&#]' + name + '=([^&#]*)'),
                  match = ''.concat(document.location.href, window.location.hash).match(re);
            if (match) {
                return decodeURIComponent(match[1]);
            }
            return defaultValue;
        }
        function getSandboxIdFromPath() {
            const pathParts = window.location.pathname.split('/');
            if (pathParts.length >= 3 && pathParts[1] === 'desktop') {
                return pathParts[2];
            }
            return null;
        }
        document.getElementById('sendCtrlAltDelButton')
            .onclick = sendCtrlAltDel;
        const sandbox_id = getSandboxIdFromPath();
        if (!sandbox_id) {
            status("Error: Could not extract sandbox_id from URL path");
            throw new Error("Could not extract sandbox_id from URL path");
        }
        const host = readQueryVariable('host', window.location.hostname);
        let port = readQueryVariable('port', window.location.port);
        const password = readQueryVariable('password');
        const path = readQueryVariable('path', 'websockify');
        status("Connecting");
        let url;
        if (window.location.protocol === "https:") {
            url = 'wss';
        } else {
            url = 'ws';
        }
        url += '://' + host;
        if(port) {
            url += ':' + port;
        }
        url += '/desktop/' + sandbox_id;
        if (path && path !== 'websockify') {
            url += '?path=' + encodeURIComponent(path);
        }
        console.log('Extracted sandbox_id:', sandbox_id);
        console.log('Connecting to WebSocket URL:', url);
        // Creating a new RFB object will start a new connection
        rfb = new RFB(document.getElementById('screen'), url,
                      { credentials: { password: password } });
        // Add listeners to important events from the RFB module
        rfb.addEventListener("connect", connectedToServer);
        rfb.addEventListener("disconnect", disconnectedFromServer);
        rfb.addEventListener("credentialsrequired", credentialsAreRequired);
        rfb.addEventListener("desktopname", updateDesktopName);
        // Set parameters that can be changed on an active connection
        rfb.viewOnly = readQueryVariable('view_only', false);
        rfb.scaleViewport = readQueryVariable('scale', false);
    </script>
 </head>
 <body>
    <div id="top_bar">
        <div id="status">Loading</div>
        <div id="sendCtrlAltDelButton">Send CtrlAltDel</div>
    </div>
    <div id="screen">
        <!-- This is where the remote screen will appear -->
    </div>
 </body>
 </html>
--- a/alias/src/alias/runtime/alias_sandbox/build.sh
+++ b/alias/src/alias/runtime/alias_sandbox/build.sh
@@ -0,0 +1 @@
 runtime-sandbox-builder alias --dockerfile_path Dockerfile --extension alias_sandbox.py
--- a/alias/src/alias/runtime/alias_sandbox/test.py
+++ b/alias/src/alias/runtime/alias_sandbox/test.py
@@ -0,0 +1,7 @@
 # -*- coding: utf-8 -*-
 from alias_sandbox import AliasSandbox
 with AliasSandbox() as sandbox:
    print(sandbox.sandbox_id)
    print(sandbox.run_ipython_cell("import time\ntime.sleep(1)"))
    input("Press Enter to continue...")
		`@@ -0,0 +1,3 @@`
							`# -- coding: utf-8 --`

							`from agentscope_runtime.sandbox.box.sandbox import Sandbox`
		`@@ -0,0 +1,3 @@`
							`from .alias_sandbox import AliasSandbox`

							`__all__ = ['AliasSandbox']`
		`@@ -0,0 +1 @@`
							`runtime-sandbox-builder alias --dockerfile_path Dockerfile --extension alias_sandbox.py`