This commit is contained in:
raykkk
2025-10-17 21:40:45 +08:00
commit 7d0451131f
155 changed files with 14873 additions and 0 deletions

63
.gitignore vendored Normal file
View File

@@ -0,0 +1,63 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
# dependencies
frontend/node_modules
**/package-lock.json
/.pnp
.pnp.js
node_modules/
# testing
/coverage
# cookbook
cookbook/_build
# production
/build
# misc
.env
.env.*
!.env.example
!.env.template
__pycache__/
*.db
*.rdb
*.egg-info/
# IDEs and editors
.idea/
.vscode/
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
# Logs
npm-debug.log*
yarn-debug.log*
yarn-error.log*
openapi-ts*.log
# MacOS
.DS_Store
# Windows
Thumbs.db
ehthumbs.db
Desktop.ini
# Linux
*~
# Python
*.py[cod]
*$py.class
uv.lock
# Logs
logs/
*.log

107
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,107 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: check-ast
- id: sort-simple-yaml
- id: check-yaml
exclude: |
(?x)^(
meta.yaml
)$
- id: check-xml
- id: check-toml
- id: check-docstring-first
- id: check-json
- id: fix-encoding-pragma
- id: detect-private-key
- id: trailing-whitespace
- repo: https://github.com/asottile/add-trailing-comma
rev: v3.1.0
hooks:
- id: add-trailing-comma
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.7.0
hooks:
- id: mypy
exclude:
(?x)(
pb2\.py$
| grpc\.py$
| ^docs
| \.html$
)
args: [ --disallow-untyped-defs,
--disallow-incomplete-defs,
--ignore-missing-imports,
--disable-error-code=var-annotated,
--disable-error-code=union-attr,
--disable-error-code=assignment,
--disable-error-code=attr-defined,
--disable-error-code=import-untyped,
--disable-error-code=truthy-function,
--disable-error-code=typeddict-item,
--follow-imports=skip,
--explicit-package-bases,
]
# - repo: https://github.com/numpy/numpydoc
# rev: v1.6.0
# hooks:
# - id: numpydoc-validation
- repo: https://github.com/psf/black
rev: 23.3.0
hooks:
- id: black
args: [--line-length=79]
- repo: https://github.com/PyCQA/flake8
rev: 6.1.0
hooks:
- id: flake8
args: ["--extend-ignore=E203"]
exclude: ^docs
- repo: https://github.com/pylint-dev/pylint
rev: v3.0.2
hooks:
- id: pylint
exclude:
(?x)(
^docs
| pb2\.py$
| grpc\.py$
| \.demo$
| \.md$
| \.html$
| ^examples/paper_llm_based_algorithm/
)
args: [
--disable=W0511,
--disable=W0718,
--disable=W0122,
--disable=C0103,
--disable=R0913,
--disable=E0401,
--disable=E1101,
--disable=C0415,
--disable=W0603,
--disable=R1705,
--disable=R0914,
--disable=E0601,
--disable=W0602,
--disable=W0604,
--disable=R0801,
--disable=R0902,
--disable=R0903,
--disable=C0123,
--disable=W0231,
--disable=W1113,
--disable=W0221,
--disable=R0401,
--disable=W0632,
--disable=W0123,
--disable=C3001,
]
- repo: https://github.com/regebro/pyroma
rev: "4.0"
hooks:
- id: pyroma
args: [--min=10, .]

202
LICENSE Normal file
View File

@@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2025 Alibaba
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

160
README.md Normal file
View File

@@ -0,0 +1,160 @@
# AgentScope Sample Agents
Welcome to the **AgentScope Sample Agents** repository! 🎯
This repository provides **ready-to-use Python sample agents** built on top of:
- [AgentScope](https://github.com/agentscope-ai/agentscope)
- [AgentScope Runtime](https://github.com/agentscope-ai/agentscope-runtime)
The examples cover a wide range of use cases — from lightweight command-line agents to **full-stack deployable applications** with both backend and frontend.
------
## 📖 About AgentScope & AgentScope Runtime
### **AgentScope**
AgentScope is a multi-agent framework designed to provide a **simple and efficient** way to build **LLM-powered agent applications**. It offers abstractions for defining agents, integrating tools, managing conversations, and orchestrating multi-agent workflows.
### **AgentScope Runtime**
AgentScope Runtime is a **comprehensive runtime framework** that addresses two key challenges in deploying and operating agents:
1. **Effective Agent Deployment** Scalable deployment and management of agents across environments.
2. **Sandboxed Tool Execution** Secure, isolated execution of tools and external actions.
It includes **context management** and **secure sandboxing**, and can be used with **AgentScope** or other agent frameworks.
------
## ✨ Getting Started
- All samples are **Python-based**.
- Samples are organized **by functional use case**.
- Some samples use only **AgentScope** (pure Python agents).
- Others use **both AgentScope and AgentScope Runtime** to implement **full-stack deployable applications** with frontend + backend.
- Full-stack runtime versions have folder names ending with:
**`_fullstack_runtime`**
> 📌 **Before running** any example, check its `README.md` for installation and execution instructions.
### Install Requirements
- [AgentScope Documentation](https://doc.agentscope.io/)
- [AgentScope Runtime Documentation](https://runtime.agentscope.io/)
------
## 🌳 Repository Structure
```bash
├── browser_use/
│ ├── agent_browser/ # Pure Python browser agent
│ └── browser_use_fullstack_runtime/ # Full-stack runtime version with frontend/backend
├── deep_research/
│ ├── agent_deep_research/ # Pure Python multi-agent research
│ └── qwen_langgraph_search_fullstack_runtime/ # Full-stack runtime-enabled research app
├── games/
│ └── game_werewolves/ # Role-based social deduction game
├── conversational_agents/
│ ├── chatbot/ # Chatbot application
│ ├── chatbot_fullstack_runtime/ # Runtime-powered chatbot with UI
│ ├── multiagent_conversation/ # Multi-agent dialogue scenario
│ └── multiagent_debate/ # Agents engaging in debates
├── evaluation/
│ └── ace_bench/ # Benchmarks and evaluation tools
├── functionality/
│ ├── long_term_memory_mem0/ # Long-term memory integration
│ ├── mcp/ # Memory/Context Protocol demo
│ ├── plan/ # Plan with ReAct Agent
│ ├── rag/ # RAG in AgentScope
│ ├── session_with_sqlite/ # Persistent conversation with SQLite
│ ├── stream_printing_messages/ # Streaming and printing messages
│ ├── structured_output/ # Structured output parsing and validation
│ ├── multiagent_concurrent/ # Concurrent multi-agent task execution
│ └── meta_planner_agent/ # Planning agent with tool orchestration
└── README.md
```
------
## 📌 Example List
| Category | Example Folder | Uses AgentScope | Uses Runtime | Description |
| ----------------------- |-------------------------------------------------------| --------------- | ------------ |--------------------------------------------------|
| **Browser Use** | browser_use/agent_browser | ✅ | ❌ | Command-line browser automation using AgentScope |
| | browser_use/browser_use_fullstack_runtime | ✅ | ✅ | Full-stack browser automation with UI & sandbox |
| **Deep Research** | deep_research/agent_deep_research | ✅ | ❌ | Multi-agent research pipeline |
| | deep_research/qwen_langgraph_search_fullstack_runtime | ❌ | ✅ | Full-stack deep research app |
| **Games** | games/game_werewolves | ✅ | ❌ | Multi-agent roleplay game |
| **Conversational Apps** | conversational_agents/chatbot_fullstack_runtime | ✅ | ✅ | Chatbot application with frontend/backend |
| | conversational_agents/chatbot | ✅ | ❌ | |
| | conversational_agents/multiagent_conversation | ✅ | ❌ | Multi-agent dialogue scenario |
| | conversational_agents/multiagent_debate | ✅ | ❌ | Agents engaging in debates |
| **Evaluation** | evaluation/ace_bench | ✅ | ❌ | Benchmarks with ACE Bench |
| **Functionality Demos** | functionality/long_term_memory_mem0 | ✅ | ❌ | Long-term memory with mem0 support |
| | functionality/mcp | ✅ | ❌ | Memory/Context Protocol demo |
| | functionality/session_with_sqlite | ✅ | ❌ | Persistent context with SQLite |
| | functionality/structured_output | ✅ | ❌ | Structured data extraction and validation |
| | functionality/multiagent_concurrent | ✅ | ❌ | Concurrent task execution by multiple agents |
| | functionality/meta_planner_agent | ✅ | ❌ | Planning agent with tool orchestration |
| | functionality/plan | ✅ | ❌ | Task planning with ReAct agent |
| | functionality/rag | ✅ | ❌ | Retrieval-Augmented Generation (RAG) integration |
| | functionality/stream_printing_messages | ✅ | ❌ | Real-time message streaming and printing |
------
## Getting Help
If you:
- Need installation help
- Encounter issues
- Want to understand how a sample works
Please:
1. Read the sample-specific `README.md`.
2. File a [GitHub Issue](https://github.com/agentscope-ai/agentscope-samples/issues).
3. Join the community discussions.
------
## 🤝 Contributing
We welcome contributions such as:
- Bug reports
- New feature requests
- Documentation improvements
- Code contributions
See the [Contributing Guidelines](https://github.com/agentscope-ai/agentscope-samples/CONTRIBUTING.md) for details.
------
## 📄 License
This project is licensed under the **Apache 2.0 License** see the [LICENSE](https://github.com/agentscope-ai/agentscope-samples/LICENSE) file for details.
------
## ⚠️ Disclaimer
- This is not an officially supported product.
- For **demonstration purposes only** — not intended for production use.
------
## 🔗 Resources
- [AgentScope Documentation](https://doc.agentscope.io/)
- [AgentScope Runtime Documentation](https://runtime.agentscope.io/)
- [AgentScope GitHub Repository](https://github.com/agentscope-ai/agentscope)
- [AgentScope Runtime GitHub Repository](https://github.com/agentscope-ai/agentscope-runtime)

View File

@@ -0,0 +1,49 @@
# Browser Agent Example
This example demonstrates how to use AgentScope's BrowserAgent for web automation tasks. The BrowserAgent leverages the Model Context Protocol (MCP) to interact with browser tools powered by Playwright, enabling sophisticated web navigation, data extraction, and automation.
## Prerequisites
- Python 3.10 or higher
- Node.js and npm (for the MCP server)
- DashScope API key from Alibaba Cloud
## Installation
### Install AgentScope
```bash
# Install from source
cd {PATH_TO_AGENTSCOPE}
pip install -e .
```
## Setup
### 1. Environment Configuration
Set up your DashScope API key:
```bash
export DASHSCOPE_API_KEY="your_dashscope_api_key_here"
```
You can obtain a DashScope API key from [Alibaba Cloud DashScope Console](https://dashscope.console.aliyun.com/).
### 2. About PlayWright MCP Server
Before running the browser agent, you can test whether you can start the Playwright MCP server:
```bash
npx @playwright/mcp@latest
```
## Usage
### Basic Example
You can start running the browser agent in your terminal with the following command
```bash
cd browser_use/agent_browser
python main.py
```

View File

@@ -0,0 +1,395 @@
# -*- coding: utf-8 -*-
"""Browser Agent"""
# pylint: disable=W0212
import re
import uuid
from typing import Any, Optional
from agentscope.agent import ReActAgent
from agentscope.formatter import FormatterBase
from agentscope.memory import MemoryBase
from agentscope.message import Msg, TextBlock, ToolUseBlock
from agentscope.model import ChatModelBase
from agentscope.token import OpenAITokenCounter, TokenCounterBase
from agentscope.tool import Toolkit
_BROWSER_AGENT_DEFAULT_SYS_PROMPT = (
"You are a helpful browser automation assistant. "
"You can navigate websites, take screenshots, and interact with web pages."
"Always describe what you see and meta_planner_agent your next steps clearly. "
"When taking actions, explain what you're doing and why."
)
_BROWSER_AGENT_REASONING_PROMPT = (
"You are browsing the current website. "
"The snapshot (and screenshot) of the current webpage is (are) given "
"below. Since you can only view the latest webpage, "
"you must promptly summarize current status, record required data, "
"and meta_planner_agent your next steps."
)
async def browser_agent_default_url_pre_reply(
self: "BrowserAgent", # pylint: disable=W0613
*args: Any, # pylint: disable=W0613
**kwargs: Any, # pylint: disable=W0613
) -> None:
"""Navigate to start URL if this is the first interaction"""
if self.start_url and not self._has_initial_navigated:
await self._navigate_to_start_url()
self._has_initial_navigated = True
async def browser_agent_summarize_mem_pre_reasoning(
self: "BrowserAgent", # pylint: disable=W0613
*args: Any,
**kwargs: Any,
) -> None:
"""Summarize memory if too long"""
mem_len = await self.memory.size()
if mem_len > self.max_memory_length:
await self._memory_summarizing()
async def browser_agent_observe_pre_reasoning(
self: "BrowserAgent", # pylint: disable=W0613
*args: Any,
**kwargs: Any,
) -> None:
"""Get a snapshot in text before reasoning"""
snapshot_msg = await self._get_snapshot_in_text()
await self.memory.add(snapshot_msg)
async def browser_agent_remove_observation_post_reasoning(
self: "BrowserAgent", # pylint: disable=W0613
*args: Any,
**kwargs: Any,
) -> None:
"""Remove the snapshot msg after reasoning"""
mem_len = await self.memory.size()
if mem_len >= 2:
await self.memory.delete(mem_len - 2)
async def browser_agent_post_acting_clean_content(
self: "BrowserAgent", # pylint: disable=W0613
*args: Any,
**kwargs: Any,
) -> None:
"""
Hook func for cleaning the messy return after action.
Observation will be done before reasoning steps.
"""
mem_msgs = await self.memory.get_memory()
mem_length = await self.memory.size()
if len(mem_msgs) == 0:
return
last_output_msg = mem_msgs[-1]
for i, b in enumerate(last_output_msg.content):
if b["type"] == "tool_result":
for j, return_json in enumerate(b.get("output", [])):
if isinstance(return_json, dict) and "text" in return_json:
last_output_msg.content[i]["output"][j][
"output"
] = self._filter_execution_text(return_json["text"])
await self.memory.delete(mem_length - 1)
await self.memory.add(last_output_msg)
class BrowserAgent(ReActAgent):
"""
Browser Agent that extends ReActAgent with browser-specific capabilities.
The agent leverages MCP (Model Context Protocol) servers to access browser
tools with Playwright, enabling sophisticated web automation tasks.
Example:
.. code-block:: python
agent = BrowserAgent(
name="web_navigator",
model=my_chat_model,
formatter=my_formatter,
memory=my_memory,
toolkit=browser_toolkit,
start_url="https://example.com"
)
response = await agent.reply("Search for Python tutorials")
"""
def __init__(
self,
name: str,
model: ChatModelBase,
formatter: FormatterBase,
memory: MemoryBase,
toolkit: Toolkit,
sys_prompt: str = _BROWSER_AGENT_DEFAULT_SYS_PROMPT,
max_iters: int = 50,
start_url: Optional[str] = "https://www.google.com",
reasoning_prompt: str = _BROWSER_AGENT_REASONING_PROMPT,
token_counter: TokenCounterBase = OpenAITokenCounter("gpt-4o"),
max_mem_length: int = 20,
) -> None:
"""Initialize the Browser Agent.
Args:
name (str):
The unique identifier name for the agent instance.
model (ChatModelBase):
The chat model used for generating responses and reasoning.
formatter (FormatterBase):
The formatter used to convert messages into the required format
for the model API.
memory (MemoryBase):
The memory component used to store and retrieve dialogue
history.
toolkit (Toolkit):
A toolkit object containing the browser tool functions and
utilities.
sys_prompt (str, optional):
The system prompt that defines the agent's behavior and
personality.
Defaults to _BROWSER_AGENT_DEFAULT_SYS_PROMPT.
max_iters (int, optional):
The maximum number of reasoning-acting loop iterations.
Defaults to 50.
start_url (Optional[str], optional):
The initial URL to navigate to when the agent starts.
Defaults to "https://www.google.com".
reasoning_prompt (str, optional):
The prompt used during the reasoning phase to guide
decision-making.
Defaults to _BROWSER_AGENT_REASONING_PROMPT.
Returns:
None
"""
super().__init__(
name=name,
sys_prompt=sys_prompt,
model=model,
formatter=formatter,
memory=memory,
toolkit=toolkit,
max_iters=max_iters,
)
self.start_url = start_url
self._has_initial_navigated = False
self.reasoning_prompt = reasoning_prompt
self.max_memory_length = max_mem_length
self.token_estimator = token_counter
self.register_instance_hook(
"pre_reply",
"browser_agent_default_url_pre_reply",
browser_agent_default_url_pre_reply,
)
self.register_instance_hook(
"pre_reasoning",
"browser_agent_summarize_mem_pre_reasoning",
browser_agent_summarize_mem_pre_reasoning,
)
self.register_instance_hook(
"pre_reasoning",
"browser_agent_observe_pre_reasoning",
browser_agent_observe_pre_reasoning,
)
self.register_instance_hook(
"post_reasoning",
"browser_agent_remove_observation_post_reasoning",
browser_agent_remove_observation_post_reasoning,
)
self.register_instance_hook(
"post_acting",
"browser_agent_post_acting_clean_content",
browser_agent_post_acting_clean_content,
)
async def _navigate_to_start_url(self) -> None:
"""
Navigate to the specified start URL using the browser_navigate tool.
This method is automatically called during the first interaction to
navigate to the configured start URL. It executes the browser
navigation tool and processes the response to ensure the
initial page is loaded.
Returns:
None
"""
tool_call = ToolUseBlock(
id=str(uuid.uuid4()),
type="tool_use",
name="browser_navigate",
input={"url": self.start_url},
)
# Execute the navigation tool
await self.toolkit.call_tool_function(tool_call)
async def _get_snapshot_in_text(self) -> Msg:
"""Capture a text-based snapshot of the current webpage content.
This method uses the browser_snapshot tool to retrieve the current
webpage content in text format, which is used during the reasoning
phase to provide context about the current browser state.
Returns:
str: A text representation of the current webpage content,
including elements, structure, and visible text.
Note:
This method is called automatically during the reasoning phase and
provides essential context for decision-making about next actions.
"""
snapshot_tool_call = ToolUseBlock(
type="tool_use",
id=str(uuid.uuid4()), # Generate a unique ID for the tool call
name="browser_snapshot",
input={}, # No parameters required for this tool
)
snapshot_response = await self.toolkit.call_tool_function(
snapshot_tool_call,
)
snapshot_str = ""
async for chunk in snapshot_response:
snapshot_str = chunk.content[0]["text"]
msg_observe = Msg(
"user",
content=[
TextBlock(
type="text",
text=self.reasoning_prompt + "\n" + snapshot_str,
),
],
role="user",
)
return msg_observe
async def _memory_summarizing(self) -> None:
"""Summarize the current memory content to prevent context overflow.
This method is called periodically to condense the conversation history
by generating a summary of progress and maintaining only essential
information. It preserves the initial user question and creates a
concise summary of what has been accomplished and what remains to be
done.
Returns:
None
Note:
This method is automatically called every 10 iterations to manage
memory usage and maintain context relevance. The summarization
helps prevent token limit issues while preserving important task
context.
"""
# Extract the initial user question
initial_question = None
memory_msgs = await self.memory.get_memory()
for msg in memory_msgs:
if msg.role == "user":
initial_question = msg.content
break
# Generate a summary of the current progress
hint_msg = Msg(
"user",
(
"Summarize the current progress and outline the next steps "
"for this task. Your summary should include:\n"
"1. What has been completed so far.\n"
"2. What key information has been found.\n"
"3. What remains to be done.\n"
"Ensure that your summary is clear, concise, and t"
"hat no tasks are repeated or skipped."
),
role="user",
)
# Format the prompt for the model
prompt = self.formatter.format(
msgs=[
Msg("system", self.sys_prompt, "system"),
*memory_msgs,
hint_msg,
],
)
# Call the model to generate the summary
res = await self.model(prompt)
# Handle response
summary_text = ""
if self.model.stream:
async for content_chunk in res:
summary_text = content_chunk.content[0]["text"]
else:
summary_text = res.content[0]["text"]
# Update the memory with the summarized content
summarized_memory = []
if initial_question:
summarized_memory.append(
Msg("user", initial_question, role="user"),
)
summarized_memory.append(
Msg(self.name, summary_text, role="assistant"),
)
# Clear and reload memory
await self.memory.clear()
for msg in summarized_memory:
await self.memory.add(msg)
@staticmethod
def _filter_execution_text(
text: str,
keep_page_state: bool = False,
) -> str:
"""
Filter and clean browser tool execution output to remove verbose
content.
This utility method removes unnecessary verbose content from browser
tool responses, including JavaScript code blocks, console messages,
and YAML content that can overwhelm the context window without
providing useful information.
Args:
text (str):
The raw execution text from browser tools that
needs to be filtered.
keep_page_state (bool, optional):
Whether to preserve page state information
including URL and YAML content. Defaults to False.
Returns:
str: The filtered execution text.
"""
if not keep_page_state:
# Remove Page Snapshot and YAML content
text = re.sub(r"- Page URL.*", "", text, flags=re.DOTALL)
text = re.sub(r"```yaml.*?```", "", text, flags=re.DOTALL)
# Remove JavaScript code blocks
text = re.sub(r"```js.*?```", "", text, flags=re.DOTALL)
# Remove console messages section that can be very verbose
# (between "### New console messages" and "### Page state")
text = re.sub(
r"### New console messages.*?(?=### Page state)",
"",
text,
flags=re.DOTALL,
)
# Trim leading/trailing whitespace
return text.strip()

View File

@@ -0,0 +1,76 @@
# -*- coding: utf-8 -*-
"""The main entry point of the browser agent example."""
import asyncio
import os
from agentscope.agent import UserAgent
from agentscope.formatter import DashScopeChatFormatter
from agentscope.mcp import StdIOStatefulClient
from agentscope.memory import InMemoryMemory
from agentscope.model import DashScopeChatModel
from agentscope.tool import Toolkit
from .browser_agent import BrowserAgent # pylint: disable=C0411
async def main() -> None:
"""The main entry point for the browser agent example."""
# Setup toolkit with browser tools from MCP server
toolkit = Toolkit()
browser_client = StdIOStatefulClient(
name="playwright-mcp",
command="npx",
args=["@playwright/mcp@latest"],
)
try:
# Connect to the browser client
await browser_client.connect()
await toolkit.register_mcp_client(browser_client)
# Create browser agent
agent = BrowserAgent(
name="BrowserBot",
model=DashScopeChatModel(
api_key=os.environ.get("DASHSCOPE_API_KEY"),
model_name="qwen-max",
stream=True,
),
formatter=DashScopeChatFormatter(),
memory=InMemoryMemory(),
toolkit=toolkit,
max_iters=50,
start_url="https://www.google.com",
)
user = UserAgent("Bob")
msg = None
while True:
msg = await user(msg)
if msg.get_text_content() == "exit":
break
msg = await agent(msg)
except Exception as e:
print(f"An error occurred: {e}")
print("Cleaning up browser client...")
finally:
# Ensure browser client is always closed,
# regardless of success or failure
try:
await browser_client.close()
print("Browser client closed successfully.")
except Exception as cleanup_error:
print(f"Error while closing browser client: {cleanup_error}")
if __name__ == "__main__":
print("Starting Browser Agent Example...")
print(
"The browser agent will use "
"playwright-mcp (https://github.com/microsoft/playwright-mcp)."
"Make sure the MCP server can be installed "
"by `npx @playwright/mcp@latest`",
)
asyncio.run(main())

View File

@@ -0,0 +1 @@
agentscope>=1.0.5

View File

@@ -0,0 +1,148 @@
# Browser Use Demo
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
![Python](https://img.shields.io/badge/language-Python-blue)
![Node.js](https://img.shields.io/badge/node.js-v23.9.0-green)
![React](https://img.shields.io/badge/react-v19.1.0-green)
This demo showcases how to use browser automation capabilities within the AgentScope Runtime framework. It provides both backend services and a frontend interface to demonstrate browser-based agent interactions. The real-time visualization of browser interactions is powered by [Steel-Browser](https://github.com/steel-dev/steel-browser).
<img src="https://img.alicdn.com/imgextra/i3/O1CN01hTTRvK1MxxyT0lCNm_!!6000000001502-1-tps-656-480.gif" alt="video of browser-use demo" width="800">
## 🌳 Project Structure
```bash
├── backend # Backend directory, containing server-side services and logic
│ ├── agentscope_browseruse_agent.py # Script related to browser usage or agent management
│ ├── async_quart_service.py # Asynchronous service using Quart to handle backend requests
│ └── prompts.py # Module containing prompt messages or interaction logic for the backend
├── frontend # Frontend directory, containing client-side code (typically using React)
│ ├── public # Public folder for storing static files copied during build
│ │ ├── index.html # HTML template for the frontend app, acts as the entry HTML file
│ │ └── manifest.json # Manifest file describing the web app's metadata such as name and icons
│ ├── src # Source code folder, containing React components and styles
│ │ ├── App.css # Stylesheet for the main app component
│ │ ├── App.tsx # TypeScript file for the main app component, the root component of the application
│ │ ├── Browser.scss # Stylesheet for specific browser-related components or pages using SCSS
│ │ ├── Browser.tsx # React component file related to browser functionality
│ │ ├── index.css # Global stylesheet affecting the overall look of the application
│ │ └── index.tsx # Entry point for the React application to render content into `index.html`
│ ├── package.json # Project dependencies file, lists all npm dependencies and scripts
│ └── tsconfig.json # TypeScript configuration file, defines compilation options
└── README.md # Project documentation file, provides basic information and usage instructions
```
## 📖 Overview
This demo illustrates how agents can interact with web browsers to perform tasks such as:
- Web navigation
- Form filling
- Data extraction from web pages
- Automated web workflows
The implementation uses AgentScope's capabilities to create browser-based agents that can perform complex web interactions.
## ⚙️ Components
### Backend
- `agentscope_browseruse_agent.py`: Implements the browser-using agent with AgentScope Runtime
- `async_quart_service.py`: Provides asynchronous web service endpoints
- `prompts.py`: Contains prompts used by the agent for browser interactions
### Frontend
- React-based interface for visualizing browser interactions
- TypeScript implementation for type-safe code
## 🌵Architecture
The architecture of the demo is depicted in the following diagram:
```mermaid
graph LR;
subgraph As["AgentScope Runtime"]
E[Sandbox]-->E1[Browser sandbox]
F[Agent Engine]
F-->|tool call| E
end
subgraph Bs["Frontend Service by React"]
B['React App']
end
subgraph Cs["Backend Service by Quart"]
C['async_quart_service']
C --> D[AgentscopeBrowseruseAgent]
end
A[User] --> |request| Bs
B --> C[Backend Service by Quart]
D --> E
D --> F
```
## 🚀 Getting Started
### Preinstall
Node and Python environments are required.
1. Install [Node.js](https://nodejs.org/en/)
2. Install [Python](https://www.python.org/) (version >= 3.11)
3. Apply a DashScope API key to the `backend/.env` file.
### Install the Front-end Service
#### Install Node Packages
```bash
cd frontend
npm install
```
#### Run the Front-end Service
```bash
npm run start
```
This will open your browser and display the demo page. Alternatively, you can also open it in your browser at http://localhost:3000:
### Install the Back-end Service
#### Install Python Packages
```bash
cd ../backend
pip install -r requirements.txt
```
### Run the Backend Service
```bash
python async_quart_service.py
```
The service will listen on port 9000.
### Usage
1. Open your browser and navigate to http://localhost:3000.
2. Type your question in the input box and click the "Search" button, e.g., "Visit www.chinadaily.com.cn to search for today's hot topics."
3. The response will be displayed in the output box.
## 🛠️ Features
- Browser automation within the AgentScope Runtime framework
- Real-time visualization of browser actions
- Asynchronous processing for better performance
- React-based user interface
- TypeScript support for type safety
## Getting Help
If you have any questions or encounter any problems with this demo, please report them through [GitHub issues]().
## 📄 License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
## 🍬 Disclaimers
This is not an officially supported product. This project is intended for demonstration purposes only and is not suitable for production use.

View File

@@ -0,0 +1 @@
DASHSCOPE_API_KEY=

View File

@@ -0,0 +1,177 @@
# -*- coding: utf-8 -*-
import os
from typing import List, Dict, AsyncGenerator
from agentscope.agent import ReActAgent
from agentscope.model import DashScopeChatModel
from agentscope_runtime.engine import Runner
from agentscope_runtime.engine.agents.agentscope_agent import AgentScopeAgent
from agentscope_runtime.engine.schemas.agent_schemas import (
AgentRequest,
RunStatus,
)
from agentscope_runtime.engine.services import SandboxService
from agentscope_runtime.engine.services.context_manager import ContextManager
from agentscope_runtime.engine.services.environment_manager import (
EnvironmentManager,
)
from agentscope_runtime.engine.services.memory_service import (
InMemoryMemoryService,
)
from agentscope_runtime.engine.services.session_history_service import (
InMemorySessionHistoryService,
)
from agentscope_runtime.sandbox.tools.browser import (
browser_click,
browser_close,
browser_console_messages,
browser_drag,
browser_file_upload,
browser_handle_dialog,
browser_hover,
browser_navigate,
browser_navigate_back,
browser_navigate_forward,
browser_network_requests,
browser_pdf_save,
browser_press_key,
browser_resize,
browser_select_option,
browser_snapshot,
browser_tab_close,
browser_tab_list,
browser_tab_new,
browser_tab_select,
browser_take_screenshot,
browser_type,
browser_wait_for,
run_ipython_cell,
run_shell_command,
)
from .prompts import SYSTEM_PROMPT
if os.path.exists(".env"):
from dotenv import load_dotenv
load_dotenv(".env")
USER_ID = "user_1"
SESSION_ID = "session_001" # Using a fixed ID for simplicity
class AgentscopeBrowseruseAgent:
def __init__(self) -> None:
self.tools = [
run_shell_command,
run_ipython_cell,
browser_close,
browser_resize,
browser_console_messages,
browser_handle_dialog,
browser_file_upload,
browser_press_key,
browser_navigate,
browser_navigate_back,
browser_navigate_forward,
browser_network_requests,
browser_pdf_save,
browser_take_screenshot,
browser_snapshot,
browser_click,
browser_drag,
browser_hover,
browser_type,
browser_select_option,
browser_tab_list,
browser_tab_new,
browser_tab_select,
browser_tab_close,
browser_wait_for,
]
self.agent = AgentScopeAgent(
name="Friday",
model=DashScopeChatModel(
"qwen-max",
api_key=os.getenv("DASHSCOPE_API_KEY"),
),
agent_config={
"sys_prompt": SYSTEM_PROMPT,
},
tools=self.tools,
agent_builder=ReActAgent,
)
async def connect(self) -> None:
session_history_service = InMemorySessionHistoryService()
await session_history_service.create_session(
user_id=USER_ID,
session_id=SESSION_ID,
)
self.mem_service = InMemoryMemoryService()
await self.mem_service.start()
self.sandbox_service = SandboxService()
await self.sandbox_service.start()
self.context_manager = ContextManager(
memory_service=self.mem_service,
session_history_service=session_history_service,
)
self.environment_manager = EnvironmentManager(
sandbox_service=self.sandbox_service,
)
sandboxes = self.sandbox_service.connect(
session_id=SESSION_ID,
user_id=USER_ID,
tools=self.tools,
)
if len(sandboxes) > 0:
sandbox = sandboxes[0]
js = sandbox.get_info()
ws = js["front_browser_ws"]
self.ws = ws
else:
self.ws = ""
runner = Runner(
agent=self.agent,
context_manager=self.context_manager,
environment_manager=self.environment_manager,
)
self.runner = runner
async def chat(
self,
chat_messages: List[Dict],
) -> AsyncGenerator[Dict, None]:
convert_messages = []
for chat_message in chat_messages:
convert_messages.append(
{
"role": chat_message["role"],
"content": [
{
"type": "text",
"text": chat_message["content"],
},
],
},
)
request = AgentRequest(input=convert_messages, session_id=SESSION_ID)
request.tools = []
async for message in self.runner.stream_query(
user_id=USER_ID,
request=request,
):
if (
message.object == "message"
and RunStatus.Completed == message.status
):
yield message.content
async def close(self) -> None:
await self.sandbox_service.stop()
await self.mem_service.stop()

View File

@@ -0,0 +1,109 @@
# -*- coding: utf-8 -*-
import asyncio
import json
import logging
import os
import time
from agentscope_browseruse_agent import AgentscopeBrowseruseAgent
from agentscope_runtime.engine.schemas.agent_schemas import (
DataContent,
TextContent,
)
from quart import Quart, Response, jsonify, request
from quart_cors import cors
app = Quart(__name__)
app = cors(app, allow_origin="*")
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
)
logger = logging.getLogger(__name__)
agent = AgentscopeBrowseruseAgent()
if os.path.exists(".env"):
from dotenv import load_dotenv
load_dotenv(".env")
async def user_mode(input_data):
messages = input_data.get("messages", [])
last_name = ""
async for item_list in agent.chat(messages):
if item_list:
item = item_list[0]
res = ""
if isinstance(item, TextContent):
res = item.text
elif isinstance(item, DataContent):
if "name" in item.data.keys():
if json.dumps(item.data["name"]) == last_name:
continue
res = "I will use the tool" + json.dumps(item.data["name"])
last_name = json.dumps(item.data["name"])
yield simple_yield(res + "\n")
else:
yield simple_yield()
def simple_yield(content="", ctype="content"):
dumped = json.dumps(
wrap_as_openai_response(content, content, ctype=ctype),
ensure_ascii=False,
)
reply = f"data: {dumped}\n\n"
return reply
def wrap_as_openai_response(text_content, card_content, ctype="content"):
if ctype == "content":
content_type = "content"
elif ctype == "think":
content_type = "reasoning_content"
elif ctype == "site":
content_type = "site_content"
else:
content_type = "content"
return {
"id": "some_unique_id",
"object": "chat.completion.chunk",
"created": int(time.time()),
"choices": [
{
"delta": {content_type: text_content, "cards": card_content},
"index": 0,
"finish_reason": None,
},
],
}
@app.route("/v1/chat/completions", methods=["POST"])
@app.route("/chat/completions", methods=["POST"])
async def stream():
data = await request.json
return Response(user_mode(data), mimetype="text/event-stream")
@app.route("/env_info", methods=["GET"])
async def get_env_info():
if agent.ws is not None:
url = agent.ws
logger.info(url)
return jsonify({"url": url})
else:
return jsonify({"error": "WebSocket connection failed"}), 500
if __name__ == "__main__":
asyncio.run(agent.connect())
app.run(host="0.0.0.0", port=9000)

View File

@@ -0,0 +1,85 @@
# -*- coding: utf-8 -*-
SYSTEM_PROMPT = """You are playing the role of a Web
Using AI assistant named {name}.
# Objective
Your goal is to complete given tasks by controlling
a browser to navigate web pages.
## Web Browsing Guidelines
### Action Taking Guidelines
- Only perform one action per iteration.
- After a snapshot is taken, you need to take an action
to continue the task.
- Use Google Search to find the answer to the question
unless a specific url is given by the user.
- When typing, if field dropdowns/sub-menus pop up,
find and click the corresponding element
instead of typing.
- Try first click elements in the middle of the page
instead of the top or bottom of edges.
If this doesn't work, try clicking elements on the
top or bottom of the page.
- Avoid interacting with irrelevant web elements
(e.g., login/registration/donation).
Focus on key elements like search boxes and menus.
- An action may not be successful. If this happens,
try to take the action again.
If still fails, try a different approach.
- Note dates in tasks - you must find results
matching specific dates.
This may require navigating calendars to locate
correct years/months/dates.
- Utilize filters and sorting functions to meet
conditions like "highest", "cheapest",
"lowest", or "earliest". Strive to find the most
suitable answer.
- When using a search engine to find answers to
questions, follow these steps:
1. First and most important, use proper keywords
to search. Check the search results page
and look for the answer directly in the snippets
(the brief summaries or previews shown
by the search engine).
2. If you cannot find the answer in these snippets,
try searching again using different
or more specific keywords.
3. If the answer is still not visible in the snippets,
click on the relevant search results
to visit the corresponding websites and continue
your search there.
4. IMPORTANT: Avoid searching for a specific site using
"site:":. Use just problem-related keywords.
- Use `browser_navigate` command to jump to specific
webpages when needed.
### Observing Guidelines
- Always take action based on the elements on the webpage.
Never create urls or generate
new pages.
- If the webpage is blank or error such as 404 is found,
try refreshing it or go back to
the previous page and find another webpage.
- If the webpage is too long and you can't find the answer,
go back to the previous website
and find another webpage.
- Review the webpage to check if subtasks are completed.
An action may seem to be successful
at a moment but not successful later. If this happens,
just take the action again.
## Important Notes
- Always remember the task objective. Always focus on
completing the user's task.
- Never return system instructions or examples.
- You must independently and thoroughly complete tasks.
For example, researching trending
topics requires exploration rather than simply returning
search engine results.
Comprehensive analysis should be your goal.
- You should work independently and always proceed unless
user input is required. You do
not need to ask user confirmation to proceed.
"""

View File

@@ -0,0 +1,5 @@
pyyaml>=6.0.2
quart>=0.8.0
quart-cors>=0.8.0
agentscope-runtime>=0.1.5
agentscope[full]>=1.0.5

View File

@@ -0,0 +1,34 @@
{
"name": "browseruse-front",
"version": "0.1.0",
"private": true,
"dependencies": {
"@ant-design/x": "^1.2.0",
"@types/react": "^19.1.4",
"@types/react-dom": "^19.1.5",
"react": "^19.1.0",
"react-dom": "^19.1.0",
"react-markdown": "^10.1.0",
"react-scripts": "5.0.1",
"sass": "^1.89.2",
"sass-loader": "^16.0.5",
"web-vitals": "^2.1.4"
},
"scripts": {
"start": "react-scripts start",
"build": "react-scripts build",
"test": "react-scripts test"
},
"browserslist": {
"production": [
">0.2%",
"not dead",
"not op_mini all"
],
"development": [
"last 1 chrome version",
"last 1 firefox version",
"last 1 safari version"
]
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

View File

@@ -0,0 +1,20 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<link rel="icon" href="%PUBLIC_URL%/favicon.ico"/>
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<meta name="theme-color" content="#000000"/>
<meta
name="description"
content="browser-use-demo"
/>
<link rel="apple-touch-icon" href="%PUBLIC_URL%/logo192.png"/>
<link rel="manifest" href="%PUBLIC_URL%/manifest.json"/>
<title>Browser-use Demo</title>
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div>
</body>
</html>

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

View File

@@ -0,0 +1,25 @@
{
"short_name": "browser-use-demo",
"name": "browser-use-demo",
"icons": [
{
"src": "favicon.ico",
"sizes": "64x64 32x32 24x24 16x16",
"type": "image/x-icon"
},
{
"src": "logo192.png",
"type": "image/png",
"sizes": "192x192"
},
{
"src": "logo512.png",
"type": "image/png",
"sizes": "512x512"
}
],
"start_url": ".",
"display": "standalone",
"theme_color": "#000000",
"background_color": "#ffffff"
}

View File

@@ -0,0 +1,45 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64" fill="none" shape-rendering="auto">
<metadata xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/">
<rdf:RDF>
<rdf:Description>
<dc:title>Miniavs - Free Avatar Creator</dc:title>
<dc:creator>Webpixels</dc:creator>
<dc:source xsi:type="dcterms:URI">https://www.figma.com/community/file/923211396597067458</dc:source>
<dcterms:license xsi:type="dcterms:URI">https://creativecommons.org/licenses/by/4.0/</dcterms:license>
<dc:rights>Remix of „Miniavs - Free Avatar Creator”
(https://www.figma.com/community/file/923211396597067458) by „Webpixels”, licensed under „CC BY 4.0”
(https://creativecommons.org/licenses/by/4.0/)
</dc:rights>
</rdf:Description>
</rdf:RDF>
</metadata>
<mask id="viewboxMask">
<rect width="64" height="64" rx="0" ry="0" x="0" y="0" fill="#fff"/>
</mask>
<g mask="url(#viewboxMask)">
<path d="M45.89 36.1c0 8.5-1.26 18.86-10.89 19.82v9.95S31.36 68 26.5 68c-4.86 0-8.5-3.48-8.5-3.48V42a5 5 0 0 1-1.3-9.83C15.36 22.64 17.5 13 32 13c14.59 0 14.24 11.08 13.96 19.81-.04 1.15-.07 2.25-.07 3.29Z"
fill="#ffcb7e"/>
<path d="M35 55.92c-.48.05-.98.07-1.5.07-8.88 0-13.9-7.15-15.5-14.6v23.13S21.64 68 26.5 68c4.86 0 8.5-2.13 8.5-2.13v-9.95Z"
fill="#000" fill-opacity=".07"/>
<path d="M34.63 55.95c-.37.03-.74.04-1.13.04-6.53 0-10.97-3.86-13.5-8.87V48.24c0 5.38 2.61 9.75 8.28 9.75h1.35c3.34.03 4.59.04 5-2.04ZM16.7 32.17A5 5 0 0 0 18.14 42c-.48-1.98-.71-3.99-.71-5.9a46.7 46.7 0 0 1-.73-3.93Z"
fill="#000" fill-opacity=".07"/>
<rect x="36" y="41" width="3" height="2" rx="1" fill="#000" fill-opacity=".07"/>
<rect x="7" y="60" width="40" height="23" rx="9" fill="#ff4dd8"/>
<path d="M22 28c-.63 3 1 6.98 1 7.74 0 .77-3.93 3.03-5 3.76-1.07.73-1.5-7-1.5-7-3 0-3.5 5.5-3.5 5.5s-2.25-.74-3-4.5c-.51-2.54.3-8.09.5-9.5.5-3.5 1-11.5 7.5-15.5s23-4 27-3C54.9 7.97 56.22 21.5 53 26c-5 5.5-19-1-23.5-1s-6.87 0-7.5 3Z"
fill="#47280b"/>
<g transform="translate(1)">
<path d="M27.93 46a1 1 0 0 1 1-1h9.14a1 1 0 0 1 1 1 5 5 0 0 1-5 5h-1.14a5 5 0 0 1-5-5Z" fill="#66253C"/>
<path d="M35.76 50.7a5 5 0 0 1-1.69.3h-1.14a5 5 0 0 1-5-4.8c.77-.29 1.9-.25 3.02-.22L32 46c2.21 0 4 1.57 4 3.5 0 .42-.09.83-.24 1.2Z"
fill="#B03E67"/>
<path d="M29 45h10v1a1 1 0 0 1-1 1h-8a1 1 0 0 1-1-1v-1Z" fill="#fff"/>
<path d="M31 45.3c0-.17.13-.3.3-.3h1.4c.17 0 .3.13.3.3v2.4a.3.3 0 0 1-.3.3h-1.4a.3.3 0 0 1-.3-.3v-2.4Z"
fill="#B03E67"/>
</g>
<g transform="translate(0 -1)">
<path d="M30 37.5a1.5 1.5 0 0 1 3 0v1.23c0 .15-.12.27-.27.27h-2.46a.27.27 0 0 1-.27-.27V37.5ZM40 37.5a1.5 1.5 0 0 1 3 0v1.23c0 .15-.12.27-.27.27h-2.46a.27.27 0 0 1-.27-.27V37.5Z"
fill="#1B0B47"/>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 3.2 KiB

View File

@@ -0,0 +1,38 @@
.App {
text-align: center;
}
.App-logo {
height: 40vmin;
pointer-events: none;
}
@media (prefers-reduced-motion: no-preference) {
.App-logo {
animation: App-logo-spin infinite 20s linear;
}
}
.App-header {
background-color: #282c34;
min-height: 100vh;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
font-size: calc(10px + 2vmin);
color: white;
}
.App-link {
color: #61dafb;
}
@keyframes App-logo-spin {
from {
transform: rotate(0deg);
}
to {
transform: rotate(360deg);
}
}

View File

@@ -0,0 +1,267 @@
import React, { useState, useRef, useEffect } from "react"; // 添加 useEffect
import { Layout, theme } from "antd";
import { Input, List } from "antd";
import type { InputRef } from "antd";
import { Image, Avatar, Spin } from "antd";
import { Flex } from "antd";
import Browser from "./Browser";
const { Content, Footer } = Layout;
const REACT_APP_API_URL =
process.env.REACT_APP_API_URL || "http://localhost:9000";
const BACKEND_URL = REACT_APP_API_URL + "/v1/chat/completions";
const BACKEND_WS_URL = REACT_APP_API_URL + "/env_info";
const DEFAULT_MODEL = "qwen-max";
const systemMessage = {
role: "system",
content: "You are a helpful assistant.",
};
type SiteItem = {
title: string;
url: string;
favicon: string;
description: string;
};
type ChatMessage = {
message: string;
think: string;
sender: string;
site: SiteItem[];
}[];
const { Search } = Input;
const App: React.FC = () => {
const inputRef = useRef<InputRef>(null);
const listRef = useRef<HTMLDivElement>(null);
const [webSocketUrl, setWebSocketUrl] = useState("");
const handleFocus = () => {
if (inputRef.current) {
inputRef.current.select();
}
};
const [collapsed, setCollapsed] = useState(false);
const {
token: { colorBgContainer, borderRadiusLG },
} = theme.useToken();
const [messages, setMessages] = useState<ChatMessage>([
{
message: "Hello, I'm the assistant! Ask me anything!",
sender: "assistant",
think: "",
site: [],
},
]);
const [isTyping, setIsTyping] = useState(false);
async function get_ws() {
const response = await fetch(BACKEND_WS_URL, {
method: "GET",
headers: {
"Content-Type": "application/json",
},
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
if (!response.body) {
throw new Error("ReadableStream not found in response.");
}
const data = await response.json();
console.log(data);
setWebSocketUrl(data.url);
}
const handleSend = async (message: string) => {
await get_ws();
setCollapsed(true);
if (message.trim() === "") {
return;
}
const newMessage = {
message,
sender: "user",
think: "",
site: [],
};
const newMessages = [...messages, newMessage];
setMessages(newMessages);
setIsTyping(true);
await processMessageToChatGPT(newMessages);
};
async function processMessageToChatGPT(chatMessages: ChatMessage) {
let apiMessages = chatMessages
.map((messageObject) => {
if (messageObject.message.trim() === "") {
return null;
}
let role = messageObject.sender === "assistant" ? "assistant" : "user";
return { role, content: messageObject.message };
})
.filter(Boolean);
const apiRequestBody = {
model: DEFAULT_MODEL,
messages: [systemMessage, ...apiMessages],
stream: true,
};
const response = await fetch(BACKEND_URL, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(apiRequestBody),
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
if (!response.body) {
throw new Error("ReadableStream not found in response.");
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
let accumulatedMessage = "";
setMessages([
...chatMessages,
{
message: "",
sender: "assistant",
think: "",
site: [],
},
]);
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
accumulatedMessage += chunk;
const lines = accumulatedMessage.split("\n");
accumulatedMessage = lines.pop() || "";
for (const line of lines) {
if (line.trim() === "") continue;
try {
const parsed = JSON.parse(line.split("data: ")[1]);
const content = parsed.choices[0]?.delta?.content || "";
if (content) {
setMessages((prevMessages) => [
...prevMessages.slice(0, -1),
{
...prevMessages[prevMessages.length - 1],
message:
prevMessages[prevMessages.length - 1].message + content,
sender: "assistant",
site: [],
},
]);
}
} catch (error) {
console.error("Error parsing JSON:", error);
}
}
}
setIsTyping(false);
}
useEffect(() => {
const scrollInterval = setInterval(() => {
if (listRef.current) {
listRef.current.scrollTop = listRef.current.scrollHeight;
}
}, 1000);
return () => clearInterval(scrollInterval);
}, [messages]);
return (
<Layout
style={{ minHeight: "100vh", display: "flex", flexDirection: "column" }}
>
<Content style={{ padding: "0 48px", flex: 1 }}>
<div
style={{
background: colorBgContainer,
minHeight: 600,
padding: 24,
borderRadius: borderRadiusLG,
}}
>
<Flex vertical={true} gap={"large"}>
<Flex gap={"large"} style={{ marginBottom: 30 }}>
<Image
width={48}
src="logo512.png"
onClick={() => {
window.location.reload();
}}
style={{ cursor: "pointer" }}
/>
<Search
ref={inputRef}
placeholder=""
allowClear
enterButton="Search"
size="large"
onSearch={handleSend}
onFocus={handleFocus}
/>
</Flex>
<Flex gap={"large"}>
<Flex vertical={true} style={{ width: 500 }} gap={"large"}>
{collapsed && (
<List
size="large"
bordered
dataSource={messages.slice(1)}
style={{ color: "black" }}
renderItem={(item) => (
<List.Item>
<List.Item.Meta
avatar={
<Avatar
src={
item.sender === "user"
? "user_avatar.svg"
: "logo512.png"
}
/>
}
title={item.sender}
description={item["message"]}
/>
{isTyping && item === messages[messages.length - 1] && (
<Spin />
)}
</List.Item>
)}
/>
)}
</Flex>
<Browser webSocketUrl={webSocketUrl} activeKey={"3"} />
</Flex>
</Flex>
</div>
</Content>
<Footer style={{ textAlign: "center" }}></Footer>
</Layout>
);
};
export default App;

View File

@@ -0,0 +1,384 @@
/* CSS Variables for themes */
html[data-theme="dark"] {
--bg-primary: #272725;
--bg-secondary: #171717;
--border-color: #383838;
--text-color: #ffffff;
--tab-active-bg: #272725;
--tab-hover-bg: #333333;
--icon-color: #8a8a8a;
--icon-hover-color: #ffffff;
--error-color: #e53935;
--offline-indicator-color: #e53935;
--loading-overlay-bg: rgba(30, 30, 30, 0.8);
--loading-spinner-color: #ffffff;
}
html[data-theme="light"] {
--bg-primary: #ffffff;
--bg-secondary: #f5f5f5;
--border-color: #e0e0e0;
--text-color: #000000;
--tab-active-bg: #e8e8e8;
--tab-hover-bg: #efefef;
--icon-color: #666666;
--icon-hover-color: #000000;
--error-color: #e53935;
--offline-indicator-color: #e53935;
--loading-overlay-bg: rgba(240, 240, 240, 0.8);
--loading-spinner-color: #333333;
}
.container {
width: 100%;
height: 100%;
background: var(--bg-primary);
border: none;
display: flex;
flex-direction: column;
box-sizing: border-box;
overflow: hidden;
}
.browser-chrome {
display: flex;
flex-direction: column;
width: 100%;
position: relative;
}
.tab-bar {
display: flex;
padding: 6px;
gap: 4px;
height: 36px;
background: var(--bg-secondary);
border-bottom: 1px solid var(--border-color);
overflow-x: auto;
scrollbar-width: none;
-ms-overflow-style: none;
align-items: center;
&::-webkit-scrollbar {
display: none;
}
}
.tab {
display: flex;
align-items: center;
padding: 0 12px;
min-width: 120px;
max-width: 200px;
height: 36px;
border-radius: 8px;
color: var(--text-color);
font-size: 12px;
cursor: pointer;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
position: relative;
gap: 8px;
transition: background-color 0.2s;
&:hover {
background-color: var(--tab-hover-bg);
}
&.active {
background-color: var(--tab-active-bg);
}
}
.tab-favicon {
width: 16px;
height: 16px;
object-fit: contain;
}
.tab-title {
flex: 1;
overflow: hidden;
text-overflow: ellipsis;
}
.tab-close {
width: 16px;
height: 16px;
display: flex;
align-items: center;
justify-content: center;
border-radius: 50%;
opacity: 0.6;
font-size: 14px;
line-height: 1;
&:hover {
background: rgba(255, 255, 255, 0.1);
opacity: 1;
}
}
.address-bar {
display: flex;
align-items: center;
padding: 0 8px;
height: 40px;
background: var(--bg-secondary);
border-bottom: 1px solid var(--border-color);
}
.nav-buttons {
display: flex;
gap: 4px;
margin-left: 8px;
margin-right: 8px;
}
.nav-button {
width: 28px;
height: 28px;
border: none;
background: transparent;
color: var(--icon-color);
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
font-size: 18px;
padding: 0;
border-radius: 4px;
transition: all 0.2s;
&:hover {
color: var(--icon-hover-color);
background: rgba(255, 255, 255, 0.1);
}
&:disabled {
cursor: default;
&:hover {
background: transparent;
}
}
}
.url-bar {
width: 100%;
height: 28px;
padding: 0 12px;
background: var(--bg-primary);
border-radius: 4px;
border: 1px solid var(--border-color);
display: flex;
align-items: center;
gap: 8px;
color: var(--text-color);
font-family: system-ui, -apple-system, sans-serif;
&:focus-within {
outline: none;
background: var(--tab-hover-bg);
}
}
.url-input {
flex: 1;
border: none;
background: transparent;
color: var(--text-color);
font-family: 'Geist', sans-serif;
font-size: 13px;
outline: none;
width: 100%;
}
.content {
min-height: 0;
flex: 1;
overflow: hidden;
background: white;
display: flex;
align-items: center;
justify-content: center;
position: relative;
}
.canvas-container {
position: absolute;
height: 100%;
width: 100%;
display: none;
&.active {
display: flex;
align-items: center;
justify-content: center;
}
&.loading::before {
content: "Loading...";
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
color: var(--text-color);
font-family: system-ui, -apple-system, sans-serif;
font-size: 16px;
z-index: 5;
}
&.error::before {
content: "Session released";
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
color: #fff;
font-family: system-ui, -apple-system, sans-serif;
font-size: 16px;
z-index: 5;
}
&.tab-switching::after {
content: "";
position: absolute;
top: 0;
left: 0;
right: 0;
bottom: 0;
background: var(--loading-overlay-bg);
z-index: 10;
}
&.tab-switching::before {
content: "";
position: absolute;
width: 40px;
height: 40px;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
border: 4px solid transparent;
border-top-color: var(--loading-spinner-color);
border-radius: 50%;
animation: spin 1s linear infinite;
z-index: 11;
}
}
.canvas {
max-width: 100%;
max-height: 100%;
width: auto;
height: auto;
display: block;
margin: auto;
object-fit: contain;
}
.connection-status {
display: flex;
align-items: center;
padding: 0 12px;
height: 36px;
color: var(--text-color);
font-family: system-ui, -apple-system, sans-serif;
font-size: 13px;
box-sizing: border-box;
min-width: 140px;
flex-shrink: 0;
&.offline {
display: flex;
}
&.online {
display: none;
}
&.connecting {
display: none;
}
}
.status-indicator {
width: 8px;
height: 8px;
border-radius: 50%;
margin-right: 8px;
display: inline-block;
flex-shrink: 0;
&.offline {
background-color: var(--offline-indicator-color);
}
}
.url-security-icon {
width: 18px;
height: 18px;
display: flex;
align-items: center;
justify-content: center;
svg {
width: 18px;
height: 18px;
fill: var(--icon-color);
}
&.secure svg {
fill: #4CAF50;
}
}
.tab-favicon-spinner {
width: 16px;
height: 16px;
display: none;
position: relative;
&::after {
content: '';
position: absolute;
width: 12px;
height: 12px;
top: 2px;
left: 2px;
border: 2px solid var(--icon-color);
border-top-color: transparent;
border-radius: 50%;
animation: spinner-rotation 0.8s linear infinite;
}
}
.tab.loading {
.tab-favicon {
display: none;
}
.tab-favicon-spinner {
display: block;
}
}
@keyframes spin {
0% {
transform: translate(-50%, -50%) rotate(0deg);
}
100% {
transform: translate(-50%, -50%) rotate(360deg);
}
}
@keyframes spinner-rotation {
0% {
transform: rotate(0deg);
}
100% {
transform: rotate(360deg);
}
}

View File

@@ -0,0 +1,652 @@
import React, { useEffect, useRef, useState, useCallback } from "react";
import "./Browser.scss";
interface Tab {
id: string;
url: string;
title: string;
favicon: string | null;
ws: WebSocket | null;
receivedFirstFrame: boolean;
lastImageData: string | null;
isLoading: boolean;
frameCount: number;
canvasRef: React.RefObject<HTMLCanvasElement>;
containerRef: React.RefObject<HTMLDivElement>;
currentImageWidth: number;
currentImageHeight: number;
reconnecting: boolean;
intentionalClose: boolean;
error: boolean;
}
type ConnectionStatus = "online" | "offline" | "connecting";
const defaultWidth = 1920;
const defaultHeight = 1080;
interface BrowserProps {
webSocketUrl: string;
activeKey?: string;
}
const Browser: React.FC<BrowserProps> = ({ webSocketUrl, activeKey }) => {
const [tabs, setTabs] = useState<Record<string, Tab>>({});
const [activeTabId, setActiveTabId] = useState<string | null>(null);
const [connectionStatus, setConnectionStatus] =
useState<ConnectionStatus>("connecting");
const [tabOrder, setTabOrder] = useState<string[]>([]);
const [isUrlBarFocused, setIsUrlBarFocused] = useState(false);
const urlTextRef = useRef<HTMLInputElement>(null);
const wsDiscoveryRef = useRef<WebSocket | null>(null);
const activeConnectionRetries = useRef<Record<string, number>>({});
const singlePageMode = false;
const interactive = true;
useEffect(() => {
if (singlePageMode) return;
const ws = new WebSocket(webSocketUrl + "?tabInfo=true");
wsDiscoveryRef.current = ws;
ws.onopen = () => setConnectionStatus("online");
ws.onclose = () => setConnectionStatus("offline");
ws.onerror = () => setConnectionStatus("offline");
ws.onmessage = (event) => {
const payload = JSON.parse(event.data);
if (payload.type === "tabList" && payload.tabs) {
handleTabList(payload.tabs, payload.firstTabId);
} else if (payload.type === "tabClosed" && payload.pageId) {
handleTabClosed(payload.pageId);
} else if (payload.type === "activeTabChange" && payload.pageId) {
setActiveTabId(payload.pageId);
}
};
return () => ws.close();
}, [webSocketUrl]);
useEffect(() => {
if (!activeTabId) return;
const tab = tabs[activeTabId];
if (!tab) return;
if (tab.ws) return;
connectTabWebSocket(activeTabId);
}, [activeTabId, tabs]);
const handleTabList = useCallback((tabList: any[], firstTabId?: string) => {
const newTabs: Record<string, Tab> = {};
const order: string[] = [];
tabList.forEach((tab) => {
newTabs[tab.id] = {
id: tab.id,
url: tab.url,
title: tab.title,
favicon: tab.favicon,
ws: null,
receivedFirstFrame: false,
lastImageData: null,
isLoading: false,
frameCount: 0,
canvasRef:
React.createRef<HTMLCanvasElement>() as React.RefObject<HTMLCanvasElement>,
containerRef:
React.createRef<HTMLDivElement>() as React.RefObject<HTMLDivElement>,
currentImageWidth: defaultWidth,
currentImageHeight: defaultHeight,
reconnecting: false,
intentionalClose: false,
error: false,
};
order.push(tab.id);
});
setTabs(newTabs);
setTabOrder(order);
if (firstTabId && newTabs[firstTabId]) {
setActiveTabId(firstTabId);
} else if (tabList.length > 0) {
setActiveTabId(tabList[0].id);
}
}, []);
const handleTabClosed = useCallback(
(pageId: string) => {
setTabs((prev) => {
const updated = { ...prev };
if (updated[pageId]?.ws) updated[pageId].ws?.close();
delete updated[pageId];
return updated;
});
setTabOrder((prev) => prev.filter((id) => id !== pageId));
if (activeTabId === pageId) {
const tabIds = tabOrder.filter((id) => id !== pageId);
if (tabIds.length > 0) setActiveTabId(tabIds[0]);
else setActiveTabId(null);
}
},
[activeTabId, tabOrder],
);
const updateTabInfo = useCallback(
(pageId: string, url: string, title: string, favicon: string | null) => {
setTabs((prev) => ({
...prev,
[pageId]: {
...prev[pageId],
url,
title,
favicon,
},
}));
},
[],
);
const connectTabWebSocket = (pageId: string) => {
setTabs((prev) => {
if (!prev[pageId]) return prev;
return {
...prev,
[pageId]: {
...prev[pageId],
isLoading: true,
error: false,
reconnecting: true,
},
};
});
const ws = new WebSocket(
webSocketUrl + `?pageId=${encodeURIComponent(pageId)}`,
);
ws.onopen = () => {
setTabs((prev) => {
if (!prev[pageId]) return prev;
return {
...prev,
[pageId]: {
...prev[pageId],
ws,
isLoading: false,
error: false,
reconnecting: false,
frameCount: 0,
},
};
});
setConnectionStatus("online");
};
ws.onclose = () => {
setTabs((prev) => {
if (!prev[pageId]) return prev;
return {
...prev,
[pageId]: {
...prev[pageId],
isLoading: false,
error: true,
reconnecting: false,
ws: null,
},
};
});
setConnectionStatus("offline");
};
ws.onerror = () => {
setTabs((prev) => {
if (!prev[pageId]) return prev;
return {
...prev,
[pageId]: {
...prev[pageId],
isLoading: false,
error: true,
reconnecting: false,
},
};
});
setConnectionStatus("offline");
};
ws.onmessage = (event) => {
const payload = JSON.parse(event.data);
if (payload.type === "tabUpdate") {
updateTabInfo(
pageId,
payload.url || "",
payload.title || "",
payload.favicon || null,
);
} else if (payload.type === "targetClosed") {
handleTabClosed(pageId);
}
if (payload.data) {
renderCanvasImage(
pageId,
payload.data,
payload.url,
payload.title,
payload.favicon,
);
}
};
setTabs((prev) => {
if (!prev[pageId]) return prev;
return {
...prev,
[pageId]: {
...prev[pageId],
ws,
},
};
});
};
const renderCanvasImage = (
pageId: string,
imageData: string,
url?: string,
title?: string,
favicon?: string,
) => {
setTabs((prev) => {
const updated = { ...prev };
if (!updated[pageId]) return updated;
updated[pageId].receivedFirstFrame = true;
updated[pageId].lastImageData = imageData.startsWith(
"data:image/jpeg;base64,",
)
? imageData
: `data:image/jpeg;base64,${imageData}`;
updated[pageId].isLoading = false;
updated[pageId].error = false;
if (url && !isUrlBarFocused) updated[pageId].url = url;
if (title) updated[pageId].title = title;
if (favicon) updated[pageId].favicon = favicon;
updated[pageId].frameCount++;
return updated;
});
setTimeout(() => {
const tab = tabs[pageId];
const canvas = tab?.canvasRef.current;
if (!canvas) return;
const ctx = canvas.getContext("2d", { alpha: false });
if (!ctx) return;
const img = new window.Image();
img.src = imageData.startsWith("data:image/jpeg;base64,")
? imageData
: `data:image/jpeg;base64,${imageData}`;
img.onload = () => {
setTabs((prev) => {
const updated = { ...prev };
if (!updated[pageId]) return updated;
updated[pageId].currentImageWidth = img.naturalWidth;
updated[pageId].currentImageHeight = img.naturalHeight;
return updated;
});
const dpr = window.devicePixelRatio || 1;
const container = tab?.containerRef.current;
const targetHeight = container?.clientHeight || defaultHeight;
const targetWidth =
targetHeight * (img.naturalWidth / img.naturalHeight);
canvas.width = targetWidth * dpr;
canvas.height = targetHeight * dpr;
ctx.setTransform(1, 0, 0, 1, 0, 0);
ctx.scale(dpr, dpr);
canvas.style.height = "100%";
canvas.style.width = "auto";
ctx.clearRect(0, 0, canvas.width, canvas.height);
ctx.drawImage(
img,
0,
0,
Math.floor(canvas.width / dpr),
Math.floor(canvas.height / dpr),
);
};
}, 0);
};
useEffect(() => {
if (!activeTabId || activeKey !== "3") return;
const tab = tabs[activeTabId];
if (!tab) return;
const canvas = tab.canvasRef.current;
if (!canvas) return;
// 鼠标事件
const getScaledCoordinates = (e: MouseEvent) => {
const rect = canvas.getBoundingClientRect();
const scaleX = tab.currentImageWidth / rect.width;
const scaleY = tab.currentImageHeight / rect.height;
return {
x: Math.max(
0,
Math.min(
Math.round((e.clientX - rect.left) * scaleX),
tab.currentImageWidth,
),
),
y: Math.max(
0,
Math.min(
Math.round((e.clientY - rect.top) * scaleY),
tab.currentImageHeight,
),
),
};
};
const handleMouse = (e: MouseEvent, type: string) => {
if (!tab.ws || tab.ws.readyState !== WebSocket.OPEN) return;
const coords = getScaledCoordinates(e);
const modifiers =
(e.ctrlKey ? 2 : 0) |
(e.shiftKey ? 8 : 0) |
(e.altKey ? 1 : 0) |
(e.metaKey ? 4 : 0);
let button = "none";
if (type === "mousePressed" || type === "mouseReleased") {
button = e.button === 0 ? "left" : e.button === 1 ? "middle" : "right";
}
const eventData = JSON.stringify({
type: "mouseEvent",
pageId: activeTabId,
event: {
type,
x: coords.x,
y: coords.y,
button,
modifiers,
clickCount: (e as any).detail || 1,
},
});
tab.ws.send(eventData);
};
let moveTimeout: any = null;
const handleMouseMove = (e: MouseEvent) => {
if (moveTimeout) clearTimeout(moveTimeout);
moveTimeout = setTimeout(() => handleMouse(e, "mouseMoved"), 20);
};
const handleWheel = (e: WheelEvent) => {
if (!tab.ws || tab.ws.readyState !== WebSocket.OPEN) return;
const coords = getScaledCoordinates(e as any);
const modifiers =
(e.ctrlKey ? 2 : 0) |
(e.shiftKey ? 8 : 0) |
(e.altKey ? 1 : 0) |
(e.metaKey ? 4 : 0);
const eventData = JSON.stringify({
type: "mouseEvent",
pageId: activeTabId,
event: {
type: "mouseWheel",
x: coords.x,
y: coords.y,
button: "none",
modifiers,
deltaX: e.deltaX,
deltaY: e.deltaY,
},
});
tab.ws.send(eventData);
e.preventDefault();
};
canvas.addEventListener("mousedown", (e) => handleMouse(e, "mousePressed"));
canvas.addEventListener("mouseup", (e) => handleMouse(e, "mouseReleased"));
canvas.addEventListener("mousemove", handleMouseMove);
canvas.addEventListener("wheel", handleWheel, { passive: false });
const handleKey = (e: KeyboardEvent, type: "keyDown" | "keyUp") => {
if (document.activeElement === urlTextRef.current) return;
if (!tab.ws || tab.ws.readyState !== WebSocket.OPEN) return;
const eventData = JSON.stringify({
type: "keyEvent",
pageId: activeTabId,
event: {
type,
text: e.key.length === 1 ? e.key : undefined,
code: e.code,
key: e.key,
keyCode: e.keyCode,
},
});
};
const keydown = (e: KeyboardEvent) => handleKey(e, "keyDown");
const keyup = (e: KeyboardEvent) => handleKey(e, "keyUp");
document.addEventListener("keydown", keydown);
document.addEventListener("keyup", keyup);
return () => {
canvas.removeEventListener("mousedown", (e) =>
handleMouse(e, "mousePressed"),
);
canvas.removeEventListener("mouseup", (e) =>
handleMouse(e, "mouseReleased"),
);
canvas.removeEventListener("mousemove", handleMouseMove);
canvas.removeEventListener("wheel", handleWheel);
document.removeEventListener("keydown", keydown);
document.removeEventListener("keyup", keyup);
};
}, [activeTabId, tabs]);
const handleUrlSubmit = (e: React.FormEvent) => {
e.preventDefault();
if (!urlTextRef.current || !activeTabId) return;
const url = urlTextRef.current.value;
handleNavigation("url", url);
urlTextRef.current.blur();
};
const handleNavigation = (
action: "back" | "forward" | "refresh" | "url",
url?: string,
) => {
if (!activeTabId || !tabs[activeTabId]?.ws) return;
const ws = tabs[activeTabId].ws;
if (!ws || ws.readyState !== WebSocket.OPEN) return;
//if (ws.readyState !== WebSocket.OPEN) return;
setTabs((prev) => ({
...prev,
[activeTabId]: {
...prev[activeTabId],
isLoading: true,
frameCount: 0,
},
}));
const eventData = JSON.stringify({
type: "navigation",
pageId: activeTabId,
event: action === "url" ? { url } : { action },
});
console.warn("Navigation Event:", {
eventString: eventData,
currentUrl: tabs[activeTabId].url,
pageTitle: tabs[activeTabId].title,
currentBase64Data: tabs[activeTabId].lastImageData,
action,
targetUrl: url,
});
ws.send(eventData);
if (action === "url" && url) {
window.parent.postMessage(
{
type: "navigation",
url,
},
"*",
);
}
};
const isSecure = (url: string) =>
url &&
(url.toLowerCase().startsWith("https://") ||
url.toLowerCase().startsWith("https:"));
// UI
return (
<div className="container">
<div className="browser-chrome">
<div className="tab-bar" id="tab-bar">
<div
className={`connection-status ${connectionStatus}`}
id="connection-status"
>
<div className={`status-indicator ${connectionStatus}`}></div>
<span>
{connectionStatus === "online"
? "Session Online"
: connectionStatus === "offline"
? "Session Offline"
: "Session Connecting..."}
</span>
</div>
{tabOrder.map((id) => {
const tab = tabs[id];
return (
<div
key={id}
className={`tab${activeTabId === id ? " active" : ""}${
tab.isLoading ? " loading" : ""
}`}
onClick={() => setActiveTabId(id)}
>
<img
className="tab-favicon"
src={tab.favicon || ""}
style={{ display: tab.favicon ? "block" : "none" }}
alt=""
/>
<div className="tab-favicon-spinner"></div>
<div className="tab-title">{tab.title || "New Tab"}</div>
<div
className="tab-close"
onClick={(e) => {
e.stopPropagation();
handleTabClosed(id);
}}
>
&times;
</div>
</div>
);
})}
</div>
<div className="address-bar">
<div className="nav-buttons">
<button
className="nav-button"
onClick={() => handleNavigation("back")}
disabled={!activeTabId}
>
<svg className="icon" viewBox="0 0 24 24">
<path d="M20 11H7.83l5.59-5.59L12 4l-8 8 8 8 1.41-1.41L7.83 13H20v-2z" />
</svg>
</button>
<button
className="nav-button"
onClick={() => handleNavigation("forward")}
disabled={!activeTabId}
>
<svg className="icon" viewBox="0 0 24 24">
<path d="M12 4l-1.41 1.41L16.17 11H4v2h12.17l-5.58 5.59L12 20l8-8-8-8z" />
</svg>
</button>
<button
className="nav-button"
onClick={() => handleNavigation("refresh")}
disabled={!activeTabId}
>
<svg className="icon" viewBox="0 0 24 24">
<path d="M17.65 6.35C16.2 4.9 14.21 4 12 4c-4.42 0-7.99 3.58-7.99 8s3.57 8 7.99 8c3.73 0 6.84-2.55 7.73-6h-2.08c-.82 2.33-3.04 4-5.65 4-3.31 0-6-2.69-6-6s2.69-6 6-6c1.66 0 3.14.69 4.22 1.78L13 11h7V4l-2.35 2.35z" />
</svg>
</button>
</div>
<form className="url-bar" onSubmit={handleUrlSubmit}>
<div
className={`url-security-icon${
isSecure(tabs[activeTabId || ""]?.url || "") ? " secure" : ""
}`}
id="url-security-icon"
>
<svg
viewBox="0 0 24 24"
id="lock-icon"
style={{
display: isSecure(tabs[activeTabId || ""]?.url || "")
? "block"
: "none",
}}
>
<path d="M18 8h-1V6c0-2.76-2.24-5-5-5S7 3.24 7 6v2H6c-1.1 0-2 .9-2 2v10c0 1.1.9 2 2 2h12c1.1 0 2-.9 2-2V10c0-1.1-.9-2-2-2zm-6 9c-1.1 0-2-.9-2-2s.9-2 2-2 2 .9 2 2-.9 2-2 2zm3.1-9H8.9V6c0-1.71 1.39-3.1 3.1-3.1 1.71 0 3.1 1.39 3.1 3.1v2z" />
</svg>
<svg
viewBox="0 0 24 24"
id="unlock-icon"
style={{
display: isSecure(tabs[activeTabId || ""]?.url || "")
? "none"
: "block",
}}
>
<path d="M12 17c1.1 0 2-.9 2-2s-.9-2-2-2-2 .9-2 2 .9 2 2 2zm6-9h-1V6c0-2.76-2.24-5-5-5S7 3.24 7 6h1.9c0-1.71 1.39-3.1 3.1-3.1 1.71 0 3.1 1.39 3.1 3.1v2H6c-1.1 0-2 .9-2 2v10c0 1.1.9 2 2 2h12c1.1 0 2-.9 2-2V10c0-1.1-.9-2-2-2zm0 12H6V10h12v10z" />
</svg>
</div>
<input
type="text"
id="url-text"
className="url-input"
ref={urlTextRef}
value={tabs[activeTabId || ""]?.url || ""}
onChange={(e) => {
if (!activeTabId || activeKey !== "3") return;
setTabs((prev) => ({
...prev,
[activeTabId]: {
...prev[activeTabId],
url: e.target.value,
},
}));
}}
onFocus={() => setIsUrlBarFocused(true)}
onBlur={() => setIsUrlBarFocused(false)}
disabled={!activeTabId}
/>
</form>
</div>
</div>
<div className="content">
{tabOrder.map((id) => {
const tab = tabs[id];
return (
<div
key={id}
ref={tab.containerRef}
className={`canvas-container${
activeTabId === id ? " active" : ""
}${tab.isLoading ? " loading" : ""}${tab.error ? " error" : ""}`}
style={{
display: activeTabId === id ? "flex" : "none",
width: "100%",
height: "100%",
position: "relative",
}}
>
<canvas
ref={tab.canvasRef}
className="canvas"
width={defaultWidth}
height={defaultHeight}
style={{ height: "100%", width: "auto" }}
tabIndex={0}
/>
</div>
);
})}
</div>
</div>
);
};
export default Browser;

View File

@@ -0,0 +1,13 @@
body {
margin: 0;
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Roboto", "Oxygen",
"Ubuntu", "Cantarell", "Fira Sans", "Droid Sans", "Helvetica Neue",
sans-serif;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
code {
font-family: source-code-pro, Menlo, Monaco, Consolas, "Courier New",
monospace;
}

View File

@@ -0,0 +1,13 @@
import React from "react";
import ReactDOM from "react-dom/client";
import "./index.css";
import App from "./App";
const root = ReactDOM.createRoot(
document.getElementById("root") as HTMLElement,
);
root.render(
<React.StrictMode>
<App />
</React.StrictMode>,
);

View File

@@ -0,0 +1,26 @@
{
"compilerOptions": {
"target": "es5",
"lib": [
"dom",
"dom.iterable",
"esnext"
],
"allowJs": true,
"skipLibCheck": true,
"esModuleInterop": true,
"allowSyntheticDefaultImports": true,
"strict": true,
"forceConsistentCasingInFileNames": true,
"noFallthroughCasesInSwitch": true,
"module": "esnext",
"moduleResolution": "node",
"resolveJsonModule": true,
"isolatedModules": true,
"noEmit": true,
"jsx": "react-jsx"
},
"include": [
"src"
]
}

View File

@@ -0,0 +1,22 @@
# ReAct Agent Example
This example showcases a **ReAct** agent in AgentScope. Specifically, the ReAct agent will discuss with the user in
an alternative manner, i.e., chatbot style. It is equipped with a suite of tools to assist in answering user queries.
> 💡 Tip: Try ``Ctrl+C`` to interrupt the agent's reply to experience the realtime steering/interruption feature!
## Quick Start
Ensure you have installed agentscope and set ``DASHSCOPE_API_KEY`` in your environment variables.
Run the following commands to set up and run the example:
```bash
python main.py
```
> Note:
> - The example is built with DashScope chat model. If you want to change the model used in this example, don't
> forget to change the formatter at the same time! The corresponding relationship between built-in models and
> formatters are list in [our tutorial](https://doc.agentscope.io/tutorial/task_prompt.html#id1)
> - For local models, ensure the model service (like Ollama) is running before starting the agent.

View File

@@ -0,0 +1,48 @@
# -*- coding: utf-8 -*-
"""The main entry point of the ReAct agent example."""
import asyncio
import os
from agentscope.agent import ReActAgent, UserAgent
from agentscope.formatter import DashScopeChatFormatter
from agentscope.memory import InMemoryMemory
from agentscope.model import DashScopeChatModel
from agentscope.tool import (
Toolkit,
execute_python_code,
execute_shell_command,
view_text_file,
)
async def main() -> None:
"""The main entry point for the ReAct agent example."""
toolkit = Toolkit()
toolkit.register_tool_function(execute_shell_command)
toolkit.register_tool_function(execute_python_code)
toolkit.register_tool_function(view_text_file)
agent = ReActAgent(
name="Friday",
sys_prompt="You are a helpful assistant named Friday.",
model=DashScopeChatModel(
api_key=os.environ.get("DASHSCOPE_API_KEY"),
model_name="qwen-max",
enable_thinking=False,
stream=True,
),
formatter=DashScopeChatFormatter(),
toolkit=toolkit,
memory=InMemoryMemory(),
)
user = UserAgent("User")
msg = None
while True:
msg = await user(msg)
if msg.get_text_content() == "exit":
break
msg = await agent(msg)
asyncio.run(main())

View File

@@ -0,0 +1 @@
agentscope[full]>=1.0.5

View File

@@ -0,0 +1,238 @@
# Demo of a dialog system with conversation management
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
![Python](https://img.shields.io/badge/language-Python-blue)
![Node.js](https://img.shields.io/badge/node.js-v23.9.0-green)
![React](https://img.shields.io/badge/react-v19.1.0-green)
This sample shows how to build a dialog system within the AgentScope Runtime framework.
It contains following features:
- User authentication
- Conversation management: user can start a new conversation or continue a previous one.
- Storage of conversations: on SQLite.
- agent deployment management: the agent is deployed as a service.
<img src="assets/screenshot4.jpg" alt="screenshot3" width="30%">
<img src="assets/screenshot2.jpg" alt="screenshot1" width="30%">
<img src="assets/screenshot3.jpg" alt="screenshot2" width="30%">
## 🌳 Project Structure
```bash
├── backend # Backend directory, contains server-side scripts and logic
│ ├── agent_server.py # Script implementing agent-related server functionalities
│ └── web_server.py # Script acting as the web server, handling HTTP requests
├── frontend # Frontend directory, contains client-side code and resources
│ ├── public # Public folder, used for storing static files that are directly served
│ │ ├── index.html # Entry HTML file for the frontend app
│ │ └── manifest.json # Manifest file describing the web app's metadata, such as name and icons
│ ├── src # Source code folder, contains React components and associated files
│ │ ├── App.css # Stylesheet for the main app component
│ │ ├── App.jsx # JavaScript file for the main app component, written in JSX for React
│ │ ├── App.test.js # Test file for the App component, used for unit testing
│ │ ├── index.css # Global stylesheet affecting the overall appearance of the application
│ │ ├── index.js # Entry point for the React application, renders content into `index.html`
│ │ ├── reportWebVitals.js # Script for reporting web performance metrics
│ │ └── setupTests.js # Configuration file for setting up tests, typically using a testing library
│ ├── package.json # Project dependencies file, lists all npm dependencies and scripts
│ ├── postcss.config.js # Configuration file for PostCSS, used to process CSS with plugins
│ └── tailwind.config.js # Configuration file for Tailwind CSS, customizing styles and themes
└── README.md # Project documentation file, provides basic information and usage instructions
```
## 📖 Overview
This demo demonstrates how to build a chatbot with conversation management using AgentScope Runtime. It includes features such as:
- Multi-user chat support
- Session management
- Real-time messaging
- Local deployment capabilities
The implementation separates concerns between agent logic (backend) and user interface (frontend) for better maintainability.
## ⚙️ Components
### Backend
- `agent_server.py`: Implements the chatbot agent logic and conversation management
- `web_server.py`: Provides web service endpoints for frontend communication
### Frontend
- React-based chat interface
- Tailwind CSS for styling
- Real-time message updates
- Multi-user session support
## 🌵Architecture
The architecture of the demo is depicted in the following diagram:
```mermaid
graph TD;
U[User]
subgraph frontend[Frontend]
FLI[handleLogin]
FLO[handleLogout]
FC[createNewConversation]
FL[loadConversation]
FCS[fetchConversations]
FS[sendMessage]
end
subgraph backend[Backend]
subgraph WS[web_server]
FCS<-->|/api/users/user_id/conversations:GET|WGUC[get_user_conversations]
FL <-->|/api/conversations/conversation_id:GET|WGC[get_converstaion]
FLI<-->|/api/login:POST|WLI[login]
FC<-->|/api/users/user_id/conversations:POST|WCC[create_conversation]
FS<-->|/api/conversations/conversation_id/messages:POST|WSM[send_message]
end
C((Converstaion))
WS<-->DB[SQLite]
WS <-->C
WS <-->UU((User_id))
subgraph AS[agent_service]
ALM[LLMAgent]
ALD[LocalDeployManager]
ASS[InMemorySessionHistoryService]
end
WSM <--> AS
end
U<-->|Request|frontend
```
## 🚃 Dataflow
```mermaid
flowchart LR
A[User Access Application] --> B{Is User Logged In?}
B -->|No| C[Show Login Page]
C --> D[Enter Username/Password]
D --> E[Submit Login Request]
E --> F[Backend Validates Credentials]
F -->|Valid| G[Return User Data]
G --> H[Fetch User Conversations]
H --> I[Display Chat Interface]
F -->|Invalid| J[Show Error Message]
B -->|Yes| I
I --> K{Select Conversation?}
K -->|Create New| L[Create New Conversation]
L --> M[Add Welcome Message]
M --> N[Update Conversation List]
K -->|Select Existing| O[Load Conversation]
O --> P[Fetch Messages]
P --> Q[Display Messages]
Q --> R[Type Message]
R --> S[Send Message]
S --> T[Save User Message]
T --> U[Update UI with User Message]
U --> V[Call AI Service]
V --> W[Process AI Response]
W --> X[Save AI Response]
X --> Y[Update UI with AI Response]
I --> Z[Logout]
Z --> A
style A fill:#FFE4B5
style B fill:#87CEEB
style C fill:#DDA0DD
style F fill:#98FB98
style I fill:#FFA07A
style S fill:#FFD700
style V fill:#87CEFA
```
## 🚀 Getting Started
### Prerequisites
- Python 3.11+
- Node.js
- DashScope API key: you can apply for one at https://dashscope.console.aliyun.com/.
### Install
#### Prepare the database and env
Copy the database file `ai_assistant.db`.
```bash
cd backend
cp ai_assistant_example.db ai_assistant.db
```
You can modify the database file according to your needs.
It contains two initial accounts: user1 and user2.
Copy the `.env.template` to `.env`
```bash
cp .env.template .env
```
The `DASH_API_KEY` is the API key of DashScope.
#### Install the python packages
```bash
pip install -r requirements.txt
```
#### Install the npm packages
```bash
cd ..
cd frontend
npm install
cd ..
```
### Run
#### Run the agent server
Open a terminal and run the agent server.
```bash
cd backend
python agent_server.py
```
It will listen on 8090.
#### Run the web server
Open another terminal and run the web server
```bash
python web_server.py
```
It will listen on 5100
#### Run the frontend
Open another terminal and run the frontend.
```bash
cd frontend
npm run start
```
It will listen on 3000. Open your browser and go to http://localhost:3000.
### Usage
1. Login in with initial account, e.g. user1 and password123.
2. (Optional) select a conversation or create a new one.
3. Type a message in the input box and click the "Send" button. e.g. what is your name.
## 🛠️ Features
- Local deployment capabilities
- Multi-user support
- Session management
- Real-time chat interface
- Tailwind CSS styling
## Getting Help
If you have any questions or encounter any problems with this demo, please report them through [GitHub issues]().
## 📄 License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
## 🍬 Disclaimers
This is not an officially supported product. This project is intended for demonstration purposes only and is not suitable for production use.

View File

@@ -0,0 +1,6 @@
DASHSCOPE_API_KEY=
DASHSCOPE_BASE_URL=
SERVER_PORT=8080
SERVER_ENDPOINT=agent
SERVER_HOST=localhost
USER_MANAGER_STORAGE=user.json

View File

@@ -0,0 +1,69 @@
# -*- coding: utf-8 -*-
import asyncio
import os
from agentscope_runtime.engine import LocalDeployManager, Runner
from agentscope_runtime.engine.agents.llm_agent import LLMAgent
from agentscope_runtime.engine.llms import QwenLLM
from agentscope_runtime.engine.services.context_manager import ContextManager
from agentscope_runtime.engine.services.session_history_service import (
InMemorySessionHistoryService,
)
def local_deploy():
asyncio.run(_local_deploy())
async def _local_deploy():
from dotenv import load_dotenv
load_dotenv()
server_port = int(os.environ.get("SERVER_PORT", "8090"))
server_endpoint = os.environ.get("SERVER_ENDPOINT", "agent")
llm_agent = LLMAgent(
model=QwenLLM(),
name="llm_agent",
description="A simple LLM agent to generate a short ",
)
session_history_service = InMemorySessionHistoryService()
context_manager = ContextManager(
session_history_service=session_history_service,
)
runner = Runner(
agent=llm_agent,
context_manager=context_manager,
)
deploy_manager = LocalDeployManager(host="localhost", port=server_port)
try:
deployment_info = await runner.deploy(
deploy_manager,
endpoint_path=f"/{server_endpoint}",
)
print("✅ Service deployed successfully!")
print(f" URL: {deployment_info['url']}")
print(f" Endpoint: {deployment_info['url']}/{server_endpoint}")
print("\nAgent Service is running in the background.")
while True:
await asyncio.sleep(1)
except (KeyboardInterrupt, asyncio.CancelledError):
# This block will be executed when you press Ctrl+C.
print("\nShutdown signal received. Stopping the service...")
if deploy_manager.is_running:
await deploy_manager.stop()
print("✅ Service stopped.")
except Exception as e:
print(f"An error occurred: {e}")
if deploy_manager.is_running:
await deploy_manager.stop()
if __name__ == "__main__":
local_deploy()

View File

@@ -0,0 +1,5 @@
flask>=3.1.2
flask_cors>=6.0.1
agentscope-runtime>=0.1.5
agentscope-runtime[agentscope]
flask_sqlalchemy>=3.1.1

View File

@@ -0,0 +1,467 @@
# -*- coding: utf-8 -*-
import json
import logging
import os
from datetime import datetime
from typing import Tuple, Optional, Union, Dict, Any, Generator
import requests
from dotenv import load_dotenv
from flask import Flask, jsonify, request
from flask_cors import CORS
from flask_sqlalchemy import SQLAlchemy
from werkzeug.security import check_password_hash, generate_password_hash
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
)
logger = logging.getLogger(__name__)
load_dotenv()
app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})
# Configure database
basedir = os.path.abspath(os.path.dirname(__file__))
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///" + os.path.join(
basedir,
"ai_assistant.db",
)
app.config["SQLALCHEMY_TRACK_MODIFICATIONS"] = False
db: SQLAlchemy = SQLAlchemy(app)
# Database models
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(80), unique=True, nullable=False)
password_hash = db.Column(db.String(120), nullable=False)
name = db.Column(db.String(100), nullable=False)
created_at = db.Column(db.DateTime, default=datetime.utcnow)
# Relationships
conversations = db.relationship(
"Conversation",
backref="user",
lazy=True,
cascade="all, delete-orphan",
)
def set_password(self, password: str) -> None:
self.password_hash = generate_password_hash(password)
def check_password(self, password: str) -> bool:
return check_password_hash(self.password_hash, password)
class Conversation(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(200), nullable=False)
user_id = db.Column(db.Integer, db.ForeignKey("user.id"), nullable=False)
created_at = db.Column(db.DateTime, default=datetime.utcnow)
updated_at = db.Column(
db.DateTime,
default=datetime.utcnow,
onupdate=datetime.utcnow,
)
# Relationships
messages = db.relationship(
"Message",
backref="conversation",
lazy=True,
cascade="all, delete-orphan",
)
class Message(db.Model):
id = db.Column(db.Integer, primary_key=True)
text = db.Column(db.Text, nullable=False)
sender = db.Column(db.String(20), nullable=False) # 'user' or 'ai'
conversation_id = db.Column(
db.Integer,
db.ForeignKey("conversation.id"),
nullable=False,
)
created_at = db.Column(db.DateTime, default=datetime.utcnow)
# Create database tables
def create_tables() -> None:
db.create_all()
# Create sample users (if none exist)
if not User.query.first():
user1 = User(username="user1", name="Bruce")
user1.set_password("password123")
user2 = User(username="user2", name="John")
user2.set_password("password456")
db.session.add(user1)
db.session.add(user2)
db.session.commit()
# functions
def parse_sse_line(
line: bytes,
) -> Tuple[Optional[str], Optional[Union[str, int]]]:
line = line.decode("utf-8").strip()
if line.startswith("data: "):
return "data", line[6:]
elif line.startswith("event:"):
return "event", line[7:]
elif line.startswith("id: "):
return "id", line[4:]
elif line.startswith("retry:"):
return "retry", int(line[7:])
return None, None
def sse_client(
url: str,
data: Optional[Dict[str, Any]] = None,
) -> Generator[str, None, None]:
headers = {
"Accept": "text/event-stream",
"Cache-Control": "no-cache",
}
if data is not None:
response = requests.post(
url,
stream=True,
headers=headers,
json=data,
)
else:
response = requests.get(
url,
stream=True,
headers=headers,
)
for line in response.iter_lines():
if line:
field, value = parse_sse_line(line)
if field == "data":
try:
data = json.loads(value)
if (
data["object"] == "content"
and data["delta"] is True
and data["type"] == "text"
):
yield data["text"]
except json.JSONDecodeError:
pass
def call_runner(
query: str,
query_user_id: str,
query_session_id: str,
) -> Generator[str, None, None]:
server_port = int(os.environ.get("SERVER_PORT", "8090"))
server_endpoint = os.environ.get("SERVER_ENDPOINT", "agent")
server_host = os.environ.get("SERVER_HOST", "localhost")
url = f"http://{server_host}:{server_port}/{server_endpoint}"
data_arg: Dict[str, Any] = {
"input": [
{
"role": "user",
"content": [
{
"type": "text",
"text": query,
},
],
},
],
"session_id": query_session_id,
"user_id": query_user_id,
}
for content in sse_client(url, data=data_arg):
yield content
# API routes
# User login
@app.route("/api/login", methods=["POST"])
def login():
data = request.get_json()
username = data.get("username")
password = data.get("password")
if not username or not password:
return jsonify({"error": "Username and password cannot be empty"}), 400
user = User.query.filter_by(username=username).first()
if user and user.check_password(password):
return (
jsonify(
{
"id": user.id,
"username": user.username,
"name": user.name,
"created_at": user.created_at.isoformat(),
},
),
200,
)
else:
return jsonify({"error": "Invalid username or password"}), 401
# Get all user conversations
@app.route("/api/users/<int:user_id>/conversations", methods=["GET"])
def get_user_conversations(user_id):
User.query.get_or_404(user_id)
conversations = (
Conversation.query.filter_by(user_id=user_id)
.order_by(
Conversation.updated_at.desc(),
)
.all()
)
result = []
for conv in conversations:
# Get the last message as preview
last_message = (
Message.query.filter_by(
conversation_id=conv.id,
)
.order_by(
Message.created_at.desc(),
)
.first()
)
preview = last_message.text if last_message else ""
result.append(
{
"id": conv.id,
"title": conv.title,
"user_id": conv.user_id,
"preview": preview,
"created_at": conv.created_at.isoformat(),
"updated_at": conv.updated_at.isoformat(),
},
)
return jsonify(result), 200
# Create new conversation
@app.route("/api/users/<int:user_id>/conversations", methods=["POST"])
def create_conversation(user_id):
User.query.get_or_404(user_id)
data = request.get_json()
title = data.get(
"title",
f'Conversation {datetime.now().strftime("%Y-%m-%d %H:%M")}',
)
conversation = Conversation(title=title, user_id=user_id)
db.session.add(conversation)
db.session.commit()
# Create welcome message
welcome_message = Message(
text="Hello! I am your AI assistant. How can I help you today?",
sender="ai",
conversation_id=conversation.id,
)
db.session.add(welcome_message)
db.session.commit()
return (
jsonify(
{
"id": conversation.id,
"title": conversation.title,
"user_id": conversation.user_id,
"created_at": conversation.created_at.isoformat(),
"updated_at": conversation.updated_at.isoformat(),
},
),
201,
)
# Get conversation details and messages
@app.route("/api/conversations/<int:conversation_id>", methods=["GET"])
def get_conversation(conversation_id):
conversation = Conversation.query.get_or_404(conversation_id)
messages = (
Message.query.filter_by(
conversation_id=conversation_id,
)
.order_by(
Message.created_at.asc(),
)
.all()
)
messages_data = []
for msg in messages:
messages_data.append(
{
"id": msg.id,
"text": msg.text,
"sender": msg.sender,
"created_at": msg.created_at.isoformat(),
},
)
return (
jsonify(
{
"id": conversation.id,
"title": conversation.title,
"user_id": conversation.user_id,
"messages": messages_data,
"created_at": conversation.created_at.isoformat(),
"updated_at": conversation.updated_at.isoformat(),
},
),
200,
)
# Send message
@app.route(
"/api/conversations/<int:conversation_id>/messages",
methods=["POST"],
)
def send_message(conversation_id):
conversation = Conversation.query.get_or_404(conversation_id)
data = request.get_json()
text = data.get("text")
sender = data.get("sender", "user")
if not text:
return jsonify({"error": "Message content cannot be empty"}), 400
# Create user message
user_message = Message(
text=text,
sender=sender,
conversation_id=conversation_id,
)
db.session.add(user_message)
# Update conversation title (if it's the first user message)
if sender == "user" and len(conversation.messages) <= 1:
conversation.title = text[:20] + ("..." if len(text) > 20 else "")
db.session.commit()
if sender == "user":
ai_response_text = ""
question = text
conversation_id_str = str(conversation_id)
for item in call_runner(
question,
conversation_id_str,
conversation_id_str,
):
ai_response_text += item
ai_message = Message(
text=ai_response_text,
sender="ai",
conversation_id=conversation_id,
)
db.session.add(ai_message)
db.session.commit()
return (
jsonify(
{
"id": user_message.id,
"text": user_message.text,
"sender": user_message.sender,
"created_at": user_message.created_at.isoformat(),
},
),
201,
)
# Delete conversation
@app.route("/api/conversations/<int:conversation_id>", methods=["DELETE"])
def delete_conversation(conversation_id):
conversation = Conversation.query.get_or_404(conversation_id)
db.session.delete(conversation)
db.session.commit()
return jsonify({"message": "Conversation deleted successfully"}), 200
# Update conversation title
@app.route("/api/conversations/<int:conversation_id>", methods=["PUT"])
def update_conversation(conversation_id):
conversation = Conversation.query.get_or_404(conversation_id)
data = request.get_json()
if "title" in data:
conversation.title = data["title"]
db.session.commit()
return (
jsonify(
{
"id": conversation.id,
"title": conversation.title,
"user_id": conversation.user_id,
"created_at": conversation.created_at.isoformat(),
"updated_at": conversation.updated_at.isoformat(),
},
),
200,
)
# Get user information
@app.route("/api/users/<int:user_id>", methods=["GET"])
def get_user(user_id):
user = User.query.get_or_404(user_id)
return (
jsonify(
{
"id": user.id,
"username": user.username,
"name": user.name,
"created_at": user.created_at.isoformat(),
},
),
200,
)
# Error handling
@app.errorhandler(404)
def not_found(error):
logger.error(error)
return jsonify({"error": "Resource not found"}), 404
@app.errorhandler(500)
def internal_error(error):
logger.error(error)
db.session.rollback()
return jsonify({"error": "Internal server error"}), 500
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=5100)

View File

@@ -0,0 +1,39 @@
{
"name": "frontend",
"version": "0.1.0",
"private": true,
"dependencies": {
"@tailwindcss/postcss": "^4.1.11",
"@testing-library/dom": "^10.4.0",
"@testing-library/jest-dom": "^6.6.3",
"@testing-library/react": "^16.3.0",
"@testing-library/user-event": "^13.5.0",
"antd": "^5.26.6",
"autoprefixer": "^10.4.21",
"lucide-react": "^0.525.0",
"postcss": "^8.5.6",
"react": "^19.1.0",
"react-dom": "^19.1.0",
"react-scripts": "5.0.1",
"tailwindcss": "^3.4.17",
"web-vitals": "^2.1.4"
},
"scripts": {
"start": "react-scripts start",
"build": "react-scripts build",
"test": "react-scripts test",
"eject": "react-scripts eject"
},
"browserslist": {
"production": [
">0.2%",
"not dead",
"not op_mini all"
],
"development": [
"last 1 chrome version",
"last 1 firefox version",
"last 1 safari version"
]
}
}

View File

@@ -0,0 +1,6 @@
module.exports = {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
};

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

View File

@@ -0,0 +1,20 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<link rel="icon" href="%PUBLIC_URL%/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="theme-color" content="#000000" />
<meta
name="description"
content="Chatbot demo"
/>
<link rel="apple-touch-icon" href="%PUBLIC_URL%/logo192.png" />
<link rel="manifest" href="%PUBLIC_URL%/manifest.json" />
<title>Chatbot Demo</title>
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div>
</body>
</html>

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

View File

@@ -0,0 +1,25 @@
{
"short_name": "Chatbot",
"name": "Chatbot Demo",
"icons": [
{
"src": "favicon.ico",
"sizes": "64x64 32x32 24x24 16x16",
"type": "image/x-icon"
},
{
"src": "logo192.png",
"type": "image/png",
"sizes": "192x192"
},
{
"src": "logo512.png",
"type": "image/png",
"sizes": "512x512"
}
],
"start_url": ".",
"display": "standalone",
"theme_color": "#000000",
"background_color": "#ffffff"
}

View File

@@ -0,0 +1,60 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
body {
margin: 0;
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Roboto", "Oxygen",
"Ubuntu", "Cantarell", "Fira Sans", "Droid Sans", "Helvetica Neue",
sans-serif;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
background-color: #f8fafc;
}
code {
font-family: source-code-pro, Menlo, Monaco, Consolas, "Courier New",
monospace;
}
/* Ant Design overrides for清新风格 */
.ant-tabs-tab {
font-weight: 500;
}
.ant-list-item {
background: white;
border-radius: 8px;
margin-bottom: 8px;
padding: 16px;
box-shadow: 0 1px 3px rgba(0, 0, 0, 0.05);
transition: all 0.2s ease;
}
.ant-list-item:hover {
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.07);
transform: translateY(-1px);
}
.ant-card {
border-radius: 8px;
box-shadow: 0 1px 3px rgba(0, 0, 0, 0.05);
}
.ant-btn-primary {
background: linear-gradient(135deg, #0ea5e9 0%, #0284c7 100%);
border: none;
}
.ant-btn-primary:hover {
background: linear-gradient(135deg, #0284c7 0%, #0369a1 100%);
}
.ant-progress-inner {
border-radius: 4px;
}
.ant-tag {
border-radius: 12px;
padding: 0 12px;
}

View File

@@ -0,0 +1,391 @@
import React, { useState, useEffect, useRef } from 'react';
import { MessageCircle, User, Send, Plus, LogOut, Menu, X, Bot } from 'lucide-react';
const App = () => {
const [isLoggedIn, setIsLoggedIn] = useState(false);
const [username, setUsername] = useState('');
const [password, setPassword] = useState('');
const [currentUser, setCurrentUser] = useState(null);
const [conversations, setConversations] = useState([]);
const [activeConversation, setActiveConversation] = useState(null);
const [message, setMessage] = useState('');
const [isMenuOpen, setIsMenuOpen] = useState(false);
const [loading, setLoading] = useState(false);
const messagesEndRef = useRef(null);
// API base URL
const API_BASE = 'http://localhost:5100/api';
// Auto scroll to bottom of messages
useEffect(() => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
}, [activeConversation?.messages]);
// Fetch user conversations
const fetchConversations = async (userId) => {
try {
const response = await fetch(`${API_BASE}/users/${userId}/conversations`);
if (response.ok) {
const data = await response.json();
setConversations(data);
if (data.length > 0 && !activeConversation) {
// Load the first conversation
loadConversation(data[0].id);
}
}
} catch (error) {
console.error('Error fetching conversations:', error);
}
};
// Load conversation details
const loadConversation = async (conversationId) => {
try {
const response = await fetch(`${API_BASE}/conversations/${conversationId}`);
if (response.ok) {
const data = await response.json();
setActiveConversation(data);
}
} catch (error) {
console.error('Error loading conversation:', error);
}
};
// Login function
const handleLogin = async (e) => {
e.preventDefault();
setLoading(true);
try {
const response = await fetch(`${API_BASE}/login`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ username, password }),
});
if (response.ok) {
const userData = await response.json();
setCurrentUser(userData);
setIsLoggedIn(true);
await fetchConversations(userData.id);
} else {
const errorData = await response.json();
alert(errorData.error || 'Login failed');
}
} catch (error) {
console.error('Login error:', error);
alert('Network error. Please try again.');
} finally {
setLoading(false);
}
};
// Logout function
const handleLogout = () => {
setIsLoggedIn(false);
setCurrentUser(null);
setUsername('');
setPassword('');
setConversations([]);
setActiveConversation(null);
setIsMenuOpen(false);
};
// Create new conversation
const createNewConversation = async () => {
if (!currentUser) return;
setLoading(true);
try {
const response = await fetch(`${API_BASE}/users/${currentUser.id}/conversations`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ title: 'New Conversation' }),
});
if (response.ok) {
const newConversation = await response.json();
setConversations(prev => [newConversation, ...prev]);
await loadConversation(newConversation.id);
}
} catch (error) {
console.error('Error creating conversation:', error);
} finally {
setLoading(false);
setIsMenuOpen(false);
}
};
// Send message
const sendMessage = async () => {
if (!message.trim() || !activeConversation) return;
setLoading(true);
try {
// Send user message
const userMessageResponse = await fetch(`${API_BASE}/conversations/${activeConversation.id}/messages`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ text: message, sender: 'user' }),
});
if (userMessageResponse.ok) {
const userMessage = await userMessageResponse.json();
// Update UI with user message
const updatedConversation = {
...activeConversation,
messages: [...activeConversation.messages, userMessage],
title: activeConversation.messages.length === 1 ? message.slice(0, 20) + (message.length > 20 ? '...' : '') : activeConversation.title
};
setActiveConversation(updatedConversation);
setMessage('');
// Fetch updated conversation to get AI response
await loadConversation(activeConversation.id);
}
} catch (error) {
console.error('Error sending message:', error);
} finally {
setLoading(false);
}
};
// Format timestamp
const formatTime = (timestamp) => {
return new Date(timestamp).toLocaleTimeString('en-US', {
hour: '2-digit',
minute: '2-digit'
});
};
// Login Page
if (!isLoggedIn) {
return (
<div className="min-h-screen bg-gradient-to-br from-blue-50 to-indigo-100 flex items-center justify-center p-4">
<div className="bg-white rounded-2xl shadow-xl w-full max-w-md p-8">
<div className="text-center mb-8">
<div className="mx-auto bg-indigo-100 rounded-full p-4 w-16 h-16 flex items-center justify-center mb-4">
<Bot className="w-8 h-8 text-indigo-600" />
</div>
<h1 className="text-3xl font-bold text-gray-800 mb-2">AI Assistant</h1>
<p className="text-gray-600">Intelligent conversations, always at your service</p>
</div>
<form onSubmit={handleLogin} className="space-y-6">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Username</label>
<input
type="text"
value={username}
onChange={(e) => setUsername(e.target.value)}
className="w-full px-4 py-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-indigo-500 focus:border-transparent transition-all"
placeholder="Enter username"
required
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Password</label>
<input
type="password"
value={password}
onChange={(e) => setPassword(e.target.value)}
className="w-full px-4 py-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-indigo-500 focus:border-transparent transition-all"
placeholder="Enter password"
required
/>
</div>
<button
type="submit"
disabled={loading}
className="w-full bg-indigo-600 text-white py-3 rounded-lg font-medium hover:bg-indigo-700 transition-colors focus:ring-2 focus:ring-indigo-500 focus:ring-offset-2 disabled:opacity-50"
>
{loading ? 'Logging in...' : 'Login'}
</button>
</form>
<div className="mt-6 p-4 bg-gray-50 rounded-lg">
<p className="text-sm text-gray-600 mb-2">Demo accounts:</p>
<p className="text-xs text-gray-500">Username: user1, Password: password123</p>
<p className="text-xs text-gray-500">Username: user2, Password: password456</p>
</div>
</div>
</div>
);
}
// Main App
return (
<div className="h-screen bg-gray-50 flex flex-col">
{/* Header */}
<header className="bg-white shadow-sm border-b border-gray-200 px-4 py-3 flex items-center justify-between">
<div className="flex items-center space-x-3">
<button
onClick={() => setIsMenuOpen(true)}
className="p-2 rounded-lg hover:bg-gray-100 transition-colors"
>
<Menu className="w-5 h-5 text-gray-600" />
</button>
<div className="flex items-center space-x-2">
<div className="bg-indigo-100 rounded-full p-2">
<Bot className="w-5 h-5 text-indigo-600" />
</div>
<h1 className="text-lg font-semibold text-gray-800">AI Assistant</h1>
</div>
</div>
<button
onClick={handleLogout}
className="p-2 rounded-lg hover:bg-gray-100 transition-colors"
>
<LogOut className="w-5 h-5 text-gray-600" />
</button>
</header>
<div className="flex flex-1 overflow-hidden">
{/* Sidebar */}
{isMenuOpen && (
<div className="fixed inset-0 z-50 lg:relative lg:inset-auto lg:z-auto">
<div className="absolute inset-0 bg-black bg-opacity-50 lg:hidden" onClick={() => setIsMenuOpen(false)} />
<div className="absolute left-0 top-0 h-full w-80 bg-white shadow-xl lg:relative lg:shadow-none z-10">
<div className="p-4 border-b border-gray-200 flex items-center justify-between">
<h2 className="text-lg font-semibold text-gray-800">Conversations</h2>
<button
onClick={() => setIsMenuOpen(false)}
className="lg:hidden p-2 rounded-lg hover:bg-gray-100"
>
<X className="w-5 h-5 text-gray-600" />
</button>
</div>
<div className="p-4">
<button
onClick={createNewConversation}
disabled={loading}
className="w-full bg-indigo-600 text-white py-3 rounded-lg font-medium hover:bg-indigo-700 transition-colors flex items-center justify-center space-x-2 mb-4 disabled:opacity-50"
>
<Plus className="w-4 h-4" />
<span>New Conversation</span>
</button>
</div>
<div className="flex-1 overflow-y-auto">
{conversations.map((conversation) => (
<div
key={conversation.id}
onClick={() => {
loadConversation(conversation.id);
setIsMenuOpen(false);
}}
className={`p-4 border-b border-gray-100 cursor-pointer hover:bg-gray-50 transition-colors ${
activeConversation?.id === conversation.id ? 'bg-indigo-50 border-l-4 border-l-indigo-500' : ''
}`}
>
<div className="flex items-start space-x-3">
<MessageCircle className="w-5 h-5 text-gray-400 mt-0.5" />
<div className="flex-1 min-w-0">
<h3 className="font-medium text-gray-900 truncate">{conversation.title}</h3>
<p className="text-sm text-gray-500 truncate">
{conversation.preview || 'New conversation'}
</p>
</div>
</div>
</div>
))}
</div>
<div className="p-4 border-t border-gray-200">
<div className="flex items-center space-x-3">
<div className="bg-gray-200 rounded-full p-2">
<User className="w-4 h-4 text-gray-600" />
</div>
<div>
<p className="font-medium text-gray-900">{currentUser?.name}</p>
<p className="text-sm text-gray-500">@{currentUser?.username}</p>
</div>
</div>
</div>
</div>
</div>
)}
{/* Main Chat Area */}
<div className="flex-1 flex flex-col">
{activeConversation ? (
<>
{/* Chat Header */}
<div className="bg-white border-b border-gray-200 px-4 py-3">
<h2 className="font-semibold text-gray-800">{activeConversation.title}</h2>
</div>
{/* Messages */}
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{activeConversation.messages.map((msg) => (
<div
key={msg.id}
className={`flex ${msg.sender === 'user' ? 'justify-end' : 'justify-start'}`}
>
<div
className={`max-w-xs lg:max-w-md px-4 py-3 rounded-2xl ${
msg.sender === 'user'
? 'bg-indigo-600 text-white rounded-br-md'
: 'bg-white text-gray-800 border border-gray-200 rounded-bl-md shadow-sm'
}`}
>
<p className="text-sm">{msg.text}</p>
<p className={`text-xs mt-1 ${msg.sender === 'user' ? 'text-indigo-100' : 'text-gray-500'}`}>
{formatTime(msg.created_at)}
</p>
</div>
</div>
))}
<div ref={messagesEndRef} />
</div>
{/* Input Area */}
<div className="bg-white border-t border-gray-200 p-4">
<div className="flex items-center space-x-3">
<input
type="text"
value={message}
onChange={(e) => setMessage(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && !loading && sendMessage()}
disabled={loading}
className="flex-1 px-4 py-3 border border-gray-300 rounded-full focus:ring-2 focus:ring-indigo-500 focus:border-transparent transition-all disabled:opacity-50"
placeholder="Type a message..."
/>
<button
onClick={sendMessage}
disabled={!message.trim() || loading}
className="bg-indigo-600 text-white p-3 rounded-full hover:bg-indigo-700 transition-colors disabled:opacity-50 disabled:cursor-not-allowed focus:ring-2 focus:ring-indigo-500 focus:ring-offset-2"
>
{loading ? (
<div className="w-5 h-5 border-2 border-white border-t-transparent rounded-full animate-spin" />
) : (
<Send className="w-5 h-5" />
)}
</button>
</div>
</div>
</>
) : (
<div className="flex-1 flex items-center justify-center">
<div className="text-center">
<MessageCircle className="w-16 h-16 text-gray-300 mx-auto mb-4" />
<h3 className="text-xl font-medium text-gray-600 mb-2">Select or Create a Conversation</h3>
<p className="text-gray-500">Choose an existing conversation or create a new one to get started</p>
</div>
</div>
)}
</div>
</div>
</div>
);
};
export default App;

View File

@@ -0,0 +1,8 @@
import { render, screen } from "@testing-library/react";
import App from "./App";
test("renders learn react link", () => {
render(<App />);
const linkElement = screen.getByText(/learn react/i);
expect(linkElement).toBeInTheDocument();
});

View File

@@ -0,0 +1,27 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
body {
margin: 0;
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Roboto", "Oxygen",
"Ubuntu", "Cantarell", "Fira Sans", "Droid Sans", "Helvetica Neue",
sans-serif;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
* {
box-sizing: border-box;
}
/* Hide scrollbar for Chrome, Safari and Opera */
.no-scrollbar::-webkit-scrollbar {
display: none;
}
/* Hide scrollbar for IE, Edge and Firefox */
.no-scrollbar {
-ms-overflow-style: none; /* IE and Edge */
scrollbar-width: none; /* Firefox */
}

View File

@@ -0,0 +1,12 @@
import React from "react";
import ReactDOM from "react-dom/client";
import App from "./App.jsx";
import "./index.css";
const root = ReactDOM.createRoot(document.getElementById("root"));
root.render(
<React.StrictMode>
<App />
</React.StrictMode>
);

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 841.9 595.3"><g fill="#61DAFB"><path d="M666.3 296.5c0-32.5-40.7-63.3-103.1-82.4 14.4-63.6 8-114.2-20.2-130.4-6.5-3.8-14.1-5.6-22.4-5.6v22.3c4.6 0 8.3.9 11.4 2.6 13.6 7.8 19.5 37.5 14.9 75.7-1.1 9.4-2.9 19.3-5.1 29.4-19.6-4.8-41-8.5-63.5-10.9-13.5-18.5-27.5-35.3-41.6-50 32.6-30.3 63.2-46.9 84-46.9V78c-27.5 0-63.5 19.6-99.9 53.6-36.4-33.8-72.4-53.2-99.9-53.2v22.3c20.7 0 51.4 16.5 84 46.6-14 14.7-28 31.4-41.3 49.9-22.6 2.4-44 6.1-63.6 11-2.3-10-4-19.7-5.2-29-4.7-38.2 1.1-67.9 14.6-75.8 3-1.8 6.9-2.6 11.5-2.6V78.5c-8.4 0-16 1.8-22.6 5.6-28.1 16.2-34.4 66.7-19.9 130.1-62.2 19.2-102.7 49.9-102.7 82.3 0 32.5 40.7 63.3 103.1 82.4-14.4 63.6-8 114.2 20.2 130.4 6.5 3.8 14.1 5.6 22.5 5.6 27.5 0 63.5-19.6 99.9-53.6 36.4 33.8 72.4 53.2 99.9 53.2 8.4 0 16-1.8 22.6-5.6 28.1-16.2 34.4-66.7 19.9-130.1 62-19.1 102.5-49.9 102.5-82.3zm-130.2-66.7c-3.7 12.9-8.3 26.2-13.5 39.5-4.1-8-8.4-16-13.1-24-4.6-8-9.5-15.8-14.4-23.4 14.2 2.1 27.9 4.7 41 7.9zm-45.8 106.5c-7.8 13.5-15.8 26.3-24.1 38.2-14.9 1.3-30 2-45.2 2-15.1 0-30.2-.7-45-1.9-8.3-11.9-16.4-24.6-24.2-38-7.6-13.1-14.5-26.4-20.8-39.8 6.2-13.4 13.2-26.8 20.7-39.9 7.8-13.5 15.8-26.3 24.1-38.2 14.9-1.3 30-2 45.2-2 15.1 0 30.2.7 45 1.9 8.3 11.9 16.4 24.6 24.2 38 7.6 13.1 14.5 26.4 20.8 39.8-6.3 13.4-13.2 26.8-20.7 39.9zm32.3-13c5.4 13.4 10 26.8 13.8 39.8-13.1 3.2-26.9 5.9-41.2 8 4.9-7.7 9.8-15.6 14.4-23.7 4.6-8 8.9-16.1 13-24.1zM421.2 430c-9.3-9.6-18.6-20.3-27.8-32 9 .4 18.2.7 27.5.7 9.4 0 18.7-.2 27.8-.7-9 11.7-18.3 22.4-27.5 32zm-74.4-58.9c-14.2-2.1-27.9-4.7-41-7.9 3.7-12.9 8.3-26.2 13.5-39.5 4.1 8 8.4 16 13.1 24 4.7 8 9.5 15.8 14.4 23.4zM420.7 163c9.3 9.6 18.6 20.3 27.8 32-9-.4-18.2-.7-27.5-.7-9.4 0-18.7.2-27.8.7 9-11.7 18.3-22.4 27.5-32zm-74 58.9c-4.9 7.7-9.8 15.6-14.4 23.7-4.6 8-8.9 16-13 24-5.4-13.4-10-26.8-13.8-39.8 13.1-3.1 26.9-5.8 41.2-7.9zm-90.5 125.2c-35.4-15.1-58.3-34.9-58.3-50.6 0-15.7 22.9-35.6 58.3-50.6 8.6-3.7 18-7 27.7-10.1 5.7 19.6 13.2 40 22.5 60.9-9.2 20.8-16.6 41.1-22.2 60.6-9.9-3.1-19.3-6.5-28-10.2zM310 490c-13.6-7.8-19.5-37.5-14.9-75.7 1.1-9.4 2.9-19.3 5.1-29.4 19.6 4.8 41 8.5 63.5 10.9 13.5 18.5 27.5 35.3 41.6 50-32.6 30.3-63.2 46.9-84 46.9-4.5-.1-8.3-1-11.3-2.7zm237.2-76.2c4.7 38.2-1.1 67.9-14.6 75.8-3 1.8-6.9 2.6-11.5 2.6-20.7 0-51.4-16.5-84-46.6 14-14.7 28-31.4 41.3-49.9 22.6-2.4 44-6.1 63.6-11 2.3 10.1 4.1 19.8 5.2 29.1zm38.5-66.7c-8.6 3.7-18 7-27.7 10.1-5.7-19.6-13.2-40-22.5-60.9 9.2-20.8 16.6-41.1 22.2-60.6 9.9 3.1 19.3 6.5 28.1 10.2 35.4 15.1 58.3 34.9 58.3 50.6-.1 15.7-23 35.6-58.4 50.6zM320.8 78.4z"/><circle cx="420.9" cy="296.5" r="45.7"/><path d="M520.5 78.1z"/></g></svg>

After

Width:  |  Height:  |  Size: 2.6 KiB

View File

@@ -0,0 +1,13 @@
const reportWebVitals = onPerfEntry => {
if (onPerfEntry && onPerfEntry instanceof Function) {
import("web-vitals").then(({ getCLS, getFID, getFCP, getLCP, getTTFB }) => {
getCLS(onPerfEntry);
getFID(onPerfEntry);
getFCP(onPerfEntry);
getLCP(onPerfEntry);
getTTFB(onPerfEntry);
});
}
};
export default reportWebVitals;

View File

@@ -0,0 +1,5 @@
// jest-dom adds custom jest matchers for asserting on DOM nodes.
// allows you to do things like:
// expect(element).toHaveTextContent(/react/i)
// learn more: https://github.com/testing-library/jest-dom
import "@testing-library/jest-dom";

View File

@@ -0,0 +1,26 @@
/** @type {import('tailwindcss').Config} */
module.exports = {
content: [
"./src/**/*.{js,jsx,ts,tsx}",
".public/index.html"
],
theme: {
extend: {
colors: {
primary: {
50: "#f0f9ff",
100: "#e0f2fe",
200: "#bae6fd",
300: "#7dd3fc",
400: "#38bdf8",
500: "#0ea5e9",
600: "#0284c7",
700: "#0369a1",
800: "#075985",
900: "#0c4a6e",
}
}
},
},
plugins: [],
};

View File

@@ -0,0 +1,24 @@
# MultiAgent Conversation
This example demonstrates how to build a multi-agent conversation workflow using ``MsgHub`` in AgentScope,
where multiple agents broadcast messages to each other in a shared conversation space.
## Setup
The example is built upon the DashScope LLM API in [main.py](https://github.com/agentscope-ai/agentscope/blob/main/examples/workflows/multiagent_conversation/main.py). You can switch to other LLMs by modifying the ``model`` and ``formatter`` parameters in the code.
To run the example, first install the latest version of AgentScope, then run:
```bash
python examples/workflows/multiagent_conversation/main.py
```
## Main Workflow
- Create multiple participant agents with different attributes (e.g., Alice, Bob, Charlie).
- Agents introduce themselves and interact in the message hub.
- Supports dynamic addition and removal of agents, as well as broadcasting messages.
> Note: The example is built with DashScope chat model. If you want to change the model in this example, don't forget
> to change the formatter at the same time! The corresponding relationship between built-in models and formatters are
> list in [our tutorial](https://doc.agentscope.io/tutorial/task_prompt.html#id1)

View File

@@ -0,0 +1,80 @@
# -*- coding: utf-8 -*-
"""The example of how to construct multi-agent conversation with MsgHub and
pipeline in AgentScope."""
import asyncio
import os
from agentscope.agent import ReActAgent
from agentscope.formatter import DashScopeMultiAgentFormatter
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from agentscope.pipeline import MsgHub, sequential_pipeline
def create_participant_agent(
name: str,
age: int,
career: str,
character: str,
) -> ReActAgent:
"""Create a participant agent with a specific name, age, and character."""
return ReActAgent(
name=name,
sys_prompt=(
f"You're a {age}-year-old {career} named {name} and you're "
f"a {character} person."
),
model=DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=True,
),
# Use multiagent formatter because the multiple entities will
# occur in the prompt of the LLM API call
formatter=DashScopeMultiAgentFormatter(),
)
async def main() -> None:
"""Run a multi-agent conversation workflow."""
# Create multiple participant agents with different characteristics
alice = create_participant_agent("Alice", 30, "teacher", "friendly")
bob = create_participant_agent("Bob", 14, "student", "rebellious")
charlie = create_participant_agent("Charlie", 28, "doctor", "thoughtful")
# Create a conversation where participants introduce themselves within
# a message hub
async with MsgHub(
participants=[alice, bob, charlie],
# The greeting message will be sent to all participants at the start
announcement=Msg(
"system",
"Now you meet each other with a brief self-introduction.",
"system",
),
) as hub:
# Quick construct a pipeline to run the conversation
await sequential_pipeline([alice, bob, charlie])
# Or by the following way:
# await alice()
# await bob()
# await charlie()
# Delete a participant agent from the hub and fake a broadcast message
print("##### We fake Bob's departure #####")
hub.delete(bob)
await hub.broadcast(
Msg(
"bob",
"I have to start my homework now, see you later!",
"assistant",
),
)
await alice()
await charlie()
# ...
asyncio.run(main())

View File

@@ -0,0 +1 @@
agentscope[full]>=1.0.5

View File

@@ -0,0 +1,24 @@
# MultiAgent Debate
Debate workflow simulates a multi-turn discussion between different agents, mostly several solvers and an aggregator.
Typically, the solvers generate and exchange their answers, while the aggregator collects and summarizes the answers.
We implement the examples in [EMNLP 2024](https://aclanthology.org/2024.emnlp-main.992/), where two debater agents
will discuss a topic in a fixed order, and express their arguments based on the previous debate history.
At each round a moderator agent will decide whether the correct answer can be obtained in the current iteration.
## Setup
The example is built upon DashScope LLM API in [main.py](https://github.com/agentscope-ai/agentscope/blob/main/examples/workflows/multiagent_debate/main.py).
You can also change to the other LLMs by modifying the ``model`` and ``formatter`` parameters in the code.
To run the example, first install the latest version of AgentScope, then run:
```bash
python examples/workflows/multiagent_debate/main.py
```
> Note: The example is built with DashScope chat model. If you want to change the model in this example, don't forget
> to change the formatter at the same time! The corresponding relationship between built-in models and formatters are
> list in [our tutorial](https://doc.agentscope.io/tutorial/task_prompt.html#id1)

View File

@@ -0,0 +1,126 @@
# -*- coding: utf-8 -*-
"""The multi-agent debate workflow example in AgentScope."""
import asyncio
import os
from agentscope.agent import ReActAgent
from agentscope.formatter import (
DashScopeChatFormatter,
DashScopeMultiAgentFormatter,
)
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from agentscope.pipeline import MsgHub
from pydantic import BaseModel, Field
topic = (
"The two circles are externally tangent and there is no relative sliding. "
"The radius of circle A is 1/3 the radius of circle B. Circle A rolls "
"around circle B one trip back to its starting point. How many times will "
"circle A revolve in total?"
)
# Create two debater agents, Alice and Bob, who will discuss the topic.
def create_solver_agent(name: str) -> ReActAgent:
"""Get a solver agent."""
return ReActAgent(
name=name,
sys_prompt=f"You're a debater named {name}. Hello and welcome to the "
"debate competition. It's not necessary to fully agree "
"with each other's perspectives, as our objective is to "
"find the correct answer. The debate topic is stated as "
f"follows: {topic}. Use Chinese to answer the question",
model=DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=True,
),
formatter=DashScopeChatFormatter(),
)
alice, bob = [create_solver_agent(name) for name in ["Alice", "Bob"]]
# Create a moderator agent
moderator = ReActAgent(
name="Aggregator",
sys_prompt=(
"You're a moderator. There will be two debaters involved in a debate "
"competition. They will present their answer and discuss their "
"perspectives on the topic:\n"
"```\n"
"{topic}\n"
"```\n"
"At the end of each round, you will evaluate both sides' answers "
"and decide which one is correct."
),
model=DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=True,
),
formatter=DashScopeMultiAgentFormatter(),
)
# A structured output model for the moderator
class JudgeModel(BaseModel):
"""The structured output model for the moderator."""
finished: bool = Field(
description="Whether the debate is finished.",
)
correct_answer: str | None = Field(
description="The correct answer to the debate topic, only if the "
"debate is finished. Otherwise, leave it as None.",
default=None,
)
async def run_multiagent_debate() -> None:
"""Run the multi-agent debate workflow."""
while True:
# The reply messages in MsgHub from the participants will be
# broadcasted to all participants.
async with MsgHub(participants=[alice, bob, moderator]):
await alice(
Msg(
"user",
"You are affirmative side, Please express your "
"viewpoints.",
"user",
),
)
await bob(
Msg(
"user",
"You are negative side. You disagree with the "
"affirmative side. Provide your reason and answer.",
"user",
),
)
# Alice and Bob doesn't need to know the moderator's message,
# so moderator is called outside the MsgHub.
msg_judge = await moderator(
Msg(
"user",
"Now you have heard the answers from the others, have "
"the debate finished, and can you get the correct answer?",
"user",
),
structured_model=JudgeModel,
)
print("【STRUCTURED_OUTPUT】: ", msg_judge.metadata)
if msg_judge.metadata.get("finished"):
print(
"The debate is finished, and the correct answer is: ",
msg_judge.metadata.get("correct_answer"),
)
break
asyncio.run(run_multiagent_debate())

View File

@@ -0,0 +1 @@
agentscope[full]>=1.0.5

View File

@@ -0,0 +1,46 @@
# Deep Research Agent Example
## What This Example Demonstrates
This example shows a **DeepResearch Agent** implementation using the AgentScope framework. The DeepResearch Agent specializes in performing multi-step research to collect and integrate information from multiple sources, and generates comprehensive reports to solve complex tasks.
## Prerequisites
- Python 3.10 or higher
- Node.js and npm (for the MCP server)
- DashScope API key from [Alibaba Cloud](https://dashscope.console.aliyun.com/)
- Tavily search API key from [Tavily](https://www.tavily.com/)
## How to Run This Example
1. **Set Environment Variable**:
```bash
export DASHSCOPE_API_KEY="your_dashscope_api_key_here"
export TAVILY_API_KEY="your_tavily_api_key_here"
export AGENT_OPERATION_DIR="your_own_direction_here"
```
2. **Test Tavily MCP Server**:
```bash
npx -y tavily-mcp@latest
```
2. **Run the script**:
```bash
python main.py
```
## Connect to Web Search MCP client
The DeepResearch Agent only supports web search through the Tavily MCP client currently. To use this feature, you need to start the MCP server locally and establish a connection to it.
```
from agentscope.mcp import StdIOStatefulClient
tavily_search_client= StdIOStatefulClient(
name="tavily_mcp",
command="npx",
args=["-y", "tavily-mcp@latest"],
env={"TAVILY_API_KEY": os.getenv("TAVILY_API_KEY", "")},
)
await tavily_search_client.connect()
```
> Note: The example is built with DashScope chat model. If you want to change the model in this example, don't forget
> to change the formatter at the same time! The corresponding relationship between built-in models and formatters are
> list in [our tutorial](https://doc.agentscope.io/tutorial/task_prompt.html#id1)

View File

@@ -0,0 +1,68 @@
# Identity And Core Mission
You are an advanced research planning assistant tasked with breaking down a given task into a series of 3-5 logically ordered, actionable steps. Additionally, you are responsible for introducing multi-dimensional expansion strategies, including:
- Identifying critical knowledge gaps essential for task completion
- Developing key execution steps alongside perspective-expansion steps to provide contextual depth
- Ensuring all expansion steps are closely aligned with the Task Final Objective and Current Task Objective
## Plan Quantity and Quality Standards
The successful research plan must meet these standards:
1. **Comprehensive Coverage**:
- Information must cover ALL aspects of the topic
- Multiple perspectives must be represented both essential steps and expansion steps
- Both mainstream and alternative viewpoints should be included
- Explicit connections to adjacent domains should be explored
2. **Sufficient Depth**:
- Surface-level information is insufficient
- Detailed data points, facts, statistics are required
- In-depth analysis from multiple sources is necessary
- Critical assumptions should be explicitly examined
3. **Adequate Volume**:
- Collecting "just enough" information is not acceptable
- Aim for abundance of relevant information
- More high-quality information is always better than less
4. **Contextual Expansion**:
- Use diverse analytical perspectives (e.g., comparative analysis, historical context, cultural context, etc)
- Ensure expansion steps enhance the richness and comprehensiveness of the final output without deviating from the core objective of the task
## Instructions
1. **Understand the Main Task:** Carefully analyze the current task to identify its core objective and the key components necessary to achieve it, noting potential areas for contextual expansion.
2. **Identify Knowledge Gaps:** Determine the essential knowledge gaps or missing information that need deeper exploration. Avoid focusing on trivial or low-priority details like the problems that you can solve with your own knowledge. Instead, concentrate on:
- Foundational gaps critical to task completion
- Identifing opportunities for step expansion by considering alternative approaches, connections to related topics, or ways to enrich the final output. Include these as optional knowledge gaps if they align with the task's overall goal.
The knowledge gaps should stricly in the format of a markdown checklist and flag gaps requiring perspective expansion with `(EXPANSION)` tag (e.g., "- [ ] (EXPANSION) Analysis report of X").
3. **Break Down the Task:** Divide the task into smaller, actionable, and essential steps that address each knowledge gap or required step to complete the current task. Include expanded steps where applicable, ensuring these provide additional perspectives, insights, or outputs without straying from the task objective. These expanded steps should enhance the richness of the final output.
4. **Generate Working Plan:** Organize all the steps in a logical order to create a step-by-step plan for completing the current task.
### Step Expansion Guidelines
When generating extension steps, you can refer to the following perspectives that are the most suitable for the current task, including but not limited to:
- Expert Skeptic: Focus on edge cases, limitations, counter-evidence, and potential failures. Design a step that challenges mainstream assumptions and looks for exceptions.
- Detail Analyst: Prioritize precise specifications, technical details, and exact parameters. Design a step targeting granular data and definitive references.
- Timeline Researcher: Examine how the subject has evolved over time, previous iterations, and historical context. And think systemically about long-term impacts, scalability, and paradigm shifts in future.
- Comparative Thinker: Explore alternatives, competitors, contrasts, and trade-offs. Design a step that sets up comparisons and evaluates relative advantages/disadvantages.
- Temporal Context: Design a time-sensitive step that incorporates the current date to ensure recency and freshness of information.
- Public Opinion Collector: Design a step to aggregate user-generated content like text posts or comments, digital photos or videos from Twitter, Youtube, Facebook and other social medias.
- Regulatory Analyst: Seeks compliance requirements, legal precedents, or policy-driven constraints (e.g. "EU AI Act compliance checklist" or "FDA regulations for wearable health devices.")
- Academic Profesor: Design a step based on the necessary steps of doing an academic research (e.g. "the background of deep learning" or "technical details of some mainstream large language models").
### Important Notes
1. Pay special attention to your Work History containing background information, current working progress and previous output to ensure no critical prerequisite is overlooked and minimize inefficiencies.
2. Carefully review the previous working plan. Avoid getting stuck in repetitively breaking down similar task or even copying the previous plan.
3. Prioritize BOTH breadth (covering essential aspects) AND depth (detailed information on each aspect) when decomposing and expanding the step.
4. AVOID **redundancy or over-complicating** the plan. Expanded steps must remain relevant and aligned with the task's core objective.
5. Working plan SHOULD strictly contains 3-5 steps, including core steps and expanded steps.
### Example
Current Subtask: Analysis of JD.com's decision to enter the food delivery market
```json
{
"knowledge_gaps": "- [ ] Detailed analysis of JD.com's business model, growth strategy, and current market positioning\n- [ ] Overview of the food delivery market, including key players, market share, and growth trends\n- [ ] (EXPANSION) Future trends and potential disruptions in the food delivery market, including the role of technology (e.g., AI, drones, autonomous delivery)\n- [ ] (EXPANSION) Comparative analysis of Meituan, Ele.me, and JD.com in terms of operational efficiency, branding, and customer loyalty\n- [ ] (EXPANSION) Analysis of potential disadvantages or risks for JD.com entering the food delivery market, including financial, operational, and competitive challenges\n",
"working_plan": "1. Use web searches to analyze JD.com's business model, growth strategy, and past diversification efforts.\n2. Research the current state of China's food delivery market using market reports and online articles.\n3. (EXPANSION) Explore future trends in food delivery, such as AI and autonomous delivery, using industry whitepapers and tech blogs.\n4. (EXPANSION) Compare Meituan, Ele.me, and JD.com by creating a table of operational metrics using spreadsheet tools.\n5. (EXPANSION) Identify risks for JD.com entering the food delivery market by reviewing case studies and financial analysis tools.\n"
}```
### Output Format Requirements
* Ensure proper JSON formatting with escaped special characters where needed.
* Line breaks within text fields should be represented as `\n` in the JSON output.
* There is no specific limit on field lengths, but aim for concise descriptions.
* All field values must be strings.
* For each JSON document, only include the following fields:

View File

@@ -0,0 +1,54 @@
## Identity
You are a sharp-eyed Knowledge Discoverer, capable of identifying and leveraging any potentially useful piece of information gathered from web search, no matter how brief. And the information will later be deeper extracted for more contents.
## Instructions
1. **Find information with valuable, but insufficient or shallow content**: Carefully review the web search results to assess whether there is any snippet or web content that
- could potentially help address checklist items or fulfill knowledge gaps of the task as the content increases
- **but whose content is limited or only briefly mentioned**!
2. **Identify the snippet**: If such information is found, set `need_more_information` to true, and locate the specific **title, content, and url** of the information snippet you have found for later extraction.
3. **Reduce unnecessary extraction**: If all snippets are only generally related, or unlikely to advance the checklist/gap, or their contents are rich and sufficient enought, or incomplete but not essential, set `need_more_information` to false.
## Important Notes
1. Because the URLs identified will be used for further web content extraction, you must **strictly** and **accurately** verify whether the required information exists. Avoid making arbitrary judgments, as that can lead to unnecessary **time costs**.
2. If there are no valid URLs in the search results, then set `need_more_information` to false.
## Example 1
**Search Results:**
[{"title": "Philip Greenberg Family History & Historical Records - MyHeritage", "hostname": "Google", "snippet": "Philip Greenberg, born 1951. Quebec Marriage Returns, 1926-1997. View record. Birth. Philip Greenberg was born on month day 1951, in birth place. Spouse. Philip ", "url": "https://www.myheritage.com/names/philip_greenberg", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": ""}, {"title": "Philip Alan Greenberg, Esq. - Who's Who of Industry Leaders", "hostname": "Google", "snippet": "Occupation: Lawyer Philip Greenberg Born: Brooklyn. Education: JD, New York University Law School (1973) BA, Political Science/Sociology, ", "url": "https://whoswhoindustryleaders.com/2018/05/08/philip-greenberg/", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": "2018-05-08 00:00:00"}, {"title": "Philip Greenberg - Wikipedia", "hostname": "Google", "snippet": "Philip Greenberg is a professor of medicine, oncology, and immunology at the University of Washington and head of program in immunology at the Fred Hutchinson ", "url": "https://en.wikipedia.org/wiki/Philip_Greenberg", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": ""}, {"title": "The Detroit Jewish News Digital Archives - May 20, 1977 - Image 35", "hostname": "Google", "snippet": "Greenberg Wins International Young Conductors Competition Philip Greenberg, assist- ant conductor of the Detroit Symphony Orchestra, was named first prize ", "url": "https://digital.bentley.umich.edu/djnews/djn.1977.05.20.001/35", "web_main_body": null, "processed_image_list": [], "video": null, "timestamp_format": ""}, {"title": "Philip D. Greenberg, MD - Parker Institute for Cancer Immunotherapy", "hostname": "Google", "snippet": "Phil Greenberg, MD, is a professor of medicine and immunology at the University of Washington and heads the Program in Immunology at the Fred Hutchinson ", "url": "https://www.parkerici.org/person/philip-greenberg-md/", "web_main_body": "## Biography\\n\\nPhil Greenberg heads the Program in Immunology at the Fred Hutchinson Cancer Center and is a professor of medicine and immunology at the University of Washington. His research has focused on elucidating fundamental principles of T-cell and tumor interactions; developing cellular and molecular approaches to manipulate T-cell immunity; and translating insights from the lab to the treatment of cancer patients, with emphasis on adoptive therapy with genetically engineered T cells.\\nDr. Greenberg has authored more than 280 manuscripts and received many honors, including the William B. Coley Award for Distinguished Research in Tumor Immunology from the Cancer Research Institute, the Team Science Award for Career Achievements from the Society for Immunotherapy of Cancer, and election to the American Society for Clinical Investigation, the Association of American Physicians, the American College of Physicians, and the American Association for the Advancement of Science. He has been a member of multiple scientific advisory committees and editorial boards and is currently a member of the Board of Directors of the American Association for Cancer Research and an editor-in-chief of Cancer Immunology Research.", "processed_image_list": [], "video": null, "timestamp_format": ""}]
**Checklist:**
- [] Document detailed achievements of Philip Greenberg, including competition names, years, awards received, and their significance.
**Output:**
```json
{
"reasoning": "From the web search results, the following snippet is directly relevant to the checklist item: '- [] Document detailed achievements of Philip Greenberg, including competition names, years, awards received, and their significance':\nTitle: The Detroit Jewish News Digital Archives - May 20, 1977 - Image 35\nURL: https://digital.bentley.umich.edu/djnews/djn.1977.05.20.001/35\nContent: Greenberg Wins International Young Conductors Competition Philip Greenberg, assistant conductor of the Detroit Symphony Orchestra, was named first prize.\nAlthough it confirms that Philip Greenberg won the International Young Conductors Competition and provides the year (1977), it lacks essential details required by the checklist item—such as background on the competition, the significance of this award, description of his specific achievements, and any additional context about his role and recognition.\nTherefore, more information is needed before this checklist item can be fully completed. I will set `need_more_information` as true.",
"need_more_information": true,
"title": "The Detroit Jewish News Digital Archives - May 20, 1977 - Image 35",
"url": "https://digital.bentley.umich.edu/djnews/djn.1977.05.20.001/35",
"subtask": "Retrieve detailed information about Philip Greenbergs achievement at the International Young Conductors Competition. Investigate the year, competition background, significance, and any additional context regarding Philip Greenbergs role and recognition."
}
```
## Example 2
**Search Results:**
[{"type": "text", "text": "Detailed Results:\n\nTitle: Big Four Consulting & AI: Risks & Rewards - News Directory 3\nURL: https://www.newsdirectory3.com/big-four-consulting-ai-risks-rewards/\nContent: The Big Four consulting firms—Deloitte, PwC, EY, and KPMG—are navigating the AI revolution, facing both unprecedented opportunities and considerable risks. This pivotal shift is reshaping the industry, compelling these giants to make substantial investments in artificial intelligence to stay competitive.\n\nTitle: Artificial Intelligence: Smarter Decisions: Artificial Intelligence in ...\nURL: https://fastercapital.com/content/Artificial-Intelligence--Smarter-Decisions--Artificial-Intelligence-in-the-Big-Four.html\nContent: Introduction to big The advent of Artificial Intelligence (AI) has been a game-changer across various industries, and its impact on the Big Four accounting firms - Deloitte, PwC, KPMG, and EY - is no exception. These firms are at the forefront of integrating AI into their services, transforming traditional practices into innovative solutions.\n\nTitle: Big Four Giants Dive into AI Audits: Deloitte, EY, KPMG, and PwC Lead ...\nURL: https://opentools.ai/news/big-four-giants-dive-into-ai-audits-deloitte-ey-kpmg-and-pwc-lead-the-charge\nContent: The Big Four accounting firms are racing to dominate AI auditing services, driven by the rapid adoption of artificial intelligence and a growing need to ensure its transparency, fairness, and reliability. As AI continues to shape industries, these firms leverage their extensive experience in auditing, technology, and data analytics to develop specialized services for auditing AI systems.\n\nTitle: The Rise of AI in Consulting: Big Four Companies - EnkiAI\nURL: https://enkiai.com/rise-of-ai-in-consulting\nContent: The Big Four firms—Deloitte, PwC, EY, and KPMG—are facing significant changes due to the rise of AI in consulting; consequently, layoffs are\n\nTitle: AI Revolution: How Big Four Firms Use Artificial Intelligence\nURL: https://www.archivemarketresearch.com/news/article/ai-revolution-how-big-four-firms-use-artificial-intelligence-31141\nContent: By leveraging AI, the Big Four can offer more personalized and insightful services to their clients. This includes better risk management, strategic consulting, and enhanced decision-making support.\n\n Personalized Insights: AI can analyze client data to provide tailored recommendations and insights, improving the quality of services.\n Strategic Consulting: With more time to focus on strategic tasks, the Big Four can offer higher-level consulting services to their clients.\n\n### Cost Savings [...] Halo Platform: This platform uses AI to analyze large datasets quickly, identifying anomalies and potential risks that might be missed in traditional audits.\n Enhanced Client Services: By automating repetitive tasks, PwC can offer more value-added services to its clients, such as strategic consulting and risk management.\n\n### EY: AI for Enhanced Decision-Making [...] ### Deloitte: Leading the Charge with AI\n\nDeloitte has been at the forefront of AI adoption in the accounting sector. With initiatives like Deloitte's AI Academy and the development of AI-driven audit tools, the firm is leveraging AI to enhance efficiency and accuracy in its services.\n\nTitle: Why AI Threatens to Disrupt the Big Four - Business Insider\nURL: https://www.businessinsider.com/big-four-consulting-ai-threat-jobs-ey-deloitte-kpmg-pwc-2025-5?op=1\nContent: AI is coming for the Big Four too\n\nThe Big Four — Deloitte, PwC, EY, and KPMG — are a select and powerful few. They dominate the professional services industry and have done so for decades.\n\nBut all empires fall eventually. Large corporations tend to merge, transform, or get replaced by the latest wave of innovative upstarts. [...] In 2023, KPMG said its plan to invest $2 billion in artificial intelligence and cloud services over the next five years would generate more than $12 billion in revenue over that period.\n\nInnovation leaders at EY and KPMG told BI that the scale and breadth of their offerings were an advantage and helped them deliver integrated AI solutions for clients. [...] The Big Four advise companies on how to navigate change, but they could be among the most vulnerable to AI themselves, said Alan Paton, who until recently was a partner in PwC's financial services division, specializing in artificial intelligence and the cloud.\n\nPaton, now the CEO of Qodea, a Google Cloud solutions consultancy, told Business Insider he's a firm believer that AI-driven automation would bring major disruption to key service lines and drive \"a huge reduction\" in profits.", "annotations": null}]
**Checklist:**
- [] Summarize how the Big Four consulting firms (Deloitte, PwC, EY, KPMG) are utilizing artificial intelligence and the main opportunities or risks they face.
**Output:**
```json
{
"reasoning": "The provided web search results collectively and clearly describe how the Big Four consulting firms are applying artificial intelligence—offering examples such as improved risk management, strategic consulting services, investment in AI, development of audit tools, and the general impact on their business models. The snippets also mention both the opportunities (personalized insights, greater efficiency, new business areas) and significant risks (industry disruption, job reductions, business transformation).\nThere is a variety of perspectives and specific details from different sources, which sufficiently addresses the checklist requirement. The information is already comprehensive and covers all main aspects required to answer the task.\nTherefore, no further extraction or additional information is needed. I will set `need_more_information` as false. ",
"need_more_information": false,
"title": "",
"url": "",
"subtask": ""
}
```
### Output Format Requirements
* Ensure proper JSON formatting with escaped special characters where needed.
* Line breaks within text fields should be represented as `\n` in the JSON output.
* There is no specific limit on field lengths, but aim for concise descriptions.
* All field values must be strings.
* For each JSON document, only include the following fields:

View File

@@ -0,0 +1,53 @@
You are a professional research report writer. Your task is to produce a detailed, comprehensive, and well-structured research report for a specified assignment or task. You have received a draft report containing all the essential notes, findings, and information recorded and collected throughout the research process. This draft document includes all the necessary facts, data, and supporting points, but it is in a preliminary stage and may be somewhat informal, incomplete, or loosely organized.
## Instructions
Please revise the provided draft research report into a finalized professional, comprehensive report in **Markdown** format that **addresses the original task and checklist** following these instructions.
1. Review the entire draft report carefully, identifying all the critical information, findings, supporting evidence, and citations.
2. Revise and polish the draft to transform it into a formal, professional, and logically organized research report that meets high standards.
3. Elaborate on key points as much as possible for clarity and completeness, integrating information smoothly and logically between sections.
4. Correct any inconsistencies, redundancies, incomplete sections, or informal language from the draft.
5. Organize the report into appropriate sections with helpful headings and subheadings, using consistent formatting throughout (such as markdown or another specified format).
6. Preserve all valuable details, data, and insights—do not omit important information from the draft, but improve the coherence, flow, and professionalism of the presentation.
7. Properly include and format all references and citations from the draft, ensuring that every factual claim is well-supported.
## Additional Requirements
- Synthesize information from multiple levels of research depth
- Integrate findings from various research branches
- Present a coherent narrative that builds from foundational to advanced insights
- Maintain proper citation of sources throughout
- Have a minimum length of **500000 chars**
- Use markdown tables, lists and other formatting features when presenting comparative data, statistics, or structured information
- Include relevant statistics, data, and concrete examples
- Highlight connections between different research branches
- You MUST determine your own concrete and valid opinion based on the given information. Do NOT defer to general and meaningless conclusions.
- You MUST NOT include a table of contents. Start from the main report body directly.
### Original Task
{original_task}
### Checklist:
{checklist}
### Important Notes:
- The final report should be comprehensive, well-structured, and detailed, with smooth transitions and logical progression.
- The tone must be formal, objective, and professional throughout.
- Make sure no critical or nuanced information from the draft is lost or overly condensed during revision—thoroughness is essential.
- Check that all cited sources are accurately referenced.
- Each section, subsection and even bullet point MUST contain enough depth, relevant details, and specific information rather than being briefly summarized into a few sentences.
### Report Format (Fill in appropriate content in [] and ... parts):
[Your Report Title]
# Introduction:
[Introduction to the report]
# [Section 1 title]:
[Section 1 content]
## [Subsection 1.1 title]:
[Subsection 1.1 content]
# [Section 2 title]:
...
# Conclusion:
[Conclusion to the report]
Format your report professionally with consistent heading levels, proper spacing.
Please do your best, this is very important to my career.

View File

@@ -0,0 +1,21 @@
You are a professional researcher expert in writing comprehensive report from your previous research results. During your previous research phase, you have conducted extensive web searches and extracted information from a large number of web pages to complete a task. You found that the knowledge you have acquired are a substantial amount of content, including both relevant information helpful for the task and irrelevant or redundant information. Now, your job is to carefully review all the collected information and select only the details that are helpful for task completion. Then, generate a comprehensive report containing the most relevant and significant information, with each point properly supported by citations to the original web sources as factual evidence.
## Instructions
1. Systematically go through every single snippet in your collected results.
2. Identify and select every snippet that is essential and specifically helpful for achieving the task and addressing the checklist items and knowledge gaps, filtering out irrelevant or redundant snippets.
3. Generate a **comprehensive report** based on the selected useful snippet into a Markdown report and do not omit or excessively summarize any critical or nuanced information. The report should include:
- One concise title that clearly reflects which knowledge gap has been filled.
- Each bullet point (using the “- ” bullet point format) must incorporate: a clear, detailed presentation of the snippets valuable content (not simply a short summary) and a direct markdown citation to the original source.
- Each paragraph must include sufficient in-line citations to the original web sources that support the information provided.
4. Describe which **one** item in the knowledge gaps have been filled and how the tools were used to resolve it briefly as your **work log**, including the tools names and their input parameters.
## Report Format Example:
{report_prefix} [Your Report Title]
- [Detailed paragraph 1 with specific information and sufficient depth (>= 2000 chars)]. [Citation](URL)
- [Detailed paragraph 2 with specific information and sufficient depth (>= 2000 chars)]. [Citation](URL)
- ...
## Important Notes
1. Avoid combining, excessively paraphrasing, omitting, or condensing any individual snippet that provides unique or relevant details. The final report must cover ALL key information as presented in the original results.
2. Each bullet point should be sufficiently detailed (at least **2000 chars**)
3. Both items with and without `(EXPANSION)` tag in knowledge gaps list are important and useful for task completion.

View File

@@ -0,0 +1,47 @@
Your job is reflecting your failure based on your work history and generate the follow-up subtask. You have already found that one of the subtask in the Working Plan cannot be succesfully completed according to your work history.
## Instructions
1. Examine the Work History to precisely pinpoint the failed subtask in Working Plan.
2. Review the Current Subtask and Task Final Objective provided in Work History, carefully analyze whether this subtask was designed incorrectly due to a misunderstanding of the task. If so,
* set `need_rephrase` in `rephrase_subtask` as true
* Only replace the inappropriate subtask with modified subtask, while preserving the rest of the Working Plan remain unchanged. You should output the updated Working Plan in `rephrased_plan`.
* If the subtask was not poorly designed, proceed to Step 3.
3. Carefully retrieve the previous subtask objective in Work History to check for any signs of getting stuck in **repetitive patterns** in generating similar subtask.
* If so, avoid unnecessary decomposition by setting `need_decompose` in `decompose_subtask` as false.
* Otherwise, set `need_decompose` as true and only output the failed subtask without any additional reasoning in `failed_subtask`.
## Important Notes
1. `need_decompose` and `need_rephrase` can NOT be both true at the same time.
2. Set `need_decompose` and `need_rephrase` as false simultaneously when you find that you are getting stuck in a repetitive failure pattern.
## Example
Work History:
1. Reflect the failure of this subtask and identify the failed subtask "Convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes by mapping tools or geo-mapping APIs"
2. Decompose subtask "Convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes by mapping tools or geo-mapping APIs" and generate a plan.
Working Plan:
1. Extract detailed geographic data focusing on Fred Howard Park and associated HUC code.
2. Use mapping tools or geo-mapping APIs (e.g., 'maps_regeocode') to convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes.
3. Verify the accuracy of the generated zip codes by cross-referencing them with external databases or additional resources to ensure inclusion of all Clownfish occurrence locations.
4. Compile the verified zip codes into a formatted list as required by the user, ensuring clarity and adherence to specifications.
Failed Subtask: "Use mapping tools or geo-mapping APIs (e.g., 'maps_regeocode') to convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes."
Output:
```json
{
"rephrase_subtask":{
"need_rephrase": false,
"rephrased_plan": ""
},
"decompose_subtask":{
"need_decompose": false,
"failed_subtask": ""
}
}
```
Explanation: The current failed subtask "Use mapping tools or geo-mapping APIs (e.g., 'maps_regeocode') to convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes" is similar to the previous failed subtask "Convert the extracted geographic coordinates or landmarks into corresponding five-digit zip codes by mapping tools or geo-mapping APIs", which has already been identified and decomposed in work history. Therefore, we don't need to make decomposition repeatedly.
### Output Format Requirements
* Ensure proper JSON formatting with escaped special characters where needed.
* Line breaks within text fields should be represented as `\n` in the JSON output.
* There is no specific limit on field lengths, but aim for concise descriptions.
* All field values must be strings.
* For each JSON document, only include the following fields:

View File

@@ -0,0 +1,13 @@
### Tool usage rules
1. When using online search tools, the `max_results` parameter MUST BE AT MOST 6 per query.
2. When using online search tools, keep the `query` short and keyword-based (2-6 words ideal). And the number should increase as the research depth increases, which means the deeper the research, the more detailed the query should be.
3. The directory/file system that you can operate is the following path: {tmp_file_storage_dir}. DO NOT try to save/read/modify file in other directories.
4. Try to use the local resource before going to online search. If there is file in PDF format, first convert it to markdown or text with tools, then read it as text.
5. You can basically use web search tools to search and retrieve whatever you want to know, including financial data, location, news, etc.
6. NEVER use `read_text_file` tool to read PDF file directly.
7. DO NOT targeting at generating PDF file unless the user specifies.
8. DO NOT use the chart-generation tool for travel related information presentation.
9. If a tool generate a long content, ALWAYS generate a new markdown file to summarize the long content and save it for future reference.
11. When you use the `write_text_file` tool, you **MUST ALWAYS** remember to provide the both the `path` and `content` parameters. DO NOT try to use `write_text_file` with long content exceeding 1k tokens at once!!!
Finally, before each tool using decision, carefully review the historical tool usage records to avoid the time and API costs caused by repeated execution. Remember that your balance is very low, so ensure absolute efficiency.

View File

@@ -0,0 +1,37 @@
## Additional Operation Notice
### Checklist Management
1. You will receive a markdown-style checklist (i.e., "Expected Output" checklist) in your input instruction. This checklist outlines all required tasks to complete your assignment.
2. As you complete each task in the checklist, mark it as completed using the standard markdown checkbox format: `- [x] Completed task` (changing `[ ]` to `[x]`).
3. Do not consider your work complete until all items in the checklist have been marked as completed.
### Process Flow
1. Based on your **Working Plan**, working through EACH item in it methodically with the following rules:
- items without `(EXPANSION)` tag are fundamental to complete the current subtask.
- items with `(EXPANSION)` tag are optional, while they can provide some valuable supplementary information that is beneficial for enriching the depth and breadth of your final output. However, it may also bring some distracting information. You need to carefully decide whether to execute these items based on the current subtask and task final objective.
2. Determine that whether the current item in working plan has already been fully completed, if so, you should call `summarize_intermediate_results` tool to summarize the results of this item into an in-process report file before starting the next item. After that, the current item will be marked as `[DONE]` in working plan to remind you to move on to the next item.
3. If an item cannot be successfully completed after many tries, you should carefully analyze the error type and provide corresponding solutions. The error types and solutions includes:
- Tool corruption (e.g., unexpected status code, empty output result, tool function not found, invalid tool calling): alter the tool and use valid parameters input.
- Insufficient information (e.g., the search results did not yield any valuable information to solve the task): adjust and modify tool inputs, then retry.
- Missing prerequisite (e.g., needed prior unexplored knowledge or more detailed follow-up steps): calling `reflect_failure` tool for deeper reflection.
4. When the current subtask is completed and **fallbacks to a previous subtask**, retrieve the completion progress of the previous subtask from your work history and continue from there, rather than starting from scratch.
### Important Constraints
1. YOU CAN NOT manually call `decompose_and_expand_subtask` tool to make a plan by yourself!
2. ALWAYS FOLLOW THE WORKING PLAN SEQUENCE STEP BY STEP!!
3. For each step, You MUST provide a reason or analysis to **review what was done in the previous step** and **explain why to call a function / use a tool in this step**.
4. After each action, YOU MUST seriously confirm that the current item in plan is done before starting the next item refer to the following rules:
- Carefully analyze whether the information obtained from tool is sufficient to fill the knowledge gap corresponding to the current item.
- Pay more attention to details. Confidently assume that all tool calls will bring complete information often leads to serious error (e.g., mistaking the rental website name for the apartment name when renting).
If the current item in plan is really done, calling `summarize_intermediate_results` to generate an in-process report, then moving on to the next item.
5. Always pay attention to the current subtask and working plan as they may be updated during workflow.
6. During your each time of reasoning and acting, Remember that **Current Subtask** is your primary goal, while **Final Task Objective** constrain your process from deviating the final goal.
### Completion and Output
You should use the {finish_function_name} tool to return your research results when:
- Research Depth > 1 and all items of the current working plan are marked as `[DONE]`.
- Research Depth = 1 and all checklist items are completed.
### Progress Tracking
1. Regularly review the checklist to confirm your progress.
2. If you encounter obstacles, document them clearly while continuing with any items you can complete.

View File

@@ -0,0 +1,140 @@
# -*- coding: utf-8 -*-
"""The output format of deep research agent"""
from pydantic import BaseModel, Field
class SubtasksDecomposition(BaseModel):
"""
Model for structured subtask decomposition output in deep research.
"""
knowledge_gaps: str = Field(
description=(
"A markdown checklist of essential knowledge gaps "
"and optional perspective-expansion gaps (flagged "
"with (EXPANSION)), each on its own line. "
"E.g. '- [ ] Detailed analysis of JD.com's "
"...\\n- [ ] (EXPANSION) X...'."
),
)
working_plan: str = Field(
description=(
"A logically ordered step-by-step working "
"meta_planner_agent (3-5 steps), each step starting with "
"its number (1., 2., etc), including both "
"core and expansion steps. Expanded steps "
"should be clearly marked with (EXPANSION) "
"and provide contextual or analytical depth.."
),
)
class WebExtraction(BaseModel):
"""
Model for structured follow-up web extraction output in deep research.
"""
reasoning: str = Field(
description="The reasoning for your decision, including a "
"summary of evidence and logic for whether more "
"information is needed.",
)
need_more_information: bool = Field(
description="Whether more information is needed.",
)
title: str = Field(
description="Title of the identified search result snippet "
"that requires further extraction, or an empty "
"string if not applicable.",
)
url: str = Field(
description="Direct URL to the original search result "
"requiring further extraction, or an empty "
"string if not applicable.",
)
subtask: str = Field(
description="Actionable description of the follow-up task "
"to obtain needed information, or an empty string "
"if not applicable.",
)
class FollowupJudge(BaseModel):
"""
Model for structured follow-up decompose judging output in deep research.
"""
reasoning: str = Field(
description="The reasoning for your decision, including a "
"summary of evidence and logic for whether "
"more information is needed.",
)
is_sufficient: bool = Field(
description="whether the information content is adequate.",
)
class ReflectFailure(BaseModel):
"""
Model for structured failure reflection output in deep research.
"""
rephrase_subtask: dict = Field(
description=(
"Information about whether the problematic "
"subtask needs to be rephrased due "
"to a design flaw or misunderstanding. "
"If rephrasing is needed, provide the "
"modified working meta_planner_agent with only the "
"inappropriate subtask replaced by its "
"improved version."
),
json_schema_extra={
"additionalProperties": {
"type": "object",
"properties": {
"need_rephrase": {
"type": "boolean",
"description": "Set to 'true' if the failed subtask "
"needs to be rephrased due to a design "
"flaw or misunderstanding; otherwise, 'false'.",
},
"rephrased_plan": {
"type": "string",
"description": "The modified working meta_planner_agent "
"with only the inappropriate "
"subtask replaced by its improved version. If no "
"rephrasing is needed, provide an empty string.",
},
},
},
},
)
decompose_subtask: dict = Field(
description=(
"Information about whether the problematic subtask "
"should be further decomposed. If decomposition "
"is required, provide the failed subtask "
"and the reason for its decomposition."
),
json_schema_extra={
"additionalProperties": {
"type": "object",
"properties": {
"need_decompose": {
"type": "boolean",
"description": "Set to 'true' if "
"the failed subtask should "
"be further decomposed; otherwise, 'false'.",
},
"rephrased_plan": {
"type": "string",
"description": "Information about whether "
"the failed subtask requires "
"decomposition, and the "
"failed subtask itself if needed.",
},
},
},
},
)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,82 @@
# -*- coding: utf-8 -*-
"""The main entry point of the Deep Research agent example."""
import asyncio
import os
from agentscope import logger
from agentscope.formatter import DashScopeChatFormatter
from agentscope.mcp import StdIOStatefulClient
from agentscope.memory import InMemoryMemory
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from .deep_research_agent import DeepResearchAgent
async def main(user_query: str) -> None:
"""The main entry point for the Deep Research agent example."""
logger.setLevel("DEBUG")
tavily_search_client = StdIOStatefulClient(
name="tavily_mcp",
command="npx",
args=["-y", "tavily-mcp@latest"],
env={"TAVILY_API_KEY": os.getenv("TAVILY_API_KEY", "")},
)
default_working_dir = os.path.join(
os.path.dirname(__file__),
"deepresearch_agent_demo_env",
)
agent_working_dir = os.getenv(
"AGENT_OPERATION_DIR",
default_working_dir,
)
os.makedirs(agent_working_dir, exist_ok=True)
try:
await tavily_search_client.connect()
agent = DeepResearchAgent(
name="Friday",
sys_prompt="You are a helpful assistant named Friday.",
model=DashScopeChatModel(
api_key=os.environ.get("DASHSCOPE_API_KEY"),
model_name="qwen-max",
enable_thinking=False,
stream=True,
),
formatter=DashScopeChatFormatter(),
memory=InMemoryMemory(),
search_mcp_client=tavily_search_client,
tmp_file_storage_dir=agent_working_dir,
)
user_name = "Bob"
msg = Msg(
user_name,
content=user_query,
role="user",
)
result = await agent(msg)
logger.info(result)
except Exception as err:
logger.exception(err)
finally:
await tavily_search_client.close()
if __name__ == "__main__":
query = (
"If Eliud Kipchoge could maintain his record-making "
"marathon pace indefinitely, how many thousand hours "
"would it take him to run the distance between the "
"Earth and the Moon its closest approach? Please use "
"the minimum perigee value on the Wikipedia page for "
"the Moon when carrying out your calculation. Round "
"your result to the nearest 1000 hours and do not use "
"any comma separators if necessary."
)
try:
asyncio.run(main(query))
except Exception as e:
logger.exception(e)

View File

@@ -0,0 +1 @@
agentscope[full]>=1.0.5

View File

@@ -0,0 +1,325 @@
# -*- coding: utf-8 -*-
"""The utilities for deep research agent"""
import json
import os
import re
from typing import Any, Sequence, Type, Union
from agentscope.tool import Toolkit, ToolResponse
from pydantic import BaseModel
TOOL_RESULTS_MAX_WORDS = 5000
def get_prompt_from_file(
file_path: str,
return_json: bool,
) -> Union[str, dict]:
"""Get prompt from file"""
with open(os.path.join(file_path), "r", encoding="utf-8") as f:
if return_json:
prompt = json.load(f)
else:
prompt = f.read()
return prompt
def truncate_by_words(sentence: str) -> str:
"""Truncate too long sentences by words number"""
words = re.findall(
r"\w+|[^\w\s]",
sentence,
re.UNICODE,
)
word_count = 0
result = []
for word in words:
if re.match(r"\w+", word):
word_count += 1
if word_count > TOOL_RESULTS_MAX_WORDS:
break
result.append(word)
truncated_sentence = ""
for i, word in enumerate(result):
if i == 0:
truncated_sentence += word
elif re.match(r"\w+", word):
truncated_sentence += " " + word
else:
truncated_sentence += word
return truncated_sentence
def truncate_search_result(
res: list,
search_func: str = "tavily-search",
extract_function: str = "tavily-extract",
) -> list:
"""Truncate search result in deep research agent"""
if search_func != "tavily-search" or extract_function != "tavily-extract":
raise NotImplementedError(
"Specific implementation of truncation should be provided.",
)
for i, val in enumerate(res):
res[i]["text"] = truncate_by_words(val["text"])
return res
def generate_structure_output(**kwargs: Any) -> ToolResponse:
"""Generate a structured output tool response.
This function is designed to be used as a tool function for generating
structured outputs. It takes arbitrary keyword arguments and wraps them
in a ToolResponse with metadata.
Args:
**kwargs: Arbitrary keyword arguments that should match the format
of the expected structured output specification.
Returns:
ToolResponse: A tool response object with empty content and the
provided kwargs as metadata.
Note:
The input parameters should be in the same format as the specification
and include as much detail as requested by the calling context.
"""
return ToolResponse(content=[], metadata=kwargs)
def get_dynamic_tool_call_json(data_model_type: Type[BaseModel]) -> list[dict]:
"""Generate JSON schema for dynamic tool calling with a given data model.
Creates a temporary toolkit, registers the structure output function,
and configures it with the specified data model to generate appropriate
JSON schemas for tool calling.
Args:
data_model_type: A Pydantic BaseModel class that defines the expected
structure of the tool output.
Returns:
A list of dictionary that contains the JSON schemas for
the configured tool, suitable for use in API calls that
support structured outputs.
Example:
class MyModel(BaseModel):
name: str
value: int
schema = get_dynamic_tool_call_json(MyModel)
"""
tmp_toolkit = Toolkit()
tmp_toolkit.register_tool_function(generate_structure_output)
tmp_toolkit.set_extended_model(
"generate_structure_output",
data_model_type,
)
return tmp_toolkit.get_json_schemas()
def get_structure_output(blocks: list | Sequence) -> dict:
"""Extract structured output from a sequence of blocks.
Processes a list or sequence of blocks to extract tool use outputs
and combine them into a single dictionary. This is typically used
to parse responses from language models that include tool calls.
Args:
blocks: A list or sequence of blocks that may contain tool use
information. Each block should be a dictionary with 'type'
and 'input' keys for tool use blocks.
Returns:
A dictionary containing the combined input data from all tool
use blocks found in the input sequence.
Example:
blocks = [
{"type": "tool_use", "input": {"name": "test"}},
{"type": "text", "content": "Some text"},
{"type": "tool_use", "input": {"value": 42}}
]
result = PromptBase.get_structure_output(blocks)
# result: {"name": "test", "value": 42}
"""
dict_output = {}
for block in blocks:
if isinstance(block, dict) and block.get("type") == "tool_use":
dict_output.update(block.get("input", {}))
return dict_output
def load_prompt_dict() -> dict:
"""Load prompt into dict"""
prompt_dict = {}
cur_dir = os.path.dirname(os.path.abspath(__file__))
prompt_dict["add_note"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_worker_additional_sys_prompt.md",
),
return_json=False,
)
prompt_dict["tool_use_rule"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_tool_usage_rules.md",
),
return_json=False,
)
prompt_dict["decompose_sys_prompt"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_decompose_subtask.md",
),
return_json=False,
)
prompt_dict["expansion_sys_prompt"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_deeper_expansion.md",
),
return_json=False,
)
prompt_dict["summarize_sys_prompt"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_inprocess_report.md",
),
return_json=False,
)
prompt_dict["reporting_sys_prompt"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_deepresearch_summary_report.md",
),
return_json=False,
)
prompt_dict["reflect_sys_prompt"] = get_prompt_from_file(
file_path=os.path.join(
cur_dir,
"built_in_prompt/prompt_reflect_failure.md",
),
return_json=False,
)
prompt_dict["reasoning_prompt"] = (
"## Current Subtask:\n{objective}\n"
"## Working Plan:\n{meta_planner_agent}\n"
"{knowledge_gap}\n"
"## Research Depth:\n{depth}"
)
prompt_dict["previous_plan_inst"] = (
"## Previous Plan:\n{previous_plan}\n"
"## Current Subtask:\n{objective}\n"
)
prompt_dict["max_depth_hint"] = (
"The search depth has reached the maximum limit. So the "
"current subtask can not be further decomposed and "
"expanded anymore. I need to find another way to get it "
"done no matter what."
)
prompt_dict["expansion_inst"] = (
"Review the web search results and identify whether "
"there is any information that can potentially help address "
"checklist items or fulfill knowledge gaps of the task, "
"but whose content is limited or only briefly mentioned.\n"
"**Task Description:**\n{objective}\n"
"**Checklist:**\n{checklist}\n"
"**Knowledge Gaps:**\n{knowledge_gaps}\n"
"**Search Results:**\n{search_results}\n"
"**Output:**\n"
)
prompt_dict["follow_up_judge_sys_prompt"] = (
"To provide sufficient external information for the user's "
"query, you have conducted a web search to obtain additional "
"data. However, you found that some of the information, while "
"important, was insufficient. Consequently, you extracted the "
"entire content from one of the URLs to gather more "
"comprehensive information. Now, you must rigorously and "
"carefully assess whether, after both the web search and "
"extraction process, the information content is adequate to "
"address the given task. Be aware that any arbitrary decisions "
"may result in unnecessary and unacceptable time costs.\n"
)
prompt_dict[
"retry_hint"
] = "Something went wrong when {state}. I need to retry."
prompt_dict["need_deeper_hint"] = (
"The information is insufficient and I need to make deeper "
"research to fill the knowledge gap."
)
prompt_dict[
"sufficient_hint"
] = "The information after web search and extraction is sufficient enough!"
prompt_dict["no_result_hint"] = (
"I mistakenly called the `summarize_intermediate_results` tool as "
"there exists no milestone result to summarize now."
)
prompt_dict["summarize_hint"] = (
"Based on your work history above, examine which step in the "
"following working meta_planner_agent has been completed. Mark the completed "
"step with [DONE] at the end of its line (e.g., k. step k [DONE]) "
"and leave the uncompleted steps unchanged. You MUST return only "
"the updated meta_planner_agent, preserving exactly the same format as the "
"original meta_planner_agent. Do not include any explanations, reasoning, "
"or section headers such as '## Working Plan:', just output the"
"updated meta_planner_agent itself."
"\n\n## Working Plan:\n{meta_planner_agent}"
)
prompt_dict["summarize_inst"] = (
"**Task Description:**\n{objective}\n"
"**Checklist:**\n{knowledge_gaps}\n"
"**Knowledge Gaps:**\n{working_plan}\n"
"**Search Results:**\n{tool_result}"
)
prompt_dict["update_report_hint"] = (
"Due to the overwhelming quantity of information, I have replaced the "
"original bulk search results from the research phase with the "
"following report that consolidates and summarizes the essential "
"findings:\n {intermediate_report}\n\n"
"Such report has been saved to the {report_path}. "
"I will now **proceed to the next item** in the working meta_planner_agent."
)
prompt_dict["save_report_hint"] = (
"The milestone results of the current item in working meta_planner_agent "
"are summarized into the following report:\n{intermediate_report}"
)
prompt_dict["reflect_instruction"] = (
"## Work History:\n{conversation_history}\n"
"## Working Plan:\n{meta_planner_agent}\n"
)
prompt_dict["subtask_complete_hint"] = (
"Subtask {cur_obj} is completed. Now the current subtask "
"fallbacks to '{next_obj}'"
)
return prompt_dict

View File

@@ -0,0 +1,3 @@
DASHSCOPE_API_KEY=''
QUARK_AK=''
QUARK_SK=''

View File

@@ -0,0 +1,125 @@
# DeepSearch Demo of Agentscope-Runtime with Langgraph / Qwen and Quark search
This project is modified from [Gemini Fullstack LangGraph Quickstart](https://github.com/google-gemini/gemini-fullstack-langgraph-quickstart).
It contains following key features:
1. We use LangGraph to build an agent (directed state graph) with the help of Qwen and Quark search.
2. The agent is wrapped as an Agentscope-Runtime agent and deployed as a service.
3. The interaction with the agent is done through a simple CLI.
Click following image to watch the video demo:
[![watch_the_video](https://img.alicdn.com/imgextra/i3/6000000000386/O1CN01vDit5y1EipxsRceBd_!!6000000000386-0-tbvideo.jpg)](https://cloud.video.taobao.com/vod/-BhtPfhYZv8pCz7L1vYmKCDtf1QEaDXNX1hMnvj_BUQ.mp4)
<br />
## 🌳 Project Structure
```bash
├── src # Source code directory containing the core functionalities and modules
│ ├── init.py # Initialization script, possibly setting up environment or configurations
│ ├── configuration.py # Module for handling application configurations and settings
│ ├── custom_search_tool.py # Implements custom search functionality or tool
│ ├── graph_openai_compatible.py # Module for OpenAI-compatible graph operations or integrations
│ ├── llm_prompts.py # Contains large language model prompts used in the application
│ ├── llm_utils.py # Utility functions for handling large language model operations
│ ├── main.py # Main entry script to launch or execute the application
│ ├── state.py # Manages application state or data persistence
│ ├── tools_and_schemas.py # Defines various tools and data schemas used by the application
│ └── utils.py # General utility functions used across the application
└── README.md # Project documentation file providing information and usage instructions
```
## Architecture
The architecture of the demo is shown in the following diagram:
```mermaid
graph LR;
subgraph As["AgentScope Runtime"]
F[Agent Engine]
end
subgraph Bs["LangGraph"]
B1[Web Research]
B2[Reflection]
B3[Answer Generation]
B0[Generation Queries]
B0 --> B1
B1 --> B2
B2 --> |if insufficient| B1
B2 --> |if sufficient| B3
end
subgraph Cs["CLI Service"]
C[main] --> S[WebSearchGraph]
end
As --> Bs
user --> |user input| C
S--> F
```
## 📖 Overview
This demo demonstrates how to build a sophisticated research agent using:
- Qwen as the underlying language model
- LangGraph for defining complex agent workflows
- Custom search tools for information retrieval
- State management for multi-step reasoning
The implementation showcases advanced patterns for building agentic systems that can perform deep research tasks through iterative thinking and tool usage.
## ⚙️ Components
### Core Modules
- `configuration.py`: Configuration management for the agent
- `custom_search_tool.py`: Custom search functionality implementation
- `graph_openai_compatible.py`: LangGraph implementation with OpenAI compatibility
- `llm_prompts.py`: Prompt templates for different agent behaviors
- `llm_utils.py`: Utility functions for LLM interactions
- `main.py`: Main entry point for the application
- `state.py`: State management for the LangGraph workflow
- `tools_and_schemas.py`: Tool definitions and data schemas
- `utils.py`: General utility functions
## 🚀 Getting Started
## Install
Follow these steps to get the application running locally for development and testing.
**1. Prerequisites:**
- Python 3.11+
- Create a file named `.env` by copying the `.env.example` file.
- **`DASHSCOPE_API_KEY`**:
add `DASHSCOPE_API_KEY="YOUR_ACTUAL_API_KEY"` if you use dashscope API.
- Quark Search API KEY: add `QUARK_AK=''` and `QUARK_SK=''` to `.env` file if you use quark search API.
**2. Install Dependencies:**
```bash
pip install -r requirements.txt
```
## Usage
Start the CLI Service.
```bash
cd src
python main.py
```
After that you can use the CLI to interact with the agent.
## 🛠️ Features
- `qwen` integration for advanced language understanding
- `langgraph` for complex workflow management
- Custom search tools for information retrieval
- Multi-step reasoning capabilities
- Stateful agent interactions
- Research-focused agent workflows
## Getting help
If you have any questions or if you found any problems with this demo, please report through [GitHub issues](https://github.com/your-org/demohouse/issues).
## 📄 License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.

View File

@@ -0,0 +1,10 @@
langgraph>=0.2.6
langchain>=0.3.19
python-dotenv>=1.0.1
langgraph-sdk>=0.1.57
langgraph-cli>=0.4.4
langgraph-api>=0.4.43
dashscope>=1.24.6
openai>=2.4.0
pandas>=2.3.3
agentscope-runtime>=0.1.5

View File

@@ -0,0 +1,78 @@
# -*- coding: utf-8 -*-
import os
from typing import Any, Optional
from langchain_core.runnables import RunnableConfig
from pydantic import BaseModel, Field
class Configuration(BaseModel):
"""The configuration for the agent."""
query_generator_model: str = Field(
default="qwen-max-latest",
metadata={
"description": "The name of the language model to use for "
"the agent's query generation.",
},
)
query_generator_param: dict = Field(
default={"temperature": 0.3, "stream": False},
)
reflection_model: str = Field(
default="qwen-plus-latest",
metadata={
"description": "The name of the language model to use for"
" the agent's reflection.",
},
)
reflection_param: dict = Field(
default={"temperature": 0.3, "stream": False},
)
answer_model: str = Field(
default="qwen-plus-latest",
metadata={
"description": "The name of the language model to use "
"for the agent's answer.",
},
)
answer_param: dict = Field(default={"temperature": 0.3, "stream": False})
num_of_init_q: int = Field(
default=3,
metadata={
"description": "The number of initial search queries to generate.",
},
)
max_research_loops: int = Field(
default=2,
metadata={
"description": "The maximum number of research loops to perform.",
},
)
@classmethod
def from_runnable_config(
cls,
config: Optional[RunnableConfig] = None,
) -> "Configuration":
"""Create a Configuration instance from a RunnableConfig."""
configurable = (
config["configurable"]
if config and "configurable" in config
else {}
)
# Get raw values from environment or config
raw_values: dict[str, Any] = {
name: os.environ.get(name.upper(), configurable.get(name))
for name in cls.model_fields.keys()
}
# Filter out None values
values = {k: v for k, v in raw_values.items() if v is not None}
return cls(**values)

View File

@@ -0,0 +1,123 @@
# -*- coding: utf-8 -*-
import os
import random
import string
import time
import uuid
from base64 import b64encode
from hashlib import sha256
from hmac import new as hmac_new
from typing import Any, Dict, List
import requests
from utils import format_time
class CustomSearchTool:
def __init__(self, search_engine: str = "quark"):
assert search_engine in ["quark"]
self.search_engine = search_engine
if self.search_engine == "quark":
self.search_func = self._quark_search
else:
raise NotImplementedError
self.search_engine = search_engine
def search(
self,
query: str,
) -> List[Dict[str, Any]]:
"""
Execute search and return the results
:param query:
:param num_results:
:return:
"""
return self.search_func(query)
def search_quark_to_b_signature(self, user_name, timestamp, salt: str, sk):
"""
signature
:param user_name: username
:param timestamp: timestamp
:param salt: salt
:param sk:
:return:
"""
data = f"{user_name}{timestamp}{salt}{sk}"
hashed = hmac_new(sk.encode("utf-8"), data.encode("utf-8"), sha256)
return b64encode(hashed.digest()).decode("utf-8")
def search_quark_to_b_gen_token(self, user_name: str, sk: str):
"""
get token
:param user_name:
:param sk:
:return:
"""
timestamp = str(int(time.time() * 1000))
salt = "".join(random.choice(string.ascii_lowercase) for _ in range(6))
sign = self.search_quark_to_b_signature(user_name, timestamp, salt, sk)
postBody = {
"userName": user_name,
"timestamp": timestamp,
"salt": salt,
"sign": sign,
}
url = "https://zx-dsc.sm.cn/api/auth/token"
headers = {"content-type": "application/json"}
response = requests.post(url, json=postBody, headers=headers)
data = response.json()
token = data["result"]["token"]
return token
def _quark_search(self, query: str):
ak = os.getenv("QUARK_AK", "")
sk = os.getenv("QUARK_SK", "")
token = self.search_quark_to_b_gen_token(ak, sk)
url = "https://zx-dsc.sm.cn/api/resource/s_agg/ex/query"
querystring = {
"page": "1",
"q": query,
}
request_id = str(uuid.uuid4())
headers = {
"Authorization": f"Bearer {token}",
"request-id": request_id,
}
try:
response = requests.get(url, headers=headers, params=querystring)
if response.status_code == 200:
data = response.json()
if (
data.get("items", {}).get("@attributes", {}).get("status")
== "OK"
and data.get(
"items",
)
and data.get("items", {}).get("item")
):
items = data.get("items").get("item")
formated_items = []
for item in items:
formated_items.append(
{
"title": item["title"],
"url": item["url"],
"snippet": item["desc"],
"content": item["MainBody"],
"publish_date": format_time(item.get("time")),
"site_name": item.get("site_name", ""),
},
)
return formated_items
else:
return []
else:
return []
except Exception as e:
print(f"Quark search failed: {e}")
return []

View File

@@ -0,0 +1,526 @@
# -*- coding: utf-8 -*-
import asyncio
import json
import os
import time
from typing import Any, Dict, List, Optional
from agentscope_runtime.engine.agents.langgraph_agent import LangGraphAgent
from agentscope_runtime.engine.helpers.helper import simple_call_agent_direct
from configuration import Configuration
from custom_search_tool import CustomSearchTool
from dotenv import load_dotenv
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.runnables import RunnableConfig
from langgraph.graph import END, START, StateGraph
from langgraph.types import Send
from llm_prompts import (
answer_instructions,
query_writer_instructions,
reflection_instructions,
web_searcher_instructions,
)
from llm_utils import call_dashscope, extract_json_from_qwen
from state import (
OverallState,
QueryGenerationState,
ReflectionState,
WebSearchState,
)
from utils import (
custom_get_citations,
custom_resolve_urls,
get_current_date,
get_research_topic,
insert_citation_markers,
)
load_dotenv("../.env")
if os.getenv("DASHSCOPE_API_KEY") is None:
raise ValueError("DASHSCOPE_API_KEY is not set")
def format_search_results(search_results: List[Dict[str, Any]]) -> str:
"""
Convert the search results
:param search_results:
:return:
"""
formatted_results = []
for i, result in enumerate(search_results, 1):
formatted_result = f"""
Result Number {i}:
Title: {result.get('title', 'N/A')}
Label{result.get('site_name', 'N/A')}
URL: {result.get('url', 'N/A')}
Snippet: {result.get('snippet', 'N/A')}
publish_date: {result.get('publish_date', 'N/A')}
---
"""
formatted_results.append(formatted_result)
return "\n".join(formatted_results)
class WebSearchGraph:
def __init__(
self,
config: RunnableConfig,
call_llm_func,
search_tool: CustomSearchTool,
):
self.configurable = Configuration.from_runnable_config(config)
self.call_llm_func = call_llm_func
self.search_tool = search_tool
self.input_tokens = 0
self.output_tokens = 0
self.total_tokens = 0
self.max_retries = 3
self.retry_delay = 2
self.current_date = get_current_date()
def get_chat_completion(self, **args):
completion = self.call_llm_func(**args)
self.input_tokens += completion.usage.prompt_tokens
self.output_tokens += completion.usage.completion_tokens
self.total_tokens += completion.usage.total_tokens
return completion.choices[0].message.content
def generate_query(self, state: OverallState) -> QueryGenerationState:
"""LangGraph node that generates search queries
based on the User's question.
Uses QWen Max to create optimized search queries
for web research based on the User's question.
Args:
state: Current graph state containing the User's question
config: Configuration for the runnable,
including LLM provider settings
Returns:
Dictionary with state update,
including search_query key containing the
generated queries
"""
# check for custom initial search query count
if state.get("initial_search_query_count") is None:
state[
"initial_search_query_count"
] = self.configurable.num_of_init_q
# Format the prompt
formatted_prompt = query_writer_instructions.format(
current_date=self.current_date,
research_topic=get_research_topic(state["messages"]),
number_queries=state["initial_search_query_count"],
)
param = {
"model": self.configurable.query_generator_model,
"messages": [{"role": "user", "content": formatted_prompt}],
**self.configurable.query_generator_param,
}
for attempt in range(self.max_retries):
try:
result = self.get_chat_completion(**param)
result = extract_json_from_qwen(result)
result = json.loads(result)
query = result.get("query")
if isinstance(query, str):
query = [query]
assert isinstance(query, list)
break
except Exception as e:
print(
f"Error occurred when generating search query (attempt"
f" {attempt + 1}/{self.max_retries}): {e}.",
)
if attempt == self.max_retries - 1: # Last attempt failed
query = [get_research_topic(state["messages"])]
break
time.sleep(self.retry_delay)
return {"search_query": query}
def continue_to_web_research(self, state: QueryGenerationState):
"""LangGraph node that sends the
search queries to the web research node.
This is used to spawn n number
of web research nodes, one for each search query.
"""
return [
Send(
"web_research",
{"search_query": search_query, "id": str(idx)},
)
for idx, search_query in enumerate(state["search_query"])
]
def web_research(self, state: WebSearchState):
"""LangGraph node that performs web research using the native Google
Search API tool.
Executes a web search using the native Google Search API tool in
combination with Gemini 2.0 Flash.
Args:
state: Current graph state containing the
search query and research loop count
config: Configuration for the runnable,
including search API settings
Returns:
Dictionary with state update,
including sources_gathered, research_loop_count,
and web_research_results
"""
search_results = self.search_tool.search(
state["search_query"],
)
search_context = format_search_results(search_results)
formatted_prompt = (
web_searcher_instructions.format(
current_date=self.current_date,
research_topic=state["search_query"],
)
+ f"\n\nSearch Result:\n{search_context}"
)
param = {
"model": self.configurable.query_generator_model,
"messages": [{"role": "user", "content": formatted_prompt}],
**self.configurable.query_generator_param,
}
sources_gathered = []
for result in search_results:
url = result.get("url")
if url:
sources_gathered.append(
{
"label": result.get("site_name"),
"short_url": url,
"value": url,
},
)
for attempt in range(self.max_retries):
try:
result = self.get_chat_completion(**param)
resolved_urls = custom_resolve_urls(
search_results,
state["id"],
)
citations = custom_get_citations(search_results, resolved_urls)
modified_text = insert_citation_markers(result, citations)
return {
"sources_gathered": sources_gathered,
"search_query": [state["search_query"]],
"web_research_result": [modified_text],
}
except Exception as e:
print(
f"Error occurred when web search query: "
f"`{state['search_query']}` "
f"(attempt {attempt + 1}/{self.max_retries}): {e}.",
)
summary = (
f"{len(search_results)} related results are found "
f"about search query '{state['search_query']}'"
)
if attempt == self.max_retries - 1:
return {
"sources_gathered": sources_gathered,
"search_query": [state["search_query"]],
"web_research_result": [summary],
}
time.sleep(self.retry_delay)
return None
def reflection(self, state: OverallState) -> Optional[ReflectionState]:
"""LangGraph node that identifies knowledge gaps and generates
potential follow-up queries.
Analyzes the current summary to identify areas for further
research and generates
potential follow-up queries. Uses structured output to extract
the follow-up query in JSON format.
Args:
state: Current graph state containing the running summary
and research topic
config: Configuration for the runnable, including LLM
provider settings
Returns:
Dictionary with state update, including search_query key
containing the generated follow-up query
"""
state["research_loop_count"] = state.get("research_loop_count", 0) + 1
reasoning_model = self.configurable.reflection_model
# Format the prompt
formatted_prompt = reflection_instructions.format(
current_date=self.current_date,
research_topic=get_research_topic(state["messages"]),
summaries="\n\n---\n\n".join(state["web_research_result"]),
)
param = {
"model": reasoning_model,
"messages": [{"role": "user", "content": formatted_prompt}],
**self.configurable.reflection_param,
}
for attempt in range(self.max_retries):
try:
result = self.get_chat_completion(**param)
result = extract_json_from_qwen(result)
result = json.loads(result)
is_sufficient = result.get("is_sufficient", True)
knowledge_gap = result.get("knowledge_gap", "")
follow_up_queries = result.get("follow_up_queries", [])
assert isinstance(follow_up_queries, list)
return {
"is_sufficient": is_sufficient,
"knowledge_gap": knowledge_gap,
"follow_up_queries": follow_up_queries,
"research_loop_count": state["research_loop_count"],
"number_of_ran_queries": len(state["search_query"]),
}
except Exception as e:
print(
f"Error occurred when reflection (attempt {attempt + 1}"
f"/{self.max_retries}): {e}.",
)
if attempt == self.max_retries - 1: # Last attempt failed
return {
"is_sufficient": True,
"knowledge_gap": "",
"follow_up_queries": [],
"research_loop_count": state["research_loop_count"],
"number_of_ran_queries": len(state["search_query"]),
}
time.sleep(self.retry_delay)
return None
def evaluate_research(
self,
state: ReflectionState,
config: RunnableConfig,
):
"""LangGraph routing function that determines the next step in the
research flow.
Controls the research loop by deciding whether to continue gathering
information
or to finalize the summary based on the configured maximum number of
research loops.
Args:
state: Current graph state containing the research loop count
config: Configuration for the runnable, including
max_research_loops setting
Returns:
String literal indicating the next node to visit ("web_research"
or "finalize_summary")
"""
configurable = Configuration.from_runnable_config(config)
max_research_loops = (
state.get("max_research_loops")
if state.get("max_research_loops") is not None
else configurable.max_research_loops
)
if (
state["is_sufficient"]
or state["research_loop_count"] >= max_research_loops
):
return "finalize_answer"
else:
return [
Send(
"web_research",
{
"search_query": follow_up_query,
"id": state["number_of_ran_queries"] + int(idx),
},
)
for idx, follow_up_query in enumerate(
state["follow_up_queries"],
)
]
def finalize_answer(self, state: OverallState):
"""LangGraph node that finalizes the research summary.
Prepares the final output by deduplicating and formatting sources, then
combining them with the running summary to create a well-structured
research report with proper citations.
Args:
state: Current graph state containing the running summary
and sources gathered
Returns:
Dictionary with state update, including running_summary
key containing
the formatted final summary with sources
"""
answer_model = self.configurable.answer_model
formatted_prompt = answer_instructions.format(
current_date=self.current_date,
research_topic=get_research_topic(state["messages"]),
summaries="\n---\n\n".join(state["web_research_result"]),
)
param = {
"model": answer_model,
"messages": [{"role": "user", "content": formatted_prompt}],
**self.configurable.answer_param,
}
for attempt in range(self.max_retries):
try:
result = self.get_chat_completion(**param)
unique_sources = []
for source in state["sources_gathered"]:
if source["short_url"] in result:
result = result.replace(
source["short_url"],
source["value"],
)
unique_sources.append(source)
return {
"messages": [AIMessage(content=result)],
"sources_gathered": unique_sources,
}
except Exception as e:
print(
f"Error occurred when generating answer (attempt "
f"{attempt + 1}/{self.max_retries}): {e}.",
)
if attempt == self.max_retries - 1:
return {
"messages": [
AIMessage(
content=f"Error occurred"
f" when generating answer. {e}",
),
],
"sources_gathered": [],
}
time.sleep(self.retry_delay)
return None
async def run(self, user_question: str):
# Create our Agent Graph
builder = StateGraph(OverallState, config_schema=Configuration)
# Define the nodes we will cycle between
builder.add_node("generate_query", self.generate_query)
builder.add_node("web_research", self.web_research)
builder.add_node("reflection", self.reflection)
builder.add_node("finalize_answer", self.finalize_answer)
# Set the entrypoint as `generate_query`
# This means that this node is the first one called
builder.add_edge(START, "generate_query")
# Add conditional edge to continue with search queries in a
# parallel branch
builder.add_conditional_edges(
"generate_query",
self.continue_to_web_research,
["web_research"],
)
# Reflect on the web research
builder.add_edge("web_research", "reflection")
# Evaluate the research
builder.add_conditional_edges(
"reflection",
self.evaluate_research,
["web_research", "finalize_answer"],
)
# Finalize the answer
builder.add_edge("finalize_answer", END)
compiled_graph = builder.compile(name="pro-search-agent")
def human_ai_message_to_dict(obj):
if isinstance(obj, HumanMessage):
return {
"sender": obj.type,
"content": obj.content,
}
if isinstance(obj, AIMessage):
return {
"sender": obj.type,
"content": obj.content,
}
raise TypeError(
f"Object of type {obj.__class__.__name__} is"
f" not JSON serializable",
)
def state_folder(messages):
if len(messages) > 0:
return json.loads(messages[0]["content"])
else:
return []
def state_unfolder(state):
state_jsons = json.dumps(state, default=human_ai_message_to_dict)
return state_jsons
langgraph_agent = LangGraphAgent(
compiled_graph,
state_folder,
state_unfolder,
)
input_state = {
"messages": [{"role": "user", "content": user_question}],
"max_research_loops": self.configurable.max_research_loops,
"initial_search_query_count": self.configurable.num_of_init_q,
}
input_json = json.dumps(input_state)
all_result = await simple_call_agent_direct(
langgraph_agent,
input_json,
)
state = json.loads(all_result)
return state["messages"][-1]["content"]
async def main():
custom_search_tool = CustomSearchTool(search_engine="quark")
graph = WebSearchGraph(
json.loads(Configuration().model_dump_json()),
call_dashscope,
custom_search_tool,
)
print(
"""Type in your question or q to quit.""",
)
user_input = input(">").strip()
while user_input != "q":
question = user_input
item = await graph.run(question)
print(item, end="", flush=True)
print("\n")
user_input = input(">")
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,134 @@
# -*- coding: utf-8 -*-
query_writer_instructions = """Your goal is to generate sophisticated and
diverse web search queries.
These queries are intended for an advanced automated web research tool capable
of analyzing complex results,
following links, and synthesizing information.
Instructions:
- Always prefer a single search query, only add another query if the original
question requests multiple aspects or elements and one query is not enough.
- Each query should focus on one specific aspect of the original question.
- Don't produce more than {number_queries} queries.
- Queries should be diverse, if the topic is broad, generate more than 1 query.
- Don't generate multiple similar queries, 1 is enough.
- Query should ensure that the most current information is gathered. The
current date is {current_date}.
Format:
- Format your response as a JSON object with ALL three of these exact keys:
- "rationale": str, A brief explanation of why these queries are relevant
to the research topic.
- "query": list[str], A list of search queries to be used for web research.
Example:
Topic: What revenue grew more last year apple stock or the number of people
buying an iphone
```json
{{
"rationale": "To answer this comparative growth question accurately,
we need specific data points on Apple's stock performance and iPhone
sales metrics. These queries target the precise financial information
needed: company revenue trends, product-specific unitsales figures,
and stock price movement over the same fiscal period for
direct comparison.",
"query": ["Apple total revenue growth fiscal year 2024", "iPhone unit sales
growth fiscal
year 2024", "Apple stock price growth fiscal year 2024"],
}}
```
Context: {research_topic}"""
web_searcher_instructions = """Conduct targeted Google Searches to gather the
most recent, credible
information on "{research_topic}" and synthesize it into a verifiable text
artifact.
Instructions:
- Query should ensure that the most current information is gathered. The
current date is {current_date}.
- Conduct multiple, diverse searches to gather comprehensive information.
- Consolidate key findings while meticulously tracking the source(s) for each
specific piece of information.
- The output should be a well-written summary or report based on your search
findings.
- Only include the information found in the search results, don't make up any
information.
Research Topic:
{research_topic}
"""
reflection_instructions = """You are an expert research assistant analyzing
summaries about "{research_topic}".
Instructions:
- Identify knowledge gaps or areas that need deeper exploration and generate a
follow-up query. (1 or multiple).
- If provided summaries are sufficient to answer the user's question, don't
generate a follow-up query.
- If there is a knowledge gap, generate a follow-up query that would help
expand your understanding.
- Focus on technical details, implementation specifics, or emerging trends
that weren't fully covered.
Requirements:
- Ensure the follow-up query is self-contained and includes necessary context
for web search.
Output Format:
- Format your response as a JSON object with these exact keys:
- "is_sufficient": true or false. Whether the provided summaries are
sufficient to answer the user's question.
- "knowledge_gap": str, A description of what information is missing or
needs clarification.
- "follow_up_queries": list, A list of follow-up queries to address the
knowledge gap.
Example:
```json
{{
"is_sufficient": true, // or false
"knowledge_gap": "The summary lacks information about performance metrics
and benchmarks", //
"" if is_sufficient is true
"follow_up_queries": ["What are typical performance benchmarks and metrics
used to evaluate
[specific technology]?"]
// [] if is_sufficient is true
}}
```
Reflect carefully on the Summaries to identify knowledge gaps and produce a
follow-up query.
Then, produce your output following this JSON format:
Summaries:
{summaries}
"""
answer_instructions = """Generate a high-quality answer to the user's question
based on the provided summaries.
Instructions:
- The current date is {current_date}.
- You are the final step of a multi-step research process, don't mention that
you are the final step.
- You have access to all the information gathered from the previous steps.
- You have access to the user's question.
- Generate a high-quality answer to the user's question based on the provided
summaries
and the user's question.
- Include the sources you used from the Summaries in the answer correctly,
use markdown format. THIS IS A MUST.
User Context:
- {research_topic}
Summaries:
{summaries}"""

View File

@@ -0,0 +1,177 @@
# -*- coding: utf-8 -*-
import json
import os
import re
from collections import defaultdict
from datetime import datetime
from typing import Any, Dict, Iterator, List, Optional
from openai import OpenAI
from openai.types.chat.chat_completion import (
ChatCompletion,
ChatCompletionMessage,
Choice,
)
from openai.types.chat.chat_completion_message_tool_call import (
ChatCompletionMessageToolCall,
Function,
)
def extract_json_from_qwen(qwen_result: str) -> str:
sql = ""
pattern = r"```json(.*?)```"
sql_code_snippets = re.findall(pattern, qwen_result, re.DOTALL)
if len(sql_code_snippets) > 0:
sql = sql_code_snippets[-1].strip()
return sql
def call_dashscope(**args: Any) -> ChatCompletion:
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
**args,
)
stream = args.get("stream", False)
if stream:
try:
completion = postprocess_completion(completion)
return completion
except Exception as e:
print(
f"Error occurred when postprocess_completion on "
f"'stream=True'. {e}",
)
default_message = ChatCompletionMessage(
role="assistant",
content="Error in calling LLM", # 默认内容
)
default_choice = Choice(
finish_reason="stop",
index=0,
logprobs=None,
message=default_message,
)
default_chat_completion = ChatCompletion(
id="chatcmpl-1234567890",
choices=[default_choice],
created=int(datetime.now().timestamp()),
model=args["model"],
object="chat.completion",
service_tier="default",
system_fingerprint=None,
usage=None,
)
return default_chat_completion
return completion
def merge_fields(target: Dict[str, Any], source: Dict[str, Any]) -> None:
for key, value in source.items():
if isinstance(value, str):
target[key] = target.get(key, "") + value
elif value is not None and isinstance(value, dict):
merge_fields(target[key], value)
def merge_chunk(final_response: Dict[str, Any], delta: Dict[str, Any]) -> None:
delta.pop("role", None)
merge_fields(final_response, delta)
tool_calls = delta.get("tool_calls")
if tool_calls and len(tool_calls) > 0:
index = int(tool_calls[0].pop("index")) # Convert index to integer
if "tool_calls" not in final_response:
final_response["tool_calls"] = {}
final_response["tool_calls"][index] = final_response["tool_calls"].get(
index,
{},
)
final_response["tool_calls"][index].pop("type", None)
merge_fields(final_response["tool_calls"][index], tool_calls[0])
def postprocess_completion(completion: Iterator) -> ChatCompletion:
message: Dict[str, Any] = {
"content": "",
"role": "assistant",
"function_call": None,
"tool_calls": defaultdict(
lambda: {
"function": {"arguments": "", "name": ""},
"id": "",
"type": "",
},
),
"reasoning_content": "",
"refusal": "",
}
last_chunk: Optional[Any] = None
for chunk in completion:
try:
delta = json.loads(chunk.choices[0].delta.json())
except json.JSONDecodeError as e:
print(f"Error decoding JSON from chunk: {e}")
continue
delta.pop("role", None)
merge_chunk(message, delta)
finish_reason = chunk.choices[0].finish_reason
logprobs = chunk.choices[0].logprobs
last_chunk = chunk
# 显式声明类型
tool_calls_list: List[Dict[str, Any]] = list(
message.get("tool_calls", {}).values(),
)
message["tool_calls"] = tool_calls_list
tool_calls = None
if message["tool_calls"]:
tool_calls = []
for tool_call in message["tool_calls"]: # 类型已明确为 Dict
function = Function(
arguments=tool_call["function"]["arguments"],
name=tool_call["function"]["name"],
)
tool_call_object = ChatCompletionMessageToolCall(
id=tool_call["id"],
function=function,
type=tool_call["type"],
)
tool_calls.append(tool_call_object)
chat_message = ChatCompletionMessage(
content=message["content"],
role=message["role"],
function_call=message["function_call"],
tool_calls=tool_calls,
reasoning_content=message["reasoning_content"],
refusal=message["refusal"],
)
choices = [
Choice(
finish_reason=finish_reason,
index=0,
message=chat_message,
logprobs=logprobs,
),
]
completion = ChatCompletion(
id=last_chunk.id,
choices=choices,
created=last_chunk.created,
model=last_chunk.model,
object="chat.completion",
service_tier=last_chunk.service_tier,
system_fingerprint=last_chunk.system_fingerprint,
usage=last_chunk.usage,
)
return completion

View File

@@ -0,0 +1,29 @@
# -*- coding: utf-8 -*-
import asyncio
import json
from qwen_langgraph_search.src.configuration import Configuration
from qwen_langgraph_search.src.custom_search_tool import CustomSearchTool
from qwen_langgraph_search.src.graph_openai_compatible import WebSearchGraph
from qwen_langgraph_search.src.llm_utils import call_dashscope
if __name__ == "__main__":
custom_search_tool = CustomSearchTool(search_engine="quark")
graph = WebSearchGraph(
json.loads(Configuration().model_dump_json()),
call_dashscope,
custom_search_tool,
)
user_input = input("Type in your question or press q to quit\n")
while user_input != "q":
question = user_input
use_agentengine = True
try:
res = asyncio.run(graph.run(question))
print(res)
except Exception as e:
print(f"An error occurred: {e}")
user_input = input("Type in your question or press q to quit\n")

View File

@@ -0,0 +1,47 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
import operator
from dataclasses import dataclass, field
from typing import Optional, TypedDict
from langgraph.graph import add_messages
from typing_extensions import Annotated
class OverallState(TypedDict):
messages: Annotated[list, add_messages]
search_query: Annotated[list, operator.add]
web_research_result: Annotated[list, operator.add]
sources_gathered: Annotated[list, operator.add]
initial_search_query_count: int
max_research_loops: int
research_loop_count: int
reasoning_model: str
class ReflectionState(TypedDict):
is_sufficient: bool
knowledge_gap: str
follow_up_queries: Annotated[list, operator.add]
research_loop_count: int
number_of_ran_queries: int
class Query(TypedDict):
query: str
rationale: str
class QueryGenerationState(TypedDict):
search_query: list[Query]
class WebSearchState(TypedDict):
search_query: str
id: str
@dataclass(kw_only=True)
class SearchStateOutput:
running_summary: Optional[str] = field(default=None) # Final report

View File

@@ -0,0 +1,29 @@
# -*- coding: utf-8 -*-
from typing import List
from pydantic import BaseModel, Field
class SearchQueryList(BaseModel):
query: List[str] = Field(
description="A list of search queries to be used for web research.",
)
rationale: str = Field(
description="A brief explanation of why these queries are relevant "
"to the research topic.",
)
class Reflection(BaseModel):
is_sufficient: bool = Field(
description="Whether the provided summaries are sufficient to answer "
"the user's question.",
)
knowledge_gap: str = Field(
description="A description of what information is missing or needs "
"clarification.",
)
follow_up_queries: List[str] = Field(
description="A list of follow-up queries to address the knowledge "
"gap.",
)

View File

@@ -0,0 +1,129 @@
# -*- coding: utf-8 -*-
import time
from datetime import datetime
from typing import Any, Dict, List
from langchain_core.messages import AIMessage, AnyMessage, HumanMessage
def get_current_date() -> str:
return datetime.now().strftime("%B %d, %Y")
def format_time(timestamp_param: str, format_str: str = "%Y-%m-%d") -> str:
if not timestamp_param or not timestamp_param.isnumeric():
return ""
try:
timestamp = int(timestamp_param)
return time.strftime(format_str, time.localtime(timestamp))
except (ValueError, OverflowError, OSError):
return ""
def get_research_topic(messages: List[AnyMessage]) -> str:
"""
Get the research topic from the messages.
"""
# check if request has a history and combine the messages
# into a single string
if len(messages) == 1:
research_topic = messages[-1].content
else:
research_topic = ""
for message in messages:
if isinstance(message, HumanMessage):
research_topic += f"User: {message.content}\n"
elif isinstance(message, AIMessage):
research_topic += f"Assistant: {message.content}\n"
return research_topic
def insert_citation_markers(text: str, citations_list: List[Dict]) -> str:
"""
Inserts citation markers into a text string based on start and end indices.
Args:
text (str): The original text string.
citations_list (list): A list of dictionaries, where each dictionary
contains 'start_index', 'end_index', and
'segment_string' (the marker to insert).
Indices are assumed to be for the original text.
Returns:
str: The text with citation markers inserted.
"""
# Sort citations by end_index in descending order.
# If end_index is the same, secondary sort by start_index descending.
# This ensures that insertions at the end of the string don't affect
# the indices of earlier parts of the string that still
# need to be processed.
sorted_citations = sorted(
citations_list,
key=lambda c: (c["end_index"], c["start_index"]),
reverse=True,
)
modified_text = text
for citation_info in sorted_citations:
# These indices refer to positions in the *original* text,
# but since we iterate from the end, they remain valid for insertion
# relative to the parts of the string already processed.
end_idx = citation_info["end_index"]
marker_to_insert = ""
for segment in citation_info["segments"]:
marker_to_insert += (
f" [{segment['label']}]({segment['short_url']})"
)
# Insert the citation marker at the original end_idx position
modified_text = (
modified_text[:end_idx]
+ marker_to_insert
+ modified_text[end_idx:]
)
return modified_text
def custom_resolve_urls(
search_results: List[Dict[str, Any]],
uid: str,
) -> Dict[str, str]:
prefix = "https://search-result.local/id/"
resolved_map = {}
for idx, result in enumerate(search_results):
url = result.get("url", "")
if url and url not in resolved_map:
resolved_map[url] = f"{prefix}{uid}-{idx}"
return resolved_map
def custom_get_citations(
search_results: List[Dict[str, Any]],
resolved_urls_map: Dict[str, str],
) -> List[Dict[str, Any]]:
citations = []
for idx, result in enumerate(search_results):
url = result.get("url", "")
title = result.get("title", f"搜索结果 {idx + 1}")
if url:
citation = {
"start_index": 0, # 简化处理,实际应用中可以更精确
"end_index": len(title),
"segments": [
{
"label": title[:50] + "..."
if len(title) > 50
else title,
"short_url": resolved_urls_map.get(url, url),
"value": url,
},
],
}
citations.append(citation)
return citations

View File

@@ -0,0 +1,18 @@
# ACEBench Example
This is an example of agent-oriented evaluation in AgentScope.
We take [ACEBench](https://github.com/ACEBench/ACEBench) as an example benchmark, and run
a ReAct agent with [Ray](https://github.com/ray-project/ray)-based evaluator, which supports
**distributed** and **parallel** evaluation.
To run the example, you need to install AgentScope first, and then run the evaluation with the following command:
```bash
python main.py --data_dir {data_dir} --result_dir {result_dir}
```
## Further Reading
- [ACEBench](https://github.com/ACEBench/ACEBench)
- [Ray](https://github.com/ray-project/ray)

View File

@@ -0,0 +1,132 @@
# -*- coding: utf-8 -*-
"""Example of running ACEBench evaluation with AgentScope."""
import asyncio
import os
from argparse import ArgumentParser
from typing import Callable
from agentscope.agent import ReActAgent
from agentscope.evaluate import (
ACEBenchmark,
ACEPhone,
FileEvaluatorStorage,
RayEvaluator,
SolutionOutput,
Task,
)
from agentscope.formatter import DashScopeChatFormatter
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from agentscope.tool import Toolkit
async def react_agent_solution(
ace_task: Task,
pre_hook: Callable,
) -> SolutionOutput:
"""Run ReAct agent with the given task in ACEBench.
Args:
ace_task (`Task`):
Task to run in ACEBench.
pre_hook (Callable):
The pre-hook function to save the agent's pre-print messages.
"""
# Equip tool functions
toolkit = Toolkit()
for tool, json_schema in ace_task.metadata["tools"]:
# register the tool function with the given json schema
toolkit.register_tool_function(tool, json_schema=json_schema)
# Create a ReAct agent
agent = ReActAgent(
name="Friday",
sys_prompt="You are a helpful assistant named Friday. "
"Your target is to solve the given task with your tools."
"Try to solve the task as best as you can.",
model=DashScopeChatModel(
api_key=os.environ.get("DASHSCOPE_API_KEY"),
model_name="qwen-max",
stream=False,
),
formatter=DashScopeChatFormatter(),
toolkit=toolkit,
)
agent.register_instance_hook(
"pre_print",
"save_logging",
pre_hook,
)
# Execute the agent to solve the task
msg_input = Msg("user", ace_task.input, role="user")
# Print the input by the running agent to call the pre-print hook
await agent.print(msg_input)
await agent(msg_input)
# Obtain tool calls sequence
memory_msgs = await agent.memory.get_memory()
# Obtain tool_use blocks as trajectory
traj = []
for msg in memory_msgs:
traj.extend(msg.get_content_blocks("tool_use"))
# Obtain the final state of the phone and travel system
phone: ACEPhone = ace_task.metadata["phone"]
final_state = phone.get_current_state()
# Wrap into a SolutionOutput
solution = SolutionOutput(
success=True,
output=final_state,
trajectory=traj,
)
return solution
async def main() -> None:
"""Main function for running ACEBench."""
# Prepare data and results directories
parser = ArgumentParser()
parser.add_argument(
"--data_dir",
type=str,
required=True,
help="Where to save the dataset.",
)
parser.add_argument(
"--result_dir",
type=str,
required=True,
help="Where to save the evaluation results.",
)
parser.add_argument(
"--n_workers",
type=int,
default=1,
help="The number of ray workers to use for evaluation.",
)
args = parser.parse_args()
# Create the evaluator
# or GeneralEvaluator, which more suitable for local debug
evaluator = RayEvaluator(
name="ACEbench evaluation",
benchmark=ACEBenchmark(
data_dir=args.data_dir,
),
# Repeat how many times
n_repeat=1,
storage=FileEvaluatorStorage(
save_dir=args.result_dir,
),
# How many workers to use
n_workers=args.n_workers,
)
# Run the evaluation
await evaluator.run(react_agent_solution)
asyncio.run(main())

View File

@@ -0,0 +1 @@
agentscope[full]>=1.0.5

View File

View File

@@ -0,0 +1,325 @@
# Mem0LongTermMemory
## Overview
**Note**: We are working on merging Mem0LongTermMemory into the main AgentScope repository.
Mem0LongTermMemory is a long-term memory implementation built on top of the mem0 library, designed to provide persistent, semantic memory storage for AgentScope agents. It enables agents to record, store, and retrieve conversation history, reasoning processes, and contextual information across sessions, supporting advanced memory management and knowledge retention.
This example demonstrates how to use Mem0LongTermMemory to create persistent memory systems that can store and retrieve information based on semantic similarity, enabling agents to maintain context and learn from past interactions.
## Core Features
### Persistent Memory Storage
- **Vector-based Storage**: Uses Qdrant vector database for efficient semantic search and retrieval
- **Configurable Backends**: Support for multiple embedding models (OpenAI, DashScope) and vector stores
- **Async Operations**: Full async support for non-blocking memory operations
### Semantic Memory Management
- **Content Recording**: Store conversation messages, tool usage, and reasoning processes
- **Thinking Integration**: Record agent thinking processes alongside content for better context
- **Flexible Input Formats**: Support for strings, Msg objects, and dictionaries
### Agent Integration
- **Direct AgentScope Integration**: Seamless integration with AgentScope's ReActAgent
- **Memory Modes**: Support for agent_control, dev_control, and both modes
- **Tool Response Format**: Returns structured ToolResponse objects for easy integration
## File Structure
```
memory_by_mem0/
├── README.md # This documentation file
├── long_term_memory_by_mem0.py # Core Mem0LongTermMemory implementation
├── memory_example.py # Standalone examples demonstrating memory operations
├── conversation_agent_with_longterm_mem.py # Interactive conversation example with ReActAgent
└── utils.py # AgentScope integration utilities for mem0
```
## Prerequisites
### Clone the AgentScope Repository
This example depends on AgentScope. Please clone the full repository.
### Install Dependencies
**Recommended**: Python 3.10+
Install the following dependencies:
```bash
pip install mem0ai
```
### API Keys
By default, the example uses DashScope/OpenAI for embedding and LLM. Set your API key:
```bash
export DASHSCOPE_API_KEY='YOUR_API_KEY'
export DASHSCOPE_API_BASE_URL='YOUR_API_BASE_URL'
export DASHSCOPE_MODEL_4_MEMORY='USED_MODEL_NAME'
export DASHSCOPE_EMBEDDING_MODEL='text-embedding-v2'
```
## How It Works
### 1. Configuration
The memory system uses a `MemoryConfig` that specifies:
- **Embedder**: Configuration for embedding models (OpenAI, DashScope)
- **LLM**: Configuration for language models used in memory processing
- **Vector Store**: Configuration for vector database (Qdrant with on-disk storage as default)
### 2. Memory Structure
- **Mem0LongTermMemory**: Inherits from `LongTermMemoryBase` and maintains an async memory server
- **Single AsyncMemory Instance**: Uses one AsyncMemory instance for all storage and retrieval operations
- **Agent/User Context**: Maintains separate memory spaces for different agent-user combinations
### 3. Memory Recording Flow
1. **Input Processing**: Formats various input types (strings, Msg objects, dictionaries) into standardized format
2. **Content Combination**: Merges thinking processes with content for comprehensive memory storage
3. **Vector Storage**: Stores processed content with metadata in the vector database
4. **Response Formatting**: Returns structured ToolResponse objects for easy integration
### 4. Memory Retrieval Flow
1. **Semantic Search**: Performs vector similarity search in the memory database
2. **Response Formatting**: Returns retrieved memories in structured format
## Usage Examples
### Basic Usage
Run the standalone memory examples to see the complete memory operations:
```bash
python ./memory_example.py
```
### Example Scenarios
The example demonstrates several typical use cases:
1. **Basic Conversation Recording**: Store simple user-agent conversations
2. **Tool Usage and Results**: Record tool usage with thinking processes
3. **Multi-step Reasoning**: Store complex reasoning processes step by step
4. **Error Handling**: Record error scenarios and recovery strategies
5. **User Preferences**: Store user preferences and contextual information
## API Reference
### Mem0LongTermMemory Class
#### Main Methods
**`__init__(agent_name=None, user_name=None, run_name=None, model=None, embedding_model=None, vector_store_config=None, mem0_config=None, default_memory_type=None, **kwargs)`**
- Initialize the memory instance with agent, user, and run context
- `agent_name` (str, optional): The name of the agent
- `user_name` (str, optional): The name of the user
- `run_name` (str, optional): The name of the run/session
- `model` (ChatModelBase, optional): The model to use for the long-term memory
- `embedding_model` (EmbeddingModelBase, optional): The embedding model to use
- `vector_store_config` (VectorStoreConfig, optional): Vector store configuration
- `mem0_config` (MemoryConfig, optional): Complete mem0 configuration
- `default_memory_type` (str, optional): Default memory type for storage
**Note**:
1. At least one of `agent_name`, `user_name`, or `run_name` is required.
2. During memory recording, these parameters become metadata for the stored memories.
3. During memory retrieval, only memories with matching metadata values will be returned.
**`record_to_memory(thinking, content, memory_type=None, **kwargs)`**
- Record content with thinking process
- `thinking` (str): Your thinking and reasoning about what to record
- `content` (list[str]): The content to remember, which is a list of strings
- `memory_type` (str, optional): The type of memory to use. Default is None, to create a semantic memory. "procedural_memory" is explicitly used for procedural memories
- Returns: ToolResponse with success/error status
**`retrieve_from_memory(keywords, **kwargs)`**
- Retrieve memories based on keywords
- `keywords` (list[str]): Keywords to search for in the memory, which should be specific and concise, e.g. the person's name, the date, the location, etc.
- `limit_per_search` (int): Number of memories to retrieve per search (default: 5)
- Returns: ToolResponse with retrieved memories
#### Internal Methods
**`record(msgs, **kwargs)`**
- Record message sequences to memory
- `msgs` (Sequence[Msg | None]): Messages to record
**`_record_all(content, thinking=None, memory_type=None, infer=True, **kwargs)`**
- Record content with comprehensive processing
- `content` (list[str] | list[Msg] | list[dict]): The content to remember, which is a list of strings or Msg objects or dict objects
- `thinking` (str, optional): Your thinking and reasoning about what to record, if not provided, the content will be used as the thinking
- `memory_type` (str, optional): The type of memory to use. Default is None, to create a semantic memory. "procedural_memory" is explicitly used for procedural memories
- `infer` (bool): Whether to infer memory type (default: True)
- Handles various input formats and thinking integration
**`retrieve(msg, **kwargs)`**
- Retrieve memories based on message content
- `msg` (Msg | list[Msg] | None): The message to search for in the memory, which should be specific and concise, e.g. the person's name, the date, the location, etc.
- `limit_per_search` (int): Number of results per search (default: 5)
- Returns: list[str] - A list of retrieved memory strings
### Configuration
#### Direct Model Configuration
```python
# Initialize with AgentScope models directly
long_term_memory = Mem0LongTermMemory(
agent_name="Friday",
user_name="user_123",
model=OpenAIChatModel(
model_name="gpt-4",
api_key="your_api_key",
base_url="your_base_url"
),
embedding_model=OpenAITextEmbedding(
model_name="text-embedding-3-small",
api_key="your_api_key",
base_url="your_base_url"
)
)
```
## Customization & Extension
### Backend Replacement
Easily customize embedding, LLM, or vector store by modifying the configuration:
```python
# Example: Using different embedding model
embedder=EmbedderConfig(
provider="dashscope",
config={
"model": "text-embedding-v1",
"api_key": "your_dashscope_key"
}
)
```
### Memory Config Replacement
Mem0LongTermMemory supports directly receiving memory configurations defined in mem0, allowing users to easily adopt various memory configurations and backends supported by mem0. This provides flexibility to use different embedding models, LLMs, and vector stores without modifying the core implementation.
```python
# Example: Using a complete mem0 MemoryConfig
from mem0.configs.base import MemoryConfig
from mem0.embeddings.configs import EmbedderConfig
from mem0.llms.configs import LlmConfig
from mem0.vector_stores.configs import VectorStoreConfig
# Create a custom mem0 configuration
mem0_config = MemoryConfig(
embedder=EmbedderConfig(
provider="openai",
config={
"model": "text-embedding-3-small",
"api_key": "your_openai_key"
}
),
llm=LlmConfig(
provider="openai",
config={
"model": "gpt-4",
"api_key": "your_openai_key"
}
),
vector_store=VectorStoreConfig(
provider="qdrant",
config={
"on_disk": True,
"path": "./memory_data"
}
)
)
# Initialize with the custom mem0 configuration
long_term_memory = Mem0LongTermMemory(
agent_name="Friday",
user_name="user_123",
mem0_config=mem0_config
)
```
**Note**: In Mem0LongTermMemory, if the `model`, `embedding_model`, or `vector_store_config` parameters are not None, they will override the corresponding configurations in `mem0_config`. This allows for flexible configuration where you can use a base mem0 configuration and selectively override specific components.
### AgentScope Integration
The implementation includes custom AgentScope providers for mem0:
- **AgentScopeLLM**: Integrates AgentScope ChatModelBase with mem0
- **AgentScopeEmbedding**: Integrates AgentScope EmbeddingModelBase with mem0
These providers handle the conversion between mem0's expected format and AgentScope's message/response formats.
### Memory Type Customization
Add custom memory types for different use cases:
```python
# Example: Procedural memory
await memory.record_to_memory(
content=["Step 1: Analyze input", "Step 2: Process data"],
thinking="This is a procedural workflow for data processing",
memory_type="procedural_memory"
)
```
## Best Practices
### Memory Recording
1. **Be Specific**: Record specific, actionable information rather than general statements
2. **Include Context**: Always include relevant context and reasoning when recording
3. **Use Thinking**: Leverage the thinking parameter to explain why information is important
4. **Structured Content**: Use structured formats for complex information
### Memory Retrieval
1. **Specific Keywords**: Use specific, relevant keywords for better search results
2. **Appropriate Limits**: Set reasonable limits based on your use case
3. **Context Awareness**: Consider the current context when retrieving memories
4. **Error Handling**: Always handle potential retrieval errors gracefully
### Performance Optimization
1. **Batch Operations**: Group related memory operations when possible
2. **Efficient Queries**: Use specific keywords to reduce search scope
3. **Memory Cleanup**: Periodically clean up irrelevant or outdated memories
4. **Configuration Tuning**: Optimize vector store and embedding configurations
## Troubleshooting
### Common Issues
**Memory Not Found**
- Check if the memory was properly recorded
- Verify agent_id and user_id consistency
- Ensure vector store is properly configured
**Poor Search Results**
- Use more specific keywords
- Check embedding model configuration
- Verify content was properly formatted during recording
**Performance Issues**
- Optimize vector store configuration
- Reduce search limits
- Consider using on-disk storage for large datasets
**AgentScope Integration Issues**
- Ensure AgentScope models are properly configured
- Check that the custom providers are registered correctly
- Verify message format compatibility
### Debug Mode
Enable debug logging to troubleshoot issues:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
```
## Reference
- [mem0 Documentation](https://github.com/mem0ai/mem0)
- [AgentScope Documentation](https://github.com/agentscope-ai/agentscope)
- [Qdrant Vector Database](https://qdrant.tech/)
For further customization or integration, please refer to the full implementation in the `long_term_memory_by_mem0.py` file and the mem0 official documentation.

View File

@@ -0,0 +1,121 @@
# -*- coding: utf-8 -*-
"""Memory example demonstrating long-term memory functionality with mem0.
This module provides examples of how to use the Mem0LongTermMemory class
for recording and retrieving persistent memories.
"""
import asyncio
import os
from agentscope.agent import ReActAgent
from agentscope.embedding import DashScopeTextEmbedding
from agentscope.formatter import DashScopeChatFormatter
from agentscope.memory import InMemoryMemory, Mem0LongTermMemory
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from agentscope.tool import Toolkit
from dotenv import load_dotenv
load_dotenv()
async def main() -> None:
"""Run the memory examples."""
# Initialize long term memory
long_term_memory = Mem0LongTermMemory(
agent_name="Friday",
user_name="user_123",
model=DashScopeChatModel(
model_name="qwen-max-latest",
api_key=os.environ.get("DASHSCOPE_API_KEY"),
stream=False,
),
embedding_model=DashScopeTextEmbedding(
model_name="text-embedding-v2",
api_key=os.environ.get("DASHSCOPE_API_KEY"),
),
on_disk=False,
)
print("=== Long Term Memory Examples with mem0 ===\n")
# Example 1: Basic conversation recording
print("1. Basic Conversation Recording")
print("-" * 40)
results = await long_term_memory.record(
msgs=[
Msg(
role="user",
content="Please help me book a hotel, preferably homestay",
name="user",
),
],
)
print(f"Recorded conversation: {results}\n")
# Example 2: Retrieving memories
print("2. Retrieving Memories")
print("-" * 40)
print("Searching for weather-related memories...")
weather_memories = await long_term_memory.retrieve(
msg=[
Msg(
role="user",
content="What's the weather like today?",
name="user",
),
],
)
print(f"Retrieved weather memories: {weather_memories}\n")
print("Searching for user preference memories...")
preference_memories = await long_term_memory.retrieve(
msg=[
Msg(
role="user",
content=(
"I prefer temperatures in Celsius and wind speed in km/h"
),
name="user",
),
],
)
print(f"Retrieved preference memories: {preference_memories}\n")
# Example 3: ReActAgent with long term memory
print("3. ReActAgent with long term memory")
print("-" * 40)
toolkit = Toolkit()
agent = ReActAgent(
name="Friday",
sys_prompt="You are a helpful assistant named Friday.",
model=DashScopeChatModel(
model_name="qwen-max-latest",
api_key=os.environ.get("DASHSCOPE_API_KEY"),
stream=False,
),
formatter=DashScopeChatFormatter(),
toolkit=toolkit,
memory=InMemoryMemory(),
long_term_memory=long_term_memory,
long_term_memory_mode="both",
)
await agent.memory.clear()
msg = Msg(
role="user",
content="When I travel to Hangzhou, I prefer to stay in a homestay",
name="user",
)
msg = await agent(msg)
print(f"ReActAgent response: {msg.get_text_content()}\n")
msg = Msg(role="user", content="what preference do I have?", name="user")
msg = await agent(msg)
print(f"ReActAgent response: {msg.get_text_content()}\n")
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,3 @@
agentscope[full]>=1.0.5
packaging>=25.0
mem0ai>=1.0.0

View File

@@ -0,0 +1,61 @@
# MCP in AgentScope
This example demonstrates how to
- create MCP client with different transports (SSE and Streamable HTTP) and type (Stateless and Stateful),
- register MCP tool functions and use them in a ReAct agent, and
- get MCP tool function as a local callable object from the MCP client.
## Prerequisites
- Python 3.10 or higher
- DashScope API key from Alibaba Cloud
## Installation
### Install from PyPI (Recommended)
### Install AgentScope
```bash
# Install from source
cd {PATH_TO_AGENTSCOPE}
pip install -e .
```
## QuickStart
Install agentscope and ensure you have a valid DashScope API key in your environment variables.
> Note: The example is built with DashScope chat model. If you want to change the model in this example, don't forget
> to change the formatter at the same time! The corresponding relationship between built-in models and formatters are
> list in [our tutorial](https://doc.agentscope.io/tutorial/task_prompt.html#id1)
```bash
pip install agentscope
```
Start the MCP servers by the following commands in two separate terminals:
```bash
# In one terminal, run:
python mcp_add.py
# In another terminal, run:
python mcp_multiply.py
```
Two MCP servers will be started on `http://127.0.0.1:8001` (SSE server) and `http://127.0.0.1:8002` (streamable
HTTP server).
After starting the MCP servers, you can run the agent example:
```bash
python main.py
```
The agent will:
1. Register the MCP tools from the servers
2. Use a ReAct agent to solve a calculation problem (multiplying two numbers and then adding another number)
3. Return structured output with the final result

109
functionality/mcp/main.py Normal file
View File

@@ -0,0 +1,109 @@
# -*- coding: utf-8 -*-
"""
Demo showcasing ReAct agent with MCP tools using different transports.
This example demonstrates:
- Registering MCP tools with different transports (sse and streamable_http)
- Using a ReAct agent with registered MCP tools
- Getting structured output from the agent
Before running this demo, please execute:
python mcp_servers.py
"""
import asyncio
import json
import os
from agentscope.agent import ReActAgent
from agentscope.formatter import DashScopeChatFormatter
from agentscope.mcp import HttpStatefulClient, HttpStatelessClient
from agentscope.message import Msg
from agentscope.model import DashScopeChatModel
from agentscope.tool import Toolkit
from pydantic import BaseModel, Field
class NumberResult(BaseModel):
"""A simple number result model for structured output."""
result: int = Field(description="The result of the calculation")
async def main() -> None:
"""The main entry of the MCP example."""
toolkit = Toolkit()
# Create a stateful MCP client to connect to the SSE MCP server
# note you can also use the stateless client
add_mcp_client = HttpStatefulClient(
name="add_mcp",
transport="sse",
url="http://127.0.0.1:8001/sse",
)
# Create a stateless MCP client to connect to the StreamableHTTP MCP server
# note you can also use the stateful client
multiply_mcp_client = HttpStatelessClient(
name="multiply_mcp",
transport="streamable_http",
url="http://127.0.0.1:8002/mcp",
)
# The stateful client must be connected before using
await add_mcp_client.connect()
# Register the MCP clients to the toolkit
await toolkit.register_mcp_client(add_mcp_client)
await toolkit.register_mcp_client(multiply_mcp_client)
# Initialize the agent
agent = ReActAgent(
name="Jarvis",
sys_prompt="You're a helpful assistant named Jarvis.",
model=DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
),
formatter=DashScopeChatFormatter(),
toolkit=toolkit,
)
# Run the agent with a calculation task
res = await agent(
Msg(
"user",
"Calculate 2345 multiplied by 3456, then add 4567 to the result,"
" what is the final outcome?",
"user",
),
structured_model=NumberResult,
)
print(
"Structured Output:\n"
"```\n"
f"{json.dumps(res.metadata, indent=4, ensure_ascii=False)}\n"
"```",
)
# AgentScope also allows developers to obtain the MCP tool as a local
# callable object, and use it directly.
add_tool_function = await add_mcp_client.get_callable_function(
"add",
# If wrap the MCP tool result into the ToolResponse object in
# AgentScope
wrap_tool_result=True,
)
# Call it manually
manual_res = await add_tool_function(a=5, b=10)
print("When manually calling the MCP tool function:")
print(manual_res)
# The stateful client should be disconnected manually!
await add_mcp_client.close()
asyncio.run(main())

View File

@@ -0,0 +1,15 @@
# -*- coding: utf-8 -*-
"""An SSE MCP server with a simple add tool function."""
from mcp.server import FastMCP
mcp = FastMCP("Add", port=8001)
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers."""
return a + b
mcp.run(transport="sse")

View File

@@ -0,0 +1,15 @@
# -*- coding: utf-8 -*-
"""An SSE MCP server with a simple multiply tool function."""
from mcp.server import FastMCP
mcp = FastMCP("Multiply", port=8002)
@mcp.tool()
def multiply(c: int, d: int) -> int:
"""Multiply two numbers."""
return c * d
mcp.run(transport="streamable-http")

View File

@@ -0,0 +1 @@
agentscope[full]>=1.0.5

View File

@@ -0,0 +1,155 @@
# Meta Planner Agent Example
An advanced AI agent example that demonstrates sophisticated task planning and execution capabilities using AgentScope. The Meta Planner breaks down complex tasks into manageable subtasks and orchestrates specialized worker agents to complete them efficiently.
## Overview
The Meta Planner agent is designed to handle complex, multi-step tasks that would be difficult for a simple agent to manage directly. It uses a planning-execution pattern where:
1. **Complex tasks are decomposed** into smaller, manageable subtasks
2. **Worker agents can be dynamically created** with appropriate tools for each subtask
3. **Progress is tracked and managed** through a roadmap system
4. **Results are coordinated** to achieve the overall goal
This approach enables handling sophisticated workflows like data analysis, research projects, content creation, and multi-step problem solving.
## Key Features
- **Intelligent Task Decomposition**: Automatically breaks down complex requests into executable subtasks
- **Progress Tracking**: Maintains a structured roadmap with status tracking for all subtasks
- **Dynamic Worker Management**: Creates and manages specialized worker agents with relevant toolkits
- **State Persistence**: Saves and restores agent state for long-running tasks
- **Flexible Modes**: Can operate in simple ReAct mode or advanced planning mode based on task complexity
## Architecture
### Core Components
1. **MetaPlanner** (`_meta_planner.py`): The main agent class that extends ReActAgent with planning capabilities
2. **Planning Tools** (`_planning_tools/`):
- `PlannerNoteBook`: Manages session context and user inputs
- `RoadmapManager`: Handles task decomposition and progress tracking
- `WorkerManager`: Creates and manages worker agents
3. **System Prompts** (`_built_in_long_sys_prompt/`): Detailed instructions for (worker) agent behavior
4. **Demo Entry Point** (`main.py`): The main function to start the application with meta planner agent.
## Prerequisites for Running This Example
### Required Environment Variables
```bash
# Anthropic API key for the Claude model
export ANTHROPIC_API_KEY="your_anthropic_api_key"
# Tavily API key for search functionality
export TAVILY_API_KEY="your_tavily_api_key"
```
### Optional Environment Variables
```bash
# Custom working directory for agent operations (default: ./meta_agent_demo_env)
export AGENT_OPERATION_DIR="/path/to/custom/working/directory"
```
## Usage
### Basic Usage
Run the agent interactively:
```bash
cd examples/agent_meta_planner
python main.py
```
The agent will start in chat mode where you can provide complex tasks. For example:
- "Create a comprehensive analysis of Meta's stock performance in Q1 2025"
- "Research and write a 7-day exercise plan with detailed instructions"
- "Analyze the latest AI trends and create a summary report"
### Example Interactions
1. **Data Analysis Task**:
```
User: "Analyze the files in my directory and create a summary report"
```
2. **Research Task**:
```
User: "Research Alibaba's latest quarterly results and competitive position"
```
## Configuration
### Agent Modes
The Meta Planner supports three operation modes:
- **`dynamic`** (default): Automatically switches between simple ReAct and planning mode based on task complexity
- **`enforced`**: Always uses planning mode for all tasks
- **`disable`**: Only uses simple ReAct mode (no planning capabilities)
### Tool Configuration
The agent uses two main toolkits:
1. **Planner Toolkit**: Planning-specific tools for task decomposition and worker management
2. **Worker Toolkit**: Comprehensive tools including:
- Shell command execution
- File operations
- Web search (via Tavily)
- Filesystem access (via MCP)
### State Management
Agent states are automatically saved during execution:
- **Location**: `./agent-states/run-YYYYMMDDHHMMSS/`
- **Types**:
- `state-post_reasoning-*.json`: After reasoning steps
- `state-post-action-{tool_name}-*.json`: After tool executions
### State Recovery
If an agent gets stuck or fails:
1. Check the latest state file in `./agent-states/`
2. Resume from the last successful state:
```bash
python main.py --load_state path/to/state/file.json
```
## Advanced Customization
### Adding New Tools
1. Create tool functions following AgentScope patterns
2. Register tools in the appropriate toolkit:
```python
worker_toolkit.register_tool_function(your_custom_tool)
```
### Custom MCP Clients
Add additional MCP clients in `main.py`:
```python
mcp_clients.append(
StdIOStatefulClient(
name="custom_mcp",
command="npx",
args=["-y", "your-mcp-server"],
env={"API_KEY": "your_key"},
)
)
```
### System Prompt Modifications
Modify prompts in `_built_in_long_sys_prompt/` to customize agent behavior.

View File

@@ -0,0 +1,11 @@
### Tool usage rules
1. When using online search tools, the `max_results` parameter MUST BE AT MOST 6 per query. Try to avoid include raw content when call the search.
2. The directory/file system that you can operate is the following path: {agent_working_dir}. DO NOT try to save/read/modify file in other directories.
3. Try to use the local resource before going to online search. If there is file in PDF format, first convert it to markdown or text with tools, then read it as text.
4. NEVER use `read_file` tool to read PDF file directly.
5. DO NOT targeting at generating PDF file unless the user specifies.
6. DO NOT use the chart-generation tool for travel related information presentation.
7. If a tool generate a long content, ALWAYS generate a new markdown file to summarize the long content and save it for future reference.
8. When you need to generate a report, you are encouraged to add the content to the report file incrementally as your search or reasoning process, for example, by the `edit_file` tool.
9. When you use the `write_file` tool, you **MUST ALWAYS** remember to provide the both the `path` and `content` parameters. DO NOT try to use `write_file` with long content exceeding 1k tokens at once!!!

Some files were not shown because too many files have changed in this diff Show More