Browser Use Demo
This demo showcases how to use browser automation capabilities within the AgentScope Runtime framework. It provides both backend services and a frontend interface to demonstrate browser-based agent interactions. The real-time visualization of browser interactions is powered by Steel-Browser.
🌳 Project Structure
├── backend # Backend directory, containing server-side services and logic
│ ├── agentscope_browseruse_agent.py # Script related to browser usage or agent management
│ ├── async_quart_service.py # Asynchronous service using Quart to handle backend requests
│ └── prompts.py # Module containing prompt messages or interaction logic for the backend
├── frontend # Frontend directory, containing client-side code (typically using React)
│ ├── public # Public folder for storing static files copied during build
│ │ ├── index.html # HTML template for the frontend app, acts as the entry HTML file
│ │ └── manifest.json # Manifest file describing the web app's metadata such as name and icons
│ ├── src # Source code folder, containing React components and styles
│ │ ├── App.css # Stylesheet for the main app component
│ │ ├── App.tsx # TypeScript file for the main app component, the root component of the application
│ │ ├── Browser.scss # Stylesheet for specific browser-related components or pages using SCSS
│ │ ├── Browser.tsx # React component file related to browser functionality
│ │ ├── index.css # Global stylesheet affecting the overall look of the application
│ │ └── index.tsx # Entry point for the React application to render content into `index.html`
│ ├── package.json # Project dependencies file, lists all npm dependencies and scripts
│ └── tsconfig.json # TypeScript configuration file, defines compilation options
└── README.md # Project documentation file, provides basic information and usage instructions
📖 Overview
This demo illustrates how agents can interact with web browsers to perform tasks such as:
- Web navigation
- Form filling
- Data extraction from web pages
- Automated web workflows
The implementation uses AgentScope's capabilities to create browser-based agents that can perform complex web interactions.
⚙️ Components
Backend
agentscope_browseruse_agent.py: Implements the browser-using agent with AgentScope Runtimeasync_quart_service.py: Provides asynchronous web service endpointsprompts.py: Contains prompts used by the agent for browser interactions
Frontend
- React-based interface for visualizing browser interactions
- TypeScript implementation for type-safe code
🌵Architecture
The architecture of the demo is depicted in the following diagram:
graph LR;
subgraph As["AgentScope Runtime"]
E[Sandbox]-->E1[Browser sandbox]
F[Agent Engine]
F-->|tool call| E
end
subgraph Bs["Frontend Service by React"]
B['React App']
end
subgraph Cs["Backend Service by Quart"]
C['async_quart_service']
C --> D[AgentscopeBrowseruseAgent]
end
A[User] --> |request| Bs
B --> C[Backend Service by Quart]
D --> E
D --> F
🚀 Getting Started
Preinstall
Node and Python environments are required.
- Install Node.js
- Install Python (version >= 3.11)
- Apply a DashScope API key to the
backend/.envfile.
Install the Front-end Service
Install Node Packages
cd frontend
npm install
Run the Front-end Service
npm run start
This will open your browser and display the demo page. Alternatively, you can also open it in your browser at http://localhost:3000:
Install the Back-end Service
Install Python Packages
cd ../backend
pip install -r requirements.txt
Run the Backend Service
python async_quart_service.py
The service will listen on port 9000.
Usage
- Open your browser and navigate to http://localhost:3000.
- Type your question in the input box and click the "Search" button, e.g., "Visit www.chinadaily.com.cn to search for today's hot topics."
- The response will be displayed in the output box.
🛠️ Features
- Browser automation within the AgentScope Runtime framework
- Real-time visualization of browser actions
- Asynchronous processing for better performance
- React-based user interface
- TypeScript support for type safety
ℹ️ Getting Help
If you have any questions or encounter any problems with this demo, please report them through GitHub issues.
📄 License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
🍬 Disclaimers
This is not an officially supported product. This project is intended for demonstration purposes only and is not suitable for production use.