ACEBench Example
This is an example of agent-oriented evaluation in AgentScope.
We take ACEBench as an example benchmark, and run a ReAct agent with Ray-based evaluator, which supports distributed and parallel evaluation.
To run the example, you need to install AgentScope first, and then run the evaluation with the following command:
python main.py --data_dir {data_dir} --result_dir {result_dir}