Upgrade Alias-Agent to 0.2.0 --------- Co-authored-by: ZiTao-Li <zitao.l@alibaba-inc.com> Co-authored-by: xieyxclack <yuexiang.xyx@alibaba-inc.com> Co-authored-by: Zexi Li <tomleeze@qq.com> Co-authored-by: SSSuperDan <dlaura2218@gmail.com> Co-authored-by: lalaliat <78087788+lalaliat@users.noreply.github.com> Co-authored-by: jinli.yl <jinli.yl@alibaba-inc.com> Co-authored-by: Dengjiaji <dengjiaji.djj@alibaba-inc.com> Co-authored-by: 于南 <zengtianjing.ztj@alibaba-inc.com> Co-authored-by: JustinDing <166603159+sleepy-bird-world@users.noreply.github.com> Co-authored-by: y1y5 <269557841@qq.com> Co-authored-by: 柳佚 <yly287738@alibaba-inc.com> Co-authored-by: LiangguiWeng <347185100@qq.com> Co-authored-by: 潜星 <zhijian.mzj@alibaba-inc.com> Co-authored-by: StCarmen <1106135234@qq.com> Co-authored-by: LuYi <yilu_2000@outlook.com> Co-authored-by: 刺葳 <ciwei.cy@alibaba-inc.com>
11 KiB
Alias for Data Science
An autonomous agent that runs your entire data science workflow.Overview
Alias-DataScience is an autonomous, ready-to-use, intelligent assistant for real-world data science workflows. It transforms high-level analytical questions into executable plans, which can seamlessly handle data acquisition, cleaning, modeling, visualization, and narrative reporting, with minimal human intervention.
✨ Key Features
🔍 Scalable File Filtering
To handle massive data files commonly found in enterprise data lakes, Alias-DataScience combines parallelized grep operations with Retrieval-Augmented Generation (RAG) to build a low-latency, high-throughput file filtering pipeline. This preprocessing step enables accurate identification of relevant files, significantly expanding our scope and applicability.
🧠 Context-Aware Prompt Engineering
Rather than relying on generic instructions, Alias-DataScience employs three specialized prompt templates, each fine-tuned for a dominant data science workflow:
- Exploratory Data Analysis (EDA): Surfaces trends, anomalies, and relationships to answer "what's happening?" and "why?"
- Predictive Modeling: Automates feature engineering, model selection, and optimization.
- Exact Data Computation: Delivers precise, auditable answers to quantitative queries (e.g., "What was the YoY revenue growth in Q3?").
An intelligent prompt selector routes tasks to the best template based on user intent.
📊 Handling of Messy Tabular Data
Alias-DataScience parses irregular spreadsheets (merged cells, embedded notes, multi-level headers) and converts them into structured tables. For large files, it outputs a semantic-preserving JSON representation, enabling reliable analysis of human-crafted inputs.
👁️ Multimodal Understanding of Visual Content
- Image Understanding: Interprets charts, diagrams, and general images to extract numerical data, trends, and domain-specific entities
- Visual QA: Answers natural-language questions about visual elements (e.g., "What was the peak value in Q3?").
📑 Automated Reporting
For EDA tasks, Alias-DataScience generates an interactive HTML report featuring:
- Actionable insights backed by statistics and visuals,
- Executable code snippets for transparency and reuse.
This bridges the gap between data scientists and stakeholders like business users or auditors.
📈 Benchmark Performance
Alias-DataScience achieves state-of-the-art (SOTA) across major data science agent benchmarks.
DSBench
Realistic tasks from ModelOff & Kaggle; includes multimodal inputs, multi-source data, and large-scale modeling.
| Task Category | Framework | Model | Score |
|---|---|---|---|
| Data Analysis | Alias-DataScience | Qwen3-max-Preview | 55.58% 🏆 |
| AutoGen | GPT-4 | 30.69% | |
| AutoGen | GPT-4o | 34.12% | |
| CodeInterpreter | GPT-4 | 26.39% | |
| CodeInterpreter | GPT-4o | 23.82% | |
| Data Modeling | Alias-DataScience | Qwen3-max-Preview | 49.70% 🏆 |
| AutoGen | GPT-4 | 45.52% | |
| AutoGen | GPT-4o | 34.74% | |
| CodeInterpreter | GPT-4 | 26.14% | |
| CodeInterpreter | GPT-4o | 16.90% |
InsightBench
Open-ended comprehensive analytical tasks.
| Framework | Model | Score |
|---|---|---|
| Alias-DataScience | Qwen3-max-Preview | 43.29% 🏆 |
| AgentPoirot | Qwen3-max-Preview | 39.30% |
DABench
End-to-end data analysis from real-world CSVs.
| Framework | Model | Score |
|---|---|---|
| Alias-DataScience | Qwen3-max-Preview | 95.20% 🏆 |
| AutoGen | GPT-4 | 71.49% |
| Data Interpreter | GPT-4 | 73.55% |
| Data Interpreter | GPT-4o | 94.93% |
Some tables include data from published sources, used with gratitude to the original authors and cited in good faith. For accuracy, please refer to the original publications.



