# Pixel - AI Video Creation Platform [中文文档](README_zh-CN.md) Pixel is an intelligent platform for creating comics and videos from scripts using AI. It streamlines the workflow from scriptwriting to asset management, storyboard generation, and final video production, leveraging the power of Multi-Agent Systems and advanced generative models. ## ✨ Features - **Intelligent Script Analysis**: - Automatically parse scripts to identify characters, scenes, and props using LLMs. - **Agent-based Workflow**: Utilizes specialized agents (powered by AgentScope) for deep script understanding and breakdown. - **Multi-Provider AI Support**: - **LLM**: DashScope (Qwen), Google (Gemini), VolcEngine (Doubao). - **Image Generation**: Flux 1.1 Pro, Wanx, Kolors (ModelScope). - **Video Generation**: Kling 1.5, Hailuo (MiniMax), CogVideoX, Wan 2.1. - **Asset Management**: Centralized "Material Center" to manage and edit creative assets. - **AI Storyboarding**: Generate visual storyboards from script descriptions using AI image generation with style consistency control. - **Video Generation & Editing**: - Transform static storyboards into dynamic videos. - **Fine-grained Control**: First/Last frame control, Camera Motion (Zoom, Pan, Tilt), and Motion Bucket settings. - **Infinite Canvas**: A node-based free creation workspace (powered by React Flow) supporting multi-selection, smooth zooming, and flexible node connections. - **Project Management**: Organize your creative works in a structured workspace with support for Episodes and Scenes. ## 🏗 Architecture The project is structured as a monorepo with a clean three-layer architecture: - **`frontend/`**: A **Next.js 15** (App Router) application using **React 19**, Tailwind CSS, and `@xyflow/react` for the canvas interface. - **`backend/`**: A **FastAPI** service with three-layer architecture (API Layer, Service Layer, Repository Layer). It uses **AgentScope** for agent orchestration and supports multiple model providers via a plugin system. ## 📚 Documentation Comprehensive documentation is available: - **[docs/API.md](docs/API.md)**: Complete API reference with examples - **[docs/development-guide.md](docs/development-guide.md)**: Development best practices and guidelines - **[docs/FRONTEND_OPTIMIZATION.md](docs/FRONTEND_OPTIMIZATION.md)**: Frontend optimization strategies ## 🚀 Quick Start ### Prerequisites - **Node.js**: v20 or higher (Required for Next.js 15) - **Python**: v3.12 or higher - **Package Manager**: `pnpm` (Frontend), `uv` (Backend - Recommended) - **Redis**: v7+ (Optional but recommended for caching) - **PostgreSQL**: v15+ (For production, SQLite for development) - **API Keys**: Aliyun DashScope, VolcEngine, or Google AI Studio keys depending on models used ### Installation #### 1. Clone the Repository ```bash git clone cd pixel ``` #### 2. Backend Setup ```bash cd backend # Install uv if you haven't already pip install uv # Install dependencies uv sync # Configure environment variables cp .env.example .env # Edit .env with your API keys # Run database migrations alembic upgrade head # Start the backend server ./start.sh # Or manually: uv run uvicorn src.main:app --reload --port 8000 ``` The backend will be available at http://localhost:8000 #### 3. Frontend Setup ```bash cd frontend # Install dependencies pnpm install # Start the development server pnpm dev ``` The frontend will be available at http://localhost:3000 ### Environment Configuration Create a `.env` file in the `backend/` directory: ```env # AI Providers DASHSCOPE_API_KEY=your_dashscope_key VOLCENGINE_ACCESS_KEY=your_volcengine_key VOLCENGINE_SECRET_KEY=your_volcengine_secret GOOGLE_API_KEY=your_google_key KLING_API_KEY=your_kling_key # Storage OSS_ACCESS_KEY_ID=your_oss_key OSS_ACCESS_KEY_SECRET=your_oss_secret # Database (Optional - defaults to SQLite) DATABASE_URL=postgresql://user:pass@localhost/pixel # Redis (Optional but recommended) REDIS_URL=redis://localhost:6379 REDIS_ENABLED=1 # CORS # Production: set explicit comma-separated origins CORS_ALLOWED_ORIGINS=https://your-app.example.com # Development fallback origins CORS_DEV_ALLOWED_ORIGINS=http://localhost:3000,http://127.0.0.1:3000 # Monitoring ENABLE_METRICS=true LOG_LEVEL=INFO ``` ### Docker Deployment (Optional) ```bash # Build and start all services docker-compose up -d # View logs docker-compose logs -f # Stop services docker-compose down ``` ## 🔌 API Documentation The backend provides RESTful APIs with comprehensive documentation: - **[API Documentation](docs/API.md)**: Complete API reference with examples - **Interactive Docs**: http://localhost:8000/docs (Swagger UI) - **ReDoc**: http://localhost:8000/redoc - **OpenAPI Spec**: http://localhost:8000/openapi.json ### Core Endpoints * **Image Generation**: `POST /api/v1/generations/image` * **Video Generation**: `POST /api/v1/generations/video` * **Script Analysis**: `POST /api/v1/script/analyze` * **Task Status**: `GET /api/v1/tasks/{task_id}` * **Project Management**: `/api/v1/projects/*` * **Canvas Operations**: `/api/v1/canvas/*` ### Interactive Documentation - **Swagger UI**: http://localhost:8000/docs - **ReDoc**: http://localhost:8000/redoc - **OpenAPI Spec**: http://localhost:8000/openapi.json All generation endpoints support an `extra_params` field to pass model-specific arguments directly to the underlying SDK. ## 🧪 Development ### Running Tests **Backend:** ```bash cd backend pytest # Run all tests pytest tests/test_integration.py # Run integration tests pytest --cov=src # Run with coverage ``` **Frontend:** ```bash cd frontend pnpm test # Run all tests pnpm test:watch # Run in watch mode pnpm test:coverage # Run with coverage ``` ### Code Quality **Backend:** ```bash # Format code black src/ isort src/ # Lint flake8 src/ mypy src/ ``` **Frontend:** ```bash # Lint pnpm lint # Type check pnpm type-check # Format pnpm format ``` ### Database Migrations ```bash cd backend # Create a new migration alembic revision --autogenerate -m "description" # Apply migrations alembic upgrade head # Rollback one migration alembic downgrade -1 ``` ### API Type Generation When backend API changes, regenerate frontend types: ```bash cd frontend pnpm gen:api ``` ## 📊 Monitoring ### Health Check ```bash curl http://localhost:8000/health ``` ### Metrics (Prometheus) ```bash curl http://localhost:8000/metrics ``` ### Logs Backend logs are in JSON format for easy parsing: ```bash tail -f backend/logs/app.log | jq ``` ## 🤝 Contributing 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Open a Pull Request ### Development Guidelines - Follow the three-layer architecture (API, Service, Repository) - Write tests for new features - Update documentation - Follow code style guidelines - Use conventional commits ## 🐛 Troubleshooting ### Backend Issues **Redis Connection Error:** ```bash # Check if Redis is running redis-cli ping # Should return: PONG ``` **Database Connection Error:** ```bash # Check database connection psql -h localhost -U user -d pixel ``` **Import Errors:** ```bash # Reinstall dependencies uv sync --reinstall ``` ### Frontend Issues **Module Not Found:** ```bash # Clear cache and reinstall rm -rf node_modules .next pnpm install ``` **API Type Mismatch:** ```bash # Regenerate API types pnpm gen:api ``` ## 📦 Project Structure ``` pixel/ ├── backend/ # FastAPI backend │ ├── src/ │ │ ├── controllers/ # API Layer (HTTP handlers) │ │ ├── services/ # Service Layer (business logic) │ │ ├── repositories/ # Repository Layer (data access) │ │ ├── models/ # Data models (entities & schemas) │ │ ├── middlewares/ # API middlewares │ │ └── utils/ # Utilities │ ├── tests/ # Test suite │ └── alembic/ # Database migrations ├── frontend/ # Next.js frontend │ ├── src/ │ │ ├── app/ # Next.js pages (App Router) │ │ ├── components/ # React components │ │ ├── lib/ # Utilities and services │ │ └── store/ # State management (Zustand) │ └── tests/ # Test suite ├── docs/ # Documentation ├── ARCHITECTURE.md # Architecture documentation └── docker-compose.yml # Docker configuration ``` ## 📄 License MIT