# Paper2Slides **Repository Path**: liusha/Paper2Slides ## Basic Information - **Project Name**: Paper2Slides - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-15 - **Last Updated**: 2025-12-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
Paper2Slides Logo
# Paper2Slides: From Paper to Presentation in One Click [![Python](https://img.shields.io/badge/Python-3.12+-FCE7D6.svg)](https://www.python.org/) [![License](https://img.shields.io/badge/License-MIT-C1E5F5.svg)](https://opensource.org/licenses/MIT/) [![Feishu](https://img.shields.io/badge/Feishu-Group-E9DBFC?style=flat&logo=wechat&logoColor=white)](./COMMUNICATION.md) [![WeChat](https://img.shields.io/badge/WeChat-Group-C5EAB4?style=flat&logo=wechat&logoColor=white)](./COMMUNICATION.md) ✨ **Never Build Slides from Scratch Again** ✨ | πŸ“„ **Universal File Support**  |  🎯 **RAG-Powered Precision**  |  🎨 **Custom Styling**  |  ⚑ **Lightning Speed** |
--- ## 🎯 What is Paper2Slides? Turns your **research papers**, **reports**, and **documents** into **professional slides & posters** in **minutes**. ### ✨ Key Features - πŸ“„ **Universal Document Support**
Seamlessly process PDF, Word, Excel, PowerPoint, Markdown, and multiple file formats simultaneously. - 🎯 **Comprehensive Content Extraction**
RAG-powered mechanism ensures every critical insight, figure, and data point is captured with precision. - πŸ”— **Source-Linked Accuracy**
Maintains direct traceability between generated content and original sources, eliminating information drift. - 🎨 **Custom Styling Freedom**
Choose from professional built-in themes or describe your vision in natural language for custom styling. - ⚑ **Lightning-Fast Generation**
Instant preview mode enables rapid experimentation and real-time refinements. - πŸ’Ύ **Seamless Session Management**
Advanced checkpoint system preserves all progressβ€”pause, resume, or switch themes instantly without loss. - ✨ **Professional-Grade Visuals**
Deliver polished, presentation-ready slides and posters with publication-quality design standards. ### ⚑ Easy as One Command ```bash # One command to generate slides from a paper python -m paper2slides --input paper.pdf --output slides --style doraemon --length medium --fast --parallel 2 ``` --- ## πŸ”₯ News - **[2025.12.09]** Added parallel slide generation (`--parallel`) for faster processing - **[2025.12.08]** Paper2Slides is now open source! --- ## 🎨 Custom Styling Showcase

doraemon

academic

custom

doraemon

academic

custom
✨ Multiple styles available β€” simply modify the --style parameter
Examples from DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
πŸ’‘ Custom Style Example: Totoro Theme ``` --style "Studio Ghibli anime style with warm whimsical aesthetic. Use soft watercolor Morandi tones with light cream background, muted sage green and dusty pink accents. Totoro character can appear as a friendly guide relating to the content, with nature elements like soft clouds or leaves." ```
--- ### 🌐 Paper2Slides Web Interface
--- ## πŸ“‹ Table of Contents - [🎯 Quick Start](#-quick-start) - [πŸ—οΈ Paper2Slides Framework](#%EF%B8%8F-paper2slides-framework) - [πŸ”§ Configuration](#%EF%B8%8F-configuration) - [πŸ“ Code Structure](#-code-structure) --- ## πŸƒ Quick Start ### 1. Environment Setup ```bash # Clone repository git clone https://github.com/HKUDS/Paper2Slides.git cd Paper2Slides # Create and activate conda environment conda create -n paper2slides python=3.12 -y conda activate paper2slides # Install dependencies pip install -r requirements.txt ``` > [!NOTE] > Create a `.env` file in `paper2slides/` directory with your API keys. Refer to `paper2slides/.env.example` for the required variables. ### 2. Command Line Usage ```bash # Basic usage - generate slides from a paper python -m paper2slides --input paper.pdf --output slides --length medium # Generate poster with custom style python -m paper2slides --input paper.pdf --output poster --style "minimalist with blue theme" --density medium # Fast mode python -m paper2slides --input paper.pdf --output slides --fast # Enable parallel generation (2 workers by default) python -m paper2slides --input paper.pdf --output slides --parallel 2 # List all processed outputs python -m paper2slides --list ``` **CLI Options**: | Option | Description | Default | |--------|-------------|---------| | `--input, -i` | Input file(s) or directory | Required | | `--output` | Output type: `slides` or `poster` | `poster` | | `--content` | Content type: `paper` or `general` | `paper` | | `--style` | Style: `academic`, `doraemon`, or custom | `doraemon` | | `--length` | Slides length: `short`, `medium`, `long` | `short` | | `--density` | Poster density: `sparse`, `medium`, `dense` | `medium` | | `--fast` | Fast mode: skip RAG indexing | `false` | | `--parallel` | Enable parallel slide generation: `--parallel` uses 2 workers, `--parallel N` uses N workers | `1` (sequential without this option) | | `--from-stage` | Force restart from stage: `rag`, `summary`, `plan`, `generate` | Auto-detect | | `--debug` | Enable debug logging | `false` | **πŸ’Ύ Checkpoint & Resume**: Paper2Slides intelligently saves your progress at every key stage, allowing you to: | Scenario | Command | |----------|---------| | **Resume after interruption** | Just run the same command again β€” it auto-detects and continues | | **Change style only** | Add `--from-stage plan` to skip re-parsing | | **Regenerate images** | Add `--from-stage generate` to keep the same plan | | **Full restart** | Add `--from-stage rag` to start from scratch | > [!TIP] > Checkpoints are auto-saved. Just run the same command to resume. Use `--from-stage` only to **force** restart from a specific stage. ### 3. Web Interface Launch both backend and frontend services: ```bash ./scripts/start.sh ``` Or start services independently: ```bash # Terminal 1: Start backend API ./scripts/start_backend.sh # Terminal 2: Start frontend ./scripts/start_frontend.sh ``` Access the web interface at `http://localhost:5173` (default)
--- ## πŸ—οΈ Paper2Slides Framework Paper2Slides transforms documents through a 4-stage pipeline designed for **reliability** and **efficiency**: | Stage | Description | Checkpoint | Output | |-------|-------------|------------|------------| | **πŸ” RAG** | Parse documents and construct intelligent retrieval index using RAG | `checkpoint_rag.json` | Searchable knowledge base| | **πŸ“Š Analysis** | Extract document structure, identify key figures, tables, and content hierarchy | `checkpoint_summary.json` | Structured content map | | **πŸ“‹ Planning** | Generate optimized content layout and slide/poster organization strategy | `checkpoint_plan.json` | Presentation blueprint| | **🎨 Creation** | Render final high-quality slides and poster visuals | Output directory | Polished presentation materials | ### πŸ’Ύ Smart Recovery System Each stage automatically saves progress checkpoints, enabling seamless resumption from any point if the process is interruptedβ€”no need to start over. ### Fast Mode vs Normal Mode | Mode | Processing Pipeline | Use Cases | |------|---------------------|-----------| | **Normal** | Complete RAG indexing with deep document analysis | Complex research papers, lengthy documents, multi-section content| | **Fast** | Skip RAG indexing, direct LLM query | Short documents, instant previews, quick revisions | Use `--fast` when: - Document (text + figures) is short enough to fit in LLM context - Quick preview/iteration needed - Don't want to wait for RAG indexing Use normal mode (default) when: - Document is long or has many figures - Multiple files to process together - Need retrieval for better context selection --- ## βš™οΈ Configuration ### Output Directory Structure ``` outputs/ β”œβ”€β”€ / β”‚ β”œβ”€β”€ / # paper or general β”‚ β”‚ β”œβ”€β”€ / # fast or normal β”‚ β”‚ β”‚ β”œβ”€β”€ checkpoint_rag.json # RAG query results & parsed file paths β”‚ β”‚ β”‚ β”œβ”€β”€ checkpoint_summary.json # Extracted content, figures, tables β”‚ β”‚ β”‚ β”œβ”€β”€ summary.md # Human-readable summary β”‚ β”‚ β”‚ └── / # e.g., slides_doraemon_medium β”‚ β”‚ β”‚ β”œβ”€β”€ state.json # Current pipeline state β”‚ β”‚ β”‚ β”œβ”€β”€ checkpoint_plan.json # Content plan for slides/poster β”‚ β”‚ β”‚ └── / # Generated outputs β”‚ β”‚ β”‚ β”œβ”€β”€ slide_01.png β”‚ β”‚ β”‚ β”œβ”€β”€ slide_02.png β”‚ β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”‚ β”‚ └── slides.pdf # Final PDF output β”‚ β”‚ └── rag_output/ # RAG index storage β”‚ └── ... └── ... ``` **Checkpoint Files**: | File | Description | Reusable When | |------|-------------|---------------| | `checkpoint_rag.json` | Parsed document content | Same input files | | `checkpoint_summary.json` | Figures, tables, structure | Same input files | | `checkpoint_plan.json` | Content layout plan | Same style & length/density | ### Style Configuration | Style | Description | |-------|-------------| | `academic` | Clean, professional academic presentation style | | `doraemon` | Colorful, friendly style with illustrations | | `custom` | Any text description for LLM-generated style | ### Image Generation Notes > [!TIP] > Paper2Slides uses `gemini-3-pro-image-preview` (Nano Banana Pro Preview) for image generation. Key findings: > > - **Mood Keywords**: Words like "warm", "elegant", "vibrant" strongly influence the overall color palette > - **Layout vs Style**: Fine-grained *layout* instructions ground well; fine-grained *element styling* does not > - **Prompt Length**: Simple prompts generally outperform detailed ones > - **Multi-slide Generation**: Native multi-image output is story-like; for consistent slides, we use iterative single-image generation --- ## πŸ“ Code Structure | Module | Description | |--------|-------------| | `paper2slides/core/` | Pipeline orchestration, 4-stage execution | | `paper2slides/raganything/` | Document parsing & RAG indexing | | `paper2slides/summary/` | Content extraction: figures, tables, paper structure | | `paper2slides/generator/` | Content planning & image generation | | `api/` | FastAPI backend for web interface | | `frontend/` | React frontend (Vite + TailwindCSS) |
Click to expand full project structure ``` Paper2Slides/ β”œβ”€β”€ paper2slides/ # Core library β”‚ β”œβ”€β”€ main.py # CLI entry point β”‚ β”œβ”€β”€ core/ β”‚ β”‚ β”œβ”€β”€ pipeline.py # Main pipeline orchestration β”‚ β”‚ β”œβ”€β”€ state.py # Checkpoint state management β”‚ β”‚ └── stages/ β”‚ β”‚ β”œβ”€β”€ rag_stage.py # Stage 1: Parse & index β”‚ β”‚ β”œβ”€β”€ summary_stage.py # Stage 2: Extract content β”‚ β”‚ β”œβ”€β”€ plan_stage.py # Stage 3: Plan layout β”‚ β”‚ └── generate_stage.py # Stage 4: Generate images β”‚ β”‚ β”‚ β”œβ”€β”€ raganything/ β”‚ β”‚ β”œβ”€β”€ raganything.py # RAG processor β”‚ β”‚ └── parser.py # Document parser β”‚ β”‚ β”‚ β”œβ”€β”€ summary/ β”‚ β”‚ β”œβ”€β”€ paper.py # Paper structure extraction β”‚ β”‚ └── extractors/ # Figure/table extractors β”‚ β”‚ β”‚ β”œβ”€β”€ generator/ β”‚ β”‚ β”œβ”€β”€ content_planner.py # Slide/poster planning β”‚ β”‚ └── image_generator.py # Image generation β”‚ β”‚ β”‚ β”œβ”€β”€ prompts/ # LLM prompt templates β”‚ └── utils/ # Utilities β”‚ β”œβ”€β”€ api/server.py # FastAPI backend β”œβ”€β”€ frontend/src/ # React frontend └── scripts/ # Shell scripts (start/stop) ```
--- ## πŸ™ Related Open-Sourced Projects - **[LightRAG](https://github.com/HKUDS/LightRAG)**: Graph-Empowered RAG - **[RAG-Anything](https://github.com/HKUDS/RAG-Anything)**: Multi-Modal RAG - **[VideoRAG](https://github.com/HKUDS/VideoRAG)**: RAG with Extremely-Long Videos ---
**🌟Found Paper2Slides helpful? Star us on GitHub!** **πŸš€ Turn any document into professional presentations in minutes!**
---

❀️ Thanks for visiting ✨ Paper2Slides!

Views