# Paper2Slides
**Repository Path**: liusha/Paper2Slides
## Basic Information
- **Project Name**: Paper2Slides
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-12-15
- **Last Updated**: 2025-12-15
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README

# Paper2Slides: From Paper to Presentation in One Click
[](https://www.python.org/)
[](https://opensource.org/licenses/MIT/)
[](./COMMUNICATION.md)
[](./COMMUNICATION.md)
β¨ **Never Build Slides from Scratch Again** β¨
| π **Universal File Support** | π― **RAG-Powered Precision** | π¨ **Custom Styling** | β‘ **Lightning Speed** |
---
## π― What is Paper2Slides?
Turns your **research papers**, **reports**, and **documents** into **professional slides & posters** in **minutes**.
### β¨ Key Features
- π **Universal Document Support**
Seamlessly process PDF, Word, Excel, PowerPoint, Markdown, and multiple file formats simultaneously.
- π― **Comprehensive Content Extraction**
RAG-powered mechanism ensures every critical insight, figure, and data point is captured with precision.
- π **Source-Linked Accuracy**
Maintains direct traceability between generated content and original sources, eliminating information drift.
- π¨ **Custom Styling Freedom**
Choose from professional built-in themes or describe your vision in natural language for custom styling.
- β‘ **Lightning-Fast Generation**
Instant preview mode enables rapid experimentation and real-time refinements.
- πΎ **Seamless Session Management**
Advanced checkpoint system preserves all progressβpause, resume, or switch themes instantly without loss.
- β¨ **Professional-Grade Visuals**
Deliver polished, presentation-ready slides and posters with publication-quality design standards.
### β‘ Easy as One Command
```bash
# One command to generate slides from a paper
python -m paper2slides --input paper.pdf --output slides --style doraemon --length medium --fast --parallel 2
```
---
## π₯ News
- **[2025.12.09]** Added parallel slide generation (`--parallel`) for faster processing
- **[2025.12.08]** Paper2Slides is now open source!
---
## π¨ Custom Styling Showcase
π‘ Custom Style Example: Totoro Theme
```
--style "Studio Ghibli anime style with warm whimsical aesthetic. Use soft watercolor Morandi tones with light cream background, muted sage green and dusty pink accents. Totoro character can appear as a friendly guide relating to the content, with nature elements like soft clouds or leaves."
```
---
### π Paper2Slides Web Interface
---
## π Table of Contents
- [π― Quick Start](#-quick-start)
- [ποΈ Paper2Slides Framework](#%EF%B8%8F-paper2slides-framework)
- [π§ Configuration](#%EF%B8%8F-configuration)
- [π Code Structure](#-code-structure)
---
## π Quick Start
### 1. Environment Setup
```bash
# Clone repository
git clone https://github.com/HKUDS/Paper2Slides.git
cd Paper2Slides
# Create and activate conda environment
conda create -n paper2slides python=3.12 -y
conda activate paper2slides
# Install dependencies
pip install -r requirements.txt
```
> [!NOTE]
> Create a `.env` file in `paper2slides/` directory with your API keys. Refer to `paper2slides/.env.example` for the required variables.
### 2. Command Line Usage
```bash
# Basic usage - generate slides from a paper
python -m paper2slides --input paper.pdf --output slides --length medium
# Generate poster with custom style
python -m paper2slides --input paper.pdf --output poster --style "minimalist with blue theme" --density medium
# Fast mode
python -m paper2slides --input paper.pdf --output slides --fast
# Enable parallel generation (2 workers by default)
python -m paper2slides --input paper.pdf --output slides --parallel 2
# List all processed outputs
python -m paper2slides --list
```
**CLI Options**:
| Option | Description | Default |
|--------|-------------|---------|
| `--input, -i` | Input file(s) or directory | Required |
| `--output` | Output type: `slides` or `poster` | `poster` |
| `--content` | Content type: `paper` or `general` | `paper` |
| `--style` | Style: `academic`, `doraemon`, or custom | `doraemon` |
| `--length` | Slides length: `short`, `medium`, `long` | `short` |
| `--density` | Poster density: `sparse`, `medium`, `dense` | `medium` |
| `--fast` | Fast mode: skip RAG indexing | `false` |
| `--parallel` | Enable parallel slide generation: `--parallel` uses 2 workers, `--parallel N` uses N workers | `1` (sequential without this option) |
| `--from-stage` | Force restart from stage: `rag`, `summary`, `plan`, `generate` | Auto-detect |
| `--debug` | Enable debug logging | `false` |
**πΎ Checkpoint & Resume**:
Paper2Slides intelligently saves your progress at every key stage, allowing you to:
| Scenario | Command |
|----------|---------|
| **Resume after interruption** | Just run the same command again β it auto-detects and continues |
| **Change style only** | Add `--from-stage plan` to skip re-parsing |
| **Regenerate images** | Add `--from-stage generate` to keep the same plan |
| **Full restart** | Add `--from-stage rag` to start from scratch |
> [!TIP]
> Checkpoints are auto-saved. Just run the same command to resume. Use `--from-stage` only to **force** restart from a specific stage.
### 3. Web Interface
Launch both backend and frontend services:
```bash
./scripts/start.sh
```
Or start services independently:
```bash
# Terminal 1: Start backend API
./scripts/start_backend.sh
# Terminal 2: Start frontend
./scripts/start_frontend.sh
```
Access the web interface at `http://localhost:5173` (default)
---
## ποΈ Paper2Slides Framework
Paper2Slides transforms documents through a 4-stage pipeline designed for **reliability** and **efficiency**:
| Stage | Description | Checkpoint | Output |
|-------|-------------|------------|------------|
| **π RAG** | Parse documents and construct intelligent retrieval index using RAG | `checkpoint_rag.json` | Searchable knowledge base|
| **π Analysis** | Extract document structure, identify key figures, tables, and content hierarchy | `checkpoint_summary.json` | Structured content map |
| **π Planning** | Generate optimized content layout and slide/poster organization strategy | `checkpoint_plan.json` | Presentation blueprint|
| **π¨ Creation** | Render final high-quality slides and poster visuals | Output directory | Polished presentation materials |
### πΎ Smart Recovery System
Each stage automatically saves progress checkpoints, enabling seamless resumption from any point if the process is interruptedβno need to start over.
### Fast Mode vs Normal Mode
| Mode | Processing Pipeline | Use Cases |
|------|---------------------|-----------|
| **Normal** | Complete RAG indexing with deep document analysis | Complex research papers, lengthy documents, multi-section content|
| **Fast** | Skip RAG indexing, direct LLM query | Short documents, instant previews, quick revisions |
Use `--fast` when:
- Document (text + figures) is short enough to fit in LLM context
- Quick preview/iteration needed
- Don't want to wait for RAG indexing
Use normal mode (default) when:
- Document is long or has many figures
- Multiple files to process together
- Need retrieval for better context selection
---
## βοΈ Configuration
### Output Directory Structure
```
outputs/
βββ /
β βββ / # paper or general
β β βββ / # fast or normal
β β β βββ checkpoint_rag.json # RAG query results & parsed file paths
β β β βββ checkpoint_summary.json # Extracted content, figures, tables
β β β βββ summary.md # Human-readable summary
β β β βββ / # e.g., slides_doraemon_medium
β β β βββ state.json # Current pipeline state
β β β βββ checkpoint_plan.json # Content plan for slides/poster
β β β βββ / # Generated outputs
β β β βββ slide_01.png
β β β βββ slide_02.png
β β β βββ ...
β β β βββ slides.pdf # Final PDF output
β β βββ rag_output/ # RAG index storage
β βββ ...
βββ ...
```
**Checkpoint Files**:
| File | Description | Reusable When |
|------|-------------|---------------|
| `checkpoint_rag.json` | Parsed document content | Same input files |
| `checkpoint_summary.json` | Figures, tables, structure | Same input files |
| `checkpoint_plan.json` | Content layout plan | Same style & length/density |
### Style Configuration
| Style | Description |
|-------|-------------|
| `academic` | Clean, professional academic presentation style |
| `doraemon` | Colorful, friendly style with illustrations |
| `custom` | Any text description for LLM-generated style |
### Image Generation Notes
> [!TIP]
> Paper2Slides uses `gemini-3-pro-image-preview` (Nano Banana Pro Preview) for image generation. Key findings:
>
> - **Mood Keywords**: Words like "warm", "elegant", "vibrant" strongly influence the overall color palette
> - **Layout vs Style**: Fine-grained *layout* instructions ground well; fine-grained *element styling* does not
> - **Prompt Length**: Simple prompts generally outperform detailed ones
> - **Multi-slide Generation**: Native multi-image output is story-like; for consistent slides, we use iterative single-image generation
---
## π Code Structure
| Module | Description |
|--------|-------------|
| `paper2slides/core/` | Pipeline orchestration, 4-stage execution |
| `paper2slides/raganything/` | Document parsing & RAG indexing |
| `paper2slides/summary/` | Content extraction: figures, tables, paper structure |
| `paper2slides/generator/` | Content planning & image generation |
| `api/` | FastAPI backend for web interface |
| `frontend/` | React frontend (Vite + TailwindCSS) |
Click to expand full project structure
```
Paper2Slides/
βββ paper2slides/ # Core library
β βββ main.py # CLI entry point
β βββ core/
β β βββ pipeline.py # Main pipeline orchestration
β β βββ state.py # Checkpoint state management
β β βββ stages/
β β βββ rag_stage.py # Stage 1: Parse & index
β β βββ summary_stage.py # Stage 2: Extract content
β β βββ plan_stage.py # Stage 3: Plan layout
β β βββ generate_stage.py # Stage 4: Generate images
β β
β βββ raganything/
β β βββ raganything.py # RAG processor
β β βββ parser.py # Document parser
β β
β βββ summary/
β β βββ paper.py # Paper structure extraction
β β βββ extractors/ # Figure/table extractors
β β
β βββ generator/
β β βββ content_planner.py # Slide/poster planning
β β βββ image_generator.py # Image generation
β β
β βββ prompts/ # LLM prompt templates
β βββ utils/ # Utilities
β
βββ api/server.py # FastAPI backend
βββ frontend/src/ # React frontend
βββ scripts/ # Shell scripts (start/stop)
```
---
## π Related Open-Sourced Projects
- **[LightRAG](https://github.com/HKUDS/LightRAG)**: Graph-Empowered RAG
- **[RAG-Anything](https://github.com/HKUDS/RAG-Anything)**: Multi-Modal RAG
- **[VideoRAG](https://github.com/HKUDS/VideoRAG)**: RAG with Extremely-Long Videos
---
**πFound Paper2Slides helpful? Star us on GitHub!**
**π Turn any document into professional presentations in minutes!**
---
β€οΈ Thanks for visiting β¨ Paper2Slides!