# dropseqRunner

**Repository Path**: fan_chuiqin/dropseqRunner

## Basic Information

- **Project Name**: dropseqRunner
- **Description**: 轻量的drop-seq上游程序
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-30
- **Last Updated**: 2026-03-30

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

## TLDR

```
git clone git@github.com:aselewa/dropseqRunner.git
cd dropseqRunner
conda env create -f environment.yaml
conda activate dropRunner

STAR --runThreadN 4 --runMode genomeGenerate --genomeDir $OUTDIR --genomeFastaFiles  $FASTA --sjdbGTFfile $ANNOTATION_GTF 
python dropRunner.py  --R1 path/to/{}.R1.fastq.gz \
                      --R2 path/to/{}.R2.fastq.gz \
                      --indices $OUTDIR \
                      --protocol drop/10x-v3/10x-v2
                      --sample pbmc_v3
```

If the above give you any trouble, run the demo to ensure everything is installed properly:
```
make run_test_workflow
```
Look for a message at the end that tells you whether the demo ran properly or not.

## Getting started

dropRunner is a Snakemake-based pipeline for processing single-cell RNA-seq data from the Drop-seq and 10x platform. We utilize STARsolo for alignment and constructing the digital expression matrix. We also supply a detailed report in HTML format that shows the sequencing statistics, as well as read distribution across the genome. The pipeline only works on Linux systems (excluding Windows linux subsystem).

**This pipeline is still under active development. If you have issues, please report them via GitHub.** 

### Setting up conda

You may **skip** this if you already have conda installed and configured.
 
`miniconda3` is a light version of `Anaconda`. To install on 64Linux, do the following:

```
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
```
Once it is done, initialize conda so it is in your path:

```
conda init bash
source ~/.bashrc
```

Check conda works:

```
conda --version
```

### 0. Set up and activate environment

Use the provided `environment.yaml` file to set up the conda environment.

```
git clone git@github.com:aselewa/dropseqRunner.git
cd dropseqRunner
conda env create -f environment.yaml
```
This may take some time depending on your environment. A fresh conda installation should take about 10-15 minutes. 

**Activate the environment before running the next steps**

```
conda activate dropRunner
```

### 1. Make reference genome indices

Use `STAR` to create indices for your reference genome of interest. You will need two things:

* fasta file of reference genome
* reference genome GTF annotations

You can get both of these from [GENCODE](https://www.gencodegenes.org/human/) for humans.

```
STAR --runThreadN 4 --runMode genomeGenerate --genomeDir $OUTDIR --genomeFastaFiles  $FASTA --sjdbGTFfile $ANNOTATION_GTF 
```

### 2. Run the pipeline

Use `dropRunner.py` on your fastq files to generate count matrices. Use the `protocol` parameter to specify drop, 10x-v2, or 10x-v3. The last two are version 2 and version 3 10x platforms.

```
python dropRunner.py  --R1 path/to/{}.R1.fastq.gz
                      --R2 path/to/{}.R2.fastq.gz
                      --indices $OUTDIR
                      --cluster
                      --sample my_example_project
```

**Note 1**: You can supply multiple R1s and R2s by passing a comma-delimited list. I find this bash command useful:

```
R1=$(ls *.R1.fastq.gz | paste -sd,)
```

### 3. Output

There are two pieces of information that most users will need:

* html reports
* count matrices

The html report is in `reports/`. The count matrices are in `output/{project_name}_Solo.out`. There are two types of count matrices: `filtered` and `raw`. The raw matrix contains all valid barcodes, while filtered contains only barcodes with a certain number of UMI. This threshold is determined by `STARsolo` using a hueristic approach.

Please report any issues you run into via GitHub.