# MegReader **Repository Path**: SearchSource/MegReader ## Basic Information - **Project Name**: MegReader - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-10-26 - **Last Updated**: 2021-10-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # MegReader A project for research in text detection and recognition using PyTorch 1.2. This project is originated from the research repo, which heavily relies on closed-source libraries, of CSG-Algorithm team of Megvii(https://megvii.com). We are in ongoing progress to transfer models into this repo gradually, released implementations are listed in [Progress](#progress). ## Highlights - Implementations of representative text detection and recognition methods. - An effective framework for conducting experiments: We use yaml files to configure experiments, making it convenient to take experiments. - Thorough logging features which make it easy to follow and analyze experimental results. - CPU/GPU compatible for training and inference. - Distributed training support. ## Install ### Requirements `pip install -r requirements.txt` - Python3.7 - PyTorch 1.2 and CUDA 10.0. - gcc 5.5(Important for compiling) ### Compile cuda ops (If needed) ``` cd PATH_TO_OPS python setup.py build_ext --inplace ``` ops may be used: - DeformableConvV2 `assets/ops/dcn` - CTC2DLoss `ops/ctc_2d` ### Configuration(optional) Edit configurations in `config.py`. ## Training See detailed options: `python3 train.py --help` ## Datasets We provide data loading implementation with annotation packed with json for quick start. Also, lmdb format data are now available too. You can refer the usage in [demo](experiments/recognition/crnn-lmdb.yaml). Datasets used in our recognition experiments can be downloaded from [onedrive](https://megvii-my.sharepoint.cn/:f:/g/personal/wanzhaoyi_megvii_com/EjkcrpmiW6hJrUKY-0fEBRABvNMtYniUPfWLVptMmy9-6w?e=bJaYFo). The transform [script](scripts/json_to_lmdb.py) are provide to convert json format data to lmdb. ### Non-distributed `python3 train.py PATH_TO_EXPERIMENT.yaml --validate --visualize --name NAME_OF_EXPERIMENT` Following we provide some of configurations of the released recognition models: - CRNN: `experiments/recognition/crnn.yaml` - 2D CTC: `experiments/recognition/res50-ppm-2d-ctc.yaml` - Attention Decoder: `experiments/recognition/fpn50-attention-decoder.yaml` ### Distributed(recommended for multi-gpu training) `python3 -m torch.distributed.launch --nproc_per_node=NUM_GPUS train.py PATH_TO_EXPERIMENT.yaml -d --validate` ## Evaluating See detailed options: `python3 eval.py --help`. Keeping ratio tesing is recommended: `python3 eval.py PATH_TO_EXPERIMENT.yaml --resize_mode keep_ratio` ### Model zoo Trained models are comming soon. ## Progress ### Recognition Methods - [x] 2D CTC - [x] CRNN - [x] Attention Decoder - [ ] Rectification ### Detection Methods - [x] Text Snake - [x] EAST ### End-to-end - [ ] Mask Text Spotter ## Contributing [Contributing.md](CONTRIBUTING.md)