# DSOD **Repository Path**: MySpaceZJP/DSOD ## Basic Information - **Project Name**: DSOD - **Description**: DSOD: Learning Deeply Supervised Object Detectors from Scratch. In ICCV 2017. - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-09 - **Last Updated**: 2024-10-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # DSOD: Learning Deeply Supervised Object Detectors from Scratch ## Update (02/26/2019) We observe that if we simply increase the batch size (bs) on each GPU from 4 (Titan X) to 12 (P40) for training BN layers, our DSOD300 can achieve much better performance without any other modifications (see comparisons below). We think if we have a better solution to tune BN layers' params, e.g., Sync BN [1] or Group Norm [2] when training detectors from scratch **with limited batch size**, the accuracy may be higher. This is also consistent with [3]. *We have also provided some preliminary results on exploring the factors of training two-stage detectors from scratch in our extended [paper](https://arxiv.org/abs/1809.09294) (v2) [4].* New results on PASCAL VOC test set: | Method | VOC 2007 test *mAP* | # parameters | Models |:-------|:-----:|:-------:|:-------:| | DSOD300 (07+12) bs=4 on each GPU | 77.7 | 14.8M | [Download (59.2M)](https://drive.google.com/open?id=0B4cvsEOB5eUCaGU3MkRkOENRWWc) | | DSOD300 (07+12) bs=12 on each GPU | 78.9 | 14.8M | [Download (59.2M)](https://drive.google.com/open?id=1_ur6TYiLPUGsHoZQM1yxAZ2AXgSe-Qxm)| [1] Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, and Jian Sun. "Megdet: A large mini-batch object detector." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6181-6189. 2018. [2] Yuxin Wu, and Kaiming He. "Group normalization." In Proceedings of the European Conference on Computer Vision (ECCV), pp. 3-19. 2018. [3] Kaiming He, Ross Girshick, and Piotr Dollár. "Rethinking imagenet pre-training." In Proceedings of the IEEE International Conference on Computer Vision, pp. 4918-4927. 2019. [4] Zhiqiang Shen, Zhuang Liu, Jianguo Li, Yu-Gang Jiang, Yurong Chen, and Xiangyang Xue. "Object detection from scratch with deep supervision." IEEE transactions on pattern analysis and machine intelligence (2019). ------------------------------------------------------------------------------------- This repository contains the code for the following paper [DSOD: Learning Deeply Supervised Object Detectors from Scratch](http://openaccess.thecvf.com/content_ICCV_2017/papers/Shen_DSOD_Learning_Deeply_ICCV_2017_paper.pdf) (ICCV 2017). [Zhiqiang Shen](http://www.zhiqiangshen.com)\*, [Zhuang Liu](https://liuzhuang13.github.io/)\*, [Jianguo Li](https://sites.google.com/site/leeplus/), [Yu-Gang Jiang](http://www.yugangjiang.info/), [Yurong chen](https://scholar.google.com/citations?user=MKRyHXsAAAAJ&hl=en), [Xiangyang Xue](https://scholar.google.com/citations?user=DTbhX6oAAAAJ&hl=en). (\*Equal Contribution) The code is based on the [SSD](https://github.com/weiliu89/caffe/tree/ssd) framework. Other Implementations: [[Pytorch]](https://github.com/chenyuntc/dsod.pytorch) by Yun Chen, [[Pytorch]](https://github.com/uoip/SSD-variants) by uoip, [[Pytorch]](https://github.com/qqadssp/DSOD-Pytorch) by qqadssp, [[Pytorch]](https://github.com/Ellinier/DSOD-Pytorch-Implementation) by Ellinier , [[Mxnet]](https://github.com/leocvml/DSOD-gluon-mxnet) by Leo Cheng, [[Mxnet]](https://github.com/eureka7mt/mxnet-dsod) by eureka7mt, [[Tensorflow]](https://github.com/Windaway/DSOD-Tensorflow) by Windaway. If you find this helps your research, please cite: @inproceedings{Shen2017DSOD, title = {DSOD: Learning Deeply Supervised Object Detectors from Scratch}, author = {Shen, Zhiqiang and Liu, Zhuang and Li, Jianguo and Jiang, Yu-Gang and Chen, Yurong and Xue, Xiangyang}, booktitle = {ICCV}, year = {2017} } @article{shen2018object, title={Object Detection from Scratch with Deep Supervision}, author={Shen, Zhiqiang and Liu, Zhuang and Li, Jianguo and Jiang, Yu-Gang and Chen, Yurong and Xue, Xiangyang}, journal={arXiv preprint arXiv:1809.09294}, year={2018} } ## Introduction DSOD focuses on the problem of training object detector from scratch (without pretrained models on ImageNet). To the best of our knowledge, this is the first work that trains neural object detectors from scratch with state-of-the-art performance. In this work, we contribute a set of design principles for this purpose. One of the key findings is the deeply supervised structure enabled by [dense layer-wise connections](https://github.com/liuzhuang13/DenseNet), plays a critical role in learning a good detection model. Please see our paper for more details.