# EPNet

**Repository Path**: daxiongpro/EPNet

## Basic Information

- **Project Name**: EPNet
- **Description**: EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection(ECCV 2020)
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-06-04
- **Last Updated**: 2024-05-29

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# EPNet
EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection (ECCV 2020).
Paper is now available in [EPNet](https://arxiv.org/pdf/2007.08856.pdf), and the code is based on [PointRCNN](https://github.com/sshaoshuai/PointRCNN).

## Highlights
0. Without extra image annotations, e.g. 2D bounding box, Semantic labels and so on. 
2. A more accurate multi-scale point-wise fusion for Image and Point Cloud. 
3. The proposed CE loss can improve the performance of 3D Detection greatly.
3. Without GT AUG.

## Contributions
This is Pytorch implementation for EPNet on KITTI dataset, which  is mainly achieved by  [Liu Zhe](https://github.com/happinesslz) and [Huang Tengteng](https://github.com/tengteng95). Some parts also benefit from [Chen Xiwu](https://github.com/XiwuChen).

## Abstract
In this paper, we aim at addressing two critical issues in the 3D detection task, including the exploitation of multiple sensors~(namely LiDAR point cloud and camera image), as well as the inconsistency between the localization and classification confidence. To this end, we propose a novel fusion module to enhance the point features with semantic image features in a point-wise manner without any image annotations. Besides, a consistency enforcing loss is employed to explicitly encourage the consistency of both the localization and classification confidence. We design an end-to-end learnable framework named EPNet to integrate these two components. Extensive experiments on the KITTI and SUN-RGBD datasets demonstrate the superiority of EPNet over the state-of-the-art methods. 

![image](img/1.jpg)

## Network
The architecture of our two-stream RPN is shown in the below.
![image](img/2.jpg)


The architecture of our LI-Fusion module in the two-stream RPN.
![image](img/3.jpg)


## Install(Same with [PointRCNN](https://github.com/sshaoshuai/PointRCNN))

The Environment：
* Linux (tested on Ubuntu 16.04)
* Python 3.6+
* PyTorch 1.0+

a. Clone the PointRCNN repository.
```shell
git clone https://github.com/happinesslz/EPNet.git
```

b. Install the dependent python libraries like `easydict`,`tqdm`, `tensorboardX ` etc.

c. Build and install the `pointnet2_lib`, `iou3d`, `roipool3d` libraries by executing the following command:
```shell
sh build_and_install.sh
```

## Dataset preparation
Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows: 
```
EPNet
├── data
│   ├── KITTI
│   │   ├── ImageSets
│   │   ├── object
│   │   │   ├──training
│   │   │      ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │   ├──testing
│   │   │      ├──calib & velodyne & image_2
├── lib
├── pointnet2_lib
├── tools
```


## Trained model
The results of Car on Recall 40:

|LI Fusion| CE loss|   Easy | Moderate |   Hard |  mAP  |models|
|  ----  | ----   |  ----  |   ----  |   ----  |  ---- |---- |
|    No  |  No    |  88.76 |  78.03  |   76.20 |  80.99| [Google](https://drive.google.com/drive/folders/1MT-gArGnu-4qrThjHtEKjWZ_8999nwWx?usp=sharing), [Baidu](https://pan.baidu.com/s/1ZBr9oFzOi2OZCutjYkpM6Q) (a43t)|
|    Yes |  No    |  89.93 |  80.77  |   77.25 |  82.65| [Google](https://drive.google.com/drive/folders/1mvmZY-XXmt059IODlJgKXtuFH0Wmp3MT?usp=sharing), [Baidu](https://pan.baidu.com/s/1MgTynfR2MfVspV8HMyzgGw) (dbxy)|
|    No  |  Yes   |  92.12 |  81.48  |   79.34 |  84.31| [Google](https://drive.google.com/drive/folders/1Up2siHcBOIIGrHKok7nVu9YoBPQS9rcR?usp=sharing), [Baidu](https://pan.baidu.com/s/19-CVgQT_lQ6iyb-PVZ0sDQ) (hrkv)|
|    Yes | Yes    |  92.17 |  82.68  |   80.10 |  84.99| [Google](https://drive.google.com/drive/folders/1tON7-ooxcEMeB7wEfH914SRPlai7npU9?usp=sharing), [Baidu](https://pan.baidu.com/s/1rMnodG0a5uuJtCSX-bzr1Q) (nasm)|

Besides, adding iou branch to EPNet (the last line in  the above table) can bring a minor improvement and the results are more stable. The result is 92.50(Easy), 82.45(Moderate), 80.29(Hard), 85.08(mAP), and the model checkpoint can be obtained from [Google](https://drive.google.com/drive/folders/13Vs2E8oDD53mrI6Q6GcknpfbzbRxcgbD?usp=sharing), [Baidu](https://pan.baidu.com/s/1y-TuenVzpltsCGNcBxMO0w) (8sir).

To evaluate all these models, please download the above  models. Unzip these models  and place them to "./log/Car/models"
```shell
cd ./tools
mkdir -p log/Car/models
bash run_eval_model.sh
```
## Implementation
### Training
Run EPNet for single gpu:
```shell
CUDA_VISIBLE_DEVICES=0 python train_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --batch_size 2 --train_mode rcnn_online --epochs 50 --ckpt_save_interval 1 --output_dir ./log/Car/full_epnet_without_iou_branch/   --set LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2 RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False TRAIN.CE_WEIGHT 5.0
```
Run EPNet for two gpu:
```shell
CUDA_VISIBLE_DEVICES=0,1 python train_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --batch_size 6 --train_mode rcnn_online --epochs 50 --mgpus --ckpt_save_interval 1 --output_dir ./log/Car/full_epnet_without_iou_branch/   --set LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2 RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False TRAIN.CE_WEIGHT 5.0
```
### Testing
```shell
CUDA_VISIBLE_DEVICES=2 python eval_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --eval_mode rcnn_online  --eval_all  --output_dir ./log/Car/full_epnet_without_iou_branch/eval_results/  --ckpt_dir ./log/Car/full_epnet_without_iou_branch/ckpt --set  LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2  RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False
```

## Acknowledgement
The code is based on [PointRCNN](https://github.com/sshaoshuai/PointRCNN). 

## Citation
If you find this work useful in your research, please consider cite:

```
@article{Huang2020EPNetEP,
  title={EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection},
  author={Tengteng Huang and Zhe Liu and Xiwu Chen and Xiang Bai},
  booktitle ={ECCV},
  month = {July},
  year={2020}
}
```
```
@InProceedings{Shi_2019_CVPR,
    author = {Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
    title = {PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2019}
}
```