# PLBART **Repository Path**: ecust-dp/plbart ## Basic Information - **Project Name**: PLBART - **Description**: Code replication for Vulnerability Identification experiments in the NAACL 2021 paper: Unified Pre-training for Program Understanding and Generation - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-06-12 - **Last Updated**: 2024-06-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README #### 项目介绍 Code replication for Vulnerability Identification experiments in the NAACL 2021 paper: Unified Pre-training for Program Understanding and Generation #### 系统环境 **P40-1**_高性能计算平台应用名称:gym CPU:10 核 RAM:100 GB GPU:NVIDIA Tesla P40 24G OS:Ubuntu 18.04 **P40-2**_高性能计算平台应用名称:Tensorflow_GPU CPU:10 核 RAM:64 GB GPU:NVIDIA Tesla P40 24G OS:CentOS 7.7 **3090**_高性能计算平台应用名称:Desktop_GPU CPU:10 核 RAM:20 GB GPU:Nvidia GeForce RTX 3090 OS:CentOS 7.8 #### 环境配置 ``` git clone https://gitee.com/ecust-dp/plbart.git cd plbart conda create --name PLBART python=3.6.10 conda activate PLBART conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 pip install submitit ``` ``` mkdir third_party cd third_party ``` ```git clone https://gitee.com/ecust-dp/fairseq``` or ```git clone https://github.com/pytorch/fairseq``` ``` cd fairseq git checkout 698e3b91ffa832c286c48035bdff78238b0de8ae pip install . cd ../.. ``` ``` pip install sacrebleu==1.2.11 pip install tree_sitter==0.2.1 pip install sentencepiece pip install scikit-learn ``` **Note:** refer to **requirements_notes.txt** if encountered error during installing packages mentioned above. **Prepare pretrained PLBART** Download **plbart_base.pt** from 高性能计算平台(Path: Ecust-SE/DP/PLBART-JIT/pretrain/), and put it under the **pretrain/** folder. **Prepare data files for fine-tuning PLBART** Download **code-to-code.zip** from 高性能计算平台(Path: Ecust-SE/DP/PLBART-JIT/data/codeXglue), and upload it to **data/codeXglue**. Extract the **code-to-code** folder containing **defect_prediction** folder via ```unzip code-to-code.zip``` #### 使用说明 **preprocess data and run** ``` cd scripts/code_to_code/defect_prediction/ bash prepare.sh bash run.sh 0 ``` #### 结果对比 Table 9: Results on the vulnerable code detection (accuracy) task. | Methods | Acc-paper | Acc-4090 | Acc-3090 | Acc-P40-1 | Acc-P40-2 | |--------------|-----------|----------|----------|-----------|--------------| | Transformer | 61.64 | | 54.06 | 54.06 | 54.06 | | CodeBERT | 62.08 | | 63.91 | 63.54 | 63.54 | | PLBART | 63.18 | 61.20 | 61.75 | 61.79 | 61.79 | **Notes:** 1. Other metrics such as precision, recall, and fl-score can be found in **results_GPU_Type.png** 2. Detailed log regrading environment installing and fine-tuning PLBART is recorded in **prepare_environments_running_P40-2.log**.