# L3DAS22

**Repository Path**: linan2/L3DAS22

## Basic Information

- **Project Name**: L3DAS22
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-11-28
- **Last Updated**: 2022-01-07

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# L3DAS22 challenge supporting API
This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results.
We provide easy-to-use instruction to produce the results included in our paper.
Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge [website](https://www.l3das.com/icassp2022/index.html) and to the challenge [documentation](https://www.l3das.com/assets/file/L3DAS22_ICASSP_documentation.pdf).

## Installation
Our code is based on Python 3.7.

To install all required dependencies run:
```bash
pip install -r requirements.txt
```

Follow [these](https://www.kaggle.com/docs/api) instructions to properly create and place the kaggle.json file.

## Dataset download
It is possible to download the entire dataset through the script **download_dataset.py**. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:
```bash
python download_dataset.py --output_path ./DATASETS --unzip True
```

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from [Kaggle](https://www.kaggle.com/l3dasteam/l3das22).

The *train360* section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360().
Example:

```python
import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)
```

## Pre-processing
The file **preprocessing.py** provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:
```bash
python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100
```
The two tasks of the challenge require different pre-processing.

For **Task1** the function returns 2 numpy arrays contatining:
* Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
* Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For **Task2** the function returns 2 numpy arrays contatining:
* Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
* Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).


## Baseline models
We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:
```bash
python train_baseline_task1.py
python train_baseline_task2.py
```
These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:
```bash
python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained
```
These models are also available for manual download [here](https://drive.google.com/drive/u/1/folders/1rTvlzoQM6ZxVTZe6PSJ_-yHx-uHa5z4z).

We also provide a Replicate [interactive demo](https://replicate.ai/l3das/l3das22_challenge) of both baseline models.


## Evaluaton metrics
Our evaluation metrics for both tasks are included in the **metrics.py** script.
The functions **task1_metric** and **location_sensitive_detection** compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:
```python
import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)
```

To compute the challenge metrics for our basiline models run:
```bash
python evaluate_baseline_task1.py
python evaluate_baseline_task2.py
```

In case you want to evaluate our pre-trained models, please add
`
--model_path path/to/model
`
to the above commands.

## Submission shape validation
The script **validate_submission.py** can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das [website](https://www.l3das.com/icassp2022/submission.html)
Use these commands to validate your submissions:
```bash
python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder
```

For each task, this script asserts if:
* The number of submitted files is correct
* The naming of the submitted files is correct
* Only the files to be submitted are present in the folder
* The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.