Acoustic detection of a nocturnal bird with deep learning: the challenge of low signal-to-noise ratio
This repository contains the python scripts to reproduce the experiments and the figures of the research project on the influence of the SNR on the training and inference of a CNN on soundscape recordings for the detection of Boreal Owl (Aegolius Funereus) vocalizations in the Risoux forest.
Clone the repository:
git clone https://github.com/ear-team/MICHAUD_CNN_SNR_BOREAL_OWL.gitThe dataset to reproduce the results will be freely available on Zenodo once the paper is accepted for publication.
This python code tests the influence of the SNR on the convergence of a CNN. A custom-built CNN of 5,900 parameters can be trained on three training subsets made of 384 Boreal Owl vocalizations each, with three different SNR distributions. 1) low+medium+high SNR corresponding to the complete SNR distribution, 2) medium+high SNR corresponding to the middle and the last tertile of the SNR distribution, and 3) high SNR only corresponding to the last tertile of the SNR distribution. 4 representations can be tested, the Mel-spectrogram, Mel-spectrogram with data augmentation, PCEN and PCEN with data augmentation. There are 10 trainings for each of the three distribution of SNR programmed with random seed ranging from 0 to 9 to test the effect of the random initialisation. A total of 30 trainings is therefore conducted.
The command line to run the code is:
python TERTILE_main_process.py -trainingPath "fill with the path to the training repo" -noisePath "fill with the path to the noise repo" --nameExp "fill with the name of the experiment"It is also possible to change the sample rate, also to choose if data augmentation is applied, the type of input format (Mel or pcen), the batchsize and the maximal number of epochs. The parameters of the paper are the default parameters.
At the end of each training, the model is tested and the result of the test is saved as a .csv in the save_experiments repo, in a repo with the name indicated with the --nameExp argument. The .csv files of each experiments are already available in the save_experiments repo.
The program results_analysis/Results_per_SNR.py produce the Figure 2 of the paper which represent the performances of each configuration trained with each SNR distribution (low+medium+high, medium+high, high) and tested on each tertile of the initial SNR distribution (low, medium, high). The .csv files are already available in the Tertiles_experiments/save_experiments/ repo or can be calculated again as explained previously.
This code trains the custom-built CNN on the whole training set 10 times with random seed ranging from 0 to 9 to test the effect of the random initialisation. The model with the performance closest to the median value of all trainings was applied on the 6 years of audio recordings of the Risoux forest.
The command line to run the code is:
python TR_INFERENCE_main_process.py -trainingPath "fill with the path to the training repo" -noisePath "fill with the path to the noise repo" --nameExp "fill with the name of the experiment"At the end of each training, the model is tested and the result of the test is saved as a .csv in the save_inference_experiments/inference_results/ repo, in a repo with the name indicated with the --nameExp argument. The .csv files of each experiments are already available in the save_inference_experiments/inference_results/ repo. The model used at inference on the Risoux forest with its parameters are stored in the save_inference_experiments/inference_model/ repo.
The program results_analysis/SNR_n_prediction_analysis.py produce the Figure 3 of the paper which compare the confidence score of positive audio segments from the test set with their corresponding SNR for the custom-built-model and BirdNET v2.4.The confidence scores of BirdNET are available in the results_analysis/Birdnet_results/ repo.
This work was conducted at the National Museum of Natural History of Paris by Félix Michaud, Jérôme Sueur, Frédéric Sebe, Maxime Le Cesne and Sylvain Haupert. Go to Ear-team to learn more about other projects of the Ear-team.