|
audio_time_series = audio_time_series.reshape(-1) |
Hi, I noticed a potential issue in with this function. Here, assuming a stereo input file of 10s duration and 44.1khz, torchaudio.load() creates a tensor of (2, 441000) and the reshaped output would have a length of (882000). This would result in uneven repetition or trimming (selection of random subsets of timeseries) between channels in the subsequent operations if operating on the .reshape(-1) tensor. Would a better choice be .mean(0) to calculate the mean signal between channels? This would prevent the uneven nature of subsequent operations.
I hope you can highlight if I am missing something here.
Regards,
Aashish
Pengi/wrapper.py
Line 167 in 31d5e37
Hi, I noticed a potential issue in with this function. Here, assuming a stereo input file of 10s duration and 44.1khz, torchaudio.load() creates a tensor of (2, 441000) and the reshaped output would have a length of (882000). This would result in uneven repetition or trimming (selection of random subsets of timeseries) between channels in the subsequent operations if operating on the .reshape(-1) tensor. Would a better choice be .mean(0) to calculate the mean signal between channels? This would prevent the uneven nature of subsequent operations.
I hope you can highlight if I am missing something here.
Regards,
Aashish