Input in example.py is not scaled up to [-256, 256] and remains to be in [-1, 1]

Hi,

Thanks a lot for the port.

It seems that the `example.py` misses the scaling of the waveform into the [-256, 256] range which SoundNet expects:
https://github.com/cvondrick/soundnet/blob/fc1b3dd588f53e2bd27207b542aff420404e94c7/demo.lua#L23-L30

Currently, `torchaudio.load(path)` loads a waveform in [-1, 1] because of the default normalization in:
https://github.com/smallflyingpig/SoundNet_Pytorch/blob/f2d9ce01e5d8467a11169fb841ca3f16d1e5da99/example.py#L10-L14

Also, here is a [Google Colab notebook](https://colab.research.google.com/drive/1EdicBPCzeFzWPOSayC-moTtFae6FB0FU?usp=sharing).
One can reproduce the issue just by adding printing min/max values after 
https://github.com/smallflyingpig/SoundNet_Pytorch/blob/f2d9ce01e5d8467a11169fb841ca3f16d1e5da99/example.py#L10

	wav, sr = torchaudio.load(audio_path)
	print(wav.shape)

	wav = wav.unsqueeze(1).unsqueeze(-1).repeat(1,1,8,1) # errors occur when the wav is too short
	feats = model.extract_feat(wav)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input in example.py is not scaled up to [-256, 256] and remains to be in [-1, 1] #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Input in example.py is not scaled up to [-256, 256] and remains to be in [-1, 1] #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions