Inquiry on Semantic Richness and Acoustic Fidelity Variation with n_q in XCodec, and Challenges in Scaling to 44kHz

Great work! 
I would like to inquire if there are any results available regarding the variation of semantic richness and acoustic fidelity as the number of n_q changes in XCodec. Specifically, I am interested in understanding how these two factors (semantic richness and acoustic fidelity) behave as n_q is increased or decreased.

Additionally, I have observed that XCodec operates at a sampling rate of 16kHz, and the reconstructed WAV files lose many acoustic details compared to raw 44.1kHz audio. I am curious about the challenges involved when applying XCodec's technology to a 44kHz codec. For instance, would it be feasible to enhance the DAC by integrating Hubert-based representations?

Any insights or experiences would be greatly appreciated.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry on Semantic Richness and Acoustic Fidelity Variation with n_q in XCodec, and Challenges in Scaling to 44kHz #15

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inquiry on Semantic Richness and Acoustic Fidelity Variation with n_q in XCodec, and Challenges in Scaling to 44kHz #15

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions