Skip to content

Inquiry on Semantic Richness and Acoustic Fidelity Variation with n_q in XCodec, and Challenges in Scaling to 44kHz #15

@LiuZH-19

Description

@LiuZH-19

Great work!
I would like to inquire if there are any results available regarding the variation of semantic richness and acoustic fidelity as the number of n_q changes in XCodec. Specifically, I am interested in understanding how these two factors (semantic richness and acoustic fidelity) behave as n_q is increased or decreased.

Additionally, I have observed that XCodec operates at a sampling rate of 16kHz, and the reconstructed WAV files lose many acoustic details compared to raw 44.1kHz audio. I am curious about the challenges involved when applying XCodec's technology to a 44kHz codec. For instance, would it be feasible to enhance the DAC by integrating Hubert-based representations?

Any insights or experiences would be greatly appreciated.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions