Skip to content

Could not get conrrect response on quantized Qwen3-8b #2

@jieguochiafan

Description

@jieguochiafan

Hello, I greatly appreciate your work and am currently trying to use your quantitative model framework for qwen3-8b, but the model response I received is completely incorrect. Can you please tell me how to debug it?

I used scripts/run_matgptq.sh to quantize and save the qwen3-8b model, then save the corresponding model weights, and pass the quantized weights to inference_ib/scripts/run_inreference_transformers.sh to run with the Kernel Quantized mode.(whatever i use mode1 or mode 2)

However, the final outputs are all '!' .

Input: Please introduce Large Language Model!
generated: Please introduce Large Language Model!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

To prevent any issues with my quantification process, I also downloaded the model file you released directly, with the link being: https://huggingface.co/ISTA-DASLab/Qwen3-8B-MatGPTQ , then converted to the format of data-pt and used inference_ib/scripts/run_inreference_transformers.sh, but still encountered problems.

the convert (safetensors to data.pt) code:

index = json.load(open(os.path.join(model_dir, "model.safetensors.index.json")))
for shard in sorted(set(index["weight_map"].values())):
with safe_open(os.path.join(model_dir, shard), framework="pt", device="cpu") as f:
keys = set(f.keys())
for k in keys:
if not k.endswith(".qweight"):
continue
layer = k[:-8]
sk = f"{layer}.scales"
if sk not in keys:
continue
layer_dir = os.path.join(out_dir, layer)
os.makedirs(layer_dir, exist_ok=True)
torch.save(
{"qweight": f.get_tensor(k), "scale": f.get_tensor(sk)},
os.path.join(layer_dir, "data.pt"),
)

May I ask where I might have encountered a problem?Thank you very much !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions