Could not get conrrect response on quantized Qwen3-8b

Hello, I greatly appreciate your work and am currently trying to use your quantitative model framework for qwen3-8b, but the model response I received is completely incorrect. Can you please tell me how to debug it?

I used scripts/run_matgptq.sh to quantize and save the qwen3-8b model, then save the corresponding model weights, and pass the quantized weights to inference_ib/scripts/run_inreference_transformers.sh to run with the Kernel Quantized mode.(whatever i use mode1 or mode 2)

However, the final outputs are all '!' .

**Input: Please introduce Large Language Model!
generated: Please introduce Large Language Model!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!**

To prevent any issues with my quantification process, I also downloaded the model file you released directly, with the link being: https://huggingface.co/ISTA-DASLab/Qwen3-8B-MatGPTQ , then converted to the format of data-pt and used  inference_ib/scripts/run_inreference_transformers.sh, but still encountered problems. 

the convert (safetensors to data.pt) code：

index = json.load(open(os.path.join(model_dir, "model.safetensors.index.json")))
for shard in sorted(set(index["weight_map"].values())):
    with safe_open(os.path.join(model_dir, shard), framework="pt", device="cpu") as f:
        keys = set(f.keys())
        for k in keys:
            if not k.endswith(".qweight"):
                continue
            layer = k[:-8]
            sk = f"{layer}.scales"
            if sk not in keys:
                continue
            layer_dir = os.path.join(out_dir, layer)
            os.makedirs(layer_dir, exist_ok=True)
            torch.save(
                {"qweight": f.get_tensor(k), "scale": f.get_tensor(sk)},
                os.path.join(layer_dir, "data.pt"),
            )


May I ask where I might have encountered a problem？Thank you very much !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could not get conrrect response on quantized Qwen3-8b #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Could not get conrrect response on quantized Qwen3-8b #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions