I managed to successfully compress Cosmos-Predict2 and Chroma, but when I tried to compress the T5 text encoder model used by Flux, I get the following error instead:
Traceback (most recent call last):
File "F:\AI setups\Diffusers\models\compress t5.py", line 42, in <module>
compress_model(
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\dfloat11\dfloat11.py", line 622, in compress_model
save_file(model.state_dict(), os.path.join(save_path, 'model.safetensors'))
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\safetensors\torch.py", line 352, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
^^^^^^^^^^^^^^^^^
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\safetensors\torch.py", line 577, in _flatten
raise RuntimeError(
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'shared.weight', 'encoder.embed_tokens.weight'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
Is this due to an error with my compression code, or is what I am trying to do not supported? The complete code I used, including the pattern_dict, is below:
import torch
from dfloat11 import compress_model
from transformers import T5EncoderModel
save_path = r".\t5-v1_1-xxl-DF11"
save_single_file = True
check_correctness = True
block_range = (0, 100)
text_encoder_2 = T5EncoderModel.from_pretrained(
r"..\models\FLUX.1-dev",
subfolder = "text_encoder_2",
torch_dtype = torch.bfloat16,
local_files_only = True
)
pattern_dict={
"block\.\d+": (
"layer.0.SelfAttention.q",
"layer.0.SelfAttention.k",
"layer.0.SelfAttention.v",
"layer.0.SelfAttention.o",
"layer.1.DenseReluDense.wi_0",
"layer.1.DenseReluDense.wi_1",
"layer.1.DenseReluDense.wo",
)
}
# Compress the model using DFloat11 compression
compress_model(
model=text_encoder_2,
pattern_dict= pattern_dict,
save_path = save_path,
save_single_file = save_single_file,
check_correctness = check_correctness,
block_range = block_range,
)
Edit: Found an issue with the pattern_dict, block should be replaced with encoder.block, but the shared tensors issue will still stop the file from being saved after the compression process finishes.
I managed to successfully compress Cosmos-Predict2 and Chroma, but when I tried to compress the T5 text encoder model used by Flux, I get the following error instead:
Is this due to an error with my compression code, or is what I am trying to do not supported? The complete code I used, including the
pattern_dict, is below:Edit: Found an issue with the
pattern_dict,blockshould be replaced withencoder.block, but the shared tensors issue will still stop the file from being saved after the compression process finishes.