Currently, the decompression hook is hardcoded to only handle nn.Linear and nn.Embedding tensors, if they are compressed "alone", and if there is a list of weight_injection_modules, only nn.Linear submodules are supported.
However, some model architectures like SDXL (even though most SDXL checkpoints are FP16, some like Laxhar/noobai-XL-1.1 and Laxhar/noobai-XL-Vpred-1.0 are actually distributed in BF16) rely heavily on other types of submodules like nn.Conv2D. I tried compressing these models and trying to compress nn.Conv2D tensors actually works, but they fail to load, due to the aforementioned problem. This is the error I obtained:
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\dfloat11\dfloat11.py", line 166, in decode_hook
sub_module.weight = weight.view(sub_module.out_features, sub_module.in_features)
^^^^^^^^^^^^^^^^^^^^^^^
File "F:\AI setups\Diffusers\diffusers-venv\Lib\site-packages\torch\nn\modules\module.py", line 1940, in __getattr__
raise AttributeError(
AttributeError: 'Conv2d' object has no attribute 'out_features'
I can confirm that the following patch seems to work, and the output is identical, as expected:
for sub_module, weight in zip(module.weight_injection_modules, weights):
if isinstance(sub_module, nn.Linear):
sub_module.weight = weight.view(sub_module.out_features, sub_module.in_features)
elif isinstance(sub_module, nn.Conv2d):
sub_module.weight = weight.view(sub_module.out_channels, sub_module.in_channels // sub_module.groups, sub_module.kernel_size[0], sub_module.kernel_size[1])
else:
raise Exception(f"Unimplemented type: {type(sub_module)}")
But this might not be a practical way to account for all submodule types, as the decode_hook is extremely performance-critical, and having a long if-elif chain might cause slowdowns. I suggest having the load_and_replace_tensors() function add another attribute besides the weight_injection_modules, to store the positional arguments of the weight.view() function as a tuple of integers, and have the loading code be responsible for determining the correct arguments.
Currently, the decompression hook is hardcoded to only handle
nn.Linearandnn.Embeddingtensors, if they are compressed "alone", and if there is a list ofweight_injection_modules, onlynn.Linearsubmodules are supported.However, some model architectures like SDXL (even though most SDXL checkpoints are FP16, some like Laxhar/noobai-XL-1.1 and Laxhar/noobai-XL-Vpred-1.0 are actually distributed in BF16) rely heavily on other types of submodules like
nn.Conv2D. I tried compressing these models and trying to compressnn.Conv2Dtensors actually works, but they fail to load, due to the aforementioned problem. This is the error I obtained:I can confirm that the following patch seems to work, and the output is identical, as expected:
But this might not be a practical way to account for all submodule types, as the
decode_hookis extremely performance-critical, and having a longif-elif chainmight cause slowdowns. I suggest having theload_and_replace_tensors()function add another attribute besides theweight_injection_modules, to store the positional arguments of theweight.view()function as a tuple of integers, and have the loading code be responsible for determining the correct arguments.