python: 3.11
mergekit: latest
Hi Mergekit!
I am getting the following error:
Executing graph: 0%|▎ | 4/1457 [00:00<00:00, 4011.77it/s]
Traceback (most recent call last):
File "/home/aclifton/.local/bin/mergekit-yaml", line 7, in <module>██████████████████████████████████████████████████████| 4.54G/4.54G [04:14<00:00, 19.3MB/s]
sys.exit(main())
^^^^^^
File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 1442, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 1363, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aclifton/mergekit/mergekit/options.py", line 166, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/aclifton/mergekit/mergekit/scripts/run_yaml.py", line 30, in main
run_merge(
File "/home/aclifton/mergekit/mergekit/merge.py", line 85, in run_merge
for _task, value in exec.run(quiet=options.quiet):
File "/home/aclifton/mergekit/mergekit/graph.py", line 518, in run
for handle, value in self._run(quiet=quiet, desc=desc):
File "/home/aclifton/mergekit/mergekit/graph.py", line 484, in _run
res = task.execute(**arguments)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aclifton/mergekit/mergekit/merge_methods/slerp.py", line 39, in execute
raise RuntimeError("Base model not in input tensors")
RuntimeError: Base model not in input tensors
aclifton@dark:~$
Using the following yaml file:
slices:
- sources:
- model: alignment-handbook/mistral-7b-sft-constitutional-ai
layer_range: [0, 32]
- model: vwxyzjn/mistral-7b-dpo-constitutional-ai
layer_range: [0, 32]
merge_method: slerp
base_model: mistralai/Mistral-7B-v0.1
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
With the following command:
mergekit-yaml my_runs/myruns/models/merged/slerp.yml mscrw_runs/mscrw/models/merged/slerp --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle
And am not sure how to resolve it. It appears a lot of stuff was downloaded so I'm not exactly sure what to do. Any help is much appreciated. Thanks in advance!
Hi Mergekit!
I am getting the following error:
Using the following
yamlfile:With the following command:
And am not sure how to resolve it. It appears a lot of stuff was downloaded so I'm not exactly sure what to do. Any help is much appreciated. Thanks in advance!