Skip to content

BaseModel not in input tensors #643

@aclifton314

Description

@aclifton314
python: 3.11
mergekit: latest

Hi Mergekit!

I am getting the following error:

Executing graph:   0%|| 4/1457 [00:00<00:00, 4011.77it/s]
Traceback (most recent call last):
  File "/home/aclifton/.local/bin/mergekit-yaml", line 7, in <module>██████████████████████████████████████████████████████| 4.54G/4.54G [04:14<00:00, 19.3MB/s]
    sys.exit(main())
             ^^^^^^
  File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/.local/lib/python3.11/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/mergekit/mergekit/options.py", line 166, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/mergekit/mergekit/scripts/run_yaml.py", line 30, in main
    run_merge(
  File "/home/aclifton/mergekit/mergekit/merge.py", line 85, in run_merge
    for _task, value in exec.run(quiet=options.quiet):
  File "/home/aclifton/mergekit/mergekit/graph.py", line 518, in run
    for handle, value in self._run(quiet=quiet, desc=desc):
  File "/home/aclifton/mergekit/mergekit/graph.py", line 484, in _run
    res = task.execute(**arguments)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aclifton/mergekit/mergekit/merge_methods/slerp.py", line 39, in execute
    raise RuntimeError("Base model not in input tensors")
RuntimeError: Base model not in input tensors
aclifton@dark:~$ 

Using the following yaml file:

slices:
  - sources:
      - model: alignment-handbook/mistral-7b-sft-constitutional-ai
        layer_range: [0, 32]
      - model: vwxyzjn/mistral-7b-dpo-constitutional-ai
        layer_range: [0, 32]
merge_method: slerp
base_model: mistralai/Mistral-7B-v0.1
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

With the following command:

mergekit-yaml my_runs/myruns/models/merged/slerp.yml mscrw_runs/mscrw/models/merged/slerp --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle

And am not sure how to resolve it. It appears a lot of stuff was downloaded so I'm not exactly sure what to do. Any help is much appreciated. Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions