Add LRP Merge Method by Tusm11 · Pull Request #678 · arcee-ai/mergekit

Tusm11 · 2026-04-03T13:30:07Z

This PR adds a new merge method called lrp (Layer-wise Relevance Propagation Merge) that uses gradient-based importance scores to determine which weights to preserve when combining fine-tuned models.

How it works:

For each fine-tuned model, a task vector is computed as delta = fine_tuned - base. Each weight is then scored using pre-computed LRP importance scores, falling back to magnitude if no scores are provided. The top-k weights are kept based on the density parameter, and the sparse deltas are weighted-averaged across all models before being added back to the base.

Parameters:

density (float, default 0.7) — fraction of weights to retain based on importance. weight (float, default 1.0) — per-model weight for the weighted average.

LRP scores can be pre-computed using the included lrp_computer module, which supports three propagation rules: epsilon, gamma, and alpha_beta. If no scores are provided, the method falls back to magnitude-based importance, making it usable without any pre-computation.

Files changed:

mergekit/merge_methods/lrp.py — core merge method implementation. mergekit/lrp_computer.py — standalone tool to compute LRP scores for a model. mergekit/merge_methods/registry.py — registers lrp as an available merge method. mergekit/config.py — adds lrp_path field to model definitions. mergekit/plan.py — passes LRP score paths through to the merge task. docs/lrp_merge.md — documentation with examples and rule descriptions. tests/test_lrp_merge.py — test coverage for basic usage, density edge cases, and validation.

My github repository: https://github.com/Tusm11/Layer-wise-Relevance-Propagation-LRP.git

… gradients

…ariable

… dead code

…xact k elements

… tensors

…smatch

… wire lrp_rule dispatch, pass sources for pre/post weights

…, fix reshape crash, guard lrp_scores to lrp method only

…nitude

…ng duplication

…opagation

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

…ding position

…erError in ConfiguredModuleArchitecture

Tusm11 · 2026-04-05T14:07:38Z

Hi! This is my recent work, and there is no paper yet. This is a new merge method inspired from Linear and SLERP. I hope you can please merge the PR submission @cg123

Tusm11 added 15 commits April 2, 2026 18:01

Add LRP merge method

2acc1cd

Fix: rectify embed sizes, implement actual LRP rules, aggregate batch…

f80e21e

… gradients

Fix: add LRPMerge to STATIC_MERGE_METHODS, remove unused total_grad v…

cdcba3a

…ariable

fix: pre-commit formatting

0b1f22c

fix: resolve lazy tensor loader compatibility and ensure all tests pass

c1969d2

fix: add venv311 to gitignore and fix LRP type annotations

070eafc

fix: wire LRP scores into merge pipeline via lrp_path config

c8968aa

fix: propagate lrp_path in normalize_config, cache LRP scores, remove…

127df82

… dead code

fix: use ImmutableMap for lrp_scores to enable task hashing

bf38697

fix: remove mutable _cached_lrp_scores from frozen Pydantic model

61d52d8

fix: cache full archive load in LazyPickleLoader to avoid repeated I/O

31fa326

fix: use argsort-based selection in _compute_topk_mask to guarantee e…

cb394ef

…xact k elements

fix: correct argsort mask assignment to avoid no-op on non-contiguous…

21b6efe

… tensors

fix: move LRP importance tensor to delta device to prevent GPU/CPU mi…

73ea6a5

…smatch

chore: trigger bot re-review

bfd1c89