Open
Conversation
… wire lrp_rule dispatch, pass sources for pre/post weights
…, fix reshape crash, guard lrp_scores to lrp method only
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
…erError in ConfiguredModuleArchitecture
Author
|
Hi! This is my recent work, and there is no paper yet. This is a new merge method inspired from Linear and SLERP. I hope you can please merge the PR submission @cg123 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This PR adds a new merge method called lrp (Layer-wise Relevance Propagation Merge) that uses gradient-based importance scores to determine which weights to preserve when combining fine-tuned models.
How it works:
For each fine-tuned model, a task vector is computed as delta = fine_tuned - base. Each weight is then scored using pre-computed LRP importance scores, falling back to magnitude if no scores are provided. The top-k weights are kept based on the density parameter, and the sparse deltas are weighted-averaged across all models before being added back to the base.
Parameters:
density (float, default 0.7) — fraction of weights to retain based on importance. weight (float, default 1.0) — per-model weight for the weighted average.
LRP scores can be pre-computed using the included lrp_computer module, which supports three propagation rules: epsilon, gamma, and alpha_beta. If no scores are provided, the method falls back to magnitude-based importance, making it usable without any pre-computation.
Files changed:
mergekit/merge_methods/lrp.py — core merge method implementation. mergekit/lrp_computer.py — standalone tool to compute LRP scores for a model. mergekit/merge_methods/registry.py — registers lrp as an available merge method. mergekit/config.py — adds lrp_path field to model definitions. mergekit/plan.py — passes LRP score paths through to the merge task. docs/lrp_merge.md — documentation with examples and rule descriptions. tests/test_lrp_merge.py — test coverage for basic usage, density edge cases, and validation.
My github repository: https://github.com/Tusm11/Layer-wise-Relevance-Propagation-LRP.git