Weights Metrics by ElliotStein · Pull Request #340 · arcee-ai/mergekit

ElliotStein · 2024-06-03T11:28:07Z

Implemented:

Framework to compute metrics based on layer weights using existing mergekit infrastructure (run_measure is based on run_merge, metric_methods based on merge_methods etc).
plot_tools.MetricsHandler to load metrics output, process and interact with statistics.
plot_tools.ModelGraph to generate a graph to represent the model structure, with node level statistics visible by hovering over a node, and more detailed stats (histograms, rather than means) available by clicking on a node.
run_metrics.py ties everything together and generates an interactive dashboard displaying the ModelGraph graph.

Not Implemented:

Split layers into individual heads
Activation based metrics
Unit tests

…methods

… to plot (simple) interactive graph

…yers. heatmaps for other metrics.

…yer name

cg123 · 2024-06-16T22:26:39Z

This file should probably be in mergekit/scripts.

Also would be good to use click to turn the hardcoded values into arguments.

cg123 · 2024-06-16T22:27:50Z

+from typing import List, Dict, Optional, Any, Tuple
+from mergekit.graph import Task
+import networkx as nx
+import plotly.graph_objects as go


We should capture these new dependencies in pyproject.toml. Probably under a feature, so headless installs don't need to bring them in.

cg123 · 2024-06-16T22:29:45Z

-                )
+                tensor_task
            )
-        finalize = FinalizeModel(


Totally fine to not do the finalize task when we're doing metrics, but this is needed for merges - I think as is this makes merges not write out correctly.

cg123 · 2024-06-16T22:30:37Z

+        **_kwargs,
+    ) -> Task:
+
+        if 'self_attn' in output_weight.name:


Down the line we probably want this split to be done based on new fields in ArchitectureInfo but this is good for now!

cg123 · 2024-06-16T22:33:12Z

+
+    res = {}
+
+    scale_diff = torch.abs(norm_0 - norm_1) / ((norm_0 + norm_1) / 2)


Should we be doing something here to guard against dividing by zero?

yep - norms are non-negative so adding small epsilon will be fine

cg123 · 2024-06-16T22:36:53Z

    aliases: Optional[Tuple[str, ...]] = None
    force_dtype: Optional[str] = None

+    GQA_groups: Optional[int] = None # None if not GQA, 1 if MQA, >1 if GQA


Should be gqa_groups

cg123 · 2024-06-16T22:38:48Z

+                    num_heads=32 # hard-coded for now
+                )
+                self.block_count += 1
+                return AttnTask(weights=weights, weight_infos=infos, weight_info=weight_info)


Does this end up creating N AttnTasks for each block? I don't think it's actually a problem as the tasks will be deduplicated downstream - should be fine

Should only be one AttnTask for each block - the if statement on line 351 is only satisfied once all the tensors (K,Q,V,O) have been collected. Then self.attn_weight_dict is reset to {} and the (one) AttnTask is created. I might also add individual tensor metrics for comparing just the Qs, Vs etc, which would be simpler.

cg123 · 2024-06-16T22:40:32Z

-        self._method = merge_methods.get(config.merge_method)
+        if getattr(config, "merge_method", None):
+            self._method = merge_methods.get(config.merge_method)
+        elif getattr(config, "metric_method", None):


Would be good to add a validator to MergeConfig that checks that exactly one of these fields is set.

cg123 · 2024-06-16T22:41:58Z

+    )
+
+    res = []
+    for _task, value in exec.run(quiet=options.quiet):


Looking this over, I kinda think we might not need a separate file here - maybe it should just early out in merge.py if there's a metric_method set instead of merge_method?

cg123 · 2024-06-16T22:44:12Z

    Abstract base class representing a task in a computational graph.

    This class should be extended to define specific tasks. Each task can have arguments (dependencies) and a defined execution strategy.
+    Note that PyDantic BaseModel requires that all attributes are defined in the class initialisation, and cannot be changed after. 


Super nitpick here: I think the official capitalization is Pydantic, not PyDantic.

…rge OR Metri, not both.

…guments

… to separate case

…eralised substitute function in architecture

…del weights)

es3e20@soton.ac.uk added 13 commits May 30, 2024 15:32

introduced metric_methods, closely following implementation of merge_…

c2b0f06

…methods

measure.py closely follows metric.py to apply chosen metrics

64e54bf

plot tools include class to handle output of run_metrics, and a class…

7e37d69

… to plot (simple) interactive graph

only minor changes required to existing mergekit code

2effd41

only minor changes required to existing mergekit

ed59d9b

Implemented interactive dashboard for metrics visualisation

564d45c

Remove single-model stats for now. Bring MSE into all_metrics

e530568

Introduce attention weights and restructure dashboard

207e874

refine implementation of attention metrics, add line plots to dashboard

b30175e

More restructuring, more seamless integration of attention and mlp la…

dfc0603

…yers. heatmaps for other metrics.

vectorise heatmap computation

f88f904

Address issue with lexicographical sort by adding leading zeros to la…

f89df57

…yer name

remove unnecessary import from last commit

74c5d33

cg123 reviewed Jun 16, 2024

View reviewed changes

es3e20@soton.ac.uk added 16 commits June 17, 2024 11:46

add validation check to ensure MergeConfiguration method is either Me…

7c209d2

…rge OR Metri, not both.

moved run_metrics and use click to enable commandline control over ar…

db65f83

…guments

rename example config

3fe66a8

correct case for gqa_group name

62692d1

replace measure with merge + early out

2a0c520

guard against divide by zero

f71e360

restore plan_to_disk functionality for merging. Move metrics planning…

7e14266

… to separate case

add optional interactive plot packages

0f3430f

minor cleanup

8d68e39

Merge remote-tracking branch 'upstream/main'

59e23fe

Merge branch 'main' into weights_metrics

89ecbf5

Add GQA info to (llama) architecture and refactor

404e395

Pass GQA info from architecture json all the way to attn metrics. Gen…

8e3c861

…eralised substitute function in architecture

re-organised and simplified dashboard view

a0e8c27

colour-categorise lineplot points by layertime

7e2b552

restructure and refactor results and plotting

69c3b15

ElliotStein added 29 commits July 9, 2024 10:59

refactor and restructure

4ce67b0

clean up imports and remove hard coding

88ed18e

tidy up tqdm

1041298

improve robustness of load and save using pathlib

01d5a2b

reintroduced heatmap functionality

f64a71e

allow for plot keyworks to be passed into Heatmap object

8b05c2a

example config

b8f7e32

improve implementation consistency

c4df572

experimental linearity score metric

5e051a8

add matplotlib to optional dependencies

bb8fce2

Merge remote-tracking branch 'upstream/main' into weights_metrics

9c92efd

remove quantisation and update environment reqs

e1c1ecd

tidy up and add missing dependency

9db9d3b

Major restructuring of Results, Results handling, metrics

af2bf1a

address alphanumeric representation layer name ordering issue

6045980

MAJOR RESTRUCTURE of results, results handler and representation metrics

cb5e31a

tidy up

dfdcb00

further restructuring

7bc8001

further refinements to representations experiment

a5c5811

added necessary imports and modules

3d3fd33

minor restructuring and refactoring

2dfa4f9

Bug fixes

60726d4

visualisation fixes

891bbb4

address model path naming issue (but only for representations! Not mo…

04b64e8

…del weights)

minor fix

1b5a3c0

More fixes, end-to-end tested and working

6bd12e7

refactor folder name

5ac4c78

Implement CKNNA and PCA visualisation

68a6a30

fixes to CKNNA and tidy up

20dd0f6

cg123 force-pushed the main branch from 1e18f70 to 86c30b6 Compare February 5, 2025 23:51


		res = {}

		scale_diff = torch.abs(norm_0 - norm_1) / ((norm_0 + norm_1) / 2)

Conversation

ElliotStein commented Jun 3, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants