-
Notifications
You must be signed in to change notification settings - Fork 19
Fixes #6 - perf: optimize python backend for task graph-analytics #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -4,6 +4,7 @@ | |||||||||||||||||||||||||
| from typing import Any | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| from ...common import TaskSpec, round6 | ||||||||||||||||||||||||||
| import numpy as np | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| def generate(node_count: int, seed: int) -> dict[str, list[str]]: | ||||||||||||||||||||||||||
|
|
@@ -19,28 +20,36 @@ def generate(node_count: int, seed: int) -> dict[str, list[str]]: | |||||||||||||||||||||||||
| return graph | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| def solve(graph: dict[str, list[str]], iterations: int = 16, damping: float = 0.85) -> dict[str, Any]: | ||||||||||||||||||||||||||
| nodes = sorted(graph) | ||||||||||||||||||||||||||
| if not nodes: | ||||||||||||||||||||||||||
| def solve(graph: dict[str, list[str]] , iterations: int = 16, damping: float = 0.85) -> dict[str, Any]: | ||||||||||||||||||||||||||
| if not graph: | ||||||||||||||||||||||||||
| return {"node_count": 0, "top_node": "", "top_score": 0.0, "checksum": 0.0} | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| rank = {node: 1.0 / len(nodes) for node in nodes} | ||||||||||||||||||||||||||
| outgoing = {node: graph[node] if graph[node] else nodes for node in nodes} | ||||||||||||||||||||||||||
| base = (1.0 - damping) / len(nodes) | ||||||||||||||||||||||||||
| nodes = sorted(graph) | ||||||||||||||||||||||||||
| N = len(nodes) | ||||||||||||||||||||||||||
| idx_map = {node: i for i, node in enumerate(nodes)} | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| rows = np.array([idx_map[src] for src, targets in graph.items() for _ in targets], dtype=np.int64) | ||||||||||||||||||||||||||
| cols = np.array([idx_map[tgt] for _, targets in graph.items() for tgt in targets], dtype=np.int64) | ||||||||||||||||||||||||||
| out_degree = np.bincount(rows, minlength=N).astype(np.float64) | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| rank = np.full((N,), 1.0/N, dtype=np.float64) | ||||||||||||||||||||||||||
| base = (1.0 - damping) / N | ||||||||||||||||||||||||||
| trans_wt = damping / out_degree | ||||||||||||||||||||||||||
|
Comment on lines
+33
to
+37
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Potential division by zero for graphs with sink nodes (dangling nodes). If a node has no outgoing edges, This won't affect the current generator (which always creates 2–4 edges per node), but arbitrary input graphs could yield incorrect results without any warning. 🛡️ Possible defensive fix out_degree = np.bincount(rows, minlength=N).astype(np.float64)
+ # Handle dangling nodes: set out_degree to 1 to avoid inf; their rank contribution is zero anyway
+ out_degree[out_degree == 0] = 1.0
rank = np.full((N,), 1.0/N, dtype=np.float64)📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| for _ in range(iterations): | ||||||||||||||||||||||||||
| new_rank = {node: base for node in nodes} | ||||||||||||||||||||||||||
| for node in nodes: | ||||||||||||||||||||||||||
| share = rank[node] / len(outgoing[node]) | ||||||||||||||||||||||||||
| for target in outgoing[node]: | ||||||||||||||||||||||||||
| new_rank[target] += damping * share | ||||||||||||||||||||||||||
| rank = new_rank | ||||||||||||||||||||||||||
| top_node = max(nodes, key=lambda node: (rank[node], node)) | ||||||||||||||||||||||||||
| checksum = sum((index + 1) * rank[node] for index, node in enumerate(nodes)) | ||||||||||||||||||||||||||
| msgs = (rank * trans_wt)[rows] | ||||||||||||||||||||||||||
| recieved = np.bincount(cols, weights=msgs, minlength=N) | ||||||||||||||||||||||||||
| rank = recieved + base | ||||||||||||||||||||||||||
|
Comment on lines
+41
to
+42
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo: ✏️ Proposed fix- recieved = np.bincount(cols, weights=msgs, minlength=N)
- rank = recieved + base
+ received = np.bincount(cols, weights=msgs, minlength=N)
+ rank = received + base📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| top_node = int(np.argmax(rank)) | ||||||||||||||||||||||||||
| mult = np.arange(1, N+1) | ||||||||||||||||||||||||||
| checksum = float(np.dot(mult, rank)) | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| return { | ||||||||||||||||||||||||||
| "node_count": len(nodes), | ||||||||||||||||||||||||||
| "top_node": top_node, | ||||||||||||||||||||||||||
| "top_score": round6(rank[top_node]), | ||||||||||||||||||||||||||
| "checksum": round6(checksum), | ||||||||||||||||||||||||||
| "node_count": N, | ||||||||||||||||||||||||||
| "top_node": f"n{top_node:04d}", | ||||||||||||||||||||||||||
| "top_score": round6(float(rank[top_node])), | ||||||||||||||||||||||||||
| "checksum" : round6(checksum), | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extraneous space before comma.
There's an extra space in the function signature:
dict[str, list[str]] ,should bedict[str, list[str]],.✏️ Proposed fix
📝 Committable suggestion
🤖 Prompt for AI Agents