ENH: New Annotation API by lithomas1 · Pull Request #149 · hpcgroup/pipit

lithomas1 · 2025-05-10T02:15:35Z

Changes

Fixed association of kernel launches to kernels
New time breakdown function breaking down time spent in each annotation
Upgrade flat_profile to allow showing breakdowns by different parallelism levels.

Copilot

Pull Request Overview

Introduces enhancements to profiling and annotation APIs, including improved kernel-event association, a new breakdown function for annotation timings, and extended flat profiling with custom grouping and parallelism controls.

Expanded flat_profile method with mapper, parallelism_level, ascending, and idle_time options
Added time_breakdown to compute CPU and launched-kernel time per annotation
Updated SQLite reader to disable unused CUPTI events and switch from _children to _kernel_launch

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
pipit/trace.py	Enhanced `flat_profile`, added `time_breakdown`, removed `dropna=False`, and debug `print`
pipit/readers/nsight_sqlite_reader.py	Commented out MEMCPY/MEMSET/SYNCH SQL, and replaced `_children` assignments with `_kernel_launch`

Comments suppressed due to low confidence (4)

pipit/readers/nsight_sqlite_reader.py:293

Swapping _children for _kernel_launch breaks consumers (e.g., time_breakdown and filter_by_label) that expect _children. Ensure both fields are populated or update all references.

trace_df.loc[calls_that_launch["index_x"].to_numpy(), "_kernel_launch"] = (calls_that_launch["index_y"].to_numpy())

pipit/trace.py:1040

[nitpick] Using DataFrame.apply with Python loops for each kernel may be slow on large traces. Consider vectorized operations or grouping strategies to improve performance.

kernels.apply(_calc_kernel_time, axis=1,)

pipit/readers/nsight_sqlite_reader.py:71

[nitpick] Large commented-out SQL queries add clutter. If these events are unused long-term, remove the dead code or move it to documentation.

#     """ ... large commented SQL block ... """

pipit/trace.py:528

New parameters and logic in flat_profile and time_breakdown lack dedicated unit tests. Add tests covering custom mapper, parallelism levels, idle_time, sorting, and the new breakdown method.

def flat_profile(..., mapper=None, parallelism_level=None, ascending=None, idle_time=False):

Copilot · 2025-05-28T17:17:18Z

pipit/trace.py

+                    pd_grouper[label] = group
+            res = (
+                res.set_index("Name")
+                .groupby([pd_grouper] + parallelism_level)[["time.exc"]]


The mapper branch hard-codes time.exc instead of using the metrics parameter; this will ignore any other metrics list provided. Use the metrics variable here.

Suggested change

.groupby([pd_grouper] + parallelism_level)[["time.exc"]]

.groupby([pd_grouper] + parallelism_level)[metrics]

I'll agree with this, worth making sure this function works when other metrics than time.exc are passed in.

pipit/trace.py

jhdavis8

Features and documentation looks good overall. I added a few comments.

One high-level question, are there new unit tests checking the added features? Would be good to add these here or in a later PR.

jhdavis8 · 2025-05-28T17:26:04Z

pipit/trace.py

+                    pd_grouper[label] = group
+            res = (
+                res.set_index("Name")
+                .groupby([pd_grouper] + parallelism_level)[["time.exc"]]


I'll agree with this, worth making sure this function works when other metrics than time.exc are passed in.

pipit/trace.py

pipit/readers/nsight_sqlite_reader.py

jhdavis8 · 2025-05-28T18:23:01Z

pipit/trace.py

+
+        # Use explode to expand every child in children list to a row
+        # This can include duplicates (e.g. for nested annotations) that we should drop
+        kernels = events.loc[host_events["_children"].dropna().explode().to_numpy()]


_children referred to here

pipit/trace.py

ocnkr · 2025-10-12T02:25:35Z

pipit/trace.py

+        # Postprocessing using mapper
+        if mapper is not None:
+            # pandas expects label->group
+            labels = res["Name"]


Should this be groupby_column? I'm not sure but wanted to double-check. I think we should check whether we want to use the hardcoded "Name" column or groupby_column in every line we use "Name".

ocnkr · 2025-10-12T02:29:13Z

SQLite connection is never stopped
Tests are missing. Please see pipit/pipit/tests (https://github.com/hpcgroup/pipit/tree/develop/pipit/tests)

lithomas1 added priority: high type: enhancement labels May 10, 2025

lithomas1 force-pushed the ann-api branch from fedb101 to b1e8e02 Compare May 10, 2025 02:17

ENH: New Annotation API

db2c8a0

lithomas1 force-pushed the ann-api branch from b1e8e02 to db2c8a0 Compare May 10, 2025 02:24

lithomas1 added 4 commits May 14, 2025 14:00

update time_breakdown

0ae53bf

better flat_profile

66a5e34

formatting

92146f4

fix CI

00c5f8d

lithomas1 marked this pull request as ready for review May 24, 2025 16:35

lithomas1 requested a review from jhdavis8 May 24, 2025 16:35

jhdavis8 requested a review from Copilot May 28, 2025 17:15

Copilot AI reviewed May 28, 2025

View reviewed changes

jhdavis8 requested changes May 28, 2025

View reviewed changes

lithomas1 added 3 commits May 28, 2025 21:27

some fixes

d4c859e

annotation time breakdown

c627c88

formatting

674d3c4

jhdavis8 self-requested a review September 22, 2025 19:40

ocnkr reviewed Oct 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: New Annotation API#149

ENH: New Annotation API#149
lithomas1 wants to merge 8 commits intohpcgroup:developfrom
lithomas1:ann-api

lithomas1 commented May 10, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 28, 2025

Uh oh!

jhdavis8 May 28, 2025

Uh oh!

Uh oh!

jhdavis8 left a comment

Uh oh!

jhdavis8 May 28, 2025

Uh oh!

Uh oh!

Uh oh!

jhdavis8 May 28, 2025

Uh oh!

Uh oh!

ocnkr Oct 12, 2025 •

edited

Loading

Uh oh!

ocnkr commented Oct 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	.groupby([pd_grouper] + parallelism_level)[["time.exc"]]
	.groupby([pd_grouper] + parallelism_level)[metrics]

Conversation

lithomas1 commented May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 28, 2025

Choose a reason for hiding this comment

Uh oh!

jhdavis8 May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhdavis8 left a comment

Choose a reason for hiding this comment

Uh oh!

jhdavis8 May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jhdavis8 May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ocnkr Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ocnkr commented Oct 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lithomas1 commented May 10, 2025 •

edited

Loading

ocnkr Oct 12, 2025 •

edited

Loading