-
Notifications
You must be signed in to change notification settings - Fork 58
Profiling kernels with additional function in psyclone tools #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,20 +13,32 @@ | |
|
|
||
| from psyclone.domain.lfric import LFRicConstants | ||
| from psyclone.psyGen import InvokeSchedule | ||
| from psyclone.psyir.nodes import Loop, Routine, Directive | ||
| from psyclone.psyir.nodes import ( | ||
| Loop, | ||
| Routine, | ||
| Directive, | ||
| Container, | ||
| OMPParallelDirective, | ||
| OMPParallelDoDirective, | ||
| OMPDoDirective, | ||
| FileContainer, | ||
| ProfileNode | ||
| ) | ||
| from psyclone.transformations import ( | ||
| Dynamo0p3ColourTrans, | ||
| Dynamo0p3OMPLoopTrans, | ||
| Dynamo0p3RedundantComputationTrans, | ||
| OMPParallelTrans, | ||
| TransformationError | ||
| ) | ||
| from psyclone.psyir.transformations import ProfileTrans | ||
|
|
||
| # List of allowed 'setval_*' built-ins for redundant computation transformation | ||
| SETVAL_BUILTINS = ["setval_c"] | ||
|
|
||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| def redundant_computation_setval(psyir): | ||
| def redundant_computation_setval(psyir: FileContainer): | ||
| """ | ||
| Applies the redundant computation transformation to loops over DoFs | ||
| for the initialision built-ins, 'setval_*'. | ||
|
|
@@ -68,13 +80,14 @@ def redundant_computation_setval(psyir): | |
|
|
||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| def colour_loops(psyir, enable_tiling=False): | ||
| def colour_loops(psyir: FileContainer, enable_tiling=False): | ||
| """ | ||
| Applies the colouring transformation to all applicable loops and optionally | ||
| enables tiling. | ||
| It creates the instance of `Dynamo0p3ColourTrans` only once. | ||
|
|
||
| :param psyir: the PSyIR of the PSy-layer. | ||
| :param enable_tiling: a bool to enable tiling. Default False. | ||
| :type psyir: :py:class:`psyclone.psyir.nodes.FileContainer` | ||
|
|
||
| """ | ||
|
|
@@ -86,6 +99,12 @@ def colour_loops(psyir, enable_tiling=False): | |
| # Colour loops over cells unless they are on discontinuous | ||
| # spaces or over DoFs | ||
| for child in subroutine.children: | ||
| # Check if the profiling calipers have been added before the | ||
| # colouring. | ||
| if isinstance(child, ProfileNode): | ||
| raise TransformationError( | ||
| "Must apply colour_loops BEFORE profile_loops function " | ||
| "in optimisation script.") | ||
| if ( | ||
| isinstance(child, Loop) | ||
| and child.iteration_space.endswith("cell_column") | ||
|
|
@@ -94,9 +113,59 @@ def colour_loops(psyir, enable_tiling=False): | |
| ): | ||
| ctrans.apply(child, options={"tiling": enable_tiling}) | ||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| def profile_loops(psyir: FileContainer, colours_only=True): | ||
| """ | ||
| Applies timing calipers to kernels during the psyclone build. The default | ||
| is to only profile coloured loops but colours_only can be set to False to | ||
| profile every instance of a coded kernel. | ||
|
|
||
| :param psyir: the PSyIR of the PSy-layer. | ||
| :param colours_only: profile only the coloured kernels. Default True. | ||
| :type psyir: :py:class:`psyclone.psyir.nodes.FileContainer` | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can use type hinting to include this information in a syntactically significant fashion. e.g. This is more succinct than the sphinx form and can be used by tools such as Note that there is no need to specify
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for this tip Matthew. I've added the type hint to this function and the other functions in psyclone_tools. |
||
|
|
||
| """ | ||
| profile_trans = ProfileTrans() | ||
| leave_loops = ["cells_in_colour", | ||
| "tiles_in_colour", | ||
| "cells_in_tile"] | ||
|
|
||
| # Loop over all the InvokeSchedule in the PSyIR object | ||
| for subroutine in psyir.walk(InvokeSchedule): | ||
| # Add timing calipers to coloured loops. This should be done | ||
| # before the application of the openmp transformation. | ||
|
Comment on lines
+135
to
+136
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a means to enforce this ordering?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for this comment Matthew, it's been really good to check. I hadn't thought about how strict this condition should be so I tested it and it turns out that a check for this ordering was definitely warranted as well as another (making sure the function isn't called before colour_loops as well). I've added a few
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hacka Fett (@christophermaynard) Joerg Henrichs (@hiker): one thing that came up here was that Psyclone raises an error when a Profile node is placed between an OMPParallelDoDirective node and a Loop node. An idea for the future could be setting Psyclone up so that it allows it when you want to profile different threads?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think that's valid Fortran: after an |
||
| count = 0 | ||
| for loop in subroutine.loops(): | ||
| if not loop.coded_kernels(): | ||
| continue | ||
| # Insert profiler calls before loop over colours | ||
| if ((loop.loop_type == "colours") or | ||
| (colours_only is False and loop.loop_type not in leave_loops)): | ||
| # First check that the transformation is not being made inside | ||
| # an OMP region. | ||
| if (loop.ancestor(OMPParallelDirective) | ||
| or loop.ancestor(OMPParallelDoDirective) | ||
| or loop.ancestor(OMPDoDirective)): | ||
| raise TransformationError( | ||
| "Must apply profile_loops BEFORE " | ||
| "openmp_parallelise_loops function in optimisation " | ||
| "script.") | ||
| # Constructing unique calliper name based on kernel name, | ||
| # invoke name and kernel count | ||
| k_object = loop.ancestor(InvokeSchedule).coded_kernels()[count] | ||
| k_name = k_object.name | ||
| invoke_name = loop.ancestor(InvokeSchedule).invoke.name | ||
| file_name = loop.ancestor(Container).name | ||
| # Make region name | ||
| region_name = invoke_name + ":" + k_name + "_k" + str(count) | ||
| options = {"region_name": (file_name, region_name)} | ||
| profile_trans.apply(loop, options=options) | ||
| # Count here is to distinguish kernels of the same name | ||
| # in the same invoke. | ||
| count += 1 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A comment about why we are using a counter would go a long way here. Something like "Allows invokes of the same name to be profiled individually" |
||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| def openmp_parallelise_loops(psyir): | ||
| def openmp_parallelise_loops(psyir: FileContainer): | ||
| """ | ||
| Applies OpenMP Loop transformation to each applicable loop. | ||
|
|
||
|
|
@@ -120,7 +189,7 @@ def openmp_parallelise_loops(psyir): | |
|
|
||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| def view_transformed_schedule(psyir): | ||
| def view_transformed_schedule(psyir: FileContainer): | ||
| """ | ||
| Provides view of transformed Invoke schedule in the PSy-layer. | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless it states otherwise in the developer documentation, it is probably best to insert your name in alphabetical order by family name. This may be difficult if others have not been doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, there is no clear structure or order to this file, alphabetical or otherwise.