Skip to content

Refactoring of stride handling#3271

Open
Lestropie wants to merge 19 commits intodevfrom
strides_refactor
Open

Refactoring of stride handling#3271
Lestropie wants to merge 19 commits intodevfrom
strides_refactor

Conversation

@Lestropie
Copy link
Copy Markdown
Member

@Lestropie Lestropie commented Mar 4, 2026

Draft PR. Still quite a lot left to do. Posting to open the floor to discussion.

#3108 struck me as odd, in that it is proposed that in addition to providing an array of integers as "desired strides", it may be necessary to additionally provide a sidecar boolean specifying "please interpret this as signed strides as opposed to just a desired order of axes". While it may be possible to store both types of information using the same eg. std::vector<int> class, what they are encoding, and therefore how their data should be interpreted, is fundamentally different (and I don't find "nearest match" to be the right description). As I dug further into this I realised that the current stride handling code was closer to C than C++: it's all verbose function names, relying on developers to invoke the right function at the right time. In the end I couldn't find a simple fix to achieve what I wanted; basically all of the handling needed to be OOP'd.

The ultimate aim of the PR is to replace #3108's:

Image with_direct_io(Stride::List with_strides = Stride::List(),
                     StrideInterpretation strideInterpretation = StrideInterpretation::NearestMatch);

with:

Image with_direct_io(const Stride::Permutation &with_permutation);
Image with_direct_io(const Eigen::Index contiguous_axis);
Image with_direct_io(const Stride::Symbolic &with_symbolic);

If you want the axes to appear in a particular order, and care neither about their signs nor the relative ordering of axes within groups, then you are ultimately specifying a permutation, not "strides". If however you want the data to have a very specific set of symbolic strides, such as in the GPU processing context, then what is being specified is "strides", and those data should not undergo any further "interpretation".


Minor points:

  • I did a little bit of std::ssize_t audit #2972. I was encountering inconsistencies in types used for the same purpose in different locations. I also had a bug very annoying to diagnose wherein MR::Adapter::Base::ndim() was being called instead of MR::Adapter::PermuteAxes::ndim() because the integer return type differed. I've made much of MR::Adapter::Base virtual even though it's not used in that way, it's to show intent rather than for function.

  • There's an interesting interaction between the stride handling and multi-threading. The multi-threading wants the indices of some number of axes to run in the inner loop, and the indices of all other axes to run in the outer loop, based on their relative strides. This turns out to arguably be better suited to MR::Stride then MR::Thread. It corresponds to transforming Stride::Symbolic into a Stride::Permutation, then to a Stride::Order, and then separating the head and tail of the list.

  • Because of the HeaderType template interface concept, when reading strides from a template class input, it is impossible to know whether that class has a symbolic vs. actual representation of strides. It's therefore necessary to treat anything that comes from a template class as actualised, and then manipulate as necessary.

  • There is some concern of borking use cases that aren't covered by the tests. There was a range of esoteric fixes required at the end of the porting process. Many of these I believe to be bad existing code combined with the refactored code being a bit less lenient; eg. calling .with_direct_io(Stride::contiguous_along_axis(3)) on a 3D image.


TODO:

  • Move the stride data member out of the Header::Axis class, and instead give Header a member variable Stride::Symbolic.
    A class Header::Axis would be expected to store information that is specific to one axis and independent of all other axes. For strides however, almost the opposite is true: it is explicitly about its relationship to other axes. The HeaderType::stride(axis) interface can possibly be preserved to minimise porting, though in all likelihood a lot of code that modifies strides of individual axes would actually make more sense (and be less bug-prone) to instead be specifying that some permutation is to be applied to the existing symbolic strides.

  • Vary integer types of Stride::Base class.
    This could help mitigate bugs arising from casting these classes back to std::vector and then just re-interpreting the data as something they are not.
    For a list of integers that encodes a subset of axes, or alternatively one that encodes an order of permutation of all axes, an unsigned integer makes sense (though I'd probably use Eigen::Index anyway).
    For symbolic strides, int8_t would be perfectly fine as long as you don't mind limiting to images with fewer than 128 axes.
    For actualised strides, a wide signed integer type is required (we have erroneously used ssize_t for this, which works in practise but is not guaranteed).

  • Add tests.
    I will try to create tests that directly contrast functionality of the prior code (which remains in place under the MR::Stride::Legacy namespace) with their equivalent in the new code.

  • Address outstanding TODOs.
    As stated, this is the first functioning version, and I leave comments to myself in the code as I'm trying to figure things out. Will need to audit all of these.

  • Confirm finalisation of interactions with multi-threading code.
    Remove duplicate classes if suitable; move all code attributing axes to inner / outer loops to MR::Stride is suitable.

  • Discuss integer types & naming.
    As raised in std::ssize_t audit #2972 the current integer type choices leave a bit to be desired.
    Here I have begun to use Eigen::Index for anything wherein a value of -1 could potentially be considered a unique error flag, and there's an assumption of indexing from zero.
    For representing the position along an axis within a voxel grid, I have defined Axes::index_type (not sold on the name) as an int64_t. It is entirely possible for some computation to result in a voxel index of less than -1, and so support for such should be guaranteed.
    For computations involving differences in memory locations I have adopted explicitly std::ptrdiff_t.

  • Investigate frequency of invocation of stride handling functionality.
    Having added debugging outputs to check my stride handling, there is a greater volume of messages generated than what I would have expected. So I want to check for any redundant logic.

    • Reduce verbosity / volume of debugging outputs once completed.
  • Run code against DWI_metadata test battery.

  • Run code against private DICOM test suite.

Utilise OOP to distinguish between different forms of data relating to the arrangement of N-dimensional data. These can be: a list of axis indices; a list of the relative orders of axes, which effectively serve as a permutation for data contiguity priority, and are indexed from zero; symbolic strides, which are indexed from one and are signed; actual strides, which represent the number (and sign) of elements that must be traversed in the serialised data to achieve an increment of the position index along each axis.
@Lestropie Lestropie self-assigned this Mar 4, 2026
@Lestropie
Copy link
Copy Markdown
Member Author

Strange feature of existing implementation found, highlighting issues with failure to disambiguate forms of strides. Only triggers INFO() so not consequential, but highlights the difficulty of getting this stuff right.

Header::create() invokes ::check_strides_match(), to compare strides between the initial Header contents and any influence of the invoked format. In the former it is clear that both sets of strides are symbolic. However within the latter there are checks as to whether any values present in one list and not the other are greater than 1. This however makes zero sense: I can see a rationale for checking whether actual strides are non-unity and therefore non-trivial, but for symbolic strides that check doesn't make any sense.

(Currently I think that the suitable check is actually: truncate both symbolic strides to only the set of image dimensions present in both, sanitise each individually, then check whether those are equivalent; ie. only report if the relative strides of axes neither inserted nor deleted remain consistent)

- Partial adoption of some integer typedefs.
- New class CuboidExtent for encoding a triplet of odd-valued sizes of a cuboid kernel centred at a voxel, removing redundancy in sanity-checking code.
- Modify Stride classes to use different integer types related to expectation of negative values and magnitudes of values, which may further protect against unintended casts from one form of such information to another.
- When constructing a Header class from a template HeaderType class, explicitly cast strides to symbolic.
- Changed implementation of Adapter::PermuteAxes so as to not depend on a std::vector providing array offsets of less than -1.
- Change criteria for reporting when an output image format automatically changing strides results in an INFO() level message.
@daljit46
Copy link
Copy Markdown
Member

One comment I have is that previously we had Stride::actualise(header), now it seems that we need to do something like Stride::Symbolic(header).actualise(header).to(header) (?), since Stride::actualise now returns an instance of Actual. This IMO is a little deceiving because "actualise" is no longer being used as an imperative and so the user of the function might think that the header is being changed. For exampel, in stride.h, you have:

template <class HeaderType>
inline void set_from_command_line(HeaderType &header, const Permutation &default_order = Permutation()) {
  auto cmdline_strides = __from_command_line(Symbolic(header));
  if (static_cast<bool>(cmdline_strides))
    cmdline_strides->actualise(header); // what does this do?
  else if (!default_order.empty())
    Stride::Symbolic(header).reordered(default_order).actualise(header);
}

where cmdline_strides->actualise(header) does nothing. I think we should mark Symbolic::actualise as [[nodiscard]] and also change its name to something like actual or get_actual.

- Only provide read-only access to individual symbolic strides in the Header class; modifications to symbolic strides can only be done by operating on the Stride::Symbolic instance as a whole.
- Remove multiple instances of strides being actualised and written to the Header class, which does not require actualised strides.
- Remove Stride::Symbolic::actualise(); rely instead on constructors for the Actual class.
- Refactor Registration::multi_resolution_lmax() functions in order to prevent instantiation of an Adapter::Extract1D class along axis 3 for a 3D input image.
@Lestropie
Copy link
Copy Markdown
Member Author

One comment I have is that previously we had Stride::actualise(header), now it seems that we need to do something like Stride::Symbolic(header).actualise(header).to(header)

This is already changed in my local work progress:

  • Actual Symbolic::actualise(const HeaderType &) is instead Actual::Actual(const HeaderType &), which invokes Actual::Actual(const Symbolic &symbolic, const std::vector<size_t> &sizes).
  • Changes to Header (/ HeaderType template):
    • Individual axis strides are read-only. Actualised strides can only be realised at class instantiation, and symbolic strides in Header can only be modified as atomic operations to the Stride::Symbolic class, not element-wise. .to() can no longer work under that paradigm.
    • The Header class is now explicitly coded as having symbolic strides, not actualised strides. These syntax changes have exposed places where strides were being actualised and then written back to the Header class, which doesn't really make much sense.

Symbolic::actualised() would have been better than Symbolic::actualise() given the naming of other functions, but irrelevant since the whole function has disappeared.

Promit: "Generate file testing/unit_tests/strides.cpp to test the functionalities present in files cpp/core/stride.h and cpp/core/strides.cpp. It should utilise the Google Code API, and follow the same code structure as other files in directory testing/unit_tests/. It should test both the active implemented features in the MR::Stride namespace, and the corresponding superseded functionalities in the MR::Stride::Legacy namespace. For each feature, tests of new and legacy implementations should appear sequentially in the tests. Include tests where invalid values are present. Include tests where the number of image dimensions (the numbers of elements in the vectors of integers) differ."

Generated-by: Claude <noreply@anthropic.com>
- Refine code based on tests initially added by Claude.
- Added some extra constructors to MR::Stride classes.
- Re-implemented Stride::conform() function to better mimic the behaviour of MR::Stride::Legacy::get_nearest_match().
- Manual refinement of tests automatically generated by Claude, and manual addition of further tests.
@Lestropie
Copy link
Copy Markdown
Member Author

Relevant notes on latest changes:

  • Header::stride(axis) will yield a Stride::Symbolic::value_type (int8_t), whereas template HeaderType::stride(axis) could yield anything, up to and including a Stride::Actual::value_type (std::ptrdiff_t). So template code that could be dealing with either needs to instantiate an Actual then convert to Symbolic if required.

  • header::stride(axis) no longer has a writeable version. Instead use Header::strides() to perform atomic operations on the complete set of strides.

  • See current proposed integer typedefs:
    4c3655f#diff-89087662cb1c5fc640d93e9f9047fdf02e6d587da95cfc321d361e97e1f6d818
    Have not gone through the whole repo to enforce, so there's likely implicit conversions all over the place, but that was already the case. These are just something that I'll probably start using consistently myself; helps to signal what a variable is for.
    Wouldn't mind having another typedef for "this is an integer counter", but don't know if that will result in a fight over size_t versus ssize_t or whether contextual control over the support for -1 needs to be preserved.

While these values are always positive, the most common use of this information is comparison to voxel position indices, and so it makes sense for these to already be of the same type.
Reduces redundancy in parsing and sanity-checking such information, and enforces type conformity for code readability.
- Remove obsolete TODO comments.
- Image::with_direct_io(): Perform preload if image intensity scaling is non-trivial.
- Change behaviour of is_sanitised() and valid() functions of Stride::Order.
- Fix valid() and is_canonical() functions of Stride::Actual.
- Group stride handling tests relating to status functions in order to double as demonstration of the differences in behaviour between these functions.
@Lestropie
Copy link
Copy Markdown
Member Author

Another minor issue found with existing implementation:

// remove duplicates
for (size_t i = 0; i < header.ndim()-1; ++i) {
  if (header.size (i) == 1) header.stride (i) = 0;

While the code comment says that it's looking for duplicates, it's also serving a second purpose. If the size along a particular axis is one, then it sets the stride to zero. This is intended to preclude the actualised strides of two image axes from being identical to one another. However because the for loop is designed around the detection of duplicates, it excludes the final image axis. So if the last of the image axes has a size of one, there will be degeneracy in the actualised strides.

- Remove unnecessary inlining of fixel helper functions.
- Drastically simplify smoothing filter looping over three spatial axes using new Stride::Permutation::conform() function.
- Embed data offset to first image voxel within Stride::Actual class.
- Embeds bugfix in display of intermediate images in #3285.
- Make construction of a Stride::Order from a Stride::Permutation robust to degeneracies in input.
- Make stride behaviour in the presence of unity-sized image axes consistent with legacy stride handling code.
- Includes fix to Stride::Legacy::sanitise() for when the size of the final image axis is 1.
- Move get_sizes() function out of Stride::actual class and into Stride namespace.
Conflicts:
	cpp/cmd/mrdegibbs.cpp
	cpp/cmd/mrfilter.cpp
	cpp/core/filter/smooth.h
	cpp/core/math/constrained_least_squares.h
- check_syntax: Don't return std::string_view.
- Fix compilation of SIFT commands with SH image export enabled.
- Fix implicit integer conversion on MSYS2 when dealing with intermal memory vs. filesystem size types.
- Add some SFINAE to stride handling to help hunt down compilation failure on MacOS.
- Ensure that copying an instance of Stride::Actual using the = operator appropriately propagates the offset variable.
- Fix compilation warning RE implicit integer conversion on MSYS2.
- Fix operator order in new CuboidExtent class.
- Fix operation of multiple stride functions being broken when terminal outputs were disabled as some data manipulations were collateral damage.
- Change behaviour of construction of Symbolic from a combination of actual strides and axis sizes.
- Change code structure of Stride::Actual constructor to remove conflict between templated and non-templated versions, which MacOS failed to resolve.
Lestropie added a commit that referenced this pull request Mar 27, 2026
- mrgrid: Utilise new integer typedefs to prevent mismatched integer widths in std::min() and std::max() calls on MacOS.
- Change some sanity checks in stride handling from assert() to MR::Exception to ensure that tests pass when compiled in release mode without assertions.
- mrgrid: Utilise new integer typedefs to prevent mismatched integer widths in std::min() and std::max() calls on MacOS.
- Change some sanity checks in stride handling from assert() to MR::Exception to ensure that tests pass when compiled in release mode without assertions.
- Add static integer cast to NPY read test for MSYS2 compatibility.
github-actions[bot]

This comment was marked as outdated.

Lestropie added a commit that referenced this pull request Mar 27, 2026
@Lestropie Lestropie marked this pull request as ready for review March 27, 2026 05:59
@Lestropie
Copy link
Copy Markdown
Member Author

This now passes all tests, including for regression against legacy code, so has been promoted from draft to reviewable PR.

Would appreciate:

  • @jdtournier running private DICOM tests
  • Any thoughts on integer typedefs
    (these aren't to be immediately enforced, but I think they will be useful long-term)
  • Opinions on whether legacy stride handling code should remain in code in MR::Stride::Legacy namespace or whether it should just be removed now that regression tests have been publicly demonstrated to pass.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 428. Check the log or trigger a new build to see more.

Comment thread cpp/cmd/mrinfo.cpp
if (i)
buffer += " ";
buffer += header.stride(i) ? str(strides[i]) : "?";
buffer += header.stride(i) == Stride::Symbolic::invalid ? "?" : str(static_cast<int>(header.stride(i)));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use of undeclared identifier 'symbolic' [clang-diagnostic-error]

    buffer += header.stride(i) == Stride::Symbolic::invalid ? "?" : str(static_cast<int>(symbolic[i]));
                                                                                         ^

: base_type(original),
extract_axis(axis),
indices(indices),
nsize(indices.size()),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: narrowing conversion from 'size_type' (aka 'unsigned long') to signed type 'VoxelIndex' (aka 'long') is implementation-defined [bugprone-narrowing-conversions]

        nsize(indices.size()),
              ^

Comment thread cpp/core/adapter/median.h
@@ -33,35 +35,25 @@ template <class ImageType> class Median : public Base<Median<ImageType>, ImageTy
using base_type::name;
using base_type::size;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use of undeclared identifier 'base_type'; did you mean 'value_type'? [clang-diagnostic-error]

Suggested change
using base_type::size;
using value_type::size;
Additional context

cpp/core/adapter/median.h:30: 'value_type' declared here

  using value_type = typename ImageType::value_type;
        ^

Comment thread cpp/core/adapter/median.h
using base_type::size;

Median(const ImageType &parent) : base_type(parent) { set_extent(std::vector<int>(1, 3)); }
Median(const ImageType &parent) : base_type(parent), extent(3) { update_halfextent(); }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: constructor does not initialize these fields: halfextent [cppcoreguidelines-pro-type-member-init]

cpp/core/adapter/median.h:73:

-   std::array<VoxelIndex, 3> halfextent;
+   std::array<VoxelIndex, 3> halfextent{};

Comment thread cpp/core/adapter/median.h
using base_type::size;

Median(const ImageType &parent) : base_type(parent) { set_extent(std::vector<int>(1, 3)); }
Median(const ImageType &parent) : base_type(parent), extent(3) { update_halfextent(); }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: member initializer 'base_type' does not name a non-static data member or base class [clang-diagnostic-error]

  Median(const ImageType &parent) : base_type(parent), extent(3) { update_halfextent(); }
                                    ^

protected:
using base_type::parent;
std::vector<ssize_t> from_, size_;
std::vector<VoxelIndex> from_, size_;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: member variable 'size_' has protected visibility [cppcoreguidelines-non-private-member-variables-in-classes]

  std::vector<VoxelIndex> from_, size_;
                                 ^

using base_type::size;

Normalise3D(const ImageType &parent) : base_type(parent) { set_extent(std::vector<int>(1, 3)); }
Normalise3D(const ImageType &parent) : base_type(parent), extent(3) { update_halfextent(); }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: constructor does not initialize these fields: halfextent, nelements [cppcoreguidelines-pro-type-member-init]

cpp/core/adapter/normalise3D.h:78:

-   std::array<VoxelIndex, 3> halfextent;
+   std::array<VoxelIndex, 3> halfextent{};

cpp/core/adapter/normalise3D.h:81:

-   size_t nelements;
+   size_t nelements{};

extent = ext;

DEBUG("normalise3D adapter for image \"" + name() + "\" initialised with extent " + str(extent));
Normalise3D(const ImageType &parent, const CuboidExtent &extent) : base_type(parent), extent(extent) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: constructor does not initialize these fields: halfextent, nelements [cppcoreguidelines-pro-type-member-init]

  Normalise3D(const ImageType &parent, const CuboidExtent &extent) : base_type(parent), extent(extent) {
  ^


protected:
std::vector<uint32_t> extent;
CuboidExtent extent;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: member variable 'extent' has protected visibility [cppcoreguidelines-non-private-member-variables-in-classes]

  CuboidExtent extent;
               ^

protected:
std::vector<uint32_t> extent;
CuboidExtent extent;
std::array<VoxelIndex, 3> halfextent;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: member variable 'halfextent' has protected visibility [cppcoreguidelines-non-private-member-variables-in-classes]

  std::array<VoxelIndex, 3> halfextent;
                            ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants