Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,8 +303,8 @@ Full documentation for MIGraphX is available at
* Support for the Log2 internal operator
* Support for the GCC 14 compiler
* The BitwiseAnd, Scan, SoftmaxCrossEntropyLoss, GridSample, and NegativeLogLikelihoodLoss ONNX operators
* The MatMulNBits, QuantizeLinear/DequantizeLinear, GroupQueryAttention, SkipSimplifiedLayerNormalization, and SimpliedLayerNormalization Microsoft Contrib operators
* Dymamic batch parameter support to OneHot operator
* The MatMulNBits, QuantizeLinear/DequantizeLinear, GroupQueryAttention, SkipSimplifiedLayerNormalization, and SimplifiedLayerNormalization Microsoft Contrib operators
* Dynamic batch parameter support to OneHot operator
* Split-K as an optional performance improvement
* Scripts to validate ONNX models from the ONNX Model Zoo
* GPU Pooling Kernel
Expand All @@ -314,7 +314,7 @@ Full documentation for MIGraphX is available at
* Pointwise fusions with MLIR across reshape operations
* MIGRAPHX_MLIR_DUMP environment variable to dump MLIR modules to MXRs
* The 3 option to MIGRAPHX_TRACE_BENCHMARKING to print the MLIR program for improved debug output
* MIGRAPHX_ENABLE_HIPBLASLT_GEMM environment variable to call hipBlasLt libaries
* MIGRAPHX_ENABLE_HIPBLASLT_GEMM environment variable to call hipBlasLt libraries
* MIGRAPHX_VERIFY_DUMP_DIFF to improve the debugging of accuracy issues
* reduce_any and reduce_all options to the Reduce operation via Torch MIGraphX
* Examples for RNNT, and ControlNet
Expand All @@ -332,7 +332,7 @@ Full documentation for MIGraphX is available at
### Removed

* Disabled requirements for MIOpen and rocBlas when running on Windows.
* Removed inaccuracte warning messages when using exhaustive-tune.
* Removed inaccurate warning messages when using exhaustive-tune.
* Remove the hard coded path in MIGRAPHX_CXX_COMPILER allowing the compiler to be installed in different locations.


Expand All @@ -342,15 +342,15 @@ Full documentation for MIGraphX is available at
* Infrastructure code to enable better Kernel fusions with all supported data types
* Subsequent model compile time by creating a cache for already performant kernels
* Use of Attention fusion with models
* Performance of the Softmax JIT kernel and of the Pooling opterator
* Performance of the Softmax JIT kernel and of the Pooling operator
* Tuning operations through a new 50ms delay before running the next kernel
* Performance of several convolution based models through an optimized NHWC layout
* Performance for the FP8 datatype
* GPU utilization
* Verification tools
* Debug prints
* Documentation, including gpu-driver utility documentation
* Summary section of the migrahx-driver perf command
* Summary section of the migraphx-driver perf command
* Reduced model compilation time
* Reordered some compiler passes to allow for more fusions
* Preloaded tiles into LDS to improve performance of pointwise transposes
Expand Down Expand Up @@ -607,7 +607,7 @@ Full documentation for MIGraphX is available at
* Fixed compile warnings for shadowing variable names
* Added missing specialization for the `nullptr` hash function

### Changees
### Changes

* Bumped version of half library to 5.6.0
* Bumped CI to support ROCm 5.6
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
cmake_minimum_required(VERSION 3.15 FATAL_ERROR)

if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_BINARY_DIR}")
message(FATAL_ERROR "The binary and source directroy cannot be the same")
message(FATAL_ERROR "The binary and source directory cannot be the same")
endif()

# Setup valid strings for build type
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ RUN pip3 install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1.1/torch-2.8
https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1.1/torchvision-0.24.0%2Brocm7.1.1.gitb919bd0c-cp310-cp310-linux_x86_64.whl\
https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1.1/triton-3.4.0%2Brocm7.1.1.git0cace8d2-cp310-cp310-linux_x86_64.whl

# add this for roctracer dependancies
# add this for roctracer dependencies
RUN pip3 install CppHeaderParser

# Workaround broken rocm packages
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/contributing-to-migraphx.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ We start with a snippet of the simple ``add_two_literals()`` function::
// compile the program on the reference device
p.compile(migraphx::ref::target{});

// evaulate the program and retreive the result
// evaluate the program and retrieve the result
auto result = p.eval({}).back();
std::cout << "add_two_literals: 1 + 2 = " << result << "\n";

Expand Down
2 changes: 1 addition & 1 deletion docs/dev/dev_intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ We start with a snippet of the simple ``add_two_literals()`` function::
// compile the program on the reference device
p.compile(migraphx::ref::target{});

// evaulate the program and retreive the result
// evaluate the program and retrieve the result
auto result = p.eval({}).back();
std::cout << "add_two_literals: 1 + 2 = " << result << "\n";

Expand Down
4 changes: 2 additions & 2 deletions docs/dev/onnx_operators.rst
Original file line number Diff line number Diff line change
Expand Up @@ -457,13 +457,13 @@ Operator Support Matrix
+--------------------------+-----------+-----------------+------------------------------+
| MaxPool | ✅ | FP32, FP16, | ``storage_order`` |
| | | FP8, INT8 | not supported, |
| | | | ``dialtion`` is |
| | | | ``dilation`` is |
| | | | partially |
| | | | supported on |
| | | | GPU (MIOpen |
| | | | limitation), |
| | | | ``indices`` 2nd |
| | | | ouput not |
| | | | output not |
| | | | supported |
+--------------------------+-----------+-----------------+------------------------------+
| MaxRoiPool | ❌ | | |
Expand Down
2 changes: 1 addition & 1 deletion src/driver/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,7 @@ struct loader
{
auto dyn_dim = parse_dyn_dims_json(x);
if(dyn_dim.size() != 1)
MIGRAPHX_THROW("dim_param must only specifiy one dimension");
MIGRAPHX_THROW("dim_param must only specify one dimension");
map_dim_params[name] = dyn_dim.front();
}
}
Expand Down
4 changes: 2 additions & 2 deletions src/include/migraphx/common.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -101,7 +101,7 @@ MIGRAPHX_EXPORT
std::vector<std::size_t> compute_common_lens(const std::vector<shape>& shapes);

/**
* @ brief Compute the common (broadcasted) dynamic dimensions of a list of dynamic shapes
* @brief Compute the common (broadcasted) dynamic dimensions of a list of dynamic shapes
*/
MIGRAPHX_EXPORT
std::vector<shape::dynamic_dimension> compute_common_dyn_dims(const std::vector<shape>& shapes);
Expand Down
8 changes: 4 additions & 4 deletions src/include/migraphx/float8_impl.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2025 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -121,7 +121,7 @@ constexpr uint8_t cast_to_f8(T f_x, bool stoch = false, uint32_t rng = 0)
return NegativeZeroNan ? 0 : 0x80; // For FNUZ types neg zero is just positive zero
}

/* First need to check if it is normal or denorm as there is a difference of implict 1
/* First need to check if it is normal or denorm as there is a difference of implicit 1
Then need to adjust the exponent to align with the F8 exponent, in the meanwhile, shift
The mantissa. Then for stochastic rounding, add rng to mantissa and truncate. And for
RNE, no need to add rng. Then probably need to check whether there is carry and adjust
Expand Down Expand Up @@ -157,7 +157,7 @@ constexpr uint8_t cast_to_f8(T f_x, bool stoch = false, uint32_t rng = 0)
{
/* This is the case where fp32/fp16 is normal but it is in f8 denormal range.
For example fp8 FNUZ mode, denormal exponent is -7, but if the fp32/fp16
actual exponent is -7, it is actually larger due to the implict 1,
actual exponent is -7, it is actually larger due to the implicit 1,
Therefore it needs to be adjust to -6 and mantissa shift right by 1.
So for fp32/fp16, exponent -8 is the cut point to convert to fp8 FNUZ */
exponent_diff = f8_denormal_act_exponent - act_exponent;
Expand Down Expand Up @@ -186,7 +186,7 @@ constexpr uint8_t cast_to_f8(T f_x, bool stoch = false, uint32_t rng = 0)
else if(exponent_diff == -1)
mantissa <<= -exponent_diff;
bool implicit_one = mantissa & (1 << mfmt);
// if there is no implict 1, it means the f8 is denormal and need to adjust to denorm exponent
// if there is no implicit 1, it means the f8 is denormal and need to adjust to denorm exponent
f8_exponent =
(act_exponent + exponent_diff) /*actual f8 exponent*/ + f8_bias - (implicit_one ? 0 : 1);

Expand Down
2 changes: 1 addition & 1 deletion src/include/migraphx/module.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ struct MIGRAPHX_EXPORT module
std::vector<std::size_t> scalar_const_out_lens = {};
};

/// Compute a new ouput shape by replacing each parameter with input
/// Compute a new output shape by replacing each parameter with input
/// shapes passed in.
std::vector<shape> compute_shapes(const std::vector<shape>& inputs,
compute_shapes_options options) const;
Expand Down
6 changes: 3 additions & 3 deletions src/include/migraphx/op/onehot.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -40,12 +40,12 @@ namespace op {
* Called with `axis` attribute that defaults to the last output axis
* Constant depth: `onehot(indices, values), depth attribute must be set;
* Variable depth: `onehot(indices, depth, values)`;
* `indicies` as a N rank tensor of indices where value is `on_value`
* `indices` as a N rank tensor of indices where value is `on_value`
* `depth` scalar with the number of classes for the one-hot dimension
* `values` `[off_value, on_value]`
* `axis` which axis to add the one-hot dimension to
* For axis = 0 and rank(indices) = 2:
* output is A[indicies[j, k], j, k] = on_value; A[i, j, k] = off_value otherwise
* output is A[indices[j, k], j, k] = on_value; A[i, j, k] = off_value otherwise
* Can be simplified to other operators when `indices` has a static shape and
* `depth` is constant at compile-time.
*/
Expand Down
4 changes: 2 additions & 2 deletions src/include/migraphx/op/scatter_op.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -47,7 +47,7 @@ template <typename Derived>
struct scatter_op : op_name<Derived>
{
int64_t axis = 0;
// skip scattering indicies that are out of bounds
// skip scattering indices that are out of bounds
bool skip_out_of_bounds = false;

template <class Self, class F>
Expand Down
4 changes: 2 additions & 2 deletions src/memory_coloring.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2025 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -180,7 +180,7 @@ struct allocation_segment
std::size_t n = 1 + (ins->get_shape().bytes() - 1) / alignment;
assert(n > 0);
std::size_t start = 0;
// Insert at end if it cant fit at the begining
// Insert at end if it can't fit at the beginning
if(segments.empty() or segments.begin()->first <= n)
{
auto it = find_gap(segments, n);
Expand Down
4 changes: 2 additions & 2 deletions src/onnx/parse_attention.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2025 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -709,7 +709,7 @@ struct parse_attention : op_parser<parse_attention>
// each attention head

if(attention.padding_mode() == mask_pad::raw)
{ // Raw Mask - 0 means mask, 1 means pass through. Apply mask_filter_val to mask indicies
{ // Raw Mask - 0 means mask, 1 means pass through. Apply mask_filter_val to mask indices
// and zero otherwise
// Need to generate from 2 dims or 3 dim cases
return generate_raw_mask_per_batch(info, attention);
Expand Down
22 changes: 11 additions & 11 deletions src/onnx/parse_softmaxcrossentropyloss.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -269,7 +269,7 @@ struct parse_softmaxcrossentropyloss : op_parser<parse_softmaxcrossentropyloss>
instruction_ref handle_index_selection(const onnx_parser::node_info& info,
const instruction_ref labels) const
{
// Pick out the coordinates from the inputs to gerneate the proper indicies to gather
// Pick out the coordinates from the inputs to generate the proper indices to gather
// what will be operated on later.

// Use label indices to select weights
Expand All @@ -287,7 +287,7 @@ struct parse_softmaxcrossentropyloss : op_parser<parse_softmaxcrossentropyloss>
// Trying to replicate torch arrange() here.
std::vector<int64_t> vect_of_lit(len_val);
std::iota(vect_of_lit.begin(), vect_of_lit.end(), 0);
auto batch_dim_indicies =
auto batch_dim_indices =
info.add_literal(migraphx::shape(label_shape.type(), {len_val}), vect_of_lit);

// This is supposed to do unsq_dims = [:a] + [a + 1:]
Expand All @@ -298,13 +298,13 @@ struct parse_softmaxcrossentropyloss : op_parser<parse_softmaxcrossentropyloss>
unsq_dims.erase(it);

auto batch_dim_index_unsq = info.add_instruction(
migraphx::make_op("unsqueeze", {{"axes", unsq_dims}}), batch_dim_indicies);
migraphx::make_op("unsqueeze", {{"axes", unsq_dims}}), batch_dim_indices);

auto batch_dim_indicies_bc = info.add_instruction(
auto batch_dim_indices_bc = info.add_instruction(
migraphx::make_op("multibroadcast",
{{"out_lens", labels_unsq->get_shape().lens()}}),
batch_dim_index_unsq);
coordinate_index_literals.push_back(batch_dim_indicies_bc);
coordinate_index_literals.push_back(batch_dim_indices_bc);
}

coordinate_index_literals.push_back(labels_unsq);
Expand Down Expand Up @@ -433,7 +433,7 @@ struct parse_softmaxcrossentropyloss : op_parser<parse_softmaxcrossentropyloss>
}

// Index selection before loss calculation completed
auto gathernd_indicies = handle_index_selection(info, labels);
auto gathernd_indices = handle_index_selection(info, labels);

std::vector<int64_t> perm(class_size, 0);
if(is_k_dim)
Expand All @@ -444,7 +444,7 @@ struct parse_softmaxcrossentropyloss : op_parser<parse_softmaxcrossentropyloss>
scores);
}

scores = info.add_instruction(migraphx::make_op("gathernd"), scores, gathernd_indicies);
scores = info.add_instruction(migraphx::make_op("gathernd"), scores, gathernd_indices);

std::vector<int64_t> axis_list(ndims - 1, 0);
std::iota((axis_list.begin() + 1), axis_list.end(), 2);
Expand All @@ -455,11 +455,11 @@ struct parse_softmaxcrossentropyloss : op_parser<parse_softmaxcrossentropyloss>
if(is_k_dim)
weights = info.add_instruction(migraphx::make_op("transpose", {{"permutation", perm}}),
weights);
weights = info.add_instruction(migraphx::make_op("gathernd"), weights, gathernd_indicies);
weights = info.add_instruction(migraphx::make_op("gathernd"), weights, gathernd_indices);

// Do pointwise operators on the final set of indicies and scores we care about rather than
// Do pointwise operators on the final set of indices and scores we care about rather than
// before so that we're not doing a bunch of pointwise on items that aren't part of the loss
// calulation.
// calculation.
auto log_sm_scores = scores;
if(is_softmaxcrossentropy)
{
Expand Down
6 changes: 3 additions & 3 deletions src/rewrite_rnn.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -727,7 +727,7 @@ std::vector<operation> rewrite_rnn::gru_actv_funcs(instruction_ref ins) const
auto gru_op = any_cast<op::gru>(ins->get_operator());
// before rewrite the gru operator, need to ensure
// we have 4 actv funcs, even though a user does not
// specifiy any actv func. If less than 4, use the
// specify any actv func. If less than 4, use the
// algorithm in parse_gru to make 4 actv functions
if(gru_op.direction == op::rnn_direction::bidirectional)
{
Expand Down Expand Up @@ -1206,7 +1206,7 @@ std::vector<operation> rewrite_rnn::lstm_actv_funcs(instruction_ref ins) const
auto lstm_op = any_cast<op::lstm>(ins->get_operator());
// before rewrite the lstm operator, need to ensure
// we have 6 actv funcs, even though a user does not
// specifiy any actv func. If less than 46, use the
// specify any actv func. If less than 6, use the
// algorithm in parse_lstm to make 6 actv functions
const auto& actv_funcs = lstm_op.actv_funcs;
std::size_t num_actv_funcs = actv_funcs.size();
Expand Down
4 changes: 2 additions & 2 deletions src/simplify_qdq.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2025 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2026 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -175,7 +175,7 @@ struct match_find_quantizable_ops
qop, migraphx::make_op("quant_convolution", conv_val), qop_args);
auto out_lens = dq->get_shape().lens();

// Ensure input and weight quantization paramaters are of a proper form
// Ensure input and weight quantization parameters are of a proper form
// Input is of shape [n, c, x1, ..., xn]. Only scalar quantization allowed
// Weight is of shape [k, c, y1, ... , yn]. Valid quantization axis is k

Expand Down
Loading
Loading