[Feature Request] Support `mxfp/nvfp` data types and blockscaled GEMM APIs

### Required prerequisites

- [x] I have searched the [Issue Tracker](https://github.com/tile-ai/tilelang/issues) that this hasn't already been reported. (comment there if it has.)

### Motivation

mxfp/nvfp types with group scaling have been widely used in efficient AI training/inference. Currently TileLang support dequantgemm gemm for mxfp/nvfp via storing them in `uint8`, and requires to perform manual SIMT scaling and `T.gemm`, which maybe kinda compilcated to upper users. What's more, hardware native blockscaled gemm is supported on modern GPU architecture, e.g. SM100.


### Solution

I propose that we may expose a unified `T.blockscaled_gemm` API in the frontend, which will be lowered to native blockscaled ptx instructions on supported architecture. For older architecture, we can implement this by appending SIMT scaling operations to mma macros. 

### Alternatives

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support `mxfp/nvfp` data types and blockscaled GEMM APIs #1927

Required prerequisites

Motivation

Solution

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support mxfp/nvfp data types and blockscaled GEMM APIs #1927

Description

Required prerequisites

Motivation

Solution

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature Request] Support `mxfp/nvfp` data types and blockscaled GEMM APIs #1927