[Cpp API Compatibility] Add CUDABlas related APIs#78060
[Cpp API Compatibility] Add CUDABlas related APIs#78060SigureMo merged 64 commits intoPaddlePaddle:developfrom
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR extends Paddle’s LibTorch/ATen compatibility layer by introducing a lightweight CUDA context interface and adding missing CUDA/CUBLAS handle accessors needed for C++ API compatibility (notably at::cuda::getCurrentCUDABlasHandle), along with a compat c10::Allocator abstraction.
Changes:
- Add
c10::cuda::device_count()API to the compat CUDA functions header. - Introduce compat
c10::Allocator(plusDataPtr::release_context()support) for raw allocation APIs. - Add
ATen/cuda/CUDAContextLight.{h,cpp}and switchATen/cuda/CUDAContext.hto include the new light header; wire the new.cppinto the compat build.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| paddle/phi/api/include/compat/c10/cuda/CUDAFunctions.h | Adds device_count() to the compat c10 CUDA API surface. |
| paddle/phi/api/include/compat/c10/core/Allocator.h | Adds a compat c10::Allocator interface and DataPtr::release_context(). |
| paddle/phi/api/include/compat/CMakeLists.txt | Adds the new CUDAContextLight.cpp to the compat compilation sources. |
| paddle/phi/api/include/compat/ATen/cuda/CUDAContextLight.h | Declares lightweight at::cuda CUDA context APIs, including cuBLAS handle getters. |
| paddle/phi/api/include/compat/ATen/cuda/CUDAContextLight.cpp | Implements the lightweight CUDA context APIs via phi::GPUContext. |
| paddle/phi/api/include/compat/ATen/cuda/CUDAContext.h | Redirects CUDAContext to the new lightweight header. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #78060 +/- ##
==========================================
Coverage ? 95.18%
==========================================
Files ? 16
Lines ? 498
Branches ? 0
==========================================
Hits ? 474
Misses ? 24
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c807df9 to
8ebc8c8
Compare
7a5c35e to
c1cbf14
Compare
1067c2f to
adb8b58
Compare
|
/re-run windows-gpu |
|
|
||
| #pragma once | ||
|
|
||
| #include <cublasXt.h> |
There was a problem hiding this comment.
这行不去掉的话,在编译的时候 Linux-DCU build会遇到redefinition错误,和 #include <cublas_v2.h> 冲突了,本来是要换一下 include 顺序,但是cpplint会直接换回来,后来发现直接去掉也没问题,这里暂时是用不到 cublasXt.h 的,后面需要用到的时候再 include 也没问题,具体可以看涉及到 fix dcu 的 commit
太不容易了啊这个 PR |
|
/re-run all-failed |
1 similar comment
|
/re-run all-failed |
PR Category
Execute Infrastructure
PR Types
New features
Description
新增
at::cuda::getCurrentCUDABlasHandleat::cuda::blas::gemm<T>接口新增 Allocator 结构体
文档详见 PFCCLab/PaddleCppAPITest#46
是否引起精度变化
否