[Cpp API Compatibility] Add pin_memory support and fix from_blob#78255
[Cpp API Compatibility] Add pin_memory support and fix from_blob#78255SigureMo merged 11 commits intoPaddlePaddle:developfrom
pin_memory support and fix from_blob#78255Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR fixes several compatibility layer issues between the PyTorch (ATen) API and the Paddle backend: it refactors from_blob to use a builder-pattern TensorMaker/for_blob class that infers device placement from the data pointer (instead of defaulting to CPU), adds pin_memory support to the empty API, and adds a non-contiguous warning to abs. It also adds corresponding test files and renames the squeeze test to follow the ATen_* naming convention.
Changes:
from_blobfix: Replaces directpaddle::from_blobcalls with aTensorMakerbuilder class; when no device is specified, uses an UNDEFINED place sopaddle::from_blobcan auto-detect device from the pointer.emptyfix: Addspin_memorysupport by allocating on CPU and copying toGPUPinnedPlace/XPUPinnedPlace; removes the old error that rejectedpin_memory=true.- New tests & test rename: Adds
ATen_empty_test.cc,ATen_from_blob_test.cc,ATen_cuda_test.ccand renamescompat_squeeze_testtoATen_squeeze_test.cc.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
paddle/phi/api/include/compat/ATen/ops/from_blob.h |
Introduces TensorMaker builder class and refactors all from_blob overloads to use it with UNDEFINED place for auto-detection |
paddle/phi/api/include/compat/ATen/ops/empty.h |
Adds pin_memory support for both TensorOptions and 6-arg overloads |
paddle/phi/api/include/compat/ATen/ops/abs.h |
Adds LOG(WARNING) for non-contiguous tensor inputs to abs and abs_ |
paddle/phi/api/include/compat/c10/core/TensorOptions.h |
Adds variadic device(Args&&...) template overload for constructing Device in-place |
test/cpp/compat/ATen_squeeze_test.cc |
New test file for squeeze/unsqueeze operations (replacing compat_squeeze_test) |
test/cpp/compat/ATen_from_blob_test.cc |
New tests for from_blob covering CPU/GPU place detection and deleter behavior |
test/cpp/compat/ATen_empty_test.cc |
New tests for at::empty covering basic shapes, dtype, and pinned memory |
test/cpp/compat/ATen_cuda_test.cc |
New tests verifying at::Tensor::cuda() transfer behavior |
test/cpp/compat/CMakeLists.txt |
Registers the three new test targets and renames the squeeze test |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "The deleter and context arguments are mutually exclusive."); | ||
|
|
||
| PD_CHECK(!storage_offset_.has_value() || storage_offset_.value() == 0, | ||
| "storage_offset` should be zero."); |
There was a problem hiding this comment.
The error message string on line 81 has a mismatched backtick: "storage_offset should be zero.". The opening backtick before storage_offset is missing. It should read: `` "storage_offset` should be zero." ``
| std::size_t computeStorageSize() const noexcept; | ||
|
|
||
| DataPtr makeDataPtrFromDeleter() noexcept; | ||
|
|
||
| DataPtr makeDataPtrFromContext() noexcept; | ||
|
|
||
| IntArrayRef makeTempSizes() const noexcept; | ||
|
|
||
| void* data_; | ||
| IntArrayRef sizes_; | ||
| OptionalIntArrayRef strides_; | ||
| std::optional<int64_t> storage_offset_; | ||
| std::function<void(void*)> deleter_; | ||
| std::unique_ptr<void, ContextDeleter> ctx_{nullptr, detail::noopDelete}; | ||
| std::optional<Device> device_; | ||
| TensorOptions opts_; | ||
| bool resizeable_{}; |
There was a problem hiding this comment.
The four private methods computeStorageSize(), makeDataPtrFromDeleter(), makeDataPtrFromContext(), and makeTempSizes() are declared in TensorMaker but have no definitions anywhere in the codebase, and are never called inside make_tensor(). Similarly, the resizeable_ member is set by resizeable_storage() but never used in make_tensor(). These stubs inflate the class definition and would cause linker errors if anyone attempts to invoke them. They should either be implemented to reflect the intended behavior or removed to avoid confusion.
|
/re-run all-failed |
|
/re-run h-coverage |
|
/re-run all-failed |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #78255 +/- ##
==========================================
Coverage ? 92.57%
==========================================
Files ? 12
Lines ? 202
Branches ? 0
==========================================
Hits ? 187
Misses ? 15
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/re-run all-failed |
1 similar comment
|
/re-run all-failed |
d62b95b to
7f06fed
Compare
|
/re-run all-failed |
1 similar comment
|
/re-run all-failed |
1b5cb31 to
fd94824
Compare
|
/re-run all-failed |
SigureMo
left a comment
There was a problem hiding this comment.
整体没啥问题,用 TensorOptions 替代之前每个 API 单独写实现绝对是向着正确方向走,之前主要是 TensorOptions 不够完善的临时实现,没有造成回归的话,就没大问题
只是现在对于 pinned_memory 的处理想确认下,torch 那边是否也是这样实现的?这里性能上可能会有些许差异,不过这不阻塞 PR 合入,正确性优先
| #include <optional> | ||
| #include <string_view> | ||
|
|
||
| #include "glog/logging.h" |
There was a problem hiding this comment.
头文件里不要用 glog,如果一定要用,放到 .cpp 里,因为我们并不会将第三方库头文件暴露出去(只暴露 pybind)
| paddle::experimental::full({}, 1, phi::DataType::FLOAT64), | ||
| compat::_PD_AtenScalarTypeToPhiDataType(options.dtype()), | ||
| phi::CPUPlace()); | ||
| return dense.copy_to(pinned_place, /*blocking=*/true); |
There was a problem hiding this comment.
paddle 的 arange 接口原生不支持 pinned place 是么?torch 那边是原生支持的是么?
There was a problem hiding this comment.
ok,看起来可能涉及一些机制问题,之前好像没有过给 pinned place 注册 kernel 的逻辑,我们先组合实现吧,先包正确性,性能问题先记着吧
| "`layout` must be Sparse for sparse_coo_tensor."); | ||
| PD_CHECK(!(pin_memory.has_value() && pin_memory.value() != false), | ||
| "`pin_memory` other than False is not supported now."); | ||
| (void)dtype; |
There was a problem hiding this comment.
这里不使用 dtype 的行为和 torch 是对齐的么?如果这里没有对齐,应该在用户传 dtype 时报个错
| @@ -407,19 +427,33 @@ class Tensor : public TensorBase { | |||
| return *this; | |||
| "`pin_memory` other than False is not supported now."); | ||
|
|
||
| return sparse_csr_tensor(crow_indices, col_indices, values, size); | ||
| (void)dtype; |
|
/re-run all-failed |
|
|
| @@ -1,4 +1,4 @@ | |||
| // Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved. | |||
| // Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved. | |||
There was a problem hiding this comment.
之后可以用 git commit -m "trigger CI" --allow-empty 创建空的 commit,可以不用找地方改 😂
|
/re-run all-failed |
@SigureMo 这是因为删了 python 3.9 的 CI 导致的错误吗,还是说镜像太老了,需要更新一下 |
|
/re-run all-failed |
只有 SOT 流水线删掉 3.9 监控了,不会影响其他流水线,rerun 也不行是么? |
|
rerun过一次,test_shared_state_dict.py 之前的pr也是经常报错 |
| #include <optional> | ||
| #include <utility> | ||
| #include <vector> | ||
| #include "glog/logging.h" |
There was a problem hiding this comment.
0. You must have one RD (jiahy0825, zyfncg, YuanRisheng or heavyrain-lzy) approval for including "gflags/gflags.h" or "glog/logging.h" headerfile in paddle/phi headerfiles( paddle/phi/api/include/compat/ATen/core/TensorBody.h). Recommend including third party headers in phi source files(*.cc) instead of phi headerfiles(*.h). Because if phi headerfiles include third party headers like "gflags.h" or "logging.h", error might occur when outside developers use phi headerfiles directly.
之前还没注意到,这里也得改一下~
CI 自定义算子的单测怎么拦不住这个,有点奇怪,感觉编译期就应该挂
There was a problem hiding this comment.
改好了,compat扫了一遍LOG(WARNING),应该没有了
pin_memory support and fix from_blob
pin_memory support and fix from_blobpin_memory support and fix from_blob
|
/re-run all-failed |





PR Category
Execute Infrastructure
PR Types
Bug fixes
Description
对于 from_blob 接口:修复了 tensor 创建的 place,原先采用如果没有传入 device,就使用默认的 device,现在使用 UNDEFINED 类型,创建的位置由 paddle::from_blob 通过传入的指针推断
对于 empty 等负责创建 tensor 的接口:新增对 pin_memory 参数的支持,通过工具函数获取 pinned_place
对于 abs 接口:在对非连续 tensor 进行 abs 操作时,明确告知用户可能导致歧义,方便用户进行逻辑错误排查
对于 cuda 接口:验证其可用性
对于 TensorBody.h 的 pin_memory 接口:对齐 Pytorch 行为,在 PFCCLab/PaddleCppAPITest#34 中更新文档
是否引起精度变化
否