Skip to content

[Cpp API Compatibility] Add pin_memory support and fix from_blob#78255

Merged
SigureMo merged 11 commits intoPaddlePaddle:developfrom
youge325:cAlign
Mar 19, 2026
Merged

[Cpp API Compatibility] Add pin_memory support and fix from_blob#78255
SigureMo merged 11 commits intoPaddlePaddle:developfrom
youge325:cAlign

Conversation

@youge325
Copy link
Copy Markdown
Contributor

@youge325 youge325 commented Mar 10, 2026

PR Category

Execute Infrastructure

PR Types

Bug fixes

Description

对于 from_blob 接口:修复了 tensor 创建的 place,原先采用如果没有传入 device,就使用默认的 device,现在使用 UNDEFINED 类型,创建的位置由 paddle::from_blob 通过传入的指针推断

对于 empty 等负责创建 tensor 的接口:新增对 pin_memory 参数的支持,通过工具函数获取 pinned_place

对于 abs 接口:在对非连续 tensor 进行 abs 操作时,明确告知用户可能导致歧义,方便用户进行逻辑错误排查

对于 cuda 接口:验证其可用性

对于 TensorBody.h 的 pin_memory 接口:对齐 Pytorch 行为,在 PFCCLab/PaddleCppAPITest#34 中更新文档

是否引起精度变化

Copilot AI review requested due to automatic review settings March 10, 2026 13:33
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Mar 10, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot Bot added the contributor External developers label Mar 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes several compatibility layer issues between the PyTorch (ATen) API and the Paddle backend: it refactors from_blob to use a builder-pattern TensorMaker/for_blob class that infers device placement from the data pointer (instead of defaulting to CPU), adds pin_memory support to the empty API, and adds a non-contiguous warning to abs. It also adds corresponding test files and renames the squeeze test to follow the ATen_* naming convention.

Changes:

  • from_blob fix: Replaces direct paddle::from_blob calls with a TensorMaker builder class; when no device is specified, uses an UNDEFINED place so paddle::from_blob can auto-detect device from the pointer.
  • empty fix: Adds pin_memory support by allocating on CPU and copying to GPUPinnedPlace/XPUPinnedPlace; removes the old error that rejected pin_memory=true.
  • New tests & test rename: Adds ATen_empty_test.cc, ATen_from_blob_test.cc, ATen_cuda_test.cc and renames compat_squeeze_test to ATen_squeeze_test.cc.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
paddle/phi/api/include/compat/ATen/ops/from_blob.h Introduces TensorMaker builder class and refactors all from_blob overloads to use it with UNDEFINED place for auto-detection
paddle/phi/api/include/compat/ATen/ops/empty.h Adds pin_memory support for both TensorOptions and 6-arg overloads
paddle/phi/api/include/compat/ATen/ops/abs.h Adds LOG(WARNING) for non-contiguous tensor inputs to abs and abs_
paddle/phi/api/include/compat/c10/core/TensorOptions.h Adds variadic device(Args&&...) template overload for constructing Device in-place
test/cpp/compat/ATen_squeeze_test.cc New test file for squeeze/unsqueeze operations (replacing compat_squeeze_test)
test/cpp/compat/ATen_from_blob_test.cc New tests for from_blob covering CPU/GPU place detection and deleter behavior
test/cpp/compat/ATen_empty_test.cc New tests for at::empty covering basic shapes, dtype, and pinned memory
test/cpp/compat/ATen_cuda_test.cc New tests verifying at::Tensor::cuda() transfer behavior
test/cpp/compat/CMakeLists.txt Registers the three new test targets and renames the squeeze test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"The deleter and context arguments are mutually exclusive.");

PD_CHECK(!storage_offset_.has_value() || storage_offset_.value() == 0,
"storage_offset` should be zero.");
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message string on line 81 has a mismatched backtick: "storage_offset should be zero.". The opening backtick before storage_offset is missing. It should read: `` "storage_offset` should be zero." ``

Copilot uses AI. Check for mistakes.
Comment on lines +138 to +154
std::size_t computeStorageSize() const noexcept;

DataPtr makeDataPtrFromDeleter() noexcept;

DataPtr makeDataPtrFromContext() noexcept;

IntArrayRef makeTempSizes() const noexcept;

void* data_;
IntArrayRef sizes_;
OptionalIntArrayRef strides_;
std::optional<int64_t> storage_offset_;
std::function<void(void*)> deleter_;
std::unique_ptr<void, ContextDeleter> ctx_{nullptr, detail::noopDelete};
std::optional<Device> device_;
TensorOptions opts_;
bool resizeable_{};
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The four private methods computeStorageSize(), makeDataPtrFromDeleter(), makeDataPtrFromContext(), and makeTempSizes() are declared in TensorMaker but have no definitions anywhere in the codebase, and are never called inside make_tensor(). Similarly, the resizeable_ member is set by resizeable_storage() but never used in make_tensor(). These stubs inflate the class definition and would cause linker errors if anyone attempts to invoke them. They should either be implemented to reflect the intended behavior or removed to avoid confusion.

Copilot uses AI. Check for mistakes.
@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@youge325
Copy link
Copy Markdown
Contributor Author

/re-run h-coverage

@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 92.57426% with 15 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@09196be). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...ddle/phi/api/include/compat/ATen/core/TensorBody.h 58.33% 5 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/from_blob.h 91.37% 5 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/empty.h 85.71% 2 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/full.h 92.85% 1 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/ones.h 92.85% 1 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/zeros.h 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #78255   +/-   ##
==========================================
  Coverage           ?   92.57%           
==========================================
  Files              ?       12           
  Lines              ?      202           
  Branches           ?        0           
==========================================
  Hits               ?      187           
  Misses             ?       15           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

1 similar comment
@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@youge325 youge325 force-pushed the cAlign branch 2 times, most recently from d62b95b to 7f06fed Compare March 12, 2026 04:36
@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

1 similar comment
@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@youge325 youge325 force-pushed the cAlign branch 2 times, most recently from 1b5cb31 to fd94824 Compare March 15, 2026 03:08
@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@youge325 youge325 changed the title [Cpp API Compatibility] Fix empty() and from_blob() and Verify cuda() [Cpp API Compatibility] add pin_memory support and fix from_blob Mar 15, 2026
Copy link
Copy Markdown
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体没啥问题,用 TensorOptions 替代之前每个 API 单独写实现绝对是向着正确方向走,之前主要是 TensorOptions 不够完善的临时实现,没有造成回归的话,就没大问题

只是现在对于 pinned_memory 的处理想确认下,torch 那边是否也是这样实现的?这里性能上可能会有些许差异,不过这不阻塞 PR 合入,正确性优先

#include <optional>
#include <string_view>

#include "glog/logging.h"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

头文件里不要用 glog,如果一定要用,放到 .cpp 里,因为我们并不会将第三方库头文件暴露出去(只暴露 pybind)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

paddle::experimental::full({}, 1, phi::DataType::FLOAT64),
compat::_PD_AtenScalarTypeToPhiDataType(options.dtype()),
phi::CPUPlace());
return dense.copy_to(pinned_place, /*blocking=*/true);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle 的 arange 接口原生不支持 pinned place 是么?torch 那边是原生支持的是么?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

我试试能不能编译,代码逻辑上是没问题的,Paddle arange 支持传入 Place

Copy link
Copy Markdown
Contributor Author

@youge325 youge325 Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image image

没有办法,Paddle 分发内核的时候会直接回落到 CPU 分支,Pytorch 是原生支持的,arange 委托给 empty 实现

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,看起来可能涉及一些机制问题,之前好像没有过给 pinned place 注册 kernel 的逻辑,我们先组合实现吧,先包正确性,性能问题先记着吧

"`layout` must be Sparse for sparse_coo_tensor.");
PD_CHECK(!(pin_memory.has_value() && pin_memory.value() != false),
"`pin_memory` other than False is not supported now.");
(void)dtype;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不使用 dtype 的行为和 torch 是对齐的么?如果这里没有对齐,应该在用户传 dtype 时报个错

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里确实是没对齐,我再对齐下

@@ -407,19 +427,33 @@ class Tensor : public TensorBase {
return *this;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个位置咋这么飘呢……找时间得看看为啥……

@youge325
Copy link
Copy Markdown
Contributor Author

只是现在对于 pinned_memory 的处理想确认下,torch 那边是否也是这样实现的?

image

Pytorch 对 pin_memory 接口就是创建一个 storage_offset 为 0 的 Tensor,复制原 Tensor的数据然后返回 ,Paddle 是抽象了一个 PinnedPlace,从数据层面看 pin_memory 之后两者都是在内存上的数据,但是 Pytorch 仅支持 CPU Tensor 进行 pin_memory 操作,因为对 GPU Tensor 进行 pin_memory 操作相当于将 Tensor 从显存 copy_to 内存,Pytorch认为这没有意义,Paddle 可能是出于防止out of memory的目的,允许 Tensor 从显存换出,这里对齐 Pytorch 的行为,仅支持 CPU Tensor 进行 pin_memory 操作

"`pin_memory` other than False is not supported now.");

return sparse_csr_tensor(crow_indices, col_indices, values, size);
(void)dtype;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是不是和上面那个一样?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对,我再改下

@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@SigureMo
Copy link
Copy Markdown
Member

CI / Clone 不知道为啥没触发,rerun 不太行了,可能得用一个空的 commit 重新触发下 CI

SigureMo
SigureMo previously approved these changes Mar 18, 2026
@@ -1,4 +1,4 @@
// Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.
Copy link
Copy Markdown
Member

@SigureMo SigureMo Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之后可以用 git commit -m "trigger CI" --allow-empty 创建空的 commit,可以不用找地方改 😂

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

懂了😂

@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@youge325
Copy link
Copy Markdown
Contributor Author

youge325 commented Mar 18, 2026

image

@SigureMo 这是因为删了 python 3.9 的 CI 导致的错误吗,还是说镜像太老了,需要更新一下

@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@SigureMo
Copy link
Copy Markdown
Member

SigureMo commented Mar 18, 2026

这是因为删了 python 3.9 的 CI 导致的错误吗

只有 SOT 流水线删掉 3.9 监控了,不会影响其他流水线,rerun 也不行是么?

@youge325
Copy link
Copy Markdown
Contributor Author

rerun过一次,test_shared_state_dict.py 之前的pr也是经常报错

#include <optional>
#include <utility>
#include <vector>
#include "glog/logging.h"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0. You must have one RD (jiahy0825, zyfncg, YuanRisheng or heavyrain-lzy) approval for including "gflags/gflags.h" or "glog/logging.h" headerfile in paddle/phi headerfiles( paddle/phi/api/include/compat/ATen/core/TensorBody.h). Recommend including third party headers in phi source files(*.cc) instead of phi headerfiles(*.h). Because if phi headerfiles include third party headers like "gflags.h" or "logging.h", error might occur when outside developers use phi headerfiles directly.

之前还没注意到,这里也得改一下~

CI 自定义算子的单测怎么拦不住这个,有点奇怪,感觉编译期就应该挂

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改好了,compat扫了一遍LOG(WARNING),应该没有了

@SigureMo SigureMo changed the title [Cpp API Compatibility] add pin_memory support and fix from_blob [Cpp API Compatibility] add pin_memory support and fix from_blob Mar 19, 2026
@SigureMo SigureMo changed the title [Cpp API Compatibility] add pin_memory support and fix from_blob [Cpp API Compatibility] Add pin_memory support and fix from_blob Mar 19, 2026
@youge325
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@SigureMo SigureMo merged commit 6d546ca into PaddlePaddle:develop Mar 19, 2026
129 of 135 checks passed
@youge325 youge325 deleted the cAlign branch March 19, 2026 07:13
co63oc pushed a commit to co63oc/Paddle that referenced this pull request Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants