[Cpp API Compatibility] Add `pin_memory` support and fix `from_blob` by youge325 · Pull Request #78255 · PaddlePaddle/Paddle

youge325 · 2026-03-10T13:33:19Z

PR Category

Execute Infrastructure

PR Types

Bug fixes

Description

对于 from_blob 接口：修复了 tensor 创建的 place，原先采用如果没有传入 device，就使用默认的 device，现在使用 UNDEFINED 类型，创建的位置由 paddle::from_blob 通过传入的指针推断

对于 empty 等负责创建 tensor 的接口：新增对 pin_memory 参数的支持，通过工具函数获取 pinned_place

对于 abs 接口：在对非连续 tensor 进行 abs 操作时，明确告知用户可能导致歧义，方便用户进行逻辑错误排查

对于 cuda 接口：验证其可用性

对于 TensorBody.h 的 pin_memory 接口：对齐 Pytorch 行为，在 PFCCLab/PaddleCppAPITest#34 中更新文档

是否引起精度变化

否

paddle-bot · 2026-03-10T13:33:30Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copilot

Pull request overview

This PR fixes several compatibility layer issues between the PyTorch (ATen) API and the Paddle backend: it refactors from_blob to use a builder-pattern TensorMaker/for_blob class that infers device placement from the data pointer (instead of defaulting to CPU), adds pin_memory support to the empty API, and adds a non-contiguous warning to abs. It also adds corresponding test files and renames the squeeze test to follow the ATen_* naming convention.

Changes:

from_blob fix: Replaces direct paddle::from_blob calls with a TensorMaker builder class; when no device is specified, uses an UNDEFINED place so paddle::from_blob can auto-detect device from the pointer.
empty fix: Adds pin_memory support by allocating on CPU and copying to GPUPinnedPlace/XPUPinnedPlace; removes the old error that rejected pin_memory=true.
New tests & test rename: Adds ATen_empty_test.cc, ATen_from_blob_test.cc, ATen_cuda_test.cc and renames compat_squeeze_test to ATen_squeeze_test.cc.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`paddle/phi/api/include/compat/ATen/ops/from_blob.h`	Introduces `TensorMaker` builder class and refactors all `from_blob` overloads to use it with UNDEFINED place for auto-detection
`paddle/phi/api/include/compat/ATen/ops/empty.h`	Adds `pin_memory` support for both TensorOptions and 6-arg overloads
`paddle/phi/api/include/compat/ATen/ops/abs.h`	Adds `LOG(WARNING)` for non-contiguous tensor inputs to `abs` and `abs_`
`paddle/phi/api/include/compat/c10/core/TensorOptions.h`	Adds variadic `device(Args&&...)` template overload for constructing Device in-place
`test/cpp/compat/ATen_squeeze_test.cc`	New test file for squeeze/unsqueeze operations (replacing `compat_squeeze_test`)
`test/cpp/compat/ATen_from_blob_test.cc`	New tests for `from_blob` covering CPU/GPU place detection and deleter behavior
`test/cpp/compat/ATen_empty_test.cc`	New tests for `at::empty` covering basic shapes, dtype, and pinned memory
`test/cpp/compat/ATen_cuda_test.cc`	New tests verifying `at::Tensor::cuda()` transfer behavior
`test/cpp/compat/CMakeLists.txt`	Registers the three new test targets and renames the squeeze test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T13:41:28Z

+             "The deleter and context arguments are mutually exclusive.");
+
+    PD_CHECK(!storage_offset_.has_value() || storage_offset_.value() == 0,
+             "storage_offset` should be zero.");


The error message string on line 81 has a mismatched backtick: "storage_offset should be zero.". The opening backtick before storage_offset is missing. It should read: `` "storage_offset` should be zero." ``

Copilot · 2026-03-10T13:41:28Z

+  std::size_t computeStorageSize() const noexcept;
+
+  DataPtr makeDataPtrFromDeleter() noexcept;
+
+  DataPtr makeDataPtrFromContext() noexcept;
+
+  IntArrayRef makeTempSizes() const noexcept;
+
+  void* data_;
+  IntArrayRef sizes_;
+  OptionalIntArrayRef strides_;
+  std::optional<int64_t> storage_offset_;
+  std::function<void(void*)> deleter_;
+  std::unique_ptr<void, ContextDeleter> ctx_{nullptr, detail::noopDelete};
+  std::optional<Device> device_;
+  TensorOptions opts_;
+  bool resizeable_{};


The four private methods computeStorageSize(), makeDataPtrFromDeleter(), makeDataPtrFromContext(), and makeTempSizes() are declared in TensorMaker but have no definitions anywhere in the codebase, and are never called inside make_tensor(). Similarly, the resizeable_ member is set by resizeable_storage() but never used in make_tensor(). These stubs inflate the class definition and would cause linker errors if anyone attempts to invoke them. They should either be implemented to reflect the intended behavior or removed to avoid confusion.

youge325 · 2026-03-11T00:17:17Z

/re-run all-failed

youge325 · 2026-03-11T01:12:40Z

/re-run h-coverage

youge325 · 2026-03-11T02:51:33Z

/re-run all-failed

codecov-commenter · 2026-03-11T02:59:17Z

Codecov Report

❌ Patch coverage is 92.57426% with 15 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@09196be). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...ddle/phi/api/include/compat/ATen/core/TensorBody.h	58.33%	5 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/from_blob.h	91.37%	5 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/empty.h	85.71%	2 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/full.h	92.85%	1 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/ones.h	92.85%	1 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/zeros.h	93.75%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop   #78255   +/-   ##
==========================================
  Coverage           ?   92.57%           
==========================================
  Files              ?       12           
  Lines              ?      202           
  Branches           ?        0           
==========================================
  Hits               ?      187           
  Misses             ?       15           
  Partials           ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

youge325 · 2026-03-11T03:28:10Z

/re-run all-failed

youge325 · 2026-03-12T02:37:46Z

/re-run all-failed

youge325 · 2026-03-12T06:54:52Z

/re-run all-failed

youge325 · 2026-03-13T01:46:28Z

/re-run all-failed

youge325 · 2026-03-15T03:27:06Z

/re-run all-failed

SigureMo

整体没啥问题，用 TensorOptions 替代之前每个 API 单独写实现绝对是向着正确方向走，之前主要是 TensorOptions 不够完善的临时实现，没有造成回归的话，就没大问题

只是现在对于 pinned_memory 的处理想确认下，torch 那边是否也是这样实现的？这里性能上可能会有些许差异，不过这不阻塞 PR 合入，正确性优先

SigureMo · 2026-03-17T02:49:16Z

 #include <optional>
 #include <string_view>

+#include "glog/logging.h"


头文件里不要用 glog，如果一定要用，放到 .cpp 里，因为我们并不会将第三方库头文件暴露出去（只暴露 pybind）

SigureMo · 2026-03-17T02:54:29Z

+        paddle::experimental::full({}, 1, phi::DataType::FLOAT64),
+        compat::_PD_AtenScalarTypeToPhiDataType(options.dtype()),
+        phi::CPUPlace());
+    return dense.copy_to(pinned_place, /*blocking=*/true);


paddle 的 arange 接口原生不支持 pinned place 是么？torch 那边是原生支持的是么？

我试试能不能编译，代码逻辑上是没问题的，Paddle arange 支持传入 Place

没有办法，Paddle 分发内核的时候会直接回落到 CPU 分支，Pytorch 是原生支持的，arange 委托给 empty 实现

ok，看起来可能涉及一些机制问题，之前好像没有过给 pinned place 注册 kernel 的逻辑，我们先组合实现吧，先包正确性，性能问题先记着吧

SigureMo · 2026-03-17T03:14:26Z

           "`layout` must be Sparse for sparse_coo_tensor.");
-  PD_CHECK(!(pin_memory.has_value() && pin_memory.value() != false),
-           "`pin_memory` other than False is not supported now.");
+  (void)dtype;


这里不使用 dtype 的行为和 torch 是对齐的么？如果这里没有对齐，应该在用户传 dtype 时报个错

这里确实是没对齐，我再对齐下

SigureMo · 2026-03-17T03:18:05Z

@@ -407,19 +427,33 @@ class Tensor : public TensorBase {
      return *this;


这个位置咋这么飘呢……找时间得看看为啥……

youge325 · 2026-03-17T04:14:24Z

只是现在对于 pinned_memory 的处理想确认下，torch 那边是否也是这样实现的？

Pytorch 对 pin_memory 接口就是创建一个 storage_offset 为 0 的 Tensor，复制原 Tensor的数据然后返回，Paddle 是抽象了一个 PinnedPlace，从数据层面看 pin_memory 之后两者都是在内存上的数据，但是 Pytorch 仅支持 CPU Tensor 进行 pin_memory 操作，因为对 GPU Tensor 进行 pin_memory 操作相当于将 Tensor 从显存 copy_to 内存，Pytorch认为这没有意义，Paddle 可能是出于防止out of memory的目的，允许 Tensor 从显存换出，这里对齐 Pytorch 的行为，仅支持 CPU Tensor 进行 pin_memory 操作

SigureMo · 2026-03-17T13:18:06Z

-           "`pin_memory` other than False is not supported now.");
-
-  return sparse_csr_tensor(crow_indices, col_indices, values, size);
+  (void)dtype;


这个是不是和上面那个一样？

对，我再改下

youge325 · 2026-03-18T00:22:17Z

/re-run all-failed

SigureMo · 2026-03-18T02:36:02Z

CI / Clone 不知道为啥没触发，rerun 不太行了，可能得用一个空的 commit 重新触发下 CI

SigureMo · 2026-03-18T03:16:50Z

@@ -1,4 +1,4 @@
-// Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
+// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.


之后可以用 git commit -m "trigger CI" --allow-empty 创建空的 commit，可以不用找地方改 😂

youge325 · 2026-03-18T09:28:52Z

/re-run all-failed

youge325 · 2026-03-18T12:53:34Z

@SigureMo 这是因为删了 python 3.9 的 CI 导致的错误吗，还是说镜像太老了，需要更新一下

youge325 · 2026-03-18T13:51:33Z

/re-run all-failed

SigureMo · 2026-03-18T14:03:53Z

这是因为删了 python 3.9 的 CI 导致的错误吗

只有 SOT 流水线删掉 3.9 监控了，不会影响其他流水线，rerun 也不行是么？

youge325 · 2026-03-18T14:13:29Z

rerun过一次，test_shared_state_dict.py 之前的pr也是经常报错

SigureMo · 2026-03-18T16:55:27Z

 #include <optional>
 #include <utility>
 #include <vector>
+#include "glog/logging.h"


0. You must have one RD (jiahy0825, zyfncg, YuanRisheng or heavyrain-lzy) approval for including "gflags/gflags.h" or "glog/logging.h" headerfile in paddle/phi headerfiles( paddle/phi/api/include/compat/ATen/core/TensorBody.h). Recommend including third party headers in phi source files(*.cc) instead of phi headerfiles(*.h). Because if phi headerfiles include third party headers like "gflags.h" or "logging.h", error might occur when outside developers use phi headerfiles directly.

之前还没注意到，这里也得改一下～

CI 自定义算子的单测怎么拦不住这个，有点奇怪，感觉编译期就应该挂

改好了，compat扫了一遍LOG(WARNING)，应该没有了

youge325 · 2026-03-19T04:51:07Z

/re-run all-failed

…addlePaddle#78255)

Copilot AI review requested due to automatic review settings March 10, 2026 13:33

paddle-bot Bot added the contributor External developers label Mar 10, 2026

Copilot started reviewing on behalf of youge325 March 10, 2026 13:33 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

youge325 force-pushed the cAlign branch from 0d88674 to b174695 Compare March 10, 2026 15:18

youge325 force-pushed the cAlign branch 2 times, most recently from d62b95b to 7f06fed Compare March 12, 2026 04:36

youge325 force-pushed the cAlign branch 2 times, most recently from 1b5cb31 to fd94824 Compare March 15, 2026 03:08

youge325 changed the title ~~[Cpp API Compatibility] Fix empty() and from_blob() and Verify cuda()~~ [Cpp API Compatibility] add pin_memory support and fix from_blob Mar 15, 2026

SigureMo reviewed Mar 17, 2026

View reviewed changes

youge325 force-pushed the cAlign branch from e99d670 to 979fc0f Compare March 17, 2026 11:57

SigureMo reviewed Mar 17, 2026

View reviewed changes

SigureMo previously approved these changes Mar 18, 2026

View reviewed changes

[Cpp API Compatibility] Fix empty() and from_blob() and Verify cuda()

3bef3e0

youge325 added 9 commits March 18, 2026 21:12

fix options() in TensorBase

35ddd83

fix pin_memory

ed1a6a8

align pin_memory in tensor maker

0c41db5

fix coverage rate

887c52f

add deprecation warning

9777185

fix abs warning

44c205b

fix dtype

e7caa8f

fix dtype again

05c1eed

trigger ci

935c3f7

SigureMo reviewed Mar 18, 2026

View reviewed changes

fix glog

1c51908

youge325 dismissed SigureMo’s stale review via 1c51908 March 19, 2026 02:18

youge325 force-pushed the cAlign branch from 2a1e48f to 1c51908 Compare March 19, 2026 02:18

SigureMo approved these changes Mar 19, 2026

View reviewed changes

SigureMo changed the title ~~[Cpp API Compatibility] add pin_memory support and fix from_blob~~ [Cpp API Compatibility] add pin_memory support and fix from_blob Mar 19, 2026

SigureMo changed the title ~~[Cpp API Compatibility] add pin_memory support and fix from_blob~~ [Cpp API Compatibility] Add pin_memory support and fix from_blob Mar 19, 2026

zyfncg approved these changes Mar 19, 2026

View reviewed changes

SigureMo merged commit 6d546ca into PaddlePaddle:develop Mar 19, 2026
129 of 135 checks passed

youge325 deleted the cAlign branch March 19, 2026 07:13

co63oc pushed a commit to co63oc/Paddle that referenced this pull request Mar 20, 2026

[Cpp API Compatibility] Add pin_memory support and fix from_blob (P…

eb333c8

…addlePaddle#78255)

youge325 mentioned this pull request Mar 31, 2026

revert some modifications PFCCLab/DeepGEMM#9

Open

		@@ -407,19 +427,33 @@ class Tensor : public TensorBase {
		return *this;

		@@ -1,4 +1,4 @@
		// Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
		// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.

Conversation

youge325 commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

是否引起精度变化

Uh oh!

paddle-bot Bot commented Mar 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

youge325 commented Mar 11, 2026

Uh oh!

youge325 commented Mar 11, 2026

Uh oh!

youge325 commented Mar 11, 2026

Uh oh!

codecov-commenter commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

youge325 commented Mar 11, 2026

Uh oh!

youge325 commented Mar 12, 2026

Uh oh!

youge325 commented Mar 12, 2026

Uh oh!

youge325 commented Mar 13, 2026

Uh oh!

youge325 commented Mar 15, 2026

Uh oh!

SigureMo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youge325 Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youge325 commented Mar 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youge325 commented Mar 18, 2026

Uh oh!

SigureMo commented Mar 18, 2026

Uh oh!

SigureMo Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youge325 commented Mar 10, 2026 •

edited

Loading

codecov-commenter commented Mar 11, 2026 •

edited

Loading

youge325 Mar 17, 2026 •

edited

Loading

SigureMo Mar 18, 2026 •

edited

Loading

youge325 commented Mar 18, 2026 •

edited

Loading

SigureMo commented Mar 18, 2026 •

edited

Loading