Skip to content

Commit 3073341

Browse files
authored
fix: resolve LLVM toolchain install failure on Linux (#61)
* fix: resolve LLVM toolchain install failure on Linux Three fixes for `mcpp toolchain install llvm` failing with "xpkg payload missing": 1. install_with_progress(): use direct `xlings install -y` command on ALL platforms (not just Windows). The direct command avoids stdin closure (</dev/null) that breaks xlings subprocess coordination for large packages like LLVM (~800MB). Falls back to NDJSON interface path if direct install fails. 2. package_fetcher.cppm: extend the global xlings directory fallback from Windows-only to all platforms. If xlings installs a package to ~/.xlings/ instead of the mcpp sandbox, detect and copy it. 3. ci.yml: add "Toolchain install smoke test" step that exercises `mcpp toolchain install llvm` + build with it. This core user flow was previously untested in CI. * fix: add missing import mcpp.platform in package_fetcher.cppm Also replace raw WIFEXITED/WEXITSTATUS with platform::process::extract_exit_code() in xlings.cppm. * fix(ci): use 'toolchain default' instead of non-existent --toolchain flag * fix: remove import mcpp.platform from package_fetcher.cppm The added import changed the module dependency graph fingerprint, invalidating BMI cache on macOS CI and exposing a pre-existing Xcode 16.4 SDK incompatibility during std module precompilation. Use #if defined(_WIN32) instead of if constexpr for the USERPROFILE check to avoid changing the module import set. * fix(ci-macos): add MCPP_HOME env and MCPP_VENDORED_XLINGS for test step macOS CI was missing the MCPP_HOME job-level env var that Linux CI has. The freshly-built mcpp resolved to a fresh ~/.mcpp/ sandbox, triggering a clean std module precompile that exposed a pre-existing Xcode 16.4 + LLVM 20.1.7 sysroot incompatibility. Fix: set MCPP_HOME=/Users/runner/.mcpp (consistent with Linux CI) and export MCPP_VENDORED_XLINGS in the test step so the sandbox reuses the pre-installed xlings binary. * fix: don't use xcrun SDK as sysroot for xlings LLVM on macOS Root cause: probe_sysroot() falls back to xcrun --show-sdk-path on macOS when the compiler doesn't report a sysroot via -print-sysroot. For xlings-installed LLVM 20.1.7, this sets --sysroot to the Xcode SDK path. Combined with -nostdinc++ from clang++.cfg, this breaks C runtime header resolution — macOS SDK headers reference internal macros (_CTYPE_A, etc.) that are only defined via default include paths which --sysroot overrides. Fix: remove the xcrun SDK fallback in probe_sysroot(). If the compiler itself doesn't report a sysroot, none should be used. The xcrun fallback was designed for Apple's system clang, not for standalone xlings LLVM which provides its own libc++ headers. Also revert the ci-macos.yml MCPP_HOME workaround — the real bug is fixed, no workaround needed. * fix: prefer xcrun SDK over cfg-baked sysroot on macOS probe_sysroot() called -print-sysroot first, which on macOS with xlings LLVM just echoed back the --sysroot from clang++.cfg. That cfg-baked path points to CommandLineTools SDK, which may differ from the active Xcode SDK. When the two SDKs have different header versions, std module precompilation fails with undeclared _CTYPE_A and related C runtime macro errors. Fix: on macOS, probe xcrun --show-sdk-path FIRST (always returns the active SDK), then fall back to -print-sysroot for non-macOS. The xcrun sysroot passed on the command line overrides the cfg-baked one. * fix: skip --sysroot for std module precompile on macOS Root cause: on macOS, xlings LLVM's clang++.cfg already contains a --sysroot for the macOS SDK. When mcpp additionally passes --sysroot on the command line during std module precompilation, it changes the C header include ordering. The macOS SDK's ___wctype.h references _CTYPE_A (defined in _ctype.h), but the duplicate --sysroot flag prevents _ctype.h from being included transitively during module purview compilation, causing "undeclared identifier '_CTYPE_A'" errors. Fix: detect apple/darwin target triple in stdmod.cppm and skip the --sysroot flag for std module precompile. The cfg's own --sysroot handles macOS SDK discovery correctly for this compilation mode. Regular compilation (flags.cppm) and linking still use the probed sysroot as before. Also restore probe_sysroot() to its original logic (xcrun fallback) since the issue was in stdmod.cppm passing sysroot, not in probing it. * fix: override stale cfg sysroot with xcrun SDK for macOS std precompile The previous fix skipped --sysroot entirely on macOS, but xlings LLVM's clang++.cfg still applies its own --sysroot pointing to CommandLineTools SDK. When the active SDK is Xcode (different path), the cfg's stale sysroot causes _CTYPE_A undeclared errors. Fix: on macOS, always probe xcrun and pass the active SDK as --sysroot to OVERRIDE the cfg-baked stale path. This ensures the std module precompile uses the correct SDK regardless of what was baked into clang++.cfg at LLVM install time. * fix: skip --sysroot entirely for macOS std module precompile CI step 9 proves that clang++ WITHOUT explicit --sysroot precompiles std.cppm correctly on macOS — the clang++.cfg's built-in --sysroot and -isystem flags handle SDK header resolution properly. When mcpp passes an explicit --sysroot (even the identical value), it changes Clang's internal header search order, breaking the transitive inclusion of _ctype.h before ___wctype.h. The macOS SDK's ___wctype.h references _CTYPE_A which is only defined via the default cfg-driven include chain. Fix: detect apple/darwin target and skip --sysroot for std module precompile, letting clang++.cfg handle it. Non-macOS platforms (Linux, Windows) continue to use the probed sysroot as before. * fix: bypass stale clang++.cfg for macOS std module precompile When xlings copies LLVM to mcpp's sandbox, clang++.cfg retains hardcoded absolute paths from the original install (--sysroot and -isystem pointing to ~/.xlings/... instead of ~/.mcpp/...). These stale paths cause _CTYPE_A undeclared errors during std module precompilation. Fix: on macOS Clang, pass --no-default-config to ignore the stale cfg, then explicitly provide the correct -isystem (libc++ headers from the sandbox LLVM root) and --sysroot (active SDK from xcrun). This produces the same header search behavior as a fresh clang++ invocation with the correct paths. * fix: apply --no-default-config for all macOS Clang compilation The stale clang++.cfg issue affects not just std module precompile but also regular compilation via ninja. flags.cppm now applies the same --no-default-config + correct -isystem + xcrun --sysroot fix for all macOS Clang compilation, bypassing the cfg's stale paths.
1 parent 20bf41d commit 3073341

7 files changed

Lines changed: 214 additions & 25 deletions

File tree

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# LLVM 工具链安装失败分析
2+
3+
## 现象
4+
5+
`mcpp toolchain install llvm` 依赖包(libxml2, zlib, glibc 等)安装成功,但 LLVM 本体(800MB)缺失:
6+
7+
```
8+
~/.mcpp/registry/data/xpkgs/
9+
├── xim-x-libxml2/ ✓ 安装成功
10+
├── xim-x-zlib/ ✓ 安装成功
11+
├── xim-x-glibc/ ✓ 安装成功
12+
├── xim-x-llvm/ ✗ 不存在
13+
```
14+
15+
## 根因分析
16+
17+
### 问题 1:`</dev/null` 关闭 stdin 可能破坏 xlings 子进程通信
18+
19+
`platform/process.cppm:79-84``seal_stdin()` 对所有 POSIX 命令追加 `</dev/null`
20+
21+
这个修复解决了 macOS 首次运行卡住的问题,但副作用是:xlings 内部的子进程(如解压 800MB LLVM 的 tar 进程)可能依赖 stdin 进行进程间通信或信号传递。小包(libxml2 等)不受影响,大包(LLVM)因为解压时间长,子进程链更复杂,可能被 broken stdin 导致静默失败。
22+
23+
### 问题 2:`2>/dev/null` 吞掉所有错误信息
24+
25+
`xlings.cppm:432-434` 构建的命令:
26+
27+
```bash
28+
cd ~/.mcpp && ... xlings interface install_packages --args '...' 2>/dev/null </dev/null
29+
```
30+
31+
stderr 被完全丢弃。如果 xlings 安装 LLVM 时输出了错误信息到 stderr,我们完全看不到。
32+
33+
### 问题 3:NDJSON handler 只处理 download_progress 事件
34+
35+
`xlings.cppm:645-692``handle_line` 回调:
36+
37+
```cpp
38+
if (kind != "data") return; // 忽略非 data 事件
39+
if (ls.find_str("dataKind") != "download_progress") return; // 只关心下载进度
40+
```
41+
42+
如果 xlings 发出了 error 事件或 log 事件报告安装失败,全部被静默丢弃。
43+
44+
### 问题 4:Windows 有 fallback 但 Linux 没有
45+
46+
`package_fetcher.cppm:608-638` 有一个 Windows-only 的 workaround:
47+
48+
```cpp
49+
#if defined(_WIN32)
50+
// 如果 verdir 不存在,检查全局 xlings 目录 ~/.xlings/data/xpkgs/ 并复制过来
51+
if (!std::filesystem::exists(verdir)) {
52+
// ... copy from ~/.xlings/ to ~/.mcpp/
53+
}
54+
#endif
55+
```
56+
57+
这个 workaround 处理了 "xlings 把包装到全局目录而非 XLINGS_HOME 指定目录" 的情况。**Linux 没有这个 fallback**
58+
59+
### 为什么 CI 没有这个问题
60+
61+
CI 设置了 `MCPP_VENDORED_XLINGS="$XLINGS_BIN"`
62+
63+
```yaml
64+
export MCPP_VENDORED_XLINGS="$XLINGS_BIN"
65+
"$MCPP" build --target x86_64-linux-musl
66+
```
67+
68+
`MCPP_VENDORED_XLINGS` 触发 `make_xlings_env()` 中的特殊路径,使用全局 xlings 二进制。而且 CI 中的工具链安装走的是 xlings 全局 sandbox(因为 MCPP_HOME 显式设置),与用户本地的嵌套沙箱场景完全不同。
69+
70+
实际上 **CI 也没有测试 `mcpp toolchain install llvm` 这个用户流程**——CI 只测试 `mcpp build`(使用预装的工具链)。
71+
72+
## 修复方案
73+
74+
### 修复 1:`install_with_progress()` Linux 路径改为直接命令(对齐 Windows)
75+
76+
Windows 已经用直接 `xlings install ... -y` 命令而非 interface 模式。Linux 也应该如此:
77+
78+
```cpp
79+
int install_with_progress(const Env& env, std::string_view target,
80+
const BootstrapProgressCallback& cb)
81+
{
82+
// 所有平台统一:先用直接命令安装
83+
auto directCmd = build_command_prefix(env) + std::format(" install {} -y", target);
84+
int directRc = mcpp::platform::process::run_silent(directCmd);
85+
if (directRc == 0) return 0;
86+
87+
// 直接命令失败则 fallback 到 interface 模式(保留进度回调能力)
88+
// ...
89+
}
90+
```
91+
92+
### 修复 2:Linux 增加与 Windows 相同的 fallback 检查
93+
94+
在 `resolve_xpkg_path()` 中,将 Windows 的全局目录 fallback 扩展到所有平台:
95+
96+
```cpp
97+
// 移除 #if defined(_WIN32),改为所有平台通用
98+
if (!std::filesystem::exists(verdir)) {
99+
// 检查全局 xlings 目录
100+
auto homeDir = std::getenv("HOME");
101+
if (homeDir) {
102+
std::filesystem::path globalXpkgs =
103+
std::filesystem::path(homeDir) / ".xlings" / "data" / "xpkgs";
104+
auto globalVerdir = globalXpkgs / verdir.filename().parent_path().filename() / verdir.filename();
105+
if (std::filesystem::exists(globalVerdir)) {
106+
// 复制或软链接到 sandbox
107+
}
108+
}
109+
}
110+
```
111+
112+
### 修复 3:不对 xlings install 命令关闭 stdin
113+
114+
`install_with_progress()` 添加不关闭 stdin 的选项,或让直接 install 命令走 `std::system()` 而非 `platform::process`
115+
116+
```cpp
117+
// 直接命令不通过 platform::process(不追加 </dev/null)
118+
int directRc = std::system(directCmd.c_str());
119+
```
120+
121+
### 修复 4:CI 增加工具链安装测试
122+
123+
`ci.yml` 中增加专门测试 `mcpp toolchain install llvm` 的步骤,确保这个用户核心流程被覆盖。
124+
125+
## 推荐实施顺序
126+
127+
1. **修复 1 + 修复 3**:Linux 改用直接命令 + 不关闭 stdin(最可能解决问题)
128+
2. **修复 2**:增加全局目录 fallback(兜底)
129+
3. **修复 4**:增加 CI 测试(防止回归)

.github/workflows/ci.yml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,23 @@ jobs:
134134
"$MCPP" build
135135
"$MCPP" test
136136
137+
- name: Toolchain install smoke test (mcpp toolchain install llvm)
138+
run: |
139+
# Test the core user flow: install a toolchain, create a project,
140+
# build with it. Uses the freshly-built mcpp (not bootstrap).
141+
MCPP=$(realpath "$(find target -type f -name mcpp -printf '%T@ %p\n' | sort -rn | head -1 | cut -d' ' -f2)")
142+
# Install LLVM toolchain into mcpp's sandbox
143+
"$MCPP" toolchain install llvm 20.1.7
144+
# Set as default so the build picks it up
145+
"$MCPP" toolchain default llvm@20.1.7
146+
# Build a hello-world project with the installed toolchain
147+
TMP=$(mktemp -d)
148+
cd "$TMP"
149+
"$MCPP" new hello
150+
cd hello
151+
"$MCPP" build
152+
"$MCPP" run
153+
137154
- name: Fresh user experience (xlings install mcpp → new → run)
138155
continue-on-error: true
139156
run: |

src/build/flags.cppm

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,9 +88,24 @@ CompileFlags compute_flags(const BuildPlan& plan) {
8888
include_flags += " -I" + escape_path(abs);
8989
}
9090

91-
// Sysroot
91+
// Sysroot + config override for macOS.
92+
// On macOS, xlings LLVM's clang++.cfg contains hardcoded --sysroot and
93+
// -isystem paths from the original install location. When the package is
94+
// copied to mcpp's sandbox, these paths become stale. We pass
95+
// --no-default-config to ignore the cfg and provide correct paths.
9296
std::string sysroot_flag;
93-
if (!plan.toolchain.sysroot.empty()) {
97+
bool is_macos_clang = mcpp::toolchain::is_clang(plan.toolchain)
98+
&& (plan.toolchain.targetTriple.find("apple") != std::string::npos
99+
|| plan.toolchain.targetTriple.find("darwin") != std::string::npos);
100+
if (is_macos_clang) {
101+
auto llvmRoot = plan.toolchain.binaryPath.parent_path().parent_path();
102+
auto libcxxInclude = llvmRoot / "include" / "c++" / "v1";
103+
sysroot_flag = " --no-default-config";
104+
sysroot_flag += " -isystem" + escape_path(libcxxInclude);
105+
if (auto sdk = mcpp::platform::macos::sdk_path())
106+
sysroot_flag += " --sysroot=" + escape_path(*sdk);
107+
f.sysroot = sysroot_flag;
108+
} else if (!plan.toolchain.sysroot.empty()) {
94109
sysroot_flag = " --sysroot=" + escape_path(plan.toolchain.sysroot);
95110
f.sysroot = sysroot_flag;
96111
}

src/pm/package_fetcher.cppm

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -605,14 +605,17 @@ Fetcher::resolve_xpkg_path(std::string_view target,
605605
};
606606

607607
auto resolve = [&]() -> std::expected<XpkgPayload, CallError> {
608-
#if defined(_WIN32)
609-
// Workaround: xlings on Windows may extract large packages (e.g. LLVM)
610-
// into its global data dir instead of the mcpp sandbox, because the
611-
// extraction subprocess doesn't inherit XLINGS_HOME. Detect this and
612-
// copy the payload into the sandbox so mcpp remains self-contained.
608+
// Workaround: xlings may extract large packages (e.g. LLVM) into its
609+
// global data dir instead of the mcpp sandbox, because the extraction
610+
// subprocess doesn't always inherit XLINGS_HOME. Detect this and copy
611+
// the payload into the sandbox so mcpp remains self-contained.
612+
// Originally Windows-only; extended to all platforms for the same
613+
// reason (xlings subprocess XLINGS_HOME propagation is unreliable).
613614
if (!std::filesystem::exists(verdir)) {
614-
// Try xlings' own data dir (where `xlings self install` placed it)
615-
auto xhome = std::getenv("USERPROFILE");
615+
const char* xhome = nullptr;
616+
#if defined(_WIN32)
617+
xhome = std::getenv("USERPROFILE");
618+
#endif
616619
if (!xhome) xhome = std::getenv("HOME");
617620
if (xhome) {
618621
// xlings stores xpkgs at <home>/.xlings/data/xpkgs/ or
@@ -635,7 +638,6 @@ Fetcher::resolve_xpkg_path(std::string_view target,
635638
}
636639
}
637640
}
638-
#endif
639641
if (!std::filesystem::exists(verdir)) {
640642
return std::unexpected(CallError{
641643
std::format("xpkg payload missing: {}", verdir.string())});

src/toolchain/probe.cppm

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,10 @@ probe_sysroot(const std::filesystem::path& compilerBin,
262262
auto s = trim_line(*r);
263263
if (!s.empty() && std::filesystem::exists(s)) return s;
264264
}
265-
// macOS fallback: use xcrun to discover the SDK path
265+
// macOS fallback: use xcrun to discover the SDK path.
266+
// The sysroot is used for regular compilation flags (flags.cppm) but
267+
// skipped for std module precompilation on macOS (stdmod.cppm) to
268+
// avoid breaking SDK internal header dependencies.
266269
if (auto sdk = mcpp::platform::macos::sdk_path())
267270
return *sdk;
268271
return {};

src/toolchain/stdmod.cppm

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,8 +92,26 @@ std::expected<StdModule, StdModError> ensure_built(
9292
: mcpp::toolchain::gcc::std_bmi_path(sm.cacheDir);
9393
sm.objectPath = sm.cacheDir / "std.o";
9494

95+
// Build sysroot + include flags for std module precompilation.
96+
// On macOS, xlings LLVM's clang++.cfg contains hardcoded --sysroot and
97+
// -isystem paths from the original install location. When the LLVM package
98+
// is copied to mcpp's sandbox, these cfg paths become stale (still point
99+
// to the original xlings directory). We override both:
100+
// --sysroot → current active SDK (from xcrun)
101+
// --no-default-config → ignore stale cfg entirely
102+
// -isystem → correct libc++ headers in the sandbox copy
95103
std::string sysroot_flag;
96-
if (!tc.sysroot.empty()) {
104+
bool is_macos = tc.targetTriple.find("apple") != std::string::npos
105+
|| tc.targetTriple.find("darwin") != std::string::npos;
106+
if (is_macos && is_clang(tc)) {
107+
// Ignore the stale clang++.cfg and provide correct flags directly.
108+
auto llvmRoot = tc.binaryPath.parent_path().parent_path();
109+
auto libcxxInclude = llvmRoot / "include" / "c++" / "v1";
110+
sysroot_flag = " --no-default-config";
111+
sysroot_flag += std::format(" -isystem'{}'", libcxxInclude.string());
112+
if (auto sdk = mcpp::platform::macos::sdk_path())
113+
sysroot_flag += std::format(" --sysroot='{}'", sdk->string());
114+
} else if (!tc.sysroot.empty()) {
97115
sysroot_flag = std::format(" --sysroot='{}'", tc.sysroot.string());
98116
}
99117

src/xlings.cppm

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -609,24 +609,29 @@ int install_with_progress(const Env& env, std::string_view target,
609609
auto argsJson = std::format(
610610
R"({{"targets":["{}"],"yes":true}})", target);
611611

612-
if constexpr (mcpp::platform::is_windows) {
613-
mcpp::platform::env::set("XLINGS_HOME", env.home.string());
614-
mcpp::platform::env::set("XLINGS_PROJECT_DIR", "");
615-
std::error_code ec_mkdir;
616-
std::filesystem::create_directories(env.home, ec_mkdir);
617-
// Use direct `install` command instead of `interface install_packages`
618-
// on Windows. The NDJSON interface may have issues with large packages
619-
// where the extraction subprocess doesn't respect XLINGS_HOME.
620-
auto directCmd = std::format("{} install {} -y",
621-
env.binary.string(), target);
622-
int directRc = mcpp::platform::process::run_silent(directCmd);
612+
// All platforms: try direct `xlings install ... -y` first.
613+
// The direct command is more reliable for large packages (e.g. LLVM
614+
// ~800MB) because:
615+
// - it doesn't pipe through NDJSON interface (simpler subprocess chain)
616+
// - xlings manages its own stdin/stdout/stderr
617+
// - extraction subprocess coordination works normally
618+
// The NDJSON interface path is kept as a fallback for progress reporting.
619+
{
620+
auto directCmd = build_command_prefix(env) +
621+
std::format(" install {} -y {}", target, mcpp::platform::shell::silent_redirect);
622+
// Use std::system() directly — do NOT redirect stdin via </dev/null
623+
// because xlings may need stdin for subprocess coordination during
624+
// large package extraction.
625+
int directRc = mcpp::platform::process::extract_exit_code(
626+
std::system(directCmd.c_str()));
623627
if (directRc == 0) return 0;
624628
}
629+
630+
// Fallback: NDJSON interface path (provides progress callbacks).
625631
auto cmd = [&]() -> std::string {
626632
if constexpr (mcpp::platform::is_windows) {
627-
// Fallback to interface path if direct install fails
628633
return std::format("{} interface install_packages --args {} {}",
629-
env.binary.string(),
634+
build_command_prefix(env),
630635
shq(argsJson),
631636
mcpp::platform::null_redirect);
632637
} else {

0 commit comments

Comments
 (0)