FEAT: Integrate mssql_py_core wheel installation into Dev workflow PR validation, official and release pipelines#440
Conversation
c92b05a to
6ca65a3
Compare
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changesNo lines with coverage information in this diff. 📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.4%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.cursor.py: 84.7%
mssql_python.__init__.py: 84.9%🔗 Quick Links
|
6ca65a3 to
63a91cf
Compare
- Add install-mssql-py-core.ps1 (Windows) and install-mssql-py-core.sh (Linux/macOS) scripts that download the mssql-py-core-wheels NuGet package from the public Azure Artifacts feed and pip install the platform-appropriate wheel - Add eng/versions/mssql-py-core.version to pin the NuGet package version (no fallback to latest - file is required) - Add 'Install mssql_py_core' step to all 10 test jobs in pr-validation-pipeline.yml - No authentication required (public feed), no nuget.exe dependency (raw HTTP + unzip) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
63a91cf to
e0d975c
Compare
…o wheel available - Detect musl vs glibc on Linux to use musllinux_1_2 vs manylinux_2_28 wheel tags - Skip with warning (exit 0) instead of failing when no compatible wheel is found - Allows Alpine jobs to continue without blocking on missing musllinux wheels Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Missing wheels must block the pipeline — every platform must have a matching wheel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ldd --version exits with code 1 on musl/Alpine. Combined with set -euo pipefail, the piped grep check always took the else branch, misidentifying Alpine as glibc (manylinux). pip then rejected the manylinux wheel since it cannot run on musl. Fix: capture ldd output into a variable with || true before grepping. Add fallback detection via /etc/alpine-release and /lib/ld-musl-*. Skip gracefully (exit 0) when no musllinux wheel is available yet. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add eng/scripts/repackage-with-mssql-py-core.py that downloads the mssql-py-core-wheels NuGet package, matches each mssql-python wheel to its corresponding mssql_py_core wheel by platform/Python tags, extracts the native extension (.pyd/.so) and supporting files, and injects them into the mssql-python wheel with updated RECORD hashes. Pipeline changes: - OneBranch consolidate-artifacts-job.yml: checkout source, install Python, run repackaging after wheel consolidation - eng/pipelines/build-whl-pipeline.yml: run repackaging after wheel build in Windows, macOS, and Linux jobs musllinux wheels are skipped gracefully since mssql_py_core does not yet ship musllinux builds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Official builds now validate eng/versions/mssql-py-core.version and fail if it contains dev, nightly, alpha, beta, rc, or preview tags. NonOfficial builds are unaffected and can use any version. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4c53fdb to
67da099
Compare
… dev/saurabh/integrate-mssql-py-core
…lation for Linux and Windows builds
…or ODBC connection strings
…stency across platforms and containers
… PackageBaseAddress extraction
|
Devcontainer changes and developer documentation changes are out of scope |
There was a problem hiding this comment.
Pull request overview
This PR integrates the mssql_py_core (Rust-based bulk copy engine) into the mssql-python package by downloading pre-built wheels from a NuGet feed and including them in every mssql-python wheel. After this change, import mssql_py_core works out of the box after pip install mssql-python.
Changes:
- Adds installation infrastructure to download and extract mssql_py_core wheels from Azure Artifacts NuGet feed
- Integrates mssql_py_core installation into all build and test pipelines (PR validation, official builds, release pipelines)
- Adds version validation to prevent pre-release versions of mssql_py_core from being shipped to PyPI
- Adds basic bulkcopy integration tests and includes mssql_py_core in the mssql-python wheel packaging
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| eng/versions/mssql-py-core.version | Single-line version file specifying which mssql_py_core version to download |
| eng/scripts/resolve_nuget_feed.py | Python script to fetch NuGet v3 service index and extract PackageBaseAddress URL |
| eng/scripts/extract_wheel.py | Python script to extract mssql_py_core files from a wheel into the repository root |
| eng/scripts/install-mssql-py-core.ps1 | Windows PowerShell script orchestrating download, platform detection, and extraction |
| eng/scripts/install-mssql-py-core.sh | Unix bash script with same functionality as PS1, plus musl/glibc detection for Alpine |
| eng/scripts/validate-release-versions.ps1 | PowerShell script to reject pre-release versions (dev/alpha/beta/rc) |
| eng/pipelines/steps/install-mssql-py-core.yml | Reusable step template with Windows/Unix/container variants |
| eng/pipelines/pr-validation-pipeline.yml | Adds mssql_py_core installation to all 12 test jobs |
| eng/pipelines/build-whl-pipeline.yml | Adds mssql_py_core installation before wheel builds, adds curl dependency |
| OneBranchPipelines/stages/build-linux-single-stage.yml | Adds install step in per-Python build loop, adds curl, updates comments |
| OneBranchPipelines/stages/build-windows-single-stage.yml | Adds PowerShell install step before testing |
| OneBranchPipelines/stages/build-macos-single-stage.yml | Adds bash install step before testing |
| OneBranchPipelines/jobs/consolidate-artifacts-job.yml | Changes checkout from none to self, copies version file to dist/ |
| eng/pipelines/official-release-pipeline.yml | Downloads version file, validates it's not pre-release before ESRP submission |
| eng/pipelines/dummy-release-pipeline.yml | Same version validation as official pipeline |
| OneBranchPipelines/build-release-package-pipeline.yml | Comment update about manylinux version |
| setup.py | Adds validate_mssql_py_core() check, includes mssql_py_core in packages and package_data |
| tests/test_019_bulkcopy.py | Basic bulkcopy integration test with 3-row insert and verification |
| .gitignore | Excludes mssql_py_core/ directory from version control |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Architectures: x86_64 (AMD/Intel), aarch64 (ARM64) | ||
| # Each stage: | ||
| # 1. Starts PyPA Docker container (manylinux_2_28 or musllinux_1_2) | ||
| # 1. Starts PyPA Docker container (manylinux_2_34 or musllinux_1_2) |
There was a problem hiding this comment.
Incorrect comment update: The comment was changed from manylinux_2_28 to manylinux_2_34, but the actual container image used is quay.io/pypa/manylinux_2_28_$(ARCH) as specified in build-linux-single-stage.yml line 113. The comment at line 107 correctly states "manylinux_2_28 = AlmaLinux 8 (glibc 2.28)".
Revert this comment back to "manylinux_2_28" to match the actual container being used. The note about mssql_py_core requiring glibc 2.34 is correctly explained in the build-linux-single-stage.yml file (lines 108-109).
| # 1. Starts PyPA Docker container (manylinux_2_34 or musllinux_1_2) | |
| # 1. Starts PyPA Docker container (manylinux_2_28 or musllinux_1_2) |
| if echo "$ldd_output" | grep -qi musl || [ -f /etc/alpine-release ]; then | ||
| WHEEL_PLATFORM="musllinux_1_2_${ARCH_TAG}" | ||
| else | ||
| # auditwheel=skip: wheels are tagged linux_* not manylinux_2_34_* |
There was a problem hiding this comment.
Unclear platform tag expectation: The comment states that mssql_py_core wheels are tagged as linux_* (not manylinux_2_34_*), but this contradicts standard Python wheel naming conventions and creates confusion with the manylinux_2_28 tags used by mssql-python itself.
For clarity and to prevent future issues:
- Document in a comment why mssql_py_core uses non-standard
linux_*tags instead of propermanylinux_*tags - Consider whether the mssql-rs build should be updated to use proper manylinux tags
- Add a note in the PR description explaining this platform tag discrepancy and its implications for pip installation
If the wheels are actually tagged as manylinux_2_34_* (not linux_*), then this line needs to be updated accordingly.
| # auditwheel=skip: wheels are tagged linux_* not manylinux_2_34_* | |
| # NOTE: The mssql_py_core Linux wheels produced by the mssql-rs build | |
| # are currently tagged with a non-standard 'linux_${ARCH_TAG}' tag | |
| # instead of a PEP 600 manylinux tag such as 'manylinux_2_34_${ARCH_TAG}'. | |
| # This comes from running auditwheel in a mode that preserves the | |
| # original linux_* platform tag (e.g. auditwheel repair --skip). | |
| # | |
| # This script installs from a wheel extracted from a NuGet package, | |
| # so pip accepts the wheel even though it is not manylinux-tagged. | |
| # If the mssql-rs build is updated to emit proper manylinux_2_34_* | |
| # tags, update WHEEL_PLATFORM (and the comment above) to match the | |
| # new tag and call out the change in the PR description, as it can | |
| # affect how pip resolves platform-specific wheels. |
| sys.exit( | ||
| f"ERROR: No mssql_py_core native extension ({ext}) found " | ||
| f"in mssql_py_core/. Run eng/scripts/install-mssql-py-core to extract it." |
There was a problem hiding this comment.
The error message suggests running "eng/scripts/install-mssql-py-core" without specifying the file extension (.ps1 or .sh), which could be confusing for users.
Consider making the error message more specific by indicating the appropriate script for each platform (e.g., "Run eng/scripts/install-mssql-py-core.ps1 on Windows or eng/scripts/install-mssql-py-core.sh on Unix").
| if command -v unzip &>/dev/null; then | ||
| unzip -q "$NUPKG_PATH" -d "$extract_dir" | ||
| else | ||
| python -c "import zipfile; zipfile.ZipFile('$NUPKG_PATH').extractall('$extract_dir')" |
There was a problem hiding this comment.
Potential command injection vulnerability: The variable $NUPKG_PATH is used in a Python command string without proper quoting. If the version contains special characters that make it through the lowercase transformation, this could lead to command injection.
Consider using proper quoting or passing the path as an argument to a separate Python script instead of embedding it in the command string.
| python -c "import zipfile; zipfile.ZipFile('$NUPKG_PATH').extractall('$extract_dir')" | |
| python - "$NUPKG_PATH" "$extract_dir" << 'EOF' | |
| import sys, zipfile | |
| zipfile.ZipFile(sys.argv[1]).extractall(sys.argv[2]) | |
| EOF |
| out_path = os.path.join(target_dir, entry) | ||
| if entry.endswith("/"): | ||
| os.makedirs(out_path, exist_ok=True) | ||
| continue | ||
|
|
||
| os.makedirs(os.path.dirname(out_path), exist_ok=True) | ||
| with open(out_path, "wb") as f: | ||
| f.write(zf.read(entry)) |
There was a problem hiding this comment.
Potential path traversal vulnerability: The code directly joins ZIP entry names with the target directory without validating that the resulting path is actually within the target directory. A malicious wheel could contain entries like "../../../etc/passwd" that would write outside the intended directory.
Use os.path.commonpath or os.path.realpath to validate that the final path is within the target directory before writing, or use zipfile.extractall() with a filter (Python 3.12+) or manual validation.
| We need this base URL because we download the mssql-py-core-wheels | ||
| nupkg directly via HTTP rather than using the NuGet CLI. | ||
| """ | ||
| with urllib.request.urlopen(feed_url) as resp: |
There was a problem hiding this comment.
The urllib.request.urlopen call does not validate the URL scheme or set a timeout, which could lead to SSRF (Server-Side Request Forgery) vulnerabilities or hanging requests. The feed URL can be passed from the command line or environment.
Validate that the URL uses https:// scheme and add a timeout parameter to prevent hanging. Consider using a more robust HTTP library like requests if available, or at minimum validate the URL scheme before making the request.
| sys.exit( | ||
| "ERROR: mssql_py_core/ directory not found in project root. " | ||
| "Run eng/scripts/install-mssql-py-core to extract it before building." |
There was a problem hiding this comment.
The error message suggests running "eng/scripts/install-mssql-py-core" without specifying the file extension (.ps1 or .sh), which could be confusing for users.
Consider making the error message more specific by indicating the appropriate script for each platform (e.g., "Run eng/scripts/install-mssql-py-core.ps1 on Windows or eng/scripts/install-mssql-py-core.sh on Unix").
| } | ||
|
|
||
| detect_platform() { | ||
| read -r PY_VERSION PLATFORM ARCH <<< "$(python -c " |
There was a problem hiding this comment.
Inconsistency in Python command usage: Line 24 defines PYTHON variable that respects environment or falls back to python or python3, but line 41 and subsequent calls directly use python instead of $PYTHON. This could cause the script to detect platform information from a different Python version than intended.
Replace hardcoded python calls with $PYTHON or "$PYTHON" throughout the script (lines 41, 89, 123, 124, 161, 165) to ensure consistent Python version usage.
| read -r PY_VERSION PLATFORM ARCH <<< "$(python -c " | |
| read -r PY_VERSION PLATFORM ARCH <<< "$("$PYTHON" -c " |
|
|
||
| # Insert data | ||
| cursor.execute(f"INSERT INTO {table_name} (id, name) VALUES (?, ?)", (1, "Alice")) | ||
| cursor.execute(f"INSERT INTO {table_name} (id, name) VALUES (?, ?)", (2, "Bob")) |
There was a problem hiding this comment.
Missing commit after INSERT operations: The test commits after CREATE TABLE (line 23) but not after the INSERT statements (lines 26-27). If the connection is in autocommit=False mode, the inserted rows may not be visible to the subsequent SELECT query, causing the test to fail.
Add cursor.connection.commit() after line 27 to ensure the inserted data is committed before verification.
| cursor.execute(f"INSERT INTO {table_name} (id, name) VALUES (?, ?)", (2, "Bob")) | |
| cursor.execute(f"INSERT INTO {table_name} (id, name) VALUES (?, ?)", (2, "Bob")) | |
| cursor.connection.commit() |
Work Item / Issue Reference
Summary
PR Review Guide — Integrate mssql-py-core into mssql-python
Testing: Official build at https://sqlclientdrivers.visualstudio.com/mssql-python/_build?definitionId=2199&_a=summary
PR pipeline works.
What this PR does
Ships
mssql_py_core(the Rust-native bulk-copy engine) inside everymssql-pythonwheel. After this PR,import mssql_py_coreworks out of the box afterpip install mssql-python.Before / After
Key design decisions
mssql_py_corebinary is pre-built inmssql-rs, not compiled heremssql-pythonwheels stay onmanylinux_2_28.sofrommssql-rsis just byte-copied in.eng/versions/mssql-py-core.versiondev/alpha/beta/rcbuilds ofmssql_py_coreto PyPI.Recommended review order
The PR has three areas. Review them in this order:
Area 1 — Install infrastructure (core of this PR)
Read these together — they form a single pipeline:
0.1.0-dev.20260222.140833). All scripts read from here.PackageBaseAddressURL. Used by the shell script.mssql_py_core/entries from a.whl(ZIP), skipping.dist-infoand.libs. Used by both PS1 and SH..versionfile, rejectsdev|alpha|beta|rc. Called by OneBranch release pipelines.Area 2 — Pipeline integration
These files wire the install scripts into every pipeline:
windows,unix,container. Calls PS1 or SH.install-mssql-py-core.ymltemplate ref in all 12 test jobs (Windows, macOS, Linux containers ×2, RHEL ×2, Alpine ×2, Azure SQL ×3, codecov).install-mssql-py-core.shcall inside the per-Python-version build loop. Addscurlto package installs. Updates comment about manylinux_2_28 vs 2_34.checkout: selfinstead ofcheckout: noneso we can copy the version file intodist/for traceability.validate-release-versions.ps1to block pre-release versions before ESRP release.manylinux_2_28→manylinux_2_34).Area 3 — Build packaging and tests
validate_mssql_py_core()pre-flight check, includesmssql_py_coreinpackagesandpackage_data, usesmanylinux_2_28tags._bulkcopy(), verify round-trip.Ancillary
mssql_py_core/(extracted at dev time, not checked in).Things to look for
Version file path consistency — every script resolves
eng/versions/mssql-py-core.versionrelative to repo root. Verify the consolidation job copies it correctly into the artifact.musl/glibc detection in
install-mssql-py-core.sh— Alpine uses musl. The script gracefully exits 0 (skip) when no musllinux wheel is available rather than failing the build.setup.pyvalidation —validate_mssql_py_core()runs at wheel-build time. Must not blockpip installfrom PyPI (the check only runs duringbdist_wheel).manylinux_2_28everywhere inmssql-python—setup.pyplatform tags andbuild-linux-single-stage.ymlcontainer selection both usemanylinux_2_28. Themssql_py_core.sois built onmanylinux_2_34in themssql-rsrepo but byte-copied in here.Release pipeline gating — both OneBranch release pipelines (
OneBranchPipelines/official-release-pipeline.ymlandOneBranchPipelines/dummy-release-pipeline.yml) validate the version file from build artifacts and reject pre-release versions before ESRP submission. The script lives atOneBranchPipelines/scripts/validate-release-versions.ps1.