Skip to content

feat(cli,mcp): add namespace support to destination smoke tests#999

Merged
Aaron ("AJ") Steers (aaronsteers) merged 4 commits intomainfrom
devin/1773874618-namespace-support
Mar 19, 2026
Merged

feat(cli,mcp): add namespace support to destination smoke tests#999
Aaron ("AJ") Steers (aaronsteers) merged 4 commits intomainfrom
devin/1773874618-namespace-support

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Mar 18, 2026

Summary

Adds namespace isolation to destination smoke tests using a zz_deleteme_yyyymmdd_hhmm naming convention. Every smoke test run now writes to a dedicated namespace, making test data easy to identify and safe for automated cleanup (any namespace containing deleteme is fair game for DROP SCHEMA-style cleanup scripts).

Changes across 4 files:

  • airbyte/_util/destination_smoke_tests.pygenerate_namespace() helper, new namespace_suffix and reuse_namespace params on run_destination_smoke_test(), namespace passed through to source config, namespace field added to DestinationSmokeTestResult
  • airbyte/cli/smoke_test_source/source.pynamespace config field in spec, set on streams in discover() and on records in read()
  • airbyte/cli/pyab.py--namespace-suffix and --reuse-namespace CLI options
  • airbyte/mcp/local.py — corresponding MCP tool parameters

Review & Testing Checklist for Human

  • Namespace is now mandatory on every smoke test run — there is no way to opt out. run_destination_smoke_test() always generates a namespace unless reuse_namespace is provided. Verify this is the desired behavior, since pre-existing callers will now get a namespace injected where there was none before.
  • namespace is a required field on DestinationSmokeTestResult (no default) — this is a breaking change to the model. Any code constructing this result without namespace will fail.
  • No cleanup logic is implemented yet — the reuse_namespace docstring says "cleanup is skipped" (implying cleanup exists otherwise), but there is no cleanup code in the finally block. Verify this is acceptable for the first increment.
  • Test with a real destination (e.g. Snowflake, MotherDuck) to confirm namespace flows end-to-end. Local testing with destination-dev-null (JDK) failed — the JDK destination received records with namespace=null despite the Python pipeline correctly setting namespace on both catalog and records. This may be a pre-existing PyAirbyte write-pipeline issue with namespace propagation to JDK destinations.

Notes

  • The namespace = config.get("namespace") call in source.py read() is evaluated inside the per-stream loop but doesn't vary per stream — minor inefficiency, not a bug.
  • PLR0913/PLR0917 lint suppression was added to the MCP tool since it now has 9 parameters.

Link to Devin session: https://app.devin.ai/sessions/9c72389579884c06bf18cef11c4550e8
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Summary by CodeRabbit

  • New Features
    • Namespace customization for destination smoke tests (optional suffix and manual reuse).
    • Automatic timestamped namespace generation for isolation.
    • CLI and API options to set or reuse namespaces when running smoke tests.
    • Emitted test streams and records include and respect the specified namespace.

Open with Devin

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

- Generate zz_deleteme_yyyymmdd_hhmm namespaces for test isolation
- Add --namespace-suffix for concurrent run disambiguation
- Add --reuse-namespace for multi-test runs on same namespace
- Inject namespace into smoke test source streams and records
- Include namespace in DestinationSmokeTestResult

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773874618-namespace-support' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773874618-namespace-support'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review March 18, 2026 23:16
Copilot AI review requested due to automatic review settings March 18, 2026 23:16
devin-ai-integration[bot]

This comment was marked as resolved.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

Added namespace support for destination smoke tests: auto-generate timestamped namespaces or reuse an existing one. The namespace is propagated from CLI/MCP through core test runner into the smoke-test source, which applies it to streams and emitted records.

Changes

Cohort / File(s) Summary
Core namespace utilities
airbyte/_util/destination_smoke_tests.py
Added NAMESPACE_PREFIX, DEFAULT_NAMESPACE_SUFFIX, generate_namespace() for timestamped namespaces; DestinationSmokeTestResult gains namespace field; get_smoke_test_source() and run_destination_smoke_test() signatures updated to accept and propagate namespace, namespace_suffix, and reuse_namespace.
CLI integration
airbyte/cli/pyab.py
destination_smoke_test() CLI handler adds --namespace-suffix and --reuse-namespace options and passes namespace_suffix/reuse_namespace into run_destination_smoke_test(); help text/examples updated.
MCP (local) API
airbyte/mcp/local.py
destination_smoke_test() API surface gains namespace_suffix and reuse_namespace parameters (with Field metadata) and forwards them to run_destination_smoke_test().
Smoke test source connector
airbyte/cli/smoke_test_source/source.py
_build_streams_from_scenarios() accepts namespace and sets it on created AirbyteStream objects; spec() includes a top-level namespace property; discover() and read() read namespace from config and propagate it into stream descriptors, statuses, and AirbyteRecordMessage objects.

Sequence Diagram

sequenceDiagram
    participant User as User/CLI
    participant CLI as CLI Handler
    participant Core as Core Runner
    participant Source as Smoke Test Source
    participant Dest as Destination

    User->>CLI: destination_smoke_test(--namespace-suffix / --reuse-namespace)
    CLI->>Core: run_destination_smoke_test(namespace_suffix, reuse_namespace)
    Core->>Core: compute namespace (generate or reuse)
    Core->>Source: get_smoke_test_source(namespace=...)
    Source->>Source: discover(config with namespace)
    Source->>Source: _build_streams_from_scenarios(namespace=...)
    Source-->>Core: configured source
    Core->>Source: read(config with namespace)
    Source->>Dest: emit records/statuses with namespace
    Dest-->>Source: write to namespaced destination
    Source-->>Core: read complete
    Core-->>CLI: DestinationSmokeTestResult(namespace=...)
    CLI-->>User: show result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Would you like me to add a short developer note or changelog entry suggestion to accompany these changes, wdyt?

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(cli,mcp): add namespace support to destination smoke tests' accurately and concisely captures the main change across the PR—adding namespace support to destination smoke tests with CLI and MCP integration.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1773874618-namespace-support
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte/_util/destination_smoke_tests.py`:
- Around line 197-198: Add an explicit validation that rejects callers passing
both namespace_suffix and reuse_namespace together: in the function that accepts
the parameters (look for parameters named namespace_suffix and reuse_namespace
in destination_smoke_tests.py) check at start that not (namespace_suffix and
reuse_namespace) and raise a clear exception (e.g., ValueError or the CLI's
argument error type) if both are provided; do the same validation where those
parameters are handled around the block referenced at lines ~230-233 so the
mutually-exclusive intent is enforced rather than silently letting
reuse_namespace override namespace_suffix.
- Around line 40-55: The generated namespace in generate_namespace uses
minute-resolution timestamp ("%Y%m%d_%H%M") causing collisions for runs within
the same minute; change the timestamp to include finer resolution (e.g., seconds
"%S" or milliseconds) or append a short collision-resistant token (e.g., first 8
chars of uuid4) to make namespaces unique across parallel/quick reruns while
preserving the optional namespace_suffix and NAMESPACE_PREFIX usage in the
function.

In `@airbyte/cli/pyab.py`:
- Around line 703-711: Update the help text for the click option
"--reuse-namespace" (parameter reuse_namespace) to soften the cleanup claim:
remove or rephrase "cleanup is skipped" and instead state that specifying an
exact namespace prevents generation of a new namespace and that the CLI does not
automatically perform cleanup of that namespace (so users must manage cleanup
themselves); keep the note about it being useful for running a second test
against an already-populated namespace.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 99d7510c-c9ce-45f1-a9a7-37153893006d

📥 Commits

Reviewing files that changed from the base of the PR and between 00ab7ca and 09d36b1.

📒 Files selected for processing (4)
  • airbyte/_util/destination_smoke_tests.py
  • airbyte/cli/pyab.py
  • airbyte/cli/smoke_test_source/source.py
  • airbyte/mcp/local.py

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-run namespace isolation for destination smoke tests so each run writes into its own dedicated schema/namespace (using a zz_deleteme_... naming convention), exposed consistently through the shared helper, CLI, and MCP tool.

Changes:

  • Introduces namespace generation + propagation through run_destination_smoke_test() and returns the namespace in DestinationSmokeTestResult.
  • Extends the smoke-test source connector to accept a namespace config and apply it to discovered streams and emitted records.
  • Adds --namespace-suffix / --reuse-namespace options to the CLI and matching parameters to the MCP tool.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
airbyte/_util/destination_smoke_tests.py Adds namespace generation/selection and includes namespace in the result model.
airbyte/cli/smoke_test_source/source.py Adds namespace to the source spec; applies it to streams and record messages.
airbyte/cli/pyab.py Exposes namespace controls via CLI flags and passes them to the shared helper.
airbyte/mcp/local.py Adds MCP tool parameters for namespace controls and passes them to the shared helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Mar 18, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  ±0   343 ✅ ±0   5m 49s ⏱️ +5s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit a5fff9e. ± Comparison against base commit 00ab7ca.

♻️ This comment has been updated with latest results.

@aaronsteers
Copy link
Contributor Author

Devin, reply to any feedback you haven't replied to. Thx

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/cli/smoke_test_source/source.py (1)

399-404: ⚠️ Potential issue | 🟠 Major

Confirm: This namespace gets dropped when flowing through from_read_result()

Your code here is correct, but AirbyteMessageIterator.from_read_result() (lines 103–113 in airbyte/_message_iterators.py) hardcodes namespace=None with an explicit comment that it doesn't handle namespaces. When a destination writes from a ReadResult (e.g., after source.read() is called), the messages flow through from_read_result() and the namespace is dropped—this happens in both destination.write(source_data=read_result) calls at lines 216 and 221 in airbyte/destinations/base.py.

The test at test_destination_write_from_read_result() confirms this pattern: source → read()ReadResultdestination.write(). Is this namespace loss intentional, or does it need fixing?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/smoke_test_source/source.py` around lines 399 - 404,
from_read_result currently hardcodes namespace=None and thus drops namespaces
when converting a ReadResult into AirbyteMessage objects; update
AirbyteMessageIterator.from_read_result to preserve per-record namespace by
reading namespace from the incoming AirbyteRecord (or ReadResult record) and
passing it to AirbyteRecordMessage (instead of None), so that messages yielded
by from_read_result keep record.namespace and downstream destination.write calls
(which consume ReadResult) retain namespaces; locate and change the logic in
AirbyteMessageIterator.from_read_result and ensure any use of namespace=None is
replaced with the record's namespace when present.
♻️ Duplicate comments (2)
airbyte/_util/destination_smoke_tests.py (2)

58-59: ⚠️ Potential issue | 🟠 Major

Could we make generated namespaces collision-resistant, wdyt?

Minute-level timestamps can collide in parallel/rapid reruns and weaken the isolation this PR introduces.

Possible fix
 from datetime import datetime, timezone
+from uuid import uuid4
@@
-    ts = now.strftime("%Y%m%d_%H%M")
-    return f"{NAMESPACE_PREFIX}_{ts}_{suffix}"
+    ts = now.strftime("%Y%m%d_%H%M%S")
+    token = uuid4().hex[:8]
+    return f"{NAMESPACE_PREFIX}_{ts}_{token}_{suffix}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/_util/destination_smoke_tests.py` around lines 58 - 59, The generated
namespace uses minute-level timestamps (ts = now.strftime("%Y%m%d_%H%M")) which
can collide; update the namespace generation in destination_smoke_tests.py to be
collision-resistant by including higher-resolution time (seconds or
milliseconds) or appending a short random/UUID component so the returned string
constructed from NAMESPACE_PREFIX, ts and suffix is unique (refer to the ts
variable, NAMESPACE_PREFIX and the return expression) — ensure the resulting
name format remains parseable but adds uniqueness to avoid parallel/rapid rerun
collisions.

201-202: ⚠️ Potential issue | 🟡 Minor

Could we reject namespace_suffix and reuse_namespace together, wdyt?

Right now reuse_namespace silently wins, which makes conflicting inputs ambiguous.

Possible guardrail
 def run_destination_smoke_test(
@@
 ) -> DestinationSmokeTestResult:
@@
+    if namespace_suffix and reuse_namespace:
+        raise PyAirbyteInputError(
+            message="`namespace_suffix` and `reuse_namespace` are mutually exclusive.",
+        )
+
     # Determine namespace
     namespace = reuse_namespace or generate_namespace(
         namespace_suffix=namespace_suffix,
     )

Also applies to: 235-238

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/_util/destination_smoke_tests.py` around lines 201 - 202, Add an
explicit validation that disallows passing both namespace_suffix and
reuse_namespace together: in the function that declares the parameters
namespace_suffix and reuse_namespace add a guard at the top that checks if both
are not None and raise a clear exception (e.g., ValueError) with a message
stating they are mutually exclusive; apply the same guard where the alternate
signature/overload appears (the other spot referencing
namespace_suffix/reuse_namespace around the later block) so callers get a
deterministic error instead of reuse_namespace silently winning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte/_util/destination_smoke_tests.py`:
- Around line 227-229: The docstring for the parameter reuse_namespace currently
states "cleanup is skipped" which is misleading because this function has no
cleanup behavior; update the docstring for reuse_namespace (the parameter in
this module's destination_smoke_tests function) to remove the claim about
skipping cleanup and instead say something like: "When set, no new namespace is
generated; any cleanup of the reused namespace is the caller's responsibility."
Ensure you change only the docstring text referencing reuse_namespace to reflect
the actual behavior.

---

Outside diff comments:
In `@airbyte/cli/smoke_test_source/source.py`:
- Around line 399-404: from_read_result currently hardcodes namespace=None and
thus drops namespaces when converting a ReadResult into AirbyteMessage objects;
update AirbyteMessageIterator.from_read_result to preserve per-record namespace
by reading namespace from the incoming AirbyteRecord (or ReadResult record) and
passing it to AirbyteRecordMessage (instead of None), so that messages yielded
by from_read_result keep record.namespace and downstream destination.write calls
(which consume ReadResult) retain namespaces; locate and change the logic in
AirbyteMessageIterator.from_read_result and ensure any use of namespace=None is
replaced with the record's namespace when present.

---

Duplicate comments:
In `@airbyte/_util/destination_smoke_tests.py`:
- Around line 58-59: The generated namespace uses minute-level timestamps (ts =
now.strftime("%Y%m%d_%H%M")) which can collide; update the namespace generation
in destination_smoke_tests.py to be collision-resistant by including
higher-resolution time (seconds or milliseconds) or appending a short
random/UUID component so the returned string constructed from NAMESPACE_PREFIX,
ts and suffix is unique (refer to the ts variable, NAMESPACE_PREFIX and the
return expression) — ensure the resulting name format remains parseable but adds
uniqueness to avoid parallel/rapid rerun collisions.
- Around line 201-202: Add an explicit validation that disallows passing both
namespace_suffix and reuse_namespace together: in the function that declares the
parameters namespace_suffix and reuse_namespace add a guard at the top that
checks if both are not None and raise a clear exception (e.g., ValueError) with
a message stating they are mutually exclusive; apply the same guard where the
alternate signature/overload appears (the other spot referencing
namespace_suffix/reuse_namespace around the later block) so callers get a
deterministic error instead of reuse_namespace silently winning.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9c63bb19-d24a-48c4-9b92-9bbcecc5ab04

📥 Commits

Reviewing files that changed from the base of the PR and between 09d36b1 and 2b7b2a8.

📒 Files selected for processing (4)
  • airbyte/_util/destination_smoke_tests.py
  • airbyte/cli/pyab.py
  • airbyte/cli/smoke_test_source/source.py
  • airbyte/mcp/local.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte/cli/pyab.py
  • airbyte/mcp/local.py

…implemented)

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Contributor

Good catch — this is a pre-existing limitation in PyAirbyte's write pipeline, not introduced by this PR. The from_read_result() path (source → read()ReadResultdestination.write()) drops namespace because ReadResult stores records in DuckDB which doesn't preserve the original namespace.

However, the smoke test harness doesn't go through from_read_result() — it uses destination.write(source_data=source_obj) which passes the source object directly (not a ReadResult). The source emits raw AirbyteMessage objects with namespace set, and these flow directly to the destination Docker container's stdin. So namespace is preserved in the smoke test flow.

The from_read_result() namespace loss is worth fixing separately but is out of scope for this PR.

@github-actions
Copy link

PyTest Results (Full)

413 tests  ±0   395 ✅ ±0   22m 58s ⏱️ - 1m 55s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit a5fff9e. ± Comparison against base commit 00ab7ca.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit a8666b2 into main Mar 19, 2026
21 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1773874618-namespace-support branch March 19, 2026 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants