Skip to content

DRAFT - new pyats support infra#459

Draft
oboehmer wants to merge 210 commits intomainfrom
release/pyats-integration-v1.1-beta
Draft

DRAFT - new pyats support infra#459
oboehmer wants to merge 210 commits intomainfrom
release/pyats-integration-v1.1-beta

Conversation

@oboehmer
Copy link
Collaborator

@oboehmer oboehmer commented Jan 26, 2026

raising this PR to start the review.. the work isn't done yet, there are still critical issues to resolve prior to release

aitestino added 30 commits June 24, 2025 16:33
…PyATS.

- Both Robot and PyATS paths can now use the same data merging logic (DRY)
- PyATS tests can now also have access to the "in disk" merged data model
- Clean separation of concerns (SRP)
…capabilities

- Added PyATS orchestration logic to enable running PyATS tests via the CLI.
- Introduced a new `PyATSOrchestrator` class for managing test execution and resource allocation.
- Mocked (not implemented) progress reporting for PyATS tests to match Robot Framework output format.
- Added authentication caching and connection pooling for efficient API interactions.
- Updated dependencies and configurations in `pyproject.toml` and `poetry.lock` to support new features.
…nd improved logging

- Implemented real-time progress reporting for PyATS tests, matching Robot Framework output format.
- Updated `ProgressReporter` to include color-coded status messages and track test execution details.
- Refactored `PyATSOrchestrator` to support structured progress events emitted by the new `ProgressReporterPlugin`.
- Improved error handling and logging throughout the PyATS execution process.
- Enhanced `AuthCache` and `ConnectionPool` for better resource management and token caching.
- Cleaned up code formatting and ensured consistent style across modules.
- Upgraded dependencies in `pyproject.toml` and `poetry.lock`, including `httpx` and `pytest-cov`.
- Enhanced `DataMerger` to ensure consistent return types from YAML loading.
- Added new method `load_yaml_file` to `DataMerger` for improved YAML file handling.
- Refactored type hints across various modules for better clarity and type safety.
- Improved retry strategy with configurable delays in `SmartRetry`.
- Cleaned up imports and ensured consistent code style across the project.
- Introduced the DATA_FILE environment variable in the PyATSOrchestrator to point to the merged data model, enhancing the orchestration process.
- Removed the abstract method decorator from get_connection_params in NACTestBase and replaced it with a NotImplementedError for better clarity in subclass implementation.
… PyATS orchestrator

- Added a pre-flight check in `PyATSOrchestrator` to validate required environment variables before running tests, ensuring clearer error messages for missing configurations.
- Introduced a new `terminal` utility module for consistent terminal output formatting, including color-coded messages for errors, warnings, and success notifications.
- Enhanced test result display with color differentiation for passed, failed, and errored tests, improving readability and user experience.
- Updated logging behavior to suppress verbose logs and ensure critical information is displayed appropriately.
…terminal output

- Added terminal utility functions for consistent color-coded output in `PyATSOrchestrator` and `ProgressReporter`, enhancing user experience during test execution.
- Updated test result display to differentiate between passed, failed, and errored statuses with appropriate colors.
- Improved console output formatting for various messages, including test summaries and output file listings, to ensure clarity and readability.
- Refactored logging behavior to suppress unnecessary verbose logs while highlighting critical information.
- Introduced a new HTML reporting module for PyATS, enabling asynchronous report generation that significantly improves performance.
- Added a `TestResultCollector` class to accumulate test results and command executions, ensuring process-safe handling of results.
- Implemented a `ReportGenerator` class for generating HTML reports from collected results, featuring robust error handling and pre-rendered metadata.
- Enhanced the `NACTestBase` class to support HTML reporting and result collection during test execution.
- Updated templates for generating detailed test case and summary reports, improving the presentation of test results.
- Added markdown support for rendering descriptions in HTML reports, enhancing report clarity and usability.
- Introduced methods for setting and clearing test context, allowing better tracking of command/API executions in reports.
- Added a context manager for temporary test context management, improving code clarity and usability.
- Wrapped the httpx client to automatically track API calls, enhancing the reporting of API interactions.
- Updated the TestResultCollector to include optional test context in command execution records, improving the detail in reports.
- Enhanced HTML report templates to display test context, providing clearer insights into test executions.
… API-based still functional. SSH/d2d staged but needs testing.

- Added core components shared across the nac-test framework, including constants for retry logic and timeouts.
- Introduced data models for test results and statuses to standardize reporting across different test architectures.
- Enhanced the PyATS integration by restructuring imports and ensuring consistent access to shared components.
- Implemented a new SSH testing infrastructure to support device-centric testing capabilities.
- Established a foundation for progress reporting and output processing within the PyATS execution environment.
…handling

- Still needs some work for SSH/d2D tests -- Need to centralize PYTHONPATH and pass it to subprocess
- Added properties to SSHTestBase for accessing the PyATS testbed object and the current device, enabling better integration with Genie parsers.
- Updated connection management to utilize hostname instead of device_id, aligning with the nac-test contract for required fields.
- Enhanced error handling and logging to provide clearer messages related to device connections and command execution.
- Improved job generation and subprocess handling to reflect hostname usage, ensuring consistency across the framework.
- Refactored command cache initialization to use hostname, enhancing clarity and adherence to the new structure.
…oved functionality

- Updated import path for `PyATSOrchestrator` to align with the new module structure.
- Increased `DEFAULT_TEST_TIMEOUT` from 5 minutes to 6 hours to accommodate longer test durations.
- Adjusted concurrency limits for API and SSH operations, raising `DEFAULT_API_CONCURRENCY` to 70 and `DEFAULT_SSH_CONCURRENCY` to 50.
- Removed the obsolete `reporting` module from the `pyats` package and introduced a new `pyats_core` module to centralize PyATS functionalities.
- Added new constants and utilities in the `pyats_core` module to support enhanced reporting and orchestration capabilities.
- Introduced a default buffer limit for subprocess output, set to 10MB, which can be overridden via the PYATS_OUTPUT_BUFFER_LIMIT environment variable.
- Implemented error handling for output lines exceeding the buffer limit, including logging warnings and attempts to clear the buffer.
- Enhanced the output processing logic to reset error counters on successful reads, improving robustness during subprocess execution.
- Introduced a new `format_skip_message` function to format enhanced skip messages with rich content, converting markdown-like formatting into HTML.
- Updated HTML templates to support the new skip message formatting, including styling for skipped results and detailed skip messages.
- Added a new CSS variable for skip status and adjusted existing styles for better visual representation of skipped tests in reports.
…ariable overrides

- Modified `DEFAULT_API_CONCURRENCY` and `DEFAULT_SSH_CONCURRENCY` to allow configuration via `NAC_API_CONCURRENCY` and `NAC_SSH_CONCURRENCY` environment variables, enhancing flexibility in concurrency management.
- Added import of `os` module to facilitate environment variable retrieval.
- Introduced new constants for multi-job execution configuration, including `TESTS_PER_JOB`, `MAX_PARALLEL_JOBS`, and `JOB_RETRY_ATTEMPTS`, to enhance job management and prevent resource exhaustion.
- Added a default buffer limit of 10MB for subprocess output handling, improving the management of large output lines during test execution.
- Updated the `__all__` export list to include the new constants for backward compatibility.
…estrator

- Added calls to `cleanup_pyats_runtime()` to ensure a clean state before running tests.
- Introduced conditional cleanup of old test outputs when running in CI/CD environments, utilizing `cleanup_old_test_outputs()` with a retention period of 3 days.
- These enhancements improve the reliability and cleanliness of the test execution environment.
…andling

- Introduced a batching reporter to manage high-volume PyATS reporter messages, preventing bottlenecks during tests with many steps.
- Implemented dual-path reporting for sending messages to the PyATS reporter, ResultCollector, and an emergency JSON dump for message recovery.
- Enhanced HTTP method tracking with aggressive retry logic for APIC connections, improving resilience against network issues.
- Updated cleanup procedures to ensure proper shutdown of the batching reporter and uninstalling of step interceptors after test execution.
- Improved logging for better visibility into message handling and error recovery processes.
- Added the "--no-xml-report" option to subprocess commands in the SubprocessRunner class to enhance command execution flexibility.
- Removed the default buffer limit constant definition, now relying on the imported DEFAULT_BUFFER_LIMIT from constants for better configuration management.
- Improved logging to maintain clarity in command execution outputs.
- Introduced a new batching reporter to efficiently manage high-volume PyATS reporter messages, preventing server crashes during tests with numerous steps.
- Implemented a queuing system to handle burst conditions and ensure graceful degradation under extreme load.
- Added features such as adaptive batch sizing, memory tracking, and thread-safe operations to improve performance and reliability.
- Included detailed logging for monitoring message processing and overflow conditions, enhancing visibility into the system's behavior.
- This implementation addresses the challenges of processing large volumes of messages, ensuring stability and efficiency in reporting.
…orage

- Changed the results storage format from JSON to JSONL for improved performance and streaming capabilities.
- Implemented immediate writing of test results and command execution records to disk, enhancing data integrity and reducing memory usage.
- Updated the overall status determination logic to utilize counters for real-time tracking, preserving existing business logic.
- Added a destructor to ensure proper closure of the JSONL file handle, enhancing resource management.
- This refactor improves the efficiency of result handling and aligns with modern data processing practices.
- Modified the ReportGenerator class to handle JSONL files instead of JSON for improved performance and streaming capabilities.
- Updated methods to read, clean up, and generate reports based on JSONL format, ensuring compatibility with the recent refactor of TestResultCollector.
- Implemented robust error handling for reading JSONL files, enhancing data integrity during report generation.
- This change aligns with the ongoing transition to JSONL format, optimizing result processing and resource management.
- Added a new StepInterceptor class to buffer PyATS reporter messages during step execution, preventing server overload when handling numerous steps.
- Wrapped Step.__enter__ and __exit__ methods to intercept and manage message flow, ensuring efficient reporting and maintaining step context hierarchy.
- Introduced a DummyReporter to replace the real reporter during batching, preventing duplicate messages and potential server crashes.
- Implemented error handling and fallback mechanisms to restore original behavior in case of interception failures.
- This enhancement aligns with the ongoing efforts to optimize message handling and improve the stability of the reporting system.
- Introduced `cleanup_pyats_runtime` function to remove PyATS runtime directories before test execution, essential for CI/CD environments to prevent disk exhaustion.
- Added `cleanup_old_test_outputs` function to delete test output files older than a specified number of days, enhancing resource management in CI/CD workflows.
- Updated `__init__.py` to include new cleanup utilities in the module's export list, ensuring accessibility throughout the nac-test framework.
- Updated retry configuration to improve recovery time during high-scale operations, increasing MAX_RETRIES to 7 and maintaining INITIAL_DELAY at 5 seconds.
- Added a TODO comment to consider moving retry constants to a separate constants file for better management.
- Adjusted comments for clarity regarding the purpose of the retry logic in handling controller stress.
…erator

- Removed temporary comments related to MVP in main.py for clarity.
- Improved task ID generation in job_generator.py by using meaningful names derived from test file names, enhancing readability and traceability during execution.
- Introduced a Makefile with a `prepush` target that automates formatting, linting, and pre-commit checks.
- The target runs multiple passes (default is 4) to ensure compliance with CI/CD rules, enhancing local development practices.
- This addition streamlines the process of preparing code for submission, improving overall code quality and consistency.
aitestino and others added 23 commits February 20, 2026 15:38
* feat(constants): add IS_UNSUPPORTED_MACOS_PYTHON platform constant

Adds a boolean constant that evaluates to True when running on macOS
with Python < 3.12, which has known incompatibilities that cause PyATS
tests to hang or crash.

* feat(platform): add shared macOS Python version gate

Centralizes the unsupported macOS Python check into a single utility
function that exits with code 1 and displays upgrade instructions.
Used by both CLI and orchestrator entry points to eliminate duplication.

* feat(cli): add hard exit for unsupported macOS Python

Calls the shared version gate immediately after logging setup so the
CLI exits before any expensive operations when running on macOS with
Python < 3.12.

* fix(orchestrator): upgrade soft warning to hard exit for macOS Python

Replaces the Python 3.11 soft warning with a hard exit via the shared
version gate. Acts as defense-in-depth for programmatic usage that
bypasses the CLI entry point.

* test(cli): add tests for macOS unsupported Python hard exit

Verifies exit code, error message content, upgrade instructions, and
early exit behavior (no expensive operations called).

* test(orchestrator): add tests for defense-in-depth Python version gate

Verifies the orchestrator-level check exits on unsupported macOS Python,
passes on supported platforms, and triggers during pyats-only mode.

* refactor(tests): extract clean_controller_env fixture to shared conftest

Moves the duplicated controller environment cleanup fixture to
tests/unit/conftest.py and updates consumers to use it. Removes
the redundant copy from the integration test since the integration
conftest already provides clear_controller_credentials with autouse.

* docs(readme): clarify platform-specific Python version requirements

Replaces the single-line Python requirement with a clear breakdown:
Linux/Windows need 3.10+, macOS needs 3.12+. Includes upgrade
instructions for common package managers.

* fix(platform): soften upgrade instructions wording

Changes "Common ways to install Python 3.12" to "Some common ways to
install it" so the list doesn't imply these are the only options.

* refactor(constants): move IS_MACOS and IS_UNSUPPORTED_MACOS_PYTHON to core

Platform detection constants are framework-wide, not PyATS-specific.
Moves them to nac_test/core/constants.py and re-exports from
pyats_core/constants.py for existing consumers.

Fixes #561
* fix: skip controller check when in --render-only mode

There is no need for any controller check when we only rendering robot
files. This is also for backward-compatibilty reasons as this was
never required and affects current workflows

* test: add unit test to ensure PyATSOrchestrator is never called in render-only mode

Add test that verifies the critical invariant: when --render-only is set,
PyATSOrchestrator must never be instantiated, regardless of whether PyATS
test files exist in the templates directory.

Also skip PyATS execution in render-only mode since PyATS tests are Python
files that always execute (no render-only concept).

* remove redundant controller check in PyatsOrchestrator

the controller detection is already done in CombinedOrchestrator
and we terminate nac-test is none is found (unless render_only is set)
ALso adjusted the type annotation from `str|None` to `str` to reflect
this

* Revert "remove redundant controller check in PyatsOrchestrator"

This reverts commit 26c2e8a.

* change self.controller_type sentinal to None

I did not realize that there is another controller detection test in PyatsOrchestrator, so none is a valid value

* fix test_render_only_mode_does_not_instantiate_pyats_orchestrator

* test: align test to other tests' pattern, improve test coverage

- Aligned test_render_only_mode_does_not_instantiate_pyats_orchestrator to
  perform full validation
- use future-proof pyats test content which also works after #511 gets
  merged
- verify controller detection is skipped and robot orchestrator is
  called

* restore uv.lock

* fix: skip PyATS execution in render-only mode

The render_only check was lost during rebase conflict resolution.
PyATSOrchestrator must not be instantiated when render_only=True.

* fix rebase breakage, avoid calling sys.exit() in PyATSOrchestrator.__init__

* add uv.lock

* adjust tests to no longer expect SystemExit for controller detection logic

* restore sys.exit() behaviour, should be handled in distinct issue/PR

* test: add explicit integration test to validate behaviour

Added an explicit test to validate that --render-only doesn't require
controller environment set.
Also undid temp fix for #431 in this test module, we can now run these
tests without controller environment (checking this also implicitly)

* fix typo in comment
* Add shared architecture detection helper for validators

Introduces is_architecture_active() helper function that provides a
lightweight check for whether an architecture environment is active by
testing if its URL environment variable is set.

This helper is shared across all architecture-specific validators (ACI,
SD-WAN, Catalyst Center) to avoid duplicating environment detection logic.

Rationale: Validators need to know which architecture is active to decide
whether to perform validation. Centralizing this check in common.py ensures
consistency and makes adding new architecture validators straightforward.

* Add ACI defaults validation with two-stage heuristic

Implements validate_aci_defaults() that fails fast when users forget to
pass the ACI defaults file, catching the common mistake of omitting
-d ./defaults/ before expensive data merge operations.

Two-stage validation approach:

Stage 1 (Instant path-based heuristic):
- Checks if path contains "default" or "defaults" as directory component
- For files: requires .yaml/.yml extension + "default" in filename
- Rejects false positives like "/users/defaultuser/" or "config.txt"

Stage 2 (Content-based fallback):
- Parses YAML files to find "defaults.apic" structure
- Non-recursive directory scanning for performance
- Handles both .yaml and .yml extensions

Security features:
- Rejects symlinks to prevent path traversal
- Limits file scanning to 3MB to prevent memory exhaustion

Performance: Path heuristic provides instant feedback for 95% of cases.
Content scanning only runs when path-based check doesn't match, adding
minimal overhead (~50-100ms) compared to time saved when defaults are
missing (users get immediate feedback instead of waiting for full merge).

* Add validators package public API exports

Exports validate_aci_defaults() and is_architecture_active() as the
public API for the validators package.

Enforces separation of concerns: validators package only exports
validation functions, not UI components. Display functions like
display_aci_defaults_banner() should be imported directly from
nac_test.cli.ui, maintaining clear architectural boundaries between
validation logic and presentation.

This design allows adding new architecture validators (SD-WAN, Catalyst
Center) by following the same pattern without coupling to UI concerns.

* Add terminal banner display for ACI defaults error

Implements display_aci_defaults_banner() to show a prominent, user-friendly
error message when ACI defaults file is missing.

BoxStyle dataclass design:
- Encapsulates box-drawing characters for terminal rendering
- Frozen dataclass ensures immutability
- Supports both ASCII (fallback) and Unicode (modern terminals)
- Includes emoji_adjustment field to handle Unicode width quirks

NO_COLOR environment variable support:
- Respects NO_COLOR standard for accessibility
- Falls back to plain ASCII when NO_COLOR is set or Unicode unavailable

Banner content includes:
- Clear error message explaining the requirement
- Working example command showing correct syntax
- Link to Cisco ACI documentation for context
- Professional formatting with borders and sections

Rationale: Validation errors need to be immediately visible and actionable.
A prominent banner with example command helps users fix the issue quickly
instead of hunting through error logs.

* Add UI package public API for banner display

Exports display_aci_defaults_banner() as the public interface for CLI
error banners. Future banner functions (SD-WAN, Catalyst Center) will
be exported from this package.

Architectural separation: UI package handles all user-facing display
logic, while validators package handles business logic. This separation
allows testing and modifying display behavior independently from
validation rules.

* Wire ACI defaults validation into CLI entry point

Adds validation check immediately before DataMerger.merge_data_files()
to fail fast when defaults are missing. Validation runs after output
directory creation but before expensive merge operation.

Execution flow:
1. Parse CLI arguments
2. Configure logging
3. Create output directory
4. → NEW: Validate ACI defaults (instant check)
5. Merge data files (expensive operation, 2-5 seconds)
6. Continue with orchestrator...

Error handling:
- Validation failure displays prominent banner via display_aci_defaults_banner()
- Exits with code 1 (failure) to prevent continuing with invalid config
- Blank lines before/after banner improve readability

Time savings: Users get immediate feedback instead of waiting 2-5 seconds
for data merge to complete before seeing validation error. Pre-flight check
philosophy: validate prerequisites before expensive operations.

* Add test directory structure for CLI components

Creates package structure mirroring source code organization:
- tests/unit/cli/validators/ for validator unit tests
- tests/unit/cli/ui/ for UI component unit tests

Empty __init__.py files make directories importable Python packages,
allowing pytest to discover tests and enabling relative imports within
test modules.

Test organization principle: Test structure should mirror source
structure. This makes finding tests for a given module intuitive and
maintains clear architectural boundaries in both source and test code.

* Add comprehensive unit tests for ACI defaults validator

Tests validate_aci_defaults() and helper functions with 32 test cases
covering environment detection, path matching, YAML parsing, security
features, and edge cases.

Test coverage includes:

Environment detection (3 tests):
- Validation skips when ACI_URL not set
- Validation runs when ACI_URL is set
- Empty string ACI_URL treated as not set

Path-based heuristic (9 tests):
- Matches "defaults" directory component
- Matches defaults.yaml and defaults.yml files
- Rejects non-YAML files (defaults.txt, defaults.json)
- Rejects substring matches (defaultuser, my-defaulted-config)
- Case-insensitive matching for directory names

YAML content validation (8 tests):
- Finds defaults.apic structure in YAML files
- Handles both .yaml and .yml extensions
- Rejects files missing "defaults:" or "apic:" keys
- Handles empty files and unreadable files gracefully

Security features (2 tests):
- Rejects symlinks to prevent path traversal
- Skips files larger than 3MB to prevent memory exhaustion

Performance features (1 test):
- Directory scanning is non-recursive (documented design decision)

Edge cases (9 tests):
- Two-stage validation (path check → content check)
- Multiple data paths with mixed content
- Content check runs as fallback when path check fails

Test quality: Uses real filesystem operations (tmp_path) not mocks,
validates actual application behavior not library functionality,
includes comprehensive docstrings documenting test intent.

* Add unit tests for ACI defaults banner display

Tests display_aci_defaults_banner() function with focus on output
content, NO_COLOR environment variable support, and error handling.

Test coverage includes:

Banner content validation (3 tests):
- Banner displays without raising exceptions
- Banner contains required error message text
- Banner includes example command with correct syntax

NO_COLOR support (2 tests):
- ASCII box style used when NO_COLOR environment variable is set
- Unicode box style used when NO_COLOR is not set

Test approach: Uses pytest's capsys fixture to capture stdout and
validate actual terminal output. Tests verify real banner display
behavior, not mocked components.

Note: Banner tests focus on content validation (error message, example
command) rather than exact formatting (border characters, spacing).
This design allows banner styling to evolve without breaking tests as
long as essential content remains present.

* Add integration tests for CLI ACI defaults validation

Implements 6 integration tests split into two test classes validating
end-to-end CLI behavior with ACI defaults validation.

TestCliAciValidationIntegration (4 tests using CliRunner):
- Tests validation triggers when ACI_URL set without defaults
- Tests validation skips when ACI_URL not set
- Tests validation passes when defaults directory provided
- Tests validation passes when YAML contains defaults.apic structure

TestCliAciValidationSubprocess (2 tests using subprocess.run):
- Tests actual nac-test CLI process execution with ACI_URL
- Tests CLI subprocess without ACI_URL environment variable
- Marked with skipif when CLI not installed in PATH

Integration test design:

CliRunner tests (lines 26-195):
- Use Typer's CliRunner for fast integration testing
- Mock expensive operations (DataMerger, CombinedOrchestrator)
- Validate CLI wiring: arguments → validators → UI → exit codes
- Test focus: Does validation logic integrate correctly with CLI?

Subprocess tests (lines 197-301):
- Execute real nac-test command in subprocess
- Zero mocking - tests production behavior
- Validate complete process: entry point → imports → execution → exit
- Test focus: Does CLI actually work end-to-end in production?

Test fixtures:
- clean_controller_env: Clears environment variables for isolation
- minimal_test_env: Creates temporary directories and YAML files
- cli_test_env: Subprocess-specific environment setup

Coverage: Tests validate that validation runs at correct point in CLI
flow (after argument parsing, before data merge), displays banner on
failure, returns correct exit codes, and handles environment variables.

* Fix CI: Add defaults.yaml to integration test fixtures

Integration tests in test_integration_robot_pabot.py set ACI_URL
environment variable via setup_bogus_controller_env fixture, which
triggers the new ACI defaults validation.

Adding defaults.yaml to test fixtures satisfies the validation check
without modifying test logic or environment setup. The file contains
minimal defaults.apic structure required by the validator.

This fixes 20 failing integration tests that were exiting with code 1
(validation failure) instead of code 0 (success).

* Fix CI: Add defaults.yaml to all integration test fixture directories

Additional fixture directories (data_env, data_list, data_list_chunked,
data_merge) also used by integration tests need defaults.yaml to satisfy
ACI validation when ACI_URL environment variable is set.

Ensures all 20 failing integration tests pass by providing the minimal
defaults.apic structure required by the validator in all test fixture
data directories.

* Fix CI: Add defaults.apic structure to data_merge test fixtures

The test_nac_test_render_output_model test passes individual YAML files
(file1.yaml, file2.yaml) via -d flags instead of a directory. Our
validator performs content-based checking on these files.

Adding defaults.apic structure to file1.yaml satisfies the validator's
content check. The expected result.yaml is updated to include the
defaults structure since it's part of the merged data model.

This fixes the 2 remaining failing integration tests.

* Add defaults.yaml to PyATS quicksilver test fixtures

The test_nac_test_pyats_quicksilver_api_only tests set architecture-specific
URL environment variables (ACI_URL, CC_URL, SDWAN_URL) which trigger the
ACI defaults validation. These fixture directories need minimal defaults.yaml
files with the required structure to satisfy the validation check.

Files added:
- tests/integration/fixtures/data_pyats_qs/aci/defaults.yaml
- tests/integration/fixtures/data_pyats_qs/catc/defaults.yaml
- tests/integration/fixtures/data_pyats_qs/sdwan/defaults.yaml

Each contains minimal mock structure: defaults.apic.version = 6.0

This fixes the failing CI test: test_nac_test_pyats_quicksilver_api_only[aci-ACI-1-0-0]

* Remove symlink rejection from ACI defaults validator

The symlink check in _file_contains_defaults_structure() was
inconsistent with the rest of the data loading chain — nac_yaml's
load_yaml_files() follows symlinks via os.path.isfile() and open()
without any detection. The check would reject legitimate symlinked
defaults files that the downstream DataMerger would accept fine.

The 3MB file size guard remains as a resource protection measure.
… check happy

fix parent branch pipeline, regression of 5381000
…haracters in Safari (#558)

* fix(html-reports): add UTF-8 charset declaration to prevent garbled characters in Safari

Without an explicit charset meta tag, Safari defaults to a different character
encoding, causing UTF-8 characters (arrows, checkmarks, etc.) to display as
garbled text like "a†'" instead of "→".

Added E2E tests to verify charset declaration is present in all generated HTML
reports (combined dashboard, Robot summary, PyATS API summary, PyATS D2D summary).

Fixes #484

* refactor(tests): consolidate charset check into verify_html_structure

Address PR review feedback to reduce helper surface area by merging
verify_html_charset() into verify_html_structure(). This creates a
single "is this valid HTML" gate that callers need to invoke, rather
than two separate helpers.

Changes:
- Merge UTF-8 charset validation into verify_html_structure()
- Remove standalone verify_html_charset() function
- Remove 4 separate test_*_has_utf8_charset test methods
- Update test_*_has_valid_html docstrings to reflect combined validation
…rator (#529)

* fix(pyats): add disconnect/wait timeout configuration to testbed generator

Add support for configuring disconnect and wait timeout parameters in PyATS testbed generator, with comprehensive test coverage for both baseline and merged configurations.

Ultraworked with Sisyphus (https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* docs: add note about POST_DISCONNECT_WAIT_SEC to arch doc

* use PYATS_POST_DISCONNECT_WAIT_SEC constant to set value

* refactor(tests): extract assert_connection_has_optimizations helper

Consolidate repeated connection optimization assertions into a reusable
helper function, improving test maintainability when adding new
optimization settings.

* tests: add defaults.apic.version to e2e fixtures to make ACI defaults check happy

* refactor: improve disconnect timeout constant naming and reduce duplication

Address PR review feedback:
- Rename PYATS_POST_DISCONNECT_WAIT_SEC to PYATS_POST_DISCONNECT_WAIT_SECONDS
  for clarity per Python style guide recommendations
- Use constant in test helper instead of hardcoding value
- Extract shared connection arguments dict to reduce duplication in
  _build_device_config() conditional branches

* fix(pyats): add GRACEFUL_DISCONNECT_WAIT_SEC and relocate test helper

Address PR review #3843924506 comments:

Issue #1 (settings ordering): Apply settings block to connection_args
BEFORE embedding in device_config, making the intent self-documenting
and removing implicit dependency on dict reference semantics.

Issue #2 (GRACEFUL_DISCONNECT_WAIT_SEC): Add PYATS_GRACEFUL_DISCONNECT_WAIT_SECONDS
constant (set to 0) alongside POST_DISCONNECT_WAIT_SEC, matching Unicon's
test suite patterns where both are commonly set together for test environments.

Issue #4 (test helper location): Move assert_connection_has_optimizations()
from test_testbed_generator.py to conftest.py - the conventional home for
shared test utilities. This prevents import chains across test files if
additional test modules need the helper in the future.

Also updates the test helper to verify both GRACEFUL_DISCONNECT_WAIT_SEC
and POST_DISCONNECT_WAIT_SEC settings.

* docs: add GRACEFUL_DISCONNECT_WAIT_SEC to architecture documentation

Update testbed example and performance note to reflect both disconnect
timeout settings (GRACEFUL_DISCONNECT_WAIT_SEC: 0 and POST_DISCONNECT_WAIT_SEC: 0)
which together skip Unicon's default 1s/10s cooldowns.

* test/add comment to e2e fixture data.yaml clarifying required structure

---------

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
* test: Add automated tests for hostname display in D2D reporting

Add comprehensive e2e tests to validate hostname display across all D2D
test reporting locations as specified in issue #482.

Changes:
- Add expected_d2d_hostnames field to E2EScenario dataclass
- Configure hostnames for D2D scenarios (SUCCESS, ALL_FAIL, MIXED,
  PYATS_D2D_ONLY, PYATS_CC)
- Add 6 hostname validation test methods to E2ECombinedTestBase:
  * Console output format: "Test Name (hostname) | STATUS |"
  * HTML summary tables and detail page headers
  * HTML filenames with sanitized hostnames
  * API test separation (no hostnames shown)
  * Character sanitization validation
- Add HTML parsing helpers for hostname extraction and validation
- Use exact sanitization logic matching base_test.py (r"[^a-zA-Z0-9]" → "_")
- Liberal pattern matching for resilience to format changes
- Smart skip logic for scenarios without D2D tests

Test coverage validates all aspects of hostname display implemented in PR #478:
- Console output displays hostnames in parentheses format
- HTML reports show hostnames in summary and detail views
- Filenames include properly sanitized hostnames
- API tests correctly exclude hostnames
- Cross-report consistency maintained

Resolves #482

* tests: add defaults.apic.version to e2e fixtures to make ACI defaults check happy

* refactor(e2e): remove unused hostname extraction functions

Remove HostnameDisplayInfo dataclass and four extraction functions
(extract_hostname_from_display_text, extract_hostnames_from_summary_table,
extract_hostname_from_detail_page_header, extract_hostnames_from_filenames)
that were added during development but never used in the final tests.

* refactor: consolidate hostname sanitization into shared utility

Create sanitize_hostname() in nac_test/utils/strings.py using pattern
[^a-zA-Z0-9_] to clearly indicate underscores are safe characters.
Update base_test.py and job_generator.py to use the shared utility
instead of duplicating the regex pattern inline.

* refactor(e2e): validate expected_d2d_hostnames in E2EScenario

Add validation in E2EScenario.validate() ensuring expected_d2d_hostnames
is populated when has_pyats_d2d_tests > 0. This makes the redundant
"or not expected_d2d_hostnames" skip condition checks unnecessary in
the 5 hostname test methods, simplifying the test logic.
…553)

* refactor: remove legacy errorhandler, use CombinedResults as single source of truth for exit codes

## Summary

Removed the legacy errorhandler system that was interfering with proper exit code behavior.
Exit codes are now handled exclusively through CombinedResults.exit_code, providing a
single, predictable source of truth that follows Robot Framework conventions.

## Key Changes

**Errorhandler Removal:**
- Eliminated errorhandler dependency that was causing logger.error() calls to override intended exit codes
- Restored proper logger.error() usage throughout the codebase (no more workarounds with logger.warning())
- Developers can now use logger.error() naturally without unintended side effects on exit codes

**Single Source of Truth:**
- CombinedResults.exit_code is now the only exit code calculation method
- Removed unused TestResults.exit_code method that was never called in production
- Simplified architecture: exit codes are purely a CLI aggregation concern, not individual framework concern

## Benefits

- **Predictable Behavior**: Exit codes come from one clear path (CombinedResults → CLI)
- **No Hidden Side Effects**: Logging levels don't accidentally change exit codes
- **Cleaner Architecture**: No duplicate exit code logic across TestResults and CombinedResults
- **Developer Friendly**: logger.error() works as expected without exit code interference

## Tests

- **New CLI exit code tests**: Comprehensive test suite for main.py exit code behavior (9 tests)
- **Removed**: Unused TestResults.exit_code tests that tested non-production code paths
- All existing integration and E2E tests continue to pass

The errorhandler was a legacy mechanism from before CombinedResults existed. With proper
structured error handling now in place, the errorhandler created more problems than it
solved by interfering with the intended graduated exit code system.

* refactor(exit-codes): replace hardcoded exit codes with named constants

Introduce symbolic exit code constants in nac_test/core/constants.py
following Robot Framework conventions:

  EXIT_FAILURE_CAP = 250   # max test failure count reported
  EXIT_INVALID_ARGS = 252  # invalid RF args or no tests found
  EXIT_INTERRUPTED  = 253  # Ctrl+C / interrupted execution
  EXIT_ERROR        = 255  # infrastructure/execution errors

EXIT_SUCCESS (0) is deliberately not defined — it is a universal POSIX
convention that never changes, so a named constant adds no clarity.
Removed the spuriously added __all__ from constants.py.

Also fix template rendering error handling: instead of propagating
exceptions in render-only mode (which leaked as exit code 1, violating
the "1-250 = test failure count" convention), rendering failures are now
converted to TestResults.from_error() objects with descriptive messages.
main.py inspects the resulting CombinedResults.has_errors to exit with
EXIT_ERROR (255) and display an actionable "❌ Template rendering
failed: <reason>" message, consistent with all other infrastructure
failures.

Updated all tests to use the new constants instead of magic numbers,
except where literals are derived directly from test input (e.g.
`failed=3` → `assert exit_code == 3`) and therefore more readable as
literals.

* refactor(exit-codes): split EXIT_INVALID_ARGS into POSIX (2) and Robot (252)

EXIT_INVALID_ARGS was previously 252 (Robot Framework convention).
This was confusing because nac-test's own CLI argument errors should
follow POSIX/Typer conventions (exit code 2) rather than RF conventions.

Split into two distinct constants:
- EXIT_INVALID_ARGS = 2    (nac-test CLI arg errors, aligns with POSIX/Typer)
- EXIT_INVALID_ROBOT_ARGS = 252  (Robot Framework invalid args / no tests found)

This allows CI/CD to unambiguously distinguish exit code 2 from a
bad CLI invocation vs. 2 test failures (which would produce artifacts).

Updated all production code, tests, and README accordingly.

* refactor(exit-codes): replace string markers with ErrorType enum for type-safe exit code determination

Replace fragile string-based error detection (ERROR_MARKER_*) with a typed
ErrorType enum approach. This enables compile-time checking and explicit
exit code priority handling.

Changes:
- Add ErrorType enum (GENERIC, INVALID_ROBOT_ARGS, INTERRUPTED)
- Add error_type field to TestResults dataclass
- Update from_error() to accept optional error_type parameter
- Fix exit code priority: 253 (interrupted) > 252 (invalid args) > 255 (generic)
- Remove ERROR_MARKER_* constants from constants.py
- Add unit tests for ErrorType and exit code priority

* refactor(exit-codes): centralize error reporting in main.py

Move all user-facing error/success output to main.py:
- Display ❌ for errors, ⚠️ for warnings, ✅ for success
- Route all error messages to stderr consistently

Remove redundant error output from orchestrators:
- robot/orchestrator.py: remove typer.echo calls (-15 lines)
- combined_orchestrator.py: simplify exception handler (-17 lines)

Orchestrators now only log errors and store them in TestResults,
following single-responsibility principle.

* test: update controller detection tests to expect EXIT_ERROR (255)

Tests previously expected exit code 1 for controller detection failures.
With the new exit code scheme, infrastructure errors return EXIT_ERROR (255).

* Fix test comment with correct expected exit code

* refactor(exit-codes): remove unused ERROR_TYPE_EXIT_CODES dict

With only 3 error types, the explicit if chain in CombinedResults.exit_code
is more readable and makes priority ordering immediately visible. The dict
was defined but never used, creating two sources of truth that could drift.

* chore(deps): remove unused errorhandler dependency

The errorhandler package is no longer used in the codebase after
centralizing exit code handling in CombinedResults.exit_code.

* refactor(exit-codes): rename EXIT_INVALID_ROBOT_ARGS to EXIT_DATA_ERROR

Rename to match Robot Framework's naming convention for exit code 252.
This constant serves dual purpose: invalid Robot arguments AND no tests
found (via is_empty check), so the more generic name is more accurate.

* docs(exit-codes): explain priority ordering rationale in exit_code docstring

Add explanation for why priority is 253 > 252 > 255 (counterintuitive
numerically): interrupted is most actionable for CI/CD, data errors
indicate config issues, generic errors may be transient.

* tests: add defaults.apic.version to e2e fixtures to make ACI defaults check happy
* test: Add automated tests for hostname display in D2D reporting

Add comprehensive e2e tests to validate hostname display across all D2D
test reporting locations as specified in issue #482.

Changes:
- Add expected_d2d_hostnames field to E2EScenario dataclass
- Configure hostnames for D2D scenarios (SUCCESS, ALL_FAIL, MIXED,
  PYATS_D2D_ONLY, PYATS_CC)
- Add 6 hostname validation test methods to E2ECombinedTestBase:
  * Console output format: "Test Name (hostname) | STATUS |"
  * HTML summary tables and detail page headers
  * HTML filenames with sanitized hostnames
  * API test separation (no hostnames shown)
  * Character sanitization validation
- Add HTML parsing helpers for hostname extraction and validation
- Use exact sanitization logic matching base_test.py (r"[^a-zA-Z0-9]" → "_")
- Liberal pattern matching for resilience to format changes
- Smart skip logic for scenarios without D2D tests

Test coverage validates all aspects of hostname display implemented in PR #478:
- Console output displays hostnames in parentheses format
- HTML reports show hostnames in summary and detail views
- Filenames include properly sanitized hostnames
- API tests correctly exclude hostnames
- Cross-report consistency maintained

Resolves #482

* feat: merge Robot and PyATS xunit.xml files for CI/CD integration (#555)

Create a combined xunit.xml at the output root by merging results from
Robot Framework and PyATS test executions. This enables CI/CD pipelines
(Jenkins, GitLab) to consume a single test results file.

Changes:
- Add xunit_merger module using lxml.etree for efficient XML parsing
- Move Robot xunit.xml output to robot_results/ subdirectory
- Integrate merger into combined orchestrator post-execution
- Add lxml>=4.5.1 as direct dependency
- Add 17 unit tests for merger functionality
- Extend E2E tests to verify merged xunit.xml in all scenarios

* refactor(robot): use Path for xunit output path construction

Use robot_results_dir Path object for xunit.xml path instead of
f-string with constant, consistent with how output paths will be
constructed in pabot 5.2 upgrade.

* tests: add defaults.apic.version to e2e fixtures to make ACI defaults check happy

* refactor(e2e): remove unused hostname extraction functions

Remove HostnameDisplayInfo dataclass and four extraction functions
(extract_hostname_from_display_text, extract_hostnames_from_summary_table,
extract_hostname_from_detail_page_header, extract_hostnames_from_filenames)
that were added during development but never used in the final tests.

* refactor: consolidate hostname sanitization into shared utility

Create sanitize_hostname() in nac_test/utils/strings.py using pattern
[^a-zA-Z0-9_] to clearly indicate underscores are safe characters.
Update base_test.py and job_generator.py to use the shared utility
instead of duplicating the regex pattern inline.

* refactor(e2e): validate expected_d2d_hostnames in E2EScenario

Add validation in E2EScenario.validate() ensuring expected_d2d_hostnames
is populated when has_pyats_d2d_tests > 0. This makes the redundant
"or not expected_d2d_hostnames" skip condition checks unnecessary in
the 5 hostname test methods, simplifying the test logic.

* refactor(xunit_merger): use directory constants instead of hardcoded strings

Replace hardcoded "robot_results" and "pyats_results" string literals
with ROBOT_RESULTS_DIRNAME and PYATS_RESULTS_DIRNAME constants from
nac_test/core/constants.py for consistency with the rest of the codebase.

* fix(xunit_merger): handle ValueError for malformed xunit attributes

Add ValueError to exception handling in merge_xunit_files() to gracefully
handle xunit files with non-numeric attribute values (e.g., tests="abc").
This maintains consistent graceful-degradation behavior alongside the
existing ParseError and OSError handling.

* fix(orchestrator): add resilient error handling for xunit merge

Wrap merge_xunit_results() call in try/except to prevent unexpected
errors (permissions, disk full, etc.) from crashing the entire test
run. A failed merge now logs a warning instead of terminating the
orchestrator, since test results are still valid without the merged file.

* refactor(subprocess_runner): extract command building into _build_command helper

Consolidate duplicated PyATS command construction from execute_job() and
execute_job_with_testbed() into a single _build_command() method. This
reduces code duplication and ensures consistent command building including
--verbose/--quiet flag handling.

Related tech debt tracked in #570.

* refactor: use XUNIT_XML constant instead of hardcoded "xunit.xml"

Addresses review comment from PR #560.
…ot_results (#560)

* test: Add automated tests for hostname display in D2D reporting

Add comprehensive e2e tests to validate hostname display across all D2D
test reporting locations as specified in issue #482.

Changes:
- Add expected_d2d_hostnames field to E2EScenario dataclass
- Configure hostnames for D2D scenarios (SUCCESS, ALL_FAIL, MIXED,
  PYATS_D2D_ONLY, PYATS_CC)
- Add 6 hostname validation test methods to E2ECombinedTestBase:
  * Console output format: "Test Name (hostname) | STATUS |"
  * HTML summary tables and detail page headers
  * HTML filenames with sanitized hostnames
  * API test separation (no hostnames shown)
  * Character sanitization validation
- Add HTML parsing helpers for hostname extraction and validation
- Use exact sanitization logic matching base_test.py (r"[^a-zA-Z0-9]" → "_")
- Liberal pattern matching for resilience to format changes
- Smart skip logic for scenarios without D2D tests

Test coverage validates all aspects of hostname display implemented in PR #478:
- Console output displays hostnames in parentheses format
- HTML reports show hostnames in summary and detail views
- Filenames include properly sanitized hostnames
- API tests correctly exclude hostnames
- Cross-report consistency maintained

Resolves #482

* feat: merge Robot and PyATS xunit.xml files for CI/CD integration (#555)

Create a combined xunit.xml at the output root by merging results from
Robot Framework and PyATS test executions. This enables CI/CD pipelines
(Jenkins, GitLab) to consume a single test results file.

Changes:
- Add xunit_merger module using lxml.etree for efficient XML parsing
- Move Robot xunit.xml output to robot_results/ subdirectory
- Integrate merger into combined orchestrator post-execution
- Add lxml>=4.5.1 as direct dependency
- Add 17 unit tests for merger functionality
- Extend E2E tests to verify merged xunit.xml in all scenarios

* refactor(robot): use Path for xunit output path construction

Use robot_results_dir Path object for xunit.xml path instead of
f-string with constant, consistent with how output paths will be
constructed in pabot 5.2 upgrade.

* feat(robot): upgrade pabot to 5.2.2 and write outputs directly to robot_results

- Update robotframework-pabot dependency to >= 5.2.2
- Use --output, --log, --report options to write artifacts directly to robot_results/
- Remove _move_robot_results_to_subdirectory() workaround from orchestrator
- Update unit tests to reflect new behavior

* tests: add defaults.apic.version to e2e fixtures to make ACI defaults check happy

* refactor(e2e): remove unused hostname extraction functions

Remove HostnameDisplayInfo dataclass and four extraction functions
(extract_hostname_from_display_text, extract_hostnames_from_summary_table,
extract_hostname_from_detail_page_header, extract_hostnames_from_filenames)
that were added during development but never used in the final tests.

* refactor: consolidate hostname sanitization into shared utility

Create sanitize_hostname() in nac_test/utils/strings.py using pattern
[^a-zA-Z0-9_] to clearly indicate underscores are safe characters.
Update base_test.py and job_generator.py to use the shared utility
instead of duplicating the regex pattern inline.

* refactor(e2e): validate expected_d2d_hostnames in E2EScenario

Add validation in E2EScenario.validate() ensuring expected_d2d_hostnames
is populated when has_pyats_d2d_tests > 0. This makes the redundant
"or not expected_d2d_hostnames" skip condition checks unnecessary in
the 5 hostname test methods, simplifying the test logic.

* refactor(xunit_merger): use directory constants instead of hardcoded strings

Replace hardcoded "robot_results" and "pyats_results" string literals
with ROBOT_RESULTS_DIRNAME and PYATS_RESULTS_DIRNAME constants from
nac_test/core/constants.py for consistency with the rest of the codebase.

* fix(xunit_merger): handle ValueError for malformed xunit attributes

Add ValueError to exception handling in merge_xunit_files() to gracefully
handle xunit files with non-numeric attribute values (e.g., tests="abc").
This maintains consistent graceful-degradation behavior alongside the
existing ParseError and OSError handling.

* fix(orchestrator): add resilient error handling for xunit merge

Wrap merge_xunit_results() call in try/except to prevent unexpected
errors (permissions, disk full, etc.) from crashing the entire test
run. A failed merge now logs a warning instead of terminating the
orchestrator, since test results are still valid without the merged file.

* refactor(subprocess_runner): extract command building into _build_command helper

Consolidate duplicated PyATS command construction from execute_job() and
execute_job_with_testbed() into a single _build_command() method. This
reduces code duplication and ensures consistent command building including
--verbose/--quiet flag handling.

Related tech debt tracked in #570.

* refactor: use XUNIT_XML constant instead of hardcoded "xunit.xml"

Addresses review comment from PR #560.

* refactor(constants): add OUTPUT_XML, LOG_HTML, REPORT_HTML filename constants

* refactor(robot): replace hardcoded filenames with constants

* refactor(tests): replace hardcoded filenames with constants
* Move archive discovery and report timing messages to logger.info

Move 'Found API/D2D/legacy archive' messages and report generation
timing from stdout to logger.info level. These diagnostic messages
are procedural and only needed when debugging - users can see them
with -v INFO flag.

Fixes #549

* Move archive and report paths to logger.info, consolidate to combined summary

Remove duplicate archive/report path output from stdout:
- PyATS Output Files block (summary_printer.py)
- HTML Reports Generated block (orchestrator.py)

Paths now logged at INFO level (visible with -v INFO):
- Archive paths, results JSON/XML, summary XML
- HTML report directories and summary reports

The combined summary remains as the single source of truth for
finding test artifacts, reducing output clutter in CI pipelines.

Fixes #545

* Consolidate test counts to combined summary only

Remove individual framework summaries from stdout, keeping only the
combined summary. Individual results (PyATS API/D2D, Robot) are now
logged at INFO level, visible with -v INFO.

- Delete SummaryPrinter class (no longer needed after #545/#549)
- Add format_test_summary() to terminal.py with colored numbers
- Refactor combined_orchestrator to use new formatter
- Log individual framework results at INFO level

Deleting SummaryPrinter also removes duplicate timing output
(Total testing / Elapsed time), leaving only the single
Total runtime at the end.

Fixes #544
Fixes #546

* Streamline execution summary output

- Move "Generating..." messages to logger.info
- Simplify output section to show key result files only
- Use file existence checks for robustness
- Clean up separators (= for frame, - for divisions)

Follow-up to #545

* Consolidate PyATS discovery output (#564)

Reduce discovery output from 4 lines to 2 by including api/d2d
breakdown in the discovery message and removing redundant
"Found X API/D2D test(s)" messages.

Before:
  Discovered 2 PyATS test files
  Running with 5 parallel workers
  Found 1 API test(s) - using standard execution
  Found 1 D2D test(s) - using device-centric execution

After:
  Discovered 2 PyATS test files (1 api, 1 d2d)
  Running with 5 parallel workers

* Move PyATS command output to logger.info

Remove redundant print() of PyATS command - already logged via
logger.info(), visible with -v INFO.

* Remove redundant Robot Framework completion message

The combined summary already shows test completion status.
Individual framework completion messages are now suppressed
to reduce stdout noise.

* Add E2E and unit tests for #540 output streamlining

Test philosophy:
- E2E tests validate stdout structure, not exact wording
- Tests use filtered_stdout (logger lines stripped) to isolate
  actual program output from debug logging
- Liberal pattern matching catches regressions without breaking
  on minor wording changes

Unit tests (test_terminal.py):
- format_test_summary() coloring: numbers colored only when > 0
- Labels (passed/failed/skipped) never colored
- NO_COLOR environment variable support
- Error message formatting for controller auto-detection

E2E stdout tests:
- Combined Summary header present
- Stats line in correct section (not pabot's duplicate)
- No individual framework completion messages (robot/pyats)
- Archive discovery messages moved to logger (not in stdout)
- PyATS discovery shows consolidated count

Also consolidates test_terminal_error_messages.py into
test_terminal.py for single module per source file.

* Use absolute paths in Combined Summary output

Resolves paths to absolute when printing artifact locations,
matching pabot's output style for consistency.

* Upgrade robotframework-pabot to >=5.2.2

Part of #560 - included here for branch self-containment.

* Fix test assertion for updated _print_execution_summary signature

* suppress Template rendering completed/Creating merged data model msgs

* add comment why we use absolute paths in our summary

* Add visual spacing around Combined Summary block

Add two blank lines before and after the Combined Summary block
to improve readability and separate it from pabot output above
and "Total runtime" below.

- Add blank lines before opening '======' separator
- Add blank line after closing '======' separator
- Add E2E test to verify spacing is maintained

* Remove timestamps from data model merging output

Simplify CLI output by removing timestamps from the data merging
messages, keeping only the duration.

* Remove stray comment line in _print_execution_summary

* Add 'other' count to format_test_summary when > 0

Display 'other' count (blocked/aborted/passx/info) in the Combined
Summary when present. Uses magenta color for visibility.

- Only shown when other > 0 to avoid clutter
- Add unit tests for other count display

* fix: remove merge regression of typer.echo in orchestrator

The typer.echo("✅ Robot Framework tests completed") was intentionally
removed as part of the output streamlining work, but crept back in
during a merge from the parent branch.  Our e2e tests caught this :-)

Other typer.echo statements introduced by recent merges (error messages
in main.py, UI banner module) are intentional and unaffected.
* feat: add standalone mock server management scripts

Add standalone mock server management:
- start_server.py: daemon mode with PID/log management in /tmp
- stop_server.py: graceful shutdown with fallback to SIGKILL
- status_server.py: health check with HTTP connectivity test
- Fix mock_server.py port handling to use self.port for standalone mode

Add mock API endpoints for scale testing:
- Update mock_api_config.yaml with 10 new vedges endpoints (vedges1-vedges10)
- Enables testing with multiple parallel test instances

Purpose: Enable standalone mock server usage outside pytest fixtures,
for example performance profiling / scale testing.

* fix: restore default port 0 in MockAPIServer for pytest-xdist compatibility

The standalone server feature inadvertently changed the MockAPIServer
constructor default port from 0 to 5555, breaking parallel test execution
with pytest-xdist. Port 0 allows the OS to assign an available port,
preventing conflicts when multiple test workers run simultaneously.

- Restore port=0 default in MockAPIServer.__init__
- Add DEFAULT_STANDALONE_PORT constant (5555) to start_server.py
- Use constant in status_server.py for consistency

* fix indendation in yaml file

* refactor(mock-server): replace magic signal numbers with constants

Use signal.SIGTERM and signal.SIGKILL instead of magic numbers 15 and 9
in stop_server.py for improved readability.

Addresses PR review feedback point #5.

* fix(mock-server): call server.stop() in SIGTERM handler

Move signal handler registration after server creation so the handler
can properly call server.stop() for graceful shutdown. Previously, the
SIGTERM handler only cleaned up the PID file without stopping the Flask
server thread, causing potential resource leaks.

Addresses PR review feedback point #4.

* refactor(mock-server): consolidate scripts into mock_server_ctl.py

Replace start_server.py, stop_server.py, and status_server.py with a
single mock_server_ctl.py using argparse subcommands (start/stop/status).

Changes:
- Eliminate duplicated constants (PID_FILE, DEFAULT_PORT) and logic
- Use JSON state file instead of plain PID file to store both PID and port
- Status command now correctly checks health on the actual server port
- Single entry point: python mock_server_ctl.py {start|stop|status}
- Add README.md with standalone usage instructions

Addresses PR review feedback points #2 and #3.

* refactor(mock-server): consolidate duplicate vedges endpoints into single regex

Replace 10 nearly identical vedges endpoints (vedges1-vedges10) with a single
regex-matched endpoint using path_pattern "^/dataservice/system/device/vedges[0-9]+$".

This reduces ~240 lines of duplicated YAML configuration to ~30 lines while
maintaining the same functionality for scale testing scenarios.

* made mock_server_ctl.py executable
- Close log_fd after os.dup2 to prevent file descriptor leak in daemon
- Reuse SERVER_STARTUP_TIMEOUT_SECONDS and SERVER_POLL_INTERVAL_SECONDS
  from mock_server.py instead of hardcoded values
- Add Unix-only platform note to README (os.fork/signal.pause not on Windows)

Addresses review comments from PR #563 which I missed to push prior to
merge, hence double-committing into parent.
* refactor(collector): remove redundant hasattr guard on metadata

metadata is unconditionally assigned in __init__ (line 64), so the
hasattr check in save_to_file() is always True. Replace with a direct
attribute reference.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(ssh-base): declare class-level attributes for SSHTestBase

Add explicit class-level declarations with | None = None for hostname,
device_info, device_data, command_cache, and execute_command. Document
broker_client as intentionally non-optional since it is always assigned
in setup() before any test method runs.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(ssh-base): remove redundant hasattr(self, "parameters")

self.parameters is a property defined on PyATS TestItem, always
available through the MRO. The outer hasattr guard was unnecessary.
The inner hasattr(self.parameters, "internal") is retained because
.internal is a dynamic PyATS attribute that may not exist.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(ssh-base): replace hasattr with is not None checks

Replace hasattr(self, "hostname") with self.hostname is not None,
hasattr(self, "result_collector") with self.result_collector is not
None, and add device_info None guard for mypy safety. Also replace
getattr(self, "_current_test_context", None) with direct access since
the attribute is now declared at class level in NACTestBase.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): declare class-level attributes for NACTestBase

Add explicit class-level declarations for result_collector, output_dir,
_controller_recovery_count, and _total_recovery_downtime. These were
previously set only in setup() and checked via hasattr(). Declaring
them at class level enables replacing hasattr() with is not None.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): use getattr for dynamically-injected reporter

reporter is injected by the PyATS runner at test execution time and is
not declared on any class in the Testcase MRO (confirmed by inspection
of compiled PyATS .so modules). Use getattr(self, "reporter", None)
to safely access it without risking AttributeError.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): replace hasattr(self, "parent") with None check

self.parent is a property on PyATS TestResultContext, always present in
the MRO. Replace hasattr guard with is not None check. The inner
hasattr(self.parent, "reporter") is retained since reporter may or may
not exist on the parent object.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): replace result_collector and output_dir hasattr

Replace all hasattr(self, "result_collector") with
self.result_collector is None / is not None, and
hasattr(self, "output_dir") with self.output_dir is not None.
These attributes are now declared at class level with None defaults.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): replace cleanup method hasattr checks

Replace hasattr guards for batching_reporter, step_interceptor, and
_controller_recovery_count in the cleanup() method with direct
attribute access. All three are declared at class level with
sensible defaults (None or 0).

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): add None guards on result_collector.add_result

With result_collector declared as TestResultCollector | None, mypy
requires null checks before calling methods on it. Add guards in
add_verification_result() and _add_step_to_html_collector_smart()
to handle the edge case where setup() failed partway through.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): replace verify_group/verify_item hasattr checks

Replace hasattr(self, "verify_group") and hasattr(self, "verify_item")
with type-identity checks: type(self).method is NACTestBase.method.

Add default implementations that raise NotImplementedError on the base
class, matching the existing pattern for extract_step_context(),
format_step_name(), etc. The pre-dispatch check runs before
asyncio.gather() to ensure loud fail-fast behavior instead of silent
FAILED results from gather(return_exceptions=True).

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(base-test): standardize NotImplementedError messages

Update all 5 short-style NotImplementedError messages to use verbose
f-strings with {self.__class__.__name__}, matching the style introduced
by verify_group/verify_item. This names the concrete subclass that
forgot to implement the method, improving the developer experience.

Part of #481.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: DRY verify stubs, fix comment accuracy, remove redundant init

- Extract _verify_group_not_implemented() and _verify_item_not_implemented()
  helpers so the error message lives in one place (used by both the
  pre-gather fail-fast check and the stub methods)
- Fix inaccurate comment on reporter attribute — it IS set in
  TestResultContext.__init__, not dynamically injected
- Remove redundant _controller_recovery_count/_total_recovery_downtime
  re-initialization in setup() (class-level defaults already handle this)
- Separate nullable attrs from zero-defaulted counters with sub-comment
- Add hasattr justification comment on self.parent.broker_client
- Fix device_name type safety with or-fallback pattern

* refactor: address review feedback — remove redundant import, add warnings

- Remove redundant local `from pyats import aetest` (already imported at
  module level)
- Add logger.warning when result_collector is None in
  add_verification_result() and _add_step_to_html_collector_smart()
  instead of silently swallowing results
- Apply ruff format to ssh_base_test.py (line wrapping fix)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(dry-run): add PyATS dry-run support (#460)

Add --dry-run support for PyATS tests. Previously, --dry-run only worked
for Robot Framework tests while PyATS tests would execute anyway.

PyATS dry-run behavior:
- Discovers and categorizes tests (API vs D2D) normally
- Prints which tests would be executed without actually running them
- Returns exit code 0 (no tests executed = no failures)

Changes:
- Add dry_run parameter to PyATSOrchestrator
- Add _print_dry_run_summary() method to display discovered tests
- Update CombinedOrchestrator to pass dry_run to PyATSOrchestrator
- Update CLI --dry-run help text to describe PyATS behavior
- Add unit tests for PyATS dry-run functionality
- Add E2E tests with mixed Robot + PyATS scenario

* change function name to be more clear

* Revert "change function name to be more clear"

This reverts commit c2f68af.

* move results return into caller to make logic more explict

* fix(cli): return exit code 0 for dry-run on PyATS-only repos

Previously, running --dry-run on a repository with only PyATS tests
(no Robot tests) returned exit code 1 because stats.is_empty became
True (PyATS returns not_run with total=0). Added early return after
dry-run message to bypass the empty stats check.

* refactor(tests): consolidate PyATS orchestrator test fixtures

Creates tests/pyats_core/conftest.py with shared fixtures:
- clean_controller_env: clears controller environment variables
- aci_controller_env, sdwan_controller_env, cc_controller_env
- pyats_test_dirs: creates standard test directory structure

Refactors all orchestrator test files to use these fixtures, reducing
~200 lines of duplicated setup code. Includes reference to issue #541
for future merge with tests/unit/conftest.py.

Also removes two low-value tests per PR review comment #3.

* docs: update dry_run docstring to reflect PyATS support

The docstring incorrectly stated dry_run was "Robot only" - it now
applies to both Robot and PyATS test execution.

* test(e2e): make dry-run pyats assertion unconditional

Removes conditional `if pyats_dir.exists():` check so the assertion
always verifies that PyATS results are not created in dry-run mode.

* fix(pyats): don't create pyats_results/ directory in dry-run mode

Move output directory creation to after the dry-run check so that
the pyats_results/ directory is only created when tests actually execute.

* test(e2e): add PyATS-only dry-run scenario to catch exit code bug

Add DRY_RUN_PYATS_ONLY_SCENARIO that would have caught the exit code bug
where --dry-run with PyATS-only (no Robot tests) returned non-zero exit.
This scenario exercises the code path fixed in the previous commit.

Additional improvements:
- Add is_dry_run field to E2EScenario for explicit dry-run marking
- Add derived properties to E2EResults (has_robot_results, has_pyats_results,
  etc.) for cleaner skip conditions in tests
- Refactor directory tests to assert existence/non-existence based on
  scenario expectations rather than skipping
- Refactor TestE2EDryRun and TestE2EDryRunPyatsOnly to inherit from
  E2ECombinedTestBase, eliminating duplicate test code

* skip xunit.xml checks in dryrun pyats-only scenario

* remove dead code in e2e tests config.py

* fix(cli): dry-run mode exit code reflects actual test results

Previously, --dry-run always exited with code 0, masking Robot Framework
validation failures (e.g., non-existent keywords). Now the exit code
correctly reflects test outcomes.

Changes:
- Remove forced exit(0) in dry-run mode from main.py
- Add "(dry-run)" indicator to CLI messages when running tests
- Add dry_run_robot_fail E2E scenario testing validation failure
- Override dry-run indicator test in TestE2EDryRunPyatsOnly (base class
  skips due to expected_count=0, but scenario does include PyATS tests)

* fix(e2e): call E2EScenario.validate() before scenario execution

Add scenario.validate() call in _run_e2e_scenario() to ensure E2E
scenario configurations are validated before test execution.

The validate() method checks:
- Exit code matches expected failure count
- D2D scenarios have expected_d2d_hostnames defined
- Data and template paths exist

Previously, validate() was never called despite test comments claiming
"Guaranteed by E2EScenario.validate()".

Fixes #579

* adjusted comment

* feat: add standalone mock server management scripts (#563)

* feat: add standalone mock server management scripts

Add standalone mock server management:
- start_server.py: daemon mode with PID/log management in /tmp
- stop_server.py: graceful shutdown with fallback to SIGKILL
- status_server.py: health check with HTTP connectivity test
- Fix mock_server.py port handling to use self.port for standalone mode

Add mock API endpoints for scale testing:
- Update mock_api_config.yaml with 10 new vedges endpoints (vedges1-vedges10)
- Enables testing with multiple parallel test instances

Purpose: Enable standalone mock server usage outside pytest fixtures,
for example performance profiling / scale testing.

* fix: restore default port 0 in MockAPIServer for pytest-xdist compatibility

The standalone server feature inadvertently changed the MockAPIServer
constructor default port from 0 to 5555, breaking parallel test execution
with pytest-xdist. Port 0 allows the OS to assign an available port,
preventing conflicts when multiple test workers run simultaneously.

- Restore port=0 default in MockAPIServer.__init__
- Add DEFAULT_STANDALONE_PORT constant (5555) to start_server.py
- Use constant in status_server.py for consistency

* fix indendation in yaml file

* refactor(mock-server): replace magic signal numbers with constants

Use signal.SIGTERM and signal.SIGKILL instead of magic numbers 15 and 9
in stop_server.py for improved readability.

Addresses PR review feedback point #5.

* fix(mock-server): call server.stop() in SIGTERM handler

Move signal handler registration after server creation so the handler
can properly call server.stop() for graceful shutdown. Previously, the
SIGTERM handler only cleaned up the PID file without stopping the Flask
server thread, causing potential resource leaks.

Addresses PR review feedback point #4.

* refactor(mock-server): consolidate scripts into mock_server_ctl.py

Replace start_server.py, stop_server.py, and status_server.py with a
single mock_server_ctl.py using argparse subcommands (start/stop/status).

Changes:
- Eliminate duplicated constants (PID_FILE, DEFAULT_PORT) and logic
- Use JSON state file instead of plain PID file to store both PID and port
- Status command now correctly checks health on the actual server port
- Single entry point: python mock_server_ctl.py {start|stop|status}
- Add README.md with standalone usage instructions

Addresses PR review feedback points #2 and #3.

* refactor(mock-server): consolidate duplicate vedges endpoints into single regex

Replace 10 nearly identical vedges endpoints (vedges1-vedges10) with a single
regex-matched endpoint using path_pattern "^/dataservice/system/device/vedges[0-9]+$".

This reduces ~240 lines of duplicated YAML configuration to ~30 lines while
maintaining the same functionality for scale testing scenarios.

* made mock_server_ctl.py executable

* fix(mock-server): fix fd leak in daemonize and reuse timeout constants

- Close log_fd after os.dup2 to prevent file descriptor leak in daemon
- Reuse SERVER_STARTUP_TIMEOUT_SECONDS and SERVER_POLL_INTERVAL_SECONDS
  from mock_server.py instead of hardcoded values
- Add Unix-only platform note to README (os.fork/signal.pause not on Windows)

Addresses review comments from PR #563 which I missed to push prior to
merge, hence double-committing into parent.

* fix: revert LOG_HTML -> log.html breakage

Some merge/activity re-introduced the robot artifacts strings, undo this and use the constants

* fix(logging): use dynamic stream handler to prevent I/O errors in pytest (#604)

Fixes "I/O operation on closed file" errors when pytest replaces stdout.
Merged to unblock CI.

Fixes #487

* tech-dept: clean-up from __future__ import

* refactor(dry-run): consolidate dry-run test structure in e2e and unit tests

- Add DRY_RUN_REASON constant to core/constants.py; replace "dry-run mode"
  literals in orchestrator and unit tests so the string has a single source
  of truth
- Extract mode_suffix once in combined_orchestrator before both echo calls,
  removing the duplicate assignment
- Move DRY_RUN_* scenario imports to module level in e2e/conftest.py
- Replace "<not-set>" controller URL with "http://dry-run.invalid" (RFC 2606
  reserved TLD) so the value is syntactically valid but non-routable
- Flatten nested with patch.object() staircases in test_orchestrator_dry_run.py
  to parenthesised context managers (Python 3.10+)
- Move test_dry_run_indicator_* out of E2ECombinedTestBase (where they were
  always skipped for non-dry-run classes) into the dry-run subclasses directly;
  add section comments explaining the design trade-offs
* docs: update platform requirements for nac-test 2.0

- Document Windows platform removal (use WSL2 instead)
- Add macOS pyATS startup delay note
- Fix typos and correct uv command (tool vs tools)

* doc: dropped macos cold-start delay note

Looks this is an isolated incident on my device after-all
…est (#604)

* fix(logging): use dynamic stream handler to prevent I/O errors in pytest

Add CurrentStreamHandler class that dynamically references sys.stdout
instead of caching it at creation time. This prevents 'I/O operation
on closed file' errors when pytest captures and replaces stdout during
test execution.

Changes:
- Add CurrentStreamHandler with dynamic stream property
- Update configure_logging() to use CurrentStreamHandler
- Add surgical handler removal (only stdout/stderr StreamHandlers)

Ref: #487

* test(logging): add unit tests for CurrentStreamHandler and configure_logging

Add 5 focused unit tests covering the fix for issue #487:

- test_stream_follows_stdout_replacement: Core fix - handler tracks replaced stdout
- test_stream_setter_is_noop: Setter ignores override attempts
- test_no_io_error_on_closed_stdout: Regression test for #487
- test_uses_current_stream_handler: configure_logging() installs CurrentStreamHandler
- test_surgical_handler_removal_preserves_file_handlers: Only stdout/stderr handlers removed

* fix(logging): add Python 3.10 compatibility for StreamHandler

Generic type syntax (StreamHandler[TextIO]) requires Python 3.11+.
Use TYPE_CHECKING guard for 3.10 compatibility while preserving
type information for static analysis.

* Improve test coverage and type consistency for logging tests

- Strengthen handler assertion from >= 1 to == 1 to catch accumulation
- Add idempotency test for repeated configure_logging() calls
- Fix Generator type hint to match codebase convention (None send type)

* Add missing __init__.py to tests/unit/utils/

Align with existing subdirectory conventions (tests/unit/cli/, tests/unit/core/).

* Add stream_name validation and simplify typing scaffolding

- Add Literal type and runtime ValueError for invalid stream names
- Clarify init comment (setter is no-op, order is for clarity)
- Replace TYPE_CHECKING scaffolding with simpler type: ignore comment
- Add unit test for stream_name validation
…TEST_ prefix (#599)

* feat(env): align environment variables to NAC_TEST_ prefix and add --debug flag (#512)

Standardize internal environment variables to use NAC_TEST_PYATS_ prefix
for consistency with the public NAC_TEST_ convention. Add --debug CLI flag
for easier troubleshooting with verbose output and archive retention.

Environment variable renames:
- PYATS_MAX_WORKERS → NAC_TEST_PYATS_PROCESSES
- MAX_CONNECTIONS → NAC_TEST_PYATS_MAX_CONNECTIONS
- NAC_API_CONCURRENCY → NAC_TEST_PYATS_API_CONCURRENCY
- NAC_SSH_CONCURRENCY → NAC_TEST_PYATS_SSH_CONCURRENCY
- PYATS_OUTPUT_BUFFER_LIMIT → NAC_TEST_PYATS_OUTPUT_BUFFER_LIMIT
- KEEP_HTML_REPORT_DATA → NAC_TEST_PYATS_KEEP_REPORT_DATA
- NAC_TEST_OVERFLOW_DIR → NAC_TEST_PYATS_OVERFLOW_DIR
- PYATS_DEBUG → NAC_TEST_DEBUG

New --debug flag enables verbose output for pabot/PyATS and preserves
intermediate archive files for troubleshooting.

The following env vars remain as undocumented internal tuning knobs,
not exposed as CLI flags:
- NAC_TEST_SENTINEL_TIMEOUT
- NAC_TEST_PIPE_DRAIN_DELAY
- NAC_TEST_PIPE_DRAIN_TIMEOUT
- NAC_TEST_BATCH_SIZE
- NAC_TEST_BATCH_TIMEOUT
- NAC_TEST_QUEUE_SIZE

* Update diagnostic script to use new NAC_TEST_PYATS_* env var names

- Replace obsolete NAC_API_CONCURRENCY, NAC_SSH_CONCURRENCY with new
  NAC_TEST_PYATS_* prefixed variables
- Add all documented PyATS tuning variables to diagnostic output
- Add separate section for undocumented internal tuning variables

* feat(cli): make --debug imply DEBUG verbosity unless explicitly overridden

When --debug is set without an explicit --verbosity flag, verbosity now
defaults to DEBUG instead of WARNING. Explicit --verbosity always wins.

- Add tests for all debug/verbosity flag combinations (11 test cases)
- Extract shared run_cli_with_temp_dirs() helper to conftest.py
- Refactor test_main_exit_codes.py to use shared helper

* refactor(tests): use parametrization for exit code tests

Replace 8 repetitive test methods with a single parametrized test,
reducing boilerplate while maintaining the same coverage.

* fix: correct internal env var names in constants.py comment

* docs: reorganize env var sections and clarify NAC_TEST_DEBUG behavior

* test(e2e): add debug e2e scenarios for --debug flag vs env var behavior

* feat(output): filter PyATS log output by verbosity level

Wire --verbosity flag through to OutputProcessor so PyATS log messages
are filtered consistently with nac-test's own logger output. This gives
users control over PyATS noise without requiring the NAC_TEST_DEBUG
environment variable.

Behavior by verbosity level:
- WARNING (default) or ERROR: Suppress all PyATS logs, show only critical
  messages (ERROR, FAILED, Traceback, etc.)
- INFO (-v INFO): Show PyATS logs at INFO level and above
- DEBUG (-v DEBUG or --debug): Show all output

Performance optimizations:
- Pre-compile regex patterns at module load time
- Early exit paths based on verbosity level
- Fast string check before regex for common patterns
- Benchmarked against 412k lines: <0.1s at default verbosity

Changes:
- Add debug and verbosity parameters to OutputProcessor
- Thread verbosity through PyATSOrchestrator and CombinedOrchestrator
- Replace DEBUG_MODE env check with self.debug for section progress
- Convert parse failure print() to logger.debug()
- Update e2e test: --debug now shows PyATS output (implies DEBUG verbosity)

* test(e2e): remove redundant NAC_TEST_DEBUG env var scenario

The TestE2EDebugEnv test was redundant with TestE2EDebug since both
test the same debug behavior. The env var test only verified that
Typer's envvar= parameter works, which is Typer's responsibility.

The actual debug behavior (verbose output, PyATS logs) is already
covered by TestE2EDebug which uses the --debug CLI flag.

Note: The extra_env_vars parameter in _run_e2e_scenario is retained
for potential future use cases.

* fix README debug doc, allow robot loglevel override

* feat(pyats): respect --verbosity flag for PyATS log output filtering

Fix issue where PyATS ignores nac-test's --verbosity flag. The root cause
was that aetest's configure_logging() overwrites standard logging config.

Solution: Set managed_handlers.screen.setLevel() directly in generated job
files, which is the only reliable way to control PyATS console output.

Changes:
- JobGenerator: Accept verbosity param, set screen handler level in job files
- SubprocessRunner: Map verbosity to PyATS CLI flags (-v, -q, -qq, -qqq)
- OutputProcessor: Simplify _should_show_line() since filtering now happens
  at source via managed_handlers
- logging.py: Add VERBOSITY_TO_LOGLEVEL and LOGLEVEL_NAME_TO_VALUE mappings
  as module-level constants for reuse across components

Tests:
- Add unit tests for job file generation (test_job_generator.py)
- Add unit tests for verbosity CLI flag construction
- Add unit tests for VERBOSITY_TO_LOGLEVEL mapping
- Add E2E scenario (DEBUG_WITH_INFO_SCENARIO) verifying --debug --verbosity INFO
  filters %EASYPY-DEBUG while showing %EASYPY-INFO

Docs:
- Update README with --verbosity interaction with --debug flag

* feat(robot): respect --verbosity flag for Robot loglevel

Decouple Robot's verbose output (pabot --verbose) from loglevel
(--loglevel). Previously both were tied to --debug flag.

Now:
- --debug: enables pabot verbose output (console)
- --verbosity DEBUG: sets Robot --loglevel DEBUG
- --verbosity INFO/WARNING/etc: no --loglevel set (Robot default)
- Explicit --loglevel arg always takes precedence

This aligns Robot behavior with PyATS verbosity handling.

Changes:
- Add loglevel parameter to run_pabot(), separate from verbose
- Update RobotOrchestrator to compute loglevel from verbosity
- Add unit tests for loglevel precedence logic
- Update README to document the behavior

* fix: adjusted E2E scenario description

* test: change robot test name to reflect new behaviour

* test(cli): simplify debug/verbosity tests and verify orchestrator params

- Reduce test permutations from 11 to 7 (remove -v alias and order tests)
- Add verification that verbosity and debug are passed to CombinedOrchestrator
- Add descriptive test IDs for better readability

* refactor(pyats): simplify job generator tests and make verbosity required

- Make verbosity a required parameter in JobGenerator.__init__ to ensure
  explicit intent and prevent test escapes if verbosity handling is removed
- Refactor test_job_generator.py to use base class pattern with fixtures:
  - Add default_test_files fixture for tests that don't care about paths
  - Use generate_content fixture consistently across all tests
  - Make verbosity optional in generate_content callable (defaults to WARNING)
- Remove redundant test_contains_screen_handler_setlevel (covered by
  parametrized test_verbosity_mapped_to_screen_handler)
- Remove test_init_default_verbosity (no longer applicable with required param)
- Reverse parameter order in generate_content: test_files first, then verbosity

* refactor: add DEFAULT_VERBOSITY constant for centralized default

Add DEFAULT_VERBOSITY constant to nac_test/utils/logging.py to centralize
the default verbosity level (WARNING). This makes it easier to change the
default in one place if needed in the future.

Changes:
- Add DEFAULT_VERBOSITY = VerbosityLevel.WARNING in utils/logging.py
- Update 6 source files to import DEFAULT_VERBOSITY from utils.logging
- Update 3 test files to use DEFAULT_VERBOSITY for default behavior tests

Tests that check specific verbosity level behavior (parametrized tests
covering all levels, explicit --verbosity flag tests) intentionally
remain hardcoded to test each level individually.

* fix: restore comments removed

* adjust comments, variable names

* tests: streamline job_generator unit tests

* tests: streamline subprocess_runner test

* remove blank line

* refactor: centralize env var parsing with shared helper function

Add _get_positive_numeric() helper to core/constants.py that validates
environment variables return positive values, falling back to defaults
otherwise. This consolidates scattered inline parsing patterns.

Changes:
- core/constants.py: Add TypeVar-based helper for int/float support,
  update DEFAULT_API_CONCURRENCY and DEFAULT_SSH_CONCURRENCY to use it
- pyats_core/constants.py: Import helper and migrate all env var
  constants (buffer limit, sentinel timeout, pipe drain, batching)
- Rename DEFAULT_BUFFER_LIMIT → PYATS_OUTPUT_BUFFER_LIMIT
- Move BATCH_SIZE, BATCH_TIMEOUT_SECONDS, OVERFLOW_QUEUE_SIZE,
  OVERFLOW_MEMORY_LIMIT_MB from batching_reporter.py to constants.py

Helper is defined in core/ (not utils/) to avoid circular import chain:
core/constants → utils/ → utils/environment → core/constants

* docs: add method references to Progress Reporting System documentation

Update PRD_AND_ARCHITECTURE.md Progress Reporting section to include
explicit method references for each component, making it easier for
developers to navigate the codebase:

- ProgressReporterPlugin: _emit_event(), pre/post_job(), pre/post_task()
- OutputProcessor: process_line(), _handle_progress_event(), _finalize_orphaned_tests()
- ProgressReporter: report_test_start(), report_test_end(), get_next_test_id()
- SubprocessRunner: execute_job(), _process_output_realtime(), _drain_remaining_buffer_safe()

Keeps descriptions high-level without exposing internal implementation
details, while providing clear entry points for code exploration.

* fix: remove dead code, align to os.environ.get() pattern

* refactor: rename --debug CLI flag to --verbose for clarity

The --debug flag name was misleading as it controls verbose output rather
than debug-level functionality. Rename to --verbose with corresponding
NAC_TEST_VERBOSE environment variable.

Changes:
- CLI: --debug → --verbose flag
- Env var: NAC_TEST_DEBUG → NAC_TEST_VERBOSE (user-facing)
- Internal: debug parameter → verbose in orchestrators
- Tests: rename fixtures/debug → fixtures/verbose, update test classes
- Docs: update README.md and PRD_AND_ARCHITECTURE.md

Note: NAC_TEST_DEBUG is retained as a hidden variable for developer-level
debugging (DEBUG_MODE in constants.py).

* refactor(cli): rename --verbosity to --loglevel and add --verbose flag

Rename the CLI argument from --verbosity to --loglevel for clarity, as it
controls the logging level, not verbosity. The --verbose flag now controls
verbose output mode separately.

Key changes:

- Rename VerbosityLevel enum to LogLevel with enhanced functionality
  (int_value property, comparison operators)
- Add --loglevel (-l) as the new CLI argument for log level control
- Deprecate --verbosity (-v) as hidden alias for backwards compatibility
- Add --verbose flag for enabling verbose output mode
- Simplify Robot loglevel: only set to DEBUG when nac-test loglevel is DEBUG
- Rename run_pabot's loglevel parameter to robot_loglevel
- Update all internal variable names from verbosity to loglevel
- Remove unused NAC_VALIDATE_VERBOSITY env var (use NAC_TEST_LOGLEVEL)
- Update README with new CLI options and breaking change documentation

BREAKING CHANGE: Robot Framework --loglevel pass-through no longer supported.
Use --variable approach documented in README instead.

* refactor(logging): make LogLevel.int_value private (_int)

Rename int_value to _int since it's only used internally by comparison
operators and configure_logging. Remove explicit test for the private
property - comparison operator tests implicitly verify it works.

* refactor(pyats): rename undocumented env vars to use NAC_TEST_PYATS_ prefix

Rename internal tuning environment variables for consistency:
- NAC_TEST_SENTINEL_TIMEOUT -> NAC_TEST_PYATS_SENTINEL_TIMEOUT
- NAC_TEST_PIPE_DRAIN_DELAY -> NAC_TEST_PYATS_PIPE_DRAIN_DELAY
- NAC_TEST_PIPE_DRAIN_TIMEOUT -> NAC_TEST_PYATS_PIPE_DRAIN_TIMEOUT
- NAC_TEST_BATCH_SIZE -> NAC_TEST_PYATS_BATCH_SIZE
- NAC_TEST_BATCH_TIMEOUT -> NAC_TEST_PYATS_BATCH_TIMEOUT
- NAC_TEST_QUEUE_SIZE -> NAC_TEST_PYATS_QUEUE_SIZE
- NAC_TEST_MEMORY_LIMIT_MB -> NAC_TEST_PYATS_MEMORY_LIMIT_MB

These are undocumented internal tuning knobs that only affect PyATS
execution, so the PYATS prefix makes their scope clear.

* refactor(logging): remove unused VerbosityLevel alias

VerbosityLevel was kept as a backwards compatibility alias for LogLevel
but is not used anywhere in the codebase.

* refactor: move NAC_TEST_NO_TESTLEVELSPLIT env check to constants.py

Centralize the environment variable check for test-level parallelization
in core/constants.py alongside other env-based configuration (DEBUG_MODE).

- Add NO_TESTLEVELSPLIT constant to nac_test/core/constants.py
- Update orchestrator.py to use the constant instead of direct os.environ check
- Remove unused os import from orchestrator.py
- Update test to patch the constant rather than setting env var

* refactor: move NAC_TEST_PYATS_OVERFLOW_DIR env check to pyats_core/constants.py

Add OVERFLOW_DIR_OVERRIDE constant for user-specified overflow directory.

* fix: set NAC_TEST_DEBUG in diag script

Rename of --debug to --verbose dropped NAC_TEST_DEBUG setting from script env

* refactor(logging): replace _int property with __int__ dunder method

Use Python's standard int() protocol instead of private _int property
for LogLevel numeric conversion. This improves encapsulation while
enabling idiomatic usage like int(LogLevel.DEBUG) → 10.

* cosmetic comment changes

* test(logging): add unit tests for LogLevel __int__ conversion

* doc: align to `-o ./output` in README.md examples

also added note about --pyats/--robot experimental flags

* fix: use DEFAULT_LOGLEVEL consistently for invalid log level inputs

- Fall back to DEFAULT_LOGLEVEL (not CRITICAL) for invalid inputs
- Fix debug message to show actual level used, not invalid input string
- Add unit tests for configure_logging() with 100% coverage
- Add docstring clarifying string/enum input handling

Note: Invalid input path is defensive programming, not a production code path.

* gitignore opencode AGENTS.md

* fix: use == instead of <= for LogLevel.DEBUG comparison

DEBUG is the lowest log level, so <= is misleading and suggests
impossible states. Aligns with other DEBUG checks in the codebase.

* fix: align environment variable names to NAC_TEST_PYATS_* convention

The orchestrator and connection manager were using non-standard env var names
(PYATS_MAX_WORKERS, MAX_SSH_CONNECTIONS) that didn't match the documented
NAC_TEST_PYATS_* naming convention. The existing unit tests only tested the
SystemResourceCalculator utility with default parameters, missing the fact
that callers were overriding the env_var parameter with incorrect names.

Add integration tests that verify the actual call sites use the correct
env var names, preventing future regressions.

- nac_test/pyats_core/orchestrator.py: PYATS_MAX_WORKERS -> NAC_TEST_PYATS_PROCESSES
- nac_test/pyats_core/ssh/connection_manager.py: MAX_SSH_CONNECTIONS -> NAC_TEST_PYATS_MAX_SSH_CONNECTIONS
- dev-docs/: Update stale env var references in documentation

* refactor(pyats): remove unused SENTINEL_TIMEOUT_SECONDS constant

SENTINEL_TIMEOUT_SECONDS (env: NAC_TEST_PYATS_SENTINEL_TIMEOUT) was
introduced in d1c3865 as a deadlock guard for sentinel-based IPC
synchronization, but was never wired into the subprocess runner.

The sentinel mechanism in _process_output_realtime() works by reading
lines until EOF, then falling back to _drain_remaining_buffer_safe()
if no stream_complete sentinel was received. A meaningful timeout would
require wrapping the entire post-EOF drain wait — which is already
covered by PIPE_DRAIN_TIMEOUT_SECONDS. There is no natural place to
apply a separate sentinel timeout without a design change.

Remove the constant definition, __all__ export, diagnostic script entry,
and PRD_AND_ARCHITECTURE.md table row. The env var name is left available
for a future PR if a deadlock guard is ever implemented.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pyats): use DEBUG_MODE constant for archive/JSONL retention checks

Replace raw os.environ.get("NAC_TEST_DEBUG") checks in orchestrator.py
and multi_archive_generator.py with the DEBUG_MODE constant from
core/constants.py. This ensures consistent semantics (only "true"
triggers debug mode, not empty strings or other truthy values) and
uses the single source of truth for the flag.

Also update log messages to use "or" instead of "/" for clarity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(tests): move scenario imports to module level in e2e/conftest.py

Replace per-fixture local imports of scenario constants with a single
module-level import block. All scenario constants from tests.e2e.config
are now imported at the top of the file, consistent with standard Python
import conventions and the existing module-level E2EScenario import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(pyats): document intentional local import in managed_handlers test

Add a comment explaining why pyats.log.managed_handlers is imported
inside the test body rather than at module level: a collection-time
ImportError gives too little context; keeping it here ensures failures
surface in the right test with a clear name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(e2e): document intentional duplication in TestE2EVerboseWithInfo

Add a comment explaining why test_pabot_verbose_shows_test_result and
test_easypy_info_in_stdout are duplicated from TestE2EVerbose rather
than extracted into an intermediate base class.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(core): rename _get_positive_numeric → get_positive_numeric_env

The helper is intentionally imported across package boundaries (to avoid
circular imports), so a leading underscore is misleading. The new name
signals it is a shared utility and clarifies its purpose.

Add unit tests in tests/unit/core/test_constants.py covering int and
float parsing, zero/negative/non-numeric/empty fallback to default,
and the not-set case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(pyats): use tmp_path in default_test_files fixture

Replace the hardcoded /tmp/test.py with a tmp_path-based file,
making the fixture portable and consistent with pytest conventions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(pyats): use .invalid TLD for ACI_URL fixture placeholder

apic.test.com could theoretically resolve; replace with RFC 2606
.invalid to make the non-functional intent explicit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add warning logging to get_positive_numeric_env and relocate to _env.py

Address PR 599 review comments 7 and 8:

- 7: Add warning logging when env var contains invalid (non-numeric) or
  non-positive values. Silent fallback to default only when env var is unset.
  New warn_on_invalid parameter allows callers to suppress warnings if needed.

- 8: Move get_positive_numeric_env() from core/constants.py to new
  nac_test/_env.py module. The underscore prefix signals internal API.
  Placement at package root avoids circular import via utils/__init__.py
  (see #610 for broader discussion on utils re-export strategy).

Tests added for warning behavior (4 new test cases).

* Add type annotations to public constants

Address PR review comment 9:

- core/constants.py: Add annotations to 25 public constants
- pyats_core/constants.py: Add annotations to 10 public constants

Private variables (e.g., _pipe_drain_default) left unannotated.

* Add missing docstrings to cli/main.py

Address PR review comment 11:

- Add module docstring
- Add docstring to version_callback()

* Add get_bool_env() helper and rename NO_TESTLEVELSPLIT to DISABLE_TESTLEVELSPLIT

PR review comment 12: Add get_bool_env() helper for consistent boolean env var
parsing (accepts 'true', 'yes', '1'). Rename NO_TESTLEVELSPLIT to
DISABLE_TESTLEVELSPLIT for clarity and NAC_TEST_ prefix alignment.

Breaking change: NAC_TEST_NO_TESTLEVELSPLIT is now NAC_TEST_DISABLE_TESTLEVELSPLIT

* Add BATCHING_REPORTER_ENABLED constant to pyats_core/constants.py

Move NAC_TEST_BATCHING_REPORTER env var parsing to centralized constant
using get_bool_env(). Update step_interceptor.py to use the constant.

* fix(e2e): reset pabot writer singleton between test scenarios

The pabot.writer module uses a singleton that captures stdout at creation
time. When running multiple E2E scenarios sequentially, the second test
would reuse the same singleton with a stale stdout reference, causing
verbose output to go to the wrong stream.

This fix resets the singleton at the start of each scenario execution.

Workaround for #611

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add shared architecture detection helper for validators

Introduces is_architecture_active() helper function that provides a
lightweight check for whether an architecture environment is active by
testing if its URL environment variable is set.

This helper is shared across all architecture-specific validators (ACI,
SD-WAN, Catalyst Center) to avoid duplicating environment detection logic.

Rationale: Validators need to know which architecture is active to decide
whether to perform validation. Centralizing this check in common.py ensures
consistency and makes adding new architecture validators straightforward.

* Add ACI defaults validation with two-stage heuristic

Implements validate_aci_defaults() that fails fast when users forget to
pass the ACI defaults file, catching the common mistake of omitting
-d ./defaults/ before expensive data merge operations.

Two-stage validation approach:

Stage 1 (Instant path-based heuristic):
- Checks if path contains "default" or "defaults" as directory component
- For files: requires .yaml/.yml extension + "default" in filename
- Rejects false positives like "/users/defaultuser/" or "config.txt"

Stage 2 (Content-based fallback):
- Parses YAML files to find "defaults.apic" structure
- Non-recursive directory scanning for performance
- Handles both .yaml and .yml extensions

Security features:
- Rejects symlinks to prevent path traversal
- Limits file scanning to 3MB to prevent memory exhaustion

Performance: Path heuristic provides instant feedback for 95% of cases.
Content scanning only runs when path-based check doesn't match, adding
minimal overhead (~50-100ms) compared to time saved when defaults are
missing (users get immediate feedback instead of waiting for full merge).

* Add validators package public API exports

Exports validate_aci_defaults() and is_architecture_active() as the
public API for the validators package.

Enforces separation of concerns: validators package only exports
validation functions, not UI components. Display functions like
display_aci_defaults_banner() should be imported directly from
nac_test.cli.ui, maintaining clear architectural boundaries between
validation logic and presentation.

This design allows adding new architecture validators (SD-WAN, Catalyst
Center) by following the same pattern without coupling to UI concerns.

* Add terminal banner display for ACI defaults error

Implements display_aci_defaults_banner() to show a prominent, user-friendly
error message when ACI defaults file is missing.

BoxStyle dataclass design:
- Encapsulates box-drawing characters for terminal rendering
- Frozen dataclass ensures immutability
- Supports both ASCII (fallback) and Unicode (modern terminals)
- Includes emoji_adjustment field to handle Unicode width quirks

NO_COLOR environment variable support:
- Respects NO_COLOR standard for accessibility
- Falls back to plain ASCII when NO_COLOR is set or Unicode unavailable

Banner content includes:
- Clear error message explaining the requirement
- Working example command showing correct syntax
- Link to Cisco ACI documentation for context
- Professional formatting with borders and sections

Rationale: Validation errors need to be immediately visible and actionable.
A prominent banner with example command helps users fix the issue quickly
instead of hunting through error logs.

* Add UI package public API for banner display

Exports display_aci_defaults_banner() as the public interface for CLI
error banners. Future banner functions (SD-WAN, Catalyst Center) will
be exported from this package.

Architectural separation: UI package handles all user-facing display
logic, while validators package handles business logic. This separation
allows testing and modifying display behavior independently from
validation rules.

* Wire ACI defaults validation into CLI entry point

Adds validation check immediately before DataMerger.merge_data_files()
to fail fast when defaults are missing. Validation runs after output
directory creation but before expensive merge operation.

Execution flow:
1. Parse CLI arguments
2. Configure logging
3. Create output directory
4. → NEW: Validate ACI defaults (instant check)
5. Merge data files (expensive operation, 2-5 seconds)
6. Continue with orchestrator...

Error handling:
- Validation failure displays prominent banner via display_aci_defaults_banner()
- Exits with code 1 (failure) to prevent continuing with invalid config
- Blank lines before/after banner improve readability

Time savings: Users get immediate feedback instead of waiting 2-5 seconds
for data merge to complete before seeing validation error. Pre-flight check
philosophy: validate prerequisites before expensive operations.

* Add test directory structure for CLI components

Creates package structure mirroring source code organization:
- tests/unit/cli/validators/ for validator unit tests
- tests/unit/cli/ui/ for UI component unit tests

Empty __init__.py files make directories importable Python packages,
allowing pytest to discover tests and enabling relative imports within
test modules.

Test organization principle: Test structure should mirror source
structure. This makes finding tests for a given module intuitive and
maintains clear architectural boundaries in both source and test code.

* Add comprehensive unit tests for ACI defaults validator

Tests validate_aci_defaults() and helper functions with 32 test cases
covering environment detection, path matching, YAML parsing, security
features, and edge cases.

Test coverage includes:

Environment detection (3 tests):
- Validation skips when ACI_URL not set
- Validation runs when ACI_URL is set
- Empty string ACI_URL treated as not set

Path-based heuristic (9 tests):
- Matches "defaults" directory component
- Matches defaults.yaml and defaults.yml files
- Rejects non-YAML files (defaults.txt, defaults.json)
- Rejects substring matches (defaultuser, my-defaulted-config)
- Case-insensitive matching for directory names

YAML content validation (8 tests):
- Finds defaults.apic structure in YAML files
- Handles both .yaml and .yml extensions
- Rejects files missing "defaults:" or "apic:" keys
- Handles empty files and unreadable files gracefully

Security features (2 tests):
- Rejects symlinks to prevent path traversal
- Skips files larger than 3MB to prevent memory exhaustion

Performance features (1 test):
- Directory scanning is non-recursive (documented design decision)

Edge cases (9 tests):
- Two-stage validation (path check → content check)
- Multiple data paths with mixed content
- Content check runs as fallback when path check fails

Test quality: Uses real filesystem operations (tmp_path) not mocks,
validates actual application behavior not library functionality,
includes comprehensive docstrings documenting test intent.

* Add unit tests for ACI defaults banner display

Tests display_aci_defaults_banner() function with focus on output
content, NO_COLOR environment variable support, and error handling.

Test coverage includes:

Banner content validation (3 tests):
- Banner displays without raising exceptions
- Banner contains required error message text
- Banner includes example command with correct syntax

NO_COLOR support (2 tests):
- ASCII box style used when NO_COLOR environment variable is set
- Unicode box style used when NO_COLOR is not set

Test approach: Uses pytest's capsys fixture to capture stdout and
validate actual terminal output. Tests verify real banner display
behavior, not mocked components.

Note: Banner tests focus on content validation (error message, example
command) rather than exact formatting (border characters, spacing).
This design allows banner styling to evolve without breaking tests as
long as essential content remains present.

* Add integration tests for CLI ACI defaults validation

Implements 6 integration tests split into two test classes validating
end-to-end CLI behavior with ACI defaults validation.

TestCliAciValidationIntegration (4 tests using CliRunner):
- Tests validation triggers when ACI_URL set without defaults
- Tests validation skips when ACI_URL not set
- Tests validation passes when defaults directory provided
- Tests validation passes when YAML contains defaults.apic structure

TestCliAciValidationSubprocess (2 tests using subprocess.run):
- Tests actual nac-test CLI process execution with ACI_URL
- Tests CLI subprocess without ACI_URL environment variable
- Marked with skipif when CLI not installed in PATH

Integration test design:

CliRunner tests (lines 26-195):
- Use Typer's CliRunner for fast integration testing
- Mock expensive operations (DataMerger, CombinedOrchestrator)
- Validate CLI wiring: arguments → validators → UI → exit codes
- Test focus: Does validation logic integrate correctly with CLI?

Subprocess tests (lines 197-301):
- Execute real nac-test command in subprocess
- Zero mocking - tests production behavior
- Validate complete process: entry point → imports → execution → exit
- Test focus: Does CLI actually work end-to-end in production?

Test fixtures:
- clean_controller_env: Clears environment variables for isolation
- minimal_test_env: Creates temporary directories and YAML files
- cli_test_env: Subprocess-specific environment setup

Coverage: Tests validate that validation runs at correct point in CLI
flow (after argument parsing, before data merge), displays banner on
failure, returns correct exit codes, and handles environment variables.

* Fix CI: Add defaults.yaml to integration test fixtures

Integration tests in test_integration_robot_pabot.py set ACI_URL
environment variable via setup_bogus_controller_env fixture, which
triggers the new ACI defaults validation.

Adding defaults.yaml to test fixtures satisfies the validation check
without modifying test logic or environment setup. The file contains
minimal defaults.apic structure required by the validator.

This fixes 20 failing integration tests that were exiting with code 1
(validation failure) instead of code 0 (success).

* Fix CI: Add defaults.yaml to all integration test fixture directories

Additional fixture directories (data_env, data_list, data_list_chunked,
data_merge) also used by integration tests need defaults.yaml to satisfy
ACI validation when ACI_URL environment variable is set.

Ensures all 20 failing integration tests pass by providing the minimal
defaults.apic structure required by the validator in all test fixture
data directories.

* Fix CI: Add defaults.apic structure to data_merge test fixtures

The test_nac_test_render_output_model test passes individual YAML files
(file1.yaml, file2.yaml) via -d flags instead of a directory. Our
validator performs content-based checking on these files.

Adding defaults.apic structure to file1.yaml satisfies the validator's
content check. The expected result.yaml is updated to include the
defaults structure since it's part of the merged data model.

This fixes the 2 remaining failing integration tests.

* Add defaults.yaml to PyATS quicksilver test fixtures

The test_nac_test_pyats_quicksilver_api_only tests set architecture-specific
URL environment variables (ACI_URL, CC_URL, SDWAN_URL) which trigger the
ACI defaults validation. These fixture directories need minimal defaults.yaml
files with the required structure to satisfy the validation check.

Files added:
- tests/integration/fixtures/data_pyats_qs/aci/defaults.yaml
- tests/integration/fixtures/data_pyats_qs/catc/defaults.yaml
- tests/integration/fixtures/data_pyats_qs/sdwan/defaults.yaml

Each contains minimal mock structure: defaults.apic.version = 6.0

This fixes the failing CI test: test_nac_test_pyats_quicksilver_api_only[aci-ACI-1-0-0]

* fix(tests): update test imports for Phase 1 refactoring

Updated test files to use new names introduced in Phase 1:
- CONTROLLER_REGISTRY instead of separate dict constants
- extract_host() instead of _extract_host()
- Updated test class and test assertions for dataclass-based config

Added type: ignore comments for untyped Auth class methods to satisfy mypy.

All 64 tests now pass successfully.

* refactor: create centralized HTTP status constants module

Introduces core/http_constants.py as the single source of truth for HTTP
status code range boundaries and special status code sets used throughout
the framework.

Why this improves the codebase:
- Eliminates DRY violations where HTTP status ranges were defined in
  multiple locations (subprocess_client.py, http/__init__.py)
- Provides consistent status code classification logic
- Makes it easier to maintain and update HTTP-related constants
- Improves type safety with explicitly typed constants
- Centralizes authentication failure codes (401, 403) and service
  unavailable codes (408, 429, 503, 504) for reuse

This module serves as a foundation for other refactoring work that will
migrate existing code to use these centralized constants.

* refactor: migrate subprocess_client.py to centralized HTTP constants

Updates subprocess_client.py to import and use HTTP status code constants
from the new core/http_constants.py module instead of defining them locally.

Why this improves the codebase:
- Eliminates duplication of HTTP status range definitions
- Ensures consistent status code classification across the framework
- Makes future updates to HTTP constants affect all consumers automatically
- Improves maintainability by reducing places where constants are defined

This is part of a broader effort to establish core/http_constants.py as
the single source of truth for HTTP-related constants.

* refactor: update http/__init__.py to re-export centralized constants

Updates the http package's __init__.py to re-export HTTP constants from
core/http_constants.py rather than defining them locally. This maintains
backward compatibility for any code importing from nac_test.pyats_core.http.

Why this improves the codebase:
- Preserves existing import paths for backward compatibility
- Delegates to the single source of truth (core/http_constants.py)
- Reduces maintenance burden by having only one place to update constants
- Makes the http package a thin facade over the core constants module

This change ensures that code using either import path gets the same
constants, preventing potential inconsistencies.

* refactor: create URL parsing utility module

Introduces utils/url.py with extract_host() function to provide robust URL
parsing capabilities using Python's standard library urlparse.

Why this improves the codebase:
- Centralizes URL manipulation logic previously duplicated or ad-hoc
- Uses Python's battle-tested urlparse for robust parsing (handles edge
  cases like missing schemes, IPv6 addresses, port numbers)
- Provides consistent host extraction across the codebase
- Eliminates brittle string manipulation approaches (split('/'), regex)
- Improves testability by isolating URL parsing logic
- Makes the codebase more maintainable by having a single utility for URL operations

This utility will be used by auth_failure.py, controller_auth.py, and
potentially other modules that need to extract host information from URLs.

* refactor: create error classification module

Introduces core/error_classification.py to extract and centralize
authentication error classification logic. This module provides AuthOutcome
enum and classification functions for HTTP and network errors.

Why this improves the codebase:
- Separates error classification logic from controller authentication logic
  (Single Responsibility Principle)
- Makes error classification logic highly testable in isolation
- Uses a two-tier classification strategy: check network failures first
  (timeouts, connection refused, DNS) to avoid false positives from port
  numbers being matched as HTTP status codes
- Provides structured outcome classification via AuthOutcome enum instead
  of raw strings
- Centralizes HTTP status code interpretation logic (401/403 → bad
  credentials, 408/429/503/504 → unreachable, etc.)
- Uses regex for reliable HTTP status code extraction from error messages
- Improves maintainability by having one place to update classification rules

This module will be used by controller_auth.py and auth_failure.py to
provide consistent error categorization across the framework.

* refactor: add controller metadata helpers and structured config

Adds ControllerConfig dataclass and CONTROLLER_REGISTRY to centralize
controller metadata, along with two new helper functions:
get_display_name() and get_env_var_prefix().

Why this improves the codebase:
- Eliminates DRY violations where controller display names and env var
  prefixes were hardcoded in 5+ locations across the codebase
- Provides a single source of truth (CONTROLLER_REGISTRY) for all
  controller metadata (display names, URL env vars, credential patterns)
- Introduces ControllerConfig dataclass for type-safe controller metadata
- Makes controller information easily accessible via simple helper functions
- Improves maintainability: adding new controllers or changing display
  names now requires updates in only one place
- Preserves backward compatibility by maintaining CREDENTIAL_PATTERNS as
  a view over CONTROLLER_REGISTRY

The helper functions gracefully degrade by returning the controller_type
string if a controller is not in the registry, preventing crashes when
dealing with unknown controller types.

* refactor: fix type annotation and extract banner rendering logic

Fixes the ColorValue type annotation in banners.py (was incorrectly using
int | tuple, now correctly str | int | tuple) and extracts common banner
rendering logic into a reusable _render_banner() function.

Why this improves the codebase:
- Corrects type annotation to match typer's actual color value type (must
  include str for named colors like "red", "white")
- Eliminates massive code duplication across banner display functions by
  extracting shared rendering logic into _render_banner()
- Reduces line count by ~100+ lines while preserving all functionality
- Makes banner rendering logic testable in isolation
- Improves maintainability: changes to border rendering now affect all
  banners automatically
- Uses controller helper functions (get_display_name, extract_host) to
  eliminate hardcoded controller names
- Follows DRY principle by having a single implementation of box-drawing
  character handling and color application

All public banner functions now delegate to _render_banner() with
appropriate title, content, and color parameters.

* refactor: migrate main.py to use controller helper functions

Updates main.py to use get_display_name() and get_env_var_prefix() from
utils.controller instead of hardcoding controller names and prefixes.

Why this improves the codebase:
- Eliminates hardcoded controller display names (no more "SDWAN Manager"
  string literals scattered throughout)
- Reduces coupling between main.py and controller-specific knowledge
- Makes the code more maintainable: adding/changing controller names now
  only requires updates to CONTROLLER_REGISTRY
- Improves consistency by ensuring all code uses the same display names
- Follows DRY principle by delegating to centralized helper functions

This is part of a broader effort to eliminate controller metadata
duplication across the codebase.

* refactor: migrate auth_failure.py to use centralized utilities

Updates auth_failure.py to use get_display_name() from utils.controller
and extract_host() from utils.url instead of defining logic inline.

Why this improves the codebase:
- Eliminates hardcoded controller display names
- Replaces brittle URL host extraction (split('/')) with robust urlparse
- Reduces coupling between auth_failure.py and controller-specific logic
- Improves testability by delegating to tested utility functions
- Makes the code more maintainable and consistent with the rest of the
  codebase
- Follows DRY principle by reusing centralized utilities

This change ensures auth_failure.py uses the same display names and URL
parsing logic as the rest of the framework.

* test: update controller_auth tests for new module structure

Updates test_controller_auth.py to reflect the refactored module structure
where error classification and controller registry moved to separate modules.

Why this improves the test suite:
- Removes tests for AuthOutcome enum (now tested in test_error_classification.py)
- Removes tests for AuthCheckResult dataclass (simple dataclass, no logic)
- Removes tests for ControllerConfig (now tested in test_controller.py)
- Focuses tests on the actual validation logic in controller_auth.py
- Follows the principle of testing behavior, not implementation details
- Imports CONTROLLER_REGISTRY and AuthOutcome from their new locations
- Maintains test coverage of the actual authentication validation flow

This change aligns the test suite with the refactored module boundaries,
ensuring each module's tests focus on that module's responsibilities.

* test: update banners tests for extracted rendering logic

Updates test_banners.py to test the new _render_banner() helper function
and ensures all banner functions still work correctly after refactoring.

Why this improves the test suite:
- Tests the extracted _render_banner() function directly to verify common
  banner rendering logic
- Ensures all public banner functions still produce correct output after
  delegating to _render_banner()
- Maintains coverage of NO_COLOR environment variable handling
- Tests that controller helper functions are used correctly (get_display_name)
- Follows DRY principle in tests by verifying shared logic once rather
  than duplicating assertions across multiple banner function tests

This change ensures the refactored banner code maintains the same
behavior while having better test isolation for the extracted logic.

* test: update auth_failure tests for new utility imports

Updates test_auth_failure.py to reflect that auth_failure.py now imports
display names and URL extraction from centralized utility modules.

Why this improves the test suite:
- Aligns test imports with the refactored module structure
- Tests continue to verify that auth_failure.py uses correct display
  names and URL extraction logic
- Ensures coverage of the integration between auth_failure.py and the
  new utility functions (get_display_name, extract_host)
- Maintains test isolation while verifying correct delegation to utilities

This change ensures auth_failure.py tests remain accurate after the
refactoring to use centralized utilities.

* test: move test_cli_controller_auth.py to correct directory

Moves test_cli_controller_auth.py from tests/integration/ to tests/unit/cli/
to reflect its actual testing scope and organization.

Why this improves the test suite:
- Places test file in the correct directory structure matching the module
  it tests (cli/validators/controller_auth.py)
- Improves test discoverability: unit tests for cli modules should be
  under tests/unit/cli/
- Separates unit tests from integration tests more clearly
- Follows the convention of mirroring the source tree structure in the
  test directory
- Makes it easier for developers to find relevant tests

This is a pure organizational change with no logic modifications.

* fix(tests): resolve Python 3.10 mock.patch compatibility issue

Updates three banner tests to patch TerminalColors.NO_COLOR at the
import location (nac_test.cli.ui.banners) rather than the definition
location (nac_test.utils.terminal).

Python 3.10's mock.patch implementation has issues resolving nested
class attributes when patching at the definition location, causing:
  ModuleNotFoundError: No module named 'nac_test.utils.terminal.TerminalColors'

Python 3.11+ handles this correctly, but for 3.10 compatibility,
patching at the import location is the standard best practice.

Fixes CI failures in Tests (3.10):
- TestDisplayAciDefaultsBanner::test_respects_no_color_mode
- TestDisplayAuthFailureBanner::test_respects_no_color_mode
- TestDisplayUnreachableBanner::test_respects_no_color_mode

* fix(auth): protect sys.argv during pyats-common lazy imports

PyATS's configuration module parses sys.argv at import time,
registering --pyats-configuration as a known argument. When
_get_auth_callable() lazily imports from nac_test_pyats_common,
the parent package __init__.py eagerly imports all test base
classes, which triggers `from pyats import aetest`, initializing
the configuration module. Python argparse prefix-matches our
--pyats CLI flag to --pyats-configuration, causing:

  error: argument --pyats-configuration: expected one argument

Fix: temporarily strip --pyats from sys.argv during the import
and restore it in a finally block. This follows the established
pattern from device_inventory.py (lines 94-100).

* Add cache invalidation method to AuthCache

Implement AuthCache.invalidate() to support credential change detection
in preflight authentication checks. This method safely removes cached
tokens under file lock, enabling fresh authentication when credentials
are updated between test runs.

Benefits:
- Prevents stale token reuse when credentials change
- Process-safe with FileLock for parallel test execution
- Best-effort design never blocks test execution
- Automatic cleanup of both cache and lock files

The invalidation mechanism is designed for the common workflow where
users test with wrong credentials (cache miss → auth fails), then fix
credentials (invalidate → cache miss → auth succeeds), then test again
with different wrong credentials (invalidate → cache miss → auth fails).
Without this, the third scenario would incorrectly reuse the cached
token from the second run.

* Add cache_key field to ControllerConfig

Introduce cache_key mapping to centralize the translation between
detected controller types and their AuthCache storage keys. This
resolves the naming inconsistency where SDWAN detection returns
"SDWAN" but the adapter uses "SDWAN_MANAGER" as the cache key.

Benefits:
- Single source of truth for cache key mapping
- Eliminates hardcoded key strings in multiple locations
- Supports all three controller types (ACI, SDWAN, CC)
- Enables preflight check to invalidate the correct cache file

The mapping preserves existing cache behavior while making it
explicit and maintainable. Cache keys match the actual keys used
by authentication adapters in nac-test-pyats-common.

* Invalidate auth cache before preflight credential check

Integrate cache invalidation into the preflight authentication flow
to ensure fresh credential validation on every test run. This fixes
the issue where changing credentials between runs would incorrectly
reuse cached tokens from previous successful authentications.

Benefits:
- Detects credential changes immediately (no stale token reuse)
- Validates current env var credentials, not cached credentials
- Works across all controller types (ACI, SDWAN, CC)
- Best-effort design never blocks execution if invalidation fails

The fix enables the critical workflow:
  Run 1: Wrong creds → Preflight fails, banner shown, exit
  Run 2: Correct creds → Preflight succeeds, tests run, cache written
  Run 3: Wrong creds again → Cache invalidated, preflight fails, banner

Without this fix, Run 3 would incorrectly succeed using the cached
token from Run 2, allowing tests to execute with invalid credentials
in the environment.

* Add unit tests for AuthCache.invalidate()

Implement comprehensive unit tests covering the cache invalidation
method's behavior across success, no-op, and failure scenarios.

Test coverage:
- Successful cache file removal with lock acquisition
- No-op behavior when cache file doesn't exist
- Lock file cleanup after cache removal
- Graceful handling of permission errors (best-effort)

These tests validate the core invalidation logic that enables
preflight credential change detection while ensuring failures
never block test execution.

* Add unit tests for preflight cache invalidation integration

Implement unit tests validating the preflight authentication check's
integration with cache invalidation across all controller types.

Test coverage:
- ACI, SDWAN, and CC cache invalidation during preflight
- Correct cache_key used for each controller type
- No invalidation attempted for unsupported controller types
- Invalidation failures do not prevent authentication attempts

These tests ensure the preflight check correctly invalidates cached
tokens before validating current credentials, enabling immediate
detection of credential changes across all supported architectures.

* Add integration tests for preflight controller authentication

Add pytest-httpserver-based integration tests that exercise the full
auth chain: preflight_auth_check → auth adapter → subprocess → HTTP
request → error classification. Tests cover all three controller types
(APIC, SDWAN, Catalyst Center) with success and failure paths.

Test coverage:
- APIC: success (200), bad credentials (401), forbidden (403), unreachable
- SDWAN: success with session cookie + XSRF token, bad credentials (401)
- CC: success with Basic Auth, bad credentials (401 on both endpoints)

Also adds pytest-httpserver to dev dependencies and registers the
'slow' pytest marker for tests with real network timeouts.

* fix(cli): clean up preflight auth banner output

Remove verbose error detail lines from auth failure and unreachable
banners — the summary line ("Could not authenticate/connect to X at Y")
is sufficient for user-facing output, and the raw error strings were
overflowing the box borders.

Downgrade subprocess_auth and controller_auth log messages from
ERROR/WARNING to DEBUG — the banner already handles user-facing
output, so the log lines were redundant noise. Also remove the
double "Authentication failed:" prefix wrapping in SubprocessAuthError.

* fix: replace hardcoded "APIC recovery" with "controller recovery" in retry logs

The retry backoff messages referenced "APIC recovery" which is
incorrect for SD-WAN and Catalyst Center controllers. Changed to
the controller-agnostic "controller recovery" so the messages are
accurate regardless of which architecture is being tested.

* feat(types): add PreFlightFailure dataclass and ControllerTypeKey alias

Introduce frozen dataclass for pre-flight failure metadata with
Literal-typed failure_type field, and a ControllerTypeKey type alias
for consistent controller type handling across the codebase.

* refactor(constants): consolidate HTTP constants into core/constants.py

Move HTTP status code ranges, auth failure codes, and service
unavailable codes from the standalone http_constants module into
the shared constants module. Update all import sites.

* feat(utils): add shared duration and timestamp formatting utilities

Add format_duration() for smart human-readable duration display
(<1s, seconds, minutes, hours) and format_timestamp_ms() for
millisecond-precision timestamps. Single source of truth for
formatting logic used across CLI, progress, and reporting.

* test(utils): add unit tests for format_duration and format_timestamp_ms

Cover all formatting tiers: None/N/A, sub-second, seconds, minutes,
hours for duration; millisecond precision and default-to-now for
timestamps. 14 test cases total.

* refactor(base-test): use timestamp format constants instead of inline strings

Replace hardcoded strftime format strings with FILE_TIMESTAMP_FORMAT
and FILE_TIMESTAMP_MS_FORMAT from core constants.

* refactor(subprocess-runner): use FILE_TIMESTAMP_MS_FORMAT constant

Replace inline strftime format string with shared constant.

* refactor(archive-aggregator): use FILE_TIMESTAMP_MS_FORMAT constant

Replace inline strftime format string with shared constant.

* refactor(batching-reporter): use FILE_TIMESTAMP_FORMAT constant

Replace inline strftime format string with shared constant.

* refactor(progress-reporter): use shared format_timestamp_ms utility

Replace inline timestamp formatting with the shared utility function
for consistent millisecond-precision timestamps.

* refactor(templates): delegate to shared formatting utilities

Replace inline timestamp and duration formatting with calls to the
shared format_timestamp_ms and format_duration utilities.

* refactor(robot-orchestrator): use shared time format and duration utility

Replace inline strftime format string with CONSOLE_TIME_FORMAT constant
and verbose duration formatting with format_duration().

* refactor(robot-parser): use ROBOT_TIMESTAMP_FORMAT constants

Replace inline Robot Framework timestamp format strings with
ROBOT_TIMESTAMP_FORMAT and ROBOT_TIMESTAMP_FORMAT_NO_MS constants.

* refactor(robot-generator): use REPORT_TIMESTAMP_FORMAT constant

Replace inline strftime format string with shared constant.

* refactor(pyats-orchestrator): use shared format_duration utility

Replace 8-line inline duration formatting block with single call
to the shared format_duration() function.

* refactor(summary-printer): replace duplicate format_duration with shared utility

Delete the 16-line SummaryPrinter.format_duration() static method and
import the shared format_duration from utils.formatting instead.
Updates all 4 call sites from self.format_duration() to format_duration().

* refactor(pyats-generator): use REPORT_TIMESTAMP_FORMAT constant

Replace inline strftime format string with shared constant.

* feat(combined-generator): add pre-flight failure report generation

Add CurlTemplate NamedTuple with data-driven lookup replacing if/elif
chain for curl example generation. Add _generate_pre_flight_failure_report()
to route auth and unreachable failures through the unified combined
summary report with 401/403 template branching.

* feat(cli): route auth failures through unified reporting pipeline

Create PreFlightFailure on auth/unreachable and pass to
CombinedReportGenerator instead of standalone auth_failure module.
Replace inline duration formatting with format_duration() call.

* refactor(cli): delete standalone auth_failure reporting module

Auth failure report generation now handled by CombinedReportGenerator.
The standalone cli/reporting/ module is no longer needed.

* test(cli): delete obsolete auth_failure report tests

Test coverage for pre-flight failure reports now lives in
test_combined_generator.py alongside the unified reporting pipeline.

* test(utils): relocate URL extraction tests to tests/unit/utils/

Move extract_host tests from the deleted cli/reporting test module
to tests/unit/utils/test_url.py alongside the utility they test.

* test(combined-generator): add curl template and pre-flight failure tests

Add TestGetCurlExample (4 tests) for data-driven curl template
generation, TestPreFlightFailureReport (6 tests) for auth/unreachable
failure report generation including 401/403 branching and negative
assertion for legacy auth_failure_report.html.

* test(cli): update auth test assertions for unified reporting pipeline

Adjust test mocking and assertions to match the new PreFlightFailure
routing through CombinedReportGenerator instead of standalone module.

* chore: update uv.lock

* fix(reporting): remove redundant f-string wrapper on plain return value

_get_curl_example() wrapped a bare variable in an f-string
(`return f"{controller_url}"`) when no interpolation was needed.
This is flagged by ruff as an unnecessary f-string (F541-adjacent
pattern) and adds runtime overhead for the string copy. Replace with
a direct `return controller_url`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(reporting): restore "Forbidden" text check in is_403 detection

The is_403 flag in _generate_pre_flight_failure_report() only checked
for the numeric "403" string via HTTP_FORBIDDEN_CODE. Some controller
responses include the word "Forbidden" without the numeric code (e.g.,
"Forbidden: insufficient privileges"). The original auth_failure.py
module checked both patterns, but the "Forbidden" text match was lost
during the migration to the unified reporting pipeline.

Restore the disjunction so the template receives is_403=True for both
"403" and "Forbidden" responses, ensuring the HTML report shows the
correct troubleshooting guidance for permission-denied errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(reporting): add explicit encoding="utf-8" to normal-path write_text

The pre-flight failure code path already specified encoding="utf-8"
when writing combined_summary.html, but the normal (success) code
path relied on the platform default. On systems where the default
encoding is not UTF-8 this would produce garbled HTML output.

Make both code paths consistent by specifying encoding="utf-8"
explicitly, matching the pre-flight failure path and the project
convention of always declaring encoding on file I/O.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(reporting): convert f-string logger calls to %-style lazy formatting

Replace all 6 f-string logger calls in CombinedReportGenerator with
%-style formatting (e.g., `logger.info("msg: %s", val)`).

f-strings are evaluated eagerly at the call site regardless of whether
the log level is enabled. %-style defers interpolation to the logging
framework, which skips the string formatting entirely when the message
would be discarded. This is the pattern recommended by the Python
logging documentation and by ruff's LOG002/G004 rule.

Affected calls:
  - generate_combined_summary(): 3 info + 1 error
  - _generate_pre_flight_failure_report(): 1 info + 1 error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(types): remove has_pre_flight_failure property, use direct None check

Remove the `has_pre_flight_failure` property from CombinedResults and
replace its single call site in CombinedReportGenerator with a direct
`results.pre_flight_failure is not None` check.

Why: mypy cannot narrow the type of an attribute through a property
accessor. The old code required a secondary `if failure is None` guard
(marked "unreachable") just to satisfy mypy after calling
`results.has_pre_flight_failure`. A direct `is not None` check lets
mypy narrow `results.pre_flight_failure` from `PreFlightFailure | None`
to `PreFlightFailure` automatically, eliminating the dead-code guard
entirely.

This also removes one level of indirection — the property was a
single-expression wrapper around `is not None`, adding API surface
without adding clarity. The direct check is idiomatic Python and
immediately communicates what is being tested.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(auth): narrow controller_type from str to ControllerTypeKey Literal

Tighten the type of `AuthCheckResult.controller_type` and the
`preflight_auth_check()` parameter from bare `str` to the
`ControllerTypeKey` Literal alias ("ACI" | "SDWAN" | "CC" | ...).

This eliminates a class of bugs where an arbitrary string could flow
through the auth pipeline unchecked. The `ControllerTypeKey` Literal
constrains valid values at the type level, giving mypy the ability to
catch invalid controller types at analysis time rather than at runtime.

The `cast()` call is placed at the boundary in main.py where
`detect_controller_type()` (which returns `str` because it reads from
environment variables) enters the typed domain. Once cast at this
single entry point, all downstream code — `preflight_auth_check()`,
`AuthCheckResult`, `PreFlightFailure` — receives a properly typed
`ControllerTypeKey` without needing per-site casts.

This removes the redundant `cast(ControllerTypeKey, ...)` that was
previously needed when constructing PreFlightFailure from
auth_result.controller_type, since that field is now already typed
as ControllerTypeKey.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(formatting): restore .2f precision in format_duration

The previous commit inadvertently changed format_duration from .2f to
.1f precision, dropping the second decimal place (e.g., "3.25s" became
"3.3s"). This restores the original .2f format specifier so durations
display with two decimal places, which is more useful for distinguishing
test timings at sub-second granularity.

The docstring examples are updated to reflect the correct output format.

* test(formatting): update format_duration assertions for .2f precision

Updates three test expectations to match the restored .2f format
specifier in format_duration:
  - "1.0s"  -> "1.00s"
  - "45.2s" -> "45.20s"
  - "30.0s" -> "30.00s"

* feat(formatting): add format_file_timestamp_ms() utility function

Three call sites (subprocess_runner, archive_aggregator, base_test)
were each independently doing:

    datetime.now().strftime(FILE_TIMESTAMP_MS_FORMAT)[:-3]

This is a leaky abstraction — callers had to know about the [:-3]
slice that trims microseconds (6 digits) to milliseconds (3 digits),
and each had to import both datetime and the format constant.

The new format_file_timestamp_ms() function encapsulates this pattern
in a single place, consistent with the existing format_timestamp_ms()
function that already handles the display-oriented timestamp format.

The _MICROSECOND_TRIM constant is reused by both functions.

* test(formatting): add tests for format_file_timestamp_ms

Adds TestFormatFileTimestampMs class with four tests covering:
- Known datetime produces exact expected string ("20250627_182616_834")
- Microsecond-to-millisecond trimming works correctly (123456 -> "123")
- None argument defaults to current time (boundary check)
- Output format has the expected 19-character length

These mirror the existing TestFormatTimestampMs structure to maintain
consistency in how we test the two timestamp formatting functions.

* refactor(subprocess-runner): use format_file_timestamp_ms utility

Replaces the inline datetime.now().strftime(FILE_TIMESTAMP_MS_FORMAT)[:-3]
pattern with the new format_file_timestamp_ms() utility. This removes the
need for subprocess_runner to know about the microsecond trimming detail
and drops the now-unused datetime and FILE_TIMESTAMP_MS_FORMAT imports.

* refactor(archive-aggregator): use format_file_timestamp_ms utility

Replaces the inline datetime.now().strftime(FILE_TIMESTAMP_MS_FORMAT)[:-3]
pattern with the new format_file_timestamp_ms() utility. Drops the
now-unused datetime and FILE_TIMESTAMP_MS_FORMAT imports.

This is the second of three call sites migrated to the shared utility.

* refactor(base-test): use format_file_timestamp_ms utility

Replaces the inline strftime + [:-3] slice with format_file_timestamp_ms()
in _generate_test_id(). Removes the now-unused FILE_TIMESTAMP_MS_FORMAT
import from core.constants (datetime import is retained — it is still
used in two other places in this module).

This is the last of three call sites migrated to the shared utility,
completing the elimination of the leaky [:-3] abstraction.

* feat(error-classification): add extract_http_status_code() function

Adds a public function to extract the HTTP status code from an
exception message as an integer, returning None if no code is found.

This reuses the existing _HTTP_STATUS_CODE_PATTERN regex that
_classify_auth_error already uses internally, so the extraction logic
is consistent. The function is intentionally separate from
_classify_auth_error (rather than changing it to return a 3-tuple)
to avoid breaking the 11 existing call sites and their test assertions.

This will be used by the pre-flight auth check to propagate structured
status codes into AuthCheckResult and PreFlightFailure, enabling
reliable is_403 detection without fragile substring matching.

* feat(types): add status_code field to PreFlightFailure

Adds an optional status_code: int | None field to the PreFlightFailure
frozen dataclass. This carries the structured HTTP status code from the
auth check through to the reporting layer, enabling reliable detection
of specific HTTP conditions (e.g., 403 Forbidden) without resorting to
fragile substring matching on the detail string.

The field defaults to None, which is correct for non-HTTP failures like
connection timeouts or DNS resolution errors where there is no HTTP
status code to report.

* feat(auth): add status_code field to AuthCheckResult

Mirrors the status_code field just added to PreFlightFailure. This
allows the pre-flight auth check to capture the HTTP status code at
the point of failure and pass it through to the reporting layer via
PreFlightFailure.

The field defaults to None, which is correct for both successful auth
checks (no error) and non-HTTP failures (timeouts, DNS, etc.).

* feat(auth): extract and propagate HTTP status_code in preflight check

Imports extract_http_status_code and calls it in the exception handler
of preflight_auth_check() to capture the HTTP status code from the
failed auth request. The extracted code is passed into AuthCheckResult
via the new status_code field.

This is the wiring that connects extract_http_status_code (added to
error_classification.py) with the AuthCheckResult field (added in the
previous commit). For non-HTTP errors (timeouts, DNS, etc.),
extract_http_status_code returns None, which is the correct default.

* fix(reporting): replace is_403 substring matching with structured check

The previous is_403 detection used fragile substring matching:

    is_403 = (
        str(HTTP_FORBIDDEN_CODE) in failure.detail
        or "Forbidden" in failure.detail
    )

This could false-positive on any detail string containing "403" or
"Forbidden" as a substring (e.g., "tried 403 times" or a URL path
containing "Forbidden"). It also required the detail string format
to remain stable.

Now that PreFlightFailure carries a structured status_code field,
we can use a direct integer comparison instead:

    is_403 = failure.status_code == HTTP_FORBIDDEN_CODE

This is unambiguous, type-safe, and decoupled from the detail string.

* feat(cli): pass status_code from AuthCheckResult to PreFlightFailure

Threads the status_code field through the CLI main function where
AuthCheckResult is converted into PreFlightFailure. This completes the
data flow from the pre-flight auth check through to the report
generator, where it replaces the fragile substring-based is_403 check.

* test(auth): add status_code propagation tests for preflight check

Adds two tests to TestPreflightAuthCheck:

1. test_propagates_http_status_code — verifies that when auth fails
   with "HTTP 403: Forbidden", the result carries status_code=403.
   This proves the extract_http_status_code -> AuthCheckResult wiring
   works end-to-end.

2. test_status_code_none_for_non_http_errors — verifies that when auth
   fails with a non-HTTP error like "Connection timed out", the
   status_code is None (not some spurious number extracted from the
   error message).

Also adds type: ignore[arg-type] to test_handles_unknown_controller_type
which deliberately passes an invalid controller type string to verify
graceful handling. The ignore is needed because preflight_auth_check now
expects ControllerTypeKey (a Literal type), and "UNKNOWN_CONTROLLER" is
intentionally outside that set — that's the entire point of the test.

* test(error-classification): add tests for extract_http_status_code

Adds TestExtractHttpStatusCode class with five tests covering:
- 401 extraction from "HTTP 401: Unauthorized"
- 403 extraction from "HTTP 403: Forbidden"
- 500 extraction from "HTTP 500: Internal Server Error"
- None returned for non-HTTP messages ("Connection timed out")
- None returned for generic messages ("Something went wrong")

These validate the public extract_http_status_code() function that was
added to error_classification.py, ensuring it correctly reuses the
_HTTP_STATUS_CODE_PATTERN regex to extract status codes and returns
None when no HTTP status code is present.

* test(combined-generator): add status_code to PreFlightFailure fixtures

Updates five PreFlightFailure constructors in the combined generator
tests to include the new status_code field, matching the structured
data that the production code now provides:

- Four auth failure tests: status_code=401 (matching "HTTP 401" detail)
- One 403 test: status_code=403 (matching "HTTP 403: Forbidden" detail)
- Unreachable test: unchanged (no status_code, defaults to None)

This ensures the test fixtures accurately reflect the data shape that
flows through the reporting pipeline in production, where the is_403
check now uses failure.status_code == HTTP_FORBIDDEN_CODE rather than
substring matching on the detail string.

* refactor(reporting): narrow _CURL_TEMPLATES key type to ControllerTypeKey

Changes the _CURL_TEMPLATES dict from dict[str, _CurlTemplate] to
dict[ControllerTypeKey, _CurlTemplate], and narrows the controller_type
parameter of _get_curl_example from str to ControllerTypeKey.

Since _CURL_TEMPLATES keys are always valid controller type literals
("ACI", "SDWAN", "CC"), the type annotation should reflect this.
Using ControllerTypeKey catches typos at type-check time and makes the
data flow consistent with the rest of the reporting pipeline, where
PreFlightFailure.controller_type is already ControllerTypeKey.

* refactor(controller): narrow detect_controller_type return to ControllerTypeKey

Changes the return type of detect_controller_type() from str to
ControllerTypeKey, which is a Literal type covering all supported
controller identifiers ("ACI", "SDWAN", "CC", "MERAKI", etc.).

This eliminates the need for callers to cast() the return value,
since the function now guarantees at the type level that it returns
a valid controller type key.

A cast(ControllerTypeKey, ...) is needed at the return point because
complete_sets is typed as list[str] (populated from CONTROLLER_REGISTRY
dict iteration). The keys are always valid ControllerTypeKey values,
but mypy can't infer this from dict key iteration. The cast documents
this boundary explicitly.

* refactor(cli): remove cast() now that detect_controller_type returns ControllerTypeKey

Now that detect_controller_type() declares its return type as
ControllerTypeKey (the previous commit), the cast() wrapper in main()
is redundant. Removing it also drops the now-unused cast and
ControllerTypeKey imports.

Before: controller_type = cast(ControllerTypeKey, detect_controller_type())
After:  controller_type = detect_controller_type()

The type narrowing is now handled at the source (controller.py) rather
than at every call site — one cast instead of N.

* test(auth): add type: ignore for parametrized controller_type argument

The pytest @parametrize decorator infers controller_type as str, but
preflight_auth_check now expects ControllerTypeKey (a Literal type).
The values ("ACI", "SDWAN", "CC") are all valid ControllerTypeKey
members, but mypy cannot narrow str to Literal through parametrize.

Adding type: ignore[arg-type] is the standard approach for this
pytest + Literal type interaction. The runtime behavior is unchanged.

* feat(banners): add _wrap_url_lines helper for banner content wrapping

Long controller URLs (e.g., https://sandboxdnacultramdeupURL.cisco.com)
can exceed the fixed 78-character banner content width, causing text to
overflow past the right border and breaking the visual box.

This adds a _wrap_url_lines() helper that keeps the URL on the same line
when the combined prefix + URL fits within BANNER_CONTENT_WIDTH, and
wraps the URL to an indented second line when it would overflow. This
approach preserves the fixed-width box (which is important for
consistent terminal rendering across 80-column terminals) while ensuring
all content stays within the borders.

The helper accepts an optional indent parameter for lines that are
already indented (e.g., "  curl -k <url>") so that both the prefix
and wrapped URL line receive consistent leading whitespace.

* fix(banners): wrap long URLs in auth failure banner

The "Could not authenticate to {display_name} at {url}" line is
constructed at runtime, and with long controller URLs (common in
enterprise environments like Catalyst Center sandboxes), the combined
string exceeds the 78-character banner width, causing the right border
to shift out of alignment.

This replaces the single f-string with a _wrap_url_lines() call that
keeps the URL inline for short URLs and wraps it to a second indented
line for long ones. Short URLs (e.g., https://apic.local) render
identically to before — no visual change for the common case.

* fix(banners): wrap long URLs in unreachable banner

The unreachable banner has two lines that embed the controller URL:
the "Could not connect to..." message and the "curl -k <url>" command.
Both can overflow the 78-character banner width with long URLs.

This applies _wrap_url_lines() to both lines. The curl command uses the
indent="  " parameter so both the "curl -k" prefix and the wrapped URL
receive the 2-space indent that visually groups them under the
"Verify the controller is reachable..." instruction line.

The ping command is not wrapped because it uses the extracted hostname
(not the full URL), which is always short enough to fit.

* test(banners): add unit tests for _wrap_url_lines helper

Tests the four key behaviors of the URL wrapping helper:

1. Short URLs that fit within BANNER_CONTENT_WIDTH stay on one line,
   preserving the original single-line format for the common case.

2. Long URLs that exceed the width are wrapped to a second line with
   2-space indentation, keeping the prefix on its own line.

3. The indent parameter is applied to both lines when wrapping occurs,
   so indented commands like "  curl -k" produce "  curl -k" / "    <url>"
   with the URL indented 2 more spaces relative to the prefix.

4. The indent parameter is applied to the single line when no wrapping
   is needed, maintaining consistent indentation.

Each test explicitly verifies its precondition (e.g., that the test URL
actually exceeds the width) to prevent tests from silently passing if
BANNER_CONTENT_WIDTH changes in the future.

* test(banners): add end-to-end tests for long URL banner rendering

These tests verify the actual rendered banner output — not just the
helper function — to ensure no line overflows the box border when a
long controller URL is used. This catches regressions that unit tests
on _wrap_url_lines alone would miss, such as a new banner line being
added that embeds the URL without using the wrapping helper.

Each test uses the same long URL from the original bug report
(https://sandboxdnacultramdeupURL.cisco.com) and asserts that every
rendered line is at most BANNER_CONTENT_WIDTH + 2 characters (the
content width plus the two border characters).

Both display_unreachable_banner and display_auth_failure_banner are
covered since they have different content layouts but the same overflow
risk.

* fix(cli): skip controller detection and preflight auth in dry-run mode

Dry-run mode doesn't need controller access, same as render-only.
Gate both detect_controller_type() and preflight_auth_check() behind
`not render_only and not dry_run` in main() and CombinedOrchestrator.

* fix(tests): mock preflight auth check in integration tests

The integration tests use bogus ACI credentials (ACI_URL=foo) which now
trigger the preflight auth check added in the previous commit. Since
these tests validate Robot rendering/execution/ordering behavior (not
authentication), mock both detect_controller_type and preflight_auth_check
in the existing setup_bogus_controller_env autouse fixtures.

* fix(tests): fix auth cache lock file test for filelock >= 3.13

filelock 3.24.2 (UnixFileLock) auto-deletes .lock files on release,
so the lock file does not persist between separate FileLock acquisitions.
The test incorrectly asserted .lock file existence after _cache_auth_data()
returned — this was testing filelock library behavior, not our code.

Updated to verify the end state: no cache or lock artifacts remain after
invalidation, regardless of filelock's cleanup behavior.

* refactor: move preflight auth check from CLI to CombinedOrchestrator

The nac-test CLI should remain a generic testing engine (Jinja + pabot)
that isn't tied to infrastructure controllers. This moves controller
detection and preflight auth validation from main.py into
CombinedOrchestrator.run_tests(), gated by has_pyats and not
render_only/dry_run.

Changes:
- Remove controller detection and auth check block from cli/main.py
- Add preflight auth check to CombinedOrchestrator.run_tests()
- Consolidate controller detection into run_tests() flow
- Add shared integration test conftest with setup_bogus_controller_env
- Use pytest.mark.usefixtures for selective fixture application
- Remove redundant auth mocks from unit tests (CombinedOrchestrator
  is fully mocked via @patch)
- Update test assertions for new orchestrator-level auth handling

* fix: skip preflight auth check in dry-run mode

Dry-run mode validates test structure, not execution — it should not
require controller credentials. The auth gate in run_tests() now
checks `not self.dry_run` alongside `not self.render_only`.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ns (#569)

* fix(tests): isolate temp file tests to prevent parallel race conditions

Fixes flaky test_temp_files_cleaned_up_on_success that failed intermittently
when running tests with pytest-xdist. The test was using the global system
temp directory to count files, causing race conditions when other parallel
tests created/deleted files with similar patterns.

Solution: Patch tempfile.NamedTemporaryFile to redirect temp file creation
to pytest's tmp_path fixture, providing an isolated directory per test.

Fixes #568

* refactor(tests): address PR review feedback for temp file isolation

- Extract duplicated patching logic into isolated_tempfile fixture
- Use pytest-mock's mocker.patch() instead of unittest.mock.patch
- Patch at call site (subprocess_auth.tempfile.NamedTemporaryFile)
- Add missing *_auth_script.py cleanup assertion in error test

* refactor(tests): simplify MockerFixture imports

Remove unnecessary TYPE_CHECKING guards for MockerFixture - direct
import works fine with Python 3.10+ and pytest-mock is already a
dev dependency loaded at test runtime.
Add targeted cleanup of api/, d2d/, and default/ directories before test
execution to prevent stale html_report_data_temp/*.jsonl files from
interrupted runs being picked up during report generation.

Does NOT remove archives (ArchiveInspector uses newest only) or
pyats_results/ (recreated during report generation anyway).
Copy link
Member

@danischm danischm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move more detailed documentation to the netascode website and keep the readme short and high level.

@oboehmer
Copy link
Collaborator Author

We should move more detailed documentation to the netascode website and keep the readme short and high level.

Fair comment, we track this with #476 which we will do prior to initial release.. You suggested today to merge this branch to main after I release 2.0.0a1, so do you want this to be resolved prior merge?

@danischm
Copy link
Member

Not necessarily, if it's already tracked we are good.

* chore: release v2.0.0a1 - resolve circular dependency (#489)

Make nac-test-pyats-common a direct dependency instead of optional
[adapters]/[all] extras. Users can now install with `pip install nac-test`
to get the complete runtime including PyATS adapters.

Changes:
- Bump version to 2.0.0a1
- Move nac-test-pyats-common>=0.3.0 to direct dependencies
- Remove [adapters] and [all] optional dependency groups
- Update CI workflow to remove --extra adapters
- Update author/maintainer metadata

Closes #489

* fix: restore [project.optional-dependencies] section header

The section header was accidentally removed when deleting the adapters/all
extras, causing the dev dependencies to become orphaned.

* update 2.0.0 Changelog entry

* fix: remove RESTinstance and setuptools pin from core dependencies

RESTinstance is no longer used by the test templates shipped with nac-test
and forces a setuptools<81 pin due to its pkg_resources dependency. Users
who need RESTinstance can install it separately.

Fixes #628

* docs: consolidate pyATS platform requirements in CHANGELOG

Move Windows and macOS/Python 3.12+ restrictions under the pyATS Integration
feature section since these are pyATS-specific requirements, not breaking
changes from nac-test 1.x.

* docs: clarify --minimal-reports applicability
* fix(robot): handle relative output paths in pabot

Resolve Robot/Pabot output paths before execution so relative --output values do not create nested result directories or break artifact discovery.

Also simplify integration coverage for this case by reusing a shared cwd-relative temp fixture and ignoring __nac_tmp_* directories.

* test(e2e): add mixed relative-output scenario

Add E2E coverage for mixed Robot and PyATS runs that use a relative --output path. This reproduces the pabot path regression end-to-end and protects the fix by verifying Robot artifacts are generated and discovered correctly alongside PyATS results.

* style(tests): address PR review feedback

Replace new os.path usage with pathlib equivalents, tighten fixture and scenario docstrings, and expand the run_pabot docstring to document the absolute-path behavior. Also simplify the mixed relative-output E2E class by dropping the redundant extra assertion and wording.

* fix(tests): restore relpath for Python 3.11 compatibility

Restore os.path.relpath in the E2E relative-output helper. The pathlib-only alternative was more convoluted and relied on newer API behavior, while relpath is simpler, clearer, and works across the supported CI Python versions.

* refactor(tests): move relative output handling into the e2e harness

Remove the duplicated mixed relative-output scenario and treat relative output selection as an execution concern in _run_e2e_scenario(). Also make integration temp fixture cleanup more robust with shutil.rmtree(..., ignore_errors=True).

* test: add comment highlighting the same scenario being used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request new-infra Issues related to the new pyats/robot infra currently under development pyats PyATS framework related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants