Skip to content

Latest commit

 

History

History
220 lines (168 loc) · 8.03 KB

File metadata and controls

220 lines (168 loc) · 8.03 KB

Experience of porting GNU Stow from Perl to Python

This was an interesting experiment that is both genuinely useful for my own work and allowed me to evaluate the strengths and weaknesses of the Claude Opus 4.5 model.

Here is what the workflow looked like:

Overview

  • Project: GNU Stow (symlink farm manager)
  • Source: ~3,600 lines of Perl across 3 files
  • Result: ~2,700 lines of Python in 2 standalone files
  • Tests: 239 tests passing on Python 2.7–3.14
  • Time: ~6 hours with Claude Code assistance

Phase 1: Analysis & Planning

My goal here was to steer the AI to stick strictly to the original Perl code, function-by-function, loop-by-loop, if-by-if. In an earlier attempt at this, I did not insist on this, and there was just too much change to get back on track to an exact match. Staring very close to the Perl ended up being key.

1.1 Understand Source Structure

  • Map out all source files and their relationships
  • Identify entry points (bin/stow, bin/chkstow)
  • Identify core modules (lib/Stow.pm, lib/Stow/Util.pm)
  • Count lines per file to understand scope

1.2 Define Requirements

  • Exact behavior matching — same stdout, stderr, return codes, filesystem effects
  • Target Python version — Python 2.7+ for maximum compatibility
  • No external dependencies — single standalone script
  • Oracle testing — compare Python vs Perl output for identical behavior

1.3 Create Initial Plan

  • Function-by-function transpilation
  • Preserve structure initially, refactor later
  • Mirror directory layout during development

Phase 2: Core Transpilation

2.1 Utility Functions First

Start with the utility module (Util.pm → Util.py):

  • Pure functions with minimal dependencies
  • Easy to test in isolation
  • Examples: join_paths(), parent(), error(), debug()

2.2 Main Class/Module

Transpile the core logic (Stow.pm → Stow.py):

  • Keep method names identical where possible
  • Preserve algorithm structure (loops, conditions)
  • Map Perl idioms to Python equivalents

2.3 CLI Entry Point

Transpile the CLI (bin/stow):

  • Argument parsing (Getopt::Long → manual parsing for Perl compat)
  • RC file handling
  • Main execution flow

2.4 Key Perl→Python Gotchas Found

Perl Pattern Python Equivalent
die() in eval context raise RuntimeError()
die() for exit sys.exit() with errno tracking
$! (errno) in exit codes Global _last_errno variable
Scalar context @array len(list)
split(/pat/, $str) re.split(r'pat', str) + strip trailing empties
-d $path sets $! Wrapper function that sets _last_errno
$hash{$key} dict.get(key) or dict[key]
qr/pattern/ re.compile(r'pattern')
use lib "path" sys.path.insert(0, "path")

Phase 3: Oracle Test Framework

An important feedback mechnism that helped keep Claude on track was the ability to compare directly to the original Perl code's behavior. The test harness for this was easily generated by the model.

3.1 Create Test Infrastructure

  • Set up pytest with fixtures
  • Create StowTestEnv class for test isolation
  • Implement helpers to run both Perl and Python stow

3.2 Oracle Testing Pattern

def assert_stow_match(stow_env, args):
    # Run Perl stow
    perl_rc, perl_stdout, perl_stderr = stow_env.run_perl_stow(args)
    perl_state = stow_env.get_filesystem_state()

    # Reset and run Python stow
    stow_env.reset_target()
    python_rc, python_stdout, python_stderr = stow_env.run_python_stow(args)
    python_state = stow_env.get_filesystem_state()

    # Compare everything
    assert perl_rc == python_rc
    assert perl_stdout == python_stdout
    assert perl_stderr == python_stderr
    assert perl_state == python_state

3.3 Start Simple

  • Basic stow/unstow operations
  • Error cases
  • Edge cases (conflicts, adopt, dotfiles)
  • Verbose output, help, version

Phase 4: Transpile Original Tests

The final step was to transpile the original Perl tests to Python, since these are likely interesting or bug-prone cases, as the GNU Stow authors highlighted them as tests.

4.1 Parallel Agent Approach

Spawned 12 parallel AI agents to transpile test files simultaneously:

  • Each agent handled 1-3 Perl test files
  • Mapped Test::More assertions to pytest
  • Preserved test semantics

4.2 Test Pattern Mapping

Perl (Test::More) Python (pytest)
is($a, $b, 'msg') assert a == b, 'msg'
is_deeply(\@a, \@b) assert a == b
ok($cond) assert cond
like($str, qr/pat/) assert re.search(r'pat', str)
dies_ok { ... } with pytest.raises(...)
stderr_like(sub{}, qr//) capsys.readouterr()
subtest 'name' => sub {} def test_name():

4.3 Fix Test Isolation Issues

A lot of the fixing was actually about the tests themselves, not the actual program.

Perl tests share state; Python tests need isolation:

  • Use pytest's tmp_path fixture
  • Create fresh directories per test
  • Restore working directory after each test

Phase 5: Bug Fixing Iteration

5.1 Track Progress

Started with 111 passed, 98 errors. Iteratively fixed:

  1. Fixture isolation (98 errors → 0)
  2. Regex compilation for patterns
  3. parent() leading slash preservation
  4. Scalar context vs list returns
  5. die() as catchable exception

5.2 Final Results

Once the functionality was complete, the next step was linting and formatting

  • 239 tests passing
  • 37 oracle tests (behavior-identical to Perl)
  • 0 flake8 errors

Phase 6: Consolidation & Packaging

6.1 Merge to Standalone Scripts

Keeping the files separate during development was probably quite helpful for direct comparisons, but for packaging, it is important to have a single file, much more convenient. This was fairly straightforward.

  • Combined lib/Stow/Util.py + lib/Stow/Stow.py + bin/stow → single bin/stow
  • Used section dividers for maintainability
  • Same for chkstow

6.2 Python 2.7 Compatibility

Adding broad compatibility was actually quite easy.

  • Use from __future__ import print_function
  • Use types.ModuleType instead of importlib.util
  • Use dirnames[:] = [] instead of .clear()
  • Use flake8 (not ruff) for linting

6.3 Setup Packaging

Claude is great with the details of the bureaucracy around packaging and filling out all the templates and metadata and so on.

  • pyproject.toml with script-files
  • setup.py for Python 2.7 compatibility
  • setup.cfg for flake8 config

Phase 7: CI/CD Setup

7.1 GitHub Actions

Claude was quite fast at figuring out how to set up even the EOL CI/CD jobs.

  • Test matrix: Python 3.8, 3.10, 3.12, 3.13
  • Container jobs for EOL versions: Python 2.7, 3.6
  • Oracle tests: Download Perl stow, run comparison tests
  • Linting: flake8

7.2 PyPI Publishing

  • Trusted publishing via GitHub Actions
  • sdist only (pure Python)
  • Trigger on release

Key Success Factors

  1. Exact behavior first — resist urge to "improve" during port
  2. Oracle testing — catches subtle differences automatically
  3. Parallel agents — dramatically speeds up test transpilation
  4. Incremental fixes — track test count, fix in batches
  5. Document gotchas — create DEVELOPMENT.md for future merges
  6. Python 2.7 support — maximizes deployment flexibility

Lessons Learned

  1. Perl die() has multiple uses — sometimes exit, sometimes exception
  2. Perl $! (errno) is stateful — need to track and reset
  3. Scalar vs list context — Python always returns collections
  4. Test isolation matters — Perl tests share state, Python shouldn't
  5. EOL distros need archive repos — Debian Buster/Stretch need archive.debian.org
  6. Consolidation works — single-file scripts are easier to deploy

Metrics

Metric Value
Perl source lines ~3,600
Python result lines ~2,700
Test count 239
Oracle tests 37
Python versions supported 2.7, 3.0–3.14
External dependencies 0
Time to port ~6 hours