Skip to content

[repo-health] High: No tests and no CI — 2000+ lines of scraping logic is completely untested #9

@Liohtml

Description

@Liohtml

Summary

The repository has 2,247 lines of production Rust code across 9+ source files (person scraper, company scraper, auth, browser management, etc.) with zero unit tests, zero integration tests, and no CI pipeline to catch regressions.

Category

Tests

Severity

High

Location

  • File: src/ (all files)
  • Line(s): N/A — no #[test] annotations anywhere in the codebase

Details

The absence of tests is especially risky here because:

  1. LinkedIn's DOM structure changes frequently — selector-based scrapers break silently without regression tests
  2. The auth module (src/core/auth.rs) contains complex credential-handling logic with no verification
  3. There is no CI workflow (.github/workflows/ does not exist), so there is no automated build check either
  4. The src/scrapers/person.rs is 696 lines — the largest file — with no test coverage

Suggested Fix

Add a tests/ directory with at minimum:

  1. Unit tests for HTML parsing helpers in src/core/utils.rs
  2. Unit tests for model serialization/deserialization (src/models/person.rs)
  3. A .github/workflows/ci.yml that runs cargo build and cargo test on push/PR

Example CI workflow:

name: CI
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo build --verbose
      - run: cargo test --verbose

Effort Estimate

1 hour+


Automated finding by repo-health-agent v1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions