Skip to content

Design crawler.md integration with lecturer detection #40

@dcasota

Description

@dcasota

Task Description

Design how to integrate photonos-docs-lecturer detection capabilities into crawler droid.

Design Decisions Needed

  1. Detection Strategy:

    • Run weblinkchecker.sh AND lecturer analyze?
    • Replace weblinkchecker with lecturer?
    • Merge both reports?
  2. Plugin Selection:

    • Which plugins to enable? (orphan_page, orphan_link, orphan_image, image_alignment, heading_hierarchy)
    • Any exclusions needed?
  3. Performance:

    • Parallel threads (current: 10, adjust?)
    • Timeout settings?
    • Memory constraints?
  4. Output Format:

    • Keep CSV?
    • JSON for programmatic parsing?
    • Both?
  5. Integration Method:

    • Subprocess call from crawler?
    • Import lecturer as Python module?
    • Separate skill file?

Deliverables

  • Updated crawler.md specification
  • Implementation approach documented
  • Example code snippets

Acceptance Criteria

  • ✅ Integration approach decided
  • ✅ crawler.md specification updated
  • ✅ Ready to implement

Related

Labels

lecturer-integration, crawler, design, medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions