Skip to content

Design Reddit r/OpenApoc scraper to cross-reference with GitHub issues #12

@ayrtondenner

Description

@ayrtondenner

Goal

Design and implement a strategy to scrape https://www.reddit.com/r/OpenApoc/ and cross-reference posts against the OpenApoc/OpenApoc GitHub issue tracker.

Strategy

  1. Scrape r/OpenApoc: Collect all topics from the subreddit
  2. Filter for issues: Identify which posts are bug reports, feature requests, or issue discussions
  3. Cross-reference with GitHub: For each Reddit issue, check against OpenApoc/OpenApoc issues
  4. Classify each Reddit issue into one of three categories:
    • Solved — matches a closed GitHub issue
    • Still pending — matches an open GitHub issue
    • Completely new — no matching GitHub issue exists

Notes

  • Consider Reddit API or third-party scraping tools (similar to DiscordChatExporter approach used for Discord exports)
  • May need to handle Reddit API rate limits and authentication
  • Results should be documented similarly to the Discord mining findings in the wiki

Metadata

Metadata

Assignees

Labels

redditReddit r/OpenApoc community data tasks

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions