docs: add app- and package-level READMEs with root index by snowfox1003 · Pull Request #221 · cppalliance/boost-data-collector

snowfox1003 · 2026-05-19T16:42:49Z

Summary

Add README.md files for Django apps, config/, core/ (including collectors/, operations/*, utils/), boost_collector_runner/, and scripts/.
Update root README.md with a Critical environment variables table, an App-level READMEs index, and links to the new package docs.

Motivation

Contributors needed a faster way to discover what each package does, which manage.py commands matter, and which env vars are required—without reading every command module or only the long-form docs/ tree.

Changes

Root `README.md`

Environment variable quick reference (points to .env.example).
Curated table linking to all new package READMEs.
Simplified project structure tree; architecture section links to the new index.

New READMEs (24 packages)

Area	Packages
Project / orchestration	`config/`, `boost_collector_runner/`, `scripts/`
Shared core	`core/`, `core/collectors/`, `core/operations/`, `core/operations/{github_ops,slack_ops,md_ops,file_ops}/`, `core/utils/`
Trackers & handlers	All tracker apps + `slack_event_handler`

Each app README typically includes: overview, common tasks, main run_* command options, other management commands, and a pytest one-liner.

Out of scope

No Python, schema, migration, or boost_collector_schedule.yaml changes.
No new READMEs under migrations/, tests/, or most management/commands/ subfolders (per root README policy).

Test plan

Spot-check root README links to each new */README.md.
Sample python manage.py <command> --help and confirm documented flags match (e.g. run_scheduled_collectors, run_cppa_user_tracker).
Confirm docs-only diff (CI lint / pyright / pytest should be unchanged in behavior).

Closes #220

Summary by CodeRabbit

Documentation
- Updated top-level README with critical environment variable guidance, clearer project layout, and an "app-level READMEs" index.
- Added or expanded comprehensive README docs for 20+ apps and core modules covering purpose, workflows, CLI command usage, configuration notes, and pytest/test instructions.

coderabbitai · 2026-05-19T16:43:02Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds or updates README files across the repo: root README now lists critical environment variables and an app-level README index; core/ and many Django apps receive package-level READMEs documenting purpose, management commands, CLI options, data flows, and test instructions.

Changes

Package and App Documentation

Layer / File(s)	Summary
Root README Updates `README.md`	Adds "Critical environment variables" section with table of required vars; updates "Project structure" tree; introduces "App-level READMEs" index and updates "How it works" references.
Core Infrastructure Packages `core/README.md`, `core/collectors/README.md`, `core/operations/README.md`, `core/operations/file_ops/README.md`, `core/operations/github_ops/README.md`, `core/operations/md_ops/README.md`, `core/operations/slack_ops/README.md`, `core/utils/README.md`	Adds core package documentation: top-level README describes shared collector/operations/utils and management commands; nested READMEs document collectors orchestration, operations submodules (GitHub/Slack/Markdown/file I/O), and stateless utilities.
Boost Apps Documentation `boost_collector_runner/README.md`, `boost_library_tracker/README.md`, `boost_library_docs_tracker/README.md`, `boost_library_usage_dashboard/README.md`, `boost_mailing_list_tracker/README.md`, `boost_usage_tracker/README.md`	Adds READMEs for Boost-focused apps covering YAML-driven scheduling, tracker/documentation/dashboard workflows, CLI option tables (including deprecated aliases where noted), optional Pinecone sync, and pytest invocation instructions.
CPPA Apps Documentation `cppa_pinecone_sync/README.md`, `cppa_slack_tracker/README.md`, `cppa_user_tracker/README.md`, `cppa_youtube_script_tracker/README.md`	Adds READMEs for CPPA apps: Pinecone sync embed/upsert contract, Slack tracker API/persistence, user tracker stub and service upserts, YouTube script tracker phases and Pinecone passthrough; each includes management command notes and tests.
Activity Trackers Documentation `discord_activity_tracker/README.md`, `github_activity_tracker/README.md`, `clang_github_tracker/README.md`, `wg21_paper_tracker/README.md`	Adds READMEs describing end-to-end ingestion (fetch → DB/workspace JSON → optional Markdown export/git push/Pinecone), backfill/maintenance commands with options, and test invocation guidance.
Event Handlers and Utilities `slack_event_handler/README.md`, `scripts/README.md`	Adds Slack Event Handler README (Socket Mode listener and token validation) and a minimal scripts README clarifying repository maintenance scripts are not Django app code.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

jonathanMLDev
wpak-ai

A rabbit hops through files at night,
Organizing docs by soft moonlight.
READMEs bloom, one per app and core,
Now every path has its own door.
🐇📚 Hooray for clearer sight!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main change: comprehensive addition of app- and package-level READMEs with root index updates.
Linked Issues check	✅ Passed	All objectives from issue `#220` are met: new READMEs added for 24 packages/apps, root README updated with environment variables table and app-level index, nested documentation for core/ subtrees, and docs-only change verified.
Out of Scope Changes check	✅ Passed	All changes are scoped to documentation. No Python code modifications, schema changes, migrations, or configuration file alterations outside documented scope.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

scripts/README.md (1)
8-8: ⚡ Quick win

Enhance usage guidance for list_cross_app_imports.py.

The clean-macos.sh entry includes usage context (when to use it, optional argument), but the list_cross_app_imports.py entry only states its purpose. Consider adding invocation details and key arguments to match the usage guidance provided for the other script.
📝 Example enhancement
-| [`list_cross_app_imports.py`](list_cross_app_imports.py) | Cross-app import report (Markdown/CSV). |
+| [`list_cross_app_imports.py`](list_cross_app_imports.py) | Cross-app import report (Markdown/CSV). Run `python scripts/list_cross_app_imports.py [--format=md|csv]` to analyze imports between Django apps. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/README.md` at line 8, Update the README entry for
list_cross_app_imports.py to match the other scripts by adding invocation and
key-argument guidance: describe how to run the script (e.g., python
list_cross_app_imports.py [PATH] ), list important flags such as an optional
target path argument, output options/format flag (Markdown/CSV), and a --help
flag, and include a short example invocation showing output selection (e.g.,
generate Markdown or CSV). Reference the script name list_cross_app_imports.py
and ensure the text mirrors the style and level of detail used for
clean-macos.sh so users know when and how to run it and what outputs to expect.
boost_usage_tracker/README.md (1)
38-40: ⚡ Quick win

Replace placeholder descriptions with meaningful command summaries.

The management command descriptions currently just repeat the command name (e.g., "Management command: run_boost_usage_tracker"). Compare with boost_library_tracker/README.md lines 41-47, which provides helpful descriptions like "Backfill previous_filename_id for renamed files" and "Exit 0 if a new Boost release exists."

Consider adding brief descriptions that explain what each command does without requiring --help.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@boost_usage_tracker/README.md` around lines 38 - 40, Replace the placeholder
descriptions for the management commands by writing concise, meaningful
summaries for `run_boost_usage_tracker`, `run_update_created_repos_by_language`,
and `run_update_db`; specifically: describe what each command does (e.g.,
`run_boost_usage_tracker` — scan project repos and aggregate Boost library usage
metrics into the tracker DB, `run_update_created_repos_by_language` — compute or
refresh language-based counts/meta for created repositories, `run_update_db` —
apply incremental updates/migrations or backfill dataset fields in the tracker
DB), and update the README table entries so each command row contains a one-line
action summary rather than repeating the command name.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Line 198: The README table references a non-existent file; update the table
row for `boost_library_tracker/` by removing the linked path
`boost_library_tracker/management/commands/README.md` (or alternatively create
that README file if intended). Specifically, edit the table cell containing
"`Boost metadata + many maintenance commands; see also
[`management/commands/`](boost_library_tracker/management/commands/README.md).`"
and either remove the "see also" clause/link or add a new
`boost_library_tracker/management/commands/README.md` with the appropriate
content so the link resolves.

In `@wg21_paper_tracker/README.md`:
- Line 36: The README table entry for the management command currently repeats
the command name; update the description for import_wg21_metadata_from_csv to a
clear functional description such as "Import metadata from CSV" (or similar) so
readers understand what the command does; locate the table row containing the
symbol import_wg21_metadata_from_csv and replace the redundant second-column
text with the improved description.
- Around line 28-30: Remove the redundant "## Title" subsection and its
duplicated bold line ("**WG21 Paper Tracker**") from the README; locate the "##
Title" heading and the following bold text and delete those lines so the
document only keeps the original top-level title heading.

---

Nitpick comments:
In `@boost_usage_tracker/README.md`:
- Around line 38-40: Replace the placeholder descriptions for the management
commands by writing concise, meaningful summaries for `run_boost_usage_tracker`,
`run_update_created_repos_by_language`, and `run_update_db`; specifically:
describe what each command does (e.g., `run_boost_usage_tracker` — scan project
repos and aggregate Boost library usage metrics into the tracker DB,
`run_update_created_repos_by_language` — compute or refresh language-based
counts/meta for created repositories, `run_update_db` — apply incremental
updates/migrations or backfill dataset fields in the tracker DB), and update the
README table entries so each command row contains a one-line action summary
rather than repeating the command name.

In `@scripts/README.md`:
- Line 8: Update the README entry for list_cross_app_imports.py to match the
other scripts by adding invocation and key-argument guidance: describe how to
run the script (e.g., python list_cross_app_imports.py [PATH] ), list important
flags such as an optional target path argument, output options/format flag
(Markdown/CSV), and a --help flag, and include a short example invocation
showing output selection (e.g., generate Markdown or CSV). Reference the script
name list_cross_app_imports.py and ensure the text mirrors the style and level
of detail used for clean-macos.sh so users know when and how to run it and what
outputs to expect.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3db39771-171a-4c81-8814-7ca65e977c36

📥 Commits

Reviewing files that changed from the base of the PR and between 7bf1b7e and 6cadf6c.

📒 Files selected for processing (26)

README.md
boost_collector_runner/README.md
boost_library_docs_tracker/README.md
boost_library_tracker/README.md
boost_library_usage_dashboard/README.md
boost_mailing_list_tracker/README.md
boost_usage_tracker/README.md
clang_github_tracker/README.md
config/README.md
core/README.md
core/collectors/README.md
core/operations/README.md
core/operations/file_ops/README.md
core/operations/github_ops/README.md
core/operations/md_ops/README.md
core/operations/slack_ops/README.md
core/utils/README.md
cppa_pinecone_sync/README.md
cppa_slack_tracker/README.md
cppa_user_tracker/README.md
cppa_youtube_script_tracker/README.md
discord_activity_tracker/README.md
github_activity_tracker/README.md
scripts/README.md
slack_event_handler/README.md
wg21_paper_tracker/README.md

…acker to improve clarity and organization

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@boost_library_tracker/README.md`:
- Around line 33-35: Remove the redundant "## Title" section that duplicates the
main heading by deleting the lines containing "## Title" and the bold "**Boost
Library Tracker**" under it (the template artifact between the top-level heading
and the rest of the README); keep the original top-level heading intact so the
README only contains a single main title.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5c0bedd2-7cb3-4573-90ec-843d4649cd2f

📥 Commits

Reviewing files that changed from the base of the PR and between 6cadf6c and 6ce1d87.

📒 Files selected for processing (3)

README.md
boost_library_tracker/README.md
wg21_paper_tracker/README.md

… streamline documentation

jonathanMLDev

I don't believe that a readme is strictly necessary for general holders such as core and scripts. However, individual apps require a more detailed readme. It should reflect the workflow regarding what data is retrieved from where, how it is processed, where it is stored and when it is executed(daily/weekly or others).
Current all READMEs are too simple and those are uniform without individual app characteristics.

… detail in data workflows

…d detail in data workflows and service references

coderabbitai

🧹 Nitpick comments (1)

clang_github_tracker/README.md (1)
44-45: 💤 Low value

Consider simplifying the date option aliases.

Documenting three equivalent aliases (--since, --from-date, --start-time and --until, --to-date, --end-time) may confuse users about which to use. If all three are genuinely supported, consider noting a "preferred" option or documenting only the primary flag with a mention that aliases exist.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@clang_github_tracker/README.md` around lines 44 - 45, Pick and document a
single preferred flag for each boundary (e.g., prefer --since and --until) and
update the README lines showing `--since`, `--from-date`, `--start-time` and
`--until`, `--to-date`, `--end-time` to present the preferred flag first with a
parenthetical note that aliases exist (e.g., "--since (alias: --from-date,
--start-time)") so the table is simpler and readers know the canonical flag
while preserving alias information.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@clang_github_tracker/README.md`:
- Around line 44-45: Pick and document a single preferred flag for each boundary
(e.g., prefer --since and --until) and update the README lines showing
`--since`, `--from-date`, `--start-time` and `--until`, `--to-date`,
`--end-time` to present the preferred flag first with a parenthetical note that
aliases exist (e.g., "--since (alias: --from-date, --start-time)") so the table
is simpler and readers know the canonical flag while preserving alias
information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bdebe47d-d393-4a1c-b8f7-b0e9822270ec

📥 Commits

Reviewing files that changed from the base of the PR and between d876c45 and e1507bd.

📒 Files selected for processing (17)

boost_collector_runner/README.md
boost_library_docs_tracker/README.md
boost_library_tracker/README.md
boost_library_usage_dashboard/README.md
boost_mailing_list_tracker/README.md
boost_usage_tracker/README.md
clang_github_tracker/README.md
core/README.md
cppa_pinecone_sync/README.md
cppa_slack_tracker/README.md
cppa_user_tracker/README.md
cppa_youtube_script_tracker/README.md
discord_activity_tracker/README.md
github_activity_tracker/README.md
scripts/README.md
slack_event_handler/README.md
wg21_paper_tracker/README.md

✅ Files skipped from review due to trivial changes (6)

scripts/README.md
cppa_slack_tracker/README.md
boost_mailing_list_tracker/README.md
boost_library_docs_tracker/README.md
wg21_paper_tracker/README.md
discord_activity_tracker/README.md

…ation of directories and files

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Line 163: The project tree code fence in README.md is missing a language tag
causing markdownlint MD040; update the fenced block that starts with the project
tree (the triple backticks surrounding the " . ├── manage.py ..." tree) to use a
language tag by changing the opening backticks to ```text so the block becomes a
text code fence.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d58d66e3-7f37-4e97-8aa3-fbc7d59c1168

📥 Commits

Reviewing files that changed from the base of the PR and between e1507bd and 22e47e3.

📒 Files selected for processing (1)

README.md

…management command names and scheduling for Boost Data Collector

snowfox1003 self-assigned this May 19, 2026

docs: enhance README files for various trackers and core components

6cadf6c

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread README.md Outdated

Comment thread wg21_paper_tracker/README.md Outdated

Comment thread wg21_paper_tracker/README.md Outdated

docs: update README files for boost_library_tracker and wg21_paper_tr…

6ce1d87

…acker to improve clarity and organization

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread boost_library_tracker/README.md Outdated

docs: remove title sections from README files for various trackers to…

d876c45

… streamline documentation

snowfox1003 requested a review from jonathanMLDev May 19, 2026 17:12

jonathanMLDev reviewed May 19, 2026

View reviewed changes

Comment thread README.md

snowfox1003 added 2 commits May 19, 2026 14:38

docs: update README files for various trackers to enhance clarity and…

961eced

… detail in data workflows

docs: enhance README files for various trackers to improve clarity an…

e1507bd

…d detail in data workflows and service references

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

docs: update README to reflect improved project structure and organiz…

22e47e3

…ation of directories and files

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread README.md

snowfox1003 requested a review from jonathanMLDev May 19, 2026 20:26

jonathanMLDev approved these changes May 19, 2026

View reviewed changes

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread boost_usage_tracker/README.md Outdated

Comment thread scripts/README.md

docs: update README and workflow documentation to reflect changes in …

7e66d16

…management command names and scheduling for Boost Data Collector

snowfox1003 requested a review from wpak-ai May 20, 2026 01:56

Conversation

snowfox1003 commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Root README.md

New READMEs (24 packages)

Out of scope

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonathanMLDev left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

snowfox1003 commented May 19, 2026 •

edited by coderabbitai Bot

Loading

Root `README.md`

coderabbitai Bot commented May 19, 2026 •

edited

Loading

jonathanMLDev left a comment •

edited

Loading