Skip to content

feat: support multi-CSV profiling, ratio analysis, and code guidance#17

Merged
rad1092 merged 2 commits into
mainfrom
codex/evaluate-current-project-completion-level
Feb 14, 2026
Merged

feat: support multi-CSV profiling, ratio analysis, and code guidance#17
rad1092 merged 2 commits into
mainfrom
codex/evaluate-current-project-completion-level

Conversation

@rad1092
Copy link
Copy Markdown
Owner

@rad1092 rad1092 commented Feb 14, 2026

Motivation

  • Add a workflow to analyze multiple CSVs together so users can compute column-level ratios (missing/unique/top values), numeric sign distributions, and discover shared/union columns across files.
  • Provide actionable next steps by auto-generating example pandas code and a human-readable markdown report so users can continue with visualization or feature engineering.
  • Improve the CLI and desktop experience to expose multi-file analysis and local environment diagnostics for smoother end-to-end usage.

Description

  • Added bitnet_tools/multi_csv.py which implements analyze_multiple_csv, per-file column profiling (missing_ratio, unique_ratio, top_values), numeric sign distribution, build_code_guidance, build_multi_csv_markdown, and result_to_json.
  • Extended the streaming summarizer with summarize_reader in bitnet_tools/analysis.py and added build_markdown_report to produce single-file markdown outputs.
  • Extended CLI (bitnet_tools/cli.py) with a new multi-analyze subcommand and support for report, desktop, and doctor flows, and added bitnet-desktop entry point to pyproject.toml.
  • Implemented a simple Windows desktop UI (bitnet_tools/desktop.py and bitnet_desktop.pyw) and an environment diagnostic helper (bitnet_tools/doctor.py) and added BitNet_Desktop_Start.bat to simplify Windows launch.
  • Updated README.md to document multi-CSV capabilities and workflow, and added tests in tests/test_analysis.py and tests/test_cli.py covering the new multi-CSV report builder and CLI behavior.

Testing

  • Ran pytest -q and all tests passed (11 passed).
  • Verified CLI help shows multi-analyze with python -m bitnet_tools.cli --help.
  • Executed a sample multi-file run python -m bitnet_tools.cli multi-analyze /tmp/a.csv /tmp/b.csv --question "다중 csv 분석" --out-json /tmp/multi.json --out-report /tmp/multi.md which produced both JSON and markdown outputs successfully.

Codex Task

@rad1092 rad1092 merged commit d251d4c into main Feb 14, 2026
4 checks passed
@rad1092 rad1092 deleted the codex/evaluate-current-project-completion-level branch February 14, 2026 14:17
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f343c900b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread bitnet_tools/multi_csv.py
Comment on lines +142 to +146
"print('결측 비율 상위:\n', missing_ratio.head(10))\n\n"
"numeric_cols = merged.select_dtypes(include='number').columns\n"
"if len(numeric_cols) > 0:\n"
" ratio = (merged[numeric_cols] > 0).mean().sort_values(ascending=False)\n"
" print('양수 비율 상위:\n', ratio.head(10))\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Escape newline sequences in generated pandas snippet

build_code_guidance currently uses \n inside quoted literals in the template string, so the generated pandas_example contains actual line breaks inside '...' and fails to compile (SyntaxError: unterminated string literal) when users run the suggested code from JSON/markdown output. This breaks the new “code guidance” workflow for any multi-analyze result that is copied into Python.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant