Automatically detect and fix data quality issues in CSV files. One command, clean data.
python csv_cleaner.py messy_data.csv -o clean_data.csvNo external dependencies. Python 3.10+ only.
- Duplicate rows — exact match removal
- Column names — standardized to snake_case
- Whitespace — leading/trailing spaces, double spaces
- Null variants — NULL, N/A, None, -, nan, #N/A → normalized
- Date formats — auto-detects 9+ formats, normalizes to ISO 8601
- Encoding — auto-detects UTF-8, Latin-1, Shift-JIS, etc.
- Type detection — infers integer, float, date, boolean, string
- Outlier detection — IQR-based outlier identification
$ python csv_cleaner.py sales_export.csv -o sales_clean.csv
CSV Cleaner — Processing sales_export.csv
Read 1,247 rows, 12 columns
Rows: 1,247 → 1,198 (49 duplicates removed)
Columns: 12 → 12
Changes:
Whitespace fixed: 156
Dates normalized: 89
Duplicates removed: 49
Column Profiles:
customer_name: string, 892 unique
email: string, 1,043 unique ⚠ 3% null
signup_date: date, 365 unique
revenue: float, 1,102 unique ⚠ 4 outliers
status: string, 3 unique
# Profile only (analyze without modifying)
python csv_cleaner.py input.csv --profile-only
# Keep original column names
python csv_cleaner.py input.csv --keep-names
# Save detailed JSON report
python csv_cleaner.py input.csv -o clean.csv --report report.jsonThe CSV Cleaner Pro ($19) adds:
- Smart fill for missing values (mean, median, mode, forward-fill)
- Fuzzy deduplication (find similar-but-not-identical rows)
- Outlier handling (flag, cap, or remove)
- Custom validation rules (regex, ranges, required fields)
- Batch processing (clean entire directories)
- Column splitting and merging
- Beautiful HTML reports
- Excel (.xlsx) support
- Python 3.10+
- No external dependencies
- JSONKit — Swiss Army knife for JSON: format, validate, query, diff, flatten, convert
- PromptLab — Test and compare LLM prompts from your terminal
- Polymarket Scanner — Scan 10,000+ prediction markets for mispricings
- All Tools — Full product catalog
MIT — free for personal and commercial use.