-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathcsv-cleaner.html
More file actions
189 lines (179 loc) · 8.42 KB
/
csv-cleaner.html
File metadata and controls
189 lines (179 loc) · 8.42 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>CSV Cleaner — Fix Messy CSV Files in One Command | Python Tool</title>
<meta name="description" content="CSV Cleaner automatically fixes encoding issues, duplicate rows, inconsistent delimiters, and malformed fields. One command, clean data. Free and open source Python CLI.">
<meta name="robots" content="index,follow">
<link rel="canonical" href="https://tools.vesperfinch.com/csv-cleaner.html">
<meta property="og:type" content="product">
<meta property="og:title" content="CSV Cleaner — Fix Messy CSV Files in One Command">
<meta property="og:description" content="Automatically fix encoding issues, duplicates, bad delimiters, and malformed fields. Python CLI tool.">
<meta property="og:url" content="https://tools.vesperfinch.com/csv-cleaner.html">
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="CSV Cleaner — Fix Messy CSV Files in One Command">
<meta name="twitter:description" content="Python CLI that cleans messy CSV files automatically. Free & open source.">
<link rel="stylesheet" href="style.css">
<script type="application/ld+json">
{
"@context":"https://schema.org",
"@type":"SoftwareApplication",
"name":"CSV Cleaner",
"applicationCategory":"DeveloperApplication",
"operatingSystem":"Windows, macOS, Linux",
"offers":[
{"@type":"Offer","price":"0","priceCurrency":"USD","description":"Free open-source edition","url":"https://github.com/vesper-astrena/csv-cleaner"},
{"@type":"Offer","price":"19","priceCurrency":"USD","description":"Pro edition with advanced features","url":"https://vesperfinch.gumroad.com/l/zgnbf"}
],
"author":{"@type":"Organization","name":"Vesper Finch"},
"description":"Fix messy CSV files in one command. Handles encoding issues, duplicate rows, inconsistent delimiters, and malformed fields."
}
</script>
</head>
<body>
<header>
<div class="container">
<a href="index.html" class="logo">Vesper<span>Finch</span></a>
<nav>
<a href="csv-cleaner.html">CSV Cleaner</a>
<a href="promptlab.html">PromptLab</a>
<a href="polymarket-scanner.html">Polymarket Scanner</a>
</nav>
</div>
</header>
<section class="hero">
<div class="container">
<span class="badge">Data Cleaning</span>
<h1>Fix Messy CSV Files<br>in One Command</h1>
<p class="subtitle">You got a CSV from a client, a government portal, or a legacy system. It's broken. CSV Cleaner makes it not broken.</p>
<div class="btn-group">
<a href="https://github.com/vesper-astrena/csv-cleaner" class="btn btn-outline">Get it Free on GitHub</a>
<a href="https://vesperfinch.gumroad.com/l/zgnbf" class="btn btn-green">Get Pro — $19</a>
</div>
</div>
</section>
<!-- Problem -->
<section>
<div class="container">
<div class="section-header">
<h2>Messy Data Is Universal</h2>
<p>Every data professional has wasted hours on files that should just work. Mixed encodings, rogue delimiters, phantom duplicates, trailing whitespace that breaks joins.</p>
</div>
<div class="features">
<div class="feature">
<h4>Encoding Detection</h4>
<p>Auto-detects Shift-JIS, Latin-1, UTF-16, and 20+ other encodings. Converts everything to clean UTF-8.</p>
</div>
<div class="feature">
<h4>Duplicate Removal</h4>
<p>Finds exact and fuzzy duplicates. Choose to keep first, last, or flag for review.</p>
</div>
<div class="feature">
<h4>Delimiter Fixing</h4>
<p>Handles mixed tabs, semicolons, and pipes. Normalizes to your preferred delimiter.</p>
</div>
<div class="feature">
<h4>Field Repair</h4>
<p>Fixes unescaped quotes, mismatched columns, and line breaks inside fields.</p>
</div>
<div class="feature">
<h4>Type Inference</h4>
<p>Detects dates, numbers, booleans, and currencies. Standardizes formats across the file.</p>
</div>
<div class="feature">
<h4>Batch Mode</h4>
<p>Point it at a directory. Clean 500 files with the same command. Ideal for pipelines.</p>
</div>
</div>
</div>
</section>
<!-- Before/After -->
<section style="background:var(--surface)">
<div class="container">
<div class="section-header">
<h2>Before & After</h2>
<p>One command transforms this mess into clean, analysis-ready data.</p>
</div>
<div class="ba-grid">
<div class="ba-box ba-before">
<div class="ba-label" style="color:#dc2626">Before: raw_data.csv</div>
name,email,joined,revenue<br>
"John Doe",john@example.com,2024/01/15,$1,200<br>
Jane Smith;jane@example.com;01-15-2024;1200<br>
"John Doe",john@example.com ,2024/01/15,"$1,200"<br>
Bob,,2024-1-15,<br>
"Alice ""Wonder""land",alice@co.jp,15/01/2024,¥98000<br>
</div>
<div class="ba-box ba-after">
<div class="ba-label" style="color:#16a34a">After: raw_data_cleaned.csv</div>
name,email,joined,revenue<br>
John Doe,john@example.com,2024-01-15,1200.00<br>
Jane Smith,jane@example.com,2024-01-15,1200.00<br>
Bob,,2024-01-15,<br>
Alice Wonderland,alice@co.jp,2024-01-15,98000.00<br>
<br>
<span style="color:#16a34a;font-weight:600">Removed 1 duplicate, fixed 4 issues</span>
</div>
</div>
<div class="code-block">
<span class="prompt">$</span> <span class="cmd">pip install csv-cleaner</span><br>
<span class="prompt">$</span> <span class="cmd">csv-cleaner fix raw_data.csv</span><br>
<span class="out">Detecting encoding... UTF-8 (confidence: 98%)</span><br>
<span class="out">Scanned 5 rows, 4 columns</span><br>
<span class="out">Fixed: 1 duplicate, 1 delimiter mismatch, 2 date formats, 2 currency formats</span><br>
<span class="out">Saved: raw_data_cleaned.csv</span>
</div>
</div>
</section>
<!-- Comparison Table -->
<section>
<div class="container">
<div class="section-header">
<h2>Free vs Pro</h2>
<p>The free version handles 90% of use cases. Pro adds batch processing, custom rules, and advanced repair.</p>
</div>
<div class="table-wrap">
<table>
<thead>
<tr><th>Feature</th><th>Free</th><th>Pro ($19)</th></tr>
</thead>
<tbody>
<tr><td>Encoding detection & conversion</td><td class="check">Yes</td><td class="check">Yes</td></tr>
<tr><td>Duplicate removal</td><td class="check">Yes</td><td class="check">Yes</td></tr>
<tr><td>Delimiter normalization</td><td class="check">Yes</td><td class="check">Yes</td></tr>
<tr><td>Field repair (quotes, line breaks)</td><td class="check">Yes</td><td class="check">Yes</td></tr>
<tr><td>Date/number standardization</td><td class="check">Yes</td><td class="check">Yes</td></tr>
<tr><td>Dry-run / diff preview</td><td class="check">Yes</td><td class="check">Yes</td></tr>
<tr><td>Batch directory processing</td><td class="cross">—</td><td class="check">Yes</td></tr>
<tr><td>Custom rule definitions (YAML)</td><td class="cross">—</td><td class="check">Yes</td></tr>
<tr><td>Fuzzy duplicate detection</td><td class="cross">—</td><td class="check">Yes</td></tr>
<tr><td>Schema validation & enforcement</td><td class="cross">—</td><td class="check">Yes</td></tr>
<tr><td>JSON / Parquet output</td><td class="cross">—</td><td class="check">Yes</td></tr>
<tr><td>CI/CD integration (exit codes)</td><td class="cross">—</td><td class="check">Yes</td></tr>
<tr><td>Priority support</td><td class="cross">—</td><td class="check">Yes</td></tr>
</tbody>
</table>
</div>
</div>
</section>
<!-- CTA -->
<section>
<div class="container">
<div class="cta-banner">
<h2>Stop wasting time on broken CSVs</h2>
<p>Install in 10 seconds. Clean your first file in 20.</p>
<div class="btn-group">
<a href="https://github.com/vesper-astrena/csv-cleaner" class="btn btn-primary">Free on GitHub</a>
<a href="https://vesperfinch.gumroad.com/l/zgnbf" class="btn btn-outline">Get Pro — $19</a>
</div>
</div>
</div>
</section>
<footer>
<div class="container">
<p>Built by <a href="https://github.com/vesper-astrena">Vesper Finch</a> · <a href="index.html">All Tools</a> · <a href="promptlab.html">PromptLab</a> · <a href="polymarket-scanner.html">Polymarket Scanner</a></p>
</div>
</footer>
</body>
</html>