High-performance data toolkit and text editor for the command line.
Built for people who tell their LLM to do the work. 60+ commands for CSV, TSV, JSON, JSONL, and plain text.
Overview • Data Commands • Text Commands • Pipeline • Expressions • Performance • Install • Build • License
Forge is a single-binary CLI tool for structured data processing and plain-text file editing. One tool, 60+ commands, zero runtime dependencies beyond libc.
- Data processing: filter, sort, join, group, pivot, deduplicate CSV/TSV/JSON/JSONL at millions of rows per second
- Text editing: cat, grep, sed, insert, delete, patch -- everything an LLM or script needs to read and modify text files
- Composable pipeline: chain any operations in a single pass with
forge pipe - Expression engine: arithmetic, comparisons, regex, conditionals, 20+ built-in functions
Built in C++20 with memory-mapped I/O, parallel sort via Intel TBB, and xxhash-based deduplication.
| Inspection | info head tail schema count sample freq stats describe |
| Column Ops | select drop rename reorder mergecols splitcol addcol |
| Row Ops | filter sort reverse slice shuffle unique dedup isolate |
| Transforms | lower upper trim replace fill transform |
| Analysis | groupby enumerate derive |
| Reshape | pivot unpivot coalesce |
| Multi-File | diff merge concat join intersect subtract |
| Export | export split validate |
Plain-text file operations for LLM toolchains and scripting. All support stdin via -.
| Inspect | cat wc grep |
| Edit | sed insert delete patch prepend append |
| Transform | lines |
# Print file with line numbers
forge cat src/main.cpp --line-numbers --range 10:25
# Search with context
forge grep src/main.cpp -e "TODO|FIXME" -n 2 --line-numbers
# Find and replace
forge sed config.txt -f "localhost" -r "prod.example.com" -o config.txt
# Replace lines 15-20 with new content
forge patch src/main.cpp --range 15:20 -v " return 0;\n}" -o src/main.cpp
# Remove empty lines, trim whitespace, deduplicate
forge lines notes.txt -o clean.txt --nonempty --trim --uniqueChain operations in a single pass. No intermediate files.
forge pipe data.csv \
subtract breach.csv --on email \
filter "salary > 50000" \
derive "salary * 12" --name annual \
select name,email,annual \
sort annual:desc \
-o clean.csvAdd --json for machine-readable pipeline summary:
{"input_rows":50000,"input_columns":5,"output_rows":12340,"output_columns":3,"steps":5}The expression engine powers filter, derive, and pipeline steps.
# Filter with conditions
forge filter data.csv -e "salary > 80000 AND department == 'Engineering'" -o out.csv
# Computed columns with arithmetic
forge derive data.csv -e "salary * 12" --name annual -o out.csv
forge derive data.csv -e "round(price * quantity * 1.08, 2)" --name total -o out.csv
# Conditional columns
forge derive data.csv -e "if(salary > 80000, 'senior', 'junior')" --name level -o out.csv
# String functions
forge derive data.csv -e "concat(first, ' ', last)" --name fullname -o out.csv| Arithmetic | + - * / % |
| Comparison | == != > < >= <= |
| Pattern | ~ !~ (regex match) |
| Logic | AND OR NOT |
| String | len upper lower trim concat substr replace contains startswith endswith |
| Numeric | abs round floor ceil min max |
| Control | if empty notempty |
curl -fsSL https://raw.githubusercontent.com/deadcode-walker/forge/main/install.sh | shInstalls the latest release as forge-cli to /usr/local/bin. Run it again to update.
Install a specific version:
curl -fsSL https://raw.githubusercontent.com/deadcode-walker/forge/main/install.sh | sh -s v1.3.0Benchmarked on AMD Ryzen 9 9950X3D, 562MB CSV (13.5M rows, 15 columns):
| Operation | Time | Throughput |
|---|---|---|
count |
0.07s | 192M rows/s |
head 5 |
0.02s | instant |
info |
1.2s | 10.9M rows/s |
select 2 cols |
1.4s | 9.6M rows/s |
filter |
2.1s | 6.4M rows/s |
lower |
2.3s | 5.9M rows/s |
export TSV |
1.6s | 8.3M rows/s |
sort (parallel) |
12.9s | -- |
unique (xxhash) |
6.3s | 2.5M rows/s |
Requirements
| C++ compiler | GCC 12+, Clang 15+, or MSVC 2022+ |
| CMake | 3.20+ |
| xxhash | latest |
| TBB | Intel Threading Building Blocks |
csv-parser v2.5.0 is vendored and requires no separate install.
Linux
# Arch
sudo pacman -S xxhash tbb
# Ubuntu/Debian
sudo apt install libxxhash-dev libtbb-dev
# Build
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)Windows (MSYS2)
pacman -S mingw-w64-x86_64-xxhash mingw-w64-x86_64-tbb
cmake -B build -G "MinGW Makefiles" -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)# Inspect
forge info data.csv
forge head data.csv -n 20
forge describe data.csv --json
# Filter and sort
forge filter data.csv -e "salary > 80000" -o filtered.csv
forge sort data.csv -c salary:desc -o sorted.csv
# Computed columns
forge derive data.csv -e "salary * 12" --name annual -o enriched.csv
# Group and aggregate
forge groupby data.csv -c department -a salary:mean,salary:count -o summary.csv
# Join
forge join users.csv orders.csv -c user_id --type left -o joined.csv
# Text editing
forge cat src/main.cpp --line-numbers --range 1:50
forge grep src/ -e "TODO" --count
forge patch config.yaml --range 12:14 -v "port: 8080\nhost: 0.0.0.0" -o config.yaml
# Pipeline: chain everything
forge pipe data.csv \
filter "status == 'active'" \
derive "price * qty" --name total \
groupby region -a total:sum \
sort total_sum:desc \
-o report.csv
# Typo? Forge suggests the right command
forge fliter data.csv
# error: unknown command 'fliter' (did you mean 'filter'?)