This is a work in progress, might eat your cat!
This project provides both a Fluent-Bit WASM filter and a standalone CLI tool for filtering logs using logcheck rules.
Fluentbit (source code) is a popular open-source log-shipping tool that can take logs in from many different sources, filter and process them, then send them on to many different supported outputs.
One of its filtering 'plugins' is the WASM filter, which currently embeds the 'WebAssembly Micro Runtime' (website, github) (see here, here/here, and here in fluentbit source) to facilitate executing WebAssembly (WASM) programs/code to process or transform particular flows of log messages that pass through Fluentbit.
The logcheck-filter CLI tool provides a standalone way to filter log files using logcheck rules from the logcheck-database package. It can read from files, stdin, or systemd journal, and output filtered results in text or JSON format.
Key Features:
- β Pure Rust - No C dependencies, runs on Alpine Linux
- β Multiple input sources - Files, stdin, systemd journald
- β Flexible output - Text (colored) or JSON format
- β Production-ready - Uses 1000+ logcheck rules from Debian
- β Statistics - Processing summaries and match rates
- β Filtering modes - Show all, violations only, or unmatched entries
Download pre-built binaries from GitHub Releases for multiple platforms:
CLI Tools:
logcheck-filter-linux-amd64.tar.gz(Linux x86_64, glibc)logcheck-filter-linux-arm64.tar.gz(Linux ARM64, glibc)logcheck-filter-darwin-amd64.tar.gz(macOS Intel)logcheck-filter-darwin-arm64.tar.gz(macOS Apple Silicon)
WASM Filter:
fluentbit-wasm-filter.tar.gz(WebAssembly module)
# Pull the latest image (linux/amd64)
docker pull ghcr.io/finkregh/fluent-bit-logcheck:latestBuild the WASM filter:
cargo xtask build-wasm --release
# Creates: target/wasm32-unknown-unknown/release/logcheck_fluent_bit_filter.wasmBasic Configuration (fluent-bit.conf):
[INPUT]
name systemd
tag journal.system
read_from_tail on
[FILTER]
name wasm
match journal.*
wasm_path ./target/wasm32-unknown-unknown/release/logcheck_fluent_bit_filter.wasm
function_name logcheck_filter_json
accessible_paths .
[OUTPUT]
name stdout
match *
format json_linesRun Fluent-Bit:
fluent-bit -c fluent-bit.conf# Build the CLI tool
cargo build --release --bin logcheck-filter
# Filter a log file
logcheck-filter --rules /etc/logcheck file /var/log/syslog
# Read from stdin
cat /var/log/syslog | logcheck-filter --rules /etc/logcheck stdin
# Read from systemd journal (Linux only)
logcheck-filter --rules /etc/logcheck journald --unit sshd --lines 100
# Show only violations
logcheck-filter --rules /etc/logcheck --show violations file /var/log/auth.log
# JSON output with statistics
logcheck-filter --rules /etc/logcheck --format json --stats file /var/log/syslog
# Colored output
logcheck-filter --rules /etc/logcheck --color file /var/log/syslogMulti-source monitoring:
# Monitor live systemd journal for security events
logcheck-filter --rules /etc/logcheck --show violations --color journald --follow --unit sshd
# Process multiple log files with statistics
for log in /var/log/{auth,syslog,messages}.log; do
echo "Processing $log:"
logcheck-filter --rules /etc/logcheck --stats --format json file "$log" | jq -r '.logcheck_category' | sort | uniq -c
done
# Real-time log streaming with filtering
tail -f /var/log/syslog | logcheck-filter --rules /etc/logcheck --color --show violations stdinIntegration with other tools:
# Export violations to CSV for analysis
logcheck-filter --rules /etc/logcheck --format json --show violations file /var/log/auth.log | \
jq -r '[.message, .logcheck_category, .logcheck_rule_type] | @csv' > security-violations.csv
# Count violations by category
logcheck-filter --rules /etc/logcheck --format json --show violations file /var/log/syslog | \
jq -r '.logcheck_category' | sort | uniq -c | sort -nr
# Monitor log rates in real-time
logcheck-filter --rules /etc/logcheck --stats journald --follow --lines 0 | \
grep -o "Processed [0-9]* entries" | \
while read line; do echo "$(date): $line"; doneAnalyze unmatched journald entries and generate regex suggestions:
# Launch analyzer with default minimum group size (2)
logcheck-filter --show unmatched journald analyze
# Require at least 3 similar entries per pattern
logcheck-filter --show unmatched journald analyze --min-group-size 3Key bindings:
Pattern list
- β/β or j/k: Move selection
- Enter: Open save dialog
- PgUp/PgDn: Scroll preview
- q/Esc: Quit analyzer
Save dialog
- e: Edit regex
- β/β: Change category (when not editing)
- β/β: Move cursor (when editing)
- Enter: Save rule
- Esc: Cancel (or finish editing)
Rules are written to /etc/logcheck under the appropriate ignore.d.* directory
as local-generated with metadata comments.
- Throughput: ~10,000 log entries/second on modern hardware
- Memory Usage: ~50MB baseline + 1MB per 1000 logcheck rules
- Startup Time: 2-3 seconds to compile 1247 production logcheck rules
- CPU Impact: Adds ~15% CPU overhead compared to native fluent-bit filters
Monitor these fluent-bit metrics for WASM filter health:
# Check filter processing rate
curl -s http://localhost:2020/api/v1/metrics | grep -E "fluentbit_filter_(add|drop)_records_total"
# Monitor WASM memory usage
curl -s http://localhost:2020/api/v1/metrics | grep "fluentbit_wasm"Common Issues:
-
WASM Module Loading Fails
Error: failed to load WASM module Solution: Check file path and ensure accessible_paths includes the directory -
Rules Directory Not Found
Error: Could not find logcheck rules Solution: Ensure /etc/logcheck exists or mount rules directory in container -
Memory Exhaustion
Error: WASM execution failed Solution: Increase fluent-bit memory limits or reduce rule set size
Debug Mode:
[FILTER]
name wasm
match *
wasm_path ./logcheck_fluent_bit_filter.wasm
function_name logcheck_filter_json
accessible_paths .
# Enable debug logging
log_level debug- Rule Chunking: Large rule sets are automatically chunked for better performance
- Input Filtering: Use fluent-bit
matchpatterns to process only relevant logs - Memory Tuning: Increase WASM stack size in
.cargo/config.tomlfor complex regex - Caching: Rules are compiled once at startup and cached for the session
Filter violations from SSH logs:
logcheck-filter --rules /etc/logcheck --show violations file /var/log/auth.logOutput:
Loading logcheck rules from: /etc/logcheck
Loaded 1247 rules across 8 categories
Reading from: /var/log/auth.log
[VIOLATION] Jan 01 10:00:00 host sshd[1234]: Failed password for invalid user admin from 192.168.1.100
[CRACKING] Jan 01 10:05:00 host sshd[5678]: Invalid user root from 192.168.1.200
JSON output for programmatic processing:
logcheck-filter --rules /etc/logcheck --format json file /var/log/syslogOutput:
{"message":"Jan 01 10:00:00 host sshd[1234]: Failed password for admin","matched":true,"category":"Violations","rule_type":"violations"}
{"message":"Jan 01 10:01:00 host systemd[1]: Started Session 42","matched":true,"category":"SystemEvents","rule_type":"ignore"}
{"message":"Jan 01 10:02:00 host unknown: weird message","matched":false,"category":null,"rule_type":"unmatched"}System Logs Pipeline:
[INPUT]
name systemd
tag journal.system
read_from_tail on
strip_underscores on
lowercase on
[INPUT]
name tail
path /var/log/syslog
tag file.syslog
parser syslog-rfc3164
read_from_head false
[INPUT]
name syslog
port 514
tag network.syslog
parser syslog-rfc3164
[FILTER]
name wasm
match *
wasm_path /opt/fluent-bit/filters/logcheck_fluent_bit_filter.wasm
function_name logcheck_filter_json
accessible_paths /etc/logcheck
[OUTPUT]
name forward
match *
host log-aggregator.company.com
port 24224Route by logcheck classifications:
[INPUT]
name systemd
tag journal.security
systemd_filter _TRANSPORT=audit
systemd_filter _SYSTEMD_UNIT=sshd.service
[FILTER]
name wasm
match journal.security
wasm_path /opt/fluent-bit/filters/logcheck_fluent_bit_filter.wasm
function_name logcheck_filter_json
accessible_paths /etc/logcheck
# Route violations to security team
[OUTPUT]
name file
match_regex journal\.security.*
path /var/log/security-violations.log
format json_lines
# Add conditional routing based on logcheck_category field
# Route normal events to standard aggregation
[OUTPUT]
name forward
match journal.security
host central-logs.company.com
port 24224Docker Compose Example:
version: '3.8'
services:
fluent-bit:
image: ghcr.io/finkregh/fluent-bit-logcheck:latest
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- /etc/logcheck:/etc/logcheck:ro
- /var/log:/var/log:ro
- /run/systemd/journal:/run/systemd/journal:ro
ports:
- "24224:24224"
cap_add:
- SYS_PTRACE # For systemd journal accessThe project publishes container images to GitHub Container Registry for linux/amd64 platform.
Kubernetes Deployment:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit-logcheck
spec:
selector:
matchLabels:
name: fluent-bit-logcheck
template:
spec:
containers:
- name: fluent-bit
image: fluent/fluent-bit:latest
volumeMounts:
- name: config
mountPath: /fluent-bit/etc/
- name: wasm-filter
mountPath: /opt/filters/
- name: logcheck-rules
mountPath: /etc/logcheck
- name: varlog
mountPath: /var/log
- name: journal
mountPath: /run/systemd/journal
volumes:
- name: config
configMap:
name: fluent-bit-config
- name: wasm-filter
configMap:
name: logcheck-wasm-filter
- name: logcheck-rules
configMap:
name: logcheck-rules
- name: varlog
hostPath:
path: /var/log
- name: journal
hostPath:
path: /run/systemd/journal- Rust compiler with your target installed:
- Linux:
rustup target add x86_64-unknown-linux-gnuoraarch64-unknown-linux-gnu - macOS:
rustup target add x86_64-apple-darwinoraarch64-apple-darwin
- Linux:
- Cargo for dependency management
- Logcheck rules directory (e.g.,
/etc/logcheckfromlogcheck-databasepackage)
- Rust compiler with WASM target:
rustup target add wasm32-unknown-unknown - Cargo for Rust dependencies
- Docker for testing against Fluent-Bit
- Optional: WebAssembly Binary Toolkit (wabt) for WASM analysis
Important: Fluent-Bit officially supports only wasm32-unknown-unknown for Rust WASM filters (requires rustc 1.62.1 or later). Other WASM targets like wasm32-wasi are not supported. See Fluent-Bit WASM filter documentation for details.
The project includes comprehensive GitHub Actions workflows:
- build-and-test.yml: Main build pipeline with testing across multiple architectures
- container.yml: Docker image builds for
linux/amd64 - release.yml: Automated releases with multi-platform binaries
- docs.yml: API documentation generation and GitHub Pages deployment
- test-logcheck-rules.yml: Integration tests with production logcheck rules
The CI system automatically builds multiple targets:
CLI Binary Targets:
x86_64-unknown-linux-gnu(Linux x86_64)aarch64-unknown-linux-gnu(Linux ARM64)x86_64-apple-darwin(macOS Intel)aarch64-apple-darwin(macOS Apple Silicon)
WASM Filter:
wasm32-unknown-unknown(WebAssembly)
Container Images:
linux/amd64(published to GitHub Container Registry)
The project uses a multi-stage Docker build with cargo-chef for optimal dependency caching:
graph TB
chef["chef<br/>Base image with cargo-chef installed<br/><i>rust:1.84-slim</i>"]
chef --> planner["planner<br/>Analyze project & create recipe<br/><i>cargo chef prepare</i>"]
chef --> base["builder-base<br/>Add Rust targets<br/><i>x86_64 + wasm32</i>"]
planner --> |recipe.json| base
base --> native["native-deps<br/>Cook x86_64 dependencies<br/><i>cargo chef cook --target x86_64</i><br/>π Cached layer"]
base --> wasm["wasm-deps<br/>Cook WASM dependencies<br/><i>cargo chef cook --target wasm32</i><br/>π Cached layer"]
native --> cli["cli-builder<br/>Build native binary<br/><i>cargo build --bin logcheck-filter</i>"]
wasm --> wasmb["wasm-builder<br/>Build WASM library<br/><i>cargo build --lib</i>"]
cli --> final["fluent-bit:4.2.2<br/>Runtime image"]
wasmb --> final
style chef fill:#e1f5ff
style planner fill:#fff3cd
style base fill:#fff3cd
style native fill:#d4edda
style wasm fill:#d4edda
style cli fill:#cce5ff
style wasmb fill:#cce5ff
style final fill:#d1ecf1
Benefits:
- π Fast rebuilds: Dependencies cached separately from source code
- β‘ Parallel builds: Native and WASM deps build in parallel
- π¦ Smaller layers: Only source changes trigger rebuilds
- π Smart caching: Recipe layer only rebuilds when Cargo.toml changes
Local Development:
This project uses cargo-xtask for build automation:
# Quick start - show all available commands
cargo xtask --help
# Build everything (CLI + WASM)
cargo xtask build-all --release
# Build specific targets
cargo xtask build-cli --release # CLI for your platform
cargo xtask build-wasm --release # WASM filter
# Build for all platforms (requires cross-compilation setup)
cargo xtask build-all-cli --release
# Install CLI locally
cargo xtask install-cli # Installs to ~/.local/bin
# Generate documentation
cargo xtask docs # API docs + CLI reference + man pages
cargo xtask docs --open # Open API docs in browser
# Testing
cargo test # Unit tests
cargo xtask test-integration # Integration tests
cargo xtask test-json # WASM filter test (Docker)
cargo xtask test-msgpack # WASM filter test (Docker)See docs/xtask-guide.md for complete xtask documentation.
Cross-compilation setup (for build-all targets):
# Install targets
rustup target add x86_64-unknown-linux-gnu aarch64-unknown-linux-gnu
rustup target add x86_64-apple-darwin aarch64-apple-darwin
# May require additional system dependencies for cross-compilationAutomated CI Testing:
- Format & Lint: Rust formatting and clippy checks
- Unit Tests: All tests on
x86_64-unknown-linux-gnu - Code Coverage: Generated with
cargo-tarpaulin - Security Audit: Vulnerability scanning on PRs
- Binary Size Analysis: Tracks CLI and WASM binary size
- Production Rules: Tests against real logcheck-database package
- Container: Single-architecture image validation
Manual Testing:
# Run all tests
cargo test
# Test CLI with sample logs
echo "Failed password for admin" | ./target/release/logcheck-filter --rules /etc/logcheck stdin
# Test WASM filter with Docker
cargo xtask test-json # Test JSON format
cargo xtask test-msgpack # Test MessagePack formatExpected output:
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
______ _ _ ______ _ _ _____ __
| ___| | | | | ___ (_) | |____ |/ |
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __ / /`| |
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / \ \ | |
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /.___/ /_| |_
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ \____(_)___/
[2024/07/24 13:12:55] [ info] [fluent bit] version=3.1.2, commit=a6feacd6e9, pid=1
[2024/07/24 13:12:55] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/07/24 13:12:55] [ info] [cmetrics] version=0.9.1
[2024/07/24 13:12:55] [ info] [ctraces ] version=0.5.1
[2024/07/24 13:12:55] [ info] [input:dummy:dummy.0] initializing
[2024/07/24 13:12:55] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/07/24 13:12:55] [ info] [sp] stream processor started
[2024/07/24 13:12:55] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy.0: [[1721826775.984965222, {}], {"msg"=>"Hello world from rust wasm! π"}]
src/
βββ lib.rs # WASM filter library
βββ rules.rs # Logcheck rule engine (shared by both WASM and CLI)
βββ main.rs # CLI entry point
βββ cli/
β βββ mod.rs # CLI module organization
β βββ args.rs # Argument parsing with clap
β βββ input/ # Input source implementations
β β βββ file.rs # File reader
β β βββ stdin.rs # Stdin reader
β β βββ journald.rs # Journald integration (Linux)
β βββ output/ # Output formatter implementations
β β βββ json.rs # JSON formatter
β β βββ text.rs # Text formatter (with colors)
β βββ processor.rs # Main log processing loop
βββ production_test.rs # Production logcheck rules tests
βββ external_test.rs # Integration tests
See https://chronosphere.io/learn/dynamic-log-routing-on-kubernetes-labels-fluent-bit/ for another example on writing a program to use in the WASM filter, using Go instead of Rust.