Skip to content

Finkregh/fluent-bit-logcheck

Repository files navigation

Logcheck Fluent-Bit Filter & CLI Tool

This is a work in progress, might eat your cat!

This project provides both a Fluent-Bit WASM filter and a standalone CLI tool for filtering logs using logcheck rules.

🎯 Overview

Fluent-Bit WASM Filter

Fluentbit (source code) is a popular open-source log-shipping tool that can take logs in from many different sources, filter and process them, then send them on to many different supported outputs.

One of its filtering 'plugins' is the WASM filter, which currently embeds the 'WebAssembly Micro Runtime' (website, github) (see here, here/here, and here in fluentbit source) to facilitate executing WebAssembly (WASM) programs/code to process or transform particular flows of log messages that pass through Fluentbit.

CLI Tool

The logcheck-filter CLI tool provides a standalone way to filter log files using logcheck rules from the logcheck-database package. It can read from files, stdin, or systemd journal, and output filtered results in text or JSON format.

Key Features:

  • βœ… Pure Rust - No C dependencies, runs on Alpine Linux
  • βœ… Multiple input sources - Files, stdin, systemd journald
  • βœ… Flexible output - Text (colored) or JSON format
  • βœ… Production-ready - Uses 1000+ logcheck rules from Debian
  • βœ… Statistics - Processing summaries and match rates
  • βœ… Filtering modes - Show all, violations only, or unmatched entries

πŸ“¦ Installation & Downloads

Pre-built Releases

Download pre-built binaries from GitHub Releases for multiple platforms:

CLI Tools:

  • logcheck-filter-linux-amd64.tar.gz (Linux x86_64, glibc)
  • logcheck-filter-linux-arm64.tar.gz (Linux ARM64, glibc)
  • logcheck-filter-darwin-amd64.tar.gz (macOS Intel)
  • logcheck-filter-darwin-arm64.tar.gz (macOS Apple Silicon)

WASM Filter:

  • fluentbit-wasm-filter.tar.gz (WebAssembly module)

Container Images

# Pull the latest image (linux/amd64)
docker pull ghcr.io/finkregh/fluent-bit-logcheck:latest

πŸš€ Quick Start

Fluent-Bit WASM Filter

Build the WASM filter:

cargo xtask build-wasm --release
# Creates: target/wasm32-unknown-unknown/release/logcheck_fluent_bit_filter.wasm

Basic Configuration (fluent-bit.conf):

[INPUT]
    name        systemd
    tag         journal.system
    read_from_tail on

[FILTER]
    name        wasm
    match       journal.*
    wasm_path   ./target/wasm32-unknown-unknown/release/logcheck_fluent_bit_filter.wasm
    function_name logcheck_filter_json
    accessible_paths .

[OUTPUT]
    name        stdout
    match       *
    format      json_lines

Run Fluent-Bit:

fluent-bit -c fluent-bit.conf

CLI Tool Usage

# Build the CLI tool
cargo build --release --bin logcheck-filter

# Filter a log file
logcheck-filter --rules /etc/logcheck file /var/log/syslog

# Read from stdin
cat /var/log/syslog | logcheck-filter --rules /etc/logcheck stdin

# Read from systemd journal (Linux only)
logcheck-filter --rules /etc/logcheck journald --unit sshd --lines 100

# Show only violations
logcheck-filter --rules /etc/logcheck --show violations file /var/log/auth.log

# JSON output with statistics
logcheck-filter --rules /etc/logcheck --format json --stats file /var/log/syslog

# Colored output
logcheck-filter --rules /etc/logcheck --color file /var/log/syslog

Advanced CLI Examples

Multi-source monitoring:

# Monitor live systemd journal for security events
logcheck-filter --rules /etc/logcheck --show violations --color journald --follow --unit sshd

# Process multiple log files with statistics
for log in /var/log/{auth,syslog,messages}.log; do
    echo "Processing $log:"
    logcheck-filter --rules /etc/logcheck --stats --format json file "$log" | jq -r '.logcheck_category' | sort | uniq -c
done

# Real-time log streaming with filtering
tail -f /var/log/syslog | logcheck-filter --rules /etc/logcheck --color --show violations stdin

Integration with other tools:

# Export violations to CSV for analysis
logcheck-filter --rules /etc/logcheck --format json --show violations file /var/log/auth.log | \
    jq -r '[.message, .logcheck_category, .logcheck_rule_type] | @csv' > security-violations.csv

# Count violations by category
logcheck-filter --rules /etc/logcheck --format json --show violations file /var/log/syslog | \
    jq -r '.logcheck_category' | sort | uniq -c | sort -nr

# Monitor log rates in real-time
logcheck-filter --rules /etc/logcheck --stats journald --follow --lines 0 | \
    grep -o "Processed [0-9]* entries" | \
    while read line; do echo "$(date): $line"; done

Interactive Analyzer (TUI)

Analyze unmatched journald entries and generate regex suggestions:

# Launch analyzer with default minimum group size (2)
logcheck-filter --show unmatched journald analyze

# Require at least 3 similar entries per pattern
logcheck-filter --show unmatched journald analyze --min-group-size 3

Key bindings:

Pattern list

  • ↑/↓ or j/k: Move selection
  • Enter: Open save dialog
  • PgUp/PgDn: Scroll preview
  • q/Esc: Quit analyzer

Save dialog

  • e: Edit regex
  • ←/β†’: Change category (when not editing)
  • ←/β†’: Move cursor (when editing)
  • Enter: Save rule
  • Esc: Cancel (or finish editing)

Rules are written to /etc/logcheck under the appropriate ignore.d.* directory as local-generated with metadata comments.

⚑ Performance & Monitoring

WASM Filter Performance

  • Throughput: ~10,000 log entries/second on modern hardware
  • Memory Usage: ~50MB baseline + 1MB per 1000 logcheck rules
  • Startup Time: 2-3 seconds to compile 1247 production logcheck rules
  • CPU Impact: Adds ~15% CPU overhead compared to native fluent-bit filters

Monitoring Metrics

Monitor these fluent-bit metrics for WASM filter health:

# Check filter processing rate
curl -s http://localhost:2020/api/v1/metrics | grep -E "fluentbit_filter_(add|drop)_records_total"

# Monitor WASM memory usage
curl -s http://localhost:2020/api/v1/metrics | grep "fluentbit_wasm"

Troubleshooting

Common Issues:

  1. WASM Module Loading Fails

    Error: failed to load WASM module
    Solution: Check file path and ensure accessible_paths includes the directory
    
  2. Rules Directory Not Found

    Error: Could not find logcheck rules
    Solution: Ensure /etc/logcheck exists or mount rules directory in container
    
  3. Memory Exhaustion

    Error: WASM execution failed
    Solution: Increase fluent-bit memory limits or reduce rule set size
    

Debug Mode:

[FILTER]
    name        wasm
    match       *
    wasm_path   ./logcheck_fluent_bit_filter.wasm  
    function_name logcheck_filter_json
    accessible_paths .
    # Enable debug logging
    log_level   debug

Optimization Tips

  1. Rule Chunking: Large rule sets are automatically chunked for better performance
  2. Input Filtering: Use fluent-bit match patterns to process only relevant logs
  3. Memory Tuning: Increase WASM stack size in .cargo/config.toml for complex regex
  4. Caching: Rules are compiled once at startup and cached for the session

Examples

Filter violations from SSH logs:

logcheck-filter --rules /etc/logcheck --show violations file /var/log/auth.log

Output:

Loading logcheck rules from: /etc/logcheck
Loaded 1247 rules across 8 categories
Reading from: /var/log/auth.log
[VIOLATION] Jan 01 10:00:00 host sshd[1234]: Failed password for invalid user admin from 192.168.1.100
[CRACKING] Jan 01 10:05:00 host sshd[5678]: Invalid user root from 192.168.1.200

JSON output for programmatic processing:

logcheck-filter --rules /etc/logcheck --format json file /var/log/syslog

Output:

{"message":"Jan 01 10:00:00 host sshd[1234]: Failed password for admin","matched":true,"category":"Violations","rule_type":"violations"}
{"message":"Jan 01 10:01:00 host systemd[1]: Started Session 42","matched":true,"category":"SystemEvents","rule_type":"ignore"}
{"message":"Jan 01 10:02:00 host unknown: weird message","matched":false,"category":null,"rule_type":"unmatched"}

πŸ”§ Production Fluent-Bit Configurations

Multiple Input Sources

System Logs Pipeline:

[INPUT]
    name        systemd
    tag         journal.system
    read_from_tail on
    strip_underscores on
    lowercase on

[INPUT]
    name        tail
    path        /var/log/syslog
    tag         file.syslog
    parser      syslog-rfc3164
    read_from_head false

[INPUT]
    name        syslog
    port        514
    tag         network.syslog
    parser      syslog-rfc3164

[FILTER]
    name        wasm
    match       *
    wasm_path   /opt/fluent-bit/filters/logcheck_fluent_bit_filter.wasm
    function_name logcheck_filter_json
    accessible_paths /etc/logcheck

[OUTPUT]
    name        forward
    match       *
    host        log-aggregator.company.com
    port        24224

Security-Focused Configuration

Route by logcheck classifications:

[INPUT]
    name        systemd
    tag         journal.security
    systemd_filter _TRANSPORT=audit
    systemd_filter _SYSTEMD_UNIT=sshd.service

[FILTER] 
    name        wasm
    match       journal.security
    wasm_path   /opt/fluent-bit/filters/logcheck_fluent_bit_filter.wasm
    function_name logcheck_filter_json
    accessible_paths /etc/logcheck

# Route violations to security team
[OUTPUT]
    name        file
    match_regex journal\.security.*
    path        /var/log/security-violations.log
    format      json_lines
    # Add conditional routing based on logcheck_category field

# Route normal events to standard aggregation  
[OUTPUT]
    name        forward
    match       journal.security
    host        central-logs.company.com
    port        24224

Container Deployment

Container Deployment

Docker Compose Example:

version: '3.8'
services:
  fluent-bit:
    image: ghcr.io/finkregh/fluent-bit-logcheck:latest
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - /etc/logcheck:/etc/logcheck:ro
      - /var/log:/var/log:ro
      - /run/systemd/journal:/run/systemd/journal:ro
    ports:
      - "24224:24224"
    cap_add:
      - SYS_PTRACE  # For systemd journal access

The project publishes container images to GitHub Container Registry for linux/amd64 platform.

Kubernetes Deployment:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit-logcheck
spec:
  selector:
    matchLabels:
      name: fluent-bit-logcheck
  template:
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:latest
        volumeMounts:
        - name: config
          mountPath: /fluent-bit/etc/
        - name: wasm-filter
          mountPath: /opt/filters/
        - name: logcheck-rules
          mountPath: /etc/logcheck
        - name: varlog
          mountPath: /var/log
        - name: journal
          mountPath: /run/systemd/journal
      volumes:
      - name: config
        configMap:
          name: fluent-bit-config
      - name: wasm-filter
        configMap:
          name: logcheck-wasm-filter
      - name: logcheck-rules
        configMap:
          name: logcheck-rules
      - name: varlog
        hostPath:
          path: /var/log
      - name: journal
        hostPath:
          path: /run/systemd/journal

Setup / Dependencies

For CLI Tool

  • Rust compiler with your target installed:
    • Linux: rustup target add x86_64-unknown-linux-gnu or aarch64-unknown-linux-gnu
    • macOS: rustup target add x86_64-apple-darwin or aarch64-apple-darwin
  • Cargo for dependency management
  • Logcheck rules directory (e.g., /etc/logcheck from logcheck-database package)

For WASM Filter

  • Rust compiler with WASM target: rustup target add wasm32-unknown-unknown
  • Cargo for Rust dependencies
  • Docker for testing against Fluent-Bit
  • Optional: WebAssembly Binary Toolkit (wabt) for WASM analysis

Important: Fluent-Bit officially supports only wasm32-unknown-unknown for Rust WASM filters (requires rustc 1.62.1 or later). Other WASM targets like wasm32-wasi are not supported. See Fluent-Bit WASM filter documentation for details.

CI/CD Pipeline

The project includes comprehensive GitHub Actions workflows:

  • build-and-test.yml: Main build pipeline with testing across multiple architectures
  • container.yml: Docker image builds for linux/amd64
  • release.yml: Automated releases with multi-platform binaries
  • docs.yml: API documentation generation and GitHub Pages deployment
  • test-logcheck-rules.yml: Integration tests with production logcheck rules

Building

The CI system automatically builds multiple targets:

CLI Binary Targets:

  • x86_64-unknown-linux-gnu (Linux x86_64)
  • aarch64-unknown-linux-gnu (Linux ARM64)
  • x86_64-apple-darwin (macOS Intel)
  • aarch64-apple-darwin (macOS Apple Silicon)

WASM Filter:

  • wasm32-unknown-unknown (WebAssembly)

Container Images:

  • linux/amd64 (published to GitHub Container Registry)

Docker Build Architecture

The project uses a multi-stage Docker build with cargo-chef for optimal dependency caching:

graph TB
    chef["chef<br/>Base image with cargo-chef installed<br/><i>rust:1.84-slim</i>"]
    
    chef --> planner["planner<br/>Analyze project & create recipe<br/><i>cargo chef prepare</i>"]
    chef --> base["builder-base<br/>Add Rust targets<br/><i>x86_64 + wasm32</i>"]
    
    planner --> |recipe.json| base
    
    base --> native["native-deps<br/>Cook x86_64 dependencies<br/><i>cargo chef cook --target x86_64</i><br/>πŸ”„ Cached layer"]
    base --> wasm["wasm-deps<br/>Cook WASM dependencies<br/><i>cargo chef cook --target wasm32</i><br/>πŸ”„ Cached layer"]
    
    native --> cli["cli-builder<br/>Build native binary<br/><i>cargo build --bin logcheck-filter</i>"]
    wasm --> wasmb["wasm-builder<br/>Build WASM library<br/><i>cargo build --lib</i>"]
    
    cli --> final["fluent-bit:4.2.2<br/>Runtime image"]
    wasmb --> final
    
    style chef fill:#e1f5ff
    style planner fill:#fff3cd
    style base fill:#fff3cd
    style native fill:#d4edda
    style wasm fill:#d4edda
    style cli fill:#cce5ff
    style wasmb fill:#cce5ff
    style final fill:#d1ecf1
Loading

Benefits:

  • πŸš€ Fast rebuilds: Dependencies cached separately from source code
  • ⚑ Parallel builds: Native and WASM deps build in parallel
  • πŸ“¦ Smaller layers: Only source changes trigger rebuilds
  • πŸ”„ Smart caching: Recipe layer only rebuilds when Cargo.toml changes

Local Development:

This project uses cargo-xtask for build automation:

# Quick start - show all available commands
cargo xtask --help

# Build everything (CLI + WASM)
cargo xtask build-all --release

# Build specific targets
cargo xtask build-cli --release     # CLI for your platform
cargo xtask build-wasm --release    # WASM filter

# Build for all platforms (requires cross-compilation setup)
cargo xtask build-all-cli --release

# Install CLI locally
cargo xtask install-cli  # Installs to ~/.local/bin

# Generate documentation
cargo xtask docs         # API docs + CLI reference + man pages
cargo xtask docs --open  # Open API docs in browser

# Testing
cargo test                          # Unit tests
cargo xtask test-integration        # Integration tests
cargo xtask test-json               # WASM filter test (Docker)
cargo xtask test-msgpack            # WASM filter test (Docker)

See docs/xtask-guide.md for complete xtask documentation.

Cross-compilation setup (for build-all targets):

# Install targets
rustup target add x86_64-unknown-linux-gnu aarch64-unknown-linux-gnu
rustup target add x86_64-apple-darwin aarch64-apple-darwin  

# May require additional system dependencies for cross-compilation

Testing

Automated CI Testing:

  • Format & Lint: Rust formatting and clippy checks
  • Unit Tests: All tests on x86_64-unknown-linux-gnu
  • Code Coverage: Generated with cargo-tarpaulin
  • Security Audit: Vulnerability scanning on PRs
  • Binary Size Analysis: Tracks CLI and WASM binary size
  • Production Rules: Tests against real logcheck-database package
  • Container: Single-architecture image validation

Manual Testing:

# Run all tests
cargo test

# Test CLI with sample logs
echo "Failed password for admin" | ./target/release/logcheck-filter --rules /etc/logcheck stdin

# Test WASM filter with Docker
cargo xtask test-json     # Test JSON format
cargo xtask test-msgpack  # Test MessagePack format

Expected output:

* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  __  
|  ___| |                | |   | ___ (_) |         |____ |/  | 
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`| | 
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \ | | 
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /_| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)___/

[2024/07/24 13:12:55] [ info] [fluent bit] version=3.1.2, commit=a6feacd6e9, pid=1
[2024/07/24 13:12:55] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/07/24 13:12:55] [ info] [cmetrics] version=0.9.1
[2024/07/24 13:12:55] [ info] [ctraces ] version=0.5.1
[2024/07/24 13:12:55] [ info] [input:dummy:dummy.0] initializing
[2024/07/24 13:12:55] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/07/24 13:12:55] [ info] [sp] stream processor started
[2024/07/24 13:12:55] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy.0: [[1721826775.984965222, {}], {"msg"=>"Hello world from rust wasm! πŸ™‚"}]

Misc

πŸ“ Project Structure

src/
β”œβ”€β”€ lib.rs              # WASM filter library
β”œβ”€β”€ rules.rs            # Logcheck rule engine (shared by both WASM and CLI)
β”œβ”€β”€ main.rs             # CLI entry point
β”œβ”€β”€ cli/
β”‚   β”œβ”€β”€ mod.rs          # CLI module organization
β”‚   β”œβ”€β”€ args.rs         # Argument parsing with clap
β”‚   β”œβ”€β”€ input/          # Input source implementations
β”‚   β”‚   β”œβ”€β”€ file.rs     # File reader
β”‚   β”‚   β”œβ”€β”€ stdin.rs    # Stdin reader
β”‚   β”‚   └── journald.rs # Journald integration (Linux)
β”‚   β”œβ”€β”€ output/         # Output formatter implementations
β”‚   β”‚   β”œβ”€β”€ json.rs     # JSON formatter
β”‚   β”‚   └── text.rs     # Text formatter (with colors)
β”‚   └── processor.rs    # Main log processing loop
β”œβ”€β”€ production_test.rs  # Production logcheck rules tests
└── external_test.rs    # Integration tests

πŸ”— Related Resources

See https://chronosphere.io/learn/dynamic-log-routing-on-kubernetes-labels-fluent-bit/ for another example on writing a program to use in the WASM filter, using Go instead of Rust.

About

Filter plugin for fluent-bit which uses logcheck rules

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors