Skip to content

Latest commit

 

History

History
186 lines (142 loc) · 5.33 KB

File metadata and controls

186 lines (142 loc) · 5.33 KB

AST-Based Arid Node Detection

This document explains the implementation of Google's AST-based arid node detection algorithm in the Rust mutation testing tool.

Overview

The AST-based filtering prevents generation of "unproductive mutants" by identifying arid nodes - statements that are typically not tested by unit tests and thus generate mutants that survive not because of test quality issues, but because they're not meant to be tested.

Algorithm Implementation

Based on Google's research paper, the algorithm is:

arid(N) = expert(N)                    if simple(N)
        = ∀(arid(c)) = 1, ∀c ∈ N      otherwise (compound nodes)

Where:

  • N is a node in the Abstract Syntax Tree
  • simple(N) determines if a node is simple (no body) or compound (has body/children)
  • expert(N) is manually curated knowledge about which simple nodes are arid

Usage Examples

Basic Usage

# Enable AST filtering (default for C++ files)
mutation-core mutate -f src/wallet/coinselection.cpp

# Disable AST filtering (generate all mutants)
mutation-core mutate -f src/wallet/coinselection.cpp --disable-ast-filtering

# Add custom expert rule
mutation-core mutate -f src/wallet/coinselection.cpp --add-expert-rule "mylog\s*\("

What Gets Filtered Out

1. Memory Management Functions

std::vector<COutput> coins;
coins.reserve(100);           // ← FILTERED (arid)
coins.resize(50);             // ← FILTERED (arid)

2. Logging and Debug Output

LogPrintf("Processing %d coins\n", coins.size());  // ← FILTERED (arid)
std::cout << "Debug: " << value << std::endl;      // ← FILTERED (arid)

3. Performance Monitoring

auto start_time = std::chrono::steady_clock::now(); // ← FILTERED (arid)
auto duration = end_time - start_time;              // ← FILTERED (arid)

4. Bitcoin Core Specific Patterns

if (G_FUZZING) return;                              // ← FILTERED (arid)
strprintf("Amount: %s", FormatMoney(amount));       // ← FILTERED (arid)

5. What DOESN'T Get Filtered

if (amount > target) {           // ← MUTATED (business logic)
    return false;                // ← MUTATED (business logic)
}

CAmount fee = CalculateFee();    // ← MUTATED (important calculation)
bool valid = ValidateInput();    // ← MUTATED (validation logic)

Expert Knowledge Rules

The system includes 100+ pre-defined rules based on common patterns:

Function Call Patterns

  • std::vector<.*>::reserve
  • std::.*::resize
  • LogPrintf\s*\(
  • std::cout\s*<<
  • Memory allocation: malloc, calloc, free
  • Threading: std::thread, std::mutex

Variable Patterns

  • Timing variables: .*_time$, .*_duration$
  • Debug variables: .*_debug$, .*_log$
  • Temporary variables: temp_.*, tmp_.*

Statement Patterns

  • Comments: ^\s*//, ^\s*/\*
  • Preprocessor: ^\s*#
  • Namespace declarations: ^\s*namespace\s+

Custom Expert Rules

Add your own patterns for project-specific arid code:

# Filter out custom logging functions
mutation-core mutate -f src/file.cpp --add-expert-rule "MyLogger::"

# Filter out specific debugging code
mutation-core mutate -f src/file.cpp --add-expert-rule "DEBUG_PRINT\s*\("

# Filter out performance counters
mutation-core mutate -f src/file.cpp --add-expert-rule ".*_counter\+\+"

Performance Impact

Mutation Generation Speed

  • Faster: Fewer lines to process means faster mutation generation
  • Reduced I/O: Fewer mutant files to write

Analysis Speed

  • Faster: Fewer mutants to compile and test
  • Better Focus: Time spent on meaningful mutants

Integration with Existing Features

Works with Coverage-Guided Testing

mutation-core mutate -f src/file.cpp -c coverage.info
# 1. Coverage filtering applied first
# 2. AST filtering applied to covered lines
# 3. Result: only covered, non-arid lines mutated

Works with Line Ranges

mutation-core mutate -f src/file.cpp --range 100 200
# 1. Line range applied first
# 2. AST filtering applied to lines 100-200
# 3. Result: only non-arid lines in range mutated

Works with Skip Lines

mutation-core mutate -f src/file.cpp --skip-lines skip.json
# 1. Skip lines applied first
# 2. AST filtering applied to non-skipped lines
# 3. Result: comprehensive filtering stack

Technical Implementation

AST Node Types

pub enum AstNodeType {
    // Simple nodes (no body)
    FunctionCall,      // func()
    Assignment,        // x = y
    Literal,          // 42, "string"

    // Compound nodes (have body)
    IfStatement,      // if (...) { ... }
    ForLoop,          // for (...) { ... }
    Block,            // { ... }
}

Detection Algorithm

pub fn is_arid(&mut self, node: &AstNode) -> bool {
    if node.is_simple() {
        // Use expert knowledge for simple nodes
        self.expert.is_arid_simple_node(node)
    } else {
        // For compound nodes: ALL children must be arid
        node.children.iter().all(|child| self.is_arid(child))
    }
}

Future Enhancements

  1. Full AST Parser: Integrate with clang or tree-sitter for complete C++ parsing
  2. IDE Integration: Real-time arid node highlighting

References