Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ hs_err_pid*

target
.DS_Store
dependency-reduced-pom.xml

venv

Expand Down
151 changes: 151 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# Mork Benchmarks

This module contains JMH (Java Microbenchmark Harness) benchmarks for the Mork framework, focusing on performance comparisons between the standard Java `HashSet` and the custom `BitSet` implementation.

## Benchmark Tests

### SetAddBenchmark
Compares the performance of adding elements to HashSet vs BitSet across different data sizes (100, 1000, 10000 elements) and fill ratios (50%, 90%).

**Measures:** Average time per operation in nanoseconds

### SetContainsBenchmark
Compares the performance of lookup/contains operations between HashSet and BitSet.

**Measures:** Average time for 1000 lookups in nanoseconds

### SetIterationBenchmark
Compares the performance of iterating over all elements in HashSet vs BitSet.

**Measures:** Average time to iterate over all elements in nanoseconds

### SetRemoveBenchmark
Compares the performance of removing elements from HashSet vs BitSet.

**Measures:** Average time per operation in nanoseconds

### SetMemoryBenchmark
Measures memory allocation patterns between HashSet and BitSet across different sizes (1000, 10000, 100000 elements).

**Measures:** Average and sample time, allocation rate tracking

### SetMixedOperationsBenchmark
Simulates realistic workloads with mixed operations (add, contains, remove) to compare overall performance.

**Measures:** Average time per mixed workload in microseconds

## Building the Benchmarks

To build the benchmark uber JAR:

```bash
cd benchmarks
mvn clean package
```

This will create `target/benchmarks.jar` containing all benchmarks and dependencies.

## Running Benchmarks

### Run All Benchmarks

```bash
java -jar target/benchmarks.jar
```

### Run Specific Benchmark

```bash
# Run only add benchmarks
java -jar target/benchmarks.jar SetAddBenchmark

# Run only contains benchmarks
java -jar target/benchmarks.jar SetContainsBenchmark
```

### Run with Custom Parameters

```bash
# Run with specific parameters
java -jar target/benchmarks.jar SetAddBenchmark -p size=1000 -p fillRatio=0.9

# Run with specific warmup and measurement iterations
java -jar target/benchmarks.jar -wi 5 -i 10
```

### Run with Profilers

JMH includes several profilers to get additional insights:

```bash
# List available profilers
java -jar target/benchmarks.jar -lprof

# Run with GC profiler
java -jar target/benchmarks.jar -prof gc

# Run with stack profiler
java -jar target/benchmarks.jar -prof stack

# Run with allocation profiler (requires -javaagent)
java -jar target/benchmarks.jar -prof "async:libPath=/path/to/libasyncProfiler.so;output=flamegraph"
```

### Save Results

```bash
# Save results to JSON
java -jar target/benchmarks.jar -rf json -rff results.json

# Save results to CSV
java -jar target/benchmarks.jar -rf csv -rff results.csv
```

## Understanding Results

The benchmarks measure:

- **CPU Performance**: All benchmarks measure execution time (throughput)
- **Memory Impact**: SetMemoryBenchmark runs with specific GC settings to track allocation patterns
- **Different Use Cases**:
- Small datasets (100-1000 elements): Typical for small optimization problems
- Medium datasets (10000 elements): Common in many algorithms
- Large datasets (100000 elements): Stress testing for memory benchmarks

### Expected Results

**BitSet Advantages:**
- Lower memory footprint for dense sets (high fill ratios)
- Faster operations when elements are small integers
- More predictable performance characteristics

**HashSet Advantages:**
- Better for sparse sets (low fill ratios)
- Can handle any object type, not just integers
- No capacity limit requirement

## JMH Options

Common JMH command-line options:

- `-h`: Display help
- `-l`: List available benchmarks
- `-lprof`: List available profilers
- `-wi <count>`: Number of warmup iterations
- `-i <count>`: Number of measurement iterations
- `-f <count>`: Number of forks
- `-t <count>`: Number of threads
- `-p <param>=<value>`: Set benchmark parameter

## Requirements

- Java 21 or higher
- Maven 3.6+
- At least 2GB of available memory for running benchmarks

## Notes

- Benchmarks may take several minutes to complete
- Results can vary based on JVM version, hardware, and system load
- For production decisions, always run benchmarks on target hardware
- The benchmarks use a fixed random seed (42) for reproducibility
92 changes: 92 additions & 0 deletions benchmarks/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<!-- Project name and current version -->
<artifactId>mork-benchmarks</artifactId>
<version>0.22-SNAPSHOT</version>
<name>mork-benchmarks</name>
<description>Benchmarking tests for Mork framework using JMH</description>

<!-- Parent configuration -->
<parent>
<groupId>es.urjc.etsii.grafo</groupId>
<artifactId>mork-parent</artifactId>
<version>0.22-SNAPSHOT</version>
</parent>

<properties>
<jmh.version>1.37</jmh.version>
<uberjar.name>benchmarks</uberjar.name>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
</properties>

<dependencies>
<!-- Mork Common library with BitSet implementation -->
<dependency>
<groupId>es.urjc.etsii.grafo</groupId>
<artifactId>mork-common</artifactId>
</dependency>

<!-- JMH Core -->
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>${jmh.version}</version>
</dependency>

<!-- JMH Annotation Processor -->
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>${jmh.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>

<build>
<plugins>
<!-- Compiler plugin -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
</plugin>

<!-- Shade plugin to create an uber jar for benchmarks -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.6.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>${uberjar.name}</finalName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>org.openjdk.jmh.Main</mainClass>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
<filters>
<filter>
<!-- Shading signed JARs will fail without this. -->
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
package es.urjc.etsii.grafo.benchmarks;

import es.urjc.etsii.grafo.util.collections.BitSet;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;

import java.util.HashSet;
import java.util.Random;
import java.util.concurrent.TimeUnit;

/**
* Benchmark comparing add operation performance between HashSet and BitSet
* for different data sizes.
*/
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 2)
@Measurement(iterations = 5, time = 3)
@Fork(1)
public class SetAddBenchmark {

@Param({"100", "1000", "10000"})
private int size;

@Param({"0.5", "0.9"}) // Fill ratio
private double fillRatio;

private int[] elements;
private int capacity;

@Setup(Level.Trial)
public void setup() {
capacity = size;
int numElements = (int) (size * fillRatio);
elements = new int[numElements];
Random random = new Random(42);

// Generate random unique elements
for (int i = 0; i < numElements; i++) {
elements[i] = random.nextInt(capacity);
}
}

@Benchmark
public void hashSetAdd(Blackhole blackhole) {
HashSet<Integer> set = new HashSet<>(capacity);
for (int element : elements) {
set.add(element);
}
blackhole.consume(set);
}

@Benchmark
public void bitSetAdd(Blackhole blackhole) {
BitSet set = new BitSet(capacity);
for (int element : elements) {
set.add(element);
}
blackhole.consume(set);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
package es.urjc.etsii.grafo.benchmarks;

import es.urjc.etsii.grafo.util.collections.BitSet;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;

import java.util.HashSet;
import java.util.Random;
import java.util.concurrent.TimeUnit;

/**
* Benchmark comparing contains/lookup operation performance between HashSet and BitSet.
*/
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 2)
@Measurement(iterations = 5, time = 3)
@Fork(1)
public class SetContainsBenchmark {

@Param({"100", "1000", "10000"})
private int size;

@Param({"0.5", "0.9"}) // Fill ratio
private double fillRatio;

private HashSet<Integer> hashSet;
private BitSet bitSet;
private int[] lookupElements;
private int capacity;

@Setup(Level.Trial)
public void setup() {
capacity = size;
int numElements = (int) (size * fillRatio);
Random random = new Random(42);

// Create and populate sets
hashSet = new HashSet<>(capacity);
bitSet = new BitSet(capacity);

for (int i = 0; i < numElements; i++) {
int element = random.nextInt(capacity);
hashSet.add(element);
bitSet.add(element);
}

// Create lookup elements (mix of existing and non-existing)
lookupElements = new int[1000];
for (int i = 0; i < lookupElements.length; i++) {
lookupElements[i] = random.nextInt(capacity);
}
}

@Benchmark
public void hashSetContains(Blackhole blackhole) {
int count = 0;
for (int element : lookupElements) {
if (hashSet.contains(element)) {
count++;
}
}
blackhole.consume(count);
}

@Benchmark
public void bitSetContains(Blackhole blackhole) {
int count = 0;
for (int element : lookupElements) {
if (bitSet.contains(element)) {
count++;
}
}
blackhole.consume(count);
}
}
Loading