docs: add LSQB benchmark tutorial and example scripts#156
docs: add LSQB benchmark tutorial and example scripts#156longbinlai merged 16 commits intoalibaba:mainfrom
Conversation
Add comprehensive tutorial and benchmark scripts for reproducing the LSQB (Labelled Subgraph Query Benchmark) performance results. - Add tutorial documentation at doc/source/tutorials/lsqb_benchmark.rst - Add benchmark scripts at examples/lsqb_benchmark/ - Include data loading, query execution, and result reporting - Provide dataset download link and reproducibility instructions Dataset available at: https://neug.oss-cn-hangzhou.aliyuncs.com/datasets/ldbc-snb-sf1-lsqb.tar.gz Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Review Summary by QodoAdd LSQB benchmark tutorial and example scripts for NeuG
WalkthroughsDescription• Add comprehensive LSQB benchmark tutorial with 9 complex subgraph queries • Implement complete benchmark script for LDBC SNB SF1 dataset loading and execution • Include CSV preprocessing utilities for data schema compatibility • Provide reproducibility instructions and expected performance results Diagramflowchart LR
A["LDBC SNB SF1<br/>Dataset"] -->|"CSV Preprocessing"| B["Derived CSVs<br/>with Headers"]
B -->|"COPY Statements"| C["NeuG Database<br/>Schema & Data"]
C -->|"9 LSQB Queries"| D["Benchmark Results<br/>JSON Report"]
E["Tutorial Doc"] -->|"References"| F["Benchmark Scripts"]
F -->|"Executes"| D
File Changes1. examples/lsqb_benchmark/run_neug_benchmark.py
|
Code Review by Qodo
|
- Add --force flag and safety checks for database deletion - Fix script name in tutorial (run_neug_benchmark.py) - Fix preprocess_csvs() to only add mapping after source validation - Fix print_results() to handle result == 0 correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Convert LSQB benchmark tutorial from RST to Markdown - Add LDBC SNB Interactive benchmark tutorial (NeuG vs Neo4j) - Add _meta.ts for Nextra support - Create ldbc_interactive_benchmark example scripts - Both tutorials support Sphinx and Nextra Embedded Mode (lsqb-benchmark-embedded.md): - NeuG vs LadybugDB comparison - LSQB SF1 benchmark queries Service Mode (ldbc-interactive-benchmark-service.md): - NeuG vs Neo4j comparison - LDBC SNB Interactive queries IC1-IC14 - Throughput and latency benchmarks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove duplicate command example, keep only the basic command and reference the CLI options table for --force flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| @@ -0,0 +1,140 @@ | |||
| # LDBC SNB Interactive Benchmark: NeuG vs Neo4j (Service Mode) | |||
|
|
|||
| This tutorial demonstrates how to reproduce the LDBC SNB Interactive Benchmark performance results comparing NeuG with Neo4j in service mode. | |||
There was a problem hiding this comment.
这里不是 完整的 LDBC SNB Interactive Benchmark,只是拿了其中的 complex read 的workload 做评测。完整的 LDBC SNB Interactive Benchmark 我们会在后续推出
|
|
||
| ## Files | ||
|
|
||
| - `run_neug_benchmark.py` - Main benchmark script for NeuG |
There was a problem hiding this comment.
为什么上面叫 run_interactive_benchmark, 底下 叫 run_neug_benchmark? 要么 就叫 run_lsqb_benchmark
…ries Add a note to explain that this tutorial covers only the complex read queries (IC1-IC14) from the LDBC SNB Interactive Benchmark, not the complete benchmark with write operations. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Rename benchmark scripts to use consistent naming: - examples/ldbc_interactive_benchmark/run_interactive_benchmark.py -> run_benchmark.py - examples/lsqb_benchmark/run_neug_benchmark.py -> run_benchmark.py Since the scripts are already in separate directories (ldbc_interactive_benchmark/ and lsqb_benchmark/), using a unified name makes the structure cleaner and more consistent. Also update all references in documentation and README files. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Combine two separate tutorials into one unified document: - benchmark-neug-dual-mode.md covers both embedded and service modes - Unified dataset download section - Clear separation between LSQB (embedded) and LDBC Interactive (service) - Consolidated 'Why NeuG is Faster' section This provides a cleaner, more cohesive view of NeuG's dual-mode capabilities. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Update the expected results table based on actual benchmark data: - NeuG wins 7/9 queries (not 9/9) - LadybugDB wins Q6 and Q9 with multi-threading advantage - Correct speedup numbers: Q3 279.5x, Q2 79.3x (not 287x, 91x) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- NeuG wins all 9 queries (not 7/9) - Q6: NeuG 3.2x faster than LadybugDB (not slower) - Q9: NeuG 1.7x faster than LadybugDB (not slower) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- NeuG wins 8/9 queries (not 9/9) - Q6: LadybugDB is 3.2x faster than NeuG (0.48s vs 0.15s) - Q9: NeuG is 1.7x faster than LadybugDB (0.60s vs 1.02s) - Updated all speedup values to match generate_charts_v5.py Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
| 2: """ | ||
| MATCH (person1:PERSON)-[:KNOWS]->(person2:PERSON), | ||
| (person1)<-[:HASCREATOR]-(comment:COMMENT) | ||
| -[:REPLYOF]->(post:POST)-[:HASCREATOR]->(person2) | ||
| RETURN count(*) AS count | ||
| """, |
There was a problem hiding this comment.
There is a difference compared to LSQB Q2 provided by LDBC: https://github.com/ldbc/lsqb/blob/main/cypher/q2.cypher
| MATCH (person1:PERSON)-[:KNOWS]->(person2:PERSON) | ||
| -[:KNOWS]->(person3:PERSON)-[:HASINTEREST]->(:TAG) | ||
| WHERE id(person1) <> id(person3) | ||
| RETURN count(*) AS count | ||
| """, |
There was a problem hiding this comment.
| 3: """ | ||
| MATCH (country:PLACE {type: 'country'}) | ||
| MATCH (person1:PERSON)-[:ISLOCATEDIN]->(city1:PLACE)-[:ISPARTOF]->(country) | ||
| MATCH (person2:PERSON)-[:ISLOCATEDIN]->(city2:PLACE)-[:ISPARTOF]->(country) | ||
| MATCH (person3:PERSON)-[:ISLOCATEDIN]->(city3:PLACE)-[:ISPARTOF]->(country) | ||
| MATCH (person1)-[:KNOWS]->(person2)-[:KNOWS]->(person3)-[:KNOWS]->(person1) | ||
| RETURN count(*) AS count | ||
| """, |
There was a problem hiding this comment.
| 9: """ | ||
| MATCH (person1:PERSON)-[:KNOWS]->(person2:PERSON) | ||
| -[:KNOWS]->(person3:PERSON)-[:HASINTEREST]->(:TAG) | ||
| WHERE NOT (person1)-[:KNOWS]->(person3) AND id(person1) <> id(person3) | ||
| RETURN count(*) AS count | ||
| """, |
There was a problem hiding this comment.
The original LSQB benchmark assumes bidirectional KNOWS edges. We modified queries to use directed traversal (-[:KNOWS]->) to allow the same LDBC SNB SF1 dataset to be used for both SNB Interactive and LSQB benchmarks, since LDBC SNB KNOWS edges are unidirectional. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
|
|
||
| [LSQB](https://github.com/ldbc/lsqb) contains 9 complex subgraph matching queries that lean toward analytical workloads. This benchmark compares NeuG with LadybugDB in embedded mode. | ||
|
|
||
| > **Note on KNOWS Edges**: The original LSQB benchmark assumes KNOWS relationships are bidirectional (i.e., if A knows B, then B also knows A). In our tests, we modified all queries involving KNOWS edges to use directed traversal (`-[:KNOWS]->`). This adjustment allows the **same LDBC SNB SF1 dataset to be used for both SNB Interactive and LSQB benchmarks**, since the KNOWS relationships in the original LDBC SNB data are unidirectional. This modification does not affect the fairness of evaluating graph database query optimization and execution capabilities. |
There was a problem hiding this comment.
@liulx20 I have added the notes according to your comments.
Summary
This PR adds comprehensive documentation and scripts for reproducing the LSQB (Labelled Subgraph Query Benchmark) performance results.
doc/source/tutorials/lsqb_benchmark.rstexamples/lsqb_benchmark/Changes
doc/source/index.rst- Add link to new tutorialdoc/source/tutorials/lsqb_benchmark.rst- New tutorial documentexamples/lsqb_benchmark/run_neug_benchmark.py- Main benchmark scriptexamples/lsqb_benchmark/README.md- Usage instructionsexamples/lsqb_benchmark/requirements.txt- Python dependenciesDataset
The LDBC SNB SF1 dataset used in this benchmark is available at:
https://neug.oss-cn-hangzhou.aliyuncs.com/datasets/ldbc-snb-sf1-lsqb.tar.gz
Test Plan
🤖 Generated with Claude Code