AI agent guidance for the AshNeo4j source repository.
AshNeo4j is an Ash.DataLayer that stores Ash resources as nodes in a Neo4j graph database.
It is a library published on hex.pm and maintained at diffo-dev/ash_neo4j. Its primary consumer
is the Diffo project; upstream bugs found while working in Diffo belong here.
- Read
usage-rules.md— the canonical rules for using AshNeo4j, including naming conventions, relationship semantics, aggregate kinds, and the test sandbox. - Understand the label system (see Label system below) — the label concept is a frequent source of bugs and the most important thing to get right.
- Run
mix testbefore and after your change to confirm nothing regressed.
Before writing any fix, review existing test coverage for the affected behaviour. If the bug has no test, write the failing test first — this confirms the reproduction and guards the fix against regression. Only then implement the fix and verify the test passes.
For any change that touches more than one layer (driver / Cypher / data layer / Ash), don't work top-down or bottom-up alone — work from both ends and meet in the middle (stalagmite + stalactite). Both ends carry unknowns that compound when you discover them late.
Bottom (stalagmite) — start with a focused test against the lowest layer that doesn't
involve Ash. A raw Bolty.query! or AshNeo4j.Sandbox.run against the driver. This isolates
driver-level surprises (bolty has a history of type / negotiation issues — see bolty#32)
before they ripple up through Cast/Dump and the data layer. Cypher-rendering helpers are also
worth bottom testing — assemble the cypher fragment by hand and Sandbox.run it.
Top (stalactite) — write an exploratory Ash-level test with IO.inspect in your data
layer callback. Surfaces Ash-shape assumptions you have wrong (e.g. the combination_of
callback being checked against {:combine, :base} was a top-down surprise; the actual
Ash.Query.Combination.t() types are five, not the three the @type spec suggested). Throw the
test away once it has taught you the shape.
Meet in the middle. Once both ends are settled, the connecting commit is small and focused — write the bridge code, run the existing end tests plus a new end-to-end one through Ash.
This pattern saved real time on #45 (spatial) and #10 (combination queries). Use it whenever the change spans more than one layer.
lib/
data_layer.ex — Ash.DataLayer behaviour: CRUD, aggregates, calculations,
transaction, enrichments (OPTIONAL MATCH → source attributes)
cypher.ex — Cypher string helpers: node/2, relationship/3, expression/5,
parameterized_node/3, render/1, run/1
cypher/query.ex — Typed clause structs (Match, Where, Return, …) and builder
functions for every query shape used by the data layer
query_helper.ex — Translates Ash.Query (filter, sort, offset, limit) into
a Cypher.Query; entry point is query_nodes/1
resource/info.ex — All DSL introspection: label/1, module_label/1, domain_label/1,
domain_fragment_label/1, all_labels/1, label_pair/1,
mapping/1, relate/1, translations/1, and relationship helpers
resource_mapping.ex — %ResourceMapping{} struct (module, label, module_label,
domain_fragment_label, all_labels, label_pair,
properties, edges, guards, skip)
edge_descriptor.ex — %EdgeDescriptor{} struct (relationship, label, direction,
destination_label)
neo4j_helper.ex — Low-level node/edge operations via Bolty
data_layer/cast.ex — Casts Neo4j return values to Ash types
data_layer/dump.ex — Dumps Ash values to Neo4j-compatible primitives
data_layer/type_classifier.ex — Classifies types as :ash_json (embedded/struct/map) or scalar
sandbox.ex — AshNeo4j.Sandbox: per-test transaction isolation
util.ex — short_name/1, to_camel_case/1, reverse/1
persisters/
persist_labels.ex — Computes and persists domain_label, module_label, label,
domain_fragment_label, all_labels, label_pair
persist_translations.ex — Builds attribute → property name keyword list; excludes
belongs_to source attributes and skip attributes
persist_relate.ex — Merges explicit relate DSL with default auto-generated edges
persist_relationship_attributes.ex — Maps source attributes to relationship names
persist_mapping.ex — Bakes __ash_neo4j_mapping__/0 onto each resource module
verifiers/
verify_labels_pascal_case.ex
verify_relate.ex
verify_guard.ex
verify_properties_camel_case.ex
verify_enrichable.ex
verify_attribute_type.ex
test/
support/resource/ — Test resources (Post, Comment, Author, Specification, …)
support/srm.ex — Test domain (Srm)
blog_test.exs — CRUD, filter, relationship tests
aggregate_test.exs — All aggregate kinds including filtered and expr aggregates
calculation_test.exs — Expression calculations
data_layer/ — Unit tests for Cast, Dump, TypeClassifier, Info
Every node has several distinct label concepts. Getting them confused is the most common source of bugs:
| Name | Persisted as | Example | When used |
|---|---|---|---|
domain_label |
:domain_label |
:Servo |
Written on CREATE; also part of MATCH via label_pair |
module_label |
:module_label |
:ShelfInstance |
Written on CREATE; also part of MATCH via label_pair |
label |
:label |
:Instance |
May differ from module_label when a resource fragment declares a base type; written on CREATE only |
domain_fragment_label |
:domain_fragment_label |
:Telco |
Written on CREATE only — from a domain fragment using AshNeo4j.DataLayer.Domain; nil when none declared |
all_labels |
:all_labels |
[:Servo, :ShelfInstance, :Instance, :Telco] |
Full CREATE label list — [domain_label, module_label, label, domain_fragment_label] deduped |
label_pair |
:label_pair |
[:Servo, :ShelfInstance] |
MATCH label list — always [domain_label, module_label]; uniquely identifies this resource type |
Key invariant: all_labels are written on CREATE. For MATCH / UPDATE / DELETE,
use mapping.label_pair — always [domain_label, module_label]. This two-label combination
uniquely identifies the exact resource type and prevents cross-fragment contamination.
Cypher.node(:s, [:Servo, :ShelfInstance]) produces "(s:Servo:ShelfInstance)" — correct.
Cypher.node(:s, [:Instance]) produces "(s:Instance)" — scans every resource extending the same fragment.
Cypher.node(:s, [:ShelfInstance]) produces "(s:ShelfInstance)" — scopes to module but not domain (avoid).
mapping.label_pair always holds [domain_label, module_label]. Use it for all MATCH patterns.
mapping.properties is a keyword list of {ash_attribute_name, neo4j_property_name} pairs
built by PersistTranslations. Rules:
snake_caseattributes →camelCaseproperties (viaUtil.to_camel_case/1).- The
:idattribute is special: its property name is the camelCase of the Ash type's short name (e.g.Ash.Type.UUID→ property:uuid). This avoids colliding with Neo4j's internalidfield. belongs_tosource attributes (e.g.specification_id) are excluded from translations. They are not stored as node properties; their values come fromenrichments/3(reading the OPTIONAL MATCH destination node). Do not re-add them to translations.- Attributes listed in the
skipDSL option are also excluded.
The convert_node_to_resource_impl/4 loop iterates translations and reads node properties.
Because belongs_to source attributes are excluded, the loop does not touch them — their
values must survive intact from the enrichments map that seeds the accumulator.
After a read query MATCH (s:Label) OPTIONAL MATCH (s)-[r]-(d) RETURN s, r, d, enrichments/3
in DataLayer processes each {edge, dest_node} pair and populates:
belongs_torelationships: setssource_attribute(e.g.specification_id) fromdest_node.properties[destination_property].has_onereverse relationships: setsdestination_attributefrom source node property.many_to_manyrelationships: converts dest_node to a resource struct and appends to a list.
The lookup uses mapping.edges (from mapping.module). If an edge returned by the OPTIONAL
MATCH has no matching entry in mapping.edges (wrong label, wrong direction, or missing relate
entry), enrichments/3 silently returns acc unchanged and the source attribute remains nil.
edge_direction/2 determines direction by comparing dest_node.id with edge.start /
edge.end:
dest_node.id == edge.start→:incoming(destination is the start of the edge)dest_node.id == edge.end→:outgoing(destination is the end of the edge)
PersistRelate builds mapping.edges from two sources:
- Explicit entries — the
relatelist in the resource'sneo4j doblock:{relationship_name, edge_label, direction, destination_label}. - Default entries — auto-generated for any Ash relationship that has no explicit entry.
Default edge label =
String.upcase(relationship.type)(e.g.:BELONGS_TO), default destination label = last segment ofrelationship.destinationmodule name.
Explicit entries always take precedence. If a relationship is declared in a fragment's
neo4j do block, check whether the extending resource's relate DSL correctly merges those
entries — a mismatch between the explicit edge label and the default generates a wrong label
in mapping.edges, causing enrichments to silently fail.
run_aggregate_for_ids/6 selects one of four paths based on the aggregate's properties:
| Condition | Path | Description |
|---|---|---|
aggregate.field is an Ash.Query.Calculation |
expr path | Loads full dest records, evaluates Ash expression per record in Elixir |
aggregate_has_filter?(aggregate) is true |
filtered path | Loads full dest records, applies Ash.Filter.Runtime.filter_matches, computes aggregate in Elixir |
field type is :ash_json (embedded/struct/map) |
embedded path | Runs collect(d.prop) in Cypher, casts each raw JSON value via Cast.cast/3 in Elixir |
| otherwise | Cypher path | Fully pushed down: COUNT, SUM, AVG, MIN, MAX, collect, head(collect(...)) |
aggregate_has_filter? treats %Ash.Filter{expression: true} as "no filter" (Ash always
attaches a trivial filter to unfiltered aggregates). Do not change this sentinel check.
Every query shape used by the data layer has a typed builder in Cypher.Query. Builders
return %Cypher.Query{clauses: [...], params: %{}} structs that Cypher.render/1 turns into
a {cypher_string, params} tuple for Cypher.run/1.
Cypher.node(variable, labels) takes a list of label atoms and produces "(var:L1:L2)".
Cypher.parameterized_node/3 does the same with a property map for parameterized MATCH patterns.
All MATCH/UPDATE/DELETE builders accept atom() | [atom()] for source label parameters — pass
mapping.label_pair (a list) for all resource operations. Single-atom callers still work for
destination labels (which remain a single label in most patterns).
The aggregate builders (aggregate_per_record, aggregate_total, related_nodes) use a
labels_string/1 private helper to render [domain, module] as "Domain:Module" inside
string-interpolated Cypher patterns — "(s:#{labels_string(label_pair)})". When modifying
aggregate builders, use labels_string/1 for the source pattern, not direct atom interpolation.
Tests require a running Neo4j instance (configured in config/runtime.exs via BOLT_URL
or similar). AshNeo4j.Sandbox wraps each test in a transaction that rolls back on completion.
mix test # full suite
mix test test/blog_test.exs # single file
mix test test/blog_test.exs:LINE # single test
mix test --max-failures 5 # stop earlyThe sandbox uses Process dictionary flags (ash_neo4j_in_sandbox_tx,
ash_neo4j_tx_stack). Tests that bypass the sandbox or start their own transactions may
interfere with isolation — check the sandbox implementation before adding transaction logic
in tests.
When a bug is found in a dependency (Bolty, Ash, Spark), raise a GitHub issue on that repository. Use diffo issue #125 as the style reference:
- ## Description — explain what the system does, what the code path is, and where it breaks. Include a short Cypher or Elixir snippet if it makes the failure concrete.
- ## What we need — state the correct behaviour plainly.
- ## Why it matters — explain the practical impact.
Do not attempt to locate or fix the root cause in the dependency. Add useful hypotheses as a follow-up comment, then leave it with the upstream maintainers.
-
Not using
mapping.label_pairfor MATCH. All read, update, delete, and aggregate queries must usemapping.label_pair([domain_label, module_label]) as the source node pattern. Usingmapping.labelalone matches every resource that extends the same fragment. Usingmapping.module_labelalone (without domain) risks collisions across domains. -
Re-adding
belongs_tosource attributes to translations. They are intentionally excluded byPersistTranslations. Their values come from enrichments (the OPTIONAL MATCH result). Including them in translations would cause the property-read loop to overwrite the enriched value with nil (the attribute has no corresponding node property). -
Assuming
Verifier.get_option(dsl, [:neo4j], :relate, [])picks up fragment DSL options.get_entitiespicks up entities from fragments; option merging behaviour forrelate(a list option) must be verified separately. If a fragment's explicitrelateentries are not visible,PersistRelategenerates default edges with wrong labels (e.g.:BELONGS_TOinstead of:SPECIFIED_BY), causing enrichments to silently fail. -
Using a single label in aggregate Cypher builders (
aggregate_per_record,aggregate_total,related_nodes). These use"(s:#{labels_string(source_label)})"with alabels_string/1helper. Always passmapping.label_pairas the source label here too. -
Registering a transformer under
persisters:and expectingbefore?/after?ordering relative to other transformers to be honoured. Persisters always run after ALL transformers. Ordering declarations that target transformers from a persister are silently ignored. -
Using
List.delete/2to filter domain labels from destination node labels. It removes only the first occurrence. If the source domain label happens to match a destination node label, only one instance is removed. PreferList.delete_ator label filtering by explicit set membership when precision matters. -
Treating
domain_labelalone as a MATCH label. The domain label is part oflabel_pairand is used in MATCH, but always paired withmodule_label. Matching on domain label alone would return every node in the domain, not just the target resource. -
Forgetting to update
relation_readinCypher.Querywhen changing MATCH label logic. Therelationship_read/7builder emits a separateMATCH (s:SrcLabel)-[r:EdgeLabel]-(d:DestLabel)pattern. It must use the same multi-label source pattern asnode_read. -
Changing
aggregate_has_filter?sentinel without understanding Ash's trivial filter. Ash attaches%Ash.Filter{expression: true}to every aggregate, even unfiltered ones. The check%Ash.Filter{expression: true} -> falseis intentional — it means "no user filter". Removing or loosening it routes all aggregates through the Elixir path unnecessarily. -
Modifying
Cypher.render/1to reorder clauses. The clause list is ordered; render outputs them in insertion order. Query correctness depends on this ordering. Always add clauses in the correct semantic position in the builder, not in render.