Skip to content

[FEA] GFQL local output_type + Table select pipeline for conformance #879

@lmeyerov

Description

@lmeyerov

Is your feature request related to a problem? Please describe.
Cypher TCK conformance work needs row-style outputs for RETURN/ORDER BY/LIMIT and basic aggregations. Today, local gfql() always returns a Plottable and relies on ad-hoc Python DataFrame post-processing. Remote gfql already supports output_type=nodes/edges, but local does not, and there is no in-language table pipeline.

Describe the solution you'd like
Add a minimal, vector-friendly Table pipeline and align local gfql() with remote output_type:

  • gfql(..., output_type='all'|'nodes'|'edges') for local execution (parity with gfql_remote).
  • nodes_table(g) / edges_table(g) conversion ops (graph -> table), optionally filtered by alias name.
  • A single table op: select(table=..., where=..., project=..., order_by=..., group_by=..., agg=..., limit=..., distinct=...).
  • Optional: graph(edges=table, src=..., dst=..., nodes=..., node_id=...) as table->graph bridge (hypergraph already exists, so keep this low-level).
    All operations should compile to pandas/cuDF vectorized primitives.

Describe alternatives you've considered

  • Keep using Python DataFrame post-processing outside GFQL. This works but blocks conformance and makes translation harder to standardize.
  • Add only output_type nodes/edges for local gfql() without any Table ops; still leaves RETURN/ORDER/AGG outside GFQL.

Additional context
Conformance status: 718/1615 scenarios translated; expressions bucket (758 scenarios) and row semantics remain 0% covered. A minimal table pipeline would unlock a large fraction of non-expression RETURN/ORDER/LIMIT cases and reduce Python-only post-processing in the TCK harness.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions