Skip to content

Refactor OpenACC parser to structured AST #9

@ouankou

Description

@ouankou

Summary

We need to remove raw string handling and post-unparse hacks across accparser. The goal is to store OpenACC keywords/modifiers as structured enums/fields in the AST, emit from that structure, and ensure no OpenACC keywords are captured as opaque strings.

Plan

  1. Goals/guardrails
  • No OpenACC keywords stored as raw strings; all keyword/modifier payloads live in typed fields (OpenACCIR.h) and enums (OpenACCKinds.h).
  • Remove after-unparse spacing/substring hacks; ToString emits from structured data.
  • Preserve alias spellings where the spec permits via structured flags/enums, not free-form strings.
  • Support both C/Fortran round-tripping without relying on raw text.
  1. Enumeration coverage
  • Add enums in OpenACCKinds.h for missing categories:
    • Device types (host/multicore/gpu/any/unspecified)
    • Routine bind target vs name literal
    • Gang/worker/vector argument kinds (dim/num/length) and presence indicators
    • Device_type clause kinds (reuse device types)
    • Wait argument kinds (devnum, queues flag)
  • Audit existing enums for completeness vs OpenACC spec; fill gaps.
  1. AST structuring (OpenACCIR.h / OpenACCIR.cpp)
  • Introduce typed payload structs for keyword-bearing clauses/directives:
    • OpenACCDeviceTypeClause: store vector of device-type enums (+ optional raw fallback for unknown).
    • OpenACCGangClause: fields for dim/num/vector-length arguments (as strings), with presence flags.
    • OpenACCWorkerClause: enum modifier (num) + numeric expr string.
    • OpenACCVectorClause: enum modifier (length) + numeric expr string.
    • OpenACCWaitClause + OpenACCWaitDirective: optional devnum string, queues flag, expression list.
    • OpenACCRoutineDirective/routine clauses: bind target (identifier vs string literal), device_type list as enums.
    • Device/Deviceptr/Device_resident: capture argument lists structurally instead of dumping into expressions.
  • For alias-preserving data clauses (copy/copyin/copyout/create variants), use enum for base kind + alias enum/string to re-emit chosen spelling without storing raw tokens.
  1. Parser/AST constructor (OpenACCASTConstructor.cpp)
  • Replace ctx->getText() captures with structured setters:
    • Map grammar alternatives directly to enums (device_type host/gpu -> enum).
    • Parse gang/vector/worker arguments into explicit fields instead of addLangExpr.
    • For wait clauses/directives, set devnum/queues flags; append remaining args as expressions.
    • For routine bind, detect name vs string literal and store separately.
  • Minimize addLangExpr usage to true user expressions/vars; remove keyword tokens from raw strings.
  • Keep original_keyword only for alias spellings or replace with an alias enum.
  • Add validation where spec restricts combinations; otherwise store an "unknown/raw" slot for forward compatibility.
  1. Unparser (OpenACCIRToString.cpp)
  • Rewrite ToString for affected clauses/directives to emit from structured fields:
    • device_type: join enum names.
    • gang/worker/vector: format only present arguments; canonical separators.
    • wait: emit devnum/queues per spec.
    • routine: emit bind/device_type from structured data; avoid getText() artifacts.
    • device/deviceptr/device_resident: emit expression lists from structured storage.
  • Remove substring/space popping hacks once formatting is explicit.
  1. Tests and expected outputs
  • Use openacc_vv + built-in expectations as the gold standard; update expected pragmas after structuring so round-trip matches canonical spacing.
  • Add targeted unit tests for structured clauses (device_type, gang/vector/worker args, wait devnum/queues, routine bind) to lock formatting.
  • Run ctest/existing harness after each feature; adjust only expected outputs, not code, to hide issues.
  1. Migration steps/sequencing (incremental)
    a) Device_type/device/deviceptr/device_resident
    b) Gang/worker/vector clauses
    c) Wait (directive/clause)
    d) Routine (bind/device_type)
    e) Alias-bearing data clauses (copy*/create*)
  • After each: update ToString, adjust tests, run ctest.
  1. Cleanup/consistency
  • Centralize all new enums in OpenACCKinds.h (keep only OpenACCBaseLang in OpenACCIR.h).
  • Remove unused helpers; document any remaining raw-expression fallbacks.

Notes

  • Current code captures many clause arguments via ctx->getText() into addLangExpr; this refactor replaces that with structured fields to improve validation and formatting fidelity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions