Skip to content

Comments

Refactor array bracket access to dedicated AST node#8168

Open
shulhi wants to merge 20 commits intorescript-lang:masterfrom
shulhi:bracket-access-ast
Open

Refactor array bracket access to dedicated AST node#8168
shulhi wants to merge 20 commits intorescript-lang:masterfrom
shulhi:bracket-access-ast

Conversation

@shulhi
Copy link
Member

@shulhi shulhi commented Jan 12, 2026

Refactors array bracket access from Pexp_apply(Array.get/set, ...) to a dedicated Pexp_index AST node. This is an internal refactoring with no user-facing changes.

Changes:

  • arr[0] now represented as Pexp_index(arr, 0, None) in AST
  • arr[0] = val now represented as Pexp_index(arr, 0, Some(val)) in AST
  • Same type checking, same JavaScript output, same behavior

Why:

@nojaf
Copy link
Member

nojaf commented Jan 12, 2026

Hmm, any reason you went with one new node to represent two different things?
If feel like the optional expression is not the cleanest way to model this.
Would it not make more sense to just split these?
Is this to remain consistent with something?

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 12, 2026

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript-lang/rescript@8168

@rescript/darwin-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-arm64@8168

@rescript/darwin-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-x64@8168

@rescript/linux-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-arm64@8168

@rescript/linux-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-x64@8168

@rescript/runtime

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/runtime@8168

@rescript/win32-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/win32-x64@8168

commit: c8ffd65

@shulhi
Copy link
Member Author

shulhi commented Jan 12, 2026

Hmm, any reason you went with one new node to represent two different things? If feel like the optional expression is not the cleanest way to model this. Would it not make more sense to just split these? Is this to remain consistent with something?

I had it split initially, but thought it would be better to represent this as a single node to be consistent with record update.

  | Pexp_record of expression record_element list * expression option
    (* { l1=P1; ...; ln=Pn }     (None)
       { E0 with l1=P1; ...; ln=Pn }   (Some E0) *)

but, you could also make a counter argument with Pexp_field vs Pexp_setfield.

Splitting Pexp_index into two shouldn't be too much work if that is more desired.

@cristianoc
Copy link
Collaborator

This is not currently changing the parser. Guess wip. Just, it means it's not been tested yet.

@nojaf
Copy link
Member

nojaf commented Jan 12, 2026

if that is more desired.

It was just my initial unfiltered response. With records it feels different somehow: { myRec with X } the with X is a clear child node. But for assignment in a[0] = x, it is tagged on and thus not quite an optional extension.

@shulhi
Copy link
Member Author

shulhi commented Jan 12, 2026

Okay this is probably closer to record construction vs mutable record field assignment (two different nodes) rather than record update.

This is still in draft though, it is still subject to change and I don't mind changing it to two different nodes.

@zth
Copy link
Member

zth commented Jan 12, 2026

I also think 2 separate nodes are cleaner for this.

Great work btw, this will be a good cleanup/enhancement.

@shulhi shulhi force-pushed the bracket-access-ast branch from 6a52ad2 to de2dc18 Compare January 28, 2026 11:35
@shulhi shulhi marked this pull request as ready for review February 6, 2026 07:35
@shulhi shulhi marked this pull request as draft February 6, 2026 12:13
@shulhi
Copy link
Member Author

shulhi commented Feb 9, 2026

TLDR

This PR introduces Pexp_index and Pexp_setindex AST nodes for bracket access (arr[i] and arr[i] = v), replacing the previous approach of desugaring to Array.get/Array.set function calls at parse time.

Bracket access now has fixed semantics as a language primitive:

  • arr[i] returns option<'a> — matching JS behavior where out-of-bounds returns undefined
  • arr[i] = v returns unit

This also removes the implicit reliance on module name. The parser previously had a FIXME about this:

(* FIXME: Do not implicitly rely on specific module name, even primitive one
   This can be abused like
     module Array = MyModule
   Find better mechanism to support it
*)

Even the built-in Array module could be shadowed by module Array = MyModule, making arr[3] call an arbitrary user function. The new AST node makes bracket access independent of any module.

Details

Previous behavior

The parser desugared arr[3] into Array.get(arr, 3) — a function call. The semantics depended on which Array.get was in scope:

Scope arr[3] resolves to Return type
Default Array.get (Stdlib_Array.get) option<'a>
open Belt Belt.Array.get option<'a>
open ImmutableArray ImmutableArray.Array.get option<'a>

The return type is the same as before (option<'a>), but bracket access is now a language primitive rather than a module-dependent function call.

GenType impact

Since bracket access is now a language primitive, arr[3] always types the container as array<'a> — it does not use module-specific types even if ImmutableArray or Belt is in scope. For example:

open ImmutableArray
arr[3]

On master, this resolved to ImmutableArray.Array.get(arr, 3), so the type checker saw arr as ImmutableArray.t<'a> and gentype exported it as ReadonlyArray<T1>. With the new AST, arr[3] always types arr as array<'a>, so gentype exports it as T1[] instead.

If ImmutableArray.t or Belt.Array.t types are desired (e.g. for gentype exports), users need to use explicit function calls:

// arr[3] — always array<'a>, gentype sees T1[]
let result = arr[3]

// Explicit call — preserves ImmutableArray.t, gentype sees ReadonlyArray<T1>
let result = ImmutableArray.get(arr, 3)

// Explicit call — preserves Belt types
let result = Belt.Array.get(arr, 3)

@shulhi shulhi marked this pull request as ready for review February 11, 2026 07:58
@shulhi
Copy link
Member Author

shulhi commented Feb 21, 2026

@nojaf can you please review again. I've made some changes to the analysis related section to handle completion. See changes in analysis and the test here https://github.com/rescript-lang/rescript/pull/8168/changes#diff-9f57d527f360fd0f7435ea108c578ba2fc160e749b874b611b46d807a444deeb

Completable: Cpath Value[people][]->
Package opens Stdlib.place holder Pervasives.JsxModules.place holder
Resolved opens 1 Stdlib
ContextPath Value[people][]->
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [] represents the CPIndex

Copy link
Member

@nojaf nojaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, though would like to know why the examples changed.

let testImmutableArrayGet = arr => {
open ImmutableArray
arr[3]
ImmutableArray.Array.get(arr, 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this existing code?
Why did this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to match the expected output of the gentype. I mentioned the reason here, #8168 (comment).

Essentially, arr[3] is transformed into Array.get(arr, 3) and depending on what module is in scope, the behavior of the gentype can be changed.

Since arr[3] no longer transformed to Array.get(arr, 3) internally, to get the same gentype behavior you need to be explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure if updating the test to avoid the problem is the best thing to do here. I have never used gen type so I have no idea what the implications are but this seems quite user impacting? If so, is this the right play then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned before that this is going to impact gentype. I don't use gentype/typescript though, maybe @cometkim can chime in? Anyway, this is going to be in v13, we can expect to have some breaking changes.

There's also this note I found in the codebase that this needs to be fixed. This PR essentially fixes this issue.

(* FIXME: Do not implicitly rely on specific module name, even primitive one
   This can be abused like
     module Array = MyModule
   Find better mechanism to support it
*)

~kind:
(Completion.ExtractedType (Toption (env, ExtractedType typ), `Type));
])
| CPIndex cp -> (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

@nojaf nojaf requested review from cristianoc and zth February 23, 2026 09:13
Copy link
Collaborator

@cristianoc cristianoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected the syntax tests to look a little different as they would show the new internal representation of array index.
Not sure why there's no diff evident in this PR.
We're still checking in the reacts of parsing as printing of the parsed AST right?

@shulhi
Copy link
Member Author

shulhi commented Feb 23, 2026

I would have expected the syntax tests to look a little different as they would show the new internal representation of array index.
Not sure why there's no diff evident in this PR.
We're still checking in the reacts of parsing as printing of the parsed AST right?

Before Pexp_index existed, arr[0] was Pexp_apply with Array.get/set internally, and pprintast.ml had special case logic in the Pexp_apply handler to detect this pattern and print it as arr.(0).

https://github.com/shulhi/rescript-compiler/blob/cfda8c1bae53980074e5c3b9d8b2757154dba589/compiler/ml/pprintast.ml#L560

With this PR, arr[0] is Pexp_index and pprintast.ml directly prints it as arr.(0). Both produce the same output.

@cristianoc
Copy link
Collaborator

cristianoc commented Feb 23, 2026

Can we clarify the gentype discussion.
Looks like the change is semantic and does not have to do with gentype itself, which just reacts to the semantic change.

The intended breaking change is that one cannot redefine Array anymore, as far as indexing operations are concerned.
Correct?

@shulhi
Copy link
Member Author

shulhi commented Feb 23, 2026

That's correct, the changes in gentype is the side effect of not allowing Array to be redefined.

Take this for example,

open ImmutableArray
arr[3]

will generate typescript definition for ReadonlyArray<T1>. However, just by changing which module is in scope, the gentype definition can be changed.

open Belt
arr[3]

This is because previously, index access is translated to Array.get(arr, 3) and depending on which module is in scope, it later gets translated to Immutable.Array.get(arr, 3), Belt.Array.get(arr, 3), etc. This is problematic because you can define

module Foo = {
  module Array = {
    let get = (_, _) => Some(42)
  }
}

let arr = [1,2,3,4]

open Foo
Console.log(arr[3]) // print 42 instead of 4

https://rescript-lang.org/try?version=v12.0.2&module=esmodule&code=LYewJgrgNgpgBAMRCOBeOBvAUHOpKxwCCATiQIYCeamOucsALnAOYzPoAUA+gDRzcAlGgB8cAMohgMTgBYATILoBfLKqxM45MjQDaARl7zeAZl6yAuliwgADjAB2iZFgDCIBwGcQsAHRQQFk5tEl0TC2EAeki4WxIASwdmBThEz0YYcjA4EAAzOFkgA

@cristianoc
Copy link
Collaborator

OK all clear. This looks good to me.
The breaking change of disallowing re-defining Array seems orthogonal to the new array index access AST change.
I have no opinion either way on the breaking change. Sounds good to me if we want to be stricter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants