feat: add lambda expression support#710
Conversation
benbellick
left a comment
There was a problem hiding this comment.
Only just started looking at this but flushing a few comments just to get the ball rolling. On the whole this is looking good though, nice work!
| List<Type> paramTypes = parameters().fields(); | ||
| Type returnType = body().getType(); | ||
|
|
||
| return Type.withNullability(false).func(paramTypes, returnType); |
There was a problem hiding this comment.
Is setting nullability false correct here? I am pretty sure func can be nullable, e.g. func?<i32 -> i32>.
Also, probably will be good to make a test captures this then.
There was a problem hiding this comment.
I followed the substrait-go implementation: the GetType function for lambda returns Nullability: types.NullabilityRequired
func (l *Lambda) GetType() types.Type {
return &types.FuncType{
Nullability: types.NullabilityRequired,
ParameterTypes: l.Parameters.Types,
ReturnType: l.Body.GetType(),
}
}
There was a problem hiding this comment.
hmm I am thinking about this right now but that may have been wrong in substrait-go 🤔
There was a problem hiding this comment.
I think this is a spec error. I raised an issue in upstream here.
Mind leaving a comment there referencing this issue as a TODO so we remember to fix it once it gets resolved?
| } | ||
| } | ||
|
|
||
| @Value.Immutable |
There was a problem hiding this comment.
This is okay if you feel confident in your LambdaInvocation work, but considering the size of this PR, it will make it easier both on you and the reviewer to separate out the Lambda work and LambdaInvocation work into two separate PRs.
There was a problem hiding this comment.
I'll handle LambdaInvocation in a separate pr
| private final Type.Struct rootType; | ||
| private final ProtoTypeConverter protoTypeConverter; | ||
| private final ProtoRelConverter protoRelConverter; | ||
| private final List<Type.Struct> lambdaParameterStack = new ArrayList<>(); |
There was a problem hiding this comment.
I need to access lambdaIndex = lambdaParameterStack.size() - 1 - stepsOut here: lambdaParameters = lambdaParameterStack.get(lambdaIndex); when building the Lambda parametre ref that's why I didn't use a stack
There was a problem hiding this comment.
That's fair. Java's stack does extend Vector though so you still get the get method. I'm not so particular on this one.
The only thing I will say is I do think that the whole stack parameter stuff can be confusing for people without a little bit of PL experience, which is why I pushed @Slimsammylim to encapsulate some of that logic and put it in docstring comments. I could see a case for encapsulating this logic in a local private class, though it probably makes more sense to do this only if this logic comes up again.
I expected to see something like this logic elsewhere, as you need to do it to do validation when building lambdas. Though I didn't find anything like this. Does the builder for lambdas in this PR actually validate that the lambda is semantically correct?
There was a problem hiding this comment.
I agree that it would be nice for clarity to encapsulate this quite specific behaviour in a class (single responsibility principle). It doesn't need to extend one of the full collection interfaces, just provide the methods you need:
void push(Type.Struct value)Type.Struct pop()(throw runtime NoSuchElementException if empty)Type.Struct get(int stepsOut)(throw runtime IndexOutOfBoundsException on invalid stepsOut)
Or make it generic if you will reuse it for other types. The caller doesn't need to worry about clever indexing logic or what collection implementation is used internally; just make simple method calls that clearly describe the intent.
| .setStruct(protoLambda.getParameters()) | ||
| .build()); | ||
|
|
||
| lambdaParameterStack.add(parameters); |
There was a problem hiding this comment.
then we can use the standard stack push here
| try { | ||
| body = from(protoLambda.getBody()); | ||
| } finally { | ||
| lambdaParameterStack.remove(lambdaParameterStack.size() - 1); |
There was a problem hiding this comment.
and the standard stack pop here
| int lambdaIndex = lambdaParameterStack.size() - 1 - stepsOut; | ||
| if (lambdaIndex < 0 || lambdaIndex >= lambdaParameterStack.size()) { | ||
| throw new IllegalArgumentException( | ||
| String.format( | ||
| "Lambda parameter reference with stepsOut=%d is invalid (current depth: %d)", | ||
| stepsOut, lambdaParameterStack.size())); | ||
| } |
There was a problem hiding this comment.
This logic could sit within a simple class to encapsulate the lambda parameter stack behaviour (as described in other comments) instead of adding extra complexity to this method.
|
|
||
| Type.Struct lambdaParameters = lambdaParameterStack.get(lambdaIndex); | ||
| return FieldReference.newLambdaParameterReference( | ||
| reference.getDirectReference().getStructField().getField(), |
There was a problem hiding this comment.
Something I'm curious about is why we use reference.getDirectReference().getStructField().getField() here instead of getDirectReferenceSegments(reference.getDirectReference()) like the ROOT_REFERENCE case does.
I'm unfamiliar with substrait-java, but does this current implementation support nested field access through lambda parameters?
*This may be an unimportant issue especially if it's not common to access nested fields in lambdas
This PR is to add support for the Lambda Expression
Summary
Test plan
closes #687