Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions .cline/instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,61 @@ The Common AST must be designed for execution by the Expression Evaluation (EE)
### Graphviz Purpose
**Important**: Graphviz DOT file generation is solely for developer diagnosis and understanding of AST structure during development. It is NOT for end-user visualization or production use.

## Collaborative Design Process

### How to Design New AST Constructs Together
**CRITICAL**: When a user requests a new AST construct (like ProjectNode), follow this collaborative process:

1. **Start with Questions**: Don't jump to implementation. Ask:
- "What are some examples of how this would be used?"
- "Should we challenge any assumptions about this design?"
- "What are the alternatives we should consider?"

2. **Grammar-Driven Discovery**: Examine both KQL and TraceQL grammars together
- Show the user the relevant grammar sections
- Discuss how each language expresses the concept
- Identify commonalities and differences

3. **Design Iteration**: Propose initial designs and refine based on feedback
- Present multiple options with trade-offs
- Ask "Why do we need X?" for each requirement
- Challenge assumptions about complexity and type systems

4. **Document Decisions**: Capture the reasoning behind design choices
- Update memory-bank/designProcess.md with insights
- Include examples of questions that led to better designs
- Document what we learned for future features

### Design Questions to Always Ask
When designing new constructs, always explore:

1. **Type System Questions**
- "Is type information actually needed here?"
- "Can downstream systems infer this information?"
- "Are we over-engineering simple cases?"

2. **Complexity Questions**
- "What's the minimum viable implementation?"
- "How will we handle complex cases later?"
- "Should we implement Level 1 (simple) vs Level 2 (complex) features?"

3. **Engine Compatibility Questions**
- "Is this design engine-agnostic?"
- "Does it work with Arrow data operations?"
- "Are we avoiding engine-specific dependencies?"

4. **Validation Questions**
- "Where should validation happen?"
- "What information does the AST need to provide?"
- "How do we separate concerns cleanly?"

### Successful Collaboration Example: ProjectNode
Reference memory-bank/designProcess.md for how we successfully designed ProjectNode through:
- Questioning type system requirements
- Grammar analysis of both languages
- Iterative refinement based on user feedback
- Smart design that avoids over-engineering

## Common Development Tasks

### Adding New AST Node Types (Grammar-Driven Process)
Expand Down
177 changes: 177 additions & 0 deletions memory-bank/designProcess.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Design Process and Collaboration Patterns

## How We Designed ProjectNode Together

### 📋 **Initial Request and Context Gathering**
The user requested implementation of a ProjectNode to map KQL's `project` operator to TraceQL's `select` operation. Before jumping into implementation, we followed a structured discovery process:

1. **Grammar Analysis**: We examined both KQL and TraceQL grammar files to understand the exact syntax patterns
2. **Requirements Clarification**: Discussed scope, type system needs, and engine compatibility
3. **Design Questions**: Challenged assumptions about wildcards, type requirements, and validation responsibilities

### 🤔 **Key Design Questions We Explored**

#### Question 1: "Should we support wildcard projections (`*`)?"
**Decision**: No wildcard support for Level 1
**Reasoning**: If all fields are needed, the project operation should be omitted entirely
**Impact**: Simplified design, clearer semantics

#### Question 2: "Do we need type information for all projections?"
**Initial Assumption**: All projections need type info for Expression Evaluation engine
**Challenge**: "Why does project operation require a result type? It doesn't evaluate anything..."
**Refined Decision**: Type info only for transformative operations (calculations, function calls)
**Impact**: Much cleaner API, reduced complexity for simple field selections

#### Question 3: "What level of expression complexity should we support?"
**Decision**: Level 1 (simple fields, aliases, basic arithmetic, simple functions) now, Level 2 (complex expressions) later
**Reasoning**: Start with common use cases, document future TODOs clearly
**Impact**: Focused implementation, clear expansion path

#### Question 4: "Where should validation happen?"
**Decision**: AST contains all info needed, validation happens in downstream phases
**Reasoning**: Separation of concerns, AST focused on representation not validation
**Impact**: Clean architecture, flexible validation strategies

### 🔄 **Iterative Design Refinement**

#### Round 1: Initial Structure
```csharp
// Initial design - too rigid
public class ProjectNode : OperationNode
{
public List<string> Fields { get; set; } // Too simple
public Dictionary<string, string> Aliases { get; set; } // Separate aliases
}
```

#### Round 2: Expression-Based Design
```csharp
// Improved - but still had issues
public class ProjectionExpression : ASTNode
{
public Expression Expression { get; set; }
public string? Alias { get; set; }
public ExpressionType ResultType { get; set; } // Always required - wrong!
}
```

#### Round 3: Final Smart Design
```csharp
// Final design - smart about when type info is needed
public class ProjectionExpression : ASTNode
{
public required Expression Expression { get; set; }
public string? Alias { get; set; }
public ExpressionType? ResultType { get; set; } // Optional - only for transformations
}
```

### 🎯 **Design Patterns We Established**

#### Pattern 1: Grammar-Driven Design
- Always start by analyzing the actual grammar files
- Understand the syntax before designing the AST representation
- Map language constructs directly to AST nodes

#### Pattern 2: Progressive Complexity
- Implement Level 1 features first (common cases)
- Document Level 2 features as TODOs with clear comments
- Show examples of future complexity in code comments

#### Pattern 3: Smart Type System
- Don't over-engineer simple cases
- Type information only where actually needed
- Let downstream systems infer types when possible

#### Pattern 4: Cross-Language Compatibility
- Use keywords to distinguish language-specific syntax
- Design AST nodes to represent concepts, not syntax
- Enable round-trip generation to different languages

### 💡 **Critical Design Insights**

#### Insight 1: "Type Information Isn't Always Needed"
The breakthrough moment was realizing that simple field selections (`name`, `duration`) don't need type information because the Expression Evaluation engine can infer types from the data schema. Type info is only needed for transformations.

#### Insight 2: "AST Should Represent Intent, Not Syntax"
Rather than literally translating syntax, we designed the AST to represent the semantic intent: "project these expressions with these optional aliases and types."

#### Insight 3: "Future-Proofing Through Documentation"
By clearly documenting Level 2 TODOs in comments, we make it easy for future developers to understand the expansion path without over-engineering the current implementation.

## Collaborative Decision-Making Process

### 🤝 **How We Made Design Decisions**

1. **Question Everything**: Started with "why does this need...?"
2. **Analyze Examples**: Looked at real KQL and TraceQL query patterns
3. **Challenge Assumptions**: "Is this really needed for all cases?"
4. **Iterate Quickly**: Made changes based on new insights
5. **Document Decisions**: Captured reasoning for future reference

### 📝 **Documentation Strategy**

#### In-Code Documentation
- Comments explaining Level 1 vs Level 2 support
- Examples showing usage patterns
- Clear reasoning for design choices

#### Memory Bank Updates
- Progress tracking with implementation status
- Design process documentation (this file)
- Patterns for future implementations

### 🔍 **Questions to Ask for Future Designs**

When designing new AST constructs, always ask:

1. **Grammar Analysis**
- What does the actual grammar say?
- How do both languages express this concept?
- What are the edge cases in the syntax?

2. **Type System**
- Is type information actually needed here?
- Can downstream systems infer this information?
- What are the performance implications?

3. **Expression Complexity**
- What's the minimum viable implementation?
- How will we handle complex cases later?
- Where should we document future TODOs?

4. **Engine Compatibility**
- Is this design engine-agnostic?
- Does it work with Arrow data operations?
- Are we avoiding engine-specific dependencies?

5. **Validation Strategy**
- Where should validation happen?
- What information does the AST need to provide?
- How do we separate concerns cleanly?

## Recommendations for Future Collaborations

### 🎯 **For Users**
When requesting new features:
- **Challenge the AI**: Ask "why do we need this?" and "what are the alternatives?"
- **Provide Examples**: Show real-world usage patterns you want to support
- **Ask Questions**: Request explanations of design choices and trade-offs
- **Iterate**: Be willing to refine requirements based on technical insights

### 🤖 **For AI Assistants**
When implementing new features:
- **Start with Grammar**: Always analyze the actual language grammars first
- **Ask Clarifying Questions**: Don't assume requirements, ask for details
- **Propose Options**: Present multiple design approaches with trade-offs
- **Document Decisions**: Capture the reasoning behind design choices
- **Plan for Growth**: Design Level 1 with clear path to Level 2

### 🏗️ **Architecture Principles**
- **Grammar-Driven**: Let language specifications guide AST design
- **Engine-Agnostic**: Avoid dependencies on specific execution engines
- **Type-Conscious**: Be smart about when type information is needed
- **Progressive**: Implement common cases first, document complex cases as TODOs
- **Collaborative**: Use questions and challenges to improve design quality

This collaborative approach resulted in a much better ProjectNode design than either human or AI could have achieved alone. The key was the iterative questioning and refinement process that led to genuine insights about type systems and AST design.
11 changes: 6 additions & 5 deletions memory-bank/progress.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,11 +104,12 @@
- Error handling and validation
- Graphviz output generation

### Extended Query Operations 📋
- **Project Operations**: Select/project functionality
- Column selection and aliasing
- Expression projection
- AST representation and processing
### Extended Query Operations ✅
- **Project Operations**: Select/project functionality ✅
- Column selection and aliasing ✅
- Expression projection ✅
- AST representation and processing ✅
- KQL Visitor Support ✅

- **Summarize Operations**: Aggregation functionality
- Group by operations
Expand Down
96 changes: 96 additions & 0 deletions memory-bank/systemPatterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,3 +326,99 @@ Consistent error handling across components:
- Clear error messages with actionable guidance

This architecture provides a solid foundation for cross-language query processing while maintaining extensibility for future enhancements.

## Visitor Pattern Requirements for New AST Constructs

**CRITICAL**: When implementing new AST node types, the visitor pattern must be updated to handle the new constructs. This is essential for complete implementation.

### Required Visitor Updates

When adding a new AST construct (like ProjectNode), you must update **ALL** visitor implementations:

1. **KqlToCommonAstVisitor** (most critical)
- Add new case to the `Visit(SyntaxNode node)` switch statement
- Implement the corresponding `VisitXxxOperator()` method
- Handle language-specific syntax parsing
- Convert to Common AST representation

2. **Future Visitors** (TraceQL, others)
- Any future visitor implementations must also support the new constructs
- Follow the same pattern for consistency

### Example: Adding ProjectNode Support

**Step 1**: Add case to switch statement
```csharp
case SyntaxKind.ProjectOperator:
VisitProjectOperator(node as ProjectOperator);
break;
```

**Step 2**: Implement visitor method
```csharp
private void VisitProjectOperator(ProjectOperator node)
{
// Parse KQL project syntax
// Extract expressions and aliases
// Convert to Common AST ProjectNode
// Add to query operations
}
```

**Step 3**: Handle language-specific nuances
- KQL: `| project field1, alias = field2, calculation = field3 / 1000`
- TraceQL: `select(span.field1, span.field2)` (different syntax, same concept)

### Common Visitor Patterns

**Expression Stack Pattern**: Use stack to handle nested expressions
```csharp
private Stack<Expression> _expressionStack = new Stack<Expression>();

// In visitor methods:
Visit(childExpression);
if (_expressionStack.Count > 0)
{
var expr = _expressionStack.Pop();
// Use expression
}
```

**Separated Elements Pattern**: Handle comma-separated lists
```csharp
foreach (var separatedElement in node.Expressions)
{
var actualExpression = separatedElement.Element;
Visit(actualExpression);
}
```

**Alias Handling Pattern**: Extract optional aliases from syntax
```csharp
if (column is SimpleNamedExpression namedExpr)
{
string? alias = namedExpr.Name?.SimpleName;
// Process with alias
}
else
{
// Process without alias
}
```

### Testing Visitor Implementation

Always test visitor updates with:
1. **Simple queries**: Basic syntax verification
2. **Complex queries**: Nested expressions, multiple operations
3. **Edge cases**: Empty lists, null expressions
4. **Integration tests**: End-to-end query processing

### Memory Bank Update Requirement

When adding visitor support for new constructs, **MUST** update:
- `memory-bank/progress.md`: Mark visitor support as complete
- `memory-bank/systemPatterns.md`: Document any new patterns
- `memory-bank/designProcess.md`: Capture design decisions

This ensures future developers understand the complete implementation requirements for new AST constructs.
Loading