Skip to content

dashalary/tiny-parser

Repository files navigation

Tiny Parser

A small Haskell project exploring parser design, including tokenization, recursive descent parsing, AST construction, and evaluation.

Supports arithmetic expressions with operator precedence and simple key-value parsing.

Overview

Input → Tokens → AST → Evaluated Result

Why I Built This

I wanted to revisit parsing fundamentals and explore how expression evaluation works end-to-end:

  • tokenizing raw input
  • building an AST with operator precedence
  • evaluating expressions recursively

I also used this as an opportunity to structure a small Haskell project with clear separation of concerns across tokenizer, parser, and evaluator modules.

Features

Currently supports the following:

Arithmetic Expression Parsing

  • Parse expressions with numbers, operators (+, -, *, /), and parentheses
  • Correct operator precedence (multiplication/division before addition/subtraction)
  • Support for nested parentheses

Key-Value Pair Parsing

  • Parse simple key = value pairs
  • Support for string and numeric values

Multiple Modes

  • Standalone mode - Quick demo of example expressions
  • Interactive mode - User input with menu-driven interface
  • Test suite - Unit and integration tests

Quick Start

Prerequisites

  • Haskell (GHC) with runhaskell
  • Compatible with macOS, Linux, and Windows

Run Standalone Demo

runhaskell parser_standalone.hs

Output:

Welcome to Tiny Parser!

Arithmetic Expression Parser

Input: 1 + 5 * 4
Tokens: [TNum 1, TOp '+', TNum 5, TOp '*', TNum 4, EOF]
AST: BinOp Add (Number 1) (BinOp Mul (Number 5) (Number 4))
Result: 21

Run Interactive Mode

runhaskell interactive_parser.hs

Interactive menu:

╔═══════════════════════════════════════╗
║   Tiny Parser - Interactive Mode      ║
╚═══════════════════════════════════════╝

What would you like to do?
  1. Parse arithmetic expression
  2. Parse key-value pair
  3. Evaluate expression
  4. Demo mode (show examples)
  5. Exit

Run Tests

runhaskell tests.hs

Output:

Total Tests: 33
✓ Passed: 33
✗ Failed: 0

🎉 All tests passed!

Usage Examples

Arithmetic Expressions

Simple Operations

Input: 2 + 3
Result: 5

Input: 10 - 4
Result: 6

Input: 3 * 4
Result: 12

Input: 20 / 4
Result: 5

Operator Precedence

Input: 1 + 5 * 4
Result: 21  (5 * 4 = 20, then 1 + 20 = 21)

Input: 2 * 3 + 4 * 5
Result: 26  (2 * 3 = 6, 4 * 5 = 20, then 6 + 20 = 26)

Parentheses

Input: (7 + 9) * 3
Result: 48  (7 + 9 = 16, then 16 * 3 = 48)

Input: (10 + 5) * (2 + 3)
Result: 75  (10 + 5 = 15, 2 + 3 = 5, then 15 * 5 = 75)

Key-Value Pairs

Input: name = alice
Output: KVPair "name" "alice"

Input: age = 30
Output: KVPair "age" "30"

Input: city = newyork
Output: KVPair "city" "newyork"

Notes

A few things stood out while building this:

  • Recursive descent parsing makes operator precedence explicit and easy to reason about
  • Algebraic data types in Haskell map naturally to AST structures
  • Splitting the project into modules (AST, Tokenizer, Parser, Evaluator) clarified responsibilities and improved maintainability

Project Structure

tiny-parser/
├── README.md                    # This file
├── LICENSE                      # BSD-3-Clause
├── tiny-parser.cabal           # Cabal build configuration
├── stack.yaml                  # Stack configuration
│
├── src/                         # Modular implementation
│   ├── Main.hs                 # Entry point
│   ├── AST.hs                  # Abstract Syntax Tree definitions
│   ├── Tokenizer.hs            # Lexical analysis (tokenization)
│   ├── Parser.hs               # Syntax analysis (parsing)
│   └── Evaluator.hs            # Expression evaluation
│
├── parser_standalone.hs        # One-file standalone demo
├── interactive_parser.hs       # Interactive REPL mode
├── tests.hs                    # Test suite (33 tests)
└── build.sh                    # Build helper script

Parser Architecture

Pipeline Overview

Input → Tokenizer → Parser → AST → Evaluator → Output

The parser follows a simple three-stage pipeline:

  1. Tokenizer (Tokenizer.hs)
    Converts raw input strings into tokens

  2. Parser (Parser.hs)
    Builds an Abstract Syntax Tree (AST) with correct operator precedence

  3. Evaluator (Evaluator.hs)
    Recursively evaluates the AST

Design Patterns

  • Recursive Descent Parsing - Clean, predictable parsing with operator precedence
  • AST Representation - Type-safe expression representation
  • Left Associativity - Operations 1 + 2 + 3 evaluate as (1 + 2) + 3
  • Operator Precedence - Multiplication and division bind tighter than addition and subtraction

Test Suite

Comprehensive test coverage (33 tests) organized by category:

  • 9 Tokenizer Tests - Input tokenization
  • 9 Parser Tests - Correct AST generation
  • 3 Key-Value Parser Tests - KV pair parsing
  • 9 Evaluator Tests - Expression evaluation
  • 3 Integration Tests - Full pipeline testing

To run tests:

runhaskell tests.hs

Data Types

AST Types

data Expr
  = Number Int           -- Numeric literal
  | BinOp Op Expr Expr   -- Binary operation
  | Var String           -- Variable name

data Op = Add | Sub | Mul | Div

data KVPair = KVPair String String

Token Types

data Token
  = TNum Int         -- Number token
  | TOp Char         -- Operator token (+, -, *, /)
  | TVar String      -- Variable name
  | TLParen          -- Left parenthesis
  | TRParen          -- Right parenthesis
  | TEq              -- Equals sign
  | EOF              -- End of input

Supported Syntax

Arithmetic Expressions

  • Numbers: 0, 42, 1234
  • Operators: + (add), - (subtract), * (multiply), / (divide)
  • Grouping: (expression)
  • Precedence: * and / before + and -

Key-Value Pairs

  • Format: key = value
  • Keys: Valid Haskell identifiers
  • Values: Identifiers or numbers

Examples of Valid Input

2 + 3
10 - 5
3 * 4
20 / 5
(1 + 2) * 3
1 + 2 * 3 + 4
((10))
name = alice
age = 30

Compilation (Optional)

The parser is designed to run interpreted with runhaskell, but you can also compile with Cabal:

cabal build
cabal run tiny-parser

Note: Full compilation requires LLVM 9-13 for GHC 8.10.7. Interpreted mode works without external dependencies.

Current Limitations

  • Variables cannot be evaluated (parser accepts them but evaluator errors on undefined variables)
  • Only integer arithmetic (no floats yet)
  • No built-in functions or operators
  • Simple error messages

Future Enhancements

Potential improvements:

  • Floating point number support
  • Variable assignment and substitution
  • More operators (%, ^, etc.)
  • String literals
  • If/then/else expressions
  • Function definitions
  • Better error messages with line/column info
  • Compiled native executable

Clone and Test

git clone https://github.com/dashalary/tiny-parser.git
cd tiny-parser
runhaskell tests.hs

About

A small Haskell project exploring parser design, including tokenization, recursive descent parsing, AST construction, and evaluation. Supports arithmetic expressions with operator precedence and simple key-value parsing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors