Skip to content

Latest commit

 

History

History
39 lines (31 loc) · 1.55 KB

File metadata and controls

39 lines (31 loc) · 1.55 KB

Step 4: Implementation (Source Files)

In Step 4, we filled in the blanks. We took the promises made in Step 3 headers and delivered the logic in .c files.

Module Breakdown

Scanner (src/scanner.c)

The scanner is a State Machine. It remembers where it is (current index).

  • Core Loop: scanner_next returns one char and bumps the index.
  • Safety: Always checks if (current >= length) to prevent reading invalid memory (Segfaults).

Tokenizer (src/tokenizer.c)

The tokenizer acts as a Consumer of the scanner.

  1. Peeks at the next char.
  2. Decides what to do (Is it a word? A number?).
  3. Consumes characters until the type changes.

Key Logic: The while loop

if (isalpha(c)) {
    while (isalpha(scanner_peek(scanner))) {
        scanner_next(scanner); // Eat the character
    }
}

This simple loop grabs an entire word.

Stats (src/stats.c)

This module is a Passive Observer. It doesn't change data; it just tallies it.

  • Uses switch(token.type) to categorize and count.

Normalizer (src/normalizer.c)

This module demonstrates Data Transformation.

  • Current Limitation: Since we avoided complex memory allocation (malloc), we just print the normalized output. In a production engine, this would allocate a new string.

Engineering Principles Used

  • Pointer Arithmetic: token.start[i] access memory directly.
  • Modularity: The Tokenizer doesn't know about file endings; it just asks the Scanner.
  • Defensive Programming: Checking for \0 (null terminators) everywhere.