In Step 4, we filled in the blanks. We took the promises made in Step 3 headers and delivered the logic in .c files.
The scanner is a State Machine. It remembers where it is (current index).
- Core Loop:
scanner_nextreturns one char and bumps the index. - Safety: Always checks
if (current >= length)to prevent reading invalid memory (Segfaults).
The tokenizer acts as a Consumer of the scanner.
- Peeks at the next char.
- Decides what to do (Is it a word? A number?).
- Consumes characters until the type changes.
Key Logic: The while loop
if (isalpha(c)) {
while (isalpha(scanner_peek(scanner))) {
scanner_next(scanner); // Eat the character
}
}This simple loop grabs an entire word.
This module is a Passive Observer. It doesn't change data; it just tallies it.
- Uses
switch(token.type)to categorize and count.
This module demonstrates Data Transformation.
- Current Limitation: Since we avoided complex memory allocation (
malloc), we just print the normalized output. In a production engine, this would allocate a new string.
- Pointer Arithmetic:
token.start[i]access memory directly. - Modularity: The Tokenizer doesn't know about file endings; it just asks the Scanner.
- Defensive Programming: Checking for
\0(null terminators) everywhere.