UNIVERSITY OF WEST ATTICA
SCHOOL OF ENGINEERING
DEPARTMENT OF COMPUTER ENGINEERING AND INFORMATICS
University of West Attica · Department of Computer Engineering and Informatics
Compilers
Vasileios Evangelos Athanasiou
Student ID: 19390005
Georgios Theocharis
Student ID: 19390283
Ioannis Iliou
Student ID: 19390066
Pantelis Tatsis
Student ID: 20390226
Vasileios Dominaris
Student ID: 21390055
Supervision
Supervisor: Christos Troussas, Assistant Professor
Co-supervisor: Michalis Iordanakis, Academic Scholar
Athens, May 2024
This project involves the development of a compiler for Uni-C, a subset of the C programming language. The implementation was completed in three distinct phases, covering the fundamental stages of compiler construction:
-
Finite State Machine (FSM) Encoding
Design and simulation of automata for recognizing lexical units. -
Lexical Analysis (FLEX)
Development of a lexical analyzer that identifies tokens using regular expressions. -
Syntactic Analysis (BISON)
Construction of a parser that validates program structure based on predefined grammar rules.
| Section | Folder | Description |
|---|---|---|
| 1 | A-FLEX/ |
Lexical analysis phase using Finite State Machines and FLEX |
| 1.1 | A-FLEX/A2-FSM/ |
FSM design and implementation for Uni-C tokens |
| 1.1.1 | A-FLEX/A2-FSM/docs/ |
FSM theory notes, transition tables, and documentation (PDF/XLSX) |
| 1.1.2 | A-FLEX/A2-FSM/src/ |
FSM source files for identifiers, strings, numbers, comments, and whitespace |
| 1.2 | A-FLEX/A3-FLEX/ |
FLEX-based lexical analyzer implementation |
| 1.2.1 | A-FLEX/A3-FLEX/docs/ |
FLEX code documentation |
| 1.2.2 | A-FLEX/A3-FLEX/src/ |
FLEX source code, Makefile, input/output samples |
| 1.3 | A-FLEX/assign/ |
Assignment descriptions for Part A (FSM & FLEX) |
| 2 | B-BISON/ |
Syntax analysis phase using BISON |
| 2.1 | B-BISON/assign/ |
Assignment descriptions for Part B (BISON) |
| 2.2 | B-BISON/B2-FLEX-BISON/ |
Combined FLEX & BISON parser implementation |
| 2.2.1 | B-BISON/B2-FLEX-BISON/src/ |
Integrated lexer/parser source code and build files |
| 2.3 | B-BISON/B3-COMPILER/ |
Final compiler stage |
| 2.3.1 | B-BISON/B3-COMPILER/docs/ |
BISON grammar documentation |
| 2.3.2 | B-BISON/B3-COMPILER/src/ |
Final Uni-C compiler source code |
| 3 | Uni-C/ |
Language specification and usage guide for Uni-C |
| 4 | README.md |
Project documentation |
| 5 | INSTALL.md |
Usage instructions |
The compiler recognizes the following categories of tokens:
-
Identifiers
Names for variables and functions- Pattern:
[a-zA-Z_][a-zA-Z0-9_]{0,31}
- Pattern:
-
Keywords
Reserved words such as:if,else,while,int,return,func
-
Constants
Supported constant types include:- Integers (decimal, octal, hexadecimal)
- Floating-point numbers
- Strings
-
Operators
- Arithmetic:
+,-,*,/ - Relational:
>,<,== - Logical:
&&,||
- Arithmetic:
-
Delimiters
- Characters such as
;used to separate commands
- Characters such as
For each token category, a Finite State Automaton (FSA) was designed.
Example – Identifiers:
- Starts at an initial state (SZ)
- Transitions to a middle-character state (SMCH) upon receiving a letter or underscore
- Reaches a GOOD exit state upon encountering a newline, provided the identifier is valid
The BISON parser generator is used to define and enforce grammar rules for Uni-C programs:
-
Variable Declarations
- Support for simple data types and arrays
-
Functions
- Recognition of both built-in and user-defined functions
-
Expressions
- Handling of simple and compound expressions
-
Error Handling
- Detection and reporting of syntax errors
- Handling of invalid tokens (
TOKEN ERROR)
-
1_identifiers.fsm
FSM encoding for identifier recognition -
simple-flex-code.l
FLEX source file containing regular expressions and token definitions -
token.h
Header file defining numeric constants for tokens -
simple-bison-code.y
BISON source file containing grammar and syntax rules

