An interactive, console-based Python application designed to parse, navigate, and extract structured data from complex XML files. Built with modularity in mind, it utilizes the Document Object Model (DOM) to safely query nodes and attributes, all wrapped in a robust multi-language interface.
- π³ Native DOM Parsing: Built entirely on Python's standard
xml.dom.minidomlibrary, requiring zero external dependencies. - π Built-in i18n (Internationalization): Features a dynamic dictionary-based translation system supporting Italian, English, and Dutch (Nederlands) seamlessly.
- ποΈ Interactive CLI: An intuitive command-line interface with input validation, error handling, and a modern
match/caserouting system. - π‘οΈ Safe Data Extraction: Implements dictionary-based data aggregation and safe condition-checking to handle unavailable or incomplete records gracefully.
The project is structured with strict functional modularity to separate data ingestion, business logic, and user interface:
- Core Setup (
setup,get_centers): Handles file I/O operations and builds the initial DOM tree in memory. - Logic Modules (
exercise_1toexercise_4): Isolated functions that perform specific node traversals and targeted data extraction. - Presentation Layer (Main Loop): Manages the state of the application (language selection), user inputs, and formatted console outputs.
1. Clone the repository:
git clone https://github.com/yourusername/xml-multilang-core.git
cd xml-multilang-core2. Ensure you have the dataset:
Make sure the tourist-info-centers_v2.xml file is placed in the root directory of the project.
3. Launch the Application:
python main.py(Note: Python 3.10 or higher is required due to the use of structural pattern matching).
Working with real-world XML data presents unique parsing challenges that this core successfully mitigates:
Note
The NoneType Trap: In standard XML DOM parsing, empty tags (like <phone2/>) do not have a firstChild text node. Attempting to read .nodeValue directly can result in an AttributeError. This project handles data extraction dynamically, ensuring dictionaries evaluate safely (if data:) to prevent runtime crashes when querying non-existent or malformed center records.
The application is currently tailored to parse tourist center data adhering to the following XML structure:
<root>
<element>
<name>Toerisme Beernem</name>
<street>Bloemendalestraat</street>
<house_number>140</house_number>
<postal_code>8730</postal_code>
<city_name>Beernem</city_name>
<phone1>+32 50 28 91 38</phone1>
</element>
</root>Developed by GiZano