Skip to content

GiZano/xml-multilang-core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌍 XML MultiLang Core

Interactive DOM Parsing & Data Extraction CLI

Python XML i18n CLI


πŸ“– About The Project

An interactive, console-based Python application designed to parse, navigate, and extract structured data from complex XML files. Built with modularity in mind, it utilizes the Document Object Model (DOM) to safely query nodes and attributes, all wrapped in a robust multi-language interface.

✨ Key Features

  • 🌳 Native DOM Parsing: Built entirely on Python's standard xml.dom.minidom library, requiring zero external dependencies.
  • 🌍 Built-in i18n (Internationalization): Features a dynamic dictionary-based translation system supporting Italian, English, and Dutch (Nederlands) seamlessly.
  • πŸŽ›οΈ Interactive CLI: An intuitive command-line interface with input validation, error handling, and a modern match/case routing system.
  • πŸ›‘οΈ Safe Data Extraction: Implements dictionary-based data aggregation and safe condition-checking to handle unavailable or incomplete records gracefully.

πŸ—οΈ System Architecture

The project is structured with strict functional modularity to separate data ingestion, business logic, and user interface:

  • Core Setup (setup, get_centers): Handles file I/O operations and builds the initial DOM tree in memory.
  • Logic Modules (exercise_1 to exercise_4): Isolated functions that perform specific node traversals and targeted data extraction.
  • Presentation Layer (Main Loop): Manages the state of the application (language selection), user inputs, and formatted console outputs.

πŸš€ Quick Start

1. Clone the repository:

git clone https://github.com/yourusername/xml-multilang-core.git
cd xml-multilang-core

2. Ensure you have the dataset: Make sure the tourist-info-centers_v2.xml file is placed in the root directory of the project.

3. Launch the Application:

python main.py

(Note: Python 3.10 or higher is required due to the use of structural pattern matching).


🧠 Technical Learnings & Challenges

Working with real-world XML data presents unique parsing challenges that this core successfully mitigates:

Note

The NoneType Trap: In standard XML DOM parsing, empty tags (like <phone2/>) do not have a firstChild text node. Attempting to read .nodeValue directly can result in an AttributeError. This project handles data extraction dynamically, ensuring dictionaries evaluate safely (if data:) to prevent runtime crashes when querying non-existent or malformed center records.


πŸ“‘ Data Standard Example

The application is currently tailored to parse tourist center data adhering to the following XML structure:

<root>
    <element>
        <name>Toerisme Beernem</name>
        <street>Bloemendalestraat</street>
        <house_number>140</house_number>
        <postal_code>8730</postal_code>
        <city_name>Beernem</city_name>
        <phone1>+32 50 28 91 38</phone1>
    </element>
</root>

Developed by GiZano

About

A Python CLI tool for parsing and extracting data from XML files using the DOM, featuring a built-in multi-language (i18n) interactive menu.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages