Research code and data accompanying the study on geographical bias and response quality of AI-driven language models (e.g. ChatGPT) across North America, Europe, and Africa, with a focus on topics in public administration.
We investigate bias and quality of state-of-the-art AI-driven language models through a geographical lens. Using OpenAI’s ChatGPT and a diverse, non-representative sample of respondents from North America, Europe, and Africa, we assess the model’s advice on issues of public interest. The study focuses on topics such as paying taxes, Indigenous Peoples’ rights and self-determination, whistleblowing, and related themes in public administration. We apply machine-learning methods to evaluate the quality and topical orientation of ChatGPT’s responses and compare them across regions. The analysis is grounded in public administration literature and aims to support the development of region-fair conversational AI and to inform policymakers, educators, and users.
genAI_regional_study/
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── CITATION.cff
├── requirements.txt
│
├── GML_Lab_EN.csv # English responses (full)
├── GML_Lab_FR.csv # French responses (full)
├── gmlLab-Reponses-6-mai-2024.csv # Combined EN + FR responses
├── FR_monthly_aggregated_data.csv # French, aggregated by month
├── topics_GML_Lab_FR.csv # French topic assignments
├── https-doi.org10.1007s43681-025-00906-2.pdf # Published article (PDF)
│
├── EN_topic_model.ipynb # Topic model, English
├── FR_topics_modelling.ipynb # Topic model, French
└── article_code_2_analyse_descr_des_reponses.ipynb # Descriptive analysis
Notebooks expect to be run from the repository root so that relative paths to the CSV files resolve correctly (e.g. in Jupyter, VS Code, or Google Colab after cloning).
- Python 3.8+
- Jupyter (optional; notebooks can be run in Colab or another environment)
-
Clone the repository
git clone https://github.com/snsn3/genAI_regional_study.git cd genAI_regional_study -
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Run the notebooks
- Open any
.ipynbin Jupyter, JupyterLab, or VS Code, or use the “Open in Colab” link inside each notebook. - Execute cells from the top; notebooks assume the working directory is the repo root (where the CSV files are).
- Open any
| File | Description |
|---|---|
GML_Lab_EN.csv |
English responses (full dataset). |
GML_Lab_FR.csv |
French responses (full dataset). |
gmlLab-Reponses-6-mai-2024.csv |
All responses (English and French). |
FR_monthly_aggregated_data.csv |
French responses aggregated by month. |
topics_GML_Lab_FR.csv |
Topic labels for French responses. |
EN_topic_model.ipynb |
Topic modelling for English responses (LDA, TF–IDF). |
FR_topics_modelling.ipynb |
Topic modelling for French responses. |
article_code_2_analyse_descr_des_reponses.ipynb |
Descriptive analysis (response length, time series, regional comparisons). |
The associated article is available as:
- PDF (local):
https-doi.org10.1007s43681-025-00906-2.pdf - DOI: 10.1007/s43681-025-00906-2
If you use this code or data in your work, please cite the paper and this repository:
Paper (BibTeX):
@article{nzobonimpa2025regional,
author = {Nzobonimpa, Stany and others},
title = {Study on {LLMs'} Regional Bias},
journal = {AI and Ethics},
year = {2025},
doi = {10.1007/s43681-025-00906-2}
}Code and data:
Stany Nzobonimpa. genAI_regional_study. GitHub. https://github.com/snsn3/genAI_regional_study.
You can also use the CITATION.cff file in this repository for automated citation.
This doctoral research is funded by the Social Sciences and Humanities Research Council of Canada (SSHRC) through the Canada Vanier Graduate Scholarships. Further details: ÉNAP – Stany Nzobonimpa, lauréat Bourse Vanier 2023.
This project is licensed under the MIT License — see the LICENSE file for details.