A Command Line application and Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese, and Romanian (with more languages soon to come) using Machine Learning techniques.
The mlconjug3 project is now a proud member of the ARS Linguistica organization. ARS Linguistica is a community-driven, open source project that aims to develop free and accessible linguistic tools and resources for all. With a focus on advancing linguistic research, documentation, and education, ARS Linguistica is dedicated to preserving and promoting linguistic diversity through the use of open source and open science.
With mlconjug3, you can:
- Conjugate any verb in one of the supported languages, even completely new or made-up verbs, with the help of a pre-trained Machine Learning model.
- Easily modify and retrain the models using any compatible classifiers from scikit-learn.
- Integrate mlconjug3 in your own projects.
| Platform | Supported | Notes |
|---|---|---|
| Linux | ✔ Python 3.9–3.14 | Fully supported |
| macOS | ✔ Python 3.9–3.14 | Fully supported |
| Windows | ✔ Python 3.9–3.12 | Stable |
| Windows (3.13+) | ~ Experimental | SciPy / sklearn issues |
| Python 3.13+ | ~ Experimental | Native binary instability |
⚠ IMPORTANT: Windows + Python 3.13/3.14 may crash due to upstream SciPy binary incompatibilities. These builds are marked experimental in CI.
This release introduces major internal improvements to the ML pipeline.
Improvements:
- Reworked training pipeline for stability and reproducibility
- Improved feature extraction for Italian and Romanian verb morphology
- Unified sklearn Pipeline architecture for all models
- Updated classifier to SGDClassifier (elasticnet regularization)
- Better handling of unseen verb forms via enhanced feature engineering
- Improved cross-platform consistency (Windows/macOS/Linux)
Behavioral changes:
- Slight variations in prediction outputs due to improved feature representation
- Optional sample weighting added to training pipeline
- Internal API refactoring (public API remains backward compatible)
Migration notes:
- No breaking changes in public API
- Minor variation in predictions expected due to improved model generalization
mlconjug3 is a valuable tool for linguistic researchers, as it provides accurate and up-to-date conjugation information for a wide range of languages. With its ability to handle completely new or made-up verbs, mlconjug3 is perfect for exploring new linguistic concepts and theories. It can also be used to compare and contrast conjugation patterns across different languages, helping researchers to identify and understand linguistic trends.
In addition to academic research, mlconjug3 can be integrated into a wide range of web and desktop applications. For language learning platforms, mlconjug3 provides an accurate and comprehensive source of conjugation information, helping students to quickly and easily master verb conjugation. For language translation tools, mlconjug3 can help to ensure that translations are grammatically correct, by providing accurate verb conjugation information in real-time.
By using mlconjug3, you are not only getting a powerful and flexible tool for verb conjugation, but you are also supporting the goals and mission of ARS Linguistica. Whether you are a linguistic researcher, language teacher, or simply someone who is passionate about preserving linguistic heritage, your support is crucial to the success of our organization.
Join us in our mission to make linguistic tools and resources accessible to all!
- Free software: MIT license
- Documentation: https://mlconjug3.readthedocs.io/en/latest/readme.html
- French
- English
- Spanish
- Italian
- Portuguese
- Romanian
- Gerard Canal, Senka Krivic, Paul Luff, Andrew Coles.Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), 2022.
- Mike Hongfei Wu.A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY. May 2022.
- Spencer Ng, Lucy Teaford, Andy Yang, and Isaiah Zwick-Schachter.CMSC 25610: Computational Linguistics University of Chicago, 2021.
- Ali Malik and Mike Wu and Vrinda Vasavada and Jinpeng Song and John Mitchell and Noah D. Goodman and Chris Piech.Proceedings of the 34th AAAI conference on Artificial Intelligence, 2019.
If you want to cite mlconjug3 in an academic publication use this citation format:
@article{mlconjug3,
title={mlconjug3},
author={Sekou Diao},
journal={GitHub. Note: https://github.com/Ars-Linguistica/mlconjug3 Cited by},
year={2023}
}- EDS-NLP provides a set of spaCy components that are used to extract information from clinical notes written in French.
- Translation flask API for the Helsinki NLP models available in the Huggingface Transformers library.
- NLP Suite is a package of tools designed for non-specialists, for scholars with no knowledge or little knowledge of Natural Language Processing.
- Runebook translates various references such as programming languages, frameworks, libraries, and APIs that software engineers refer to in development.
- This project offers tools to visualize the gender bias in pre-trained language models.
- Uses language models to generate adapted text.
- Dockerized microservice for conjugation.
- HTML transformation tool.
- A Tux bot.
- Tweets French words.
- Helps learn verb forms.
- NLP utilities.
- Detects masks.
- Generates excuses.
- NLP repository.
- Voice assistant.
- Random advice generator.
- Quiz generator.
- Rogue-like game.
- Spanish learning app.
- Vocabulary app.
You can install mlconjug3 using different methods depending on your workflow.
### 1. Install via pip (recommended)
pip install mlconjug3### 2. Install from source (GitHub)
git clone https://github.com/Ars-Linguistica/mlconjug3.git
cd mlconjug3
pip install .### 3. Install via conda (conda-forge)
conda install -c conda-forge mlconjug3Starting with version 3.10, all releases of mlconjug3 published on PyPI and GitHub are signed using Sigstore.
Sigstore is an open-source project that provides simple, transparent, and secure software signing. It allows developers to sign releases without managing long-lived cryptographic keys.
Instead, Sigstore uses short-lived certificates issued through identity providers, making the signing process both secure and easy to verify.
This ensures that mlconjug3 releases on PyPI can be cryptographically verified and have not been tampered with.
You can verify mlconjug3 package signatures using cosign, which is part of the Sigstore ecosystem.
Install cosign:
# Linux / macOS
curl -O -L https://github.com/sigstore/cosign/releases/latest/download/cosign-linux-amd64
chmod +x cosign-linux-amd64
sudo mv cosign-linux-amd64 /usr/local/bin/cosign
# macOS (Homebrew)
brew install cosignVerify a release:
cosign verify-blob \
--certificate-identity https://github.com/Ars-Linguistica/mlconjug3/.github/workflows/upload_wheels_to_pypi.yml \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
mlconjug3-<version>.tar.gz \
--signature mlconjug3-<version>.tar.gz.sigThis ensures: - The package was built by the official CI pipeline - The release was not modified after publication - The signature matches the GitHub Actions identity
For more details, see: https://docs.sigstore.dev/
This package was created with the help of Verbiste and scikit-learn.
The logo was designed by Zuur.
