Skip to content

Ars-Linguistica/mlconjug3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,170 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
mlconjug3 PyPi Home Page
Package Maintenance Status Package Maintener OpenSSF Best Practices OpenSSF ScoreCard Pypi Python Package Index Status Anaconda Package Index Status Supported platforms Conda Code Coverage Status Code Vulnerability Status DOI Follow me on Mastodon

mlconjug3: The multi-lingual conjugator

A Command Line application and Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese, and Romanian (with more languages soon to come) using Machine Learning techniques.

The mlconjug3 project is now a proud member of the ARS Linguistica organization. ARS Linguistica is a community-driven, open source project that aims to develop free and accessible linguistic tools and resources for all. With a focus on advancing linguistic research, documentation, and education, ARS Linguistica is dedicated to preserving and promoting linguistic diversity through the use of open source and open science.

With mlconjug3, you can:

  • Conjugate any verb in one of the supported languages, even completely new or made-up verbs, with the help of a pre-trained Machine Learning model.
  • Easily modify and retrain the models using any compatible classifiers from scikit-learn.
  • Integrate mlconjug3 in your own projects.

Compatibility Matrix (v4.0.1)

Platform Supported Notes
Linux ✔ Python 3.9–3.14 Fully supported
macOS ✔ Python 3.9–3.14 Fully supported
Windows ✔ Python 3.9–3.12 Stable
Windows (3.13+) ~ Experimental SciPy / sklearn issues
Python 3.13+ ~ Experimental Native binary instability

⚠ IMPORTANT: Windows + Python 3.13/3.14 may crash due to upstream SciPy binary incompatibilities. These builds are marked experimental in CI.


Release Notes (v4.0.1)

This release introduces major internal improvements to the ML pipeline.

Improvements:

  • Reworked training pipeline for stability and reproducibility
  • Improved feature extraction for Italian and Romanian verb morphology
  • Unified sklearn Pipeline architecture for all models
  • Updated classifier to SGDClassifier (elasticnet regularization)
  • Better handling of unseen verb forms via enhanced feature engineering
  • Improved cross-platform consistency (Windows/macOS/Linux)

Behavioral changes:

  • Slight variations in prediction outputs due to improved feature representation
  • Optional sample weighting added to training pipeline
  • Internal API refactoring (public API remains backward compatible)

Migration notes:

  • No breaking changes in public API
  • Minor variation in predictions expected due to improved model generalization

Using mlconjug3 in Academic Research

mlconjug3 is a valuable tool for linguistic researchers, as it provides accurate and up-to-date conjugation information for a wide range of languages. With its ability to handle completely new or made-up verbs, mlconjug3 is perfect for exploring new linguistic concepts and theories. It can also be used to compare and contrast conjugation patterns across different languages, helping researchers to identify and understand linguistic trends.

Integrating mlconjug3 in Applications

In addition to academic research, mlconjug3 can be integrated into a wide range of web and desktop applications. For language learning platforms, mlconjug3 provides an accurate and comprehensive source of conjugation information, helping students to quickly and easily master verb conjugation. For language translation tools, mlconjug3 can help to ensure that translations are grammatically correct, by providing accurate verb conjugation information in real-time.

By using mlconjug3, you are not only getting a powerful and flexible tool for verb conjugation, but you are also supporting the goals and mission of ARS Linguistica. Whether you are a linguistic researcher, language teacher, or simply someone who is passionate about preserving linguistic heritage, your support is crucial to the success of our organization.

Join us in our mission to make linguistic tools and resources accessible to all!


Conjugation for the verb to be.


Supported Languages

  • French
  • English
  • Spanish
  • Italian
  • Portuguese
  • Romanian

Academic publications citing mlconjug3

BibTeX

If you want to cite mlconjug3 in an academic publication use this citation format:

@article{mlconjug3,
  title={mlconjug3},
  author={Sekou Diao},
  journal={GitHub. Note: https://github.com/Ars-Linguistica/mlconjug3 Cited by},
  year={2023}
}

Software projects using mlconjug3

Installation

You can install mlconjug3 using different methods depending on your workflow.

### 1. Install via pip (recommended)

pip install mlconjug3

### 2. Install from source (GitHub)

git clone https://github.com/Ars-Linguistica/mlconjug3.git
cd mlconjug3
pip install .

### 3. Install via conda (conda-forge)

conda install -c conda-forge mlconjug3

Signing of Releases

Starting with version 3.10, all releases of mlconjug3 published on PyPI and GitHub are signed using Sigstore.

What is Sigstore?

Sigstore is an open-source project that provides simple, transparent, and secure software signing. It allows developers to sign releases without managing long-lived cryptographic keys.

Instead, Sigstore uses short-lived certificates issued through identity providers, making the signing process both secure and easy to verify.

This ensures that mlconjug3 releases on PyPI can be cryptographically verified and have not been tampered with.

How to verify a release?

You can verify mlconjug3 package signatures using cosign, which is part of the Sigstore ecosystem.

Install cosign:

# Linux / macOS
curl -O -L https://github.com/sigstore/cosign/releases/latest/download/cosign-linux-amd64
chmod +x cosign-linux-amd64
sudo mv cosign-linux-amd64 /usr/local/bin/cosign

# macOS (Homebrew)
brew install cosign

Verify a release:

cosign verify-blob \
    --certificate-identity https://github.com/Ars-Linguistica/mlconjug3/.github/workflows/upload_wheels_to_pypi.yml \
    --certificate-oidc-issuer https://token.actions.githubusercontent.com \
    mlconjug3-<version>.tar.gz \
    --signature mlconjug3-<version>.tar.gz.sig

This ensures: - The package was built by the official CI pipeline - The release was not modified after publication - The signature matches the GitHub Actions identity

For more details, see: https://docs.sigstore.dev/

Credits

This package was created with the help of Verbiste and scikit-learn.

The logo was designed by Zuur.

About

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

 

Packages

 
 
 

Contributors

Languages