This repository contains a simple machine‑learning pipeline that predicts whether a payment transaction will fail based on customer KYC information and transaction metadata. The model is a logistic regression trained on synthetic data that mimics real‑world payment onboarding records.
userdata.json– Synthetic dataset used for training and evaluation.train.py– Script that loads the data, performs feature engineering, trains a logistic regression model, logs the run with MLflow, and saves the model topayment_failure_model.pkl.predict.py– Script that loads the saved model and makes a prediction for a single example.app.py– FastAPI application exposing a/predictendpoint that returns the failure probability.requirements.txt– Python dependencies required to run the project.README.md– This documentation.
- Create a virtual environment
python3 -m venv venv source venv/bin/activate - Install dependencies
The required packages include
pip install -r requirements.txt
pandas,scikit‑learn,mlflow,fastapi, anduvicorn. - Verify the environment
python -c "import pandas, sklearn, mlflow; print('All imports succeeded')"
Run the training script:
python train.pyThe script will:
- Load
userdata.json. - Engineer features such as age, ID verification flag, and cross‑border indicator.
- Split the data into training and test sets.
- Train a logistic regression model.
- Log parameters, metrics, and the model artifact to an MLflow tracking server (default local SQLite store).
- Save the trained model to
payment_failure_model.pkl.
After successful execution you will see a classification report and a confirmation that the model file has been saved.
Use the predict.py script to obtain a prediction for a single record:
python predict.pyThe output is a JSON object, for example:
{ "payment_failed": 1, "failure_probability": 0.62 }You can use the same predict.py script to make predictions on any new data record. Provide a JSON object that matches the feature schema used during training. For batch predictions, create a JSON Lines file where each line is a separate record and modify predict.py to read the file and output predictions.
{
"occupation": "worker",
"purposeOfTransaction": "shopping",
"sourceOfFunds": "Cash",
"countryOfBirth": "IN",
"nationality": "IN",
"idVerificationStatus": "N",
"receiver": {"address": {"countryCode": "US"}},
"dateOfBirth": "1990-01-01"
}python predict.py '{"occupation":"worker","purposeOfTransaction":"shopping","sourceOfFunds":"Cash","countryOfBirth":"IN","nationality":"IN","idVerificationStatus":"N","receiver":{"address":{"countryCode":"US"}},"dateOfBirth":"1990-01-01"}'The script will print a JSON response with payment_failed and failure_probability.
For batch predictions, create a file new_data.jsonl with one JSON per line and run:
python batch_predict.py new_data.jsonl(You can create batch_predict.py by copying predict.py and looping over each line.)
You can also send the same JSON payload to the FastAPI endpoint as shown in the FastAPI section.
The FastAPI wrapper (app.py) provides a REST endpoint:
- Start the server
uvicorn app:app --reload
- POST request to
http://127.0.0.1:8000/predictwith a JSON payload matching the feature schema. Example usingcurl:The response will containcurl -X POST http://127.0.0.1:8000/predict \ -H "Content-Type: application/json" \ -d '{"occupation":"worker","purposeOfTransaction":"shopping","sourceOfFunds":"Cash","countryOfBirth":"IN","nationality":"IN","idVerificationStatus":"N","receiver":{"address":{"countryCode":"US"}},"dateOfBirth":"1990-01-01"}'payment_failure_risk(0 or 1) andfailure_probability.
To view the logged runs, start the MLflow UI:
mlflow uiIf you encounter an import error related to alembic, ensure that the virtual environment is activated and that the alembic package version is compatible with your Python version. Re‑installing the dependencies often resolves the issue:
pip install --upgrade alembicThe UI is available at http://127.0.0.1:5000.
To deactivate the virtual environment:
deactivateYou can delete the venv directory and the generated model file (payment_failure_model.pkl) if you wish to start fresh.
This README provides a reproducible workflow for training, evaluating, and serving a payment‑failure prediction model with MLflow tracking and FastAPI.
A live demo of the model is available at https://payment-failure.streamlit.app/. You can interact with the UI to upload a JSON file for batch predictions or fill in the form for a single prediction. The Streamlit app uses the same model and feature engineering logic as the FastAPI service.
Ramakrushna Mohapatra