EasyOCR Custom Model Training

A comprehensive guide to fine-tuning EasyOCR's text recognition models on custom datasets. This repository provides a step-by-step tutorial for training domain-specific OCR models that outperform generic pre-trained models on specialized text.

Overview

While EasyOCR provides excellent out-of-the-box performance for general text recognition, it may struggle with:

Technical jargon and domain-specific terminology
Unique fonts or stylized text
Specialized formatting (invoices, forms, technical documents)
Low-quality or degraded text
Languages or scripts with limited training data

This tutorial shows you how to fine-tune EasyOCR to achieve superior accuracy on your specific use case.

Features

Complete end-to-end training pipeline
Automated data preprocessing and LMDB conversion
Detailed explanations for each training step
Model conversion utilities for EasyOCR compatibility
Best practices and troubleshooting tips
Ready-to-use Google Colab notebook

Prerequisites

Python 3.11 or higher
Basic understanding of machine learning concepts
Training images with corresponding text labels
Google Colab account (for GPU training) or local GPU setup

Quick Start

Option 1: Google Colab (Recommended)

Click the badge below to open the notebook directly in Google Colab:

Option 2: Local Setup

Clone the repository

git clone https://github.com/AbdullahButt2611/EasyOCR-Custom-Training.git
cd EasyOCR-Custom-Training

Install dependencies

pip install -r requirements.txt

Prepare your training data

train_data/
├── image1.jpg
├── image2.jpg
├── ...
└── gt.txt

Run the notebook

jupyter notebook EasyOCR_Custom_Training.ipynb

Repository Structure

easyocr-custom-training/
│
├── EasyOCR_Custom_Training.ipynb    # Main training notebook
├── README.md                         # This file
├── requirements.txt                  # Python dependencies
├── LICENSE                           # MIT License
│
├── examples/                         # Example data and results
│   ├── sample_data/                  # Sample training images
│   └── results/                      # Example outputs
│
└── utils/                            # Helper scripts (optional)
    ├── data_preparation.py
    └── model_converter.py

Training Data Format

Your training data should follow this structure:

Directory Structure

train_data/
├── image1.jpg
├── image2.jpg
├── image3.jpg
└── gt.txt

Ground Truth File (`gt.txt`)

The ground truth file should contain tab-separated values:

image1.jpg	Hello World
image2.jpg	Sample Text
image3.jpg	Custom Label

Important Notes:

Use TAB character (not spaces) between filename and label
One line per image
UTF-8 encoding
Labels should match the exact text in the image

Tutorial Contents

The notebook covers:

Environment Setup
- Installing dependencies
- Cloning the Deep Text Recognition Benchmark
Data Preprocessing
- Ground truth file formatting
- LMDB dataset creation
Framework Compatibility
- PyTorch compatibility fixes
- CPU/GPU configuration
Model Training
- Architecture selection (VGG + BiLSTM + CTC)
- Hyperparameter configuration
- Training monitoring
Model Conversion
- Converting to EasyOCR format
- Model deployment preparation
Testing & Evaluation
- Testing on sample images
- Performance evaluation

Configuration Options

Model Architecture

Choose from different combinations:

Component	Options
Transformation	`None`, `TPS`
Feature Extraction	`VGG`, `RCNN`, `ResNet`
Sequence Modeling	`None`, `BiLSTM`
Prediction	`CTC`, `Attn`

Training Parameters

--exp_name my_model          # Experiment name
--batch_size 8               # Batch size (adjust for GPU memory)
--num_iter 3000              # Total training iterations
--valInterval 100            # Validation frequency
--lr 1                       # Learning rate
--workers 4                  # Number of data loading workers

Expected Results

With proper training data (200+ samples):

Training time: 30-60 minutes (1000 iterations on GPU)
Accuracy improvement: 20-50% over generic models
Best for: Domain-specific text with 50+ unique vocabulary items

Performance Metrics

Monitor these during training:

Train Loss: Should steadily decrease
Validation Loss: Should decrease without diverging from train loss
Accuracy: Target 80%+ on validation set
Normalized Edit Distance: Target < 0.10

Troubleshooting

Common Issues

Q: Training loss not decreasing

Check data quality and label accuracy
Increase training iterations
Try different learning rates (0.5, 1.0, 2.0)

Q: Out of memory errors

Reduce batch_size
Use GPU runtime in Colab
Reduce image resolution

Q: Model not loading in EasyOCR

Verify model conversion completed successfully
Check that converted model is in correct directory
Ensure key names match EasyOCR's expected format

Q: Low accuracy on validation set

Add more diverse training samples
Increase num_iter to 3000-5000
Try different model architectures

Q: Overfitting (train accuracy >> validation accuracy)

Add more training data
Reduce model complexity
Implement data augmentation

Tips for Better Results

Data Collection

Minimum 200 images recommended
Cover all characters/symbols you need to recognize
Include variations in lighting, angles, and quality
Balance dataset (similar samples per class)

Training Strategy

Start with 1000 iterations, increase if needed
Monitor validation metrics closely
Save checkpoints regularly
Test on completely unseen data

Model Selection

For short text (1-10 chars): VGG + BiLSTM + CTC
For longer text: ResNet + BiLSTM + Attn
For simple fonts: VGG + None + CTC (faster)

Contributing

Contributions are welcome! Here's how you can help:

Fork the repository
Create a feature branch
```
git checkout -b feature/amazing-feature
```
Commit your changes
```
git commit -m 'Add amazing feature'
```
Push to the branch
```
git push origin feature/amazing-feature
```
Open a Pull Request

Contribution Ideas

Add example datasets for different domains
Implement data augmentation utilities
Create model evaluation scripts
Add support for additional architectures
Improve documentation

Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

JaidedAI for EasyOCR
Clova AI for Deep Text Recognition Benchmark
The open-source OCR community

Contact & Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: abutt2210@gmail.com

Star History

If you find this repository helpful, please consider giving it a star! It helps others discover this resource.

Made with ❤️ for the OCR community

Last updated: January 2026

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
train_data		train_data
user_network		user_network
.gitattributes		.gitattributes
EasyOCR_Model_Training_Guide.ipynb		EasyOCR_Model_Training_Guide.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EasyOCR Custom Model Training

Overview

Features

Prerequisites

Quick Start

Option 1: Google Colab (Recommended)

Option 2: Local Setup

Repository Structure

Training Data Format

Directory Structure

Ground Truth File (`gt.txt`)

Tutorial Contents

Configuration Options

Model Architecture

Training Parameters

Expected Results

Performance Metrics

Troubleshooting

Common Issues

Tips for Better Results

Data Collection

Training Strategy

Model Selection

Contributing

Contribution Ideas

Resources

License

Acknowledgments

Contact & Support

Star History

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EasyOCR Custom Model Training

Overview

Features

Prerequisites

Quick Start

Option 1: Google Colab (Recommended)

Option 2: Local Setup

Repository Structure

Training Data Format

Directory Structure

Ground Truth File (gt.txt)

Tutorial Contents

Configuration Options

Model Architecture

Training Parameters

Expected Results

Performance Metrics

Troubleshooting

Common Issues

Tips for Better Results

Data Collection

Training Strategy

Model Selection

Contributing

Contribution Ideas

Resources

License

Acknowledgments

Contact & Support

Star History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

Ground Truth File (`gt.txt`)