Skip to content
Martin Kišš edited this page Nov 13, 2025 · 19 revisions

AnnoPage is an open-source tool for automatic processing of digitized documents. It uses machine learning models for object detection, image captioning, and semantic embedding generation to create rich metadata for scanned pages. This documentation provides comprehensive information on how to install, configure, and use AnnoPage, as well as details on its API and instructions for training custom YOLO models for object detection.

AnnoPage

In the documentation page for AnnoPage you can find details on installation, configuration setup, and how to run the tool. There is also an example of its configuration file containing definition of engines for object detection, generating image captions using LLM, and semantic embedding creation. You can also find here how to integrate AnnoPage into PERO-OCR processing pipeline or into your Python code.

AnnoPage API

The documentation page for AnnoPage API describes how to use the AnnoPage functionality programmatically using REST API. It contains description of main parts of the entire system and the typical communication workflow, along with examples on running the individual parts.

Training YOLO models

AnnoPage uses YOLO-based object detection models to identify non-textual elements in digitized documents. While pre-trained models are provided in models directory, you can also train your own detector. Instructions for preparing datasets, training, and evaluating models are described in detail on the Training YOLO Detectors page.

Clone this wiki locally