-
Notifications
You must be signed in to change notification settings - Fork 0
Home
AnnoPage is an open-source tool for automatic processing of digitized documents. It uses machine learning models for object detection, image captioning, and semantic embedding generation to create rich metadata for scanned pages. This documentation provides comprehensive information on how to install, configure, and use AnnoPage, as well as details on its API and instructions for training custom YOLO models for object detection.
In the documentation page for AnnoPage you can find details on installation, configuration setup, and how to run the tool. There is also an example of its configuration file containing definition of engines for object detection, generating image captions using LLM, and semantic embedding creation. You can also find here how to integrate AnnoPage into PERO-OCR processing pipeline or into your Python code.
The documentation page for AnnoPage API describes how to use the AnnoPage functionality programmatically using REST API. It contains description of main parts of the entire system and the typical communication workflow, along with examples on running the individual parts.
AnnoPage uses YOLO-based object detection models to identify non-textual elements in digitized documents. While pre-trained models are provided in models directory, you can also train your own detector. Instructions for preparing datasets, training, and evaluating models are described in detail on the Training YOLO Detectors page.