Skip to content

Latest commit

 

History

History
62 lines (37 loc) · 2.73 KB

File metadata and controls

62 lines (37 loc) · 2.73 KB

Textract

  • The main motive behind this project was that we often faced the problem of separately typing any content instead of copy-pasting from an already existing document or image which are not in typed format.

  • Hence, a text extractor which would simply scanning and extracting the content of the file would save loads of time and also reduce the chances of typographical error to 0%.

Application Link:

Flow of the Application

  • Our system takes the scanned image/document from the user as an input.

  • Then perform some image pre-processing techniques, like scaling, binarization and noise removal.

  • Use Optical Character Recognition using Tesseract Engine and extract the text.

Usage Guidelines:

1. Desktop Version

  • The link http://18.222.220.89:5000/ lands on this page, where you can submit the file from which you want to extract text.

  • After uploading and submitting a file, the result appear as shown in the image and you click on Copy To Clipboard to copy and use the text as you want.

2. Mobile Version

  • After downloading the APK package from this link, install it in your device and start the app.

  • Upload an image and the results come out as follows. Then simply copy-paste the text and use it as per you requirement.

Technology Used:

  • Flask
  • Tesseract OCR Engine
  • TensorFlow, OpenCV
  • Flutter
  • AWS (Deployment)