Trying OCR for UI text recognition.
Goal
- Reason for trying parseq: It is state-of-the-art (SOTA).
- The parseq code uses lmdb.
- Is it necessary? Investigate other OCR development tools...
mmOCR
pip3 install openmim
mim install mmcv-full
mim install mmdet
git clone https://github.com/open-mmlab/mmocr.git
python mmocr/utils/ocr.py demo/demo_text_ocr.jpg --print-result --imshow
After setup, when you input an image, it cuts out the text in a fixed position, recognizes it, and visualizes the result. Of course, if you run ocr.py, it can also perform detection (localization), but in our case, the format is fixed, so it's better to handle it as a simple cropping preprocessing step.
- Surprisingly, RobustScanner and SAR work well.
- Naturally, if the text is split into two lines, it fails.