All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
-
Improve recognition accuracy for long text lines, at the cost of longer inference times, by increasing max image width after preprocessing #32
-
Added
--text-line-images
option to save previews of text lines after preprocessing. This is useful for debugging recognition accuracy issues #29, #30. -
Added a note in the README about the importance of building ocrs (or at least the rten dependencies) in release mode #28
-
Updated rten to 0.4.0. This includes optimizations for post-processing of the text segmentation mask #23.
-
Updated rten to v0.3.1. This improves performance on Arm by ~30%.
-
Fix panic in layout analysis when average word spacing in a line is negative #20
-
Added LICENSE files to repository (Apache 2, MIT) #12
-
Extract the ocrs project out of the RTen repository and into a standalone repo at https://github.com/robertknight/ocrs.
-
Improve the
--json
output format with extracted text and coordinates of the rotated bounding rect for each word and line (92f17fb).
- Update rten to fix incorrect output on non-x64 / wasm32 platforms
- Improve layout analysis (ce52b3a1, cefb6c3f). The longer term plan is to use machine learning for layout analysis, but these incremental tweaks address some of the most egregious errors.
- Add
--version
flag to CLI (20055ee0) - Revise CLI flags for specifying output format (97c3a011). The output path
is now specified with
-o
. Available formats are text (default), JSON (--json
) or annotated PNG (--png
). - Fixed slow OCR model downloads by changing hosting location (robertknight/rten#22).
Initial release.