Commit Graph

3 Commits

Author SHA1 Message Date
Tuan Anh Nguyen Dang (Tadashi_Cin)
4704e2c11a Add new OCRReader with PDF+OCR text merging (#66)
This change speeds up OCR extraction by allowing bypassing OCR for texts that are irrelevant (not in table).

---------

Co-authored-by: Nguyen Trung Duc (john) <trungduc1992@gmail.com>
2023-11-13 17:43:02 +07:00
ian_Cin
533fffa6db Enable caching for github actions (#43) 2023-10-12 13:52:19 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
6c3d614973 [AUR-432] Add layout-aware table parsing PDF reader (#27)
* add OCRReader, MathPixReader and ExcelReader

* update test case for ocr reader

* reformat

* minor fix
2023-09-26 15:52:44 +07:00