kotaemon/tests/resources
Tuan Anh Nguyen Dang (Tadashi_Cin) 4704e2c11a Add new OCRReader with PDF+OCR text merging (#66)
This change speeds up OCR extraction by allowing bypassing OCR for texts that are irrelevant (not in table).

---------

Co-authored-by: Nguyen Trung Duc (john) <trungduc1992@gmail.com>
2023-11-13 17:43:02 +07:00
..
7810d908b0ff4ce381dcab873196d133.jpg Add new OCRReader with PDF+OCR text merging (#66) 2023-11-13 17:43:02 +07:00
dummy.pdf [AUR-391, AUR-393] Add Document and DocumentReader base (#6) 2023-08-31 11:24:12 +07:00
dummy.xlsx [AUR-432] Add layout-aware table parsing PDF reader (#27) 2023-09-26 15:52:44 +07:00
embedding_openai_batch.json [AUR-389] Add base interface and embedding model (#17) 2023-09-14 14:08:58 +07:00
embedding_openai.json [AUR-389] Add base interface and embedding model (#17) 2023-09-14 14:08:58 +07:00
fullocr_sample_output.json Add new OCRReader with PDF+OCR text merging (#66) 2023-11-13 17:43:02 +07:00
policy.md [AUR-432] Add layout-aware table parsing PDF reader (#27) 2023-09-26 15:52:44 +07:00
table.pdf Add new OCRReader with PDF+OCR text merging (#66) 2023-11-13 17:43:02 +07:00