kotaemon/knowledgehub/pipelines
Tuan Anh Nguyen Dang (Tadashi_Cin) 4704e2c11a Add new OCRReader with PDF+OCR text merging (#66)
This change speeds up OCR extraction by allowing bypassing OCR for texts that are irrelevant (not in table).

---------

Co-authored-by: Nguyen Trung Duc (john) <trungduc1992@gmail.com>
2023-11-13 17:43:02 +07:00
..
agents Add Huggingface embeddings and Cohere embeddings (#63) 2023-11-10 09:38:30 +07:00
tools Simplify the BaseComponent inteface (#64) 2023-11-13 15:10:18 +07:00
__init__.py Initiate repository 2023-08-16 14:56:48 +07:00
cot.py Upgrade the declarative pipeline for cleaner interface (#51) 2023-10-24 11:12:22 +07:00
indexing.py Simplify the BaseComponent inteface (#64) 2023-11-13 15:10:18 +07:00
ingest.py Add new OCRReader with PDF+OCR text merging (#66) 2023-11-13 17:43:02 +07:00
qa.py Add new OCRReader with PDF+OCR text merging (#66) 2023-11-13 17:43:02 +07:00
retrieving.py Simplify the BaseComponent inteface (#64) 2023-11-13 15:10:18 +07:00
utils.py [AUR-429] Add MVP pipeline with Ingestion and QA stage (#39) 2023-10-05 12:31:33 +07:00