Commit Graph

3 Commits

Author SHA1 Message Date
Tuan Anh Nguyen Dang (Tadashi_Cin)
6c3d614973 [AUR-432] Add layout-aware table parsing PDF reader (#27)
* add OCRReader, MathPixReader and ExcelReader

* update test case for ocr reader

* reformat

* minor fix
2023-09-26 15:52:44 +07:00
Nguyen Trung Duc (john)
c339912312 [AUR-389] Add base interface and embedding model (#17)
This change provides the base interface of an embedding, and wrap the Langchain's OpenAI embedding. Usage as follow:

```python
from kotaemon.embeddings import AzureOpenAIEmbeddings

model = AzureOpenAIEmbeddings(
    model="text-embedding-ada-002",
    deployment="embedding-deployment",
    openai_api_base="https://test.openai.azure.com/",
    openai_api_key="some-key",
)
output = model("Hello world")
```
2023-09-14 14:08:58 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
21350153d4 [AUR-391, AUR-393] Add Document and DocumentReader base (#6)
* Declare BaseComponent

* Brainstorming base class for LLM call

* Define base LLM

* Add tests

* Clean telemetry environment for accurate testing

* Fix README

* Fix typing

* add base document reader

* update test

* update requirements

* Cosmetic change

* update requirements

* reformat

---------

Co-authored-by: trducng <trungduc1992@gmail.com>
2023-08-31 11:24:12 +07:00