Commit Graph

10 Commits

Author SHA1 Message Date
Tuan Anh Nguyen Dang (Tadashi_Cin)
cc1e75b3c6 Add Citation pipeline (#78)
* add rerankers in retrieving pipeline

* update example MVP pipeline

* add citation pipeline and function call interface

* change return type of QA and AgentPipeline to Document
2023-11-16 11:24:35 +07:00
Nguyen Trung Duc (john)
8532138842 Move Document and other interface into base/schema (#69) 2023-11-14 11:51:10 +07:00
Nguyen Trung Duc (john)
d79b3744cb Simplify the BaseComponent inteface (#64)
This change remove `BaseComponent`'s:

- run_raw
- run_batch_raw
- run_document
- run_batch_document
- is_document
- is_batch

Each component is expected to support multiple types of inputs and a single type of output. Since we want the component to work out-of-the-box with both standardized and customized use cases, supporting multiple types of inputs are expected. At the same time, to reduce the complexity of understanding how to use a component, we restrict a component to only have a single output type.

To accommodate these changes, we also refactor some components to remove their run_raw, run_batch_raw... methods, and to decide the common output interface for those components.

Tests are updated accordingly.

Commit changes:

* Add kwargs to vector store's query
* Simplify the BaseComponent
* Update tests
* Remove support for Python 3.8 and 3.9
* Bump version 0.3.0
* Fix github PR caching still use old environment after bumping version

---------

Co-authored-by: ian <ian@cinnamon.is>
2023-11-13 15:10:18 +07:00
Nguyen Trung Duc (john)
9035e25666 Upgrade the declarative pipeline for cleaner interface (#51) 2023-10-24 11:12:22 +07:00
ian_Cin
d83c22aa4e [AUR-395, AUR-415] Adopt Example1 Injury pipeline; add .flow() for enabling bottom-up pipeline execution (#32)
* add example1/injury pipeline example
* add dotenv
* update various api
2023-10-02 16:24:56 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
3cceec63ef [AUR-431] Add ReAct Agent (#34)
* add base Tool

* minor update test_tool

* update test dependency

* update test dependency

* Fix namespace conflict

* update test

* add base Agent Interface, add ReWoo Agent

* minor update

* update test

* fix typo

* remove unneeded print

* update rewoo agent

* add LLMTool

* update BaseAgent type

* add ReAct agent

* add ReAct agent

* minor update

* minor update

* minor update

* minor update

* update docstring

* fix max_iteration

---------

Co-authored-by: trducng <trungduc1992@gmail.com>
2023-10-02 11:29:12 +07:00
ian_Cin
b794051653 [AUR-421] base output post-processor that works using regex. (#20) 2023-09-19 19:54:44 +07:00
Nguyen Trung Duc (john)
c339912312 [AUR-389] Add base interface and embedding model (#17)
This change provides the base interface of an embedding, and wrap the Langchain's OpenAI embedding. Usage as follow:

```python
from kotaemon.embeddings import AzureOpenAIEmbeddings

model = AzureOpenAIEmbeddings(
    model="text-embedding-ada-002",
    deployment="embedding-deployment",
    openai_api_base="https://test.openai.azure.com/",
    openai_api_key="some-key",
)
output = model("Hello world")
```
2023-09-14 14:08:58 +07:00
ian_Cin
5241edbc46 [AUR-361] Setup pre-commit, pytest, GitHub actions, ssh-secret (#3)
Co-authored-by: trducng <trungduc1992@gmail.com>
2023-08-30 07:22:01 +07:00
Nguyen Trung Duc (john)
c3c25db48c [AUR-385, AUR-388] Declare BaseComponent and decide LLM call interface (#2)
- Use cases related to LLM call: https://cinnamon-ai.atlassian.net/browse/AUR-388?focusedCommentId=34873
- Sample usages: `test_llms_chat_models.py` and `test_llms_completion_models.py`:

```python
from kotaemon.llms.chats.openai import AzureChatOpenAI

model = AzureChatOpenAI(
    openai_api_base="https://test.openai.azure.com/",
    openai_api_key="some-key",
    openai_api_version="2023-03-15-preview",
    deployment_name="gpt35turbo",
    temperature=0,
    request_timeout=60,
)
output = model("hello world")
```

For the LLM-call component, I decide to wrap around Langchain's LLM models and Langchain's Chat models. And set the interface as follow:

- Completion LLM component:
```python
class CompletionLLM:

    def run_raw(self, text: str) -> LLMInterface:
        # Run text completion: str in -> LLMInterface out

    def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
        # Run text completion in batch: list[str] in -> list[LLMInterface] out

# run_document and run_batch_document just reuse run_raw and run_batch_raw, due to unclear use case
```

- Chat LLM component:
```python
class ChatLLM:
    def run_raw(self, text: str) -> LLMInterface:
        # Run chat completion (no chat history): str in -> LLMInterface out

    def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
        # Run chat completion in batch mode (no chat history): list[str] in -> list[LLMInterface] out

    def run_document(self, text: list[BaseMessage]) -> LLMInterface:
        # Run chat completion (with chat history): list[langchain's BaseMessage] in -> LLMInterface out

    def run_batch_document(self, text: list[list[BaseMessage]]) -> list[LLMInterface]:
        # Run chat completion in batch mode (with chat history): list[list[langchain's BaseMessage]] in -> list[LLMInterface] out
```

- The LLMInterface is as follow:

```python
@dataclass
class LLMInterface:
    text: list[str]
    completion_tokens: int = -1
    total_tokens: int = -1
    prompt_tokens: int = -1
    logits: list[list[float]] = field(default_factory=list)
```
2023-08-29 15:47:12 +07:00