kotaemon/knowledgehub/components.py
Nguyen Trung Duc (john) c3c25db48c [AUR-385, AUR-388] Declare BaseComponent and decide LLM call interface (#2)
- Use cases related to LLM call: https://cinnamon-ai.atlassian.net/browse/AUR-388?focusedCommentId=34873
- Sample usages: `test_llms_chat_models.py` and `test_llms_completion_models.py`:

```python
from kotaemon.llms.chats.openai import AzureChatOpenAI

model = AzureChatOpenAI(
    openai_api_base="https://test.openai.azure.com/",
    openai_api_key="some-key",
    openai_api_version="2023-03-15-preview",
    deployment_name="gpt35turbo",
    temperature=0,
    request_timeout=60,
)
output = model("hello world")
```

For the LLM-call component, I decide to wrap around Langchain's LLM models and Langchain's Chat models. And set the interface as follow:

- Completion LLM component:
```python
class CompletionLLM:

    def run_raw(self, text: str) -> LLMInterface:
        # Run text completion: str in -> LLMInterface out

    def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
        # Run text completion in batch: list[str] in -> list[LLMInterface] out

# run_document and run_batch_document just reuse run_raw and run_batch_raw, due to unclear use case
```

- Chat LLM component:
```python
class ChatLLM:
    def run_raw(self, text: str) -> LLMInterface:
        # Run chat completion (no chat history): str in -> LLMInterface out

    def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
        # Run chat completion in batch mode (no chat history): list[str] in -> list[LLMInterface] out

    def run_document(self, text: list[BaseMessage]) -> LLMInterface:
        # Run chat completion (with chat history): list[langchain's BaseMessage] in -> LLMInterface out

    def run_batch_document(self, text: list[list[BaseMessage]]) -> list[LLMInterface]:
        # Run chat completion in batch mode (with chat history): list[list[langchain's BaseMessage]] in -> list[LLMInterface] out
```

- The LLMInterface is as follow:

```python
@dataclass
class LLMInterface:
    text: list[str]
    completion_tokens: int = -1
    total_tokens: int = -1
    prompt_tokens: int = -1
    logits: list[list[float]] = field(default_factory=list)
```
2023-08-29 15:47:12 +07:00

57 lines
1.5 KiB
Python

from abc import abstractmethod
from theflow.base import Composable
class BaseComponent(Composable):
"""Base class for component
A component is a class that can be used to compose a pipeline. To use the
component, you should implement the following methods:
- run_raw: run on raw input
- run_batch_raw: run on batch of raw input
- run_document: run on document
- run_batch_document: run on batch of documents
- is_document: check if input is document
- is_batch: check if input is batch
"""
@abstractmethod
def run_raw(self, *args, **kwargs):
...
@abstractmethod
def run_batch_raw(self, *args, **kwargs):
...
@abstractmethod
def run_document(self, *args, **kwargs):
...
@abstractmethod
def run_batch_document(self, *args, **kwargs):
...
@abstractmethod
def is_document(self, *args, **kwargs) -> bool:
...
@abstractmethod
def is_batch(self, *args, **kwargs) -> bool:
...
def run(self, *args, **kwargs):
"""Run the component."""
is_document = self.is_document(*args, **kwargs)
is_batch = self.is_batch(*args, **kwargs)
if is_document and is_batch:
return self.run_batch_document(*args, **kwargs)
elif is_document and not is_batch:
return self.run_document(*args, **kwargs)
elif not is_document and is_batch:
return self.run_batch_raw(*args, **kwargs)
else:
return self.run_raw(*args, **kwargs)