Commit Graph

8 Commits

Author SHA1 Message Date
ian_Cin
230328c62f Best docs Cinnamon will probably ever have (#105) 2023-12-20 11:30:25 +07:00
Duc Nguyen (john)
0e30dcbb06 Create Langchain LLM converter to quickly supply it to Langchain's chain (#102)
* Create Langchain LLM converter to quickly supply it to Langchain's chain

* Clean up
2023-12-11 14:55:56 +07:00
Duc Nguyen (john)
e34b1e4c6d Refactor the index component and update the MVP insurance accordingly (#90)
Refactor the `kotaemon/pipelines` module to `kotaemon/indices`. Create the VectorIndex.

Note: currently I place `qa` to be inside `kotaemon/indices` since at the moment we only have `qa` in RAG. At the same time, I think `qa` can be an independent module in `kotaemon/qa`. Since this can be changed later, I still go at the 1st option for now to observe if we can change it later.
2023-11-30 18:35:07 +07:00
ian_Cin
0dede9c82d Subclass chat messages from Document (#86) 2023-11-27 10:38:19 +07:00
Nguyen Trung Duc (john)
f8b8d86d4e Move LLM-related components into LLM module (#74)
* Move splitter into indexing module
* Rename post_processing module to parsers
* Migrate LLM-specific composite pipelines into llms module

This change moves the `splitters` module into `indexing` module. The `indexing` module will be created soon, to house `indexing`-related components.

This change renames `post_processing` module into `parsers` module. Post-processing is a generic term which provides very little information. In the future, we will add other extractors into the `parser` module, like Metadata extractor...

This change migrates the composite elements into `llms` module. These elements heavily assume that the internal nodes are llm-specific. As a result, migrating these elements into `llms` module will make them more discoverable, and simplify code base structure.
2023-11-15 16:26:53 +07:00
Nguyen Trung Duc (john)
693ed39de4 Move prompts into LLMs module (#70)
Since the only usage of prompt is within LLMs, it is reasonable to keep it within the LLM module. This way, it would be easier to discover module, and make the code base less complicated.

Changes:

* Move prompt components into llms
* Bump version 0.3.1
* Make pip install dependencies in eager mode

---------

Co-authored-by: ian <ian@cinnamon.is>
2023-11-14 16:00:10 +07:00
Nguyen Trung Duc (john)
6e7905cbc0 [AUR-411] Adopt to Example2 project (#28)
Add the chatbot from Example2. Create the UI for chat.
2023-10-12 15:13:25 +07:00
Nguyen Trung Duc (john)
c3c25db48c [AUR-385, AUR-388] Declare BaseComponent and decide LLM call interface (#2)
- Use cases related to LLM call: https://cinnamon-ai.atlassian.net/browse/AUR-388?focusedCommentId=34873
- Sample usages: `test_llms_chat_models.py` and `test_llms_completion_models.py`:

```python
from kotaemon.llms.chats.openai import AzureChatOpenAI

model = AzureChatOpenAI(
    openai_api_base="https://test.openai.azure.com/",
    openai_api_key="some-key",
    openai_api_version="2023-03-15-preview",
    deployment_name="gpt35turbo",
    temperature=0,
    request_timeout=60,
)
output = model("hello world")
```

For the LLM-call component, I decide to wrap around Langchain's LLM models and Langchain's Chat models. And set the interface as follow:

- Completion LLM component:
```python
class CompletionLLM:

    def run_raw(self, text: str) -> LLMInterface:
        # Run text completion: str in -> LLMInterface out

    def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
        # Run text completion in batch: list[str] in -> list[LLMInterface] out

# run_document and run_batch_document just reuse run_raw and run_batch_raw, due to unclear use case
```

- Chat LLM component:
```python
class ChatLLM:
    def run_raw(self, text: str) -> LLMInterface:
        # Run chat completion (no chat history): str in -> LLMInterface out

    def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
        # Run chat completion in batch mode (no chat history): list[str] in -> list[LLMInterface] out

    def run_document(self, text: list[BaseMessage]) -> LLMInterface:
        # Run chat completion (with chat history): list[langchain's BaseMessage] in -> LLMInterface out

    def run_batch_document(self, text: list[list[BaseMessage]]) -> list[LLMInterface]:
        # Run chat completion in batch mode (with chat history): list[list[langchain's BaseMessage]] in -> list[LLMInterface] out
```

- The LLMInterface is as follow:

```python
@dataclass
class LLMInterface:
    text: list[str]
    completion_tokens: int = -1
    total_tokens: int = -1
    prompt_tokens: int = -1
    logits: list[list[float]] = field(default_factory=list)
```
2023-08-29 15:47:12 +07:00