diff --git a/README.md b/README.md index 49bf963..78db80e 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,11 @@ # kotaemon -![demo](https://raw.githubusercontent.com/Cinnamon/kotaemon/main/docs/images/chat-demo.gif) +Build and use local RAG-based Question Answering (QA) applications. + +https://github.com/Cinnamon/kotaemon/assets/25688648/815ecf68-3a02-4914-a0dd-3f8ec7e75cd9 [Source Code](https://github.com/Cinnamon/kotaemon) | -[Demo](https://huggingface.co/spaces/lone17/kotaemon-app) +[Live Demo](https://huggingface.co/spaces/lone17/kotaemon-app) [User Guide](https://cinnamon.github.io/kotaemon/) | [Developer Guide](https://cinnamon.github.io/kotaemon/development/) | @@ -13,9 +15,7 @@ [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![built with Codeium](https://codeium.com/badges/main)](https://codeium.com) -Build and use local RAG-based Question Answering (QA) applications. - -This repository would like to appeal to both end users who want to do QA on their +This project would like to appeal to both end users who want to do QA on their documents and developers who want to build their own QA pipeline. - For end users: @@ -32,47 +32,150 @@ This repository is under active development. Feedback, issues, and PRs are highl appreciated. Your input is valuable as it helps us persuade our business guys to support open source. -## Setting up +## Installation -- Clone the repo +### For end users - ```shell - git clone git@github.com:Cinnamon/kotaemon.git - cd kotaemon - ``` +This document is intended for developers. If you just want to install and use the app as +it, please follow the [User Guide](https://cinnamon.github.io/kotaemon/). -- Install the environment +### For developers - - Create a conda environment (python >= 3.10 is recommended) +```shell +# Create a environment +python -m venv kotaemon-env - ```shell - conda create -n kotaemon python=3.10 - conda activate kotaemon +# Activate the environment +source kotaemon-env/bin/activate - # install dependencies - cd libs/kotaemon - pip install -e ".[all]" - ``` +# Install the package +pip install git+https://github.com/Cinnamon/kotaemon.git +``` - - Or run the installer (one of the `scripts/run_*` scripts depends on your OS), then - you will have all the dependencies installed as a conda environment at - `install_dir/env`. +## Creating your application - ```shell - conda activate install_dir/env - ``` +In order to create your own application, you need to prepare these files: -- Pre-commit +- `flowsettings.py` +- `app.py` +- `.env` (Optional) - ```shell - pre-commit install - ``` +### `flowsettings.py` -- Test +This file contains the configuration of your application. You can use the example +[here](https://github.com/Cinnamon/kotaemon/blob/main/libs/ktem/flowsettings.py) as the +starting point. - ```shell - pytest tests - ``` +### `app.py` + +This file is where you create your Gradio app object. This can be as simple as: + +```python +from ktem.main import App + +app = App() +demo = app.make() +demo.launch() +``` + +### `.env` (Optional) + +This file provides another way to configure your models and credentials. + +
+ +Configure model via the .env file + +Alternatively, you can configure the models via the `.env` file with the information needed to connect to the LLMs. This file is located in +the folder of the application. If you don't see it, you can create one. + +Currently, the following providers are supported: + +#### OpenAI + +In the `.env` file, set the `OPENAI_API_KEY` variable with your OpenAI API key in order +to enable access to OpenAI's models. There are other variables that can be modified, +please feel free to edit them to fit your case. Otherwise, the default parameter should +work for most people. + +```shell +OPENAI_API_BASE=https://api.openai.com/v1 +OPENAI_API_KEY= +OPENAI_CHAT_MODEL=gpt-3.5-turbo +OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002 +``` + +#### Azure OpenAI + +For OpenAI models via Azure platform, you need to provide your Azure endpoint and API +key. Your might also need to provide your developments' name for the chat model and the +embedding model depending on how you set up Azure development. + +```shell +AZURE_OPENAI_ENDPOINT= +AZURE_OPENAI_API_KEY= +OPENAI_API_VERSION=2024-02-15-preview +AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-35-turbo +AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=text-embedding-ada-002 +``` + +#### Local models + +- Pros: +- Privacy. Your documents will be stored and process locally. +- Choices. There are a wide range of LLMs in terms of size, domain, language to choose + from. +- Cost. It's free. +- Cons: +- Quality. Local models are much smaller and thus have lower generative quality than + paid APIs. +- Speed. Local models are deployed using your machine so the processing speed is + limited by your hardware. + +##### Find and download a LLM + +You can search and download a LLM to be ran locally from the [Hugging Face +Hub](https://huggingface.co/models). Currently, these model formats are supported: + +- GGUF + +You should choose a model whose size is less than your device's memory and should leave +about 2 GB. For example, if you have 16 GB of RAM in total, of which 12 GB is available, +then you should choose a model that takes up at most 10 GB of RAM. Bigger models tend to +give better generation but also take more processing time. + +Here are some recommendations and their size in memory: + +- [Qwen1.5-1.8B-Chat-GGUF](https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat-GGUF/resolve/main/qwen1_5-1_8b-chat-q8_0.gguf?download=true): + around 2 GB + +##### Enable local models + +To add a local model to the model pool, set the `LOCAL_MODEL` variable in the `.env` +file to the path of the model file. + +```shell +LOCAL_MODEL= +``` + +Here is how to get the full path of your model file: + +- On Windows 11: right click the file and select `Copy as Path`. +
+ +## Start your application + +Simply run the following command: + +```shell +python app.py +``` + +The app will be automatically launched in your browser. + +![Chat tab](https://raw.githubusercontent.com/Cinnamon/kotaemon/main/docs/images/chat-tab.png) + +## Customize your application Please refer to the [Developer Guide](https://cinnamon.github.io/kotaemon/development/) for more details. diff --git a/docs/development/contributing.md b/docs/development/contributing.md index af9f129..d6ade8f 100644 --- a/docs/development/contributing.md +++ b/docs/development/contributing.md @@ -1,5 +1,47 @@ # Contributing +## Setting up + +- Clone the repo + + ```shell + git clone git@github.com:Cinnamon/kotaemon.git + cd kotaemon + ``` + +- Install the environment + + - Create a conda environment (python >= 3.10 is recommended) + + ```shell + conda create -n kotaemon python=3.10 + conda activate kotaemon + + # install dependencies + cd libs/kotaemon + pip install -e ".[all]" + ``` + + - Or run the installer (one of the `scripts/run_*` scripts depends on your OS), then + you will have all the dependencies installed as a conda environment at + `install_dir/env`. + + ```shell + conda activate install_dir/env + ``` + +- Pre-commit + + ```shell + pre-commit install + ``` + +- Test + + ```shell + pytest tests + ``` + ## Package overview `kotaemon` library focuses on the AI building blocks to implement a RAG-based QA application. It consists of base interfaces, core components and a list of utilities: diff --git a/docs/index.md b/docs/index.md index 28ccd1a..4f8af20 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,6 +1,6 @@ # Getting Started with Kotaemon -![chat demo](images/chat-demo.gif) +![type:video](https://github.com/Cinnamon/kotaemon/assets/25688648/815ecf68-3a02-4914-a0dd-3f8ec7e75cd9) This page is intended for **end users** who want to use the `kotaemon` tool for Question Answering on local documents. If you are a **developer** who wants contribute to the project, please visit the [development](development/index.md) page. @@ -25,9 +25,9 @@ Download and upzip the latest version of `kotaemon` by clicking this ## Launch -To launch the app after initial setup or any changes, simply run the `run_*` script again. +To launch the app after initial setup or any change, simply run the `run_*` script again. -A browser window will be opened and greet you with this screen: +A browser window will be opened and greets you with this screen: ![Chat tab](https://raw.githubusercontent.com/Cinnamon/kotaemon/main/docs/images/chat-tab.png) diff --git a/docs/pages/app/customize-ui.md b/docs/pages/app/customize-ui.md deleted file mode 100644 index e69de29..0000000 diff --git a/mkdocs.yml b/mkdocs.yml index 71dfb90..5d33e8b 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -12,9 +12,10 @@ nav: # - Functional description: pages/app/functional-description.md - Development: - development/index.md - - Contributing: development/contributing.md # - Data & Data Structure Components: development/data-components.md # - Features: pages/app/features.md + - Customize flow logic: pages/app/customize-flows.md + - Creating a Component: development/create-a-component.md - Components: - Index: - File index: pages/app/index/file.md @@ -23,8 +24,7 @@ nav: - pages/app/settings/user-settings.md - Extension: - User management: pages/app/ext/user-management.md - - Customize flow logic: pages/app/customize-flows.md - - Creating a Component: development/create-a-component.md + - Contributing: development/contributing.md # generated using gen-files + literate-nav - API Reference: reference/ - Issue Tracker: "https://github.com/Cinnamon/kotaemon/issues" @@ -77,6 +77,7 @@ plugins: type: timeago fallback_to_build_date: true - section-index + - mkdocs-video theme: features: