Commit Graph

104 Commits

Author SHA1 Message Date
Duc Nguyen (john)
66905d39c4
Allow adding, updating and deleting indices (#24)
* Allow adding indices

* Allow deleting indices

* Allow updating the indices

* When there are multiple indices, group them below Indices tab

* Update elem classes
2024-04-12 15:41:09 +07:00
Duc Nguyen (john)
5ce6bac03d
Allow listing indices (#22) 2024-04-11 16:28:04 +07:00
Duc Nguyen (john)
3ed50b0f10
Improve LLMs and Embedding models resources experience (#21)
* Fix inconsistent default values
* Disallow LLM's empty name. Handle LLM creation error on UI
2024-04-11 07:50:53 +07:00
Duc Nguyen (john)
f3e82b2e70
Put the preparation step in FileIndex to on_start (#20) 2024-04-10 19:30:45 +07:00
ian_Cin
b507eef541
Improve manuals (#19)
* Rename Admin -> Resources
* Improve ui
* Update docs
2024-04-10 17:04:04 +07:00
Duc Nguyen (john)
7b3307e3c4
Provide embedding manager (#16)
* Provide the Embedding management UI

* Update Fastembed documentation

* Add validation when adding / updating embeddings

* Stop using the old ktem embeddings manager

* Set default local embedding models

* Move the local embeddings below in flowsettings

* Update flowsettings
2024-04-10 15:11:44 +07:00
Duc Nguyen (john)
ed10020ea3
Refactor embeddings and provide vanilla OpenAI-based embeddings (#11)
* Prepend all Langchain-based embeddings with LC

* Provide vanilla OpenAI embeddings

* Add test for AzureOpenAIEmbeddings and OpenAIEmbeddings

* Fix disallowed empty string

* Use OpenAIEmbeddings in flowsettings

---------

Co-authored-by: ian_Cin <ian@cinnamon.is>
2024-04-09 15:07:59 +07:00
Duc Nguyen (john)
e75354d410
Enable fastembed as a local embedding vendor (#12)
* Prepend all Langchain-based embeddings with LC

* Provide vanilla OpenAI embeddings

* Add test for AzureOpenAIEmbeddings and OpenAIEmbeddings

* Incorporate fastembed

---------

Co-authored-by: ian_Cin <ian@cinnamon.is>
2024-04-09 01:44:34 +07:00
ian_Cin
8001c86b16
Feat/new UI (#13)
* new custom theme

* improve css: scrollbar, header, tabs and buttons

* update settings tab

* open file index selector by default

* update chat control panel

* update chat panel

* update file index page

* cap gradio<=4.22.0

* rename admin page

* adjust UI

* update flowsettings

* auto start in browser

* change colour for edit LLM page's button
2024-04-08 22:23:00 +07:00
Duc Nguyen (john)
a203fc0f7c
Allow users to add LLM within the UI (#6)
* Rename AzureChatOpenAI to LCAzureChatOpenAI
* Provide vanilla ChatOpenAI and AzureChatOpenAI
* Remove the highest accuracy, lowest cost criteria

These criteria are unnecessary. The users, not pipeline creators, should choose
which LLM to use. Furthermore, it's cumbersome to input this information,
really degrades user experience.

* Remove the LLM selection in simple reasoning pipeline
* Provide a dedicated stream method to generate the output
* Return placeholder message to chat if the text is empty
2024-04-06 11:53:17 +07:00
Duc Nguyen (john)
e187e23dd1
Improve UX (#9)
* Go to chat tab when resignin
* Allow placeholder message configurable
* Hide setting tabs if there aren't any settings
2024-04-04 15:39:24 +07:00
ian_Cin
ecf09b275f
Fix UI bugs (#8)
* Auto create conversation when the user starts

* Add conversation rename rule check

* Fix empty name during save

* Confirm deleting conversation

* Show warning if users don't select file when upload files in the File Index

* Feedback when user uploads duplicated file

* Limit the file types

* Fix valid username

* Allow login when username with leading and trailing whitespaces

* Improve the user

* Disable admin panel for non-admnin user

* Refresh user lists after creating/deleting users

* Auto logging in

* Clear admin information upon signing out

* Fix unable to receive uploaded filename that include special characters, like !@#$%^&*().pdf

* Set upload validation for FileIndex

* Improve user management UI/UIX

* Show extraction error when indexing file

* Return selected user -1 when signing out

* Fix default supported file types in file index

* Validate changing password

* Allow the selector to contain mulitple gradio components

* A more tolerable placeholder screen

* Allow chat suggestion box

* Increase concurrency limit

* Make adobe loader optional

* Use BaseReasoning

---------

Co-authored-by: trducng <trungduc1992@gmail.com>
2024-04-03 16:33:54 +07:00
ian_Cin
43a18ba070
Feat/regenerate answer (#7)
* Add regen button and repharasing question on regen

* Stop appending regen messages to history, allow only one

* Add dynamic conversation state

* Allow reasoning pipeline to manipulate state

---------

Co-authored-by: albert <albert@cinnamon.is>
Co-authored-by: Duc Nguyen (john) <trungduc1992@gmail.com>
2024-04-03 15:37:55 +07:00
ian_Cin
e67a25c0bd
Feat/add multimodal loader (#5)
* Add Adobe reader as the multimodal loader

* Allow FullQAPipeline to reasoning on figures

* fix: move the adobe import to avoid ImportError, notify users whenever they run the AdobeReader

---------

Co-authored-by: cin-albert <albert@cinnamon.is>
2024-04-03 14:52:40 +07:00
ian_Cin
a3bf728400
Update various docs (#4)
* rename cli tool

* remove redundant docs

* update docs

* update macos instructions

* add badges
2024-03-29 19:47:03 +07:00
ian
14482e9755 bug fix: settings are not persistent 2024-03-28 16:36:05 +07:00
ian
f9cc40ca25 improve llm selection for simple reasoning pipeline 2024-03-28 16:35:13 +07:00
ian
e3498a4958 rename ktem test dir 2024-03-28 16:27:05 +07:00
ian
c1b1371a68 enable config through .env 2024-03-27 19:04:48 +07:00
ian
da86fa463f rename test dir 2024-03-27 18:56:06 +07:00
ian_Cin
d22ae88c7a make default installation faster (#2)
* remove cohere as default

* refractor dependencies

* use llama-index pdf reader as default (pypdf)

* fix some lazy docstring

* update install scripts

* minor fix
2024-03-21 22:48:20 +07:00
ian_Cin
a8f92b3f9e post migrate cleanup 2024-03-18 23:10:20 +07:00
ian_Cin
df12dec732 Feat/local endpoint llm (#148)
* serve local model in a different process from the app
---------

Co-authored-by: albert <albert@cinnamon.is>
Co-authored-by: trducng <trungduc1992@gmail.com>
2024-03-15 16:17:33 +07:00
Duc Nguyen (john)
2950e6ed02 Improve behavior of simple reasoning (#157)
* Add base reasoning implementation

* Provide explicit async and streaming capability

* Allow refreshing the information panel
2024-03-12 13:03:38 +07:00
Duc Nguyen (john)
cb01d27d19 Fix integrating indexing and retrieval pipelines to FileIndex (#155)
* Add docs for settings
* Add mdx_truly_sane_lists to doc requirements
2024-03-10 16:41:42 +07:00
trducng
2b3571e892 Fix subscribing sign-in/out 2024-03-08 13:38:29 +07:00
Duc Nguyen (john)
4f356f7f9a Provide dedicated page for login (#153) 2024-03-08 08:06:51 +07:00
Duc Nguyen (john)
9725d60791 Create user management functionality (#152)
* Create user management page
* Remove old user creating UI
* Add username validation; admin user auto-creation
* Provide docs on user management
* Bump version
2024-03-07 14:19:37 +07:00
Duc Nguyen (john)
8a90fcfc99 Restructure index to allow it to be dynamically created by end-user (#151)
1. Introduce the concept of "collection_name" to docstore and vector store. Each collection can be viewed similarly to a table in a SQL database. It allows better organizing information within this data source.
2. Move the `Index` and `Source` tables from the application scope into the index scope. For each new index created by user, these tables should increase accordingly. So it depends on the index, rather than the app.
3. Make each index responsible for the UI components in the app.
4. Construct the File UI page.
2024-03-07 01:50:47 +07:00
Duc Nguyen (john)
033e7e05cc Improve kotaemon based on insights from projects (#147)
- Include static files in the package.
- More reliable information panel. Faster & not breaking randomly.
- Add directory upload.
- Enable zip file to upload.
- Allow setting endpoint for the OCR reader using environment variable.
2024-02-28 22:18:29 +07:00
Duc Nguyen (john)
e1cf970a3d Disable Gradio analytics and unnecessary font loading to avoid app hanging in private network (#145) 2024-02-20 22:02:28 +07:00
trducng
08cc99d8db Add docstring for database and OCR loader 2024-02-20 21:20:48 +07:00
Duc Nguyen (john)
767aaaa1ef Utilize llama.cpp for both completion and chat models (#141) 2024-02-20 18:17:48 +07:00
trducng
89450ab661 Enable zip file upload in ktem 2024-02-20 02:59:46 +07:00
Duc Nguyen (john)
d36522129f refactor: replace llama-index based loader, to a llama-index mixin loader (#142) 2024-02-20 02:33:28 +07:00
trducng
7fc54d52e4 Improve ocr loader error message 2024-02-06 12:21:12 +07:00
trducng
1a4fd7c33f Update default settings to conform Langchain's Azure implementation 2024-02-05 18:04:36 +07:00
trducng
771f074c0e Add utf-8 encoding in Help Page for rendering on Windows 2024-02-05 16:42:40 +07:00
trducng
bff55230ba Reduce the default chunk size in the reasoning pipeline to fit LLM capability 2024-02-03 09:38:50 +07:00
trducng
107bc7580e Enable HTML upload 2024-02-02 11:37:57 +07:00
Duc Nguyen (john)
65852b7d71 Add docx + html reader (#139) 2024-01-31 19:21:30 +07:00
ian_Cin
116919b346 Update docs (#106) 2024-01-30 18:50:17 +07:00
trducng
cbe40fac99 Show retrieved but non-evidence docs. Support language changing 2024-01-29 11:16:07 +07:00
trducng
50b5d936f5 Optionally allow database migration with Alembic 2024-01-28 19:54:15 +07:00
trducng
04635b77f6 Make the database table customizable 2024-01-28 07:54:38 +07:00
trducng
6ae9634399 Enable .doc file 2024-01-27 23:45:19 +07:00
trducng
23c0331bab Enable pptx support 2024-01-27 23:08:06 +07:00
trducng
80ec214107 Fix loaders' file_path and other metadata 2024-01-27 22:52:46 +07:00
trducng
c6637ca56e Relate the retrievers to the indexer 2024-01-27 16:39:40 +07:00
Duc Nguyen (john)
22c646e5c4 Add documentation about adding reasoning and indexing pipelines to the application (#138) 2024-01-26 22:31:52 +07:00
trducng
757aabca4d Add app title, favicon. More natural chat 2024-01-25 22:40:32 +07:00
Duc Nguyen (john)
513e86f490 Add dedicated information panel to the UI (#137)
* Allow streaming to the chatbot and the information panel without threading
* Highlight evidence in a simple manner
2024-01-25 19:07:53 +07:00
Duc Nguyen (john)
ebc61400d8 Provide a developer mode when running ktem (#135)
Implement and utilize `on_app_created` to support the developer mode.
2024-01-23 11:46:59 +07:00
Duc Nguyen (john)
2dd531114f Make ktem official (#134)
* Move kotaemon and ktem into same folder

* Update docs

* Update CI

* Resolve mypy, isorts

* Re-allow test pdf files
2024-01-23 10:54:18 +07:00