ian_Cin
a86c727869
add albert to git-secret ( #140 )
...
* add albert to git-secret
* update readme
* Limit llama-index version
* Langchain upgrade their wikipedia tool name
---------
Co-authored-by: trducng <trungduc1992@gmail.com>
2024-02-20 17:28:06 +07:00
trducng
89450ab661
Enable zip file upload in ktem
2024-02-20 02:59:46 +07:00
Duc Nguyen (john)
d36522129f
refactor: replace llama-index based loader, to a llama-index mixin loader ( #142 )
2024-02-20 02:33:28 +07:00
trducng
7fc54d52e4
Improve ocr loader error message
2024-02-06 12:21:12 +07:00
trducng
1a4fd7c33f
Update default settings to conform Langchain's Azure implementation
2024-02-05 18:04:36 +07:00
trducng
771f074c0e
Add utf-8 encoding in Help Page for rendering on Windows
2024-02-05 16:42:40 +07:00
trducng
bff55230ba
Reduce the default chunk size in the reasoning pipeline to fit LLM capability
2024-02-03 09:38:50 +07:00
trducng
107bc7580e
Enable HTML upload
2024-02-02 11:37:57 +07:00
Duc Nguyen (john)
65852b7d71
Add docx + html reader ( #139 )
2024-01-31 19:21:30 +07:00
ian_Cin
116919b346
Update docs ( #106 )
2024-01-30 18:50:17 +07:00
trducng
cbe40fac99
Show retrieved but non-evidence docs. Support language changing
2024-01-29 11:16:07 +07:00
trducng
50b5d936f5
Optionally allow database migration with Alembic
2024-01-28 19:54:15 +07:00
trducng
04635b77f6
Make the database table customizable
2024-01-28 07:54:38 +07:00
trducng
6ae9634399
Enable .doc file
2024-01-27 23:45:19 +07:00
trducng
23c0331bab
Enable pptx support
2024-01-27 23:08:06 +07:00
trducng
80ec214107
Fix loaders' file_path and other metadata
2024-01-27 22:52:46 +07:00
trducng
c6637ca56e
Relate the retrievers to the indexer
2024-01-27 16:39:40 +07:00
trducng
9b586466ff
Add the tutorial to mkdocs
2024-01-26 15:40:04 +00:00
Duc Nguyen (john)
22c646e5c4
Add documentation about adding reasoning and indexing pipelines to the application ( #138 )
2024-01-26 22:31:52 +07:00
trducng
757aabca4d
Add app title, favicon. More natural chat
2024-01-25 22:40:32 +07:00
Duc Nguyen (john)
513e86f490
Add dedicated information panel to the UI ( #137 )
...
* Allow streaming to the chatbot and the information panel without threading
* Highlight evidence in a simple manner
2024-01-25 19:07:53 +07:00
Duc Nguyen (john)
ebc61400d8
Provide a developer mode when running ktem ( #135 )
...
Implement and utilize `on_app_created` to support the developer mode.
2024-01-23 11:46:59 +07:00
Duc Nguyen (john)
2dd531114f
Make ktem official ( #134 )
...
* Move kotaemon and ktem into same folder
* Update docs
* Update CI
* Resolve mypy, isorts
* Re-allow test pdf files
2024-01-23 10:54:18 +07:00
Duc Nguyen (john)
9c5b707010
Customize application settings ( #132 )
...
* Allow customizing the base application
* Make the core llms and embeddings customizable
* Make the settings, reasoning and index customizable
* Import from langchain_openai
2024-01-21 14:36:07 +07:00
Duc Nguyen (john)
5a9d6f75be
Migrate the MVP into kotaemon ( #108 )
...
- Migrate the MVP into kotaemon.
- Preliminary include the pipeline within chatbot interface.
- Organize MVP as an application.
Todo:
- Add an info panel to view the planning of agents -> Fix streaming agents' output.
Resolve : #60
Resolve : #61
Resolve : #62
2024-01-10 15:28:09 +07:00
ian_Cin
230328c62f
Best docs Cinnamon will probably ever have ( #105 )
2023-12-20 11:30:25 +07:00
Duc Nguyen (john)
0e30dcbb06
Create Langchain LLM converter to quickly supply it to Langchain's chain ( #102 )
...
* Create Langchain LLM converter to quickly supply it to Langchain's chain
* Clean up
2023-12-11 14:55:56 +07:00
Duc Nguyen (john)
da0ac1d69f
Change template to private attribute and simplify imports ( #101 )
...
---------
Co-authored-by: ian <ian@cinnamon.is>
2023-12-08 18:10:34 +07:00
Duc Nguyen (john)
1f927d3391
Upgrade promptui to conform to Gradio V4 ( #98 )
2023-12-07 15:24:07 +07:00
ian_Cin
797df5a69c
refractor agents ( #100 )
...
* refractor agents
* minor cosmetic, add terminal ui for cli
* pump to 0.3.4
* Add temporary path
* fix unclose files in tests
---------
Co-authored-by: trducng <trungduc1992@gmail.com>
2023-12-06 17:06:29 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
d9e925eb75
Add UnstructuredReader with support for various legacy files (.doc, .xls) ( #99 )
2023-12-05 16:19:13 +07:00
Duc Nguyen (john)
37c744b616
Add file-based document store and vector store ( #96 )
...
* Modify docstore and vectorstore objects to be reconstructable
* Simplify the file docstore
* Use the simple file docstore and vector store in MVP
2023-12-04 17:46:00 +07:00
Duc Nguyen (john)
0ce3a8832f
Provide type hints for pass-through Langchain and Llama-index objects ( #95 )
2023-12-04 10:59:13 +07:00
Duc Nguyen (john)
e34b1e4c6d
Refactor the index component and update the MVP insurance accordingly ( #90 )
...
Refactor the `kotaemon/pipelines` module to `kotaemon/indices`. Create the VectorIndex.
Note: currently I place `qa` to be inside `kotaemon/indices` since at the moment we only have `qa` in RAG. At the same time, I think `qa` can be an independent module in `kotaemon/qa`. Since this can be changed later, I still go at the 1st option for now to observe if we can change it later.
2023-11-30 18:35:07 +07:00
Nguyen Trung Duc (john)
8e3a1d193f
Refactor agents and tools ( #91 )
...
* Move tools to agents
* Move agents to dedicate place
* Remove subclassing BaseAgent from BaseTool
2023-11-30 09:52:08 +07:00
ian_Cin
4256030b4f
Adopt pyproject.toml ( #89 )
...
* ditching setup.py in favour of pyproject.toml; bump to 0.3.2
* bump to 0.3.3
2023-11-29 14:58:35 +07:00
ian_Cin
8e0779a22d
Enforce all IO objects to be subclassed from Document ( #88 )
...
* enforce Document as IO
* Separate rerankers, splitters and extractors (#85 )
* partially refractor importing
* add text to embedding outputs
---------
Co-authored-by: Nguyen Trung Duc (john) <trungduc1992@gmail.com>
2023-11-27 16:35:09 +07:00
Nguyen Trung Duc (john)
2186c5558f
Separate rerankers, splitters and extractors ( #85 )
2023-11-27 14:25:54 +07:00
ian_Cin
0dede9c82d
Subclass chat messages from Document ( #86 )
2023-11-27 10:38:19 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
3ac277cc0b
Update Elastics store delete() ( #84 )
2023-11-21 15:29:00 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
9a96a9b876
Add Elasticsearch Docstore ( #83 )
...
* add Elasticsearch Docstore
* update missing requirements
* add docstore
* [ignore cache] update default param
* update docstring
2023-11-21 11:59:20 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
8bb7ad91e0
Add Langchain Agent wrapper with OpenAI Function / Self-ask agent support ( #82 )
...
* update Param() type hint in MVP
* update default embedding endpoint
* update Langchain agent wrapper
* update langchain agent
2023-11-20 16:26:08 +07:00
Nguyen Trung Duc (john)
0a3fc4b228
Correct the use of abstractmethod ( #80 )
...
* Correct abstractmethod usage
* Update interface
* Specify minimal llama-index version [ignore cache]
* Update examples
2023-11-20 11:18:53 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
98509f886c
Update splitters + metadata extractor interface to conform with new LlamaIndex design ( #81 )
...
* change splitter to general doc parsers class to fit new llama-index desing
* moving interface of splitter
2023-11-20 10:09:30 +07:00
Nguyen Trung Duc (john)
98c76c4700
Refactor excel Loader ( #79 )
2023-11-16 11:30:11 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
cc1e75b3c6
Add Citation pipeline ( #78 )
...
* add rerankers in retrieving pipeline
* update example MVP pipeline
* add citation pipeline and function call interface
* change return type of QA and AgentPipeline to Document
2023-11-16 11:24:35 +07:00
Nguyen Trung Duc (john)
f8b8d86d4e
Move LLM-related components into LLM module ( #74 )
...
* Move splitter into indexing module
* Rename post_processing module to parsers
* Migrate LLM-specific composite pipelines into llms module
This change moves the `splitters` module into `indexing` module. The `indexing` module will be created soon, to house `indexing`-related components.
This change renames `post_processing` module into `parsers` module. Post-processing is a generic term which provides very little information. In the future, we will add other extractors into the `parser` module, like Metadata extractor...
This change migrates the composite elements into `llms` module. These elements heavily assume that the internal nodes are llm-specific. As a result, migrating these elements into `llms` module will make them more discoverable, and simplify code base structure.
2023-11-15 16:26:53 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
9945afdf6f
Add Reranker implementation and integration in Retrieving pipeline ( #77 )
...
* Add base Reranker
* Add LLM Reranker
* Add Cohere Reranker
* Add integration of Rerankers in Retrieving pipeline
2023-11-15 16:03:51 +07:00
Nguyen Trung Duc (john)
b52f312d8e
Use new Langchain's dedicated Azure OpenAI embedding class ( #76 )
...
* Use new Langchain's dedicated Azure OpenAI embedding class
* Update test
2023-11-15 14:46:32 +07:00
Nguyen Trung Duc (john)
b159897ac6
Combine docstores and vectorstores within a storages component ( #72 )
2023-11-14 17:50:57 +07:00