diff --git a/README.md b/README.md index ad09490e..20e67a9d 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,13 @@ ![](img/logo-long-chatchat-trans-v2.png) - 🌍 [READ THIS IN ENGLISH](README_en.md) 📃 **LangChain-Chatchat** (原 Langchain-ChatGLM) 基于 ChatGLM 等大语言模型与 Langchain 等应用框架实现,开源、可离线部署的检索增强生成(RAG)大模型知识库项目。 +⚠️`0.2.10`将会是`0.2.x`系列的最后一个版本,`0.2.x`系列版本将会停止更新和技术支持,全力研发具有更强应用性的 `Langchain-Chatchat 0.3.x`。 + --- ## 目录 @@ -14,23 +15,31 @@ * [介绍](README.md#介绍) * [解决的痛点](README.md#解决的痛点) * [快速上手](README.md#快速上手) - * [1. 环境配置](README.md#1-环境配置) - * [2. 模型下载](README.md#2-模型下载) - * [3. 初始化知识库和配置文件](README.md#3-初始化知识库和配置文件) - * [4. 一键启动](README.md#4-一键启动) - * [5. 启动界面示例](README.md#5-启动界面示例) + * [1. 环境配置](README.md#1-环境配置) + * [2. 模型下载](README.md#2-模型下载) + * [3. 初始化知识库和配置文件](README.md#3-初始化知识库和配置文件) + * [4. 一键启动](README.md#4-一键启动) + * [5. 启动界面示例](README.md#5-启动界面示例) * [联系我们](README.md#联系我们) - ## 介绍 -🤖️ 一种利用 [langchain](https://github.com/hwchase17/langchain) 思想实现的基于本地知识库的问答应用,目标期望建立一套对中文场景与开源模型支持友好、可离线运行的知识库问答解决方案。 +🤖️ 一种利用 [langchain](https://github.com/langchain-ai/langchain) +思想实现的基于本地知识库的问答应用,目标期望建立一套对中文场景与开源模型支持友好、可离线运行的知识库问答解决方案。 -💡 受 [GanymedeNil](https://github.com/GanymedeNil) 的项目 [document.ai](https://github.com/GanymedeNil/document.ai) 和 [AlexZhangji](https://github.com/AlexZhangji) 创建的 [ChatGLM-6B Pull Request](https://github.com/THUDM/ChatGLM-6B/pull/216) 启发,建立了全流程可使用开源模型实现的本地知识库问答应用。本项目的最新版本中通过使用 [FastChat](https://github.com/lm-sys/FastChat) 接入 Vicuna, Alpaca, LLaMA, Koala, RWKV 等模型,依托于 [langchain](https://github.com/langchain-ai/langchain) 框架支持通过基于 [FastAPI](https://github.com/tiangolo/fastapi) 提供的 API 调用服务,或使用基于 [Streamlit](https://github.com/streamlit/streamlit) 的 WebUI 进行操作。 +💡 受 [GanymedeNil](https://github.com/GanymedeNil) 的项目 [document.ai](https://github.com/GanymedeNil/document.ai) +和 [AlexZhangji](https://github.com/AlexZhangji) +创建的 [ChatGLM-6B Pull Request](https://github.com/THUDM/ChatGLM-6B/pull/216) +启发,建立了全流程可使用开源模型实现的本地知识库问答应用。本项目的最新版本中通过使用 [FastChat](https://github.com/lm-sys/FastChat) +接入 Vicuna, Alpaca, LLaMA, Koala, RWKV 等模型,依托于 [langchain](https://github.com/langchain-ai/langchain) +框架支持通过基于 [FastAPI](https://github.com/tiangolo/fastapi) 提供的 API +调用服务,或使用基于 [Streamlit](https://github.com/streamlit/streamlit) 的 WebUI 进行操作。 -✅ 依托于本项目支持的开源 LLM 与 Embedding 模型,本项目可实现全部使用**开源**模型**离线私有部署**。与此同时,本项目也支持 OpenAI GPT API 的调用,并将在后续持续扩充对各类模型及模型 API 的接入。 +✅ 依托于本项目支持的开源 LLM 与 Embedding 模型,本项目可实现全部使用**开源**模型**离线私有部署**。与此同时,本项目也支持 +OpenAI GPT API 的调用,并将在后续持续扩充对各类模型及模型 API 的接入。 -⛓️ 本项目实现原理如下图所示,过程包括加载文件 -> 读取文本 -> 文本分割 -> 文本向量化 -> 问句向量化 -> 在文本向量中匹配出与问句向量最相似的 `top k`个 -> 匹配出的文本作为上下文和问题一起添加到 `prompt`中 -> 提交给 `LLM`生成回答。 +⛓️ 本项目实现原理如下图所示,过程包括加载文件 -> 读取文本 -> 文本分割 -> 文本向量化 -> 问句向量化 -> +在文本向量中匹配出与问句向量最相似的 `top k`个 -> 匹配出的文本作为上下文和问题一起添加到 `prompt`中 -> 提交给 `LLM`生成回答。 📺 [原理介绍视频](https://www.bilibili.com/video/BV13M4y1e7cN/?share_source=copy_web&vd_source=e6c5aafe684f30fbe41925d61ca6d514) @@ -42,7 +51,8 @@ 🚩 本项目未涉及微调、训练过程,但可利用微调或训练对本项目效果进行优化。 -🌐 [AutoDL 镜像](https://www.codewithgpu.com/i/chatchat-space/Langchain-Chatchat/Langchain-Chatchat) 中 `v13` 版本所使用代码已更新至本项目 `v0.2.9` 版本。 +🌐 [AutoDL 镜像](https://www.codewithgpu.com/i/chatchat-space/Langchain-Chatchat/Langchain-Chatchat) 中 `v13` +版本所使用代码已更新至本项目 `v0.2.9` 版本。 🐳 [Docker 镜像](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.6) 已经更新到 ```0.2.7``` 版本。 @@ -52,7 +62,10 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.7 ``` -🧩 本项目有一个非常完整的[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) , README只是一个简单的介绍,__仅仅是入门教程,能够基础运行__。 如果你想要更深入的了解本项目,或者想对本项目做出贡献。请移步 [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) 界面 +🧩 本项目有一个非常完整的[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) , README只是一个简单的介绍,_ +_仅仅是入门教程,能够基础运行__。 +如果你想要更深入的了解本项目,或者想对本项目做出贡献。请移步 [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) +界面 ## 解决的痛点 @@ -62,17 +75,19 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch 我们支持市面上主流的本地大语言模型和Embedding模型,支持开源的本地向量数据库。 支持列表详见[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) - ## 快速上手 ### 1. 环境配置 -+ 首先,确保你的机器安装了 Python 3.8 - 3.11 ++ 首先,确保你的机器安装了 Python 3.8 - 3.11 (我们强烈推荐使用 Python3.11)。 + ``` $ python --version Python 3.11.7 ``` + 接着,创建一个虚拟环境,并在虚拟环境内安装项目的依赖 + ```shell # 拉取仓库 @@ -88,33 +103,44 @@ $ pip install -r requirements_webui.txt # 默认依赖包括基本运行环境(FAISS向量库)。如果要使用 milvus/pg_vector 等向量库,请将 requirements.txt 中相应依赖取消注释再安装。 ``` -请注意,LangChain-Chatchat `0.2.x` 系列是针对 Langchain `0.0.x` 系列版本的,如果你使用的是 Langchain `0.1.x` 系列版本,需要降级。 + +请注意,LangChain-Chatchat `0.2.x` 系列是针对 Langchain `0.0.x` 系列版本的,如果你使用的是 Langchain `0.1.x` +系列版本,需要降级您的`Langchain`版本。 + ### 2, 模型下载 -如需在本地或离线环境下运行本项目,需要首先将项目所需的模型下载至本地,通常开源 LLM 与 Embedding 模型可以从 [HuggingFace](https://huggingface.co/models) 下载。 +如需在本地或离线环境下运行本项目,需要首先将项目所需的模型下载至本地,通常开源 LLM 与 Embedding +模型可以从 [HuggingFace](https://huggingface.co/models) 下载。 -以本项目中默认使用的 LLM 模型 [THUDM/ChatGLM3-6B](https://huggingface.co/THUDM/chatglm3-6b) 与 Embedding 模型 [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) 为例: +以本项目中默认使用的 LLM 模型 [THUDM/ChatGLM3-6B](https://huggingface.co/THUDM/chatglm3-6b) 与 Embedding +模型 [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) 为例: -下载模型需要先[安装 Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage),然后运行 +下载模型需要先[安装 Git LFS](https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) +,然后运行 ```Shell $ git lfs install $ git clone https://huggingface.co/THUDM/chatglm3-6b $ git clone https://huggingface.co/BAAI/bge-large-zh ``` + ### 3. 初始化知识库和配置文件 按照下列方式初始化自己的知识库和简单的复制配置文件 + ```shell $ python copy_config_example.py $ python init_database.py --recreate-vs ``` + ### 4. 一键启动 按照以下命令启动项目 + ```shell $ python startup.py -a ``` + ### 5. 启动界面示例 如果正常启动,你将能看到以下界面 @@ -133,29 +159,37 @@ $ python startup.py -a ![](img/init_knowledge_base.jpg) - ### 注意 -以上方式只是为了快速上手,如果需要更多的功能和自定义启动方式 ,请参考[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) +以上方式只是为了快速上手,如果需要更多的功能和自定义启动方式 +,请参考[Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/) --- + ## 项目里程碑 + `2023年4月`: `Langchain-ChatGLM 0.1.0` 发布,支持基于 ChatGLM-6B 模型的本地知识库问答。 + `2023年8月`: `Langchain-ChatGLM` 改名为 `Langchain-Chatchat`,`0.2.0` 发布,使用 `fastchat` 作为模型加载方案,支持更多的模型和数据库。 -+ `2023年10月`: `Langchain-Chatchat 0.2.5` 发布,推出 Agent 内容,开源项目在`Founder Park & Zhipu AI & Zilliz` 举办的黑客马拉松获得三等奖。 ++ `2023年10月`: `Langchain-Chatchat 0.2.5` 发布,推出 Agent 内容,开源项目在`Founder Park & Zhipu AI & Zilliz` + 举办的黑客马拉松获得三等奖。 + `2023年12月`: `Langchain-Chatchat` 开源项目获得超过 **20K** stars. -+ `2024年1月`: `LangChain 0.1.x` 推出,`Langchain-Chatchat 0.2.x` 停止更新和技术支持,全力研发具有更强应用性的 `Langchain-Chatchat 0.3.x`。 ++ `2024年1月`: `LangChain 0.1.x` 推出,`Langchain-Chatchat 0.2.x` 发布稳定版本`0.2.10` + 后将停止更新和技术支持,全力研发具有更强应用性的 `Langchain-Chatchat 0.3.x`。 + 🔥 让我们一起期待未来 Chatchat 的故事 ··· + --- + ## 联系我们 + ### Telegram + [![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white "langchain-chatglm")](https://t.me/+RjliQ3jnJ1YyN2E9) ### 项目交流群 -二维码 + +二维码 🎉 Langchain-Chatchat 项目微信交流群,如果你也对本项目感兴趣,欢迎加入群聊参与讨论交流。 diff --git a/README_en.md b/README_en.md index cd18304e..dba6d3b7 100644 --- a/README_en.md +++ b/README_en.md @@ -7,6 +7,10 @@ A LLM application aims to implement knowledge and search engine based QA based on Langchain and open-source or remote LLM API. +⚠️`0.2.10` will be the last version of the `0.2.x` series. The `0.2.x` series will stop updating and technical support, +and strive to develop `Langchain-Chachat 0.3.x with stronger applicability. `. + + --- ## Table of Contents @@ -24,7 +28,8 @@ LLM API. ## Introduction 🤖️ A Q&A application based on local knowledge base implemented using the idea -of [langchain](https://github.com/hwchase17/langchain). The goal is to build a KBQA(Knowledge based Q&A) solution that +of [langchain](https://github.com/langchain-ai/langchain). The goal is to build a KBQA(Knowledge based Q&A) solution +that is friendly to Chinese scenarios and open source models and can run both offline and online. 💡 Inspired by [document.ai](https://github.com/GanymedeNil/document.ai) @@ -55,10 +60,9 @@ The main process analysis from the aspect of document process: 🚩 The training or fine-tuning are not involved in the project, but still, one always can improve performance by do these. -🌐 [AutoDL image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.5) is supported, and in v13 the codes are update -to v0.2.9. +🌐 [AutoDL image](https://www.codewithgpu.com/i/chatchat-space/Langchain-Chatchat/Langchain-Chatchat) is supported, and in v13 the codes are update to v0.2.9. -🐳 [Docker image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.7) +🐳 [Docker image](registry.cn-beijing.aliyuncs.com/chatchat/chatchat:0.2.7) is supported to 0.2.7 ## Pain Points Addressed @@ -98,7 +102,9 @@ $ pip install -r requirements_webui.txt # 默认依赖包括基本运行环境(FAISS向量库)。如果要使用 milvus/pg_vector 等向量库,请将 requirements.txt 中相应依赖取消注释再安装。 ``` -Please note that the LangChain-Chachat `0.2.x` series is for the Langchain `0.0.x` series version. If you are using the Langchain `0.1.x` series version, you need to downgrade. + +Please note that the LangChain-Chachat `0.2.x` series is for the Langchain `0.0.x` series version. If you are using the +Langchain `0.1.x` series version, you need to downgrade. ### Model Download @@ -157,15 +163,23 @@ The above instructions are provided for a quick start. If you need more features please refer to the [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/). --- + ## Project Milestones -+ `April 2023`: `Langchain-ChatGLM 0.1.0` released, supporting local knowledge base question and answer based on the ChatGLM-6B model. -+ `August 2023`: `Langchain-ChatGLM` was renamed to `Langchain-Chatchat`, `0.2.0` was released, using `fastchat` as the model loading solution, supporting more models and databases. -+ `October 2023`: `Langchain-Chachat 0.2.5` was released, Agent content was launched, and the open source project won the third prize in the hackathon held by `Founder Park & Zhipu AI & Zilliz`. ++ `April 2023`: `Langchain-ChatGLM 0.1.0` released, supporting local knowledge base question and answer based on the + ChatGLM-6B model. ++ `August 2023`: `Langchain-ChatGLM` was renamed to `Langchain-Chatchat`, `0.2.0` was released, using `fastchat` as the + model loading solution, supporting more models and databases. ++ `October 2023`: `Langchain-Chachat 0.2.5` was released, Agent content was launched, and the open source project won + the third prize in the hackathon held by `Founder Park & Zhipu AI & Zilliz`. + `December 2023`: `Langchain-Chachat` open source project received more than **20K** stars. -+ `January 2024`: `LangChain 0.1.x` is launched, `Langchain-Chatchat 0.2.x` will stop updating and technical support, and all efforts will be made to develop `Langchain-Chatchat 0.3.x` with stronger applicability. ++ `January 2024`: `LangChain 0.1.x` is launched, `Langchain-Chachat 0.2.x` is released. After the stable + version `0.2.10` is released, updates and technical support will be stopped, and all efforts will be made to + develop `Langchain with stronger applicability -Chat 0.3.x`. + + 🔥 Let’s look forward to the future Chatchat stories together··· + --- ## Contact Us @@ -176,7 +190,7 @@ please refer to the [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/ ### WeChat Group -二维码 +二维码 ### WeChat Official Account diff --git a/configs/model_config.py.example b/configs/model_config.py.example index be1f4d18..d7053564 100644 --- a/configs/model_config.py.example +++ b/configs/model_config.py.example @@ -120,7 +120,7 @@ ONLINE_LLM_MODEL = { "secret_key": "", "provider": "TianGongWorker", }, - # Gemini API (开发组未测试,由社群提供,只支持pro) + # Gemini API https://makersuite.google.com/app/apikey "gemini-api": { "api_key": "", "provider": "GeminiWorker", diff --git a/configs/server_config.py.example b/configs/server_config.py.example index 2f51c3ad..09b1546b 100644 --- a/configs/server_config.py.example +++ b/configs/server_config.py.example @@ -92,11 +92,10 @@ FSCHAT_MODEL_WORKERS = { # 'disable_log_requests': False }, - # 可以如下示例方式更改默认配置 - # "Qwen-1_8B-Chat": { # 使用default中的IP和端口 - # "device": "cpu", - # }, - "chatglm3-6b": { # 使用default中的IP和端口 + "Qwen-1_8B-Chat": { + "device": "cpu", + }, + "chatglm3-6b": { "device": "cuda", }, @@ -129,16 +128,10 @@ FSCHAT_MODEL_WORKERS = { "port": 21009, }, "gemini-api": { - "port": 21012, + "port": 21010, }, } -# fastchat multi model worker server -FSCHAT_MULTI_MODEL_WORKERS = { - # TODO: -} - -# fastchat controller server FSCHAT_CONTROLLER = { "host": DEFAULT_BIND_HOST, "port": 20001, diff --git a/document_loaders/mypdfloader.py b/document_loaders/mypdfloader.py index 5c480cff..faaf63dd 100644 --- a/document_loaders/mypdfloader.py +++ b/document_loaders/mypdfloader.py @@ -16,11 +16,8 @@ class RapidOCRPDFLoader(UnstructuredFileLoader): b_unit = tqdm.tqdm(total=doc.page_count, desc="RapidOCRPDFLoader context page index: 0") for i, page in enumerate(doc): - # 更新描述 b_unit.set_description("RapidOCRPDFLoader context page index: {}".format(i)) - # 立即显示进度条更新结果 b_unit.refresh() - # TODO: 依据文本与图片顺序调整处理方式 text = page.get_text("") resp += text + "\n" diff --git a/img/qr_code_85.jpg b/img/qr_code_85.jpg deleted file mode 100644 index d294f22f..00000000 Binary files a/img/qr_code_85.jpg and /dev/null differ diff --git a/img/qr_code_87.jpg b/img/qr_code_87.jpg new file mode 100644 index 00000000..ce8fc77a Binary files /dev/null and b/img/qr_code_87.jpg differ diff --git a/requirements.txt b/requirements.txt index 3484ba4e..4307ab9d 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,79 +1,61 @@ -# API requirements - torch~=2.1.2 torchvision~=0.16.2 torchaudio~=2.1.2 -xformers==0.0.23.post1 -transformers==4.36.2 -sentence_transformers==2.2.2 - +xformers~=0.0.23.post1 +transformers~=4.36.2 +sentence_transformers~=2.2.2 langchain==0.0.354 langchain-experimental==0.0.47 pydantic==1.10.13 -fschat==0.2.35 -openai~=1.7.1 -fastapi~=0.108.0 -sse_starlette==1.8.2 -nltk>=3.8.1 -uvicorn>=0.24.0.post1 +fschat~=0.2.35 +openai~=1.9.0 +fastapi~=0.109.0 +sse_starlette~=1.8.2 +nltk~=3.8.1 +uvicorn~=0.24.0.post1 starlette~=0.32.0 -unstructured[all-docs]==0.11.0 -python-magic-bin; sys_platform == 'win32' -SQLAlchemy==2.0.19 -faiss-cpu~=1.7.4 # `conda install faiss-gpu -c conda-forge` if you want to accelerate with gpus +unstructured[all-docs]~=0.12.0 +python-magic-bin; sys_platform ~= 'win32' +SQLAlchemy~=2.0.25 +faiss-cpu~=1.7.4 accelerate~=0.24.1 spacy~=3.7.2 -PyMuPDF~=1.23.8 -rapidocr_onnxruntime==1.3.8 +PyMuPDF~=1.23.16 +rapidocr_onnxruntime~=1.3.8 requests~=2.31.0 pathlib~=1.0.1 pytest~=7.4.3 -numexpr~=2.8.6 # max version for py38 +numexpr~=2.8.6 strsimpy~=0.2.1 markdownify~=0.11.6 tiktoken~=0.5.2 -tqdm>=4.66.1 -websockets>=12.0 +tqdm~=4.66.1 +websockets~=12.0 numpy~=1.24.4 pandas~=2.0.3 -einops>=0.7.0 -transformers_stream_generator==0.0.4 -vllm==0.2.7; sys_platform == "linux" - -# flash-attn>=2.4.3 # For Orion-14B-Chat and Qwen-14B-Chat - - -# optional document loaders - -#rapidocr_paddle[gpu]>=1.3.0.post5 # gpu accelleration for ocr of pdf and image files -jq==1.6.0 # for .json and .jsonl files. suggest `conda install jq` on windows -beautifulsoup4~=4.12.2 # for .mhtml files +einops~=0.7.0 +transformers_stream_generator~=0.0.4 +vllm~=0.2.7; sys_platform ~= "linux" +jq~=1.6.0 +beautifulsoup4~=4.12.2 pysrt~=1.1.2 - -# Online api libs dependencies -# zhipuAI sdk is not supported on our platform, so use http instead -dashscope==1.13.6 # qwen -# volcengine>=1.0.119 # fangzhou - +dashscope~=1.13.6 # qwen +# volcengine~=1.0.119 # fangzhou # uncomment libs if you want to use corresponding vector store -# pymilvus>=2.3.4 -# psycopg2==2.9.9 -# pgvector>=0.2.4 - -# Agent and Search Tools - +# pymilvus~=2.3.4 +# psycopg2~=2.9.9 +# pgvector~=0.2.4 +# flash-attn~=2.4.3 # For Orion-14B-Chat and Qwen-14B-Chat +#rapidocr_paddle[gpu]~=1.3.0.post5 # gpu accelleration for ocr of pdf and image files arxiv~=2.1.0 youtube-search~=2.1.2 duckduckgo-search~=3.9.9 metaphor-python~=0.1.23 - -# WebUI requirements - -streamlit==1.30.0 -streamlit-option-menu==0.3.6 -streamlit-antd-components==0.3.1 -streamlit-chatbox==1.1.11 -streamlit-modal==0.1.0 -streamlit-aggrid==0.3.4.post3 -httpx==0.26.0 -watchdog==3.0.0 \ No newline at end of file +streamlit~=1.30.0 +streamlit-option-menu~=0.3.12 +streamlit-antd-components~=0.3.1 +streamlit-chatbox~=1.1.11 +streamlit-modal~=0.1.0 +streamlit-aggrid~=0.3.4.post3 +httpx~=0.26.0 +watchdog~=3.0.0 \ No newline at end of file diff --git a/requirements_api.txt b/requirements_api.txt index e126c8c5..81660dae 100644 --- a/requirements_api.txt +++ b/requirements_api.txt @@ -1,15 +1,14 @@ torch~=2.1.2 torchvision~=0.16.2 torchaudio~=2.1.2 -xformers==0.0.23.post1 +xformers>=0.0.23.post1 transformers==4.36.2 sentence_transformers==2.2.2 - langchain==0.0.354 langchain-experimental==0.0.47 pydantic==1.10.13 fschat==0.2.35 -openai~=1.7.1 +openai~=1.9.0 fastapi~=0.108.0 sse_starlette==1.8.2 nltk>=3.8.1 @@ -18,7 +17,7 @@ starlette~=0.32.0 unstructured[all-docs]==0.11.0 python-magic-bin; sys_platform == 'win32' SQLAlchemy==2.0.19 -faiss-cpu~=1.7.4 # `conda install faiss-gpu -c conda-forge` if you want to accelerate with gpus +faiss-cpu~=1.7.4 accelerate~=0.24.1 spacy~=3.7.2 PyMuPDF~=1.23.8 @@ -26,7 +25,7 @@ rapidocr_onnxruntime==1.3.8 requests~=2.31.0 pathlib~=1.0.1 pytest~=7.4.3 -numexpr~=2.8.6 # max version for py38 +numexpr~=2.8.6 strsimpy~=0.2.1 markdownify~=0.11.6 tiktoken~=0.5.2 @@ -39,29 +38,18 @@ transformers_stream_generator==0.0.4 vllm==0.2.7; sys_platform == "linux" httpx==0.26.0 llama-index -# flash-attn>=2.4.3 # For Orion-14B-Chat and Qwen-14B-Chat - -# optional document loaders - -# rapidocr_paddle[gpu]>=1.3.0.post5 # gpu accelleration for ocr of pdf and image files -jq==1.6.0 # for .json and .jsonl files. suggest `conda install jq` on windows -beautifulsoup4~=4.12.2 # for .mhtml files +jq==1.6.0 +beautifulsoup4~=4.12.2 pysrt~=1.1.2 - -# Online api libs dependencies - -# zhipuAI sdk is not supported on our platform, so use http instead -dashscope==1.13.6 # qwen -# volcengine>=1.0.119 # fangzhou - -# uncomment libs if you want to use corresponding vector store -# pymilvus>=2.3.4 -# psycopg2==2.9.9 -# pgvector>=0.2.4 - -# Agent and Search Tools - +dashscope==1.13.6 arxiv~=2.1.0 youtube-search~=2.1.2 duckduckgo-search~=3.9.9 -metaphor-python~=0.1.23 \ No newline at end of file +metaphor-python~=0.1.23 + +# volcengine>=1.0.119 +# pymilvus>=2.3.4 +# psycopg2==2.9.9 +# pgvector>=0.2.4 +# flash-attn>=2.4.3 # For Orion-14B-Chat and Qwen-14B-Chat +# rapidocr_paddle[gpu]>=1.3.0.post5 \ No newline at end of file diff --git a/requirements_lite.txt b/requirements_lite.txt index 4274b649..5a0ace76 100644 --- a/requirements_lite.txt +++ b/requirements_lite.txt @@ -1,64 +1,32 @@ -# API requirements - langchain==0.0.354 langchain-experimental==0.0.47 pydantic==1.10.13 -fschat==0.2.35 -openai~=1.7.1 -fastapi~=0.108.0 -sse_starlette==1.8.2 -nltk>=3.8.1 -uvicorn>=0.24.0.post1 +fschat~=0.2.35 +openai~=1.9.0 +fastapi~=0.109.0 +sse_starlette~=1.8.2 +nltk~=3.8.1 +uvicorn~=0.24.0.post1 starlette~=0.32.0 -unstructured[all-docs]==0.11.0 -python-magic-bin; sys_platform == 'win32' -SQLAlchemy==2.0.19 +unstructured[all-docs]~=0.12.0 +python-magic-bin; sys_platform ~= 'win32' +SQLAlchemy~=2.0.25 faiss-cpu~=1.7.4 +accelerate~=0.24.1 +spacy~=3.7.2 +PyMuPDF~=1.23.16 +rapidocr_onnxruntime~=1.3.8 requests~=2.31.0 pathlib~=1.0.1 pytest~=7.4.3 -numexpr~=2.8.6 # max version for py38 -strsimpy~=0.2.1 -markdownify~=0.11.6 -tiktoken~=0.5.2 -tqdm>=4.66.1 -websockets>=12.0 -numpy~=1.24.4 -pandas~=2.0.3 -einops>=0.7.0 -transformers_stream_generator==0.0.4 -vllm==0.2.7; sys_platform == "linux" -httpx[brotli,http2,socks]==0.25.2 -requests -pathlib -pytest - - -# Online api libs dependencies - -# zhipuAI sdk is not supported on our platform, so use http instead dashscope==1.13.6 -# volcengine>=1.0.119 - -# uncomment libs if you want to use corresponding vector store -# pymilvus>=2.3.4 -# psycopg2==2.9.9 -# pgvector>=0.2.4 - -# Agent and Search Tools - arxiv~=2.1.0 youtube-search~=2.1.2 duckduckgo-search~=3.9.9 metaphor-python~=0.1.23 - -# WebUI requirements - -streamlit==1.30.0 -streamlit-option-menu==0.3.6 -streamlit-antd-components==0.3.1 -streamlit-chatbox==1.1.11 -streamlit-modal==0.1.0 -streamlit-aggrid==0.3.4.post3 -httpx==0.26.0 -watchdog==3.0.0 \ No newline at end of file +watchdog~=3.0.0 +# volcengine>=1.0.119 +# pymilvus>=2.3.4 +# psycopg2==2.9.9 +# pgvector>=0.2.4 +# flash-attn>=2.4.3 # For Orion-14B-Chat and Qwen-14B-Chat \ No newline at end of file diff --git a/requirements_webui.txt b/requirements_webui.txt index 111dedaa..9dae4620 100644 --- a/requirements_webui.txt +++ b/requirements_webui.txt @@ -1,10 +1,8 @@ -# WebUI requirements - -streamlit==1.30.0 -streamlit-option-menu==0.3.6 -streamlit-antd-components==0.3.1 -streamlit-chatbox==1.1.11 -streamlit-modal==0.1.0 -streamlit-aggrid==0.3.4.post3 -httpx==0.26.0 -watchdog==3.0.0 \ No newline at end of file +streamlit~=1.30.0 +streamlit-option-menu~=0.3.12 +streamlit-antd-components~=0.3.1 +streamlit-chatbox~=1.1.11 +streamlit-modal~=0.1.0 +streamlit-aggrid~=0.3.4.post3 +httpx~=0.26.0 +watchdog~=3.0.0 \ No newline at end of file diff --git a/server/embeddings_api.py b/server/embeddings_api.py index 440bb774..e907de07 100644 --- a/server/embeddings_api.py +++ b/server/embeddings_api.py @@ -16,7 +16,6 @@ def embed_texts( ) -> BaseResponse: ''' 对文本进行向量化。返回数据格式:BaseResponse(data=List[List[float]]) - TODO: 也许需要加入缓存机制,减少 token 消耗 ''' try: if embed_model in list_embed_models(): # 使用本地Embeddings模型 diff --git a/server/knowledge_base/kb_api.py b/server/knowledge_base/kb_api.py index f50d8a73..0d2fbce9 100644 --- a/server/knowledge_base/kb_api.py +++ b/server/knowledge_base/kb_api.py @@ -13,9 +13,9 @@ def list_kbs(): def create_kb(knowledge_base_name: str = Body(..., examples=["samples"]), - vector_store_type: str = Body("faiss"), - embed_model: str = Body(EMBEDDING_MODEL), - ) -> BaseResponse: + vector_store_type: str = Body("faiss"), + embed_model: str = Body(EMBEDDING_MODEL), + ) -> BaseResponse: # Create selected knowledge base if not validate_kb_name(knowledge_base_name): return BaseResponse(code=403, msg="Don't attack me") @@ -39,8 +39,8 @@ def create_kb(knowledge_base_name: str = Body(..., examples=["samples"]), def delete_kb( - knowledge_base_name: str = Body(..., examples=["samples"]) - ) -> BaseResponse: + knowledge_base_name: str = Body(..., examples=["samples"]) +) -> BaseResponse: # Delete selected knowledge base if not validate_kb_name(knowledge_base_name): return BaseResponse(code=403, msg="Don't attack me") diff --git a/server/knowledge_base/kb_cache/faiss_cache.py b/server/knowledge_base/kb_cache/faiss_cache.py index ed48b5dd..60c550ee 100644 --- a/server/knowledge_base/kb_cache/faiss_cache.py +++ b/server/knowledge_base/kb_cache/faiss_cache.py @@ -55,8 +55,6 @@ class _FaissPool(CachePool): embed_model: str = EMBEDDING_MODEL, embed_device: str = embedding_device(), ) -> FAISS: - # TODO: 整个Embeddings加载逻辑有些混乱,待清理 - # create an empty vector store embeddings = EmbeddingsFunAdapter(embed_model) doc = Document(page_content="init", metadata={}) vector_store = FAISS.from_documents([doc], embeddings, normalize_L2=True,distance_strategy="METRIC_INNER_PRODUCT") diff --git a/server/knowledge_base/kb_doc_api.py b/server/knowledge_base/kb_doc_api.py index 09a264f9..e58ea41f 100644 --- a/server/knowledge_base/kb_doc_api.py +++ b/server/knowledge_base/kb_doc_api.py @@ -95,7 +95,6 @@ def _save_files_in_thread(files: List[UploadFile], and not override and os.path.getsize(file_path) == len(file_content) ): - # TODO: filesize 不同后的处理 file_status = f"文件 {filename} 已存在。" logger.warn(file_status) return dict(code=404, msg=file_status, data=data) @@ -116,7 +115,6 @@ def _save_files_in_thread(files: List[UploadFile], yield result -# TODO: 等langchain.document_loaders支持内存文件的时候再开通 # def files2docs(files: List[UploadFile] = File(..., description="上传文件,支持多文件"), # knowledge_base_name: str = Form(..., description="知识库名称", examples=["samples"]), # override: bool = Form(False, description="覆盖已有文件"), diff --git a/server/knowledge_base/kb_service/base.py b/server/knowledge_base/kb_service/base.py index 44c0d64e..bd5a54eb 100644 --- a/server/knowledge_base/kb_service/base.py +++ b/server/knowledge_base/kb_service/base.py @@ -191,7 +191,6 @@ class KBService(ABC): ''' 传入参数为: {doc_id: Document, ...} 如果对应 doc_id 的值为 None,或其 page_content 为空,则删除该文档 - TODO:是否要支持新增 docs ? ''' self.del_doc_by_ids(list(docs.keys())) docs = [] diff --git a/server/knowledge_base/kb_service/milvus_kb_service.py b/server/knowledge_base/kb_service/milvus_kb_service.py index 32382929..43b616e2 100644 --- a/server/knowledge_base/kb_service/milvus_kb_service.py +++ b/server/knowledge_base/kb_service/milvus_kb_service.py @@ -70,7 +70,6 @@ class MilvusKBService(KBService): return score_threshold_process(score_threshold, top_k, docs) def do_add_doc(self, docs: List[Document], **kwargs) -> List[Dict]: - # TODO: workaround for bug #10492 in langchain for doc in docs: for k, v in doc.metadata.items(): doc.metadata[k] = str(v) diff --git a/server/knowledge_base/kb_service/pg_kb_service.py b/server/knowledge_base/kb_service/pg_kb_service.py index ec0e147b..46efe7d8 100644 --- a/server/knowledge_base/kb_service/pg_kb_service.py +++ b/server/knowledge_base/kb_service/pg_kb_service.py @@ -32,8 +32,6 @@ class PGKBService(KBService): results = [Document(page_content=row[0], metadata=row[1]) for row in session.execute(stmt, {'ids': ids}).fetchall()] return results - - # TODO: def del_doc_by_ids(self, ids: List[str]) -> bool: return super().del_doc_by_ids(ids) diff --git a/server/knowledge_base/kb_summary/base.py b/server/knowledge_base/kb_summary/base.py index 00dcea6f..6d095fee 100644 --- a/server/knowledge_base/kb_summary/base.py +++ b/server/knowledge_base/kb_summary/base.py @@ -13,7 +13,6 @@ from server.db.repository.knowledge_metadata_repository import add_summary_to_db from langchain.docstore.document import Document -# TODO 暂不考虑文件更新,需要重新删除相关文档,再重新添加 class KBSummaryService(ABC): kb_name: str embed_model: str diff --git a/server/knowledge_base/kb_summary/summary_chunk.py b/server/knowledge_base/kb_summary/summary_chunk.py index 0b88f233..7c2aaf47 100644 --- a/server/knowledge_base/kb_summary/summary_chunk.py +++ b/server/knowledge_base/kb_summary/summary_chunk.py @@ -112,12 +112,6 @@ class SummaryAdapter: docs: List[DocumentWithVSId] = []) -> List[Document]: logger.info("start summary") - # TODO 暂不处理文档中涉及语义重复、上下文缺失、document was longer than the context length 的问题 - # merge_docs = self._drop_overlap(docs) - # # 将merge_docs中的句子合并成一个文档 - # text = self._join_docs(merge_docs) - # 根据段落于句子的分隔符,将文档分成chunk,每个chunk长度小于token_max长度 - """ 这个过程分成两个部分: 1. 对每个文档进行处理,得到每个文档的摘要 diff --git a/server/knowledge_base/utils.py b/server/knowledge_base/utils.py index 6064477a..3a8c701b 100644 --- a/server/knowledge_base/utils.py +++ b/server/knowledge_base/utils.py @@ -174,7 +174,6 @@ def get_loader(loader_name: str, file_path: str, loader_kwargs: Dict = None): if encode_detect is None: encode_detect = {"encoding": "utf-8"} loader_kwargs["encoding"] = encode_detect["encoding"] - ## TODO:支持更多的自定义CSV读取逻辑 elif loader_name == "JSONLoader": loader_kwargs.setdefault("jq_schema", ".") diff --git a/server/model_workers/azure.py b/server/model_workers/azure.py index 70959325..f0835ae1 100644 --- a/server/model_workers/azure.py +++ b/server/model_workers/azure.py @@ -67,12 +67,10 @@ class AzureWorker(ApiModelWorker): self.logger.error(f"请求 Azure API 时发生错误:{resp}") def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="You are a helpful, respectful and honest assistant.", diff --git a/server/model_workers/baichuan.py b/server/model_workers/baichuan.py index 5e9cbbb0..75cfad4e 100644 --- a/server/model_workers/baichuan.py +++ b/server/model_workers/baichuan.py @@ -88,12 +88,10 @@ class BaiChuanWorker(ApiModelWorker): yield data def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="", diff --git a/server/model_workers/base.py b/server/model_workers/base.py index 88affb43..234ab47a 100644 --- a/server/model_workers/base.py +++ b/server/model_workers/base.py @@ -125,8 +125,6 @@ class ApiModelWorker(BaseModelWorker): def count_token(self, params): - # TODO:需要完善 - # print("count token") prompt = params["prompt"] return {"count": len(str(prompt)), "error_code": 0} diff --git a/server/model_workers/fangzhou.py b/server/model_workers/fangzhou.py index ddbad4ab..fdb50a1c 100644 --- a/server/model_workers/fangzhou.py +++ b/server/model_workers/fangzhou.py @@ -12,16 +12,16 @@ class FangZhouWorker(ApiModelWorker): """ def __init__( - self, - *, - model_names: List[str] = ["fangzhou-api"], - controller_addr: str = None, - worker_addr: str = None, - version: Literal["chatglm-6b-model"] = "chatglm-6b-model", - **kwargs, + self, + *, + model_names: List[str] = ["fangzhou-api"], + controller_addr: str = None, + worker_addr: str = None, + version: Literal["chatglm-6b-model"] = "chatglm-6b-model", + **kwargs, ): kwargs.update(model_names=model_names, controller_addr=controller_addr, worker_addr=worker_addr) - kwargs.setdefault("context_len", 16384) # TODO: 不同的模型有不同的大小 + kwargs.setdefault("context_len", 16384) super().__init__(**kwargs) self.version = version @@ -53,15 +53,15 @@ class FangZhouWorker(ApiModelWorker): if error := resp.error: if error.code_n > 0: data = { - "error_code": error.code_n, - "text": error.message, - "error": { - "message": error.message, - "type": "invalid_request_error", - "param": None, - "code": None, - } + "error_code": error.code_n, + "text": error.message, + "error": { + "message": error.message, + "type": "invalid_request_error", + "param": None, + "code": None, } + } self.logger.error(f"请求方舟 API 时发生错误:{data}") yield data elif chunk := resp.choice.message.content: @@ -77,7 +77,6 @@ class FangZhouWorker(ApiModelWorker): break def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) diff --git a/server/model_workers/gemini.py b/server/model_workers/gemini.py index 0cd8e159..e9175b6e 100644 --- a/server/model_workers/gemini.py +++ b/server/model_workers/gemini.py @@ -3,7 +3,7 @@ from fastchat.conversation import Conversation from server.model_workers.base import * from server.utils import get_httpx_client from fastchat import conversation as conv -import json,httpx +import json, httpx from typing import List, Dict from configs import logger, log_verbose @@ -14,14 +14,14 @@ class GeminiWorker(ApiModelWorker): *, controller_addr: str = None, worker_addr: str = None, - model_names: List[str] = ["Gemini-api"], + model_names: List[str] = ["gemini-api"], **kwargs, ): kwargs.update(model_names=model_names, controller_addr=controller_addr, worker_addr=worker_addr) kwargs.setdefault("context_len", 4096) super().__init__(**kwargs) - def create_gemini_messages(self,messages) -> json: + def create_gemini_messages(self, messages) -> json: has_history = any(msg['role'] == 'assistant' for msg in messages) gemini_msg = [] @@ -42,11 +42,11 @@ class GeminiWorker(ApiModelWorker): msg = dict(contents=gemini_msg) return msg - + def do_chat(self, params: ApiChatParams) -> Dict: params.load_config(self.model_names[0]) data = self.create_gemini_messages(messages=params.messages) - generationConfig=dict( + generationConfig = dict( temperature=params.temperature, topK=1, topP=1, @@ -54,8 +54,8 @@ class GeminiWorker(ApiModelWorker): stopSequences=[] ) - data['generationConfig'] = generationConfig - url = "https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent"+ '?key=' + params.api_key + data['generationConfig'] = generationConfig + url = "https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent" + '?key=' + params.api_key headers = { 'Content-Type': 'application/json', } @@ -67,7 +67,7 @@ class GeminiWorker(ApiModelWorker): text = "" json_string = "" timeout = httpx.Timeout(60.0) - client=get_httpx_client(timeout=timeout) + client = get_httpx_client(timeout=timeout) with client.stream("POST", url, headers=headers, json=data) as response: for line in response.iter_lines(): line = line.strip() @@ -89,13 +89,12 @@ class GeminiWorker(ApiModelWorker): "error_code": 0, "text": text } - print(text) + print(text) except json.JSONDecodeError as e: print("Failed to decode JSON:", e) print("Invalid JSON string:", json_string) def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) diff --git a/server/model_workers/minimax.py b/server/model_workers/minimax.py index ba610d52..79d24514 100644 --- a/server/model_workers/minimax.py +++ b/server/model_workers/minimax.py @@ -37,7 +37,6 @@ class MiniMaxWorker(ApiModelWorker): def do_chat(self, params: ApiChatParams) -> Dict: # 按照官网推荐,直接调用abab 5.5模型 - # TODO: 支持指定回复要求,支持指定用户名称、AI名称 params.load_config(self.model_names[0]) url = 'https://api.minimax.chat/v1/text/chatcompletion{pro}?GroupId={group_id}' @@ -55,7 +54,7 @@ class MiniMaxWorker(ApiModelWorker): "temperature": params.temperature, "top_p": params.top_p, "tokens_to_generate": params.max_tokens or 1024, - # TODO: 以下参数为minimax特有,传入空值会出错。 + # 以下参数为minimax特有,传入空值会出错。 # "prompt": params.system_message or self.conv.system_message, # "bot_setting": [], # "role_meta": params.role_meta, @@ -143,12 +142,10 @@ class MiniMaxWorker(ApiModelWorker): return {"code": 200, "data": result} def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="你是MiniMax自主研发的大型语言模型,回答问题简洁有条理。", diff --git a/server/model_workers/qianfan.py b/server/model_workers/qianfan.py index 7dd3a355..da362ec6 100644 --- a/server/model_workers/qianfan.py +++ b/server/model_workers/qianfan.py @@ -187,14 +187,11 @@ class QianFanWorker(ApiModelWorker): i += batch_size return {"code": 200, "data": result} - # TODO: qianfan支持续写模型 def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="你是一个聪明的助手,请根据用户的提示来完成任务", diff --git a/server/model_workers/qwen.py b/server/model_workers/qwen.py index 58d1bcd1..2741b74d 100644 --- a/server/model_workers/qwen.py +++ b/server/model_workers/qwen.py @@ -100,12 +100,10 @@ class QwenWorker(ApiModelWorker): return {"code": 200, "data": result} def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="你是一个聪明、对人类有帮助的人工智能,你可以对人类提出的问题给出有用、详细、礼貌的回答。", diff --git a/server/model_workers/tiangong.py b/server/model_workers/tiangong.py index e127ea55..88010a15 100644 --- a/server/model_workers/tiangong.py +++ b/server/model_workers/tiangong.py @@ -70,12 +70,10 @@ class TianGongWorker(ApiModelWorker): yield data def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="", diff --git a/server/model_workers/xinghuo.py b/server/model_workers/xinghuo.py index 1e772a33..de38308b 100644 --- a/server/model_workers/xinghuo.py +++ b/server/model_workers/xinghuo.py @@ -42,7 +42,6 @@ class XingHuoWorker(ApiModelWorker): self.version = version def do_chat(self, params: ApiChatParams) -> Dict: - # TODO: 当前每次对话都要重新连接websocket,确认是否可以保持连接 params.load_config(self.model_names[0]) version_mapping = { @@ -73,12 +72,10 @@ class XingHuoWorker(ApiModelWorker): yield {"error_code": 0, "text": text} def get_embeddings(self, params): - # TODO: 支持embeddings print("embedding") print(params) def make_conv_template(self, conv_template: str = None, model_path: str = None) -> Conversation: - # TODO: 确认模板是否需要修改 return conv.Conversation( name=self.model_names[0], system_message="你是一个聪明的助手,请根据用户的提示来完成任务", diff --git a/server/utils.py b/server/utils.py index 26ef967e..7fed5f8c 100644 --- a/server/utils.py +++ b/server/utils.py @@ -36,7 +36,6 @@ async def wrap_done(fn: Awaitable, event: asyncio.Event): await fn except Exception as e: logging.exception(e) - # TODO: handle exception msg = f"Caught exception: {e}" logger.error(f'{e.__class__.__name__}: {msg}', exc_info=e if log_verbose else None) @@ -404,7 +403,7 @@ def fschat_controller_address() -> str: def fschat_model_worker_address(model_name: str = LLM_MODELS[0]) -> str: - if model := get_model_worker_config(model_name): # TODO: depends fastchat + if model := get_model_worker_config(model_name): host = model["host"] if host == "0.0.0.0": host = "127.0.0.1" @@ -449,7 +448,7 @@ def get_prompt_template(type: str, name: str) -> Optional[str]: from configs import prompt_config import importlib - importlib.reload(prompt_config) # TODO: 检查configs/prompt_config.py文件有修改再重新加载 + importlib.reload(prompt_config) return prompt_config.PROMPT_TEMPLATES[type].get(name) @@ -550,7 +549,7 @@ def run_in_thread_pool( thread = pool.submit(func, **kwargs) tasks.append(thread) - for obj in as_completed(tasks): # TODO: Ctrl+c无法停止 + for obj in as_completed(tasks): yield obj.result() diff --git a/startup.py b/startup.py index 0681dcda..359fb709 100644 --- a/startup.py +++ b/startup.py @@ -418,7 +418,7 @@ def run_openai_api(log_level: str = "INFO", started_event: mp.Event = None): set_httpx_config() controller_addr = fschat_controller_address() - app = create_openai_api_app(controller_addr, log_level=log_level) # TODO: not support keys yet. + app = create_openai_api_app(controller_addr, log_level=log_level) _set_app_event(app, started_event) host = FSCHAT_OPENAI_API["host"] diff --git a/webui_pages/dialogue/dialogue.py b/webui_pages/dialogue/dialogue.py index 325cd5d1..b9d2f7fd 100644 --- a/webui_pages/dialogue/dialogue.py +++ b/webui_pages/dialogue/dialogue.py @@ -12,7 +12,6 @@ from server.knowledge_base.utils import LOADER_DICT import uuid from typing import List, Dict - chat_box = ChatBox( assistant_avatar=os.path.join( "img", @@ -127,7 +126,6 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): chat_box.use_chat_name(conversation_name) conversation_id = st.session_state["conversation_ids"][conversation_name] - # TODO: 对话模型与会话绑定 def on_mode_change(): mode = st.session_state.dialogue_mode text = f"已切换到 {mode} 模式。" @@ -138,11 +136,11 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): st.toast(text) dialogue_modes = ["LLM 对话", - "知识库问答", - "文件对话", - "搜索引擎问答", - "自定义Agent问答", - ] + "知识库问答", + "文件对话", + "搜索引擎问答", + "自定义Agent问答", + ] dialogue_mode = st.selectbox("请选择对话模式:", dialogue_modes, index=0, @@ -166,9 +164,9 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): available_models = [] config_models = api.list_config_models() if not is_lite: - for k, v in config_models.get("local", {}).items(): # 列出配置了有效本地路径的模型 + for k, v in config_models.get("local", {}).items(): # 列出配置了有效本地路径的模型 if (v.get("model_path_exists") - and k not in running_models): + and k not in running_models): available_models.append(k) for k, v in config_models.get("online", {}).items(): # 列出ONLINE_MODELS中直接访问的模型 if not v.get("provider") and k not in running_models: @@ -250,14 +248,14 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): elif dialogue_mode == "文件对话": with st.expander("文件对话配置", True): files = st.file_uploader("上传知识文件:", - [i for ls in LOADER_DICT.values() for i in ls], - accept_multiple_files=True, - ) + [i for ls in LOADER_DICT.values() for i in ls], + accept_multiple_files=True, + ) kb_top_k = st.number_input("匹配知识条数:", 1, 20, VECTOR_SEARCH_TOP_K) ## Bge 模型会超过1 score_threshold = st.slider("知识匹配分数阈值:", 0.0, 2.0, float(SCORE_THRESHOLD), 0.01) - if st.button("开始上传", disabled=len(files)==0): + if st.button("开始上传", disabled=len(files) == 0): st.session_state["file_chat_id"] = upload_temp_docs(files, api) elif dialogue_mode == "搜索引擎问答": search_engine_list = api.list_search_engines() @@ -279,9 +277,9 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): chat_input_placeholder = "请输入对话内容,换行请使用Shift+Enter。输入/help查看自定义命令 " def on_feedback( - feedback, - message_id: str = "", - history_index: int = -1, + feedback, + message_id: str = "", + history_index: int = -1, ): reason = feedback["text"] score_int = chat_box.set_feedback(feedback=feedback, history_index=history_index) @@ -296,7 +294,7 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): } if prompt := st.chat_input(chat_input_placeholder, key="prompt"): - if parse_command(text=prompt, modal=modal): # 用户输入自定义命令 + if parse_command(text=prompt, modal=modal): # 用户输入自定义命令 st.rerun() else: history = get_messages_history(history_len) @@ -306,11 +304,11 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): text = "" message_id = "" r = api.chat_chat(prompt, - history=history, - conversation_id=conversation_id, - model=llm_model, - prompt_name=prompt_template_name, - temperature=temperature) + history=history, + conversation_id=conversation_id, + model=llm_model, + prompt_name=prompt_template_name, + temperature=temperature) for t in r: if error_msg := check_error_msg(t): # check whether error occured st.error(error_msg) @@ -321,12 +319,12 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): metadata = { "message_id": message_id, - } + } chat_box.update_msg(text, streaming=False, metadata=metadata) # 更新最终的字符串,去除光标 chat_box.show_feedback(**feedback_kwargs, - key=message_id, - on_submit=on_feedback, - kwargs={"message_id": message_id, "history_index": len(chat_box.history) - 1}) + key=message_id, + on_submit=on_feedback, + kwargs={"message_id": message_id, "history_index": len(chat_box.history) - 1}) elif dialogue_mode == "自定义Agent问答": if not any(agent in llm_model for agent in SUPPORT_AGENT_MODEL): @@ -373,13 +371,13 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): ]) text = "" for d in api.knowledge_base_chat(prompt, - knowledge_base_name=selected_kb, - top_k=kb_top_k, - score_threshold=score_threshold, - history=history, - model=llm_model, - prompt_name=prompt_template_name, - temperature=temperature): + knowledge_base_name=selected_kb, + top_k=kb_top_k, + score_threshold=score_threshold, + history=history, + model=llm_model, + prompt_name=prompt_template_name, + temperature=temperature): if error_msg := check_error_msg(d): # check whether error occured st.error(error_msg) elif chunk := d.get("answer"): @@ -397,13 +395,13 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): ]) text = "" for d in api.file_chat(prompt, - knowledge_id=st.session_state["file_chat_id"], - top_k=kb_top_k, - score_threshold=score_threshold, - history=history, - model=llm_model, - prompt_name=prompt_template_name, - temperature=temperature): + knowledge_id=st.session_state["file_chat_id"], + top_k=kb_top_k, + score_threshold=score_threshold, + history=history, + model=llm_model, + prompt_name=prompt_template_name, + temperature=temperature): if error_msg := check_error_msg(d): # check whether error occured st.error(error_msg) elif chunk := d.get("answer"): @@ -455,4 +453,4 @@ def dialogue_page(api: ApiRequest, is_lite: bool = False): file_name=f"{now:%Y-%m-%d %H.%M}_对话记录.md", mime="text/markdown", use_container_width=True, - ) \ No newline at end of file + ) diff --git a/webui_pages/knowledge_base/knowledge_base.py b/webui_pages/knowledge_base/knowledge_base.py index 31fc1512..76a97377 100644 --- a/webui_pages/knowledge_base/knowledge_base.py +++ b/webui_pages/knowledge_base/knowledge_base.py @@ -7,15 +7,12 @@ from server.knowledge_base.utils import get_file_path, LOADER_DICT from server.knowledge_base.kb_service.base import get_kb_details, get_kb_file_details from typing import Literal, Dict, Tuple from configs import (kbs_config, - EMBEDDING_MODEL, DEFAULT_VS_TYPE, - CHUNK_SIZE, OVERLAP_SIZE, ZH_TITLE_ENHANCE) + EMBEDDING_MODEL, DEFAULT_VS_TYPE, + CHUNK_SIZE, OVERLAP_SIZE, ZH_TITLE_ENHANCE) from server.utils import list_embed_models, list_online_embed_models import os import time - -# SENTENCE_SIZE = 100 - cell_renderer = JsCode("""function(params) {if(params.value==true){return '✓'}else{return '×'}}""") @@ -32,7 +29,7 @@ def config_aggrid( gb.configure_selection( selection_mode=selection_mode, use_checkbox=use_checkbox, - # pre_selected_rows=st.session_state.get("selected_rows", [0]), + pre_selected_rows=st.session_state.get("selected_rows", [0]), ) gb.configure_pagination( enabled=True, @@ -59,7 +56,8 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): try: kb_list = {x["kb_name"]: x for x in get_kb_details()} except Exception as e: - st.error("获取知识库信息错误,请检查是否已按照 `README.md` 中 `4 知识库初始化与迁移` 步骤完成初始化或迁移,或是否为数据库连接错误。") + st.error( + "获取知识库信息错误,请检查是否已按照 `README.md` 中 `4 知识库初始化与迁移` 步骤完成初始化或迁移,或是否为数据库连接错误。") st.stop() kb_names = list(kb_list.keys()) @@ -150,7 +148,8 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): [i for ls in LOADER_DICT.values() for i in ls], accept_multiple_files=True, ) - kb_info = st.text_area("请输入知识库介绍:", value=st.session_state["selected_kb_info"], max_chars=None, key=None, + kb_info = st.text_area("请输入知识库介绍:", value=st.session_state["selected_kb_info"], max_chars=None, + key=None, help=None, on_change=None, args=None, kwargs=None) if kb_info != st.session_state["selected_kb_info"]: @@ -200,8 +199,8 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): doc_details = doc_details[[ "No", "file_name", "document_loader", "text_splitter", "docs_count", "in_folder", "in_db", ]] - # doc_details["in_folder"] = doc_details["in_folder"].replace(True, "✓").replace(False, "×") - # doc_details["in_db"] = doc_details["in_db"].replace(True, "✓").replace(False, "×") + doc_details["in_folder"] = doc_details["in_folder"].replace(True, "✓").replace(False, "×") + doc_details["in_db"] = doc_details["in_db"].replace(True, "✓").replace(False, "×") gb = config_aggrid( doc_details, { @@ -252,7 +251,8 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): st.write() # 将文件分词并加载到向量库中 if cols[1].button( - "重新添加至向量库" if selected_rows and (pd.DataFrame(selected_rows)["in_db"]).any() else "添加至向量库", + "重新添加至向量库" if selected_rows and ( + pd.DataFrame(selected_rows)["in_db"]).any() else "添加至向量库", disabled=not file_exists(kb, selected_rows)[0], use_container_width=True, ): @@ -285,39 +285,39 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): st.divider() - # cols = st.columns(3) + cols = st.columns(3) - # if cols[0].button( - # "依据源文件重建向量库", - # # help="无需上传文件,通过其它方式将文档拷贝到对应知识库content目录下,点击本按钮即可重建知识库。", - # use_container_width=True, - # type="primary", - # ): - # with st.spinner("向量库重构中,请耐心等待,勿刷新或关闭页面。"): - # empty = st.empty() - # empty.progress(0.0, "") - # for d in api.recreate_vector_store(kb, - # chunk_size=chunk_size, - # chunk_overlap=chunk_overlap, - # zh_title_enhance=zh_title_enhance): - # if msg := check_error_msg(d): - # st.toast(msg) - # else: - # empty.progress(d["finished"] / d["total"], d["msg"]) - # st.rerun() + if cols[0].button( + "依据源文件重建向量库", + help="无需上传文件,通过其它方式将文档拷贝到对应知识库content目录下,点击本按钮即可重建知识库。", + use_container_width=True, + type="primary", + ): + with st.spinner("向量库重构中,请耐心等待,勿刷新或关闭页面。"): + empty = st.empty() + empty.progress(0.0, "") + for d in api.recreate_vector_store(kb, + chunk_size=chunk_size, + chunk_overlap=chunk_overlap, + zh_title_enhance=zh_title_enhance): + if msg := check_error_msg(d): + st.toast(msg) + else: + empty.progress(d["finished"] / d["total"], d["msg"]) + st.rerun() - # if cols[2].button( - # "删除知识库", - # use_container_width=True, - # ): - # ret = api.delete_knowledge_base(kb) - # st.toast(ret.get("msg", " ")) - # time.sleep(1) - # st.rerun() + if cols[2].button( + "删除知识库", + use_container_width=True, + ): + ret = api.delete_knowledge_base(kb) + st.toast(ret.get("msg", " ")) + time.sleep(1) + st.rerun() - # with st.sidebar: - # keyword = st.text_input("查询关键字") - # top_k = st.slider("匹配条数", 1, 100, 3) + with st.sidebar: + keyword = st.text_input("查询关键字") + top_k = st.slider("匹配条数", 1, 100, 3) st.write("文件内文档列表。双击进行修改,在删除列填入 Y 可删除对应行。") docs = [] @@ -325,11 +325,12 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): if selected_rows: file_name = selected_rows[0]["file_name"] docs = api.search_kb_docs(knowledge_base_name=selected_kb, file_name=file_name) - data = [{"seq": i+1, "id": x["id"], "page_content": x["page_content"], "source": x["metadata"].get("source"), - "type": x["type"], - "metadata": json.dumps(x["metadata"], ensure_ascii=False), - "to_del": "", - } for i, x in enumerate(docs)] + data = [ + {"seq": i + 1, "id": x["id"], "page_content": x["page_content"], "source": x["metadata"].get("source"), + "type": x["type"], + "metadata": json.dumps(x["metadata"], ensure_ascii=False), + "to_del": "", + } for i, x in enumerate(docs)] df = pd.DataFrame(data) gb = GridOptionsBuilder.from_dataframe(df) @@ -343,22 +344,24 @@ def knowledge_base_page(api: ApiRequest, is_lite: bool = None): edit_docs = AgGrid(df, gb.build()) if st.button("保存更改"): - # origin_docs = {x["id"]: {"page_content": x["page_content"], "type": x["type"], "metadata": x["metadata"]} for x in docs} + origin_docs = { + x["id"]: {"page_content": x["page_content"], "type": x["type"], "metadata": x["metadata"]} for x in + docs} changed_docs = [] for index, row in edit_docs.data.iterrows(): - # origin_doc = origin_docs[row["id"]] - # if row["page_content"] != origin_doc["page_content"]: - if row["to_del"] not in ["Y", "y", 1]: - changed_docs.append({ - "page_content": row["page_content"], - "type": row["type"], - "metadata": json.loads(row["metadata"]), - }) + origin_doc = origin_docs[row["id"]] + if row["page_content"] != origin_doc["page_content"]: + if row["to_del"] not in ["Y", "y", 1]: + changed_docs.append({ + "page_content": row["page_content"], + "type": row["type"], + "metadata": json.loads(row["metadata"]), + }) if changed_docs: if api.update_kb_docs(knowledge_base_name=selected_kb, - file_names=[file_name], - docs={file_name: changed_docs}): + file_names=[file_name], + docs={file_name: changed_docs}): st.toast("更新文档成功") else: st.toast("更新文档失败")