mirror of https://github.com/RYDE-WORK/Langchain-Chatchat.git synced 2026-01-21 06:19:29 +08:00

1.增加对llama-cpp模型的支持；2.增加对bloom/chatyuan/baichuan模型的支持；3. 修复多GPU部署的bug;4. 修复了moss_llm.py的bug；5. 增加对openai支持（没有api,未测试);6. 支持在多卡情况自定义设备GPU (#664 )

* 修复 bing_search.py的typo;更新model_config.py中Bing Subscription Key申请方式及注意事项

* 更新FAQ，增加了[Errno 110] Connection timed out的原因与解决方案

* 修改loader.py中load_in_8bit失败的原因和详细解决方案

* update loader.py

* stream_chat_bing

* 修改stream_chat的接口，在请求体中选择knowledge_base_id;增加stream_chat_bing接口

* 优化cli_demo.py的逻辑：支持 输入提示；多输入；重新输入

* update cli_demo.py

* add bloom-3b,bloom-7b1,ggml-vicuna-13b-1.1

* 1.增加对llama-cpp模型的支持；2.增加对bloom模型的支持；3. 修复多GPU部署的bug;4. 增加对openai支持（没有api,未测试)；5.增加了llama-cpp模型部署的说明

* llama模型兼容性说明

* modified:   ../configs/model_config.py
	modified:   ../docs/INSTALL.md
在install.md里增加对llama-cpp模型调用的说明

* 修改llama_llm.py以适应llama-cpp模型

* 完成llama-cpp模型的支持；

* make fastchat and openapi compatiable

* 1. 修复/增加对chatyuan,bloom,baichuan-7等模型的支持；2. 修复了moss_llm.py的bug;

* set default model be chatglm-6b

* 在多卡情况下也支持自定义GPU设备

---------

Co-authored-by: imClumsyPanda <littlepanda0716@gmail.com>

2023-07-11 19:36:50 +08:00

2.3 KiB

Raw Blame History

安装

环境检查

# 首先，确信你的机器安装了 Python 3.8 及以上版本
$ python --version
Python 3.8.13

# 如果低于这个版本，可使用conda安装环境
$ conda create -p /your_path/env_name python=3.8

# 激活环境
$ source activate /your_path/env_name
$ pip3 install --upgrade pip

# 关闭环境
$ source deactivate /your_path/env_name

# 删除环境
$ conda env remove -p  /your_path/env_name

项目依赖

# 拉取仓库
$ git clone https://github.com/imClumsyPanda/langchain-ChatGLM.git

# 进入目录
$ cd langchain-ChatGLM

# 项目中 pdf 加载由先前的 detectron2 替换为使用 paddleocr，如果之前有安装过 detectron2 需要先完成卸载避免引发 tools 冲突
$ pip uninstall detectron2

# 检查paddleocr依赖，linux环境下paddleocr依赖libX11，libXext
$ yum install libX11
$ yum install libXext

# 安装依赖
$ pip install -r requirements.txt

# 验证paddleocr是否成功，首次运行会下载约18M模型到~/.paddleocr
$ python loader/image_loader.py

注：使用 langchain.document_loaders.UnstructuredFileLoader 进行非结构化文件接入时，可能需要依据文档进行其他依赖包的安装，请参考 langchain 文档。

llama-cpp模型调用的说明

首先从huggingface hub中下载对应的模型，如https://huggingface.co/vicuna/ggml-vicuna-13b-1.1/的 ggml-vic13b-q5_1.bin，建议使用huggingface_hub库的snapshot_download下载。
将下载的模型重命名。通过huggingface_hub下载的模型会被重命名为随机序列，因此需要重命名为原始文件名，如ggml-vic13b-q5_1.bin。
基于下载模型的ggml的加载时间，推测对应的llama-cpp版本，下载对应的llama-cpp-python库的wheel文件，实测ggml-vic13b-q5_1.bin与llama-cpp-python库兼容,然后手动安装wheel文件。
将下载的模型信息写入configs/model_config.py文件里 llm_model_dict中，注意保证参数的兼容性，一些参数组合可能会报错.

2.3 KiB Raw Blame History Unescape Escape

安装

环境检查

项目依赖

llama-cpp模型调用的说明

2.3 KiB

Raw Blame History