在没有互联网连接的情况下,使用LLM的力量对你的文档提出问题。 100%私有,任何时候都没有数据离开你的执行环境。你可以在没有互联网连接的情况下摄取文档和提问!
使用LangChain,GPT4All,LlamaCpp,Chroma和SentenceTransformers构建。
为了设置你的环境以运行此处的代码,请首先安装所有要求:
pip3 install -r requirements.txt
然后,下载LLM模型并将其放在你选择的目录中:
.env
重命名并相应地编辑变量。
example.env
.env
MODEL_TYPE: supports LlamaCpp or GPT4All PERSIST_DIRECTORY: is the folder you want your vectorstore in MODEL_PATH: Path to your GPT4All or LlamaCpp supported LLM MODEL_N_CTX: Maximum token limit for the LLM model EMBEDDINGS_MODEL_NAME: SentenceTransformers embeddings model name (see https://www.sbert.net/docs/pretrained_models.html) TARGET_SOURCE_CHUNKS: The amount of chunks (sources) that will be used to answer a question
注意:由于加载嵌入的方式,第一次运行脚本时,需要互联网连接才能下载嵌入模型本身。
langchain
SentenceTransformers
此存储库使用联合状态脚本作为示例。
将任何和所有文件放入目录中
source_documents
支持的扩展包括:
.csv:.CSV
.docx:文字文档,
.doc:文字文档,
.enex:印象笔记,
.eml:电子邮件
.epub: EPub,
.html: HTML 文件,
.md: 降价,
.msg: 展望消息,
.odt:打开文档文本,
.pptx:幻灯片文档,
.ppt:幻灯片文档,
.txt: 文本文件 (UTF-8),
运行以下命令以引入所有数据。
python ingest.py
输出应如下所示:
Creating new vectorstore
Loading documents from source_documents
Loading new documents: 100%|██████████████████████| 1/1 [00:01<00:00, 1.73s/it]
Loaded 1 new documents from source_documents
Split into 90 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Using embedded DuckDB with persistence: data will be stored in: db
Ingestion complete! You can now run privateGPT.py to query your documents
它将创建一个包含本地向量存储的文件夹。每个文档需要 20-30 秒,具体取决于文档的大小。你可以根据需要引入任意数量的文档,所有文档都将累积在本地嵌入数据库中。如果要从空数据库开始,请删除该文件夹。
db
db
注意:在摄取过程中,没有数据离开你的本地环境。你可以在没有 Internet 连接的情况下进行摄取,但首次运行采集脚本时下载嵌入模型除外。
要提出问题,请运行如下命令:
python privateGPT.py
并等待脚本需要你的输入。
> Enter a query:
Hit enter. You'll need to wait 20-30 seconds (depending on your machine) while the LLM model consumes the prompt and prepares the answer. Once done, it will print the answer and the 4 sources it used as context from your documents; you can then ask another question without re-running the script, just wait for the prompt again.
Note: you could turn off your internet connection, and the script inference would still work. No data gets out of your local environment.
Type
exit
to finish the script.
CLIThe script also supports optional command-line arguments to modify its behavior. You can see a full list of these arguments by running the command
python privateGPT.py --help
in your terminal.
How does it work?Selecting the right local models and the power of
LangChain
you can run the entire pipeline locally, without any data leaving your environment, and with reasonable performance.
ingest.pyuses
LangChaintools to parse the document and create embeddings locally using
HuggingFaceEmbeddings(
SentenceTransformers). It then stores the result in a local vector database using
Chromavector store.
privateGPT.pyuses a local LLM based on
GPT4All-Jor
LlamaCppto understand questions and create answers. The context for the answers is extracted from the local vector store using a similarity search to locate the right piece of context from the docs.
GPT4All-Jwrapper was introduced in LangChain 0.0.162.
To use this software, you must have Python 3.10 or later installed. Earlier versions of Python will not compile.
If you encounter an error while building a wheel during the
pip installprocess, you may need to install a C++ compiler on your computer.
To install a C++ compiler on Windows 10/11, follow these steps:
gcccomponent.
When running a Mac with Intel hardware (not M1), you may run into clang: error: the clang compiler does not support '-march=native' during pip install.
If so set your archflags during pip install. eg: ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt
This is a test project to validate the feasibility of a fully private solution for question answering using LLMs and Vector embeddings. It is not production ready, and it is not meant to be used in production. The models selection is not optimized for performance, but for privacy; but it is possible to use different models and vectorstores to improve performance.