Spring AI + RAG 实战:基于 Milvus 和智谱 GLM-5 的知识库搭建指南
# Spring AI + RAG 实战:基于 Milvus 和智谱 GLM-5 的知识库搭建指南
# 一、项目概述
在企业级 AI 应用开发中,如何让大语言模型理解并回答企业私有知识库中的问题,是一个核心技术挑战。RAG(Retrieval Augmented Generation,检索增强生成)架构正是解决这一问题的最佳方案。
本文将带您从零开始,基于 Spring AI + Milvus 向量数据库 + 智谱 GLM-5 模型,搭建一套完整的私有知识库问答系统。
# 技术栈
| 组件 | 选型 | 说明 |
|---|---|---|
| 框架 | Spring AI 1.0.0 | Spring 生态 AI 框架 |
| LLM | 智谱 GLM-5 | 国内领先的大语言模型 |
| Embedding | 智谱 Embedding-3 | 文本向量化模型 |
| 向量库 | Milvus | 高性能开源向量数据库 |
| 文档解析 | Apache Tika | 支持 PDF/Word/TXT 等多格式 |
# 系统架构
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ 用户 │────▶│ Spring AI │────▶│ GLM-5 │
│ (提问) │ │ (RAG 编排) │ │ (生成回答) │
└─────────────┘ └──────┬──────┘ └─────────────┘
│
┌────────────┴────────────┐
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ Milvus │ │ 知识库文档 │
│ (向量检索) │ │ (PDF/Word) │
└─────────────┘ └─────────────┘
1
2
3
4
5
6
7
8
9
10
11
2
3
4
5
6
7
8
9
10
11
# 二、环境准备
# 2.1 Milvus 部署
使用 Docker Compose 快速部署 Milvus:
# docker-compose.yml
version: '3.5'
services:
milvus:
image: milvusdb/milvus:v2.3.3
container_name: milvus-standalone
environment:
ETCD_ENDPOINTS: milvus-etcd:2379
MINIO_ADDRESS: milvus-minio:9000
ports:
- "19530:19530"
- "9091:9091"
volumes:
- ./milvus/data:/var/lib/milvus
milvus-etcd:
image: quay.io/coreos/etcd:v3.5.5
environment:
- ETCD_AUTO_COMPACTION_MODE=revision
- ETCD_AUTO_COMPACTION_RETENTION=1000
- ETCD_QUOTA_BACKEND_BYTES=4294967296
milvus-minio:
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
environment:
MINIO_ACCESS_KEY: minioadmin
MINIO_SECRET_KEY: minioadmin
command: server /minio_data --console-address ":9001"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
启动命令:
docker-compose up -d
1
# 2.2 智谱 API Key 获取
- 访问 智谱 AI 开放平台 (opens new window)
- 注册账号并完成企业认证
- 在「API 密钥」页面创建 API Key
- 记录 Key 并设置环境变量
# 三、项目搭建
# 3.1 Maven 依赖配置
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.3.0</version>
</parent>
<properties>
<java.version>17</java.version>
<spring-ai.version>1.0.0-M7</spring-ai.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>${spring-ai.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<!-- Spring AI Core -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-core</artifactId>
</dependency>
<!-- 智谱 GLM 聊天模型 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-zhipuai</artifactId>
</dependency>
<!-- 智谱 Embedding 模型 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-zhipuai-embedding</artifactId>
</dependency>
<!-- Milvus 向量存储 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-milvus-store</artifactId>
</dependency>
<!-- Tika 文档解析 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>
<!-- Spring Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>
</project>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# 3.2 配置文件
# application.yml
server:
port: 8080
spring:
application:
name: spring-ai-rag-milvus
ai:
# 智谱 GLM-5 配置
zhipuai:
api-key: ${ZHIPUAI_API_KEY:}
base-url: https://open.bigmodel.cn/api/paas/v4
chat:
options:
model: glm-5
temperature: 0.7
max-tokens: 2048
embedding:
options:
model: embedding-3
# Milvus 向量数据库配置
vectorstore:
milvus:
client:
host: ${MILVUS_HOST:localhost}
port: ${MILVUS_PORT:19530}
username: ${MILVUS_USER:root}
password: ${MILVUS_PASSWORD:milvus}
database-name: default
collection-name: knowledge_base
embedding-dimension: 1024
index-type: IVF_FLAT
metric-type: COSINE
initialize-schema: true
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# 3.3 环境变量
# 启动前设置环境变量
export ZHIPUAI_API_KEY=your-zhipuai-api-key
export MILVUS_HOST=localhost
export MILVUS_PORT=19530
1
2
3
4
2
3
4
# 四、核心代码实现
# 4.1 文档处理服务
负责将知识库文档转换为向量并存储到 Milvus:
package com.farerboy.springai.demo.service;
import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentReader;
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.reader.tika.TikaDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.core.io.Resource;
import org.springframework.core.io.UrlResource;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.stream.Collectors;
@Service
public class DocumentService {
private final VectorStore vectorStore;
private final TokenTextSplitter documentSplitter;
@Value("${knowledge.base.path:./knowledge-base}")
private String knowledgeBasePath;
public DocumentService(VectorStore vectorStore, EmbeddingModel embeddingModel) {
this.vectorStore = vectorStore;
// 文档分块:512 token/块,128 token 重叠
this.documentSplitter = new TokenTextSplitter(
512, // chunk size
128, // chunk overlap
true, // keepSeparator
true // includeTitle
);
}
/**
* 加载单个文档
*/
public void loadDocument(String filePath) throws IOException {
Resource resource = new UrlResource(Paths.get(filePath).toUri());
// 使用 Tika 解析 PDF/Word/TXT 等格式
DocumentReader documentReader = new TikaDocumentReader(resource);
List<Document> documents = documentReader.read();
// 文档分块
List<Document> chunks = documentSplitter.apply(documents);
// 添加元数据
chunks.forEach(doc -> {
doc.getMetadata().put("source", filePath);
doc.getMetadata().put("filename", Paths.get(filePath).getFileName().toString());
});
// 存入向量数据库
vectorStore.add(chunks);
}
/**
* 批量加载目录下的所有文档
*/
public void loadDirectory(String directoryPath) throws IOException {
Path path = Paths.get(directoryPath);
List<Path> files = Files.walk(path)
.filter(Files::isRegularFile)
.filter(p -> {
String name = p.getFileName().toString().toLowerCase();
return name.endsWith(".pdf") ||
name.endsWith(".docx") ||
name.endsWith(".txt") ||
name.endsWith(".md");
})
.collect(Collectors.toList());
for (Path file : files) {
try {
loadDocument(file.toAbsolutePath().toString());
} catch (Exception e) {
System.err.println("加载失败: " + file + ", 错误: " + e.getMessage());
}
}
}
/**
* 加载默认知识库目录
*/
public void loadKnowledgeBase() throws IOException {
loadDirectory(knowledgeBasePath);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# 4.2 RAG 问答服务
基于检索增强的对话服务:
package com.farerboy.springai.demo.service;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.stereotype.Service;
import java.util.List;
import java.util.Map;
@Service
public class ChatService {
private final ChatClient chatClient;
private final VectorStore vectorStore;
public ChatService(ChatModel chatModel, VectorStore vectorStore) {
this.vectorStore = vectorStore;
// 构建 ChatClient,配置系统提示词
this.chatClient = ChatClient.builder(chatModel)
.defaultSystem("你是一个专业的知识库问答助手。" +
"请根据提供的上下文信息回答用户的问题。" +
"如果上下文中没有相关信息,请明确告知用户。")
.build();
}
/**
* 基础 RAG 问答
*/
public String chat(String question) {
return chatClient.prompt()
.user(question)
.advisors(QuestionAnswerAdvisor.builder(vectorStore).build())
.call()
.content();
}
/**
* 带过滤条件的 RAG 问答
*/
public String chat(String question, Map<String, Object> filters) {
if (filters != null && !filters.isEmpty()) {
StringBuilder filterExpr = new StringBuilder();
int i = 0;
for (Map.Entry<String, Object> entry : filters.entrySet()) {
if (i > 0) filterExpr.append(" AND ");
filterExpr.append(entry.getKey())
.append(" == '")
.append(entry.getValue())
.append("'");
i++;
}
SearchRequest searchRequest = SearchRequest.builder()
.query(question)
.filterExpression(filterExpr.toString())
.topK(5)
.similarityThreshold(0.7)
.build();
return chatClient.prompt()
.user(question)
.advisors(QuestionAnswerAdvisor.builder(vectorStore, searchRequest).build())
.call()
.content();
}
return chat(question);
}
/**
* 带来源标注的问答
*/
public Map<String, Object> chatWithSources(String question) {
// 检索相关文档
SearchRequest searchRequest = SearchRequest.builder()
.query(question)
.topK(3)
.similarityThreshold(0.6)
.build();
var docs = vectorStore.similaritySearch(searchRequest);
// 构建上下文
StringBuilder context = new StringBuilder();
StringBuilder sources = new StringBuilder();
for (int i = 0; i < docs.size(); i++) {
var doc = docs.get(i);
context.append("文档 ").append(i + 1).append(":\n")
.append(doc.getContent())
.append("\n\n");
String filename = doc.getMetadata().get("filename") != null
? doc.getMetadata().get("filename").toString()
: "未知来源";
sources.append("- ").append(filename).append("\n");
}
// 构建增强 prompt
String prompt = "基于以下知识库中的信息回答用户问题。\n\n" +
"【知识库内容】\n" + context.toString() +
"【用户问题】\n" + question + "\n\n" +
"【回答要求】\n" +
"1. 根据提供的知识库内容回答\n" +
"2. 如果没有相关信息,请说明\n" +
"3. 在回答末尾列出参考来源";
String answer = chatClient.prompt()
.user(prompt)
.call()
.content();
return Map.of(
"answer", answer,
"sources", sources.toString(),
"docCount", docs.size()
);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
# 4.3 REST API 控制器
package com.farerboy.springai.demo.controller;
import com.farerboy.springai.demo.service.ChatService;
import com.farerboy.springai.demo.service.DocumentService;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.Map;
@RestController
@RequestMapping("/api/rag")
public class RagController {
private final DocumentService documentService;
private final ChatService chatService;
public RagController(DocumentService documentService, ChatService chatService) {
this.documentService = documentService;
this.chatService = chatService;
}
/**
* RAG 问答接口
*/
@PostMapping("/chat")
public ResponseEntity<Map<String, Object>> chat(@RequestBody Map<String, String> request) {
String question = request.get("question");
String answer = chatService.chat(question);
return ResponseEntity.ok(Map.of(
"answer", answer,
"question", question
));
}
/**
* 带来源标注的问答接口
*/
@PostMapping("/chat/with-sources")
public ResponseEntity<Map<String, Object>> chatWithSources(@RequestBody Map<String, String> request) {
String question = request.get("question");
Map<String, Object> result = chatService.chatWithSources(question);
return ResponseEntity.ok(result);
}
/**
* 上传文档接口
*/
@PostMapping("/document/upload")
public ResponseEntity<Map<String, Object>> uploadDocument(@RequestParam("file") MultipartFile file)
throws IOException {
Path uploadDir = Paths.get("./knowledge-base/uploads");
Files.createDirectories(uploadDir);
Path filePath = uploadDir.resolve(file.getOriginalFilename());
Files.write(filePath, file.getBytes());
documentService.loadDocument(filePath.toAbsolutePath().toString());
return ResponseEntity.ok(Map.of(
"message", "文档上传成功",
"filename", file.getOriginalFilename()
));
}
/**
* 加载目录下的所有文档
*/
@PostMapping("/document/load-directory")
public ResponseEntity<Map<String, Object>> loadDirectory(@RequestBody Map<String, String> request)
throws IOException {
String directoryPath = request.get("path");
documentService.loadDirectory(directoryPath);
return ResponseEntity.ok(Map.of(
"message", "文档加载成功",
"path", directoryPath
));
}
/**
* 相似文档搜索
*/
@GetMapping("/search")
public ResponseEntity<Map<String, Object>> search(
@RequestParam String query,
@RequestParam(defaultValue = "5") int topK) {
List<String> results = chatService.searchSimilarDocuments(query, topK);
return ResponseEntity.ok(Map.of(
"query", query,
"results", results,
"count", results.size()
));
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
# 4.4 启动类
package com.farerboy.springai.demo;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class SpringAiRagApplication {
public static void main(String[] args) {
SpringApplication.run(SpringAiRagApplication.class, args);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
2
3
4
5
6
7
8
9
10
11
12
# 五、API 接口说明
# 5.1 问答接口
# 基础 RAG 问答
curl -X POST http://localhost:8080/api/rag/chat \
-H "Content-Type: application/json" \
-d '{"question": "什么是 Spring AI?"}'
1
2
3
4
2
3
4
响应示例:
{
"answer": "Spring AI 是一个用于 AI 工程的应用框架...",
"question": "什么是 Spring AI?"
}
1
2
3
4
2
3
4
# 5.2 带来源的问答
# 获取答案和参考来源
curl -X POST http://localhost:8080/api/rag/chat/with-sources \
-H "Content-Type: application/json" \
-d '{"question": "如何配置 Milvus 向量库?"}'
1
2
3
4
2
3
4
# 5.3 文档上传
# 上传知识库文档
curl -X POST http://localhost:8080/api/rag/document/upload \
-F "file=@./docs/spring-ai-guide.pdf"
1
2
3
2
3
# 5.4 相似文档搜索
# 搜索相似文档
curl "http://localhost:8080/api/rag/search?query=Spring AI 配置&topK=3"
1
2
2
# 六、最佳实践
# 6.1 分块策略
| 场景 | 建议参数 |
|---|---|
| 通用文档 | chunk=512, overlap=128 |
| 技术文档 | chunk=1024, overlap=256 |
| 问答场景 | chunk=256, overlap=64 |
# 6.2 相似度阈值
- 严格模式:0.8+(高precision)
- 平衡模式:0.6-0.8(推荐)
- 宽松模式:0.4-0.6(高recall)
# 6.3 索引类型选择
| 类型 | 特点 | 适用场景 |
|---|---|---|
| IVF_FLAT | 精确度高,召回快 | 中小规模数据 |
| HNSW | 高速搜索,内存占用大 | 大规模数据,延迟敏感 |
| IVF_SQ8 | 压缩存储,降低精度 | 超大规模数据 |
# 七、常见问题
# Q1: 检索不到相关内容?
- 检查文档是否成功加载
- 调整 chunk size 和 overlap
- 降低相似度阈值
# Q2: LLM 幻觉?
- 提高相似度阈值(建议 0.7+)
- 使用带来源的问答模式
- 在 prompt 中强调"基于知识库回答"
# Q3: 响应速度慢?
- 使用 HNSW 索引
- 开启批量处理
- 考虑本地部署模型
# 八、总结
本文详细介绍了基于 Spring AI + Milvus + 智谱 GLM-5 的私有知识库问答系统搭建方案。该方案具有以下优势:
- 国产化支持:智谱 GLM-5 是国内领先的大模型,响应速度快
- 高性能检索:Milvus 向量数据库支持海量数据毫秒级检索
- Spring 生态:Java 开发者零门槛,快速上手
- 灵活扩展:支持多租户、混合搜索、重排序等高级特性
通过本文的代码示例和配置指南,您可以快速搭建属于自己的私有知识库问答系统,让 AI 真正成为企业知识的智能助手。