Spring AI + RAG 实战:基于 Milvus 和智谱 GLM-5 的知识库搭建指南

# Spring AI + RAG 实战:基于 Milvus 和智谱 GLM-5 的知识库搭建指南

# 一、项目概述

在企业级 AI 应用开发中,如何让大语言模型理解并回答企业私有知识库中的问题,是一个核心技术挑战。RAG(Retrieval Augmented Generation,检索增强生成)架构正是解决这一问题的最佳方案。

本文将带您从零开始,基于 Spring AI + Milvus 向量数据库 + 智谱 GLM-5 模型,搭建一套完整的私有知识库问答系统。

# 技术栈

组件 选型 说明
框架 Spring AI 1.0.0 Spring 生态 AI 框架
LLM 智谱 GLM-5 国内领先的大语言模型
Embedding 智谱 Embedding-3 文本向量化模型
向量库 Milvus 高性能开源向量数据库
文档解析 Apache Tika 支持 PDF/Word/TXT 等多格式

# 系统架构

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   用户      │────▶│  Spring AI  │────▶│   GLM-5     │
│   (提问)    │     │  (RAG 编排)  │     │   (生成回答) │
└─────────────┘     └──────┬──────┘     └─────────────┘
                           │
              ┌────────────┴────────────┐
              │                         │
       ┌──────▼──────┐           ┌──────▼──────┐
       │  Milvus     │           │  知识库文档   │
       │ (向量检索)   │           │ (PDF/Word)  │
       └─────────────┘           └─────────────┘
1
2
3
4
5
6
7
8
9
10
11

# 二、环境准备

# 2.1 Milvus 部署

使用 Docker Compose 快速部署 Milvus:

# docker-compose.yml
version: '3.5'

services:
  milvus:
    image: milvusdb/milvus:v2.3.3
    container_name: milvus-standalone
    environment:
      ETCD_ENDPOINTS: milvus-etcd:2379
      MINIO_ADDRESS: milvus-minio:9000
    ports:
      - "19530:19530"
      - "9091:9091"
    volumes:
      - ./milvus/data:/var/lib/milvus

  milvus-etcd:
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296

  milvus-minio:
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    command: server /minio_data --console-address ":9001"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

启动命令:

docker-compose up -d
1

# 2.2 智谱 API Key 获取

  1. 访问 智谱 AI 开放平台 (opens new window)
  2. 注册账号并完成企业认证
  3. 在「API 密钥」页面创建 API Key
  4. 记录 Key 并设置环境变量

# 三、项目搭建

# 3.1 Maven 依赖配置

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
         http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.3.0</version>
    </parent>

    <properties>
        <java.version>17</java.version>
        <spring-ai.version>1.0.0-M7</spring-ai.version>
    </properties>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <!-- Spring AI Core -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-core</artifactId>
        </dependency>

        <!-- 智谱 GLM 聊天模型 -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-zhipuai</artifactId>
        </dependency>

        <!-- 智谱 Embedding 模型 -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-zhipuai-embedding</artifactId>
        </dependency>

        <!-- Milvus 向量存储 -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-milvus-store</artifactId>
        </dependency>

        <!-- Tika 文档解析 -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-tika-document-reader</artifactId>
        </dependency>

        <!-- Spring Web -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
    </dependencies>
</project>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68

# 3.2 配置文件

# application.yml
server:
  port: 8080

spring:
  application:
    name: spring-ai-rag-milvus

  ai:
    # 智谱 GLM-5 配置
    zhipuai:
      api-key: ${ZHIPUAI_API_KEY:}
      base-url: https://open.bigmodel.cn/api/paas/v4
      chat:
        options:
          model: glm-5
          temperature: 0.7
          max-tokens: 2048
      embedding:
        options:
          model: embedding-3

    # Milvus 向量数据库配置
    vectorstore:
      milvus:
        client:
          host: ${MILVUS_HOST:localhost}
          port: ${MILVUS_PORT:19530}
          username: ${MILVUS_USER:root}
          password: ${MILVUS_PASSWORD:milvus}
        database-name: default
        collection-name: knowledge_base
        embedding-dimension: 1024
        index-type: IVF_FLAT
        metric-type: COSINE
        initialize-schema: true
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

# 3.3 环境变量

# 启动前设置环境变量
export ZHIPUAI_API_KEY=your-zhipuai-api-key
export MILVUS_HOST=localhost
export MILVUS_PORT=19530
1
2
3
4

# 四、核心代码实现

# 4.1 文档处理服务

负责将知识库文档转换为向量并存储到 Milvus:

package com.farerboy.springai.demo.service;

import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentReader;
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.reader.tika.TikaDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.core.io.Resource;
import org.springframework.core.io.UrlResource;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.stream.Collectors;

@Service
public class DocumentService {

    private final VectorStore vectorStore;
    private final TokenTextSplitter documentSplitter;

    @Value("${knowledge.base.path:./knowledge-base}")
    private String knowledgeBasePath;

    public DocumentService(VectorStore vectorStore, EmbeddingModel embeddingModel) {
        this.vectorStore = vectorStore;
        // 文档分块:512 token/块,128 token 重叠
        this.documentSplitter = new TokenTextSplitter(
            512,    // chunk size
            128,    // chunk overlap
            true,   // keepSeparator
            true    // includeTitle
        );
    }

    /**
     * 加载单个文档
     */
    public void loadDocument(String filePath) throws IOException {
        Resource resource = new UrlResource(Paths.get(filePath).toUri());
        
        // 使用 Tika 解析 PDF/Word/TXT 等格式
        DocumentReader documentReader = new TikaDocumentReader(resource);
        List<Document> documents = documentReader.read();
        
        // 文档分块
        List<Document> chunks = documentSplitter.apply(documents);
        
        // 添加元数据
        chunks.forEach(doc -> {
            doc.getMetadata().put("source", filePath);
            doc.getMetadata().put("filename", Paths.get(filePath).getFileName().toString());
        });
        
        // 存入向量数据库
        vectorStore.add(chunks);
    }

    /**
     * 批量加载目录下的所有文档
     */
    public void loadDirectory(String directoryPath) throws IOException {
        Path path = Paths.get(directoryPath);
        
        List<Path> files = Files.walk(path)
            .filter(Files::isRegularFile)
            .filter(p -> {
                String name = p.getFileName().toString().toLowerCase();
                return name.endsWith(".pdf") || 
                       name.endsWith(".docx") || 
                       name.endsWith(".txt") ||
                       name.endsWith(".md");
            })
            .collect(Collectors.toList());

        for (Path file : files) {
            try {
                loadDocument(file.toAbsolutePath().toString());
            } catch (Exception e) {
                System.err.println("加载失败: " + file + ", 错误: " + e.getMessage());
            }
        }
    }

    /**
     * 加载默认知识库目录
     */
    public void loadKnowledgeBase() throws IOException {
        loadDirectory(knowledgeBasePath);
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96

# 4.2 RAG 问答服务

基于检索增强的对话服务:

package com.farerboy.springai.demo.service;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.stereotype.Service;

import java.util.List;
import java.util.Map;

@Service
public class ChatService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    public ChatService(ChatModel chatModel, VectorStore vectorStore) {
        this.vectorStore = vectorStore;
        
        // 构建 ChatClient,配置系统提示词
        this.chatClient = ChatClient.builder(chatModel)
            .defaultSystem("你是一个专业的知识库问答助手。" +
                "请根据提供的上下文信息回答用户的问题。" +
                "如果上下文中没有相关信息,请明确告知用户。")
            .build();
    }

    /**
     * 基础 RAG 问答
     */
    public String chat(String question) {
        return chatClient.prompt()
            .user(question)
            .advisors(QuestionAnswerAdvisor.builder(vectorStore).build())
            .call()
            .content();
    }

    /**
     * 带过滤条件的 RAG 问答
     */
    public String chat(String question, Map<String, Object> filters) {
        if (filters != null && !filters.isEmpty()) {
            StringBuilder filterExpr = new StringBuilder();
            int i = 0;
            for (Map.Entry<String, Object> entry : filters.entrySet()) {
                if (i > 0) filterExpr.append(" AND ");
                filterExpr.append(entry.getKey())
                    .append(" == '")
                    .append(entry.getValue())
                    .append("'");
                i++;
            }

            SearchRequest searchRequest = SearchRequest.builder()
                .query(question)
                .filterExpression(filterExpr.toString())
                .topK(5)
                .similarityThreshold(0.7)
                .build();

            return chatClient.prompt()
                .user(question)
                .advisors(QuestionAnswerAdvisor.builder(vectorStore, searchRequest).build())
                .call()
                .content();
        }
        return chat(question);
    }

    /**
     * 带来源标注的问答
     */
    public Map<String, Object> chatWithSources(String question) {
        // 检索相关文档
        SearchRequest searchRequest = SearchRequest.builder()
            .query(question)
            .topK(3)
            .similarityThreshold(0.6)
            .build();

        var docs = vectorStore.similaritySearch(searchRequest);
        
        // 构建上下文
        StringBuilder context = new StringBuilder();
        StringBuilder sources = new StringBuilder();
        
        for (int i = 0; i < docs.size(); i++) {
            var doc = docs.get(i);
            context.append("文档 ").append(i + 1).append(":\n")
                   .append(doc.getContent())
                   .append("\n\n");
            
            String filename = doc.getMetadata().get("filename") != null 
                ? doc.getMetadata().get("filename").toString() 
                : "未知来源";
            sources.append("- ").append(filename).append("\n");
        }

        // 构建增强 prompt
        String prompt = "基于以下知识库中的信息回答用户问题。\n\n" +
            "【知识库内容】\n" + context.toString() +
            "【用户问题】\n" + question + "\n\n" +
            "【回答要求】\n" +
            "1. 根据提供的知识库内容回答\n" +
            "2. 如果没有相关信息,请说明\n" +
            "3. 在回答末尾列出参考来源";

        String answer = chatClient.prompt()
            .user(prompt)
            .call()
            .content();

        return Map.of(
            "answer", answer,
            "sources", sources.toString(),
            "docCount", docs.size()
        );
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122

# 4.3 REST API 控制器

package com.farerboy.springai.demo.controller;

import com.farerboy.springai.demo.service.ChatService;
import com.farerboy.springai.demo.service.DocumentService;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.Map;

@RestController
@RequestMapping("/api/rag")
public class RagController {

    private final DocumentService documentService;
    private final ChatService chatService;

    public RagController(DocumentService documentService, ChatService chatService) {
        this.documentService = documentService;
        this.chatService = chatService;
    }

    /**
     * RAG 问答接口
     */
    @PostMapping("/chat")
    public ResponseEntity<Map<String, Object>> chat(@RequestBody Map<String, String> request) {
        String question = request.get("question");
        String answer = chatService.chat(question);
        
        return ResponseEntity.ok(Map.of(
            "answer", answer,
            "question", question
        ));
    }

    /**
     * 带来源标注的问答接口
     */
    @PostMapping("/chat/with-sources")
    public ResponseEntity<Map<String, Object>> chatWithSources(@RequestBody Map<String, String> request) {
        String question = request.get("question");
        Map<String, Object> result = chatService.chatWithSources(question);
        return ResponseEntity.ok(result);
    }

    /**
     * 上传文档接口
     */
    @PostMapping("/document/upload")
    public ResponseEntity<Map<String, Object>> uploadDocument(@RequestParam("file") MultipartFile file) 
            throws IOException {
        
        Path uploadDir = Paths.get("./knowledge-base/uploads");
        Files.createDirectories(uploadDir);
        
        Path filePath = uploadDir.resolve(file.getOriginalFilename());
        Files.write(filePath, file.getBytes());
        
        documentService.loadDocument(filePath.toAbsolutePath().toString());
        
        return ResponseEntity.ok(Map.of(
            "message", "文档上传成功",
            "filename", file.getOriginalFilename()
        ));
    }

    /**
     * 加载目录下的所有文档
     */
    @PostMapping("/document/load-directory")
    public ResponseEntity<Map<String, Object>> loadDirectory(@RequestBody Map<String, String> request) 
            throws IOException {
        String directoryPath = request.get("path");
        documentService.loadDirectory(directoryPath);
        
        return ResponseEntity.ok(Map.of(
            "message", "文档加载成功",
            "path", directoryPath
        ));
    }

    /**
     * 相似文档搜索
     */
    @GetMapping("/search")
    public ResponseEntity<Map<String, Object>> search(
            @RequestParam String query,
            @RequestParam(defaultValue = "5") int topK) {
        
        List<String> results = chatService.searchSimilarDocuments(query, topK);
        return ResponseEntity.ok(Map.of(
            "query", query,
            "results", results,
            "count", results.size()
        ));
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103

# 4.4 启动类

package com.farerboy.springai.demo;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SpringAiRagApplication {

    public static void main(String[] args) {
        SpringApplication.run(SpringAiRagApplication.class, args);
    }
}
1
2
3
4
5
6
7
8
9
10
11
12

# 五、API 接口说明

# 5.1 问答接口

# 基础 RAG 问答
curl -X POST http://localhost:8080/api/rag/chat \
  -H "Content-Type: application/json" \
  -d '{"question": "什么是 Spring AI?"}'
1
2
3
4

响应示例:

{
  "answer": "Spring AI 是一个用于 AI 工程的应用框架...",
  "question": "什么是 Spring AI?"
}
1
2
3
4

# 5.2 带来源的问答

# 获取答案和参考来源
curl -X POST http://localhost:8080/api/rag/chat/with-sources \
  -H "Content-Type: application/json" \
  -d '{"question": "如何配置 Milvus 向量库?"}'
1
2
3
4

# 5.3 文档上传

# 上传知识库文档
curl -X POST http://localhost:8080/api/rag/document/upload \
  -F "file=@./docs/spring-ai-guide.pdf"
1
2
3

# 5.4 相似文档搜索

# 搜索相似文档
curl "http://localhost:8080/api/rag/search?query=Spring AI 配置&topK=3"
1
2

# 六、最佳实践

# 6.1 分块策略

场景 建议参数
通用文档 chunk=512, overlap=128
技术文档 chunk=1024, overlap=256
问答场景 chunk=256, overlap=64

# 6.2 相似度阈值

  • 严格模式:0.8+(高precision)
  • 平衡模式:0.6-0.8(推荐)
  • 宽松模式:0.4-0.6(高recall)

# 6.3 索引类型选择

类型 特点 适用场景
IVF_FLAT 精确度高,召回快 中小规模数据
HNSW 高速搜索,内存占用大 大规模数据,延迟敏感
IVF_SQ8 压缩存储,降低精度 超大规模数据

# 七、常见问题

# Q1: 检索不到相关内容?

  • 检查文档是否成功加载
  • 调整 chunk size 和 overlap
  • 降低相似度阈值

# Q2: LLM 幻觉?

  • 提高相似度阈值(建议 0.7+)
  • 使用带来源的问答模式
  • 在 prompt 中强调"基于知识库回答"

# Q3: 响应速度慢?

  • 使用 HNSW 索引
  • 开启批量处理
  • 考虑本地部署模型

# 八、总结

本文详细介绍了基于 Spring AI + Milvus + 智谱 GLM-5 的私有知识库问答系统搭建方案。该方案具有以下优势:

  1. 国产化支持:智谱 GLM-5 是国内领先的大模型,响应速度快
  2. 高性能检索:Milvus 向量数据库支持海量数据毫秒级检索
  3. Spring 生态:Java 开发者零门槛,快速上手
  4. 灵活扩展:支持多租户、混合搜索、重排序等高级特性

通过本文的代码示例和配置指南,您可以快速搭建属于自己的私有知识库问答系统,让 AI 真正成为企业知识的智能助手。