Junki
Junki
Published on 2025-02-11 / 129 Visits
0
0

基于 Ollama 和 SpringAI 实现私有化知识库的思路

使用 Ollama 作为大模型能力基座

官网:https://ollama.com/

GitHub:https://github.com/ollama/ollama

安装 Ollama

各平台下载地址:https://ollama.com/download

也可使用 Docker 部署:https://hub.docker.com/r/ollama/ollama

安装过程略...

运行 deepseek-r1 作为对话模型

可在 Ollama 模型仓库中查找可用模型:https://ollama.com/library

这里以 deepseek-r1:1.5b 模型为例,直接执行运行命令,会自动拉取模型:

ollama run deepseek-r1:1.5b

运行 nomic-embed-text 作为嵌入模型

嵌入模型可将文本转换为向量,用于知识库的构建。

可在 Ollama 模型仓库筛选出所有可用的 embedding 模型:https://ollama.com/search?c=embedding

这里以 nomic-embed-text 模型为例,直接执行运行命令,会自动拉取模型:

ollama run nomic-embed-text

了解 REST API

官方文档:https://github.com/ollama/ollama/blob/main/docs/api.md

Ollama 的默认 API 端口是 11434。默认没有开启远程访问权限,开启方式参考官方文档:https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server

请求 deepseek-r1:1.5b 对话模型:

curl --location --request POST 'http://localhost:11434/api/chat' \
--data-raw '{
    "model": "deepseek-r1:1.5b",
    "stream": false,
    "messages": [
        {
            "role": "user",
            "content": "你好"
        }
    ]
}'

响应如下:

{
  "model": "deepseek-r1:1.5b",
  "created_at": "2025-02-11T07:39:15.630949522Z",
  "message": {
    "role": "assistant",
    "content": "<think>\n\n</think>\n\n你好!很高兴见到你,有什么我可以帮忙的吗?无论是问题、建议还是闲聊,我都在这里为你服务。😊"
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 2271233241,
  "load_duration": 21222115,
  "prompt_eval_count": 4,
  "prompt_eval_duration": 75333000,
  "eval_count": 32,
  "eval_duration": 2133681000
}

体验流式响应可将 stream 设置为 true。

请求 nomic-embed-text 嵌入模型:

curl --location --request POST 'http://localhost:11434/api/embed' \
--data-raw '{
    "model": "nomic-embed-text",
    "input": "测试文本"
}'

响应如下:

{
  "model": "nomic-embed-text",
  "embeddings": [
    [
      0.03132133,
      "省略..."
    ]
  ],
  "total_duration": 3125538250,
  "load_duration": 2976165401,
  "prompt_eval_count": 4
}

通过测试可知,nomic-embed-text 返回的向量维度是768。

使用 ElasticSearch 作为向量数据库

其他向量数据库集成参考:https://docs.spring.io/spring-ai/reference/api/vectordbs.html

安装 Docker

本文基于 docker-compose 部署 ElasticSearch 集群,其他安装方式参考官方文档:https://www.elastic.co/guide/en/elasticsearch/reference/8.17/install-elasticsearch.html

Docker 安装参考官方文档:https://docs.docker.com/get-started/

编写 docker-compose 配置文件

自定义目录,创建 .env 环境配置文件:

# kibana_system账号的密码 (至少六个字符),该账号仅用于一些kibana的内部设置,不能用来查询es
KIBANA_PASSWORD=abcdef

# es和kibana的版本
STACK_VERSION=8.13.3

# 集群名字
CLUSTER_NAME=docker-cluster

# es映射到宿主机的的端口
ES_PORT=9200

# kibana映射到宿主机的的端口
KIBANA_PORT=5601

# es容器的内存大小,请根据自己硬件情况调整
MEM_LIMIT=1073741824

# 命名空间,会体现在容器名的前缀上
COMPOSE_PROJECT_NAME=es-cluster

在同目录下创建 docker-compose.yml 配置文件:

version: "2.2"

services:
  es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    volumes:
      - esdata01:/usr/share/elasticsearch/data
    ports:
      - ${ES_PORT}:9200
    environment:
      - node.name=es01
      - cluster.name=${CLUSTER_NAME}
      - cluster.initial_master_nodes=es01,es02,es03
      - discovery.seed_hosts=es02,es03
      - bootstrap.memory_lock=true
      - xpack.security.enabled=false
      - xpack.security.http.ssl.enabled=false
      - xpack.security.transport.ssl.enabled=false
    mem_limit: ${MEM_LIMIT}
    ulimits:
      memlock:
        soft: -1
        hard: -1

  es02:
    depends_on:
      - es01
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    volumes:
      - esdata02:/usr/share/elasticsearch/data
    environment:
      - node.name=es02
      - cluster.name=${CLUSTER_NAME}
      - cluster.initial_master_nodes=es01,es02,es03
      - discovery.seed_hosts=es01,es03
      - bootstrap.memory_lock=true
      - xpack.security.enabled=false
      - xpack.security.http.ssl.enabled=false
      - xpack.security.transport.ssl.enabled=false
    mem_limit: ${MEM_LIMIT}
    ulimits:
      memlock:
        soft: -1
        hard: -1

  es03:
    depends_on:
      - es02
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    volumes:
      - esdata03:/usr/share/elasticsearch/data
    environment:
      - node.name=es03
      - cluster.name=${CLUSTER_NAME}
      - cluster.initial_master_nodes=es01,es02,es03
      - discovery.seed_hosts=es02,es03
      - bootstrap.memory_lock=true
      - xpack.security.enabled=false
      - xpack.security.http.ssl.enabled=false
      - xpack.security.transport.ssl.enabled=false
    mem_limit: ${MEM_LIMIT}
    ulimits:
      memlock:
        soft: -1
        hard: -1

  kibana:
    image: docker.elastic.co/kibana/kibana:${STACK_VERSION}
    volumes:
      - kibanadata:/usr/share/kibana/data
    ports:
      - ${KIBANA_PORT}:5601
    environment:
      - SERVERNAME=kibana
      - ELASTICSEARCH_HOSTS=http://es01:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
    mem_limit: ${MEM_LIMIT}

volumes:
  esdata01:
    driver: local
  esdata02:
    driver: local
  esdata03:
    driver: local
  kibanadata:
    driver: local

和官方文档相比,这里去除了 elasticsearch 的密码和 SSL 配置。官方文档参考:https://www.elastic.co/guide/en/elasticsearch/reference/8.17/docker.html#docker-compose-file

启动容器

在 docker-compose 配置目录下,执行命令:

docker compose up -d

Kibana 连接 ElasticSearch

Kibana 为 ElasticSearch 的官方可视化工具。

通过 http://localhost:5601 访问 Kibana。用户名密码为 kibana_system 和 abcdef。初次使用需要配置密钥、接收验证码,按照提示进入对应容器获取密钥和验证码即可。

Kibana 使用教程参考官方文档:https://www.elastic.co/guide/en/kibana/current/index.html

构建 Spring AI 工程

创建项目

强烈推荐将 Spring AI 官方文档作为入门教程,文档详细介绍了各模块的整合方式。

官方文档:https://docs.spring.io/spring-ai/reference/getting-started.html

注意事项:

  • Spring AI 依赖目前没有发布到 Maven 仓库,需要配置私服仓库。
  • 需要 Spring Boot 3.2.x 或 3.3.x
  • 需要 JDK-17 以上

pom.xml 参考

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.3.4</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>

    <groupId>cn.junki</groupId>
    <artifactId>spring-ai-server</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>spring-ai-server</name>
    <description>spring-ai-server</description>

    <properties>
        <java.version>17</java.version>
    </properties>

    <repositories>
        <repository>
            <id>spring-milestones</id>
            <name>Spring Milestones</name>
            <url>https://repo.spring.io/milestone</url>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
        <repository>
            <id>spring-snapshots</id>
            <name>Spring Snapshots</name>
            <url>https://repo.spring.io/snapshot</url>
            <releases>
                <enabled>false</enabled>
            </releases>
        </repository>
    </repositories>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-webflux</artifactId>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>

        <!-- https://docs.spring.io/spring-ai/reference/api/chat/ollama-chat.html -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
        </dependency>


        <!-- https://docs.spring.io/spring-ai/reference/api/vectordbs/elasticsearch.html -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-elasticsearch-store-spring-boot-starter</artifactId>
        </dependency>
        <dependency>
            <groupId>co.elastic.clients</groupId>
            <artifactId>elasticsearch-java</artifactId>
            <version>8.13.3</version>
        </dependency>

    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>1.0.0-M5</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <configuration>
                    <excludes>
                        <exclude>
                            <groupId>org.projectlombok</groupId>
                            <artifactId>lombok</artifactId>
                        </exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>

特别注意这里的 spring-ai-bom 使用的是 1.0.0-M5 版本,修复了 es 相关的 bug。

配置 application.yml

server:
  port: 8080

spring:
  ai:
    ollama:
      base-url: "http://localhost:11434"
      chat:
        options:
          # 对话模型
          model: "deepseek-r1:1.5b"
      embedding:
        options:
          # 嵌入模型
          model: "nomic-embed-text"
    vectorstore:
      # 使用 es 作为向量库
      elasticsearch:
        # 启动项目时创建 schema
        initialize-schema: true
        # 知识库的 index 名
        index-name: knowledge-base-index
        # 向量维度,nomic-embed-text 模型返回的是768维向量
        dimensions: 768
        # 相似度算法,这里使用余弦相似度算法
        similarity: cosine
  elasticsearch:
    uris:
      - http://localhost:9200

编写测试控制器

package cn.junki.springaiserver.controller;

import jakarta.annotation.Resource;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentReader;
import org.springframework.ai.ollama.OllamaChatModel;
import org.springframework.ai.ollama.OllamaEmbeddingModel;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.io.FileSystemResource;
import org.springframework.http.MediaType;
import org.springframework.http.codec.ServerSentEvent;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.server.ServerWebExchange;
import reactor.core.publisher.Flux;

import java.io.StringReader;
import java.net.http.HttpResponse;
import java.time.Duration;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;

/**
 * AI 能力控制器
 *
 * @author Junki
 * @since 2025-02-10
 */
@CrossOrigin
@RestController
@RequestMapping("/ai")
public class AiController {

    /**
     * Ollama 聊天模型
     */
    @Resource
    private OllamaChatModel chatModel;

    /**
     * Ollama 嵌入模型
     */
    @Resource
    private OllamaEmbeddingModel embeddingModel;

    /**
     * 向量数据库
     */
    @Resource
    private VectorStore vectorStore;

    /**
     * 知识库顾问提示词
     */
    private static final String USER_TEXT_ADVISE = """
            参考以下知识进行回答:
            
            ---------------------
            {question_answer_context}
            ---------------------
            
            """;

    /**
     * 知识库顾问
     *
     * @return 顾问实例
     */
    private QuestionAnswerAdvisor getQuestionAnswerAdvisor() {
        return QuestionAnswerAdvisor.builder(vectorStore)
                .userTextAdvise(USER_TEXT_ADVISE)
                .searchRequest(
                        SearchRequest.builder()
                                .similarityThreshold(0.8d)
                                .topK(6)
                                .build()
                )
                .build();
    }

    /**
     * 同步对话接口
     *
     * @param message 用户消息
     * @return 回复
     */
    @GetMapping("/call")
    public ChatResponse call(@RequestParam String message) {
        return ChatClient.builder(chatModel)
                .build()
                .prompt()
                .advisors(getQuestionAnswerAdvisor())
                .user(message)
                .call()
                .chatResponse();
    }

    /**
     * 响应式 SSE 对话接口
     *
     * @param message 用户消息
     * @return SSE 流式响应
     */
    @GetMapping(path = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<ServerSentEvent<ChatResponse>> stream(@RequestParam String message) {
        return ChatClient.builder(chatModel)
                .build()
                .prompt()
                .advisors(getQuestionAnswerAdvisor())
                .user(message)
                .stream()
                .chatResponse()
                .map(response -> ServerSentEvent.<ChatResponse>builder().data(response).build());
    }

    /**
     * 响应式 SSE 对话接口(只响应文本内容)
     *
     * @param message 用户消息
     * @return SSE 流式响应
     */
    @GetMapping(path = "/stream/text", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<ServerSentEvent<String>> streamText(@RequestParam String message) {
        return ChatClient.builder(chatModel)
                .build()
                .prompt()
                .advisors(getQuestionAnswerAdvisor())
                .user(message)
                .stream()
                .chatResponse()
                .map(response -> ServerSentEvent.<String>builder().data(response.getResult().getOutput().getText()).build());
    }

    /**
     * 嵌入接口
     *
     * @param text 文本
     * @return 向量
     */
    @GetMapping("/embed")
    public float[] embed(@RequestParam String text) {
        return embeddingModel.embed(text);
    }

    /**
     * 新增知识文本
     *
     * @param text 文本
     */
    @GetMapping("/knowledge/add")
    public void knowledgeAdd(@RequestParam String text) {
        List<Document> documents = new ArrayList<>();
        documents.add(Document.builder().text(text).build());

        vectorStore.add(documents);
    }

    /**
     * 知识库检索
     *
     * @param question 提问
     * @return 文档
     */
    @GetMapping("/knowledge/search")
    public List<Document> knowledgeSearch(@RequestParam String question) {
        return vectorStore.similaritySearch(
                SearchRequest.builder()
                        .query(question)
                        .topK(6)
                        .build()
        );
    }

}

接口测试

接口测试过程略...


Comment