当前位置: 首页 > news >正文

企业级java+LangChain4j-RAG系统 限流熔断降级

企业级java+LangChain4j-RAG系统 限流熔断降级

1. 文档说明

本文档基于SpringBoot3 + LangChain4j + Milvus/Chroma + MySQL + Redis企业级AI知识库RAG项目,整合了目前业界所有主流接口限流、熔断、降级方案,包含完整可运行源码、配置、场景选型规范、生产落地标准、面试核心知识点。

所有代码无缝替换Sentinel、零冲突、可直接部署上线,适配AI问答、文档解析、大文件分片上传全业务场景。

2. 核心概念区分(生产必备)

2.1 限流(RateLimit)

控制接口QPS,防止请求量过大压垮服务,解决流量风暴、恶意刷接口问题。

适用:AI高频问答、文档上传、批量解析接口。

2.2 熔断(CircuitBreaker)

依赖服务(大模型API、向量库、数据库)超时/报错率过高时,自动切断请求,避免请求堆积、线程阻塞,防止服务雪崩

AI项目刚需:大模型接口不稳定、响应慢、极易超时。

2.3 降级(Fallback)

服务熔断/异常时,返回预设兜底结果,保证服务可用、不报错、不雪崩

3. 主流技术方案能力全景对比

技术方案

限流

熔断降级

分布式集群

性能

运维成本

适用场景

Resilience4j

✅ 全能

需Redis配合

极高

极低

单体/微服务、AI项目首选

Sentinel

中(需控制台)

阿里生态、需动态规则可视化

Redisson

✅ 分布式

✅ 强适配

极低

集群全局限流

Bucket4j

✅ 高性能

支持Redis

极致高

极低

大文件上传、高并发接口

Guava RateLimiter

✅ 单机

极高

0

小型内部项目、简单防刷

Redis+Lua

✅ 原生

极低

零框架、极简技术栈

Hystrix

老旧项目,新项目禁用

4. 项目选型决策流程图(通用复用)

开始选型 → 判断是否需要熔断降级/超时防雪崩

4.1 不需要熔断(仅限流)

  • 单机小项目 → Guava RateLimiter

  • 高并发/大文件上传 → Bucket4j

  • 集群多实例 → Redisson / Redis+Lua

  • 零框架自研 → Redis+Lua

4.2 需要熔断降级(AI项目必选)

  • 阿里生态、需要可视化控制台 → Sentinel

  • 非阿里生态、轻量无绑定 → Resilience4j

  • 单体部署 → 直接 Resilience4j 全套

  • 集群部署 → Resilience4j(熔断) + Redisson(分布式限流)

5. 本RAG项目 最终生产标准技术栈

永久固定方案,无需反复选型

  • 熔断、降级、超时、防雪崩:Resilience4j(Spring官方、无厂商绑定)

  • 单机接口QPS限流:Resilience4j 内置限流

  • 集群分布式限流:Redisson

  • 大文件分片上传高并发限流:Bucket4j

  • 极简备选:Guava、Redis+Lua

6. 完整项目依赖(pom.xml)

整合所有限流熔断方案、RAG核心依赖,可直接覆盖原有pom

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>3.2.6</version> <relativePath/> </parent> <groupId>com.ai</groupId> <artifactId>langchain4j-enterprise-rag</artifactId> <version>1.0.0</version> <name>LangChain4j企业级RAG系统</name> <properties> <java.version>17</java.version> <langchain4j.version>0.32.0</langchain4j.version> <mybatis-plus.version>3.5.5</mybatis-plus.version> <fastjson2.version>2.0.52</fastjson2.version> </properties> <dependencies> <!-- SpringBoot基础 --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <!-- 数据库 --> <dependency> <groupId>com.mysql</groupId> <artifactId>mysql-connector-j</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>com.baomidou</groupId> <artifactId>mybatis-plus-boot-starter</artifactId> <version>${mybatis-plus.version}</version> </dependency> <!-- LangChain4j RAG核心 --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-spring-boot-starter</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-tongyi</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-milvus</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-chroma</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-document-parser-apache-tika</artifactId> <version>${langchain4j.version}</version> </dependency> <!-- 限流熔断全套方案 --> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId> <version>3.2.1</version> </dependency> <dependency> <groupId>org.redisson</groupId> <artifactId>redisson-spring-boot-starter</artifactId> <version>3.27.0</version> </dependency> <dependency> <groupId>com.github.vladimir-bukhtoyarov</groupId> <artifactId>bucket4j-core</artifactId> <version>7.6.0</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>32.1.3-jre</version> </dependency> <!-- 工具类 --> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>com.alibaba.fastjson2</groupId> <artifactId>fastjson2</artifactId> <version>${fastjson2.version}</version> </dependency> <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> <version>5.8.30</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> <configuration> <excludes> <exclude> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> </exclude> </excludes> </configuration> </plugin> </plugins> </build> </project>

7. 全局配置文件(application-dev.yml)

spring: datasource: url: jdbc:mysql://127.0.0.1:3306/rag_db?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&allowMultiQueries=true username: root password: 123456 driver-class-name: com.mysql.cj.jdbc.Driver data: redis: host: 127.0.0.1 port: 6379 password: database: 0 # 通义千问大模型配置 langchain4j: tongyi: api-key: sk-xxx你的keyxxx model-name: qwen-turbo timeout: 60s # 向量库配置 milvus: host: 127.0.0.1 port: 19530 collection-name: enterprise_knowledge chroma: host: 127.0.0.1 port: 8000 # Resilience4j 熔断+限流核心配置 resilience4j: circuitbreaker: instances: aiChatCircuit: slidingWindowSize: 10 failureRateThreshold: 50 waitDurationInOpenState: 10000 permittedNumberOfCallsInHalfOpen: 3 uploadCircuit: slidingWindowSize: 10 failureRateThreshold: 50 ratelimiter: instances: aiChatLimit: limitForPeriod: 5 limitRefreshPeriod: 1000 timeoutDuration: 2000 uploadLimit: limitForPeriod: 2 limitRefreshPeriod: 1000

8. 全套配置类源码

8.1 Bucket4jConfig.java 高性能限流配置

package com.ai.rag.config; import io.github.bucket4j.Bandwidth; import io.github.bucket4j.Bucket; import io.github.bucket4j.Refill; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import java.time.Duration; @Configuration public class Bucket4jConfig { @Bean public Bucket aiChatBucket() { Bandwidth bandwidth = Bandwidth.classic(5, Refill.greedy(5, Duration.ofSeconds(1))); return Bucket.builder().addLimit(bandwidth).build(); } @Bean public Bucket uploadBucket() { Bandwidth bandwidth = Bandwidth.classic(2, Refill.greedy(2, Duration.ofSeconds(1))); return Bucket.builder().addLimit(bandwidth).build(); } }

8.2 GuavaLimitConfig.java 轻量单机限流配置

package com.ai.rag.config; import com.google.common.util.concurrent.RateLimiter; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; @Configuration public class GuavaLimitConfig { @Bean public RateLimiter aiGuavaLimiter() { return RateLimiter.create(5.0); } @Bean public RateLimiter uploadGuavaLimiter() { return RateLimiter.create(2.0); } }

9. 全套限流工具类

9.1 RedisLimitUtil.java Redisson分布式限流

package com.ai.rag.util; import org.redisson.api.RRateLimiter; import org.redisson.api.RateIntervalUnit; import org.redisson.api.RateType; import org.redisson.api.RedissonClient; import org.springframework.stereotype.Component; import javax.annotation.Resource; @Component public class RedisLimitUtil { @Resource private RedissonClient redissonClient; public boolean tryLimit(String key, int qps) { RRateLimiter limiter = redissonClient.getRateLimiter(key); limiter.trySetRate(RateType.OVERALL, qps, 1, RateIntervalUnit.SECONDS); return limiter.tryAcquire(1); } }

9.2 LuaLimitUtil.java Redis+Lua原生限流

package com.ai.rag.util; import org.springframework.data.redis.core.StringRedisTemplate; import org.springframework.data.redis.core.script.DefaultRedisScript; import org.springframework.stereotype.Component; import javax.annotation.Resource; import java.util.List; @Component public class LuaLimitUtil { @Resource private StringRedisTemplate stringRedisTemplate; private static final String LUA_SCRIPT = "local key = KEYS[1] " + "local limit = tonumber(ARGV[1]) " + "local curr = redis.call('get', key) or 0 " + "if curr + 1 > limit then " + " return 0 " + "else " + " redis.call('incr', key) " + " redis.call('expire', key, 1) " + " return 1 " + "end"; public boolean tryLimit(String key, int limit) { DefaultRedisScript<Long> script = new DefaultRedisScript<>(); script.setScriptText(LUA_SCRIPT); script.setResultType(Long.class); Long result = stringRedisTemplate.execute(script, List.of(key), String.valueOf(limit)); return result != null && result == 1; } }

10. 终极整合Controller(5套方案全覆盖)

默认启用 Resilience4j 熔断限流,同时预留其余4套方案接口,无缝切换

package com.ai.rag.controller; import com.ai.rag.common.R; import com.ai.rag.service.DocumentService; import com.ai.rag.service.RagQaService; import com.ai.rag.util.LuaLimitUtil; import com.ai.rag.util.RedisLimitUtil; import com.google.common.util.concurrent.RateLimiter; import io.github.bucket4j.Bucket; import lombok.RequiredArgsConstructor; import org.springframework.web.bind.annotation.*; import org.springframework.web.multipart.MultipartFile; import javax.annotation.Resource; @RestController @RequestMapping("/api/rag") @RequiredArgsConstructor public class RagController { private final DocumentService documentService; private final RagQaService ragQaService; @Resource private Bucket aiChatBucket; @Resource private Bucket uploadBucket; @Resource private RateLimiter aiGuavaLimiter; @Resource private RateLimiter uploadGuavaLimiter; @Resource private RedisLimitUtil redisLimitUtil; @Resource private LuaLimitUtil luaLimitUtil; // 1. Resilience4j 熔断+限流 【生产默认主方案】 @GetMapping("/chat") @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "aiChatLimit") @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "aiChatCircuit", fallbackMethod = "chatFallback") public R<String> chat(@RequestParam String sessionId, @RequestParam String question) { return R.ok(ragQaService.chat(sessionId, question)); } @PostMapping("/upload") @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "uploadLimit") @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "uploadCircuit", fallbackMethod = "uploadFallback") public R<String> upload(@RequestParam MultipartFile file) throws Exception { documentService.uploadAndEmbed(file); return R.ok("文档上传并完成向量化"); } // 2. Bucket4j 高性能限流接口 @GetMapping("/chat/bucket4j") public R<String> chatByBucket4j(@RequestParam String sessionId, @RequestParam String question) { if (!aiChatBucket.tryConsume(1)) { return R.fail("【Bucket4j】AI问答接口访问限流"); } return R.ok(ragQaService.chat(sessionId, question)); } // 3. Guava 轻量限流接口 @GetMapping("/chat/guava") public R<String> chatByGuava(@RequestParam String sessionId, @RequestParam String question) { if (!aiGuavaLimiter.tryAcquire()) { return R.fail("【Guava】AI问答访问频繁,请稍后"); } return R.ok(ragQaService.chat(sessionId, question)); } // 4. Redisson 分布式限流接口 @GetMapping("/chat/redisson") public R<String> chatByRedisson(@RequestParam String sessionId, @RequestParam String question) { if (!redisLimitUtil.tryLimit("ai:chat:cluster:limit", 5)) { return R.fail("【Redisson】集群访问限流"); } return R.ok(ragQaService.chat(sessionId, question)); } // 5. Redis-Lua 原生限流接口 @GetMapping("/chat/lua") public R<String> chatByLua(@RequestParam String sessionId, @RequestParam String question) { if (!luaLimitUtil.tryLimit("ai:chat:lua:limit", 5)) { return R.fail("【Lua】接口请求受限"); } return R.ok(ragQaService.chat(sessionId, question)); } @DeleteMapping("/clear") public R<String> clear() { documentService.clearVectorStore(); return R.ok("向量库清空成功"); } // 统一降级兜底方法 public R<String> chatFallback(String sessionId, String question, Throwable e) { return R.fail("AI服务繁忙,已熔断降级保护,请稍后重试"); } public R<String> uploadFallback(MultipartFile file, Throwable e) { return R.fail("文档上传服务异常,已降级"); } }

11. 生产避坑规范

  • ❌ 禁止仅用限流无熔断:AI大模型超时堆积必雪崩

  • ❌ 集群环境禁止使用Guava/本地Bucket4j,限流失效

  • ❌ 新项目禁止使用Hystrix、老旧自研熔断

  • ✅ 职责拆分:熔断统一Resilience4j,限流按场景拆分

  • ✅ 大文件上传独立限流,不占用问答接口流量配额

  • ✅ 集群环境必须搭配Redisson实现全局流量管控

12. 项目启动顺序

  1. 启动 Redis、MySQL、Milvus/Chroma 向量库

  2. 修改yml中大通义千问API密钥

  3. 刷新Maven依赖

  4. 启动SpringBoot项目

  5. 默认接口/api/rag/chat自带熔断+限流降级

13. 接口测试清单

  • Resilience4j默认:GET /api/rag/chat

  • Bucket4j:GET /api/rag/chat/bucket4j

  • Guava:GET /api/rag/chat/guava

  • Redisson分布式:GET /api/rag/chat/redisson

  • Lua原生:GET /api/rag/chat/lua

  • 文档上传:POST /api/rag/upload

  • 清空向量库:DELETE /api/rag/clear

http://www.jsqmd.com/news/721778/

相关文章:

  • Go语言Context深度解析与工程实践
  • RuoYi-Vue项目左侧菜单样式全局覆盖实战:避免污染其他页面的正确姿势
  • 从CPU到密码学:聊聊逻辑门(AND/OR/XOR)在真实世界里的硬核应用
  • 渗透测试入门
  • 电脑黑屏F1报错怎么解决 开机显示器不亮 键盘灯不亮
  • 如何选择适合项目的「限流 / 熔断 / 降级」方案
  • Pixelle-Video完整指南:如何用AI全自动生成专业短视频
  • 告别模糊照片:用PMRID模型实战训练你的专属图像去噪数据集(附完整代码与避坑指南)
  • 魔兽争霸3现代兼容性终极指南:5分钟解决所有运行问题
  • 超市购物车里的秘密:用Python手把手教你Apriori算法找商品关联(附完整代码)
  • FuturesDesk 集成 OMC 多智能体编排提效
  • Linux cgroup 使用指南:从原理到实践
  • M4Markets vs FP Markets vs XM:平台稳定性与高波动时的表现
  • 孩子不爱背单词?试试让手指先「记住」——打字侠英语可以这样用
  • 【GPR回归预测】双向长短期记忆神经网络结合高斯过程回归(BiLSTM-GPR)的多变量回归预测 (多输入单输出)【含Matlab源码 15399期】
  • 从安防到短视频:聊聊视频分割技术在我们身边的5个真实应用
  • Cursor Free VIP终极指南:三步解锁Cursor Pro永久免费使用
  • 在 Windows 上使用 Hyper-V 虚拟机准备安装OpenClaw
  • 1993-2023年各国各行业IFR工业机器人数据
  • 你的棋盘格摆对了吗?Ubuntu 20.04 + ROS相机标定实战避坑指南(附常见错误排查)
  • 爆款引擎:2026流量内卷下的SEO破局密码
  • 如何开展高质量用户访谈?掌握 UX 研究的 4 个核心要素与提问艺术
  • 实战案例——AI智能客服机器人(全渠道发布)
  • HoRain云--SciPy科学计算库:Python数据分析的强大工具
  • 别再傻等IDEA的Maven骨架了!手把手教你用阿里云镜像5分钟搞定Web项目
  • 算法训练营第 17天 151.翻转字符串里的单词
  • 35块钱的国产开发板,用Docker搞定PyTorch模型TPU推理(MilkV Duo保姆级教程)
  • 用ESP32C3+Arduino IDE,5分钟搞定MiniMax大模型对话(附完整代码与避坑指南)
  • 虚拟主播必备!IndexTTS 2.0打造专属声音IP,情感可控超实用
  • 3步实现Windows系统性能翻倍:Winhance中文版终极优化指南