企业级java+LangChain4j-RAG系统 限流熔断降级
企业级java+LangChain4j-RAG系统 限流熔断降级
1. 文档说明
本文档基于SpringBoot3 + LangChain4j + Milvus/Chroma + MySQL + Redis企业级AI知识库RAG项目,整合了目前业界所有主流接口限流、熔断、降级方案,包含完整可运行源码、配置、场景选型规范、生产落地标准、面试核心知识点。
所有代码无缝替换Sentinel、零冲突、可直接部署上线,适配AI问答、文档解析、大文件分片上传全业务场景。
2. 核心概念区分(生产必备)
2.1 限流(RateLimit)
控制接口QPS,防止请求量过大压垮服务,解决流量风暴、恶意刷接口问题。
适用:AI高频问答、文档上传、批量解析接口。
2.2 熔断(CircuitBreaker)
依赖服务(大模型API、向量库、数据库)超时/报错率过高时,自动切断请求,避免请求堆积、线程阻塞,防止服务雪崩。
AI项目刚需:大模型接口不稳定、响应慢、极易超时。
2.3 降级(Fallback)
服务熔断/异常时,返回预设兜底结果,保证服务可用、不报错、不雪崩。
3. 主流技术方案能力全景对比
技术方案 | 限流 | 熔断降级 | 分布式集群 | 性能 | 运维成本 | 适用场景 |
|---|---|---|---|---|---|---|
Resilience4j | ✅ | ✅ 全能 | 需Redis配合 | 极高 | 极低 | 单体/微服务、AI项目首选 |
Sentinel | ✅ | ✅ | ✅ | 高 | 中(需控制台) | 阿里生态、需动态规则可视化 |
Redisson | ✅ 分布式 | ❌ | ✅ 强适配 | 高 | 极低 | 集群全局限流 |
Bucket4j | ✅ 高性能 | ❌ | 支持Redis | 极致高 | 极低 | 大文件上传、高并发接口 |
Guava RateLimiter | ✅ 单机 | ❌ | ❌ | 极高 | 0 | 小型内部项目、简单防刷 |
Redis+Lua | ✅ 原生 | ❌ | ✅ | 高 | 极低 | 零框架、极简技术栈 |
Hystrix | ✅ | ✅ | ✅ | 低 | 高 | 老旧项目,新项目禁用 |
4. 项目选型决策流程图(通用复用)
开始选型 → 判断是否需要熔断降级/超时防雪崩
4.1 不需要熔断(仅限流)
单机小项目 → Guava RateLimiter
高并发/大文件上传 → Bucket4j
集群多实例 → Redisson / Redis+Lua
零框架自研 → Redis+Lua
4.2 需要熔断降级(AI项目必选)
阿里生态、需要可视化控制台 → Sentinel
非阿里生态、轻量无绑定 → Resilience4j
单体部署 → 直接 Resilience4j 全套
集群部署 → Resilience4j(熔断) + Redisson(分布式限流)
5. 本RAG项目 最终生产标准技术栈
永久固定方案,无需反复选型
熔断、降级、超时、防雪崩:Resilience4j(Spring官方、无厂商绑定)
单机接口QPS限流:Resilience4j 内置限流
集群分布式限流:Redisson
大文件分片上传高并发限流:Bucket4j
极简备选:Guava、Redis+Lua
6. 完整项目依赖(pom.xml)
整合所有限流熔断方案、RAG核心依赖,可直接覆盖原有pom
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>3.2.6</version> <relativePath/> </parent> <groupId>com.ai</groupId> <artifactId>langchain4j-enterprise-rag</artifactId> <version>1.0.0</version> <name>LangChain4j企业级RAG系统</name> <properties> <java.version>17</java.version> <langchain4j.version>0.32.0</langchain4j.version> <mybatis-plus.version>3.5.5</mybatis-plus.version> <fastjson2.version>2.0.52</fastjson2.version> </properties> <dependencies> <!-- SpringBoot基础 --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <!-- 数据库 --> <dependency> <groupId>com.mysql</groupId> <artifactId>mysql-connector-j</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>com.baomidou</groupId> <artifactId>mybatis-plus-boot-starter</artifactId> <version>${mybatis-plus.version}</version> </dependency> <!-- LangChain4j RAG核心 --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-spring-boot-starter</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-tongyi</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-milvus</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-chroma</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-document-parser-apache-tika</artifactId> <version>${langchain4j.version}</version> </dependency> <!-- 限流熔断全套方案 --> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId> <version>3.2.1</version> </dependency> <dependency> <groupId>org.redisson</groupId> <artifactId>redisson-spring-boot-starter</artifactId> <version>3.27.0</version> </dependency> <dependency> <groupId>com.github.vladimir-bukhtoyarov</groupId> <artifactId>bucket4j-core</artifactId> <version>7.6.0</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>32.1.3-jre</version> </dependency> <!-- 工具类 --> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>com.alibaba.fastjson2</groupId> <artifactId>fastjson2</artifactId> <version>${fastjson2.version}</version> </dependency> <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> <version>5.8.30</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> <configuration> <excludes> <exclude> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> </exclude> </excludes> </configuration> </plugin> </plugins> </build> </project>7. 全局配置文件(application-dev.yml)
spring: datasource: url: jdbc:mysql://127.0.0.1:3306/rag_db?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&allowMultiQueries=true username: root password: 123456 driver-class-name: com.mysql.cj.jdbc.Driver data: redis: host: 127.0.0.1 port: 6379 password: database: 0 # 通义千问大模型配置 langchain4j: tongyi: api-key: sk-xxx你的keyxxx model-name: qwen-turbo timeout: 60s # 向量库配置 milvus: host: 127.0.0.1 port: 19530 collection-name: enterprise_knowledge chroma: host: 127.0.0.1 port: 8000 # Resilience4j 熔断+限流核心配置 resilience4j: circuitbreaker: instances: aiChatCircuit: slidingWindowSize: 10 failureRateThreshold: 50 waitDurationInOpenState: 10000 permittedNumberOfCallsInHalfOpen: 3 uploadCircuit: slidingWindowSize: 10 failureRateThreshold: 50 ratelimiter: instances: aiChatLimit: limitForPeriod: 5 limitRefreshPeriod: 1000 timeoutDuration: 2000 uploadLimit: limitForPeriod: 2 limitRefreshPeriod: 10008. 全套配置类源码
8.1 Bucket4jConfig.java 高性能限流配置
package com.ai.rag.config; import io.github.bucket4j.Bandwidth; import io.github.bucket4j.Bucket; import io.github.bucket4j.Refill; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import java.time.Duration; @Configuration public class Bucket4jConfig { @Bean public Bucket aiChatBucket() { Bandwidth bandwidth = Bandwidth.classic(5, Refill.greedy(5, Duration.ofSeconds(1))); return Bucket.builder().addLimit(bandwidth).build(); } @Bean public Bucket uploadBucket() { Bandwidth bandwidth = Bandwidth.classic(2, Refill.greedy(2, Duration.ofSeconds(1))); return Bucket.builder().addLimit(bandwidth).build(); } }8.2 GuavaLimitConfig.java 轻量单机限流配置
package com.ai.rag.config; import com.google.common.util.concurrent.RateLimiter; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; @Configuration public class GuavaLimitConfig { @Bean public RateLimiter aiGuavaLimiter() { return RateLimiter.create(5.0); } @Bean public RateLimiter uploadGuavaLimiter() { return RateLimiter.create(2.0); } }9. 全套限流工具类
9.1 RedisLimitUtil.java Redisson分布式限流
package com.ai.rag.util; import org.redisson.api.RRateLimiter; import org.redisson.api.RateIntervalUnit; import org.redisson.api.RateType; import org.redisson.api.RedissonClient; import org.springframework.stereotype.Component; import javax.annotation.Resource; @Component public class RedisLimitUtil { @Resource private RedissonClient redissonClient; public boolean tryLimit(String key, int qps) { RRateLimiter limiter = redissonClient.getRateLimiter(key); limiter.trySetRate(RateType.OVERALL, qps, 1, RateIntervalUnit.SECONDS); return limiter.tryAcquire(1); } }9.2 LuaLimitUtil.java Redis+Lua原生限流
package com.ai.rag.util; import org.springframework.data.redis.core.StringRedisTemplate; import org.springframework.data.redis.core.script.DefaultRedisScript; import org.springframework.stereotype.Component; import javax.annotation.Resource; import java.util.List; @Component public class LuaLimitUtil { @Resource private StringRedisTemplate stringRedisTemplate; private static final String LUA_SCRIPT = "local key = KEYS[1] " + "local limit = tonumber(ARGV[1]) " + "local curr = redis.call('get', key) or 0 " + "if curr + 1 > limit then " + " return 0 " + "else " + " redis.call('incr', key) " + " redis.call('expire', key, 1) " + " return 1 " + "end"; public boolean tryLimit(String key, int limit) { DefaultRedisScript<Long> script = new DefaultRedisScript<>(); script.setScriptText(LUA_SCRIPT); script.setResultType(Long.class); Long result = stringRedisTemplate.execute(script, List.of(key), String.valueOf(limit)); return result != null && result == 1; } }10. 终极整合Controller(5套方案全覆盖)
默认启用 Resilience4j 熔断限流,同时预留其余4套方案接口,无缝切换
package com.ai.rag.controller; import com.ai.rag.common.R; import com.ai.rag.service.DocumentService; import com.ai.rag.service.RagQaService; import com.ai.rag.util.LuaLimitUtil; import com.ai.rag.util.RedisLimitUtil; import com.google.common.util.concurrent.RateLimiter; import io.github.bucket4j.Bucket; import lombok.RequiredArgsConstructor; import org.springframework.web.bind.annotation.*; import org.springframework.web.multipart.MultipartFile; import javax.annotation.Resource; @RestController @RequestMapping("/api/rag") @RequiredArgsConstructor public class RagController { private final DocumentService documentService; private final RagQaService ragQaService; @Resource private Bucket aiChatBucket; @Resource private Bucket uploadBucket; @Resource private RateLimiter aiGuavaLimiter; @Resource private RateLimiter uploadGuavaLimiter; @Resource private RedisLimitUtil redisLimitUtil; @Resource private LuaLimitUtil luaLimitUtil; // 1. Resilience4j 熔断+限流 【生产默认主方案】 @GetMapping("/chat") @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "aiChatLimit") @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "aiChatCircuit", fallbackMethod = "chatFallback") public R<String> chat(@RequestParam String sessionId, @RequestParam String question) { return R.ok(ragQaService.chat(sessionId, question)); } @PostMapping("/upload") @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "uploadLimit") @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "uploadCircuit", fallbackMethod = "uploadFallback") public R<String> upload(@RequestParam MultipartFile file) throws Exception { documentService.uploadAndEmbed(file); return R.ok("文档上传并完成向量化"); } // 2. Bucket4j 高性能限流接口 @GetMapping("/chat/bucket4j") public R<String> chatByBucket4j(@RequestParam String sessionId, @RequestParam String question) { if (!aiChatBucket.tryConsume(1)) { return R.fail("【Bucket4j】AI问答接口访问限流"); } return R.ok(ragQaService.chat(sessionId, question)); } // 3. Guava 轻量限流接口 @GetMapping("/chat/guava") public R<String> chatByGuava(@RequestParam String sessionId, @RequestParam String question) { if (!aiGuavaLimiter.tryAcquire()) { return R.fail("【Guava】AI问答访问频繁,请稍后"); } return R.ok(ragQaService.chat(sessionId, question)); } // 4. Redisson 分布式限流接口 @GetMapping("/chat/redisson") public R<String> chatByRedisson(@RequestParam String sessionId, @RequestParam String question) { if (!redisLimitUtil.tryLimit("ai:chat:cluster:limit", 5)) { return R.fail("【Redisson】集群访问限流"); } return R.ok(ragQaService.chat(sessionId, question)); } // 5. Redis-Lua 原生限流接口 @GetMapping("/chat/lua") public R<String> chatByLua(@RequestParam String sessionId, @RequestParam String question) { if (!luaLimitUtil.tryLimit("ai:chat:lua:limit", 5)) { return R.fail("【Lua】接口请求受限"); } return R.ok(ragQaService.chat(sessionId, question)); } @DeleteMapping("/clear") public R<String> clear() { documentService.clearVectorStore(); return R.ok("向量库清空成功"); } // 统一降级兜底方法 public R<String> chatFallback(String sessionId, String question, Throwable e) { return R.fail("AI服务繁忙,已熔断降级保护,请稍后重试"); } public R<String> uploadFallback(MultipartFile file, Throwable e) { return R.fail("文档上传服务异常,已降级"); } }11. 生产避坑规范
❌ 禁止仅用限流无熔断:AI大模型超时堆积必雪崩
❌ 集群环境禁止使用Guava/本地Bucket4j,限流失效
❌ 新项目禁止使用Hystrix、老旧自研熔断
✅ 职责拆分:熔断统一Resilience4j,限流按场景拆分
✅ 大文件上传独立限流,不占用问答接口流量配额
✅ 集群环境必须搭配Redisson实现全局流量管控
12. 项目启动顺序
启动 Redis、MySQL、Milvus/Chroma 向量库
修改yml中大通义千问API密钥
刷新Maven依赖
启动SpringBoot项目
默认接口/api/rag/chat自带熔断+限流降级
13. 接口测试清单
Resilience4j默认:GET /api/rag/chat
Bucket4j:GET /api/rag/chat/bucket4j
Guava:GET /api/rag/chat/guava
Redisson分布式:GET /api/rag/chat/redisson
Lua原生:GET /api/rag/chat/lua
文档上传:POST /api/rag/upload
清空向量库:DELETE /api/rag/clear
