美团CPS分销系统中Java接口高并发下的性能瓶颈排查与优化技巧
美团CPS分销系统中Java接口高并发下的性能瓶颈排查与优化技巧
在美团CPS分销系统中,订单回调、佣金计算、分佣发放等核心接口在午晚高峰面临每秒数千QPS的压力。若未提前识别和优化性能瓶颈,极易出现RT飙升、线程阻塞甚至服务雪崩。本文结合Arthas、Prometheus、SkyWalking等工具,从代码、JVM、数据库三个层面提供可落地的排查与优化方案。
1. 使用Arthas实时诊断热点方法
通过trace命令定位慢方法:
# 启动Arthasjava-jararthas-boot.jar# 跟踪分销计算入口trace baodanbao.com.cn.cps.service.CommissionService calculateCommissionDistribution若发现calculateRule()耗时占比80%,则重点优化该逻辑。
2. 避免同步锁竞争:无锁化设计
原始代码使用synchronized导致线程排队:
// ❌ 反例:全局锁publicsynchronizedvoidupdateBalance(LonguserId,BigDecimalamount){UserAccountaccount=accountMapper.selectById(userId);account.setBalance(account.getBalance().add(amount));accountMapper.updateById(account);}改为CAS重试或数据库乐观锁:
// ✅ 优化:数据库版本控制publicvoidupdateBalanceOptimistic(LonguserId,BigDecimalamount){intretries=0;while(retries++<3){UserAccountaccount=accountMapper.selectForUpdate(userId);// SELECT ... FOR UPDATEBigDecimalnewBalance=account.getBalance().add(amount);intupdated=accountMapper.updateBalanceWithVersion(userId,newBalance,account.getVersion());if(updated>0)break;// 成功try{Thread.sleep(10);}catch(InterruptedExceptione){/* ignore */}}}Mapper XML:
<updateid="updateBalanceWithVersion">UPDATE user_account SET balance = #{newBalance}, version = version + 1 WHERE user_id = #{userId} AND version = #{version}</update>3. 异步化非核心链路
将日志、通知、埋点等操作异步化:
@ServicepublicclassOrderCallbackService{@AutowiredprivateExecutorServiceasyncTaskExecutor;publicvoidhandleOrderCallback(OrderCallbackDTOdto){// 1. 核心:校验并计算佣金CommissionResultresult=commissionCalculator.calculate(dto);// 2. 异步:记录审计日志、推送消息asyncTaskExecutor.execute(()->{baodanbao.com.cn.cps.audit.AuditLogger.log(result);baodanbao.com.cn.cps.notify.MessageSender.send(result.getUserId(),"佣金到账");});}}4. 数据库连接池与SQL优化
检查HikariCP指标:
# application.ymlspring:datasource:hikari:maximum-pool-size:40metric-registry:io.micrometer.core.instrument.Metrics.globalRegistry通过/actuator/metrics/hikaricp.connections.active监控活跃连接数。
优化慢SQL:为order_id、user_id、status添加联合索引:
ALTERTABLEcps_commission_recordADDINDEXidx_user_status_time(user_id,status,create_time);避免SELECT *,只查必要字段:
@Select("SELECT order_id, amount, status FROM cps_commission_record WHERE user_id = #{userId}")List<SimpleCommission>selectSimpleByUserId(@Param("userId")LonguserId);5. 缓存击穿防护与本地缓存
对高频访问的分销规则使用Caffeine+Redis二级缓存:
@ComponentpublicclassDistributeRuleCache{privatefinalCache<Long,DistributeRule>localCache=Caffeine.newBuilder().maximumSize(1000).expireAfterWrite(10,TimeUnit.MINUTES).build();publicDistributeRulegetRule(LongmerchantId){DistributeRulerule=localCache.getIfPresent(merchantId);if(rule!=null)returnrule;StringredisKey="distribute:rule:"+merchantId;Objectobj=redisTemplate.opsForValue().get(redisKey);if(objinstanceofDistributeRule){localCache.put(merchantId,(DistributeRule)obj);return(DistributeRule)obj;}// 回源DB(带Redis分布式锁防击穿)rule=loadFromDbWithLock(merchantId);if(rule!=null){redisTemplate.opsForValue().set(redisKey,rule,30,TimeUnit.MINUTES);localCache.put(merchantId,rule);}returnrule;}}6. JVM GC调优与内存分析
观察GC日志,若Young GC频繁且耗时长,说明对象生命周期过短或Eden区过小:
# 启动参数-Xms8g-Xmx8g-XX:+UseG1GC-XX:MaxGCPauseMillis=200-Xlog:gc*:file=gc.log:time使用MAT分析堆转储,查找大对象或内存泄漏:
// 避免在循环中创建大集合publicList<OrderStat>processOrders(List<Order>orders){// ❌ 反例:每次循环new ArrayList// ✅ 正确:预分配容量List<OrderStat>stats=newArrayList<>(orders.size());for(Ordero:orders){stats.add(convert(o));}returnstats;}7. 接口限流与熔断
使用Sentinel保护核心接口:
@PostConstructpublicvoidinitFlowRules(){List<FlowRule>rules=newArrayList<>();FlowRulerule=newFlowRule("commission_callback_api").setGrade(RuleConstant.FLOW_GRADE_QPS).setCount(2000);// QPS上限2000rules.add(rule);FlowRuleManager.loadRules(rules);}@SentinelResource(value="commission_callback_api",blockHandler="handleBlocked")publicResponseEntity<?>handleCallback(@RequestBodyCallbackDTOdto){// 业务逻辑}publicResponseEntity<?>handleBlocked(BlockExceptionex){returnResponseEntity.status(429).body("Too many requests");}本文著作权归 俱美开放平台 ,转载请注明出处!
