当前位置：首页 > news >正文

告别繁琐配置！Spring Batch注解式开发入门：5分钟搭建你的第一个文件批处理Job

news 2026/5/1 3:11:43

告别繁琐配置！Spring Batch注解式开发入门：5分钟搭建你的第一个文件批处理Job

批处理任务在企业级应用中无处不在——从每日的报表生成、数据清洗到大规模日志分析。传统Spring Batch开发中，XML配置的冗长常让开发者望而却步。现在，借助Spring Boot的自动化配置和现代注解体系，我们能用极简代码实现专业级批处理能力。

1. 环境准备与项目初始化

首先通过Spring Initializr创建项目骨架，只需勾选两个核心依赖：

<dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-batch</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies>

注意：Spring Batch 5.x+版本需要JDK 17+支持，若使用JDK 8可选择2.7.x版本

创建基础启动类时，关键是要排除数据源自动配置（除非需要数据库持久化任务状态）：

@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class}) public class BatchApplication { public static void main(String[] args) { SpringApplication.run(BatchApplication.class, args); } }

2. 注解驱动的批处理配置

核心配置类只需两个注解即可激活批处理环境：

@Configuration @EnableBatchProcessing public class FileBatchConfig { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; }

与传统XML配置相比，注解方式有三大优势：

类型安全：编译器可检查Bean类型匹配
代码导航：IDE支持直接跳转到实现
配置集中：所有组件定义在同一文件

3. 构建文件处理流水线

假设我们要处理学生成绩单CSV文件，计算每个学生的总分。先定义领域模型：

@Data @AllArgsConstructor @NoArgsConstructor public class StudentRecord { private String studentId; private int math; private int physics; private int chemistry; } @Data @AllArgsConstructor @NoArgsConstructor public class StudentSummary { private String studentId; private int totalScore; }

3.1 配置读写组件

使用FlatFileItemReader构建CSV读取器：

@Bean public FlatFileItemReader<StudentRecord> csvReader() { return new FlatFileItemReaderBuilder<StudentRecord>() .name("studentReader") .resource(new ClassPathResource("scores.csv")) .delimited() .names("studentId", "math", "physics", "chemistry") .fieldSetMapper(new BeanWrapperFieldSetMapper<>() {{ setTargetType(StudentRecord.class); }}) .build(); }

对应的文件写入器配置：

@Bean public FlatFileItemWriter<StudentSummary> csvWriter() { return new FlatFileItemWriterBuilder<StudentSummary>() .name("summaryWriter") .resource(new FileSystemResource("output/summary.csv")) .lineAggregator(new DelimitedLineAggregator<>() {{ setDelimiter("|"); setFieldExtractor(new BeanWrapperFieldExtractor<>() {{ setNames(new String[]{"studentId", "totalScore"}); }}); }}) .build(); }

3.2 实现处理逻辑

创建处理器计算总分：

public class ScoreCalculator implements ItemProcessor<StudentRecord, StudentSummary> { @Override public StudentSummary process(StudentRecord item) { int total = item.getMath() + item.getPhysics() + item.getChemistry(); return new StudentSummary(item.getStudentId(), total); } }

4. 组装批处理任务

将组件组合成完整任务：

@Bean public Job calculateTotalScoresJob() { return jobBuilderFactory.get("scoreCalculation") .start(processStep()) .build(); } @Bean public Step processStep() { return stepBuilderFactory.get("calculateStep") .<StudentRecord, StudentSummary>chunk(100) .reader(csvReader()) .processor(new ScoreCalculator()) .writer(csvWriter()) .build(); }

关键参数说明：

chunk(100)：每处理100条记录后执行一次写入
reader/processor/writer：构成完整处理链

5. 运行与验证

准备测试文件scores.csv：

s1001,85,92,88 s1002,78,85,90 s1003,92,95,89

启动应用后查看output/summary.csv，将看到：

s1001|265 s1002|253 s1003|276

控制台会输出处理日志：

Processing student: s1001 with total 265 Processing student: s1002 with total 253 Processing student: s1003 with total 276 Job completed in 450ms

6. 高级配置技巧

6.1 任务监听与监控

添加任务生命周期监听：

@Bean public JobExecutionListener jobListener() { return new JobExecutionListener() { @Override public void beforeJob(JobExecution jobExecution) { System.out.println("Job starting: " + jobExecution.getJobInstance().getJobName()); } @Override public void afterJob(JobExecution jobExecution) { System.out.println("Job completed with status: " + jobExecution.getStatus()); } }; }

在Job配置中添加监听器：

@Bean public Job calculateTotalScoresJob() { return jobBuilderFactory.get("scoreCalculation") .listener(jobListener()) .start(processStep()) .build(); }

6.2 多步骤任务

复杂任务可拆分为多个步骤：

@Bean public Job multiStepJob() { return jobBuilderFactory.get("advancedJob") .start(prepareStep()) .next(calculateStep()) .next(exportStep()) .build(); }

6.3 异常处理策略

配置跳过规则和重试机制：

@Bean public Step faultTolerantStep() { return stepBuilderFactory.get("safeStep") .<StudentRecord, StudentSummary>chunk(50) .reader(csvReader()) .processor(calculator()) .writer(csvWriter()) .faultTolerant() .skipLimit(10) .skip(NumberFormatException.class) .retryLimit(3) .retry(DeadlockLoserDataAccessException.class) .build(); }

7. 性能调优建议

合理设置chunk大小：
- 内存充足时增大chunk size（500-1000）
- 大数据量时适当减小（50-100）

并行处理配置：

@Bean public Step parallelStep() { return stepBuilderFactory.get("parallelStep") .<StudentRecord, StudentSummary>chunk(100) .reader(csvReader()) .processor(calculator()) .writer(csvWriter()) .taskExecutor(new SimpleAsyncTaskExecutor()) .throttleLimit(4) .build(); }