Spring Boot 2.3.12 + Spring Batch 实战:用注解搞定学生成绩单批量计算(附完整源码)
Spring Boot 2.3.12 + Spring Batch 实战:用注解搞定学生成绩单批量计算(附完整源码)
在当今教育信息化快速发展的背景下,学校教务系统每天都需要处理大量的学生成绩数据。传统的手工录入和计算方式不仅效率低下,而且容易出错。作为一名Java开发者,掌握高效的批量数据处理技术显得尤为重要。本文将带你从零开始,使用Spring Boot 2.3.12和Spring Batch构建一个完整的学生成绩批量处理系统,通过注解配置实现从CSV文件读取学生单科成绩、计算总分并写入新文件的完整流程。
1. 环境准备与项目搭建
1.1 创建Spring Boot项目
首先,我们需要创建一个基础的Spring Boot项目。推荐使用Spring Initializr(https://start.spring.io/)快速生成项目骨架,选择以下依赖:
- Spring Batch
- Lombok(简化POJO编写)
或者直接在Maven项目中添加以下依赖:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-batch</artifactId> <version>2.3.12.RELEASE</version> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.18.20</version> <scope>provided</scope> </dependency>1.2 项目结构规划
合理的项目结构能让代码更易维护。建议采用以下目录结构:
src/main/java ├── com.example.batchdemo │ ├── config # 配置类 │ ├── model # 数据模型 │ ├── processor # 业务处理 │ ├── reader # 数据读取 │ └── writer # 数据写入 src/main/resources ├── application.yml # 应用配置 └── student-data.csv # 测试数据2. 数据模型设计
2.1 学生单科成绩模型
我们首先定义学生单科成绩的数据模型,对应CSV文件中的原始数据:
@Data @AllArgsConstructor @NoArgsConstructor public class Student { private String id; // 学号 private int chinese; // 语文成绩 private int math; // 数学成绩 private int english; // 英语成绩 }2.2 学生总分模型
处理后的结果数据模型,包含学号和总分:
@Data @AllArgsConstructor @NoArgsConstructor public class StudentTotalScore { private String id; // 学号 private int totalScore; // 总分 }提示:使用Lombok注解可以大幅减少样板代码,但需要确保IDE已安装Lombok插件。
3. 核心批处理配置
3.1 启用批处理支持
在Spring Boot主类上添加@EnableBatchProcessing注解:
@SpringBootApplication @EnableBatchProcessing public class BatchDemoApplication { public static void main(String[] args) { SpringApplication.run(BatchDemoApplication.class, args); } }3.2 配置批处理作业
创建批处理配置类,定义Reader、Processor和Writer:
@Configuration public class BatchConfig { @Value("${input.file}") private Resource inputFile; @Value("${output.file}") private Resource outputFile; @Bean public FlatFileItemReader<Student> reader() { return new FlatFileItemReaderBuilder<Student>() .name("studentItemReader") .resource(inputFile) .delimited() .names("id", "chinese", "math", "english") .fieldSetMapper(new BeanWrapperFieldSetMapper<Student>() {{ setTargetType(Student.class); }}) .build(); } @Bean public ItemProcessor<Student, StudentTotalScore> processor() { return new StudentScoreProcessor(); } @Bean public FlatFileItemWriter<StudentTotalScore> writer() { return new FlatFileItemWriterBuilder<StudentTotalScore>() .name("studentScoreWriter") .resource(outputFile) .lineAggregator(new DelimitedLineAggregator<StudentTotalScore>() {{ setDelimiter(","); setFieldExtractor(new BeanWrapperFieldExtractor<StudentTotalScore>() {{ setNames(new String[]{"id", "totalScore"}); }}); }}) .build(); } }3.3 实现业务处理器
创建成绩处理器,计算学生总分:
public class StudentScoreProcessor implements ItemProcessor<Student, StudentTotalScore> { @Override public StudentTotalScore process(Student student) throws Exception { int totalScore = student.getChinese() + student.getMath() + student.getEnglish(); log.info("Processing student {}: total score = {}", student.getId(), totalScore); return new StudentTotalScore(student.getId(), totalScore); } }4. 作业定义与执行
4.1 定义批处理作业
在配置类中继续添加作业定义:
@Bean public Job calculateStudentScores(JobBuilderFactory jobs, Step step1) { return jobs.get("calculateStudentScores") .incrementer(new RunIdIncrementer()) .flow(step1) .end() .build(); } @Bean public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Student> reader, ItemWriter<StudentTotalScore> writer, ItemProcessor<Student, StudentTotalScore> processor) { return stepBuilderFactory.get("step1") .<Student, StudentTotalScore>chunk(10) .reader(reader) .processor(processor) .writer(writer) .build(); }4.2 配置参数
在application.yml中添加文件路径配置:
input: file: classpath:student-data.csv output: file: file:./output/student-scores.csv4.3 准备测试数据
在resources目录下创建student-data.csv文件:
student-1,90,85,96 student-2,92,97,94 student-3,95,93,1005. 高级配置与优化
5.1 批处理监控
Spring Batch提供了方便的监控接口,可以通过添加以下依赖启用Web界面:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency>然后在application.yml中启用批处理端点:
management: endpoints: web: exposure: include: batch5.2 性能优化建议
对于大规模数据处理,可以考虑以下优化措施:
- 调整chunk大小:根据内存情况调整chunk大小(默认10)
- 并行处理:使用
TaskExecutor实现多线程处理 - 分区处理:对大数据集进行分区并行处理
@Bean public TaskExecutor taskExecutor() { SimpleAsyncTaskExecutor executor = new SimpleAsyncTaskExecutor(); executor.setConcurrencyLimit(4); return executor; } @Bean public Step step1(StepBuilderFactory stepBuilderFactory) { return stepBuilderFactory.get("step1") .<Student, StudentTotalScore>chunk(100) .reader(reader()) .processor(processor()) .writer(writer()) .taskExecutor(taskExecutor()) .throttleLimit(4) .build(); }5.3 错误处理与重试
Spring Batch提供了强大的错误处理机制:
@Bean public Step step1(StepBuilderFactory stepBuilderFactory) { return stepBuilderFactory.get("step1") .<Student, StudentTotalScore>chunk(10) .reader(reader()) .processor(processor()) .writer(writer()) .faultTolerant() .skipLimit(10) .skip(Exception.class) .retryLimit(3) .retry(Exception.class) .build(); }6. 实际应用扩展
6.1 数据库集成
实际项目中,数据可能存储在数据库中。Spring Batch支持多种数据库读取方式:
@Bean public JdbcCursorItemReader<Student> databaseReader(DataSource dataSource) { return new JdbcCursorItemReaderBuilder<Student>() .name("databaseReader") .dataSource(dataSource) .sql("SELECT id, chinese, math, english FROM student_scores") .rowMapper(new BeanPropertyRowMapper<>(Student.class)) .build(); }6.2 定时任务集成
结合Spring Scheduler实现定时批处理:
@Scheduled(cron = "0 0 2 * * ?") // 每天凌晨2点执行 public void runBatchJob() throws JobExecutionException { JobParameters jobParameters = new JobParametersBuilder() .addLong("time", System.currentTimeMillis()) .toJobParameters(); jobLauncher.run(calculateStudentScores, jobParameters); }6.3 多步骤作业
复杂业务可能需要多个处理步骤:
@Bean public Job complexScoreJob(JobBuilderFactory jobs, Step step1, Step step2) { return jobs.get("complexScoreJob") .start(step1) .next(step2) .build(); }7. 常见问题解决
在实际开发中,可能会遇到以下典型问题:
版本兼容性问题:
- Spring Boot 2.3.x与最新Spring Batch可能存在兼容性问题
- 推荐使用文中指定的版本组合
文件路径问题:
- 输出文件目录需要提前创建
- 使用绝对路径更可靠
事务管理:
- 默认使用Map-based JobRepository
- 生产环境建议配置数据库存储批处理元数据
@Bean public DataSource dataSource() { // 配置数据源 } @Bean public BatchConfigurer configurer(DataSource dataSource) { return new DefaultBatchConfigurer(dataSource); }- 性能瓶颈:
- 大数据量处理时注意内存使用
- 考虑分页读取或游标方式
8. 完整项目源码结构
以下是完整项目的关键文件结构及说明:
src/main/java/ ├── com/example/batchdemo/ │ ├── BatchDemoApplication.java # 启动类 │ ├── config/ │ │ └── BatchConfig.java # 批处理配置 │ ├── model/ │ │ ├── Student.java # 学生模型 │ │ └── StudentTotalScore.java # 成绩模型 │ └── processor/ │ └── StudentScoreProcessor.java # 业务处理器 src/main/resources/ ├── application.yml # 应用配置 └── student-data.csv # 测试数据关键配置类完整代码:
@Configuration @EnableBatchProcessing public class BatchConfig { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; @Value("classpath:student-data.csv") private Resource inputResource; @Value("file:./output/student-scores.csv") private Resource outputResource; @Bean public FlatFileItemReader<Student> reader() { return new FlatFileItemReaderBuilder<Student>() .name("studentItemReader") .resource(inputResource) .delimited() .names("id", "chinese", "math", "english") .fieldSetMapper(new BeanWrapperFieldSetMapper<Student>() {{ setTargetType(Student.class); }}) .build(); } @Bean public ItemProcessor<Student, StudentTotalScore> processor() { return new StudentScoreProcessor(); } @Bean public FlatFileItemWriter<StudentTotalScore> writer() { return new FlatFileItemWriterBuilder<StudentTotalScore>() .name("studentScoreWriter") .resource(outputResource) .lineAggregator(new DelimitedLineAggregator<StudentTotalScore>() {{ setDelimiter(","); setFieldExtractor(new BeanWrapperFieldExtractor<StudentTotalScore>() {{ setNames(new String[]{"id", "totalScore"}); }}); }}) .build(); } @Bean public Step calculateStep() { return stepBuilderFactory.get("calculateStep") .<Student, StudentTotalScore>chunk(10) .reader(reader()) .processor(processor()) .writer(writer()) .build(); } @Bean public Job calculateStudentScoresJob() { return jobBuilderFactory.get("calculateStudentScoresJob") .incrementer(new RunIdIncrementer()) .flow(calculateStep()) .end() .build(); } }运行项目后,你将在项目根目录下的output文件夹中找到处理后的成绩文件,内容格式如下:
student-1,271 student-2,283 student-3,288