My problem statement is that- I have to pass multiple numbers of files to spring batch reader and reader runs in parellel.if we use grid-size = 100 then there will be 100 threads which is not logical. what is the way to solve this issue i.e. process many files with limited number of threads.
@Bean
public Step orderStep1() throws IOException {
return stepBuilderFactory.get("orderStep1")
.partitioner("slaveStep", partitioner())
.step(slaveStep())
.gridSize(100)
.taskExecutor(taskExecutor())
.build();
}
Task executor will be
@Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
return taskExecutor;
}
partitoner will be
public Map<String, ExecutionContext> partition(int gridSize) {
Map<String, ExecutionContext> partitionData = new HashMap<String, ExecutionContext>();
for (int i = 0; i < gridSize; i++) {
ExecutionContext executionContext = new ExecutionContext();
executionContext.putString("file", fileList.get(i)); //passing filelist
executionContext.putString("name", "Thread" + i);
partitionData.put("partition: " + i, executionContext);
}
return partitionData;
}
and passing files dynamically using stepExecutionContext
if we use grid-size = 100 then there will be 100 threads which is not logical
The grid size and thread pool size are two different things. You can have 100 partitions to process but only 10 worker threads available.
The issue in your case is that you are using the SimpleAsyncTaskExecutor
which does not re-use threads (See its Javadoc). So for each partition, a new thread will be created and you end up seeing 100 threads for the 100 partitions.
what is the way to solve this issue i.e. process many files with limited number of threads.
Consider using a ThreadPoolTaskExecutor so you can limit the number of worker threads.