1. Spring Batch processes items in chunks. Chunk processing
allows streaming data instead of loading all the data in memory. By default, chunk
processing is single threaded and usually performs well.
Using chunk processing, Spring Batch collects items one at a time from
the item reader into a configurable-sized chunk. Spring Batch then sends the chunk
to the item writer and goes back to using the item reader to create another chunk,
and so on, until the input is exhausted.
If you return null from the ItemProcessor method process, processing for
that item stops and Spring Batch won’t insert the item in the database.
2. First, the size of a chunk and the commit interval are the same thing!
Our recommendation is a value between 10 and 200.
3. For this test, you use the volatile job repository implementation. It’s perfect for testing
and prototyping because it stores execution metadata in memory.
4. To be able to refer to job parameters, a bean must use the Spring Batch step scope.
The step scope means that Spring will create the bean only when the step asks for it
and that values will be resolved then (this is the lazy instantiation pattern; the bean
isn’t created during the Spring application context’s bootstrapping).
4. Assuming you can live with skipping some records instead of failing the whole job
<step id="readWriteProducts">
<tasklet>
<chunk reader="reader" writer="writer" commit-interval="100"
skip-limit="5">
<skippable-exception-classes>
<include class="org.springframework.batch.
➥ item.file.FlatFileParseException" />
</skippable-exception-classes>
</chunk>
</tasklet>
</step>
5. Job attributes
restartable The default is true.
6. Spring Batch provides a default implementation of this interface with the
DefaultJobParametersValidator class that suits most use cases. This class allows you
to specify which parameters are required and which are optional.
7. Chunk attributes
commit-interval: Number of items to process before issuing a commit. When
the number of items read reaches the commit interval number,
the entire corresponding chunk is written out through the item
writer and the transaction is committed.
8. JdbcCursorItemReader
The minimal set of properties to use a JdbcCursorItemReader is dataSource, sql,
and rowMapper.
Choosing between cursor-based and page-based item readers
Cursor-based readers issue one query to the database
and stream the data to avoid consuming too much memory. Cursor-based
readers rely on the cursor implementation of the database and of the JDBC driver.
Depending on your database engine and on the driver.
Page-based readers work well with an appropriate page size. In this case, retrieving data consists in successively executingseveral requests with criteria. Spring Batch dynamically builds requests to execute
based on a sort key to delimit data for a page. To retrieve each page, Spring Batch executes
one request to retrieve the corresponding data.
the
size of a page should be around 1,000 items—this is a rule of thumb. The page size
is usually higher than the commit interval (whose reasonable values range from 10
to 200 items). Remember, the point of paging is to avoid consuming too much memory,
so large pages aren’t good. Small pages aren’t good either. If you read 1 million
items in pages of 10 items (a small page size), you’ll send 100,000 queries to the
database.
9. JmsItemReader
The Spring Batch class
JmsItemReader implements the ItemReader interface and internally uses the Spring
JmsTemplate class.