Use Spring batch to load csv files into Oracle database:
1. Need set the below for oracle database (to create the tables job repository needs, may check the contents in core spring batch jar)
<jdbc:initialize-database data-source="dataSource">
<jdbc:script location="org/springframework/batch/core/schema-drop-oracle10g.sql" />
<jdbc:script location="org/springframework/batch/core/schema-oracle10g.sql" />
</jdbc:initialize-database>
2. Need set <property name="isolationLevelForCreate" value="ISOLATION_READ_COMMITTED" /> so that there's no below error:
Spring Batch ORA-08177: can't serialize access for this transaction when running single job, SERIALIZED isolation level
3. May set the skip lines for csv files:
<property name="linesToSkip" value="1"/>
4. csv file location could be specified by classpath or file way:
<property name="resource" value="classpath:...">
<property name="resource" value="file:c:/...">
5. Load 2G csv files with 41 million lines, it took 221 minutes. Much slower than SQL*Loader but should be good enough for most projects
The performance for Spring Batch is about 2700/s while Oracle SQL*Loader is above 100k/s. Oracle SQL*Loader (direct path) is 40 time faster than Spring Batch.
No comments:
Post a Comment