Search Tutorials


Spring Batch Hello World example- Write data from csv to xml file | JavaInUse

Spring Batch Hello World example-Write data from csv to xml file

In this post we create a simple spring batch tutorial to read data from csv to xml file.
Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. Spring Batch builds upon the productivity, POJO-based development approach, and general ease of use capabilities people have come to know from the Spring Framework, while making it easy for developers to access and leverage more advanced enterprise services when necessary.
Consider an environment where users have to do a lot of batch processing. This will be quite different from a typical web application which has to work 24/7. But in classic environments it's not unusual to do the heavy lifting for example during the night when there are no regular users using your system. Batch processing includes typical tasks like reading and writing to files, transforming data, reading from or writing to databases, create reports, import and export data and things like that. Often these steps have to be chained together or you have to create more complex workflows where you have to define which job steps can be run in parallel or have to be run sequentially etc. That's where a framework like Spring Batch can be very handy.


How Spring Batch works?


boot13_1
  • step - A Step that delegates to a Job to do its work. This is a great tool for managing dependencies between jobs, and also to modularise complex step logic into something that is testable in isolation. The job is executed with parameters that can be extracted from the step execution, hence this step can also be usefully used as the worker in a parallel or partitioned execution.
  • ItemReader - Strategy interface for providing the data. Implementations are expected to be stateful and will be called multiple times for each batch, with each call to read() returning a different value and finally returning null when all input data is exhausted. Implementations need not be thread-safe and clients of a ItemReader need to be aware that this is the case. A richer interface (e.g. with a look ahead or peek) is not feasible because we need to support transactions in an asynchronous batch.
  • ItemProcessor -

Interface for item transformation. Given an item as input, this interface provides an extension point which allows for the application of business logic in an item oriented processing scenario. It should be noted that while it's possible to return a different type than the one provided, it's not strictly necessary. Furthermore, returning null indicates that the item should not be continued to be processed.
  • ItemStreamWriter - Basic interface for generic output operations. Class implementing this interface will be responsible for serializing objects as necessary. Generally, it is responsibility of implementing class to decide which technology to use for mapping and how it should be configured. The write method is responsible for making sure that any internal buffers are flushed. If a transaction is active it will also usually be necessary to discard the output on a subsequent rollback. The resource to which the writer is sending data should normally be able to handle this itself.
  • Spring Batch - Table Of Contents

    Spring Batch Hello World example-Write data from csv to xml file Spring Boot Batch Simple example Spring Batch - Difference between Step, Chunk and Tasklet Spring Batch Tasklet - Hello World example Spring Boot + Batch + Task Scheduler Example

    Lets Begin-

    The maven project we will be creating is as follows-

    batch_1-1
    The pom.xml with spring batch dependencies is as follows-
    	<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    	<modelVersion>4.0.0</modelVersion>
    	<groupId>com.mkyong</groupId>
    	<artifactId>spring-batch-helloworld</artifactId>
    	<packaging>jar</packaging>
    	<version>1.0-SNAPSHOT</version>
    	<name>SpringBatchExample</name>
    	<url>http://maven.apache.org</url>
    
    	<properties>
    		<spring.version>3.2.2.RELEASE</spring.version>
    		<spring.batch.version>2.2.0.RELEASE</spring.batch.version>
    		<mysql.driver.version>5.1.25</mysql.driver.version>
    		<junit.version>4.11</junit.version>
    	</properties>
    
    	<dependencies>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-core</artifactId>
    			<version></version>
    		</dependency>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-jdbc</artifactId>
    			<version></version>
    		</dependency>
    
    		<dependency>
    			<groupId>org.springframework</groupId>
    			<artifactId>spring-oxm</artifactId>
    			<version></version>
    		</dependency>
    
    		<dependency>
    			<groupId>mysql</groupId>
    			<artifactId>mysql-connector-java</artifactId>
    			<version></version>
    		</dependency>
    		
    
    		<dependency>
    			<groupId>org.springframework.batch</groupId>
    			<artifactId>spring-batch-core</artifactId>
    			<version></version>
    		</dependency>
    		<dependency>
    			<groupId>org.springframework.batch</groupId>
    			<artifactId>spring-batch-infrastructure</artifactId>
    			<version></version>
    		</dependency>
    
    	</dependencies>
    
    </project>
    

    The CSV file employee-data to be converted to xml will be as follows-

    batch_1-4
    Define the model class Employee. We will map the CSV values to this model class-
    package com.javainuse.model;
    
    import javax.xml.bind.annotation.XmlRootElement;
    
    @XmlRootElement(name = "employee")
    public class Employee {
    
    	private String employeeId;
    	private String employeeName;
    
    	public String getEmployeeId() {
    		return employeeId;
    	}
    
    	public void setEmployeeId(String employeeId) {
    		this.employeeId = employeeId;
    	}
    
    	public String getEmployeeName() {
    		return employeeName;
    	}
    
    	public void setEmployeeName(String employeeName) {
    		this.employeeName = employeeName;
    	}
    }
    

    Define the CustomEmployeeProcessor which is executed before the writer.
    package com.javainuse;
    
    import org.springframework.batch.item.ItemProcessor;
    
    import com.javainuse.model.Employee;
    
    public class CustomEmployeeProcessor implements
    		ItemProcessor<Employee, Employee> {
    
    	@Override
    	public Employee process(Employee employee) throws Exception {
    
    		System.out.println("Processing..." + employee);
    		System.out.println(employee.getEmployeeName());
    		return employee;
    	}
    
    }
    

    Define the EmployeeFieldSetMapper. This is used to map the csv fields to the Employee class.
    package com.javainuse;
    
    import org.springframework.batch.item.file.mapping.FieldSetMapper;
    import org.springframework.batch.item.file.transform.FieldSet;
    import org.springframework.validation.BindException;
    
    import com.javainuse.model.Employee;
    
    public class EmployeeFieldSetMapper implements FieldSetMapper<Employee> {
    
    	@Override
    	public Employee mapFieldSet(FieldSet fieldSet) throws BindException {
    
    		Employee employee = new Employee();
    		employee.setEmployeeName(fieldSet.readString(0));
    		employee.setEmployeeId(fieldSet.readString(1));
    		return employee;
    
    	}
    
    }
    

    The configuration file for database batch-database.xml will be as follows-
    <beans xmlns="http://www.springframework.org/schema/beans"
    	xmlns:jdbc="http://www.springframework.org/schema/jdbc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="http://www.springframework.org/schema/beans 
    		http://www.springframework.org/schema/beans/spring-beans-3.2.xsd
    		http://www.springframework.org/schema/jdbc 
    		http://www.springframework.org/schema/jdbc/spring-jdbc-3.2.xsd">
    
    	<bean id="dataSource"
    		class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    		<property name="driverClassName" value="com.mysql.jdbc.Driver" />
    		<property name="url" value="jdbc:mysql://localhost/batchdb" />
    		<property name="username" value="root" />
    		<property name="password" value="root" />
    	</bean>
    
    	<bean id="transactionManager"
    		class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
    
    	<jdbc:initialize-database data-source="dataSource">
    		<jdbc:script location="org/springframework/batch/core/schema-drop-mysql.sql" />
    		<jdbc:script location="org/springframework/batch/core/schema-mysql.sql" />
    	</jdbc:initialize-database>
    
    </beans>
    

    The batch-context.xml file for defining the batch context will be as follows-
    <beans xmlns="http://www.springframework.org/schema/beans"
    	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="
    		http://www.springframework.org/schema/beans 
    		http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">
    
    	<bean id="jobRepository"
    		class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
    		<property name="dataSource" ref="dataSource" />
    		<property name="transactionManager" ref="transactionManager" />
    		<property name="databaseType" value="mysql" />
    	</bean>
    
    	<bean id="transactionManager"
    		class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
    
    	<bean id="jobLauncher"
    		class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    		<property name="jobRepository" ref="jobRepository" />
    	</bean>
    
    </beans>
    
    The batch-job-hello-world.xml file for defining the batch context will be as follows-
    <beans xmlns="http://www.springframework.org/schema/beans"
    	xmlns:batch="http://www.springframework.org/schema/batch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="http://www.springframework.org/schema/batch
    		http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
    		http://www.springframework.org/schema/beans 
    		http://www.springframework.org/schema/beans/spring-beans-3.2.xsd
    	">
    
    	<import resource="batch-context.xml" />
    	<import resource="batch-database.xml" />
    
    	<bean id="employee" class="com.javainuse.model.Employee" />
    	<bean id="itemEmployeeProcessor" class="com.javainuse.CustomEmployeeProcessor" />
    
    	<batch:job id="helloWorldBatchJob">
    		<batch:step id="step1">
    			<batch:tasklet>
    				<batch:chunk reader="cvsFileItemReader" writer="xmlItemWriter"
    					processor="itemEmployeeProcessor" commit-interval="10">
    				</batch:chunk>
    			</batch:tasklet>
    		</batch:step>
    	</batch:job>
    
    	<bean id="cvsFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    
    		<property name="resource" value="classpath:employee-data.csv" />
    
    		<property name="lineMapper">
    			<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
    				<property name="lineTokenizer">
    					<bean
    						class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    						<property name="names" value="empId,empName" />
    					</bean>
    				</property>
    				<property name="fieldSetMapper">
    					<bean class="com.javainuse.EmployeeFieldSetMapper" />
    				</property>
    			</bean>
    		</property>
    
    	</bean>
    
    	<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
    		<property name="resource" value="file:c://xml/outputs/employee.xml" />
    		<property name="marshaller" ref="empMarshaller" />
    		<property name="rootTagName" value="employee" />
    	</bean>
    
    	<bean id="empMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
    		<property name="classesToBeBound">
    			<list>
    				<value>com.javainuse.model.Employee</value>
    			</list>
    		</property>
    	</bean>
    
    
    </beans>
    

    Finally run the Batch job as follows-
    package com.javainuse;
    
    import org.springframework.batch.core.Job;
    import org.springframework.batch.core.JobExecution;
    import org.springframework.batch.core.JobParameters;
    import org.springframework.batch.core.launch.JobLauncher;
    import org.springframework.context.ApplicationContext;
    import org.springframework.context.support.ClassPathXmlApplicationContext;
    
    public class App {
    	public static void main(String[] args) {
    
    		String[] springConfig = { "spring/batch/jobs/batch-job-hello-world.xml" };
    
    		ApplicationContext context = new ClassPathXmlApplicationContext(
    				springConfig);
    
    		JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
    		Job job = (Job) context.getBean("helloWorldBatchJob");
    
    		try {
    			JobExecution execution = jobLauncher.run(job, new JobParameters());
    			System.out.println("Batch Job status--" + execution.getStatus());
    		} catch (Exception e) {
    			e.printStackTrace();
    		}
    
    		System.out.println("Batch complete");
    
    	}
    }
    

    Running the application we get the output as-

    batch_1-2
    Also in the c://xml/outputs/ the xml file named employee gets created-

    batch_1-3

    Download Source Code

    Download it -
    Spring + Batch Hello World Example