Bulk and Batch imports with Spring Boot and the CrudRepository

27.12.2017

Bulk and Batch imports with Spring Boot and the CrudRepository

This article describes how to handle the ids to allow a bulk or batch inserts by using the Persistable-Interface or your own id-Generator. There are two possibilities to import Entities in a bulk without using the EntityManager directly (as opposed to the article Bulk and Batch imports with Spring Boot using the EntityManager ), but with the CrudRepository. One approach uses the Persistable interface, which is implemented by a Item class. In the second approach a generator is specified in the Item class that creates the ids.

In both approaches the same Repository is used:

package org.hameister.bulk.generator;

import org.springframework.data.repository.CrudRepository;
import org.springframework.stereotype.Repository;

/**
 * Created by hameister on 27.12.17.
 */
@Repository
public interface ItemBulkImporterRepository extends CrudRepository<Item, String> {
}

If you want to do a bulk import, you have to be aware that multiple SQL inserts have to be combined into one SQL statement. For this to be possible, the ids of the Item must have an id where it is clear that it is new. You have to make sure that in the class SimpleJpaRepository a persist and not a merge is called.

Code snippet from the SimpleJpaRepository:

	@Transactional
	public <S extends T> S save(S entity) {

		if (entityInformation.isNew(entity)) {
			em.persist(entity);
			return entity;
		} else {
			return em.merge(entity);
		}
	}

If the approach via Persistable is used, the Item class implements this interface:

package org.hameister.bulk.persistable;

import org.springframework.data.domain.Persistable;

import javax.persistence.*;

/**
 * Created by hameister on 27.12.17.
 */
@Entity
@Table(name = "Item")
public class Item implements Persistable<String> {

    @Id
    String id;

    @Column(name = "description")
    private String description;

    @Column(name = "location")
    private String location;

    private @Transient boolean isNew = true;

    public Item() {
    }

    @PostPersist
    @PostLoad
    void markNotNew() {
        this.isNew = false;
    }

    @Override
    public boolean isNew() {
        return isNew;
    }

// Getter,  Setter, equals...

}

It uses a transient variable isNew, which is used by the EntityInformation model to determine if the entity is new. Thank you Oliver Gierke for the tip:

The other approach is to define a generator in the Item class that causes the id to be set:

package org.hameister.bulk.generator;

import org.hibernate.annotations.GenericGenerator;

import javax.persistence.*;

/**
 * Created by hameister on 27.12.17.
 */
@Entity
@Table(name = "Item")
public class Item  {

    @Id
    @GeneratedValue(generator = "uuid2")
    @GenericGenerator(name = "uuid2", strategy = "uuid2")
    String id;

    @Column(name = "description")
    private String description;

    @Column(name = "location")
    private String location;

    public Item() {
    }

// getter, setter, equals

}

Both approaches ensure that a persist() and not a merge() is called and a bulk import is possible. Many thanks to Michael Simons for the valuable hints and the quick help!

The complete source code for both approaches can be found at GitHub under SpringBootBulkImportPersistable and SpringBootBulkImportGenerator.