Entries in ORM (2)

Tuesday
Sep032013

Don't Rely on EntityManager.persist() for Immediate Insert

I had always counted on EntityManager's persist() method to immediately insert entities. I would rely on this when writing database integration tests - I'd persist some records, then test my DAO methods to find them.

On my current project, I decided to add a configuration option to allow me to run my datbase integration tests on my development Oracle database rather than my embedded HSQLDB test database - just for an extra sanity check. The tests that tried to persist() and then retrieve those new entities failed. Adding an entityManager.flush() method after the persist() invocations solved the issue.

...But why?

From en.wikibooks.org:

The EntityManager.persist() operation is used to insert a new object into the database. persist does not directly insert the object into the database, it just registers it as new in the persistence context (transaction). When the transaction is committed, or if the persistence context is flushed, then the object will be inserted into the database. If the object uses a generated Id, the Id will normally be assigned to the object when persist is called, so persist can also be used to have an object's Id assigned. The one exception is if IDENTITY sequencing is used, in this case the Id is only assigned on commit or flush because the database will only assign the Id on INSERT. If the object does not use a generated Id, you should normally assign its Id before calling persist.

Here's how I wire up my entities' primary key:

@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Long id;

For my embedded HSQLDB database, the generation strategy is GenerationType.IDENTIY, which relies on the database to generate an autoincrementing primary key for that row. This requires an insert, so the persist() immediately inserts in HSQLDB.

Oracle, on the other hand, uses a cross-table GenerationType.SEQUENCE @Id generator, which doesn't require an insert, but the following SELECT:

select
    hibernate_sequence.nextval
from
    dual

This select is called immediately on persist() so that the EntityManager has an ID to assign the entity. That entity will only be inserted after a flush(), which is called automatically on transaction commit.

Long story short: If you're relying on your entity existing in the database after your call to persist(), but before the transaction commits, then call flush() first. Leave a comment justifying it, as manually calling flush is largely considered an anti-pattern akin to invoking the garbage collector. Delayed flush() calls give Hibernate the chance to perform more performant bulk updates.

Tuesday
Sep032013

Object/Relational Mapping: Know Your Frameworks

I've been working with Hibernate for several years now, yet I learn something new about it all the time. The more time I spend with the framework, the more concerned I am about how it will be used by developers new to it.

Mirko Novakovic Alois Reitbauer nails it in a post about O/R Mapping Anti-Patterns:

The simplicity of the entrance into the world of O/R mapping however gives a wrong impression of the complexity of these frameworks. Working with more complex applications you soon realize that you should know the details of framework implementation to be able to use them in the best possible way. In this article, we describe some common anti-patterns which may easily lead to performance problems.

This is an echo of Joel Spolsky's warnings of the Law of Leaky Abstraction:

The law of leaky abstractions means that whenever somebody comes up with a wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying "learn how to do it manually first, then use the wizzy tool to save time." Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don't save us time learning.

Don't stop learning about a framework once you figure out how to use it - that's only the beginning.