Entries in java (11)

Sunday
Nov102013

Performance Testing Hibernate Query Approaches

I've known entities are expensive, but wanted to see for myself, so I built this test project to run some benchmarks. The tests aren't complete - just a simple query with no joins, but with a lot of data. The point was to see how much overhead is introduced after we have the query results.

About the Project

This is a simple Maven Spring project that creates an in-memory HSQLDB database, populates it 500,000 records, and then uses several Hibernate query strategies to fetch every one, and report on their average execution times.

Approaches Tested

  1. Using a JpaRepository interface's findAll method to return a list of attached Hibernate entities.

  2. Using Hibernate's StatelessSession interface to return a list of detached Hibernate entities.

  3. Selecting the specific fields of the entity, using Hibernate to return a simple List, and then manually converting that list to a list of detached entities (as DTOs, basically).

  4. Selecting the specific fields of the entity, then using Hibernate's AliasToBeanResultTransformer to build a list of detached entities (as DTOs, basically).

Changing Execution Parameters

By default, the database is loaded with 500,000 records, and each test is repeated in its own transaction 10 times. You can change both of these values in src/main/resources/application.properties.

Running Tests

This big of a database does take up over 256MB of memory, so you might have to increase your heap space. If you run the tests from Maven, you should be fine, since I increase it in the plugin's settings.

Download the sample project and run the following command from inside its directory:

mvn clean test

The test might take several minutes to run. At the end, the test will output the results.

My Results

The results are listed slowest to fastest:

  ---------------------------
  Testing JpaRepository query
  Total # of runs: 10
  JpaRepository avg time: 1073.5ms

  ---------------------------
  Testing stateless session query
  Total # of runs: 10
  Stateless Session avg time: 818.5ms

  ---------------------------
  Testing RowData query
  Total # of runs: 10
  RowData avg time: 317.7ms

  ---------------------------
  Testing ResultTransformer query
  Total # of runs: 10
  ResultTransformer avg time: 311.9ms

The individual times will vary on different systems - the relative performance is what's important.

The StatelessSession query was a little more efficient than returning attached entities, but still pretty slow for this big query. The AliasToBeanResultTransformer and my custom List -> DTO approaches tied as the best performers. I was hoping for this result, but worried that the reflective nature of AliasToBeanResultTransformer might have introduced some overhead. It did not.

Problems? Let me know!

I tried being as careful as I could with these tests:

  • took the average of several test runs
  • turned off Hibernate's second-level cache
  • turned off Hibernate's query cache
  • cleared the entity manager before each run
  • ran each test in its own transaction
  • ignored the first query in the transaction as to avoid any initial performance hit from opening it

I encourage you to download the project, take a look at the code, and try it out for yourself. If you see any issues with my methodology, please let me know, and I'll correct for it.

Friday
Nov082013

Exercising Caution With Hibernate Entities

Take a look at my hibernate-perf-test sample project to see the relative performance of different Hibernate query strategies.

Background

I heavily rely on the Hibernate framework on all of my database-driven Java projects. It’s a full-featured cross-database ORM framework capable of managing your database schema, persisting your entities to the database, populating entities from the database, caching queries, caching entities, registering entities for lifecycle events, indexing them in a search index, and just about everything else you could ask of an ORM.

When you fetch an entity from Hibernate, it remains attached to your EntityManager for the duration of that transaction. The EntityManager is responsible for all of the magic - it watches the entity for changes, persisting it and calling lifecycle methods when appropriate, makes sure to return the same instance of an entity for multiple queries to the same record, builds and executes additional queries if you access properties that point to associations in the database, and of course much more.

Entities are Expensive!

Entities are convenient, but if you don’t understand what they’re doing, you can get yourself into trouble. It all boils down to the fact that an entity’s getter method may run several more queries, so long as the entity is still attached to the EntityManager. A developer can be excused at first, because we’re used to getters and setters containing little to no logic besides accessing a private member variable. Hibernate is an amazing framework, but its greatest strength: being easy to use, becomes a liability: it’s too easy to use it to thrash your database.

The most common and expensive errors that I see are due to:

Accidental “eager” wiring of associated entities

When you load a Customer that has a list of Orders that are wired “eagerly”, Hibernate will build and execute a query to fetch all of those Orders, even if the code never accesses that list. The Orders can have their own eager associations, which compounds this problem. This becomes performance-crippling when your User table is eagerly self-referencing, and just about every query loads up your entire database. This is the sort of issue you might not notice during development, with 10 records in your database, but believe me, it shows itself in production.

Too many queries by accessing an association while looping over entities

When you’re displaying a table of 20 User records, and each one needs the name of their Organization, the easiest path for that developer is to loop over the list of entities and access the “organization” property on each User. This is intuitive, and makes sense with how we think about objects - you have a User, the User has an Organization, and the Organization has a name. However, with Hibernate, you need to understand that that Organization isn’t loaded until you access it. By fetching each User’s Organization this way, you’re producing at least 20 more queries. Now, imagine if the user chooses a table size of 50 rows, or if Organization had some eagerly-loading associations to surprise you with (like another User record!). You’ve just killed your action.

This comes up a lot when a developer tries to do the right thing and adds a lot of logging. Even if we don’t need the Organization for the data table we’re building, the developer might think that someone that reads the logs might want to see each User’s organization. Those logs just became very expensive, and nobody will notice until your users are complaining about page load times in production.

Yes, there are better ways to use entities

If you’re familiar with the framework, you’re probably shaking your head, mumbling something about “fetch joins”, closing the transaction before building your view, and other strategies for more efficiently fetching entities. My point is that as much as I love entities, they’re performance time bombs. It takes one careless moment, or one developer that doesn’t know Hibernate as well as you do, to touch the wrong property and cause a big issue to ripple through your system.

Querying Carefully

Rather than maintain constant vigilance and make sure that every member of your team has read all of the documentation, I prefer to tread lightly with Hibernate. I fully embrace it for helping me generate my schema, for updating the database in an infrequent-write system, and for the query language that bridges different databases.

There are several ways you can use Hibernate to query your database, from full-on magic entities to getting your hands dirty for better efficiency. In a system where reads are common and writes rare, my approach is typically to use as little ‘magic’ as I can for my read-only queries. I don’t fetch attached entities from the database, but rather, select the specific fields that I need, and use the results to build detached data transfer objects (DTOs).

Aside from avoiding the big issues above, this also forces a developer to think more in terms of SQL, and where their data is coming from - it’s more explicit, and thus, more understandable. If you have a “dumb” User DTO, and want the name of that User’s Organization, you’re going to have to either run a query on each User, query them in bulk and then join the two sets of data, or add the Organization name to the original User query. It’s immediately obvious that the first option is ridiculous, and that the second one is annoying to write. The last one takes the least effort, and makes sense when you’re thinking in terms of SQL and database access.

Here’s another point to consider. Since it’s typically considered poor form to return your entities to the client or view, then why not avoid the entity-to-DTO conversion and generate the DTOs in the first place?

Coming Up: Performance Testing Different Query Approaches

There are several ways to use Hibernate to fetch data from your database, and no shortage of conversations online about the performance of these different approaches, as well as official documentation on the subject.

In my next post, I’ll walk you through a sample project that I wrote to test out different query strategies. If you’re interested, take a look for yourself. If you see an error in my methodology, please contact me.

Wednesday
Oct302013

Asserting Exceptions With JUnit Rules: IsEqual Matcher

In my first post on asserting exceptions with JUnit’s Rules feature, I showed how to test that a specific type of exception is thrown with expected substrings in the error message. In my second post, I showed how we can write a custom Matcher to inspect the contents of the exception. In this post, I’ll show you how to take advantage of the stock IsEqual matcher to accomplish the same task, but with less work.

IsEqual Matcher

This Matcher is straightforward - it evaluates whether two objects are equal by calling equals(Object) on one of the objects that isn’t null, passing in the other. So, to use it with our custom exception, we’ll need to make sure that our equals(Object) method correctly evaluates two of its instances.

The default behavior of equals(Object obj) is to check if two objects are the same instance:

The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true).

This won’t help us here, because we’ll be comparing exceptions thrown in our code with one we instantiated in our unit test. We’re going to have to implement equals(Object) ourselves.

Implementing equals(Object)

Here’s my simple custom exception:

You have to be careful when implementing either either equals(Object) or hashCode(). The rule is that if you implement either of these methods, then you need to implement both. Two objects that are equal must have the same hash code. If they don’t, and you try to add two distinct, but equal instances of a class to a Map, they’d both be accepted. This is because internally, the Map stores the items in “buckets” based on their hash codes, and then only has to check equality within a bucket. If two objects have different hash codes, then they won’t be compared to each other.

Your IDE should have the ability to generate what we need here. If you’re using Eclipse (I recommend the STS version), right-click in the source file, select “Source”, and the select “Generate hashCode() and equals()…”

After selecting that option, choose which private members will be used in the two methods. I recommend selecting “Use blocks in ‘if’ statements” in order to help wrong code look wrong, should someone modify these methods down the road.

Here’s our final ErrorCodeException class with the newly generated code:

Verifying equals(Object) and hashCode()

Even though we generated this code, we still need to test it. Here’s the test fixture for ErrorCodeException:

Using the IsEqual Matcher In Our Unit Tests

Now that we’ve implemented equals(Object) and hashCode() for our custom exception, we can use the IsEqual Matcher to setup an expectation for a specific exception.

In the first test, I create an IsEqual Matcher with the exception that I want to compare the thrown exception to. No custom Matcher was required, and my custom exception is now more useful because of it.

In my second test, I include the “old way” of checking exceptions to demonstrate how much easier and more readable exception tests are when using JUnit’s Rules feature.

Tuesday
Oct292013

Asserting Exceptions With Junit Rules: Custom Matchers

In my previous post, I demonstrated how to use JUnit’s Rules feature to assert expected assertions in your unit tests. In this post, I’ll show you how to write custom Matchers that will help give you more power when inspecting your exceptions.

Maven Dependencies

This demo uses the following Maven dependencies:

Custom Exception

We’ll start with a custom exception which does little more than remember the error code at the time the exception was thrown.

Our Exception Matcher

When we pass our Matcher to JUnit’s ExpectedException instance, we’re given a chance to match the exception itself, not the message. In this case, we’re going to write a Matcher that makes sure that the exception’s error code was as expected. We can only match on an instance of our _ ErrorCodeException_, so we’ll save some effort and extend TypeSafeMatcher.

From the documentation:

TypeSafeMatcher : Convenient base class for Matchers that require a non-null value of a specific type. This simply implements the null check, checks the type and then casts.

Example Tests With ExpectedException and Our Custom Matcher

With the components in place, let’s start testing. Of course, if you only have one or two error test cases, then a custom Matcher might take more work than it saves, but you end up with code that any developer should be able to read, which might reduce maintenance costs.

Asserting Exceptions The Old Way

To demonstrate the benefits of using custom Matchers with JUnit’s ExcpectedException, here are the alternatives that you’re probably familiar with, with inline comments explaining why they’re not ideal.

Monday
Oct282013

Asserting Exception Messages With JUnit Rules

If you’re not familiar with JUnit’s @Rule feature for asserting exceptions in your tests, then read on - you’re about to start using it.

Assert Exception Type

It’s very simple to assert that a given type of exception is thrown in a JUnit test case with the following:

Assert Exception Message (The Old Way)

But, what if you want to be more specific, and check the message itself? I’ve always done the following:

Heres’s another variant you’re probably familiar with:

Assert Exception Message With JUnit Rules

The above methods always felt like hacks. I recently came across JUnit’s @Rule feature, which saves tons of code and is much easier to read. You first define your public ExpectedException instance, and give it a @Rule annotation. Then, in each test case that wants to use it, you set what type of exception you’re expecting, and optionally a substring to look for in the exception message:

Since expectMessage is looking for substrings, you can use several of them to test more complicated exception messages:

More Advanced: Custom Matchers

In my next post, I’ll describe how to implement a custom Matcher for more complicated Exception assertions.