1. Introduction

1.1. Objectives

The PDF of the document is available |HERE|.

The examples in the document are available |HERE|.

The aim here is to explore the main concepts of data persistence using the JPA API (Java Persistence API). After reading this document and testing the examples, the reader should have acquired the necessary foundations to then stand on their own two feet.

The JPA API is relatively new. It has only been available since JDK 1.5. The JPA layer has its place in a multi-tier architecture. Let’s consider a fairly common three-tier architecture:

Layer [1], referred to here as [ui] (User Interface), is the layer that interacts with the user via a Swing GUI, a console interface, or a web interface. Its role is to provide data from the user to layer [2] or to present data provided by layer [2] to the user.
Layer [2], referred to here as [business], is the layer that applies the so-called business rules—that is, the application’s specific logic—without concerning itself with where the data it receives comes from or where the results it produces go.
Layer [3], referred to here as [DAO] (Data Access Object), is the layer that provides layer [2] with pre-stored data (files, databases, etc.) and stores some of the results provided by layer [2].
The [JDBC] layer is the standard layer used in Java to access databases. This is commonly referred to as the DBMS’s JDBC driver.

Numerous efforts have been made to make it easier for developers to write these different layers. Among these, JPA aims to simplify the development of the [DAO] layer, which manages so-called persistent data, hence the name of the API (Java Persistence API). One solution that has gained traction in recent years in this field is Hibernate:

The [Hibernate] layer sits between the [DAO] layer written by the developer and the [JDBC] layer. Hibernate is an ORM (Object-Relational Mapping), a tool that bridges the relational world of databases and the world of objects manipulated by Java. The [DAO] layer developer no longer sees the [JDBC] layer or the database tables whose content they wish to utilize. They see only the object representation of the database, provided by the [Hibernate] layer. The bridge between the database tables and the objects manipulated by the [DAO] layer is primarily established in two ways:

via XML-style configuration files
through Java annotations in the code, a technique available only since JDK 1.5

The [Hibernate] layer is an abstraction layer designed to be as transparent as possible. The ideal goal is for the [DAO] layer developer to be completely unaware that they are working with a database. This is feasible if they are not the one writing the configuration that bridges the relational world and the object world. Configuring this bridge is quite delicate and requires some experience.

The [4] object layer, which mirrors the database, is called the "persistence context." A [DAO] layer based on Hibernate performs persistence operations (CRUD: create, read, update, delete) on the objects in the persistence context; these operations are translated by Hibernate into SQL statements. For database query operations (SQL SELECT), Hibernate provides developers with an HQL (Hibernate Query Language) to query the persistence context [4] rather than the database itself.

Hibernate is popular but complex to master. The learning curve, often presented as easy, is actually quite steep. As soon as you have a database with tables featuring one-to-many or many-to-many relationships, configuring the relational-to-object bridge is beyond the capabilities of the average beginner. Configuration errors can then lead to poorly performing applications.

In the commercial world, there was a product equivalent to Hibernate called Toplink:

In light of the success of ORM products, Sun, the creator of Java, decided to standardize an ORM layer via a specification called JPA, which was released alongside Java 5. The JPA specification was implemented by both Toplink and Hibernate. Toplink, which was a commercial product, has since become open-source. With JPA, the previous architecture becomes the following:

The [DAO] layer now interacts with the JPA specification, a set of interfaces. Developers have gained from this standardization. Previously, if a developer changed their ORM layer, they also had to change their [DAO] layer, which had been written to interact with a specific ORM. Now, they will write a [DAO] layer that interacts with a JPA layer. Regardless of which product implements the JPA layer, the interface presented to the [DAO] layer remains the same.

This document will present JPA examples in various domains:

First, we will focus on the relational/object bridge that the ORM layer builds. This will be created using Java 5 annotations for databases in which we find table relationships of the type:
- one-to-one
- one-to-many
- many-to-many

To illustrate this area, we will create the following test architectures:

Our test programs will be console applications that query the JPA layer directly. In doing so, we will explore the main methods of the JPA layer. We will be working in a so-called "Java SE" (Standard Edition) environment. JPA works in both Java SE and Java EE5 (Enterprise Edition) environments.

Once we have mastered both the configuration of the relational/object bridge and the use of JPA layer methods, we will return to a more traditional multi-layer architecture:

The [JPA] layer will be accessed via a two-tier architecture consisting of [business] and [DAO] layers. The Spring framework [7], followed by the JBoss EJB3 container, will be used to link these layers together.

We mentioned earlier that JPA is available in both SE and EE5 environments. The Java EE5 environment provides numerous services for accessing persistent data, including connection pools, transaction managers, and more. It may be beneficial for a developer to take advantage of these services. The Java EE5 environment is not yet widely adopted (May 2007). It is currently available on the Sun Application Server 9.x (Glassfish). An application server is essentially a web application server. If you build a standalone graphical application using Swing, you cannot utilize the EE environment and the services it provides. This is a problem. We are beginning to see "stand-alone" EE environments, i.e., those that can be used outside of an application server. This is the case with JBos EJB3, which we will use in this document.

In an EE5 environment, the layers are implemented by objects called EJBs (Enterprise Java Beans). In previous versions of EE, EJBs (EJB 2.x) were considered difficult to implement and test, and sometimes underperformed. A distinction is made between EJB 2.x "entity" beans and EJB 2.x "session" beans. In short, an EJB 2.x "entity" corresponds to a database table row, and an EJB 2.x "session" is an object used to implement the [business] and [DAO] layers of a multi-layer architecture. One of the main criticisms of layers implemented with EJBs is that they can only be used within EJB containers, a service provided by the EE environment. This makes unit testing problematic. Thus, in the diagram above, unit testing of the [business] and [DAO] layers built with EJBs would require setting up an application server, a rather cumbersome operation that does not really encourage the developer to perform tests frequently.

The Spring framework was created in response to the complexity of EJB2. Spring provides, within an SE environment, a significant number of the services typically provided by EE environments. Thus, in the "Data Persistence" section that concerns us here, Spring provides the connection pools and transaction managers that applications require. The emergence of Spring has fostered a culture of unit testing, which suddenly became much easier to implement. Spring allows the implementation of application layers using standard Java objects (POJOs, Plain Old/Ordinary Java Objects), enabling their reuse in other contexts. Finally, it integrates numerous third-party tools fairly transparently, notably persistence tools such as Hibernate, iBatis, ...

Java EE 5 was designed to address the shortcomings of the previous EE specification. EJB 2.x has evolved into EJB 3. These are POJOs annotated with tags that designate them as special objects when they are within an EJB 3 container. Within the container, the EJB3 can leverage the container’s services (connection pool, transaction manager, etc.). Outside the EJB3 container, the EJB3 becomes a standard Java object. Its EJB annotations are ignored.

Above, we have depicted Spring and JBoss EJB3 as a possible infrastructure (framework) for our multi-tier architecture. It is this infrastructure that will provide the services we need: a connection pool and a transaction manager.

With Spring, the layers will be implemented using POJOs. These will access Spring’s services (connection pool, transaction manager) through dependency injection into these POJOs: when constructing them, Spring injects references to the services they will need.
JBoss EJB3 is an EJB container capable of running outside an application server. Its operating principle (from the developer’s perspective) is analogous to that described for Spring. We will find few differences.
We will conclude this document with an example of a three-tier web application—basic yet representative:

1.2. References

[ref1]: Java Persistence with Hibernate, by Christian Bauer and Gavin King, published by Manning.

[ref1] is the document that served as the basis for the following. It is a comprehensive book of over 800 pages on the use of the Hibernate ORM in two different contexts: with or without JPA. The use of Hibernate without JPA is indeed still relevant for developers using JDK 1.4 or earlier, as JPA did not appear until JDK 1.5.

Having read more than three-quarters of the book and skimmed the rest, it struck me that everything in this document was useful. The experienced Hibernate user should be familiar with nearly all the information provided in the 800 pages. Christian Bauer and Gavin King have been thorough but rarely to the point of describing situations one will never encounter. It’s all worth reading. The book is written in an educational style: there is a genuine effort to leave nothing in the dark. The fact that it was written for using Hibernate both with and without JPA poses a challenge for those interested in only one or the other of these technologies. For example, the authors describe, using numerous examples, the relational/object bridge in both contexts. The concepts used are very similar since JPA was heavily inspired by Hibernate. But there are some differences. So much so that something that is true for Hibernate may no longer be true for JPA, and this ends up creating confusion for the reader.

The authors provide examples of three-tier applications in the context of an EJB3 container. They do not discuss Spring. We will see in an example that Spring is, however, simpler to use and has a broader scope than the JBoss EJB3 container used in [ref1]. Nevertheless, "Java Persistence with Hibernate" is an excellent book that I recommend for all the fundamentals it teaches about ORMs.

Using an ORM is complex for a beginner.

There are concepts to understand in order to configure the relational/object bridge.
There is the concept of the persistence context with its notions of objects in a "persisted," "detached," or "new" state
There are the mechanics surrounding persistence (transactions, connection pools), typically services provided by a container
There are performance-related settings to configure (second-level cache)
...

We will introduce these concepts using examples. We will not delve deeply into the theory behind them. Our goal is simply, in each case, to enable the reader to understand the example and internalize it to the point where they can make modifications themselves or apply it in a different context.

1.3. Tools Used

The examples in this document use the following tools. Some are described in the appendices (download, installation, configuration, usage). In such cases, we provide the paragraph number and page number.

a JDK 1.6 (section 5.1)
the Eclipse 3.2.2 Java development IDE (section 5.2)
Eclipse WTP (Web Tools Package) plugin (section 5.2.3)
Eclipse SQL Explorer plugin (section 5.2.6)
the Eclipse Hibernate Tools plugin (section 5.2.5)
Eclipse TestNG plugin (section 5.2.4)
Tomcat 5.5.23 servlet container (section 5.3)
Firebird 2.1 DBMS (section 5.4)
MySQL 5 DBMS (Section 5.5)
PostgreSQL DBMS (Section 5.6)
Oracle 10g Express DBMS (Section 5.7)
SQL Server 2005 Express DBMS (section 5.8)
HSQLDB DBMS (Section 5.9)
Apache Derby DBMS (Section 5.10)
Spring 2.1 (Section 5.11)
JBoss EJB3 container (section 5.12)

1.4. Downloading the example e

On this document’s website, the examples covered can be downloaded as a ZIP file, which, once extracted, creates the following folder:

in [1]: the directory structure of the examples
in [2]: the <annexes> folder contains items presented in the APPENDICES section, paragraph 5. In particular, the <jdbc> folder contains the JDBC drivers for the DBMSs used in the tutorial examples.
in [3]: the <lib> folder groups the various .jar archives used by the tutorial into 5 folders
[4]: The <lib/divers> folder contains the archives: - JDBC drivers for the DBMS - for the unit testing tool [TestNG] - the logging tool [log4j]

in [5]: the archives for the JPA/Hibernate implementation and third-party tools required by Hibernate
in [6]: the archives for the JPA/TopLink implementation
in [7]: the Spring 2.x archives and third-party tools required by Spring
in [8]: the archives of the JBoss EJB3 container

in [9]: the <hibernate> folder contains examples using the JPA/Hibernate persistence layer
in [10]: the <hibernate/direct> folder contains examples where the JPA layer is used directly with a [Main]-type program.
in [11] and [12]: examples where the JPA layer is used via [business] and [DAO] layers in a multi-layer architecture, which is the standard use case. The services (connection pool, transaction manager) used by the [business] and [DAO] layers are provided either by Spring [11] or by JBoss EJB3 [12].

In [13]: The <toplink> folder includes the examples from the <hibernate> folder [9], but this time with a JPA/Toplink persistence layer instead of JPA/Hibernate. There is no <jbossejb3> folder in [13] because it was not possible to get an example working where the persistence layer is provided by Toplink and the services by the JBoss EJB3 container.
In [14]: a <web> folder contains three examples of web applications with a JPA persistence layer:
[15]: an example using Spring / JPA / Hibernate
[16]: the same example with Spring / JPA / Toplink
[17]: the same example with JBoss EJB3 / JPA / Hibernate. This example does not work, likely due to an unresolved configuration issue. It has nevertheless been included so that the reader can examine it and possibly find a solution to this problem.

The tutorial frequently refers to this directory structure, particularly when testing the examples covered. Readers are encouraged to download these examples and install them. Hereafter, we will refer to the directory structure described above as <examples>.

1.5. -Eclipse Project Configuration for the Examples

The examples use "user" libraries. These are .jar archives grouped under a single name. When such a library is included in a Java project’s classpath, all the archives it contains are then included in that classpath. Let’s see how to do this in Eclipse:

in [1]: [Window / Preferences / Java / Build Path / User Libraries]
in [2]: create a new library
in [3]: give it a name and confirm

in [4]: select the JARs that will be part of the [jpa-divers] library
in [5]: select all JARs from the <examples>/lib/divers folder

in [6]: the user library [jpa-divers] has been defined
in [7]: repeat the same process to create 4 more libraries:

Library	Library JAR folder
`jpa-hibernate`	<examples>/lib/hibernate
`jpa-toplink`	<examples>/lib/toplink
`jpa-spring`	<examples>/lib/spring
`jpa-jbossejb3`	<examples>/lib/jbossejb3