Isolation and Database Locking (Enterprise JavaBeans)

8.3. Isolation and Database Locking

Transaction isolation (the "I" in ACID) is a critical part of any transactional system. This section explains isolation conditions, database locking, and transaction isolation levels. These concepts are important when deploying any transactional system.

8.3.1. Dirty, Repeatable, and Phantom Reads

Transaction isolation is defined in terms of isolation conditions called dirty reads, repeatable reads, and phantom reads . These conditions describe what can happen when two or more transactions operate on the same data.[2]

[2]Isolation conditions are covered in detail by the ANSI SQL-92 Specification, Document Number: ANSI X3.135-1992 (R1998).

To illustrate these conditions, let's think about two separate client applications using their own instances of the TravelAgent to access the same data--specifically, a cabin record with the primary key of 99. These examples revolve around the RESERVATION table, which is accessed by both the bookPassage() method (through the Reservation bean) and the listAvailableCabins() method (through JDBC). It might be a good idea to go back to Chapter 7, "Session Beans" and review how the RESERVATION table is accessed through these methods. This will help you to understand how two transactions executed by two different clients can impact each other. Assume that both methods have a transaction attribute of Required.

8.3.1.1. Dirty reads

A dirty read occurs when the first transaction reads uncommitted changes made by a second transaction. If the second transaction is rolled back, the data read by the first transaction becomes invalid because the rollback undoes the changes. The first transaction won't be aware that the data it has read has become invalid. Here's a scenario showing how a dirty read can occur (illustrated in Figure 8-9):

Time 10:00:00: Client 1 executes the TravelAgent.bookPassage() method on its bean. Along with the Customer and Cruise beans, Client 1 had previously chosen Cabin 99 to be included in the reservation.
Time 10:00:01: Client 1 creates a Reservation bean within the bookPassage() method. The Reservation bean's create() method inserts a record into the RESERVATION table, which reserves Cabin 99.
Time 10:00:02: Client 2 executes TravelAgent.listAvailableCabins(). Cabin 99 has been reserved by Client 1, so it is not in the list of available cabins that are returned from this method.
Time 10:00:03: Client 1 executes the ProcessPayment.byCredit() method within the bookPassage() method. The byCredit() method throws an exception because the expiration date on the credit card has passed.
Time 10:00:04: The exception thrown by the ProcessPayment bean causes the entire bookPassage() transaction to be rolled back. As a result, the record inserted into the RESERVATION table when the Reservation bean was created is not made durable (it is removed). Cabin 99 is now available.

Figure 8-9. A dirty read

Client 2 is now using an invalid list of available cabins because Cabin 99 is available but is not included in the list. This would be serious if Cabin 99 was the last available cabin because Client 2 would inaccurately report that the cruise was booked. The customer would presumably try to book a cruise on a competing cruise line.

8.3.1.2. Repeatable reads

A repeatable read is when the data read is guaranteed to look the same if read again during the same transaction. Repeatable reads are guaranteed in one of two ways: either the data read is locked against changes or the data read is a snapshot that doesn't reflect changes. If the data is locked, then it cannot be changed by any other transaction until this transaction ends. If the data is a snapshot, then other transactions can change the data, but these changes won't be seen by this transaction if the read is repeated. Here's an example of a repeatable read (illustrated in Figure 8-10):

Time 10:00:00: Client 1 begins an explicit javax.transaction.UserTransaction.
Time 10:00:01: Client 1 executes TravelAgent.listAvailableCabins(2), asking for a list of available cabins that have two beds. Cabin 99 is in the list of available cabins.
Time 10:00:02: Client 2 is working with an interface that manages Cabin beans. Client 2 attempts to change the bed count on Cabin 99 from 2 to 3.
Time 10:00:03: Client 1 re-executes the TravelAgent.listAvailableCabins(2). Cabin 99 is still in the list of available cabins.

Figure 8-10. Repeatable read

This example is somewhat unusual because it uses javax.transaction.UserTransaction. This class is covered in more detail later in this chapter; essentially, it allows a client application to control the scope of a transaction explicitly. In this case, Client 1 places transaction boundaries around both calls to listAvailableCabins(), so that they are a part of the same transaction. If Client 1 didn't do this, the two listAvailableCabins() methods would have executed as separate transactions and our repeatable read condition would not have occurred.

Although Client 2 attempted to change the bed count for Cabin 99 to 3, Cabin 99 still shows up in the Client 1 call to listAvailableCabins() when a bed count of 2 is requested. This is because either Client 2 was prevented from making the change (because of a lock), or Client 2 was able to make the change, but Client 1 is working with a snapshot of the data that doesn't reflect that change.

A nonrepeatable read is when the data retrieved in a subsequent read within the same transaction can return different results. In other words, the subsequent read can see the changes made by other transactions.

8.3.1.3. Phantom reads

Phantom reads occur when new records added to the database are detectable by transactions that started prior to the insert. Queries will include records added by other transactions after their transaction has started. Here's a scenario that includes a phantom read (illustrated in Figure 8-11):

Time 10:00:00: Client 1 begins an explicit javax.transaction.UserTransaction.
Time 10:00:01: Client 1 executes TravelAgent.listAvailableCabins(2), asking for a list of available cabins that have two beds. Cabin 99 is in the list of available cabins.
Time 10:00:02: Client 2 executes bookPassage() and creates a Reservation bean. The reservation inserts a new record into the RESERVATION table, reserving cabin 99.
Time 10:00:03: Client 1 re-executes the TravelAgent.listAvailableCabins(2). Cabin 99 is no longer in the list of available cabins.

Figure 8-11. Phantom read

Client 1 places transaction boundaries around both calls to listAvailableCabins(), so that they are a part of the same transaction. In this case, the reservation was made between the listAvailableCabins() queries in the same transaction. Therefore, the record inserted in the RESERVATION table didn't exist when the first listAvailableCabins() method is invoked, but it does exist and is visible when the second listAvailableCabins() method is invoked. The record inserted is a phantom record.

8.3.2. Database Locks

Databases, especially relational databases, normally use several different locking techniques. The most common are read locks, write locks, and exclusive write locks. (I've taken the liberty of adding "snapshots," although this isn't a formal term.) These locking mechanisms control how transactions access data concurrently. Locking mechanisms impact the read conditions that were just described. These types of locks are simple concepts that are not directly addressed in the EJB specification. Database vendors implement these locks differently, so you should understand how your database addresses these locking mechanisms to best predict how the isolation levels described in this section will work.

Read locks: Read locks prevent other transactions from changing data read during a transaction until the transaction ends, thus preventing nonrepeatable reads. Other transactions can read the data but not write it. The current transaction is also prohibited from making changes. Whether a read lock locks only the records read, a block of records, or a whole table depends on the database being used.
Write locks: Write locks are used for updates. A write lock prevents other transactions from changing the data until the current transaction is complete. A write lock allows dirty reads, by other transactions and by the current transaction itself. In other words, the transaction can read its own uncommitted changes.
Exclusive write locks: Exclusive write locks are used for updates. An exclusive write lock prevents other transactions from reading or changing data until the current transaction is complete. An exclusive write lock prevents dirty reads by other transactions. Other transactions are not allowed to read the data while it is exclusively locked. Some databases do not allow transactions to read their own data while it is exclusively locked.
Snapshots: Some databases get around locking by providing every transaction with its own snapshot of the data. A snapshot is a frozen view of the data that is taken when the transaction begins. Snapshots can prevent dirty reads, nonrepeatable reads, and phantom reads. Snapshots can be problematic because the data is not real-time; it is old the instant the snapshot is taken.

8.3.3. Transaction Isolation Levels

Transaction isolation is defined in terms of the isolation conditions (dirty reads, repeatable reads, and phantom reads). Isolation levels are commonly used in database systems to describe how locking is applied to data within a transaction.[3] The following terms are usually used to discuss isolation levels:

[3]Isolation conditions are covered in detail by ANSI SQL-92 Specification, Document Number: ANSI X3.135-1992 (R1998).

Read Uncommitted

The transaction can read uncommitted data (data changed by a different transaction that is still in progress).

Dirty reads, nonrepeatable reads, and phantom reads can occur. Bean methods with this isolation level can read uncommitted change.

Read Committed

The transaction cannot read uncommitted data; data that is being changed by a different transaction cannot be read.

Dirty reads are prevented; nonrepeatable reads and phantom reads can occur. Bean methods with this isolation level cannot read uncommitted data.

Repeatable Read

The transaction cannot change data that is being read by a different transaction.

Dirty reads and nonrepeatable reads are prevented; phantom reads can occur. Bean methods with this isolation level have the same restrictions as Read Committed and can only execute repeatable reads.

Serializable

The transaction has exclusive read and update privileges to data; different transactions can neither read nor write the same data.

Dirty reads, nonrepeatable reads, and phantom reads are prevented. This isolation level is the most restrictive.

These isolation levels are the same as those defined for JDBC. Specifically, they map to the static final variables in the java.sql.Connection class. The behavior modeled by the isolation levels in the connection class is the same as the behavior described here.

The exact behavior of these isolation levels depends largely on the locking mechanism used by the underlying database or resource. How the isolation levels work depends in large part on how your database supports them.

8.3.3.1. EJB 1.1 transaction isolation control

In EJB 1.1, isolation levels are not controlled through declarative attributes, as was the case in EJB 1.0. In EJB 1.1, the deployer sets transaction isolation levels if the container manages the transaction. The bean developer sets the transaction isolation level if the bean manages the transaction. Up to this point we have only discussed container-managed transactions; bean-managed transactions are discussed later in this chapter.

8.3.3.2. EJB 1.0 transaction isolation control

EJB 1.0 describes four isolation levels that can be assigned to the methods of a bean in the ControlDescriptor. We did this several times when we created control descriptors for all the beans we developed in this book. Here is a snippet of code from the MakeDD class used to create the TravelAgentDD.ser in Chapter 7, "Session Beans", showing how we set the isolation level:

ControlDescriptor cd = new ControlDescriptor();
cd.setIsolationLevel (ControlDescriptor.TRANSACTION_SERIALIZABLE);
cd.setMethod(null);
ControlDescriptor [] cdArray = {cd};
sd.setControlDescriptors(cdArray);

In our example so far, we have always used the isolation level ControlDescriptor.TRANSACTION_SERIALIZABLE, the most restrictive isolation level. Table 8-2 shows the transaction isolation levels and their corresponding attribute in the ControlDescriptor class.

Table 8-2. Isolation Level Attributes in EJB 1.0

Isolation Level	ControlDescriptor Constant
Read Committed	`TRANSACTION_READ_COMMITTED`
Read Uncommitted	`TRANSACTION_READ_UNCOMMITTED`
Repeatable Read	`TRANSACTION_REPEATABLE_READ`
Serializable	`TRANSACTION_SERIALIZABLE`

You are allowed to specify isolation levels on a per-method basis, but this flexibility comes with an important restriction: all methods invoked in the same transaction must have the same isolation level. You can't mix isolation levels within transactions at runtime.

8.3.4. Balancing Performance Against Consistency

Generally speaking, as the isolation levels become more restrictive, the performance of the system decreases because more restrictive isolation levels prevent transactions from accessing the same data. If isolation levels are very restrictive, like Serializable, then all transactions, even simple reads, must wait in line to execute. This can result in a system that is very slow. EJB systems that process a large number of concurrent transactions and need to be very fast will therefore avoid the Serializableisolation level where it is not necessary, since it will be prohibitively slow.

Isolation levels, however, also enforce consistency of data. More restrictive isolation levels help ensure that invalid data is not used for performing updates. The old adage "garbage in, garbage out" applies here. The Serializable isolation level ensures that data is never accessed concurrently by transactions, thus ensuring that the data is always consistent.

Choosing the correct isolation level requires some research about the database you are using and how it handles locking. You must also balance the performance needs of your system against consistency. This is not a cut-and-dried process, because different applications use data differently.

Although there are only three ships in Titan's system, the beans that represent them are included in most of Titan's transactions. This means that many, possibly hundreds, of transactions will be accessing these Ship beans at the same time. Access to Ship beans needs to be fast or it becomes a bottleneck, so we do not want to use very restrictive isolation levels. At the same time, the ship data also needs to be consistent; otherwise, hundreds of transactions will be using invalid data. Therefore, we need to use a strong isolation level when making changes to ship information. To accommodate these conflicting requirements, we can apply different isolation levels to different methods.

Most transactions use the Ship bean's get methods to obtain information. This is read-only behavior, so the isolation level for the get methods can be very low, such as Read Uncommitted. The set methods of the ship bean are almost never used; the name of the ship probably wouldn't change for years. However, the data changed by the set methods must be isolated to prevent dirty reads by other transactions, so we will use the most restrictive isolation level, Serializable, on the ship's set methods. By using different isolation levels on different business methods, we can balance consistency against performance.

8.3.4.1. EJB 1.1: Controlling isolation levels

Different EJB servers allow different levels of granularity for setting isolation levels; some servers defer this responsibility to the database. In some servers, you may be able to set different isolation levels for different methods, while other products may require the same isolation level for all methods in a bean, or possibly even all beans in the container. You will need to consult your vendor's documentation to find out the level of control your server offers.

Bean-managed transactions in stateful session beans, however, allow the bean developer to specify the transaction isolation level using the API of the resource providing persistent storage. The JDBC API, for example, provides a mechanism for specifying the isolation level of the database connection. The following code shows how this is done. Bean-managed transactions are covered in more detail later in this chapter.

...
DataSource source = (javax.sql.DataSource)
    jndiCntxt.lookup("java:comp/env/jdbc/titanDB");

Connection con = source.getConnection();
con.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE);
...

You can set the isolation level to be different for different databases within the same transaction, but all beans that use the same database in a transaction should use the same isolation level.

8.3.4.2. EJB 1.0: Controlling isolation levels

The following code, taken from a deployment descriptor for a Ship bean, shows one way to assign these isolation levels:

Method [] methods = new Method[6];

Class [] parameters = new Class[0]; 
methods[ 0 ] = ShipBean.class.getDeclaredMethod("getName",parameters);
methods[ 1 ] = ShipBean.class.getDeclaredMethod("getTonnage",parameters);
methods[ 2 ] = ShipBean.class.getDeclaredMethod("getCapacity",parameters);

parameters = new Class[1];

parameters[0] = String.class;
methods[ 3 ] = ShipBean.class.getDeclaredMethod("setName",parameters);
parameters[0] = Double.TYPE;
methods[ 4 ] = ShipBean.class.getDeclaredMethod("setTonnage",parameters);
parameters[0] = Integer.TYPE;
methods[ 5 ] = ShipBean.class.getDeclaredMethod("setCapacity",parameters);

ControlDescriptor [] cds = new ControlDescriptor[methods.length];

for (int i = 0; i < methods.length; i++) {
    cds[i] = new ControlDescriptor(methods[i]);
    if (methods[i].getReturnType() == Void.TYPE) {
        // Set methods all return void.
        cds[i].setIsolationLevel(
            ControlDescriptor.TRANSACTION_SERIALIZABLE);
    }
    else {
        // Get methods don't return void.
        cds[i].setIsolationLevel(
            ControlDescriptor.TRANSACTION_READ_UNCOMMITTED);
    }
    cds[i].setRunAsMode(ControlDescriptor.CLIENT_IDENTITY);
    cds[i].setTransactionAttribute(ControlDescriptor.TX_REQUIRED);
}

shipDD.setControlDescriptors(cds);

This code takes all the set methods in the Ship interface that are used to make updates (setName(), setCapacity(), setTonnage()) and gives them an isolation level of TRANSACTION_SERIALIZABLE. For the get methods (getName(), getCapacity(), getTonnage()), which are used for reading data, the isolation level is set to TRANSACTION_READ_UNCOMMITTED.

NOTE

Remember that all bean methods invoked in the same transaction must have the same isolation level.

Understanding the effect of isolation levels on your code's behavior is crucial to balancing performance against consistency. In EJB 1.0, all the bean methods invoked within the same transaction must have the same isolation level. In the TravelAgent bean, for example, every method invoked on every bean within the scope of the bookPassage() method must have the same transaction isolation level. Any method invoked with a different isolation level will throw a java.rmi.RemoteException. Therefore, mixing isolation levels across beans (specifying different isolation levels for different beans within your application) must be done with care and only in those circumstances when methods with different isolation levels will never need to be executed in the same transaction.


8.2. Declarative Transaction Management		8.4. Non-Transactional Beans