Notes: May 2014

Transaction Isolation Levels

The ANSI/ISO SQL standard defines four levels of transaction isolation, with different possible outcomes for the same transaction scenario. That is, the same work performed in the same fashion with the same inputs may result in different answers, depending on your isolation level. These levels are defined in terms of three phenomena that are either permitted or not at a given isolation level:

Dirty read: A transaction reads data that has been written by another transaction that has not been committed yet.
Data integrity is compromised, foreign keys are violated, and unique constraints are ignored.
Nonrepeatable (fuzzy) reads: A transaction re-reads data it has previously read and finds that another committed transaction has modified or deleted the data.
Phantom read: A transaction re-runs a query returning a set of rows that satisfies a search condition and finds that another committed transaction has inserted additional rows that satisfy the condition.

The SQL isolation levels are defined based on whether they allow each of the preceding phenomena. It's interesting to note that the SQL standard doesn't impose a specific locking scheme or mandate particular behaviors, but rather describes these isolation levels in terms of these phenomena—allowing for many different locking/concurrency mechanisms to exist (see Table 1).

Isolation Level	Dirty Read	Nonrepeatable Read	Phantom Read
READ UNCOMMITTED	Permitted	Permitted	Permitted
READ COMMITTED	--	Permitted	Permitted
REPEATABLE READ	--	--	Permitted
SERIALIZABLE	--	--	--

Table 1: ANSI isolation levels

READ UNCOMMITTED - no lock on table
READ COMMITTED - lock on committed data
REPEATABLE READ - lock on block of sql(which is selected by using select query)
SERIALIZABLE - lock on full table(on which Select query is fired)

Oracle Database offers the read committed and serializable isolation levels, as well as a read-only mode that is not part of SQL92. Read committed is the default.

Overview of Locking Mechanisms

In general, multiuser databases use some form of data locking to solve the problems associated with data concurrency, consistency, and integrity. Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource.

Resources include two general types of objects:

User objects, such as tables and rows (structures and data)
System objects not visible to users, such as shared data structures in the memory and data dictionary rows

Locking Policy

A locking policy is an important component of any multiuser application. When users share objects in an application, a locking policy ensures that two or more users do not attempt to modify the same object or its underlying data simultaneously.

Types of locking policy:

Optimistic (Write) Lock: All users have read access to the object. When a user attempts to write a change, the application checks to ensure that the object has not changed since the last read (checking by version number, date, timestamps, or checksums/hashes).
Optimistic Read Lock: As with optimistic lock, the optimistic read lock ensures that the object has not changed before writing a change. However, the optimistic read lock also forces a read of any related tables that contribute information to the object.
Pessimistic Locking: When a user accesses an object to update it, the database locks the object until the update is completed. No other user can read or update the object until the first user releases the lock. The database offers this locking type.
No Locking: The application does not verify that data is current.

Details:
Optimistic assumes that nothing's going to change while you're reading it.
Pessimistic assumes that something will and so locks it.

Optimistic Locking details: When you write the record back you filter the update on the version to make sure it's atomic. (i.e. hasn't been updated between when you check the version and write the record to the disk) and update the version in one hit.
If the record is dirty (i.e. different version to yours) you abort the transaction and the user can re-start it.

This strategy is most applicable to high-volume systems and three-tier architectures where you do not necessarily maintain a connection to the database for your session. In this situation the client cannot actually maintain database locks as the connections are taken from a pool and you may not be using the same connection from one access to the next.

Advantages of optimistic locking:

It prevents users and applications from editing stale data.
It notifies users of any locking violation immediately, when updating the object.
It does not require you to lock up the database resource.
It prevents database deadlocks.

Disadvantages of optimistic locking:

Optimistic locking cannot prevent applications from selecting and attempting to modify the same data. When two different processes modify data, the first one to commit the changes succeeds; the other process fails and receives an OptimisticLockException.

Pessimistic Locking details is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks. To use pessimistic locking you need either a direct connection to the database (as would typically be the case in a two tier client server application) or an externally available transaction ID that can be used independently of the connection.

Because pessimistic locks exist for the duration of the current transaction, the associated database transaction remains open from the point of the first lock request until the transaction commits. When the transaction commits or rolls back, the database releases the locks.

Advantages of Pessimistic Locking

It prevents users and applications from editing data that is being or has been changed.
Processes know immediately when a locking violation occurs, rather than after the transaction is complete.

Disadvantages of Pessimistic Locking

It is not fully supported by all databases.
It consumes extra database resources.
It requires to maintain an open transaction and database lock for the duration of the transaction, which can lead to database deadlocks.
It decreases the concurrency of connection pooling when using the server session, which affects the overall scalability of your application.

In the latter case you open the transaction with the TxID and then reconnect using that ID. The DBMS maintains the locks and allows you to pick the session back up through the TxID. This is how distributed transaction using two-phase commit protocols (such as XA or COM+ Transactions) work.

Resolving conflicts between transactions

How can you serialize a set of changes from overlapping business transactions that affect the same object when your primary concern is for users to see changes immediately as they occur?

First Commit Wins - forcing the user to start a new business transaction, starting from the updated object that results from the first committed transaction.
Last Commit Wins - initiating a new server transaction automatically and replaying the user's changes, which may overwrite some of the changes from the first committed transaction.
Merge Conflicting Updates - allowing the replay of changes in the last transaction to selectively update the conflicting object.

First Commit Wins - A transactional application server allows the first server transaction to "win" in the sense that successive transactions having a write-write conflict will fail. Whenever a client application experiences a commit failure, abort the transaction, and refresh the user's view from the current state of the server. This will cause the user to lose all of their changes, forcing them to start a new business transaction. They will need to reevaluate the new data manually in order to decide whether to redo their changes or not.

This is a very easy pattern to implement. While it sounds terrible to cause the end user to redo their work, this may be acceptable when the probability of a commit failure is so low that it can be ignored.

Last Commit Wins - A transactional application server allows the first server transaction to "win" in the sense that successive transactions having a write-write conflict will fail. However, it is possible for the second client to abort the failed transaction, begin a new transaction, replay the changes, and commit the new transaction.

This is an easy pattern to implement, as long as there is a way to define transaction specifications in the client. At first glance, this solution appears to violate the objectives of a transactional system, since the second transaction may overwrite the effects of the first transaction. However, as long as appropriate business rules validate changes as they are played back, there is no risk of changing the repository into an inconsistent state.

One of the disadvantages of this solution is that the software automatically resolves the conflict without notifying the end user that a conflict occurred. In situations where such notification is important or manual merging of changes is more appropriate, then the First Commit Wins or the Merge Conflicting Updates patterns are more appropriate.

The disadvantages of this solution may be acceptable when the probability of a commit failure is so low that it can be ignored or where commit failures typically do not have business consequences. For example, suppose a customer service organization changes addresses and other customer contact information, while a billing organization updates a customer's purchasing information. It is possible that the application server may report a conflict between a customer service transaction and a billing transaction, while from a business perspective the two transactions are updating different aspects of the customer object.

Merge Conflicting Updates - A transactional application server allows the first server transaction to "win" in the sense that successive transactions having a write-write conflict will fail. However, it is possible for the second client to abort the failed transaction, begin a new transaction, replay the changes, and commit the new transaction. While replaying the changes, you must check each individual change to determine whether there is a conflict with the first committed transaction. If so, prompt the user to determine whether or not the new change should be applied.

This solution identifies the server transaction conflict to the user, enabling the user to make the necessary adjustments to merge the two sets of changes appropriately. It minimizes the amount of rework that the end user must do, in order to complete their business transaction.

What is Transaction Context Propagation

Transaction context propagation is necessary if multiple instances are to participate in a single distributed transaction. A transaction context defines the scope of a transaction and associates all participating resources and transactional operations. All participating threads share the same transaction context.

Multiple app instances need to participate in the same transaction if an application instance makes a remote call into another application instance in the scope of an existing transaction, assuming the EJB semantics for the method support scoping work in a client’s transaction. An example of this is a servlet in application instance 1 obtaining a reference to an EJB residing in application instance 2, starting a transaction and making a method call on the remote EJB in the scope of the transaction. When multiple application instances participate in a single transaction, all work done by the participating application instances as part of the global transaction is guaranteed to be atomic.

Notes

5/28/2014

Is Java compiled or interpreted? - Both

Git Theme

5/19/2014

Transaction Isolation Levels

Transaction Isolation Levels

Overview of Locking Mechanisms

Locking Policy

Resolving conflicts between transactions

What is Transaction Context Propagation

Sources