DB2 for i Commitment Control and Data Integrity: A Tutorial
by Peter Sramka
When commitment control is used with DB2 for i, there are five different isolation levels available. In general, as the isolation level is increased from the lowest level of No Commit (*NONE) to the highest level of Serializable (*RR), data integrity is increased, but concurrency is decreased. But what exactly does “data integrity” mean? Basically, if you have data integrity your data is “correct” and if you don’t have data integrity your data is not “correct.” But what does “correct” mean in this context? This is best explained with an example. Consider a simple banking algorithm where a customer transfers X dollars from a savings account to a checking account. Here are the sequential steps of that algorithm:
1) Start a new database transaction.
2) Read the balance of the savings account.
3) Subtract X dollars from the balance of the savings account.
4) Write the new balance of the savings account.
5) Read the balance of the checking account.
6) Add X dollars to the balance of the checking account.
7) Write the new balance of the checking account.
8) End the database transaction with a COMMIT command.
In this case, the “correctness” of this algorithm cannot be guaranteed on DB2 for i unless the isolation level is set to at least Repeatable Read (*ALL). Why? Again, this is best explained with another example. Let’s say that John and his wife Mary have a joint savings account with a $1500 balance and a joint checking account with a $100 balance, for a total of $1600. John needs to write a check for $300 and decides to transfer that amount from savings to checking. At approximately the same time, Mary transfers $500 from savings to checking. After both transactions are complete, the savings account should have a $700 balance and the checking account should have a $900 balance, for an unchanged total of $1600. Let’s look at a case where an isolation level of Read Committed (*CS), which is lower than Repeatable Read (*ALL), is used and how the data gets corrupted:
1J) The thread of the algorithm servicing John’s request (John’s thread)
starts a new database transaction with an isolation level of Read
Committed (*CS).
1M) The thread of the algorithm servicing Mary’s request (Mary’s thread)
starts a new database transaction with an isolation level of Read
Committed (*CS).
2J) John’s thread reads the balance of the savings account as $1500
2M) Mary’s thread reads the balance of the savings account as $1500.
3J) John’s thread subtracts $300 from $1500 to get $1200.
3M) Mary’s thread subtracts $500 from $1500 to get $1000.
4J) John’s thread attempts to write $1200 as the new balance of the
savings account, thus DB2 for i automatically places an exclusive
“write” lock on the savings account record on behalf of John’s thread
and allows the write to proceed.
4Ma) Mary’s thread attempts to write $1000 as the new balance of the
savings account, but is blocked because of John’s thread’s “write”
lock.
5J) John’s thread reads the balance of the checking account as $100.
6J) John’s thread adds $300 to $100 to get $400.
7J) John’s thread attempts to write $400 as the new balance of the
checking account, thus DB2 for i automatically places an exclusive
“write” lock on the checking account record on behalf of John’s
thread and allows the write to proceed.
8J) John’s thread ends its database transaction with a COMMIT command,
thus DB2 for i automatically releases John’s thread’s “write” lock on
the savings account record (which unblocks Mary’s thread) as well as
its “write” lock on the checking account record.
4Mb) DB2 for i automatically places an exclusive “write” lock on the savings
account record on behalf of Mary’s thread and allows Mary’s thread to
write $1000 as the new balance of the savings account.
5M) Mary’s thread reads the balance of the checking account as $400.
6M) Mary’s thread adds $500 to $400 to get $900.
7M) Mary’s thread attempts to write $900 as the new balance of the
checking account, thus DB2 for i automatically places an exclusive
“write” lock on the checking account record on behalf of Mary’s
thread and allows the write to proceed.
8M) Mary’s thread ends its database transaction with a COMMIT command,
thus DB2 for i automatically releases Mary’s thread’s “write” lock on
the savings account record as well as its “write” lock on the
checking account record.
At the end of this cycle, the savings account has a $1000 balance and the checking account has a $900 balance, for a total of $1900. John and Mary got $300 for FREE because the bank didn’t use the right isolation level!
Now let’s look at the case where the Repeatable Read (*ALL) isolation level is used:
1J) John’s thread starts a new database transaction with an isolation
level of Repeatable Read (*ALL).
1M) Mary’s thread starts a new database transaction with an isolation
level of Repeatable Read (*ALL).
2J) John’s thread attempts to read the balance of the savings account (as
$1500), thus DB2 for i automatically places a non-exclusive “read” lock
on the savings account record on behalf of John’s thread and allows
the read to proceed.
2M) Mary’s thread attempts to read the balance of the savings account (as
$1500), thus DB2 for i automatically places a second non-exclusive
“read” lock on the savings account record on behalf of Mary’s thread
and allows the read to proceed.
3J) John’s thread subtracts $300 from $1500 to get $1200.
3M) Mary’s thread subtracts $500 from $1500 to get $1000.
4Ja) John’s thread attempts to write $1200 as the new balance of the
savings account, but the write is blocked because of Mary’s thread’s
“read” lock on the same record.
4Ma) Mary’s thread attempts to write $1000 as the new balance of the
savings account, but the write is blocked because of John’s thread’s
“read” lock on the same record.
At this point, a “deadlock” situation occurs since John’s thread is waiting on Mary’s thread and Mary’s thread is waiting on John’s thread. How is this “deadlock” resolved? The answer is, “it depends,” but lets assume both transactions have the same time-out value and that value is relatively small, in which case the following would occur:
4Jb) John’s thread receives the error “Row or object [filename] in
[library] type *FILE in use”.
4Jc) John’s thread issues a ROLLBACK command and DB2 for i automatically
releases John’s thread’s “read” lock on the savings account record
(which unblocks Mary’s thread.)
4Jd) John’s thread displays the error “Row or object [filename] in
[library] type *FILE in use” on John’s screen and waits for his
input.
4Mb) DB2 for i automatically converts Mary’s thread’s “read” lock on the
savings account record to an exclusive “write” lock on behalf of
Mary’s thread and allows Mary’s thread to write $1000 as the new
balance of the savings account.
5M) Mary’s thread attempts to read the balance of the checking account
(as $100), thus DB2 for i automatically places a non-exclusive “read”
lock on the checking account record on behalf of Mary’s thread and
allows the read to proceed.
6M) Mary’s thread adds $500 to $100 to get $600.
7M) Mary’s thread attempts to write $600 as the new balance of the
checking account, thus DB2 for i automatically converts Mary’s thread’s
“read” lock on the checking account record to an exclusive “write”
lock on behalf of Mary’s thread and allows the write to proceed.
8M) Mary’s thread ends its database transaction with a COMMIT command,
thus DB2 for i automatically releases Mary’s thread’s “write” lock on
the savings account record as well as its “write” lock on the
checking account record.
At this point, the total of the savings and checking account adds up to the correct total of $1600, thus we have data integrity. John reads the error on the screen, tells the application a second time to transfer $300 and the following occurs:
1J) John’s thread starts a new database transaction with an isolation
level of Repeatable Read (*ALL).
2J) John’s thread attempts to read the balance of the savings account (as
$1000), thus DB2 for i automatically places a non-exclusive “read” lock
on the savings account record on behalf of John’s thread and allows
the read to proceed.
3J) John’s thread subtracts $300 from $1000 to get $700.
4J) John’s thread attempts to write $700 as the new balance of the savings
account, thus DB2 for i automatically converts John’s thread’s “read”
lock on the savings account record to an exclusive “write” lock on
behalf of John’s thread and allows the write to proceed.
5J) John’s thread attempts to read the balance of the checking account (as
$600), thus DB2 for i automatically places a non-exclusive “read” lock
on the checking account record on behalf of John’s thread and allows
the read to proceed.
6J) John’s thread adds $300 to $600 to get $900.
7J) John’s thread attempts to write $900 as the new balance of the
checking account, thus DB2 for i automatically converts John’s thread’s
“read” lock on the checking account record to an exclusive “write”
lock on behalf of John’s thread and allows the write to proceed.
8J) John’s thread ends its database transaction with a COMMIT command,
thus DB2 for i automatically releases John’s thread’s “write” lock on
the savings account record as well as its “write” lock on the checking
account record.
At this point, we have the “correct” result: $700 in savings and $900 in checking, for a total of $1600. For this sample algorithm, in order to get the “correct” result, we had to use the isolation level Repeatable Read (*ALL) or higher. A different algorithm might need a different isolation level to get its “correct” result. That is why you have to analyze each algorithm individually to determine the lowest isolation level that it can use to guarantee data integrity. Of course, you could just always use the Serializable (*RR) isolation level and always be guaranteed data integrity, but your concurrency might be unacceptable. Things can get even more complicated (than this example) if you have multiple algorithms accessing the same data with different isolation levels.
To maintain data integrity, algorithms must be properly written, not only in terms of using the appropriate isolation level, but also in terms of proper error handling. In the example given above, when the algorithm received the error “Row or object [filename] in [library] type *FILE in use,” it performed a ROLLBACK and (after user intervention) started the entire algorithm from the beginning. THIS IS VERY IMPORTANT. If you receive an error from the database, you should ALWAYS stop execution of the algorithm, perform a ROLLBACK, and if you wish to make another attempt at executing the algorithm, start execution of the algorithm from the very beginning, otherwise data integrity can not be guaranteed. To illustrate this point, consider what would have happened if the algorithm used the “correct” isolation level of Repeatable Read (*ALL), but did not start the execution of the algorithm from the very beginning after receiving an error and issuing a ROLLBACK:
1J) John’s thread starts a new database transaction with an isolation
level of Repeatable Read (*ALL).
1M) Mary’s thread starts a new database transaction with an isolation
level of Repeatable Read (*ALL).
2J) John’s thread attempts to read the balance of the savings account (as
$1500), thus DB2 for i automatically places a non-exclusive “read” lock
on the savings account record on behalf of John’s thread and allows
the read to proceed.
2M) Mary’s thread attempts to read the balance of the savings account (as
$1500), thus DB2 for i automatically places a second non-exclusive
“read” lock on the savings account record on behalf of Mary’s thread
and allows the read to proceed.
3J) John’s thread subtracts $300 from $1500 to get $1200.
3M) Mary’s thread subtracts $500 from $1500 to get $1000.
4Ja) John’s thread attempts to write $1200 as the new balance of the
savings account, but the write is blocked because of Mary’s thread’s
“read” lock on the same record.
4Ma) Mary’s thread attempts to write $1000 as the new balance of the
savings account, but the write is blocked because of John’s thread’s
“read” lock on the same record.
Again, a “deadlock” situation occurs. Again, we assume both transactions have the same time-out value and that value is relatively small:
4Jb) John’s thread receives the error “Row or object [filename] in
[library] type *FILE in use”.
4Jc) John’s thread issues a ROLLBACK command and DB2 for i automatically
releases John’s thread’s “read” lock on the savings account record
(which unblocks Mary’s thread.)
4Jd) John’s thread displays the error “Row or object [filename] in
[library] type *FILE in use” and then displays a “Retry?” message box
on John’s screen and waits for his input.
4Mb) DB2 for i automatically converts Mary’s thread’s “read” lock on the
savings account record to an exclusive “write” lock on behalf of
Mary’s thread and allows Mary’s thread to write $1000 as the new
balance of the savings account.
5M) Mary’s thread attempts to read the balance of the checking account
(as $100), thus DB2 for i automatically places a non-exclusive “read”
lock on the checking account record on behalf of Mary’s thread and
allows the read to proceed.
6M) Mary’s thread adds $500 to $100 to get $600.
7M) Mary’s thread attempts to write $600 as the new balance of the
checking account, thus DB2 for i automatically converts Mary’s thread’s
“read” lock on the checking account record to an exclusive “write”
lock on behalf of Mary’s thread and allows the write to proceed.
8M) Mary’s thread ends its database transaction with a COMMIT command,
thus DB2 for i automatically releases Mary’s thread’s “write” lock on
the savings account record as well as its “write” lock on the
checking account record.
At this point, John reads the error on the screen, clicks “Yes”, and the following occurs:
4Je) John’s thread attempts to write $1200 as the new balance of the
savings account, thus DB2 for i automatically places an exclusive
“write” lock on the savings account record on behalf of John’s
thread and allows the write to proceed.
5J) John’s thread attempts to read the balance of the checking account
(as $600), thus DB2 for i automatically places a non-exclusive “read”
lock on the checking account record on behalf of John’s thread and
allows the read to proceed.
6J) John’s thread adds $300 to $600 to get $900.
7J) John’s thread attempts to write $900 as the new balance of the
checking account, thus DB2 for i automatically converts John’s thread’s
“read” lock on the checking account record to an exclusive “write”
lock on behalf of John’s thread and allows the write to proceed.
8J) John’s thread ends its database transaction with a COMMIT command,
thus DB2 for i automatically releases John’s thread’s “write” lock on
the savings account record as well as its “write” lock on the
checking account record.
At his point, the savings account has $1200 and the checking account has $900, for a total of $2100. That’s $500 that John and Mary got for FREE because of poorly coded error-handling logic.
As you can see, when proper analysis and coding techniques are employed, the various isolation levels of DB2 for i’s commitment control can be very useful for maintaining data integrity, but the analysis required can get complex. With time, practice, patience and discipline, you can master the concepts involved for this sort of analysis.