DBMS - Concurrency Control: A Beginner's Guide

Hello there, future database wizards! Today, we're going to embark on an exciting journey into the world of Concurrency Control in Database Management Systems (DBMS). Don't worry if you're new to this; I'll be your friendly guide, and we'll explore this topic step by step. So, grab a cup of coffee, and let's dive in!

DBMS - Concurrency Control

What is Concurrency Control?

Before we get into the nitty-gritty, let's understand what concurrency control is all about. Imagine a busy restaurant where multiple waiters are trying to take orders and serve food simultaneously. Without proper coordination, chaos would ensue! Similarly, in a database, multiple users or processes might try to access and modify data at the same time. Concurrency control is like the head waiter who ensures everything runs smoothly without conflicts.

Now, let's explore the main techniques used for concurrency control in DBMS.

Lock-based Protocols

Understanding Locks

Locks are like "Do Not Disturb" signs on hotel room doors. When a transaction needs to access data, it puts a lock on it, telling others, "Hey, I'm working here!"

Types of Locks

Lock Type Description Use Case
Shared Lock (S) Allows multiple transactions to read data Reading data without modifications
Exclusive Lock (X) Only one transaction can hold this lock Writing or updating data

Two-Phase Locking (2PL) Protocol

This protocol is like a dance with two main moves:

  1. Growing Phase: Acquire locks, don't release any.
  2. Shrinking Phase: Release locks, don't acquire any.

Let's see a simple example:

BEGIN TRANSACTION;
-- Growing Phase
LOCK TABLE users IN EXCLUSIVE MODE;
UPDATE users SET balance = balance - 100 WHERE id = 1;
LOCK TABLE transactions IN EXCLUSIVE MODE;
INSERT INTO transactions (user_id, amount) VALUES (1, -100);
-- Shrinking Phase
UNLOCK TABLE users;
UNLOCK TABLE transactions;
COMMIT;

In this example, we first lock the tables we need, perform our operations, and then release the locks before committing the transaction.

Deadlocks: The Dance Gone Wrong

Imagine two dancers both waiting for the other to make a move. That's a deadlock! In databases, it happens when two transactions are waiting for each other to release a lock.

To prevent deadlocks, we use techniques like:

  1. Timeout: If a transaction waits too long, it's rolled back.
  2. Deadlock detection: The system actively looks for deadlocks and resolves them.

Timestamp-based Protocols

Now, let's switch gears and talk about timestamp-based protocols. These are like giving each transaction a unique ticket with a timestamp when it enters the system.

Basic Timestamp Ordering (TO) Protocol

In this protocol, we use timestamps to determine the order of conflicting operations. It's like serving customers based on when they arrived at the restaurant.

Here's how it works:

  1. Each data item X has two timestamp values:

    • W-timestamp(X): The largest timestamp of any transaction that successfully wrote X.
    • R-timestamp(X): The largest timestamp of any transaction that successfully read X.
  2. For a transaction T trying to read X:

    • If TS(T) < W-timestamp(X), T is too late and must be aborted and restarted.
    • Otherwise, allow T to read X and set R-timestamp(X) to max(R-timestamp(X), TS(T)).
  3. For a transaction T trying to write X:

    • If TS(T) < R-timestamp(X) or TS(T) < W-timestamp(X), T is too late and must be aborted and restarted.
    • Otherwise, allow T to write X and set W-timestamp(X) to TS(T).

Let's see an example:

class DataItem:
    def __init__(self):
        self.value = None
        self.r_timestamp = 0
        self.w_timestamp = 0

def read(transaction, data_item):
    if transaction.timestamp < data_item.w_timestamp:
        print(f"Transaction {transaction.id} is too late to read. Aborting...")
        abort(transaction)
    else:
        print(f"Transaction {transaction.id} reads value: {data_item.value}")
        data_item.r_timestamp = max(data_item.r_timestamp, transaction.timestamp)

def write(transaction, data_item, new_value):
    if (transaction.timestamp < data_item.r_timestamp or 
        transaction.timestamp < data_item.w_timestamp):
        print(f"Transaction {transaction.id} is too late to write. Aborting...")
        abort(transaction)
    else:
        print(f"Transaction {transaction.id} writes value: {new_value}")
        data_item.value = new_value
        data_item.w_timestamp = transaction.timestamp

def abort(transaction):
    print(f"Transaction {transaction.id} aborted and will be restarted.")

In this example, we've implemented basic read and write operations following the timestamp ordering protocol. The system checks timestamps before allowing operations and updates them accordingly.

Thomas Write Rule: A Smart Optimization

Thomas Write Rule is like letting a faster runner pass a slower one in a race. It allows us to ignore some "too late" writes without aborting the transaction.

Here's how it works:

If TS(T) < W-timestamp(X), instead of aborting T, we simply ignore this write operation. This is safe because the value being written is outdated anyway.

Let's modify our write function to include Thomas Write Rule:

def write_with_thomas_rule(transaction, data_item, new_value):
    if transaction.timestamp < data_item.r_timestamp:
        print(f"Transaction {transaction.id} is too late to write. Aborting...")
        abort(transaction)
    elif transaction.timestamp < data_item.w_timestamp:
        print(f"Transaction {transaction.id}'s write ignored due to Thomas Write Rule.")
    else:
        print(f"Transaction {transaction.id} writes value: {new_value}")
        data_item.value = new_value
        data_item.w_timestamp = transaction.timestamp

This optimization helps reduce the number of unnecessary transaction aborts, improving overall system performance.

Wrapping Up

Phew! We've covered a lot of ground today, from lock-based protocols to timestamp-based ones. Remember, concurrency control is all about maintaining order in the chaotic world of simultaneous database operations. It's like being a traffic cop at a busy intersection, making sure everyone gets where they need to go without crashing into each other.

As you continue your journey in the world of databases, you'll encounter more advanced concepts and techniques. But for now, pat yourself on the back for mastering these fundamental concepts of concurrency control!

Keep practicing, stay curious, and happy coding!

Credits: Image by storyset