DBMS - Codd's 12 Rules

Hello, aspiring database enthusiasts! Today, we're going to dive into the fascinating world of Database Management Systems (DBMS) and explore the legendary Codd's 12 Rules. These rules, formulated by the father of relational databases, Dr. Edgar F. Codd, are like the Ten Commandments of the database world. They're not just rules; they're the guiding principles that shape how we think about and design modern databases.

DBMS - Codd

As your friendly neighborhood computer teacher, I'll break down these rules in a way that even those of you who've never written a line of code can understand. So, grab a cup of coffee, get comfortable, and let's embark on this exciting journey together!

Rule 1: Information Rule

Imagine you're organizing your bookshelf. The Information Rule is like saying, "Everything on this shelf must be a book." In the database world, this rule states that all information in a database must be represented in one and only one way - as values in tables.

Let's look at a simple example:

CREATE TABLE Students (
    StudentID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    Age INT
);

In this example, we're creating a table called "Students". Each piece of information about a student (their ID, first name, last name, and age) is represented as a column in this table. This is the essence of the Information Rule - all data is stored in tables.

Rule 2: Guaranteed Access Rule

This rule is like having a library card that gives you access to any book in the library. In database terms, it means every piece of data should be accessible by knowing the table name, primary key, and column name.

For example, if we want to find the age of a student with ID 1, we could use:

SELECT Age FROM Students WHERE StudentID = 1;

This query guarantees access to the specific piece of information we need, following Rule 2.

Rule 3: Systematic Treatment of NULL Values

NULL values are like the mystery books in our library - they represent unknown or inapplicable information. This rule ensures that NULL values are treated consistently across all operations.

For instance, when inserting a new student without knowing their age:

INSERT INTO Students (StudentID, FirstName, LastName, Age)
VALUES (2, 'Jane', 'Doe', NULL);

The database treats this NULL value systematically, allowing for operations like:

SELECT * FROM Students WHERE Age IS NULL;

Rule 4: Active Online Catalog

Think of this as the library's digital catalog system. The database must have a built-in "catalog" where it stores information about all tables, columns, indexes, and more. In most modern DBMS, this is often referred to as the "data dictionary" or "system catalog".

We can query this catalog. For example, in SQL Server:

SELECT * FROM INFORMATION_SCHEMA.TABLES;

This query gives us information about all tables in our database, demonstrating the active online catalog in action.

Rule 5: Comprehensive Data Sub-Language Rule

This rule is like saying our library should have a universal language for finding, reading, and organizing books. In database terms, it means there should be a comprehensive language for all database operations.

SQL (Structured Query Language) is the most common example of this. It allows us to do everything from creating tables to querying data:

-- Creating a table
CREATE TABLE Books (
    BookID INT PRIMARY KEY,
    Title VARCHAR(100),
    Author VARCHAR(50)
);

-- Inserting data
INSERT INTO Books (BookID, Title, Author)
VALUES (1, 'Database Design 101', 'E. F. Codd');

-- Querying data
SELECT * FROM Books WHERE Author = 'E. F. Codd';

Rule 6: View Updating Rule

Views are like custom bookshelves in our library. This rule states that if you can see a view, you should be able to update it (with some limitations).

Here's an example of creating and updating a view:

-- Create a view
CREATE VIEW TeenageStudents AS
SELECT * FROM Students WHERE Age BETWEEN 13 AND 19;

-- Update through the view
UPDATE TeenageStudents
SET Age = 18
WHERE StudentID = 1;

Rule 7: High-Level Insert, Update, and Delete Rule

This rule ensures that we can modify sets of data in one go, rather than one record at a time. It's like being able to move entire shelves of books at once, rather than book by book.

For example:

-- Insert multiple records
INSERT INTO Students (StudentID, FirstName, LastName, Age)
VALUES 
(3, 'Alice', 'Johnson', 20),
(4, 'Bob', 'Smith', 22),
(5, 'Charlie', 'Brown', 19);

-- Update multiple records
UPDATE Students
SET Age = Age + 1
WHERE Age < 20;

-- Delete multiple records
DELETE FROM Students
WHERE Age > 25;

Rule 8: Physical Data Independence

This rule is about separating the physical storage of data from how we interact with it. It's like not caring whether a book is hardcover or paperback - you can still read it the same way.

In practice, this means changing the physical storage (like moving from HDD to SSD) shouldn't require changes to the application code.

Rule 9: Logical Data Independence

Similar to Rule 8, but on a logical level. It means we should be able to change the logical structure (like adding a new column) without affecting existing applications.

For example, if we add a new column to our Students table:

ALTER TABLE Students
ADD Email VARCHAR(100);

Existing applications that query the Students table should continue to work without modification.

Rule 10: Integrity Independence

This rule ensures that integrity constraints (like "Age must be positive") are defined in the database and not in the application. It's like having rules for book categorization enforced by the library system, not by individual librarians.

Example:

CREATE TABLE Students (
    StudentID INT PRIMARY KEY,
    FirstName VARCHAR(50) NOT NULL,
    LastName VARCHAR(50) NOT NULL,
    Age INT CHECK (Age > 0 AND Age < 120)
);

Here, the constraints are defined at the database level, ensuring integrity independence.

Rule 11: Distribution Independence

This rule states that the database should work the same way whether it's distributed across multiple locations or not. It's like being able to access books from different library branches seamlessly.

While there's no simple code example for this, modern distributed databases implement this principle behind the scenes.

Rule 12: Non-Subversion Rule

The final rule is about security and consistency. It states that if the database has a way to access records (like SQL), there shouldn't be a way to bypass this and access the data directly. It's like ensuring all book checkouts go through the library system, with no backdoor access.

This is typically handled at the database system level, not through user-written code.

To summarize all these rules in a handy table:

Rule	Description
1	Information Rule
2	Guaranteed Access Rule
3	Systematic Treatment of NULL Values
4	Active Online Catalog
5	Comprehensive Data Sub-Language Rule
6	View Updating Rule
7	High-Level Insert, Update, and Delete Rule
8	Physical Data Independence
9	Logical Data Independence
10	Integrity Independence
11	Distribution Independence
12	Non-Subversion Rule

And there you have it, folks! Codd's 12 Rules in a nutshell. Remember, these rules were formulated in 1985, and while not all modern databases strictly adhere to every rule, they still form the theoretical foundation of relational database systems.

As we wrap up, I hope this journey through Codd's Rules has been as enlightening for you as it is every time I teach it. These principles have shaped the database world we know today, and understanding them is your first step towards becoming a database guru. Keep practicing, stay curious, and who knows? Maybe one day you'll be formulating the next set of groundbreaking database principles!

Credits: Image by storyset

Previous Tutorial:

DBMS - Generalization, Aggregation

Next Tutorial:

DBMS - Relational Data Model