MySQL - Vertical Partitioning

Hello, aspiring database enthusiasts! Today, we're diving into the fascinating world of MySQL vertical partitioning. As your friendly neighborhood computer teacher, I'm excited to guide you through this journey, even if you're completely new to programming. Don't worry; we'll take it step by step, and before you know it, you'll be partitioning data like a pro!

MySQL - Vertical Partitioning

The MySQL Vertical Partitioning

Imagine you have a big box of toys (that's our database), and you want to organize them better. Vertical partitioning is like sorting your toys into different smaller boxes based on their characteristics. In database terms, it's the process of dividing a table's columns into separate tables.

Let's start with a simple example:

CREATE TABLE students (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    age INT,
    email VARCHAR(100),
    phone VARCHAR(20),
    address TEXT
);

This is our original students table. Now, let's apply vertical partitioning:

CREATE TABLE students_main (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    age INT
);

CREATE TABLE students_contact (
    id INT PRIMARY KEY,
    email VARCHAR(100),
    phone VARCHAR(20)
);

CREATE TABLE students_address (
    id INT PRIMARY KEY,
    address TEXT
);

Here, we've split our original table into three smaller tables. This can improve performance, especially if you frequently access only certain columns.

Why use vertical partitioning?

  1. Improved query performance
  2. Better data management
  3. Enhanced security (you can restrict access to sensitive data)

Now, let's see how we can insert and retrieve data with this new structure:

-- Inserting data
INSERT INTO students_main VALUES (1, 'Alice', 20);
INSERT INTO students_contact VALUES (1, '[email protected]', '123-456-7890');
INSERT INTO students_address VALUES (1, '123 Main St, Anytown, USA');

-- Retrieving data
SELECT m.id, m.name, c.email, a.address
FROM students_main m
JOIN students_contact c ON m.id = c.id
JOIN students_address a ON m.id = a.id
WHERE m.id = 1;

This way, you can still get all the information about a student, but it's stored more efficiently.

Range Columns Partitioning

Now, let's level up and talk about range columns partitioning. This is like organizing your books by the year they were published, but for multiple columns.

Here's an example using a sales table:

CREATE TABLE sales (
    id INT NOT NULL,
    sale_date DATE NOT NULL,
    amount DECIMAL(10,2) NOT NULL
)
PARTITION BY RANGE COLUMNS(sale_date, amount) (
    PARTITION p0 VALUES LESS THAN ('2023-01-01', 1000),
    PARTITION p1 VALUES LESS THAN ('2023-01-01', 5000),
    PARTITION p2 VALUES LESS THAN ('2023-07-01', 1000),
    PARTITION p3 VALUES LESS THAN ('2023-07-01', 5000),
    PARTITION p4 VALUES LESS THAN (MAXVALUE, MAXVALUE)
);

In this example, we're partitioning based on both the sale_date and amount. This allows for more granular control over how data is distributed.

Let's insert some data and see how it works:

INSERT INTO sales VALUES
(1, '2022-12-15', 500),
(2, '2022-12-20', 2000),
(3, '2023-03-10', 800),
(4, '2023-03-15', 3000),
(5, '2023-08-01', 1500);

-- Let's check which partition each record is in
SELECT *, PARTITION_NAME FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME = 'sales';

This query will show you which partition each record has been assigned to based on our range criteria.

List Columns Partitioning

Last but not least, let's explore list columns partitioning. This is like sorting your clothes by both color and type – you can have a drawer for red shirts, another for blue pants, and so on.

Here's an example using a products table:

CREATE TABLE products (
    id INT NOT NULL,
    name VARCHAR(50),
    category VARCHAR(20),
    sub_category VARCHAR(20)
)
PARTITION BY LIST COLUMNS(category, sub_category) (
    PARTITION p_electronics VALUES IN (('Electronics', 'Phones'), ('Electronics', 'Laptops')),
    PARTITION p_clothing VALUES IN (('Clothing', 'Shirts'), ('Clothing', 'Pants')),
    PARTITION p_other VALUES IN (('Books', 'Fiction'), ('Books', 'Non-fiction'))
);

Now, let's add some products:

INSERT INTO products VALUES
(1, 'iPhone 12', 'Electronics', 'Phones'),
(2, 'MacBook Pro', 'Electronics', 'Laptops'),
(3, 'Levi\'s Jeans', 'Clothing', 'Pants'),
(4, 'Harry Potter', 'Books', 'Fiction');

-- Let's check which partition each product is in
SELECT *, PARTITION_NAME FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME = 'products';

This query will show you how each product has been sorted into its respective partition based on its category and sub-category.

Comparison of Partitioning Methods

Method Use Case Pros Cons
Vertical Partitioning Large tables with infrequently used columns Improved query performance, better data management More complex joins for full data retrieval
Range Columns Partitioning Time-based or numerical range data Efficient for range-based queries, good for historical data Potential for uneven data distribution
List Columns Partitioning Categorical data with multiple attributes Precise control over data placement, good for complex categorizations Can become unwieldy with many categories

And there you have it, folks! We've journeyed through the land of MySQL vertical partitioning, explored the realms of range columns partitioning, and ventured into the territory of list columns partitioning. Remember, like organizing your room, the key to good database design is putting things where they make the most sense and are easiest to find. Keep practicing, and soon you'll be a database organizing wizard!

Credits: Image by storyset