SQL - UNION vs UNION ALL

Hello there, aspiring SQL enthusiasts! Today, we're going to embark on an exciting journey into the world of SQL, specifically focusing on two powerful operators: UNION and UNION ALL. As your friendly neighborhood computer teacher with years of experience under my belt, I'm here to guide you through these concepts with clarity, humor, and plenty of real-world examples. So, let's dive in!

SQL - UNION vs UNION ALL

What is UNION?

Definition and Basic Concept

UNION is like a master chef combining ingredients from different recipes to create a unique dish. In SQL terms, it's an operator that allows us to combine the result sets of two or more SELECT statements into a single result set.

Here's the basic syntax:

SELECT column1, column2, ... FROM table1
UNION
SELECT column1, column2, ... FROM table2;

Key Features of UNION

  1. Distinct Results: UNION automatically removes duplicate rows from the final result set.
  2. Column Compatibility: The SELECT statements must have the same number of columns, and these columns should have compatible data types.
  3. Order of Columns: The order of columns in the SELECT statements matters, as they are combined based on their position.

UNION in Action

Let's imagine we have two tables: fruits and vegetables. We want to create a combined list of all produce items.

-- fruits table
CREATE TABLE fruits (
    id INT,
    name VARCHAR(50),
    color VARCHAR(20)
);

INSERT INTO fruits VALUES
(1, 'Apple', 'Red'),
(2, 'Banana', 'Yellow'),
(3, 'Orange', 'Orange');

-- vegetables table
CREATE TABLE vegetables (
    id INT,
    name VARCHAR(50),
    color VARCHAR(20)
);

INSERT INTO vegetables VALUES
(1, 'Carrot', 'Orange'),
(2, 'Broccoli', 'Green'),
(3, 'Tomato', 'Red');

-- UNION query
SELECT name, color FROM fruits
UNION
SELECT name, color FROM vegetables;

This query will give us:

name color
Apple Red
Banana Yellow
Orange Orange
Carrot Orange
Broccoli Green
Tomato Red

Notice how we have only one 'Orange' entry, even though it appears in both tables. That's the magic of UNION – it removes duplicates!

What is UNION ALL?

Definition and Basic Concept

Now, let's meet UNION's enthusiastic cousin, UNION ALL. While UNION is like a careful chef who removes duplicate ingredients, UNION ALL is like a chef who believes "the more, the merrier!"

The syntax is similar to UNION:

SELECT column1, column2, ... FROM table1
UNION ALL
SELECT column1, column2, ... FROM table2;

Key Features of UNION ALL

  1. Retains Duplicates: UNION ALL keeps all rows, including duplicates.
  2. Faster Performance: Since it doesn't need to remove duplicates, UNION ALL is generally faster than UNION.
  3. Column Compatibility: Like UNION, it requires the same number of columns with compatible data types.

UNION ALL in Action

Let's use our fruits and vegetables tables again:

SELECT name, color FROM fruits
UNION ALL
SELECT name, color FROM vegetables;

This query will produce:

name color
Apple Red
Banana Yellow
Orange Orange
Carrot Orange
Broccoli Green
Tomato Red

Notice that we now have two 'Orange' entries – one from fruits and one from vegetables. UNION ALL keeps all rows, duplicates and all!

UNION vs UNION ALL: A Head-to-Head Comparison

To really understand the difference, let's compare them side by side:

Feature UNION UNION ALL
Duplicate Handling Removes duplicates Retains all duplicates
Performance Slower (due to deduplication) Faster
Result Set Size Potentially smaller Potentially larger
Use Case When unique results are needed When all results, including duplicates, are required

When to Use UNION

Use UNION when you need to combine results from multiple queries and want to ensure that each row in the result set is unique. It's perfect for scenarios like:

  1. Combining customer lists from different regions, ensuring no duplicates.
  2. Merging product catalogs from different suppliers, removing duplicate items.

When to Use UNION ALL

Opt for UNION ALL when:

  1. You know there are no duplicates across your data sets.
  2. You specifically need to keep all rows, including duplicates.
  3. Performance is a priority, and you're dealing with large data sets.

Practical Examples

Let's dive into some real-world scenarios to solidify our understanding.

Example 1: Employee Directory

Imagine we have two tables: current_employees and former_employees. We want to create a comprehensive directory.

-- Create and populate tables
CREATE TABLE current_employees (id INT, name VARCHAR(50), department VARCHAR(50));
CREATE TABLE former_employees (id INT, name VARCHAR(50), department VARCHAR(50));

INSERT INTO current_employees VALUES
(1, 'Alice', 'HR'),
(2, 'Bob', 'IT'),
(3, 'Charlie', 'Finance');

INSERT INTO former_employees VALUES
(4, 'David', 'Marketing'),
(5, 'Eve', 'IT'),
(2, 'Bob', 'IT');  -- Bob used to work here, left, and came back

-- UNION query
SELECT name, department FROM current_employees
UNION
SELECT name, department FROM former_employees;

-- UNION ALL query
SELECT name, department FROM current_employees
UNION ALL
SELECT name, department FROM former_employees;

The UNION query will show Bob only once, while UNION ALL will show Bob twice.

Example 2: Sales Report

Let's create a sales report combining online and in-store sales.

-- Create and populate tables
CREATE TABLE online_sales (product VARCHAR(50), amount DECIMAL(10,2));
CREATE TABLE store_sales (product VARCHAR(50), amount DECIMAL(10,2));

INSERT INTO online_sales VALUES
('Laptop', 1200.00),
('Phone', 800.00),
('Tablet', 500.00);

INSERT INTO store_sales VALUES
('Laptop', 1100.00),
('Phone', 750.00),
('Headphones', 200.00);

-- Total unique products sold (UNION)
SELECT product FROM online_sales
UNION
SELECT product FROM store_sales;

-- All sales entries (UNION ALL)
SELECT 'Online' AS source, product, amount FROM online_sales
UNION ALL
SELECT 'Store' AS source, product, amount FROM store_sales;

The UNION query gives us a list of unique products sold across both channels, while UNION ALL provides a complete list of all sales transactions.

Conclusion

And there you have it, my dear students! We've journeyed through the lands of UNION and UNION ALL, exploring their similarities, differences, and real-world applications. Remember, UNION is your go-to for unique, deduplicated results, while UNION ALL is your speedy friend when you need all the data, duplicates and all.

As you continue your SQL adventure, you'll find countless opportunities to use these powerful tools. Just like in cooking, knowing when to use each ingredient (or in our case, operator) is the key to creating the perfect query recipe.

Keep practicing, stay curious, and don't be afraid to experiment with these concepts. Before you know it, you'll be UNIONing and UNION ALLing like a pro! Happy querying!

Credits: Image by storyset