SelectFromData
AnalyticsLesson 7 of 9 · 20 min

UNION: Stacking Result Sets on Top of Each Other

A JOIN glues tables side by side, matching on a key. UNION does something different: it stacks two result sets vertically, one on top of the other, into a single longer list. You reach for it when the rows live in different places but mean the same thing: this year's orders plus last year's, online customers plus in-store customers.

In one line

UNION appends one result set below another. Both must have the same number of columns in the same order with compatible types. UNION removes duplicate rows; UNION ALL keeps every row.

The setup: two lists that belong together

Imagine a company that tracked customers in two systems before merging: an online_customers table and an instore_customers table. Same idea, different source. To get one combined list, you stack them.

UNION — stack and de-duplicate

Picture the stack first: the three online rows sit on top of the three in-store rows. Bo from Ghent appears in both lists, so plain UNION drops the second copy:

namecity
AdaAntwerp
BoGhent
CyBruges
BoGhent
DiLeuven
EliAntwerp
UNION (stack, then de-duplicate)
namecity
AdaAntwerp
BoGhent
CyBruges
DiLeuven
EliAntwerp
Six stacked rows become five: the second Bo/Ghent row duplicates the first, so UNION removes it.

Run it: the two tables become one list, sorted by name.

UNION stacks both customer lists into one result
Editable · runs in your browser

Notice Bo appears only once, even though Bo is in both tables. Plain UNION quietly collapses identical rows into one.

Watch out

UNION de-duplicating is convenient until it isn't. If two different people are both named "Bo" in "Ghent", UNION will merge them into a single row and your count is now wrong. De-duplication looks at the whole row, not at any notion of identity.

UNION ALL — keep every row

UNION ALL skips the duplicate check and returns everything from both sides. It's also faster, because the database doesn't have to compare rows. Run it and watch Bo appear twice:

UNION ALL keeps duplicates — Bo now appears twice
Editable · runs in your browser
UNIONUNION ALL
Duplicate rowsRemovedKept
SpeedSlower (must compare rows)Faster (no comparison)
Use whenYou want a distinct combined listYou want every row, or know there are no dupes

If you don't actually need de-duplication, prefer UNION ALL.

Tagging where each row came from

Once you stack two sources, you often lose track of which row came from where. The fix: add a literal column to each SELECT that labels the source. Run it:

A literal column labels each row's origin before stacking
Editable · runs in your browser

The column names come from the first SELECT; the second query's labels are ignored, so only the order and types of its columns matter.

Watch out

The cardinal rule of UNION: both queries must return the same number of columns, in the same order, with compatible types. Line up name, city with city, name by accident and you'll either get an error or, worse, cities silently landing in the name column. Always read your two SELECT lists top-to-bottom and check they match position by position.

From practice

Default to UNION ALL and only switch to plain UNION when you genuinely need duplicates removed. It's faster, and it won't silently merge two distinct rows that happen to look identical.

Your turn

Stack the two sources into one distinct list of customers (name, city), no duplicates. (Which keyword removes them?)

Your turn · Exercise

Combine online_customers and instore_customers into one list of distinct customers (name, city), with no duplicate rows.

The data

online_customers

3 rows returned

namecity
AdaAntwerp
BoGhent
CyBruges

instore_customers

3 rows returned

namecity
BoGhent
DiLeuven
EliAntwerp
Runs in your browser

Next: making a single query react to its data with CASE, SQL's version of if/then/else.