SelectFromData
AnalyticsLesson 6 of 9 · 25 min

JOINs: Combining Two Tables

Real databases spread information across many tables (employees in one, departments in another) linked by a shared key. A JOIN stitches them back together so you can use columns from both in one query. This is the last fundamental, and the one with the most famous beginner trap.

In one line

A JOIN matches rows from two tables on a shared key. INNER JOIN keeps only rows that match in both; LEFT JOIN keeps every row from the left table even when the right side has no match.

The setup: two related tables

Here employees.dept_id points at departments.dept_id. The join key is dept_id.

employeesleft
  • name
  • dept_id (FK)
  • salary
departmentsright
  • dept_id (PK)
  • dept_name
  • location

INNER JOIN — only matching rows

See the match first: each employee's dept_id is looked up in departments. Rows whose key matches share a colour; rows with no partner (Di's dept_id 99, and Marketing, which has no employees) stay grey and never reach the result:

employees

namedept_idsalary
Ada162000
Bo248000
Cy171000
Di9952000

departments

dept_iddept_namelocation
1EngineeringAntwerp
2SalesGhent
3MarketingBruges
INNER JOIN ON employees.dept_id = departments.dept_id

result

namedept_idsalarydept_namelocation
Ada162000EngineeringAntwerp
Bo248000SalesGhent
Cy171000EngineeringAntwerp
INNER JOIN emits one result row per matching pair; unmatched rows on either side are dropped.

Try it now: write the INNER JOIN the visual showed, each employee's name and dept_name (only the matched ones).

Your turn · Exercise

INNER JOIN employees to departments on dept_id. Return name and dept_name (matched rows only).

The data

employees

4 rows returned

namedept_id
Ada1
Bo2
Cy1
Di99

departments

3 rows returned

dept_iddept_name
1Engineering
2Sales
3Marketing
Runs in your browser

INNER JOIN returns a row only where the key exists in both tables. Run it: each employee gets their department's name and location.

INNER JOIN matches employees to departments on dept_id
Editable · runs in your browser

Two things to notice: we gave the tables aliases (e, d) so we can write e.name and d.dept_name, and the ON clause states which keys must match.

Watch out

Look closely at the result: Di is missing. Di's dept_id is 99, which doesn't exist in departments, so INNER JOIN drops the row entirely, silently. This is the trap: an inner join quietly discards rows that don't match, and your headcount comes up short without any error.

LEFT JOIN — keep every row from the left

A LEFT JOIN keeps all rows from the left table; where the right table has no match, its columns come back NULL. Watch Di survive this time, kept in the result, with the department columns filled by NULL:

employees

namedept_idsalary
Ada162000
Bo248000
Cy171000
Di9952000

departments

dept_iddept_namelocation
1EngineeringAntwerp
2SalesGhent
3MarketingBruges
LEFT JOIN ON employees.dept_id = departments.dept_id

result

namedept_idsalarydept_namelocation
Ada162000EngineeringAntwerp
Bo248000SalesGhent
Cy171000EngineeringAntwerp
Di9952000NULLNULL
LEFT JOIN keeps every employee. Di has no matching department, so dept_name and location come back NULL instead of dropping the row.

Run it: now Di stays, with empty department fields.

LEFT JOIN keeps Di; the unmatched department columns are NULL
Editable · runs in your browser
INNER JOINLEFT JOIN
KeepsOnly rows matching in both tablesEvery left-table row, matched or not
Unmatched left rowsDropped silentlyKept, right columns = NULL
Use whenYou only want matched recordsYou must not lose any left-table rows

Which one do I use?

Ask: "do I need every row from my main table, even the ones without a match?" If yes, use LEFT JOIN. That covers headcounts, "customers and their orders (including those who never ordered)", and completeness checks. If you genuinely only want matched pairs, INNER JOIN.

From practice

When a joined query returns fewer rows than the table you started from, an INNER JOIN silently dropping unmatched rows is the first thing to suspect. Switch to LEFT JOIN and look for the NULLs; they tell you exactly which keys didn't match.

Your turn

Keep every employee. List each employee with their department name, including Di, whose department is unknown. (Which join keeps the unmatched row?)

Your turn · Exercise

List every employee with their department name, keeping employees even when no department matches. Return name and dept_name.

The data

employees

4 rows returned

namedept_idsalary
Ada162,000
Bo248,000
Cy171,000
Di9952,000

departments

3 rows returned

dept_iddept_namelocation
1EngineeringAntwerp
2SalesGhent
3MarketingBruges
Runs in your browser

That's the foundation: selecting, filtering, sorting, aggregating, grouping, and joining. You're now ready for SQL for Data Analysts, which picks up where this leaves off (window functions, CTEs, and the patterns real reporting runs on), and for the business cases, where you'll use all of it on deliberately messy data.