SelectFromData
AnalyticsLesson 9 of 9 · 25 min

Subqueries: A Query Inside a Query

Sometimes the answer you want depends on another answer you have to compute first: "show me everyone who earns above the company average". You can't know the average until you've queried for it. A subquery lets you nest one query inside another so SQL works it out in a single statement.

In one line

A subquery is a SELECT wrapped in parentheses and used inside another query. SQL runs the inner query first, then feeds its result to the outer query.

A subquery that returns one value

The simplest subquery returns a single number you can compare against. Here the inner query computes the average salary; the outer query uses it in WHERE. Run it: only the above-average earners come back.

The inner SELECT computes the average; the outer query filters against it
Editable · runs in your browser

The average here is 55000. SQL computes that inner number once, then compares every row's salary against it. This is why you can't write WHERE salary > AVG(salary) directly; as you saw with HAVING, an aggregate can't sit in WHERE. The subquery sidesteps that by computing the average as a separate, complete query.

Watch out

A subquery used with = or > must return exactly one value: one row, one column. If (SELECT salary FROM employees WHERE department = 'Sales') returns three salaries, the comparison errors out. When the inner query can return several rows, use IN instead of =, which is built for a list.

A subquery that returns a list: IN

When the inner query returns many values, pair it with IN. Here we find every employee who works in a department located in Antwerp. The inner query produces the list of qualifying dept_ids, and IN checks membership. Run it:

The subquery returns a list of dept_ids; IN checks each employee against it
Editable · runs in your browser

The inner query returns 1 and 3 (Engineering and Marketing are in Antwerp), and IN keeps every employee whose dept_id is in that list.

A subquery as a table you can query

A subquery can also stand in for a whole table in the FROM clause: a temporary, named result you then query further. This is handy for filtering on an aggregate you just computed. Run it: first summarise per department, then keep only the larger ones.

The inner query builds a per-department summary; the outer query filters it
Editable · runs in your browser

The subquery in FROM must be given an alias (AS dept_summary); the outer query needs a name to refer to it by.

Watch out

A subquery in FROM without an alias is a hard error in most databases, and the message ("subquery must have an alias") confuses beginners because it doesn't mention the missing name directly. If a FROM (SELECT ...) query won't run, check for the alias first.

From practice

Build subqueries from the inside out. Write and run the inner SELECT on its own first, confirm it returns what you expect, then wrap it. Debugging a nested query you wrote all at once is far harder than verifying each layer as you go.

Your turn

Use a subquery. Return everyone earning more than the company average salary (name, salary), highest first.

Your turn · Exercise

Return every employee earning more than the company average salary, with name and salary, highest first.

The data

employees

5 rows returned

namesalary
Ada48,000
Bo62,000
Cy71,000
Di55,000
Eli39,000
Runs in your browser

That rounds out the fundamentals: you can now select, filter, sort, aggregate, group, join, stack, branch with CASE, and nest queries. From here, SQL for Data Analysts takes these same ideas further: CTEs (a cleaner way to write the subqueries you just learned), window functions, and the reporting patterns real analytics runs on.