Writing SQL Subquery | Subqueries in SQL
Mastering SQL Subqueries: A Guide to Writing and Using Subqueries in SQL
Introduction
One of the powerful skills while working with SQL is the ability to write effective subqueries. A subquery is generally known as an inner query or nested query, through which complex filtering and data extraction are implemented within a single SQL statement. Regardless of whether you are dynamically joining tables, filtering results based on calculated values, or doing something else, a subquery can really help simplify your code and avoid complex joins or multiple queries. In the following sections, we will explore what SQL subqueries are, how to write them, and illustrate examples of their usage.
A subquery is an inner SQL statement that may be nested in SELECT, FROM, WHERE, or HAVING clauses. The subquery will return its results as a set of values that the outer query can use. It is especially useful if you need to perform a calculation or a filtering operation based on another table's data.
Basic Syntax
SELECT column1, column2, .
FROM table1
WHERE column IN (SELECT column FROM table2 WHERE condition );
In this scenario, the subquery will first be evaluated, which in turn yields a set of rows to be used for filtering purposes by the outer query.
---
Subqueries in SQL
There are many forms of subqueries based upon their use within a statement and what they're accomplishing. These are sometimes referred to as follows;
1. Single-row Subqueries
Return one row and usually used with =, <, >, <=, >= operators
2. Multi-row Subqueries
Returns more than one row and are used with IN, ANY, or ALL operators.
3. Scalar Subqueries
Returns a single value, often used in the SELECT clause to add a calculated column.
4. Correlated Subqueries
Refer to columns in the outer query, running once for each row returned by the outer query. These are handy for advanced filtering.
Another very obvious use of the subquery is in the WHERE clause. It becomes very convenient if you should filter information by conditions related to some other table.
Task
Let's consider two tables: employees and departments. We want to select the names of all employees working in the "Sales" department.
SELECT name FROM employees WHERE department_id = (SELECT id FROM departments WHERE name = 'Sales');
In the subquery, it identifies the ID of "Sales" so that the outer query can bring back only those employees with that department ID.
2. Subqueries in the SELECT Clause
Subqueries in the SELECT clause are used to make calculated columns based on other tables.
Example
Here we will include a column showing what name their manager is.
SELECT name,
(SELECT manager_name FROM managers WHERE managers.id = employees.manager_id) AS manager
FROM employees;
It will return the manager's name for each employee.
3. Subqueries in the FROM Clause
In using a subquery in the FROM clause you declare temporary tables for use by the outer query. That is also known as a derived table.
Here is an example:
Assume that you need to fetch the average salary of all departments and then display only departments with an average salary more than some threshold.
SELECT department_id, avg_salary
FROM (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
) AS dept_avg
WHERE avg_salary > 50000;
Here, the subquery computes the average salary per department, and the outer query filters departments based on that.
4. Correlated Subqueries
Correlated subqueries rely on values from the outer query, which makes them so much more useful for filtering data based on dynamic conditions.
Example:
Find employees who earn more than the average salary in their department.
SELECT name, salary
FROM employees AS e1
WHERE salary > (SELECT AVG(salary)
FROM employees AS e2
WHERE e1.department_id = e2.department_id);
This subquery, in the WHERE clause, calculates the average salary for each department. Then it compares each employee's salary to this value.
Subqueries are powerful but sometimes a JOIN might be a better choice to handle the case, especially if the number of rows is very large.
2. Avoid More Unnecessary Nested Subqueries:
Too many nested subqueries really kill your queries. Join when you can.
3. Consider Indexing Columns on Which the Subqueries Run:
Indexes can be helpful to accelerate the search process in subqueries.
4. Use EXISTS Instead of IN in Correlated Subqueries:
If it's a correlated subquery, sometimes using EXISTS can be faster than using IN.
5. Optimize with EXPLAIN Plan:
Use EXPLAIN to understand how your query is executed and where optimization might help.
Common Use Cases for SQL Subqueries
Filtering Based on Aggregated Data
Use subqueries to filter rows based on aggregate values, like finding departments with an above-average salary.
Self-Joins and Recursive Data
In hierarchical data structures, subqueries can be handy for recursive data selection.
Dynamic Calculations
Subqueries automatically calculate values in the SELECT clause without necessitating an extra query to be run first.
SQL subqueries are a very important technique that every SQL developer should know how to handle. With subqueries, you can therefore compute advanced data calculations, filtering, and transformations within one query. Knowing how and when to use subqueries-that is
, when in the SELECT, FROM, or WHERE clause-can be really helpful for getting data out of SQL and transforming it in the way you like. I can attempt to use all these types of subqueries and see their effect on SQL queries and how they are used in the management of data.