MSH Logo

The Mental Model of SQL

A mental model for understanding SQL queries

Published on

Think of SQL as a declarative language - meaning you tell the database what you want, not how to do it. Unlike imperative programming where you write step-by-step instructions, SQL lets you describe the desired result and lets the database figure out the best way to get there.

This mental shift is crucial for writing effective SQL queries and understanding how databases work under the hood.

The SQL Mindset

SQL is about describing what you want, not how to get it. The database engine handles the optimization and execution details.

What is a SQL Query?

A SQL query is a request to the database to either:

  • Retrieve data (SELECT)
  • Modify data (INSERT, UPDATE, DELETE)
  • Modify structure (CREATE, ALTER, DROP)
Every SQL query comes down to four questions
  1. What data do I want? (SELECT columns)
  2. Where does it live? (FROM tables)
  3. What conditions must be met? (WHERE clauses)
  4. In what shape do I want it returned? (ORDER BY, GROUP BY, etc.)

Example Query Breakdown

1SELECT name, age 2FROM users 3WHERE age > 25 4ORDER BY age DESC;

Let's break this down using our mental model:

  1. What data?name and age columns
  2. Where?users table
  3. What conditions? → Only users older than 25
  4. What shape? → Sorted by age, highest first

Understanding JOINs

JOINs are how SQL combines rows from two or more tables based on a related column between them. Think of JOINs as merging tables horizontally based on matching values.

The JOIN Mental Model

Imagine you have two spreadsheets:

  • Left table: Customer information
  • Right table: Order information

A JOIN creates a new "virtual table" that combines matching rows from both tables.

Join TypeWhat it doesWhen to use
INNER JOINReturns only rows that have matching values in both tablesWhen you only want records that exist in both tables
LEFT JOINAll rows from left table, NULL if no right matchWhen you want all left records, even without matches
RIGHT JOINAll rows from right table, NULL if no left matchWhen you want all right records, even without matches
FULL OUTER JOINAll rows from both tables, NULL where no matchWhen you want all records from both tables

Practical JOIN Examples

1-- Find all customers and their orders (including customers with no orders) 2SELECT c.name, o.order_date, o.total 3FROM customers c 4LEFT JOIN orders o ON c.id = o.customer_id; 5 6-- Find only customers who have placed orders 7SELECT c.name, o.order_date, o.total 8FROM customers c 9INNER JOIN orders o ON c.id = o.customer_id; 10 11-- Find all orders and their customer info (including orders with no customer) 12SELECT c.name, o.order_date, o.total 13FROM customers c 14RIGHT JOIN orders o ON c.id = o.customer_id;

JOIN Performance: The Database's Perspective

Understanding how databases execute JOINs helps you write more efficient queries.

Key Performance Factors

FactorImpactOptimization
Table SizeLarger tables = slower joinsFilter early with WHERE clauses
IndexesIndexed join keys = much fasterCreate indexes on foreign keys
Join TypeINNER JOINs are usually fasterUse INNER when possible
Data DistributionSkewed data affects performanceConsider partitioning strategies
Performance Rule of Thumb

The faster the database can reduce data early, the better. That means:

  • Apply WHERE conditions before joining
  • Use indexes so we don't scan entire tables
  • Make sure joins use indexed columns
  • Filter out unnecessary data as early as possible

How Databases Actually Execute JOINs

SQL engines use different algorithms behind the scenes:

AlgorithmBest ForHow It Works
Nested LoopSmall datasets, indexed joinsFor each row in outer table, scan inner table
Hash JoinLarge datasets, no indexesBuild hash table from smaller table, probe with larger
Merge JoinBoth sides sortedMerge two sorted inputs like merge sort

Understanding Subqueries

A subquery is a query inside another query. Think of it as asking a question to answer another question.

Subquery Mental Model

Before I ask for this... go and find that first.

Subqueries help break complex problems into manageable steps, where the result of one step becomes input to the next.

Types of Subqueries

Scalar Subqueries (Return Single Value)

1-- Find the user who placed the highest order 2SELECT name 3FROM users 4WHERE id = ( 5 SELECT user_id 6 FROM orders 7 ORDER BY total DESC 8 LIMIT 1 9);

List Subqueries (Return Multiple Values)

1-- Find all users who have placed orders over $100 2SELECT name 3FROM users 4WHERE id IN ( 5 SELECT user_id 6 FROM orders 7 WHERE total > 100 8);

Correlated Subqueries (Reference Outer Query)

1-- Find users who have placed more orders than the average 2SELECT name, order_count 3FROM users u 4WHERE ( 5 SELECT COUNT(*) 6 FROM orders o 7 WHERE o.user_id = u.id 8) > ( 9 SELECT AVG(order_count) 10 FROM ( 11 SELECT COUNT(*) as order_count 12 FROM orders 13 GROUP BY user_id 14 ) avg_orders 15);

Subquery vs JOIN: When to Use Which?

SituationUse SubqueryUse JOIN
Checking existenceWHERE EXISTSINNER JOIN
Getting single valueScalar subqueryAggregation with JOIN
Complex filteringMultiple conditionsSimple relationships
PerformanceOften slowerUsually faster

Subquery Performance Considerations

Common Performance Issues

  1. N+1 Problem: Subquery executes for each outer row
  2. Missing Indexes: Subquery can't use indexes efficiently
  3. Data Duplication: Subquery might process same data multiple times

Optimization Strategies

1-- SLOW: Correlated subquery (executes for each user) 2SELECT name, ( 3 SELECT COUNT(*) 4 FROM orders o 5 WHERE o.user_id = u.id 6) as order_count 7FROM users u; 8 9-- FAST: JOIN with aggregation 10SELECT u.name, COUNT(o.id) as order_count 11FROM users u 12LEFT JOIN orders o ON u.id = o.user_id 13GROUP BY u.id, u.name;
Subquery Performance Tip

When possible, rewrite correlated subqueries as JOINs. JOINs are typically more efficient because the database can optimize the entire operation together.

Advanced SQL Concepts

Window Functions

Window functions let you perform calculations across a set of table rows related to the current row, without grouping.

1-- Rank users by their total order value 2SELECT 3 name, 4 total_spent, 5 RANK() OVER (ORDER BY total_spent DESC) as rank, 6 ROW_NUMBER() OVER (ORDER BY total_spent DESC) as row_num 7FROM ( 8 SELECT 9 u.name, 10 SUM(o.total) as total_spent 11 FROM users u 12 JOIN orders o ON u.id = o.user_id 13 GROUP BY u.id, u.name 14) user_totals;

Common Table Expressions (CTEs)

CTEs make complex queries more readable by breaking them into named, reusable parts.

1-- Find users with above-average order counts 2WITH user_order_counts AS ( 3 SELECT 4 user_id, 5 COUNT(*) as order_count 6 FROM orders 7 GROUP BY user_id 8), 9average_orders AS ( 10 SELECT AVG(order_count) as avg_count 11 FROM user_order_counts 12) 13SELECT u.name, uoc.order_count 14FROM users u 15JOIN user_order_counts uoc ON u.id = uoc.user_id 16JOIN average_orders ao ON uoc.order_count > ao.avg_count;

SQL Query Execution Order

Understanding the logical order of SQL execution helps debug and optimize queries:

OrderClauseWhat Happens
1FROMIdentify source tables
2WHEREFilter rows
3GROUP BYGroup rows
4HAVINGFilter groups
5SELECTChoose columns
6ORDER BYSort results
7LIMITLimit results

Best Practices for SQL Mental Models

Think in Sets, Not Loops

SQL operates on sets of data, not individual records. This is why:

  • WHERE filters entire sets
  • JOIN combines sets
  • GROUP BY creates new sets

Understand Data Relationships

Before writing queries, map out:

  • Which tables contain the data you need
  • How tables relate to each other
  • What the relationships mean for your query

Start Simple, Then Add Complexity

  1. Write a basic SELECT to see your data
  2. Add WHERE conditions to filter
  3. Add JOINs to combine tables
  4. Add GROUP BY for aggregations
  5. Add ORDER BY for sorting

Use EXPLAIN to Understand Execution

Most databases provide EXPLAIN or EXPLAIN ANALYZE to show how your query will be executed:

1EXPLAIN ANALYZE 2SELECT u.name, COUNT(o.id) as order_count 3FROM users u 4LEFT JOIN orders o ON u.id = o.user_id 5WHERE u.created_at > '2023-01-01' 6GROUP BY u.id, u.name 7ORDER BY order_count DESC;

Common SQL Anti-Patterns to Avoid

SELECT * in Production

1-- BAD: Fetches unnecessary data 2SELECT * FROM users WHERE active = true; 3 4-- GOOD: Only fetch what you need 5SELECT id, name, email FROM users WHERE active = true;

Functions in WHERE Clauses

1-- BAD: Can't use indexes 2SELECT * FROM orders WHERE YEAR(order_date) = 2023; 3 4-- GOOD: Index-friendly 5SELECT * FROM orders WHERE order_date >= '2023-01-01' AND order_date < '2024-01-01';

Nested Subqueries When JOINs Would Work

1-- BAD: Nested subquery 2SELECT name FROM users 3WHERE id IN ( 4 SELECT user_id FROM orders 5 WHERE total > 100 6); 7 8-- GOOD: Simple JOIN 9SELECT DISTINCT u.name 10FROM users u 11JOIN orders o ON u.id = o.user_id 12WHERE o.total > 100;

Conclusion

Mastering SQL requires shifting from an imperative mindset to a declarative one. By understanding that SQL describes what you want rather than how to get it, you can write more effective queries and better understand database performance.

Key takeaways:

  • Think in sets: SQL operates on collections of data
  • Understand relationships: Know how your tables connect
  • Optimize early: Filter data as soon as possible
  • Use the right tool: JOINs vs subqueries vs window functions
  • Measure performance: Use EXPLAIN to understand execution plans

The mental model of SQL as a declarative language for describing data relationships will serve you well as you tackle increasingly complex database challenges.

Next Steps

Practice writing queries with this mental model in mind. Start with simple SELECT statements, then gradually add complexity. Use EXPLAIN to see how the database interprets your queries, and always think about the data relationships first.

References

Buy Me a Coffee at ko-fi.com
GET IN TOUCH

Let's work together

I build exceptional and accessible digital experiences for the web

WRITE AN EMAIL

or reach out directly at hello@mohammadshehadeh.com