SQL Formatting Best Practices
SQL runs the same whether it is one long line or thirty tidy ones, but humans review, debug, and diff it — and badly formatted SQL hides bugs. A handful of conventions (uppercase keywords, one column per line, readable JOINs, CTEs over nested subqueries) make queries scannable and produce clean version-control diffs. This guide lays out the conventions that most style guides agree on. To reformat an existing query in seconds, paste it into the SQL formatter.
Why Format SQL at All
The database does not care. Whitespace, line breaks, and keyword casing are invisible to the parser — a one-line query and a thirty-line one compile to the same execution plan. Formatting is entirely for the people who read the query: the reviewer approving your pull request, the on-call engineer debugging it at 2am, and you, six months from now. A well-formatted query reveals its structure; a wall-of-text query hides a missing JOIN condition or a misplaced AND in plain sight.
-- Hard to review: where does the WHERE end and the next clause begin?
select u.id, u.name, count(o.id) from users u left join orders o on o.user_id = u.id where u.active = true and u.created_at > '2026-01-01' group by u.id, u.name having count(o.id) > 5 order by 3 desc;
The same query, formatted, makes each clause and each column a separate, scannable line — the version below is what the rest of this guide builds toward.
Uppercase Keywords, Lowercase Identifiers
The most common convention puts reserved words in uppercase and your own table and column names in lowercase. The capitals become visual landmarks that separate the clauses:
SELECT
u.id,
u.name
FROM users AS u
WHERE u.active = TRUE
ORDER BY u.created_at DESC;
A growing number of teams — particularly in the dbt/analytics world — prefer all-lowercase keywords for less visual noise. Both work. The only wrong answer is inconsistency: a codebase where some queries shout SELECT and others whisper select is harder to scan than either pure style. Pick one and let a formatter enforce it.
One Column Per Line
For anything beyond two or three columns, give each its own line. This makes the select list scannable, makes individual columns easy to comment, and — critically — produces clean diffs when you add or remove one:
SELECT
o.id,
o.status,
o.total_cents,
o.created_at
FROM orders AS o;
The cost is vertical space; the benefit is that a code review shows exactly which columns changed. For a tiny query (SELECT id, name FROM users) a single line is fine — readability, not dogma, is the goal.
Leading vs Trailing Commas
Where the comma goes in a multi-line select list is a genuine fork, and it is worth understanding the trade-off. Leading commas put the comma at the start of each line:
SELECT
o.id
, o.status
, o.total_cents
FROM orders AS o;
The payoff is version control. Adding a column appends a single new line; removing one deletes a single line. You never produce the classic error of a dangling comma before FROM, because the comma lives with the column it precedes. Trailing commas read more naturally to most people:
SELECT
o.id,
o.status,
o.total_cents
FROM orders AS o;
but adding a column now edits two lines — the new column and the comma on the previous line — making diffs slightly noisier. Many modern style guides favour leading commas for the cleaner diffs; others prioritise readability and choose trailing. Either is fine if applied consistently.
Format JOINs for Clarity
JOINs are where unformatted SQL does the most damage, because a missing or wrong join condition is a real bug, not a style issue. Put each JOIN on its own line, always state the join type explicitly, and keep the ON condition close:
SELECT
u.name,
o.total_cents
FROM users AS u
INNER JOIN orders AS o
ON o.user_id = u.id
LEFT JOIN refunds AS r
ON r.order_id = o.id
AND r.status = 'approved'
WHERE u.active = TRUE;
- Always write the join type —
INNER JOIN,LEFT JOIN— never a bareJOIN, so the reader never has to guess. - Indent the
ONcondition under its JOIN so the relationship is obvious. - Line up multiple conditions with the
ANDaligned, as above, so a missing predicate stands out.
Prefer CTEs Over Nested Subqueries
A query nested three subqueries deep has to be read inside-out, through layers of parentheses. A Common Table Expression (the WITH clause) lets you name each step and read the query top-to-bottom:
WITH active_users AS (
SELECT id, name
FROM users
WHERE active = TRUE
),
order_counts AS (
SELECT
user_id,
COUNT(*) AS order_count
FROM orders
GROUP BY user_id
)
SELECT
au.name,
oc.order_count
FROM active_users AS au
INNER JOIN order_counts AS oc
ON oc.user_id = au.id
WHERE oc.order_count > 5;
Each CTE is a named, independently understandable step. On modern databases the optimizer typically inlines CTEs, so there is usually no performance cost — though it is worth checking your specific engine's materialization behaviour for very large or repeatedly-referenced CTEs.
Alias Tables Deliberately
Short, meaningful aliases keep qualified column references readable. Always qualify columns in any query with more than one table, so a reader never has to guess which table a column comes from:
-- Good: every column is qualified, aliases are mnemonic
SELECT
c.name,
c.email,
a.city
FROM customers AS c
INNER JOIN addresses AS a
ON a.customer_id = c.id;
- Use the explicit
ASkeyword for aliases — it is clearer than a bare alias and most formatters add it. - Prefer a short mnemonic (
cfor customers,afor addresses) over single letters that mean nothing (t1,t2). - Qualify every column in a multi-table query, even unambiguous ones — it future-proofs against a later JOIN introducing a name clash.
Name Things in snake_case
SQL identifiers are conventionally snake_case — lowercase words joined by underscores. There is a concrete reason beyond habit: unquoted SQL identifiers are case-insensitive, so a camelCase column name is not reliably preserved by every engine, while snake_case survives intact and never needs double-quoting.
-- Conventional
SELECT user_id, created_at, total_cents FROM order_line_items;
-- Avoid: needs quoting on many engines, case not preserved
SELECT "userId", "createdAt" FROM "orderLineItems";
Common companion conventions: plural table names (users, orders), a consistent primary key (id), and foreign keys named for the table they reference (user_id). For the full picture on why snake_case wins here and how it compares to other styles, see the camelCase vs snake_case vs kebab-case guide.
Formatting Does Not Change Behaviour
It is worth stating plainly: reformatting SQL is safe. The parser ignores whitespace, newlines, indentation, and keyword casing, so a formatted query and a minified one produce an identical execution plan and identical results. The only characters that affect output are inside string literals and quoted identifiers — and a correct formatter never touches the contents of quotes. That means you can reformat an unfamiliar query to understand it without any risk of changing what it does.
Format Any Query Instantly
Paste a messy or minified query into the SQL formatter to apply consistent keyword casing, indentation, and one-column-per-line layout in a click — useful both for cleaning up your own SQL and for making someone else's dense query readable enough to debug. For the same readability mindset applied to data formats, see the JSON formatting best practices guide.
Frequently Asked Questions
Should SQL keywords be uppercase or lowercase?
The most widespread convention is uppercase for reserved keywords (SELECT, FROM, WHERE, JOIN, GROUP BY) and lowercase for identifiers like table and column names. The contrast makes the structure of the query scannable at a glance, the capitals act as visual landmarks separating clauses. A growing minority of teams, influenced by tools like dbt, prefer all-lowercase keywords for less visual noise and easier typing. Either is defensible; what matters is that the whole codebase picks one and a formatter enforces it, because mixed casing is the worst of both worlds.
Should I use leading or trailing commas in SQL SELECT lists?
Both are common, but leading commas have a practical edge for version control. With leading commas each new column line begins with a comma, so adding or removing a column changes exactly one line and produces a clean one-line diff, and you never get a syntax error from a trailing comma before FROM. Trailing commas read more naturally to most people but mean that adding a column edits two lines (the new one and the comma on the previous line). Pick one per codebase. Many modern style guides recommend leading commas specifically for the cleaner diffs, while others prioritise readability and choose trailing.
Why use CTEs instead of nested subqueries?
Common Table Expressions (the WITH clause) let you name and define each step of a query at the top, then reference those names in the main query, instead of nesting subqueries several levels deep. The benefit is readability, a query reads top-to-bottom as a sequence of named steps rather than inside-out through nested parentheses. CTEs also let you reuse the same intermediate result in multiple places and make each step independently testable. On most modern databases CTEs are inlined by the optimizer so there is no performance penalty, though you should check your specific engine for materialization behaviour on very large queries.
What naming convention should I use for SQL tables and columns?
snake_case (lowercase words joined by underscores) is the near-universal convention for SQL identifiers, for example user_id, created_at, order_line_items. It avoids two problems, SQL is case-insensitive for unquoted identifiers so camelCase casing is not reliably preserved, and snake_case never needs double-quoting. Beyond casing, common conventions are plural table names (users, orders), a consistent primary key name (id), and foreign keys named after the referenced table (user_id). Consistency across the schema matters more than the specific choices.
Does formatting SQL change how it runs or performs?
No. Whitespace, line breaks, indentation, and keyword casing are ignored by the SQL parser, so a formatted query and a minified one produce an identical execution plan and identical results. Formatting is purely for the humans who read, review, and maintain the query. The one thing that does affect results, casing of string literals and quoted identifiers, is content inside quotes, which a good formatter never touches. So you can reformat freely without fear of changing behaviour.