The order by clause is surely one of the best-known SQL clauses. Yet there are a number of common misunderstandings that are worth clearing up. For that, I’ll walk you through the evolution of the order by clause in ISO/IEC 9075-2:2023—that is, the SQL standard.
Contents:
- The 1980s
- Three Orders: Syntax, Logical and Actual Evaluation
- Positional References
- Non-Selected Columns
- Expressions
- Null Ordering
- In Subqueries
The 1980s
In the beginning,0 the order by clause had several limitations that were lifted later: (1) it could only refer to columns produced by the corresponding select clause; (2) it did not allow expressions; (3) it did not provide a direct syntax to control ordering of null values and (4) it was only allowed as the very last clause of a query.
For example, the following query was invalid until ISO/IEC 9075-2:1999, because the order by clause refers to a non-selected column:
SELECT a
FROM t
ORDER BY bYou might wonder why the standard made it invalid to sort on something that is not selected. But that is a totally wrong perspective. Let me explain.
Three Orders: Syntax, Logical and Actual Evaluation
There are countless AI-generated infographics flooding the social networks these days. Many of them address the difference between the syntactic order of SQL clauses and the logical order of evaluation. While we can easily see the syntactic order in the statement itself, such as select, from, order by, the logical evaluation order is more subtle. Even though the logical evaluation order often follows the syntactic order, there are exceptions such as the select clause.
For the query above, the logical order of evaluation is from, select, order by. That means that the from clause first creates an imaginary result, consisting of the full contents of the table “t”. This includes the column “b”, if there is such a column. This imaginary result is then passed to the select clause, which produces another imaginary result consisting of the column “a” only. That imaginary result is in turn passed to the order by clause, which cannot find the column “b” in the input received from the select clause. Thus, the statement was not valid SQL. Not because the standard explicitly says so, but as a rather unfortunate consequence of how imaginary results are passed between clauses.
So far I’ve mentioned two orders of SQL: the syntactic order (order of clauses in the statement) and the logical order of evaluation. Both are defined in the standard. However, there is a third order, which the currently circulating infographics fail to highlight: the actual order of execution. Even though it might look like the standard defines the steps an engine must undertake to get the result, it actually does not! The SQL standard says:
A conforming SQL-implementation is not required to perform the exact sequence of actions defined in the General Rules, provided its effect [...] is identical to the effect of that sequence.
— ISO/IEC 9075-1:2023 §6.3.3.3
In other words: Everything in the standard just means: “The result has to be the same as though you would follow these steps”.
This relates to another set of infographics that spread the mantra to “filter early” in order to improve performance. While this mantra is correct when applied to the actual order of execution, it is pointless to apply it to the syntactical or logical order. Take this statement as an example:
SELECT *
FROM t
WHERE a = 42The way the SQL standard defines the result of this statement is to get the full contents of the table “t” first. That imaginary result is then passed on to the where clause which drops all but the matching rows, producing a smaller but still imaginary result. This is then processed by the select clause which finally produces the ultimate result of the query. If the standard would require the actual order of execution to follow these steps, indexing would not be possible at all.
In fact, engines will be happy to use an index on column “a”, if available. That effectively merges the where clause into the from clause. This is also true if there is a join in the query. Whenever it is possible to apply filtering early (“push down”) without changing the result, engines aim to do so. The suggestion to use a subquery or CTE to apply the filter early (e.g. before a join) is wasting your time. It is done by the engine, automatically. For decades.1 Trust the execution plan, not infographics. The execution plan tells you what the engine actually does.
Positional References
After this short digression, let’s get back to the order by clause. At that time, up to ISO/IEC 9075:1992, there was another way to specify which selected columns should be used for sorting: a positional reference. The literal “1” in the following example means to use the first column for sorting.
SELECT a
FROM t
ORDER BY 1Note that this was removed from the standard in 1999,2 yet it is still widely supported—and often used.
- Also via bind parameter:
order by ?• Negative parameters (but not literals) reverse the direction - Also via bind parameter:
order by ?
Non-Selected Columns
This was the formal functionality of the order by clause in the previous century. But then there was the “big bang” of modern SQL: A new edition of the SQL standard was published in December 1999. And that changed pretty much everything. From the boolean type to custom types, from CTEs to grouping sets. Countless new features pushed SQL far beyond its relational origins. But SQL:1999 was also not shy of improving old and simple functionality such as the order by clause.
In particular, SQL:1999 allowed non-selected columns in order by. As we have seen above, the order by clause just does not have access to non-selected columns. So the question is: How did SQL:1999 make it possible?
If an order by clause refers to columns not found in the imaginary result produced by select, the standard just adds those columns to the select clause.3 The standard text effectively rewrites the query for you. Such rewrites are known as syntactic transformations. Although they are very common, most of them stay within a syntactic unit, such as a clause. This particular syntactic transformation does not mind rewriting another clause.4
If you like this page, you might also like …
… to subscribe my mailing lists, get free stickers, buy my book or get training.
Once added to the select clause, order by can use it like any selected column. To avoid returning those added columns to the final result, the order by clause itself drops them again.5 The separation of concerns is ruthlessly broken. The end justifies the means, I’d say. And keep in mind: the engines are still allowed to do whatever they want, as long as the result is the same as if they had followed these steps.
Note the order of precedence for this syntactic transformation: Only columns not found in the select clause are subject to the syntactic transformation. Consider this table and the query below:
CREATE TABLE t (
a INTEGER,
b INTEGER
)SELECT b AS a
FROM t
ORDER BY a -- refers to t.bThe column name “a” in order by a refers, unambiguously, to the column “a” produced by the select clause. Only if there is no such column, the columns produced by from/join clauses are considered. This ensures that the meaning of this query is the same for SQL-92 and SQL:1999.
Expressions
Another—equally annoying—limitation of the early order by clause was lifted in 1999: The order by keys were no longer limited to column references and integer values.
Since then, order by can contain arbitrary expressions (formulas)6 that contain at least one column name. Order by 1, as shown above, is explicitly not standard SQL anymore.7 27 years later, knowing that this syntax is still accepted everywhere, it might be worth reconsidering this for standardization.
However, a more relevant issues is that there are still engines that do not fully support expressions in the order by clause. They only allow expressions on columns originating from the from clause. Consider these examples. The first works in all tested systems, but the second fails in some of them.
SELECT a + b
FROM t
ORDER BY COALESCE(a + b, 0)SELECT a + b AS x
FROM t
ORDER BY COALESCE(x, 0)This limitation is mentioned in the documentation of some systems: PostgreSQL, SQL Server.
Null Ordering
The next notable extension of the order by clause came in 2003: The nulls first|last specification.
SELECT a
FROM t
ORDER BY b NULLS FIRSTWithout that, the position of null values relative to non-null values is implementation defined (ID133). The only provision the standard makes is that engines must treat all null values the same and either put them first or last. Whether they come first or last may depend on the asc/desc specification so that they are effectively sorted as smallest (≪) or greatest (≫) possible value. Indeed most systems treat null as being either very small or very big.
- Configurable:
set default_null_order = [NULLS_FIRST | NULLS_LAST | NULLS_FIRST_ON_ASC_LAST_ON_DESC | NULLS_LAST_ON_ASC_FIRST_ON_DESC]
In Subqueries
As mentioned at the top of this article, order by was originally not allowed in subqueries. This lies, once more, in the history of SQL: As “SQL is based on, but is not a strict implementation of, the relational model”,8 it inherits many ideas from the relational model. One of them is that tables essentially implement sets—actually, multisets. However, the property of (multi)sets to imply no order on its members, applies to tables. This also affects the imaginary tables passed between clauses and from subqueries to their outer query. Consequently, there is no use for order by in subqueries.
Until ISO/IEC 9075-2:2008 was published. That edition introduced the fetch first clause. In case you don’t know: fetch first is just a more powerful version of limit or top. These clauses typically, but not necessarily, follow an order by clause that establishes a row order before truncating the result as specified by the fetch first clause.
Sidenote: No “Ordered Tables”
The imaginary tables passed between clauses may have meta-data associated with them. The group by clause, for example, produces a “grouped table” so that having can see the table rows in a grouped fashion. Likewise, “windowed tables” are used for window functions. But there is no such thing as an “ordered table” in the SQL standard.
That raises a question: How does the fetch first clauses obtain the ordered table from the order by clause? The answer might be disappointing: Fetch first just explicitly refer to the order established by order by (if present). This cross-reference just spans a single page in the standard because all three clauses are covered in a single section of the standard.
It is probably just my personal preference, but I have a tendency to treat the standard text (and laws) as code, which should be organized into smaller units that only interact with each other by means of small and well-defined interfaces. Yes, I know, that’s naive. But why not aim for it anyway? From that perspective, the just mentioned cross reference is not a big deal. The way order by makes non-selected columns available is very different in this regard.
As the fetch first clause makes sense in subqueries, in particular in lateral joins, the need for order by in subqueries emerged. Therefore, the syntactic position of the order by clause was slightly moved from the <cursor specification>, which cannot be nested, to the <query expression>, which can be nested.
Since then, order by is allowed in subqueries so that fetch first can be meaningfully used there as well. Similarly, but technically different, order by also became valid in the legs of union, intersect and except, when put into parenthesis (see F855, “Nested ORDER BY in query expression” for details).
- Some variants
- Optimized away if meaningless
- Some variants • Only in combination with
fetch firstor the like - Some variants
Note that the standard allows order by in subqueries even if there is no fetch first (or the later introduced offset) clause. One tested system takes the (reasonable) freedom to not accept such queries. Another system just ignores meaningless order by clauses, which breaks the principle of least astonishment, but complies with the requirements of the standard. The remaining systems seem to apply the subquery’s order by clause to preserve that order in the outer query.9 This is not required by the standard, but not astonishing either.
To conclude this not-very-brief history of SQL’s order by clause, let me just mention a few examples of other places where SQL uses order by nowadays: over(order by), match_recognize(order by), within group (order by), json_arrayagg(order by).
Of course, these topics are also addressed in my SQL Reloaded training course.
Related
Articles
We need tool support for keyset pagination
The
offsetclause should not be used for pagination because it gives wrong results and poor performance. Wrong and slow, what else do you need?Some systems allow the
group byclause to look forward into theselectclause. This is not covered by the standard. Some systems implement this functionality in a way that breaks standard-conformance.
Standard Features
Mandatory Features
E051, Basic query specification
E121-01, “DECLARE CURSOR”
Optional Features

