PosgreSQL 11 was released four months ago and my review is long overdue. Here we go!
With respect to standard SQL, the main theme in PostgreSQL 11 is window functions (over
). For almost eight years, from 2009 until 2017, PostgreSQL was the only major free open-source product to support SQL window functions. Just a year later, by September 2018, all open-source competitors have caught up…and some even overtook PostgreSQL. The PostgreSQL community was prepared. PostgreSQL 11 was just released in 2018, and it has restored and even expanded its leadership position.0
This article explains this race and covers other improvements in PostgreSQL 11.
Contents:
- Complete SQL:2011
Over
Clause Create Procedure
andCall
- Parameters in
Fetch First|Next
- Relative XPath Expressions
- Other News
- Extra:
Create Index … Include
Complete SQL:2011 Over
Clause
The over
clause defines which rows are visible to a window function. Window functions were originally standardized with SQL:2003, and PostgreSQL has supported them since PostgreSQL 8.4 (2009). In some areas, the PostgreSQL implementation was less complete than the other implementations (range
frames, ignore nulls
), but in other areas it was the first major system to support them (the window
clause). In general, PostgreSQL was pretty close to the commercial competitors, and it was the only major free database to support window functions at all—until recently.
In 2017, MariaDB introduced window functions. MySQL and SQLite followed in 2018. At that time, the MySQL implementation of the over
clause was even more complete than that of PostgreSQL, a gap that PostgreSQL 11 closed. Furthermore, PostgreSQL is again the first to support some aspects of the over
clause, namely the frame unit groups
and frame exclusion. These are not yet supported by any other major SQL database—neither open-source, nor commercial.
The only over
clause feature not supported by PostgreSQL 11 are pattern
and related clauses. These clauses were just standardized with SQL:2016 and do a framing based on a regular expression. No major database supports this this framing yet.1
Frame Units
Before looking into the new functionality in PostgreSQL 11, I’ll show you a typical use case of window functions. We can then proceed to the so-called framing.
The example calculates the running total over the column amnt
, so the sum over all rows before and up to the current row according to the specified order by
clause:
SELECT SUM(amnt)
OVER(ORDER BY id
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) running_total
FROM …
The aggregate function sum
is used with the over
clause rather than with a group by
clause—that makes it a window function. The interesting part in this example is the framing, which is in bold.
Window framing narrows an ordered window to the rows between a specified start and end row.
The general syntax is:
<unit> BETWEEN <window frame bound>
AND <window frame bound>
[<frame exclusion>]
Let’s start with the window frame bounds, i.e. the definition of the frame’s beginning and end.
The window frame bounds can be specified in terms relative to the current row or they can be “unbounded”. Unbounded refers to the start or end of the current result or partition.
CURRENT ROW
| <distance> (PRECEDING|FOLLOWING)
| UNBOUNDED (PRECEDING|FOLLOWING)
The following frame definition uses frame bounds relative to the current row.
<unit> BETWEEN 1 PRECEDING AND CURRENT ROW
To truly understand the meaning of relative bounds, we must also understand how the three frame units—rows
, range
, and groups
—change the meaning of these bounds.
The rows
unit does just what you might expect: it interprets current row
as referring to the current row and <distance>
in preceding
and following
as a number of rows. With the rows
unit, the previous example defines a frame that includes up to two rows: one before the current row and the current row itself. If there is no row before the current row, e.g. because the current row is the first row, the frame just covers the current row itself.
The next frame unit, range
, does not count rows at all. Instead it uses the value of the sort key (order by
expression) and adds or subtracts the specified <distance>
. All rows for which the value of the sort key falls into the specified range are taken into the frame.
Note that current row
as range
bound refers to all rows with the same value as the current row. That can be many rows. Think of current row
as though it was 0 preceding
or 0 following
.2 In case of range
, “current peers” or “current value” might have been a better choice than current row
.
The following figure uses the unit range
instead of rows
. As the value of the current row is two, the frame covers all rows with the values one to two (inclusive). The frame begins at the first row, because its value is one and thus falls into the value range. The end of the frame is even beyond the current row as the next row still falls into the value range.
This is an example that works in MySQL 8.0, but not in PostgreSQL prior to version 11. Although range
frames were supported by PostgreSQL before, you could not use a numeric distance as shown above. Only unbounded
and current row
could be used before PostgreSQL 11. That is still the case in SQL Server and SQLite, by the way. PostgreSQL 11 supports all frame units with all boundary types.
On my Own Behalf
I make my living from SQL training, SQL tuning and consulting and my book “SQL Performance Explained”. Learn more at https://winand.at/.
Even the last frame unit, groups
, is fully supported by PostgreSQL 11. Groups
assigns each row of the result or partition into a group just like the group by
clause does. The <distance>
then refers to the number of groups to cover before and after the current row, i.e. the number of distinct sort key values.
The following figure shows how the groups
frame covers one distinct value before the current value (1 preceding
) and the current value itself (current row
). The numeric difference between the values does not matter, nor does the number of rows. Groups
is solely about the number of distinct values.
PostgreSQL 11 is the first major SQL database to support groups
frames.
Frame Exclusion
Another feature that is not yet implemented by any other major SQL product is frame exclusion. It removes rows from the frame that are related to the current row.
The default is exclude no others
, which does not remove any rows.
The next option is to remove the current row itself from the frame.
EXCLUDE CURRENT ROW
Note that the meaning of the exclude
clause is not affected by the frame unit. Current row
just removes the current row—even if the range
or groups
unit is used and the current row has peers. This is different from the behavior of current row
in a frame bound.
To remove the current row along with all its peers from the frame, use exclude group
.
EXCLUDE GROUP
Again, this is independent of the frame unit and thus also removes peers when using the rows
unit.
Finally, it is also possible to remove the peers of the current row, but not the current row itself:
EXCLUDE TIES
Compatibility
- No chaining of window definitions
- No
<distance>
(onlyunbounded
andcurrent row
) - No
<distance>
(onlyunbounded
andcurrent row
) and no date and time types
Create Procedure
and Call
PostgreSQL has supported user-defined functions for ages. Create procedure
, on the other hand, was not supported before PostgreSQL 11. Previously, returns void
functions were often used to mimic procedures.
The new procedures introduced to PostgreSQL 11 differ from functions in several ways:
They cannot return anything
They are invoked by the
call
statement, rather than in an SQL expressionThey can contain transaction control statements (in particular
commit
androllback
)3
There is not much more I have to say about procedures, except that it is still a work in progress topic. This is especially true for drivers like JDBC.
- Not strictly following standard syntax
- Proprietary syntax:
<varname> in|out|in out <type name>
- Proprietary syntax:
<varname> <type name>
(no parameter mode specification) - Proprietary syntax:
<varname> <type name> output
(output hasinout
semantics) - Not supported by the latest JDBC driver, which I use for testing
- Also supports
:=
instead of keyworddefault
- Use
=
instead of keyworddefault
- I’m possibly encountering a JDBC driver issue here
- Using
exec … <param>=<value>
- Syntax accepted, semantics not tested
Despite the size of this support matrix, there are several aspects I have not tested:
Security (T323, T324)
Semantics of
drop restrict|cascade
(F032)Cyclic dependencies (T655)
Collection type parameters (S201, S202)
Dynamic SQL in routines (T652)
Schema statements in routines (T651)
Overloading (T341)
Proprietary extensions such as
alter procedure
,drop routine
, transaction control in routines, etc.
Parameters in Fetch First|Next
Fetch first N rows only
is the standard SQL syntax for the well known but proprietary limit
clause. It was introduced with SQL:2008 and then promptly supported by PostgreSQL in 2009 (version 8.4). However, there was one very small gotcha: the use of a parameter instead of a literal value required the parameter to be enclosed in a pair of parentheses.
FETCH FIRST ($1) ROWS ONLY
Although it is not a big deal if you know about it, it can drive you crazy when you get the error message “syntax error at or near "$1"
” when you omit the parentheses.
PostgreSQL 11 accepts parameters (and expressions) without parentheses.
- Requires
offset
(e.g.offset 0 rows
) • Requiresorder by
- Use nested query:
CREATE VIEW … AS SELECT … FROM (SELECT … FROM … FETCH FIRST …) t
- Does not allow parentheses:
(?)
- Use proprietary
select top … percent
- Only for
rows
, not forpercent
- Use proprietary
select top … with ties
- Not for
0
(zero)
Relative XPath Expressions
Another small annoyance—one that was easy to live with when you knew about it—was that PostgreSQL has interpreted relative XPath expressions in XML functions as being relative to the root node of the document. Makes sense, right? Not really, it should be the document node.
Consider the following example, which uses xmltable
(introduced with PostgreSQL 10) to transform an XML document into columns and rows.
SELECT c
FROM (VALUES ('<root>
<c>c1</c>
<c>c2</c>
</root>'::xml
)
) t(x)
, XMLTABLE ('root/c' -- XPath expression
PASSING x
COLUMNS c TEXT PATH '.'
)
If the XPath expression 'root/c'
is interpreted relative to the document node, as mandated by the standard, it will match both <c>
elements in <root>
.
Until version 10, PostgreSQL evaluated those expressions relative to the root note <root>
, meaning that this expression doesn’t match anything. In older releases you would either have to use the relative XPath expression 'c'
or, preferably, the absolute XPath expression '/root/c'
to get the same result.
Other News
The above-mentioned features that relate to the SQL standard are just a small part of the changes in PostgreSQL 11. Please have a look at the release notes for all the changes.
For your convenience, I’ll give you a little teaser:
- Partitioning
Partitioning is no longer sadly incomplete. New in PostgreSQL 11:
Cross partition primary key and unique constraints
Foreign keys are supported in one way (partitioned table can refer to non-partitioned table)
Update
statements can move rows to another partitionDefault partitions
Hash partitioning
- Parallel Processing
Improvements of existing parallel execution (
Hash Join
,Seq Scan
).A few more commands that can be executed in parallel: creation of b-tree indexes,
create table … as select
,create materialized view
.- Just In Time (JIT)
Expressions in queries can be compiled into native code rather than being interpreted from the abstract syntax tree. Benchmarks have shown almost 30% run time improvement on queries that are expression heavy.
- Fast
Add Column
Adding a new column to an existing table is a fairly common task. If the new column has a default value of
null
, PostgreSQL was already able to add this column by changing only the table’s metadata. PostgreSQL 11 extends this ability to columns with a constant default value.Quit
andExit
inPsql
PostgreSQL has learned its lessons from "vi". Quoting from the PostgreSQL 11 announcement: “The inclusion of the keywords "quit" and "exit" in the PostgreSQL command-line interface to help make it easier to leave the command-line tool.”
Extra: Create Index … Include
There is another PostgreSQL 11 feature that actually deserves its own article: create index … include
. This article will is available on Use The Index, Luke!: A Close Look at the Index Include
Clause.
If you’d like to learn more about modern SQL, have a look at my training in Vienna. In addition to window functions (mentioned above), it covers recursion and indexing, and greatly improves your understanding of basic SQL concepts. The training is based on the current draft of my next book. Check it out now!