`listagg`

Rows to Delimited Strings

The listagg function transforms values from a group of rows into a list of values that are delimited by a configurable separator. Listagg is typically used to denormalize rows into a string of comma-separated values (CSV) or other comparable formats suitable for human reading.

Listagg does not apply any escaping: it is not generally possible to tell whether an occurrence of the separator in the result is an actual separator, or just part of a value. The safe use of listagg for electronic data interfaces is therefore limited to cases in which an unambiguous separator can be selected, e.g. when aggregating numbers, dates, or strings that are known to not contain the separator.

When implementing electronic data interfaces, arrays and document types (JSON, XML) are advantageous as they offer type safety, or at least proper escaping.

Syntax

Listagg is an ordered set function, which require the within group clause to specify an order. The minimal syntax is:

LISTAGG(<expression>, <separator>) WITHIN GROUP(ORDER BY …)

The <expression> must not contain window functions, aggregate functions or subqueries.0 The standard only allows character literals in <separator>—i.e. no expression and no bind parameter.1 Bind parameters are nevertheless well supported in practice.

Listagg removes null values before aggregation2 like most other aggregate functions. If no not null value remains, the result of listagg is null. If needed, coalesce can be used to replace null values before aggregation.

The `on overflow` clause

The return type of listagg is either varchar or clob with an implementation defined length limit.3 In practice, it is a varchar type.4

Listagg accepts the optional on overflow clause to define the behavior if the result exceeds the length limit of the return type:

LISTAGG(<expression>, <separator> ON OVERFLOW …)

The default is on overflow error. In this case, the standard requires an exception with SQLSTATE 22001 to be raised5—in practice, this requirement is not fulfilled.

The on overflow truncate clause prevents the overflow by only concatenating as many values as the result type can accommodate. Furthermore, the on overflow truncate clause allows one to specify how the result is terminated:

ON OVERFLOW TRUNCATE [<filler>] WITH[OUT] COUNT

The optional <filler> defaults to three periods (...) and will be added as last element if truncation happens.

If with count is specified and truncation happens, the number of omitted values is put in brackets and appended to the result.

The SQL standard does not require a warning to be issued on truncation.6 To know whether the result is complete or not, users can parse the result7 or compare the actual length of the result to the calculated length for a result containing all values.8

`distinct`

The listagg function accepts the optional set quantifiers all and distinct:

LISTAGG( [ALL|DISTINCT] <expression>, <separator> …) …

If neither is specified, all is default. If distinct is specified, duplicate values are removed before aggregation. Note that the elimination of duplicates is subject to the collation in effect.

Warning

The standard does not specify which of the duplicate elements is removed.

If the order by clause places one occurrence at the beginning and the other at the end, it is unspecified at which place the value appears in the result.9

Distinct can be implemented manually by removing duplicates before aggregation—e.g. in a subquery. This works for databases not supporting distinct in listagg, and also allows to keep a particular occurrence if duplicates exist.

The following example demonstrates this approach. The columns g and o represent the group by and order by keys respectively. The example uses min(o) to keep the first occurrence in case one value appears multiple times.

SELECT g
     , LISTAGG(val, ',') WITHIN GROUP (ORDER BY o) list
  FROM (SELECT g, min(o) o, val
          FROM dist_listagg
         GROUP BY g, val
       ) dt
 GROUP BY g

Combining `listagg` with `filter` and `over`

Listagg can be combined with the filter and over clauses:

LISTAGG(…) WITHIN GROUP(…) [FILTER(WHERE …)] [OVER(…)]

The effect of the filter clause is to remove rows before aggregation. Case can be used for the same effect.

The over clause must not contain an order by clause10 because the mandatory within group clause must contain an order by clause anyway. It is not possible to narrow the window frame: the set of aggregated rows is always the full partition.

Compatibility

Listagg was introduced with SQL:2016 as optional feature T625. Even though listagg is not yet widely supported, most databases offer similar functionality using a proprietary syntax.

If ordered by the aggregated values: listagg(distinct X,…) within group (order by X)
SQLSTATE 54006
SQLSTATE 72000

Standard-Conforming Alternatives

If you like this page, you might also like …

… to subscribe my mailing lists, get free stickers, buy my book or join a training.

Has non-standard function: json_group_array
No order by clause
Has non-standard function: json_agg
Has non-standard syntax: FOR JSON AUTO
Proprietary syntax available: FOR XML PATH
Without keyword recursive: with recursive name as…

Arrays

If the query does not strictly require the return of a delimited string, arrays can be used to return an array of values. An array can be constructed using the array_agg aggregate function or via a subquery.

ARRAY_AGG(<expression> ORDER BY …)

ARRAY(<query>)

In the second form, <query> can contain distinct and fetch first to remove duplicates and limit the array length.

Neither of the two approaches performs an implicit cast: the array elements have the same type as <expression>. That means that the retrieving application can fetch the values in a type-safe manner and apply formatting if required.

The type-safe nature of arrays allows them to also carry null values in an unambiguous way. Array_agg does therefore not remove null values like other aggregate functions do (including listagg).11

The filter clause can be used to remove null values before aggregation with array_agg.12 If the filter clause removes all rows, array_agg returns null—not an empty array.

The subquery syntax allows removing null values in the where clause of the <query> and returns an empty array if the subquery doesn’t return any rows.

If the order of elements is irrelevant, multisets and collect can also be used to pass a type-safe list to an application.

Document Types

Similar to array_agg, the SQL standard defines aggregate functions that return JSON or XML fragments: i.e. json_arrayagg and xmlagg. The main benefit compared to listagg is that they apply the respective escape rules.

JSON_ARRAYAGG(<expression> ORDER BY … [NULL ON NULL])

XMLAGG(XMLELEMENT(NAME <element-name>, <expression>) ORDER BY …)

Warning

Some articles show how to use SQL string manipulation functions to transform such documents into a delimited string. These examples often neglect the fact that the serialized document might contain escape sequences that need to be unescaped (e.g., < in XML or \" in JSON).

Using `with recursive`

Note

Although the listagg functionality can be implemented using with recursive, it is often the better choice to use arrays, documents or the proprietary alternatives to listagg as shown below.

The following special case can be implemented using only with recursive and intermediate SQL-92:

LISTAGG(DISTINCT <expr1>, <sep> …) WITHIN GROUP(ORDER BY <expr1>)

Note the distinct and that <expr1> has to be the exact same expression in both cases.

The following example uses g as group by key, val as <expr1> and ', ' as <sep>:

310-listagg/10-via-recursion.sql

WITH RECURSIVE
list_agg(g, val, list)
AS (
    SELECT g, min(val), CAST(null AS VARCHAR(255))
      FROM listagg_demo
     GROUP BY g
 UNION ALL
    SELECT prev.g
         , (SELECT min(val)
              FROM listagg_demo this
             WHERE this.g   = prev.g
               AND this.val > prev.val
           ) val
         , COALESCE(list || ', ', '') || val
      FROM list_agg prev
     WHERE prev.val IS NOT NULL
)
SELECT g, list
  FROM list_agg
 WHERE val IS NULL
 ORDER BY g

This particular implementation uses the “loose index scan” technique as explained on the PostgreSQL Wiki. The performance will remain at a rather low level even with an index on (g, val). The distinct behavior is a side effect of this technique.

The correct handling of null in val is an important special case: although null is generally ignored in aggregations, a group that consists of null values only must still be present in the result. This means that null must not be removed if there is no not null value in the group. The implementation above uses min(val) in the non-recursive expression to get this behavior.13

A more generic implementation that supports all semantics and arbitrary order by clauses is possible using with recursive and window functions. Aaron Bertrand’s post “Grouped Concatenation in SQL Server” presents an example of this approach.

In both cases, arbitrary on overflow behavior can be implemented.14

Proprietary Extensions

The only useful extension that is commonly available is the support of bind parameters and constant expressions in <separator>.

The standard neither allows omitting the <separator> nor omitting the within group clause. Yet some databases treat them as optional and apply implementation defined defaults or expose undefined behavior if within group is omitted.

Proprietary Alternatives

There are two widely available proprietary alternatives to listagg: group_concat and string_agg. Even though some databases use the same proprietary function name, they still use a different syntax.

The good news is that the proprietary functions have the same default semantic as listagg: they filter null values before aggregation but don’t remove duplicates (all semantics).

Accepts optional limit clause: string_agg(<e>, <s> ORDER BY … LIMIT <n>)

`string_agg` — PostgreSQL & Google BigQuery Syntax

PostgreSQL-style string_agg supports distinct and follows the array_agg syntax to specify an order:

STRING_AGG([ALL|DISTINCT] <expression>, <separator> [ORDER BY …])

Google BigQuery also supports this variant, extended by an optional limit clause after the order by.

PostgreSQL also offers a proprietary function to turn arrays into delimited strings: array_to_string.

`string_agg` — SQL Server Syntax (since release 2017/V14)

SQL Server’s string_agg doesn’t support distinct and uses the within group clause to specify an order:

STRING_AGG(<expression>, <separator>) [WITHIN GROUP (ORDER BY …)]

Aaron Bertrand’s post “Grouped Concatenation in SQL Server” demonstrates many other ways to implement listagg in older SQL Server releases—including the FOR XML PATH approach.

`group_concat` — MySQL and MariaDB Syntax

Group_concat supports distinct and expects the order by clause before the optional separator (default: comma):

GROUP_CONCAT([DISTINCT] <expression>
             ORDER BY …
             [SEPARATOR <separator>]
             [LIMIT <n>] -- MariaDB 10.3+
            )

Group_concat truncates the result if it exceeds a configurable maximum length and doesn’t honor element boundaries when doing so. In other words, it might truncate in the middle of an element.

MariaDB supports the same syntax and offers the optional limit clause in group_concat since version 10.3.

`group_concat` — SQLite Syntax

SQLite’s group_concat supports distinct but not order by:

GROUP_CONCAT([DISTINCT] <expression>, <separator>)

User-Defined Aggregates

Some databases support user-defined aggregate functions: MySQL (only in C/C++), Oracle Database (also WM_CONCAT), PostgreSQL, SQL Server (using CLR).

You can’t catch up on 20 years of SQL evolution in one day. Subscribe the newsletter via E-Mail, Bluesky or RSS to gradually catch up and to keep modern-⁠sql.com on your radar.

About the Author

Markus Winand provides insights into SQL and shows how different systems support it at modern-sql.com. Previously he made use-the-index-luke.com, which is still actively maintained. Markus can be hired as trainer, speaker and consultant via winand.at.

Footnotes

ISO/IEC 9075-2:2023 §10.9 SR 11a.
ISO/IEC 9075-2:2023 §10.9 <listagg separator> ::= <character string literal>, no <dynamic parameter specification> allowed.
ISO/IEC 9075-2:2023 §10.9 GR 11b.
ISO/IEC 9075-2:2023 §10.9 SR 11d.
DB2 LUW uses varchar, Oracle varchar2.
ISO/IEC 9075-2:2023 §10.9 GR 11gii.
Omission in ISO/IEC 9075-2:2023 §10.9 GR 11giii. Warning — string data, right truncation (SQLSTATE 01004) would be an appropriate condition.
The use of an unambiguous <filler> is highly recommended for this purpose.
SUM(CHAR_LENGTH(<expression>)) + (COUNT(<expression>)-1) * CHAR_LENGTH(<separator>).
Omission in ISO/IEC 9075-2:2023 §10.9 GR 11ci
ISO/IEC 9075-2:2023 §9.23 SR 8di
ISO/IEC 9075-2:2023 §10.9 GR 12gii (NOTE 528)
This use case of filter cannot be emulated using case
Alternatively, order by val nulls last with fetch first 1 row only can be used as well. However, This is supported by fewer databases. If val is know to be not null, the nulls last clause can be skipped.
The implementation defined maximum length for varchar must be considered manually.

`listagg`

Syntax

The `on overflow` clause

`distinct`

Warning

Combining `listagg` with `filter` and `over`

Compatibility

Standard-Conforming Alternatives

If you like this page, you might also like …

Arrays

Document Types

Warning

Using `with recursive`

Note

Proprietary Extensions

Proprietary Alternatives

`string_agg` — PostgreSQL & Google BigQuery Syntax

`string_agg` — SQL Server Syntax (since release 2017/V14)

`group_concat` — MySQL and MariaDB Syntax

`group_concat` — SQLite Syntax

User-Defined Aggregates

About the Author

Footnotes

Can I use…

News

The Standard

Concepts

Features

Use Cases

Connect with Markus Winand

listagg

Syntax

The on overflow clause

distinct

Warning

Combining listagg with filter and over

Compatibility

Standard-Conforming Alternatives

If you like this page, you might also like …

Arrays

Document Types

Warning

Using with recursive

Note

Proprietary Extensions

Proprietary Alternatives

string_agg — PostgreSQL & Google BigQuery Syntax

string_agg — SQL Server Syntax (since release 2017/V14)

group_concat — MySQL and MariaDB Syntax

group_concat — SQLite Syntax

User-Defined Aggregates

About the Author

Buy the Book

Hire Markus

Footnotes

Can I use…

News

The Standard

Concepts

Features

Use Cases

Connect with Markus Winand

`listagg`

The `on overflow` clause

`distinct`

Combining `listagg` with `filter` and `over`

Using `with recursive`

`string_agg` — PostgreSQL & Google BigQuery Syntax

`string_agg` — SQL Server Syntax (since release 2017/V14)

`group_concat` — MySQL and MariaDB Syntax

`group_concat` — SQLite Syntax