Reduce Network Latency For insert


Generally, SQL databases perform best when working on the whole problem at once0. Instead of selecting row-by-row or—god forbid—joining tables in the application code, we should use a single query to fetch all the required data—and nothing more—at once. The very same is true for modifying data: to delete three rows we should use onedelete statement with the right where clause. But how about inserting three rows at once?

As a matter of fact there is a very old SQL feature making this possible: the values clause can list multiple rows:

INSERT INTO tbl (   c1 ,    c2 ,    c3 )
         VALUES ('r1c1', 'r1c2', 'r1c3')
              , ('r2c1', 'r2c2', 'r2c3')
              , ('r3c1', 'r3c2', 'r3c3')

The main drawback of this approach is that the number of rows affects the query string itself as well as the number and position of the bind parameters1. That's no big deal for databases—nowadays, almost all SQL is executed as “dynamic SQL” anyway. Nevertheless it might be a big deal in your code base. Using a batch API is therefore better in the general case2.

Besides the ease of use, the two approaches have also a semantic difference: most batch APIs just execute the statement multiple times in a latency optimized manner3. That means, each execution can fail independently of each other. If one out of three rows breaks a constraint, the other two will still get inserted.

Inserting multiple rows with a single statement is different: the SQL standard implies a all-or-nothing semantic for single statements4. If one out of three rows in the example above breaks a constraint, none will be inserted. Depending on your requirements, this may or may not be the desired behavior.

This semantic difference also affects statement level triggers: they'll fire multiple times when using a typical batch API, but only once when inserting multiple rows with a single statement.

To conclude this topic: inserting multiple rows with a single insert statement is useful if the all-or-nothing semantic is desired and the number of rows is fixed or at least within known boundaries. In almost all other cases, batch APIs offer a better approach to reduce network latency.

Conforming Alternatives

As mentioned above, using a batch API is the common way to cut latencies. One very nice thing about most batch APIs is that they also work for update and delete.

If you need the all-or-nothing semantic as well, you have to use explicit transactions or even savepoints.

Proprietary Alternatives

insert all (Oracle)

Oracle support a multitable insert syntax that also allows to insert multiple rows in one statement. (Documentation)

About the Author

Photo of Markus Winand

Markus Winand teaches efficient SQL—inhouse and online. He minimizes the development time using modern SQL and optimizes the runtime with smart indexing. His book entitled SQL Performance Explained has become standard reading.

Buy his Book on Amazon

Cover of “SQL Performance Explained”: Squirrel running on grass

The essence of SQL tuning in 200 pages

Buy on Amazon
(paperback only)

Paperback and PDF also available at Markus’ store.

Hire Markus

…to answer your current SQL questions.

The quick and easy way to benefit from his extensive knowledge and experience.
Learn more »

Footnotes

0

Exceptions prove the rule. Arguably, some databases get it more often wrong than right.

1

The example uses literal values for demonstration. Programs should always use bind parameters. All right, Bobby?

2

Such as Java's PreparedStatement.addBatch(). The performance gain depends on the implementation, however.

3

This is the typical way batch APIs are implemented. Some API might offer a batch mode, but create a single statement behind the scenes. Check your docs.

4

“Checking of Constraints“: §4.10.1 in SQL-92 and §4.23.2 in SQL:2016.

“modern SQL” by Markus Winand is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.
Legal | Contact | NO WARRANTY | Trademarks | Privacy | CC-BY-NC-ND 3.0 license