rss-bridge 2022-05-11T08:55:41+00:00

Setting the JDBC Statement.setFetchSize() to 1 for Single Row Queries

An interesting hint by Vladimir Sitnikov has made me think about a new benchmark for jOOQ: The benchmark should check whether single row queries should have a JDBC Statement.setFetchSize(1) call made to them by default. The Javadoc of the method says: Gives the JDBC driver a hint as to the number of rows that should … Continue reading Setting the JDBC Statement.setFetchSize() to 1 for Single Row Queries →

Setting the JDBC Statement.setFetchSize() to 1 for Single Row Queries

Posted on May 11, 2022August 30, 2022 by lukaseder

An interesting hint by Vladimir Sitnikov has made me think about a new benchmark for jOOQ:

Interesting thought. Do you have a benchmark ready for this, perhaps?

— Lukas Eder (@lukaseder) June 23, 2021

The benchmark should check whether single row queries should have a JDBC Statement.setFetchSize(1)) call made to them by default. The Javadoc of the method says:

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for ResultSet objects generated by this Statement. If the value specified is zero, then the hint is ignored. The default value is zero.

If an ORM (e.g. jOOQ) knows that it will fetch only 1 row, or if it knows that there can be only 1 row, then that hint certainly makes sense. Examples in jOOQ include:

When users call ResultQuery.fetchSingle()), or fetchOne()), or fetchOptional()), or any similar method, then it is reasonable to expect only 0 – 1 rows to be returned. In the case of those methods returning more than 1 row, an exception is thrown, so even if there are more rows, 2 rows will be fetched at most.
When users add a LIMIT 1 clause on a top level query, there can never be more than 1 row.
When the query is trivial (no joins, or only to-one joins, no GROUP BY GROUPING SETS, no UNION, etc.) and an equality predicate on a UNIQUE constraint is present, there can also be no more than 1 row.

The database optimiser knows all of these things as well. If you add LIMIT 1 to a query, then the optimiser can be reasonably expected to take that as a strong hint about the result set size. But the JDBC driver doesn’t know these things (or at least, it shouldn’t be expected to), because it’s unlikely that it parses the SQL and calculates statistics on it, or considers meta data for such optimisations.

So, the user could hint. And because that would be very tedious for users, even better, the ORM (e.g. jOOQ) should hint. Or so it seems.

Benchmarks

But should it? Is it really worth the trouble? Here’s Vladimir’s assessment about the pgjdbc driver, where he wouldn’t expect an improvement now, but perhaps in the future.

Just in case, the cells in pgjdbc are created "as they are received", so setFetchSize() does not cause instantiation of all the 50'000 byte[] arrays

— Vladimir Sitnikov (@VladimirSitnikv) June 23, 2021

Better than making assumptions, let’s measure, using a JMH benchmark. JMH is normally used for microbenchmarking things on the JVM, to test assumptions about JIT runtime behaviour. This is obviously not a microbenchmark, but I still like JMH’s approach and output, which includes standard deviations and errors, as well as ignores warmup penalties, etc.

First off, the results:

Because benchmark results can’t be published for some commercial RDBMS (at least not when comparing between RDBMS), I have normalised the results so a comparison of actual execution speed between RDBMS is not possible. I.e. for each RDBMS, the faster execution is 1, and the slower one is some fraction of 1. That way, the RDBMS is only benchmarked against itself, which is fair.

The results are below. We’re measuring throughput, so lower is worse.

Db2
---
Benchmark                            Mode   Score
JDBCFetchSizeBenchmark.fetchSize1   thrpt   0.677
JDBCFetchSizeBenchmark.noFetchSize  thrpt   1.000

MySQL
-----
Benchmark                            Mode   Score
JDBCFetchSizeBenchmark.fetchSize1   thrpt   0.985
JDBCFetchSizeBenchmark.noFetchSize  thrpt   1.000

Oracle
------
Benchmark                            Mode   Score
JDBCFetchSizeBenchmark.fetchSize1   thrpt   0.485
JDBCFetchSizeBenchmark.noFetchSize  thrpt   1.000

PostgreSQL
----------
Benchmark                            Mode   Score
JDBCFetchSizeBenchmark.fetchSize1   thrpt   1.000
JDBCFetchSizeBenchmark.noFetchSize  thrpt   0.998

SQL Server
----------
Benchmark                            Mode   Score
JDBCFetchSizeBenchmark.fetchSize1   thrpt   0.972
JDBCFetchSizeBenchmark.noFetchSize  thrpt   1.000

For each RDBMS, I have run a trivial query producing a single row with 1 column. Each time, I have re-created a JDBC Statement, and fetched the ResultSet. In fetchSize1, I have specified the fetch size hint. In noFetchSize, I left the default untouched. As can be summarised:

In these RDBMS, there was no effect

MySQL
PostgreSQL
SQL Server

In these RDBMS, things got significantly worse (not better!):

Db2
Oracle

This is quite surprising, as the benchmark includes running the entire statement on the server, so I would have expected, at best, a negligible result.

For this benchmark, I was using these server and JDBC driver versions:

Db2 11.5.6.0 with jcc-11.5.6.0
MySQL 8.0.29 with mysql-connector-java-8.0.28
Oracle 21c with ojdbc11-21.5.0.0
PostgreSQL 14.1 with postgresql-42.3.3
SQL Server 2019 with mssql-jdbc-10.2.0

The benchmark logic is here:


package org.jooq.test.benchmarks.local;

import java.sql.*;

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;

@Fork(value = 1)
@Warmup(iterations = 3, time = 3)
@Measurement(iterations = 7, time = 3)
public class JDBCFetchSizeBenchmark {

@State(Scope.Benchmark)
public static class BenchmarkState {

Connection connection;

@Setup(Level.Trial)
public void setup() throws Exception {
Class.forName("org.postgresql.Driver");
connection = DriverManager.getConnection(
"jdbc:postgresql://localhost:5432/postgres",
"postgres",
"test"

@TearDown(Level.Trial)
public void teardown() throws Exception {
connection.close();

@FunctionalInterface
interface ThrowingConsumer<T> {
void accept(T t) throws SQLException;

private void run(
Blackhole blackhole,
BenchmarkState state,
ThrowingConsumer<Statement> c
) throws SQLException {
try (Statement s = state.connection.createStatement()) {
c.accept(s);

try (ResultSet rs = s.executeQuery(
"select title from t_book where id = 1")
) {
while (rs.next())
blackhole.consume(rs.getString(1));

@Benchmark
public void fetchSize1(Blackhole blackhole, BenchmarkState state)
throws SQLException {
run(blackhole, state, s -> s.setFetchSize(1));

@Benchmark
public void noFetchSize(Blackhole blackhole, BenchmarkState state)
throws SQLException {
run(blackhole, state, s -> {});

A few remarks:

The query is by no means representative of a production workload. But if things did get improved by the fetchSize flag, the improvement should have manifested
The benchmark didn’t use prepared statements, which could have removed some side-effects, or added some side-effects. Feel free to repeat the benchmark using prepared statements.
It is not yet understood why things didn’t matter in some drivers, or why they did in others. For the conclusion, the “why” is not too important, because nothing will be changed as a result of this blog post. If you know why (the db2 driver and ojdbc code isn’t open source, regrettably), I’d be curious.

Conclusion

[...]

Original source