Why you should not use Limit/Offset in SQL
When using SQL as a database, developpers often need to select only a subset of data from a query not starting at the first element returned by the query. For that, there is the LIMIT … OFFSET feature that can be used. Most of the time it’s used for pagination or handling data by batch.
You should not use that design pattern as this is not scalable if you have quite a lot of data. Let us show you why with an example and explain what’s wrong with this approach.
In the following, we’ll use MySQL on docker for simplicity and reproductability but you can do the same with a database installed locally including MariaDB or Percona, it actually doesn’t matter for the case we’re interested in.
Ensure you have docker running and then launch a MySQL docker container :
docker run --name mysql -e MYSQL_ROOT_PASSWORD=root -d mysql
Here we use root as root password. Never do that on production or on network accessible services. Here we don’t really care for this testing.
You can now launch a MySQL shell to connect to your empty DB with the following :
$ docker exec -it mysql mysql -u root -proot
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.0.26 MySQL Community Server - GPL
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
Create a database :
mysql> CREATE DATABASE limit_test;
Query OK, 1 row affected (0.01 sec)
mysql> use limit_test;
Database changed
Then a a big_table that will contain a lot of rows.
mysql> CREATE TABLE big_table
(
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
not_an_index INT
);
Query OK, 0 rows affected (0.02 sec)
First element is an integer with auto increment on, so we’ll do not have to care about it later. This is an indexed field as it is the PRIMARY KEY.
The second element is also an integer but it is not indexed. We will generate random values for it as an integer between 1 and 1000000.
Create a big_file containing 500k lines to execute :
$ time for i in $(seq 1 500000); do echo "INSERT INTO big_table (not_an_index) SELECT round(rand()*1000000);" >> big_file; done
real 1m42.924s
user 0m29.957s
sys 0m56.612s
This can be optimized by generating INSERT by batch commands and precomputing the random data instead of relying on the SQL rand() function but let’s keep it simple to understand.
Load the DB with that file, it’s a bit long as we’re doing 500k inserts :
docker exec -i mysql mysql -u root -proot limit_test < big_file
Now that we have some data to play with, let’s see how the LIMIT … OFFSET behaves. The classic usage of LIMIT X OFFSET Y is to iterate over a big subset of data to handle them by batch.
A small piece of code, in pseudo code could be :
LINES_TO_SELECT=1
COUNT="select count(0) from big_table ;"
FOR i=0; $i < $COUNT; $i++
select * from big_table ORDER BY id DESC LIMIT $LINES_TO_SELECT OFFSET $i*$LINES_TO_SELECT;
We would SELECT the first X lines with an OFSSET set to 0, we’ll do stuff, then we’ll select X new lines starting at OFFSET X, do stuff, then SELECT X lines starting at OFFSET X*n where n is the iteration number.
Let’s see how it behaves depending on the offset value :
mysql> select * from big_table ORDER BY id DESC LIMIT 1 OFFSET 10;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 565604 | 274252 |
+--------+--------------+
1 row in set (0.00 sec)
All good.
mysql> select * from big_table ORDER BY id DESC LIMIT 1 OFFSET 10000;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 555614 | 693915 |
+--------+--------------+
1 row in set (0.01 sec)
Hmm, a bit slower, no ?
mysql> select * from big_table ORDER BY id DESC LIMIT 1 OFFSET 100000;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 465614 | 168202 |
+--------+--------------+
1 row in set (0.03 sec)
Execution time is growing.
mysql> select * from big_table ORDER BY id DESC LIMIT 1 OFFSET 499999;
+-------+--------------+
| id | not_an_index |
+-------+--------------+
| 65614 | 811125 |
+-------+--------------+
1 row in set (0.11 sec)
Damn, we’re more than 11 times slower with 500k lines than with 10k lines. And worst, time is growing even if we’re doing ordering on an indexed field.
As a comparison, using the non indexed column for filtering is really worse :
mysql> select * from big_table ORDER BY not_an_index DESC LIMIT 1 OFFSET 10;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 329322 | 999983 |
+--------+--------------+
1 row in set (0.13 sec)
mysql> select * from big_table ORDER BY not_an_index DESC LIMIT 1 OFFSET 10000;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 525027 | 982360 |
+--------+--------------+
1 row in set (0.15 sec)
mysql> select * from big_table ORDER BY not_an_index DESC LIMIT 1 OFFSET 100000;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 371301 | 823168 |
+--------+--------------+
1 row in set (0.27 sec)
mysql> select * from big_table ORDER BY not_an_index DESC LIMIT 1 OFFSET 499999;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 241369 | 116245 |
+--------+--------------+
1 row in set (0.33 sec)
So, everything is optimized, you’re doing things correctly on an indexed field but this do not scale. 500k lines is a relatively small table. What about tables with millions, hundred of millions or billion records, what will happen ?
You’ve got it, you’ll blow your server doing that kind of iteration.
Let’s use EXPLAIN to see what’s happening :
mysql> explain select * from big_table ORDER BY id DESC LIMIT 1 OFFSET 1;
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+---------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+---------------------+
| 1 | SIMPLE | big_table | NULL | index | NULL | PRIMARY | 4 | NULL | 2 | 100.00 | Backward index scan |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+---------------------+
1 row in set, 1 warning (0.01 sec)
mysql> explain select * from big_table ORDER BY id DESC LIMIT 1 OFFSET 4999999;
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+--------+----------+---------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+--------+----------+---------------------+
| 1 | SIMPLE | big_table | NULL | index | NULL | PRIMARY | 4 | NULL | 500000 | 100.00 | Backward index scan |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+--------+----------+---------------------+
1 row in set, 1 warning (0.00 sec)
Explain is OK, it uses the index (Backward index scan as we’re doing a reverse ordering) but why is it so slow then ?
In short, as the database engine have no idea what is the Xth value in your query, it will read your indexed field and scan it until it finds the Xth value, then it will give you that value. So in short 10 rows are read if offset is 10, it’s fast, but 10000000000 records are read if you asked for the 10000000000th record, this one is really huge, even for really big databases out there.
The algorithmic complexity of the operation to search the first element is O(n) where n is the value of your OFFSET. So your complexity is proportionnal to your dataset size.
If you’re using this pattern for sequential scan of a table data, then your LIMIT…OFFSET queries will be slower at each iteration. If you have a small amount of data, it will eventually be unoticeable but if you have quite a lot of lines, it could blow up your server.
In real applications, I saw that affecting servers scaled to handle pretty good concurrency like thousands of SELECT per seconds when doing this on a table with several millions of records. That’s not a huge amount of data but it is huge impact on resources usage.
The issue is that the database engine do not know where is the Xth value of an indexed field. However, it perfectly knows where the value Y is in the associated index.
So the idea is to retrieve the last index returned by the previous iteration and then use it for the following iteration. In this case, access to the first element will be done in only 1 row read.
The algorithmic complexity of the operation to search the first element becomes O(1). The complexity becomes independent of your dataset size.
A small piece of code, in pseudo code could be :
LINES_TO_SELECT=1
COUNT="select count(0) from big_table ;"
LAST_ID="select max(id) from big_table ;"
FOR i=0; $i < $COUNT; $i++
SELECTED_LINES=select * from big_table WHERE id < $LAST_ID ORDER BY id DESC LIMIT $LINES_TO_SELECT;
LAST_ID=getLastId($SELECTED_LINES)
Let’s see how it behaves now :
mysql> select * from big_table WHERE id < 10 ORDER BY id DESC LIMIT 1;
+----+--------------+
| id | not_an_index |
+----+--------------+
| 9 | 132812 |
+----+--------------+
1 row in set (0.00 sec)
mysql> select * from big_table WHERE id < 100000 ORDER BY id DESC LIMIT 1;
+-------+--------------+
| id | not_an_index |
+-------+--------------+
| 99999 | 179631 |
+-------+--------------+
1 row in set (0.01 sec)
mysql> select * from big_table WHERE id < 500000 ORDER BY id DESC LIMIT 1;
+--------+--------------+
| id | not_an_index |
+--------+--------------+
| 499999 | 873655 |
+--------+--------------+
1 row in set (0.00 sec)
Really stable.