Fun MySQL fact of the day: OFFSET

On Friday, we discussed a simple fact about LIMIT, and that wasn't just because it was Friday (well, maybe a bit), but because I felt it important to have a refresher on how MySQL and the storage engine interact before we look at OFFSET.

It took some time, but after finally(!) finding the 3 lines of code in MySQL that's responsible for implementing OFFSET, I had a bit of a chuckle. It was not at all what I expected, but I have a hard time arguing that it's wrong. Today, let's consider a table, (id,a,b,c), with no secondary indices, but with 30 rows in sequential order (id=1,id=2,...,id=30) that we query as SELECT * FROM my_table LIMIT 10 OFFSET 10. Knowing what we know, our expectation is that we will get back rows with id=11..id=20, and, well, we will. But not until MySQL has first seen all rows from id=1..id=20.

Huh? Surely there's some kind of optimisation where MySQL can just jump to id=11, right? Unfortunately, no. There isn't. And there can't be, because for all MySQL knows, there is no id=11. And, further, like the example on Friday, the storage engine also has no idea if MySQL will need to perform supplemental filtering on the rows it has returned. So, knowing this, I think we can agree that the OFFSET implementation can't be terribly intelligent. But what exactly does it look like?

For a query with an OFFSET, MySQL behaves no differently than a query without an OFFSET: it fetches each row from the storage engine one-by-one, evaluates them against the applicable conditions in the WHERE clause one-by-one, and then attempts to write it to the network. But this is where things differ. To see how, we need to consider just 3 variables in the SELECT code path:

Now, just before the row is sent to the network in select_send::send_data, MySQL checks if offset_limit_cnt > 0. If so, it subtracts 1 and returns, effectively dropping the row. At the same time, MySQL increments send_records. Eventually, offset_limit_cnt == 0 and the rows start getting sent to the client. Then, eventually, when send_records >= select_limit_cnt, MySQL knows to stop asking for rows and cleans up. So, in effect, OFFSET is (probably necessarily) more like a DONT WRITE x ROWS TO THE NETWORK feature. Fun.

Fortunately for us, even though MySQL isn't "smart" enough to jump straight to id=11, we are, but that's a fun fact for later this week.