Fun MySQL fact of the day: chaos without ORDER BY

2019-02-28 21:36UTC

ANSI SQL-92 states that the order of a result set in the absence of an ORDER BY is implementation specific. This means that the same query without an ORDER BY clause run n times may return rows in n different orders.

In MySQL/InnoDB, the implementation-specific behaviour (coincidentally, not as defined or documented, and subject to change) tends to be the order of records as they are sorted in the index from which they are queried. Because a query may use a different query plan on each execution, based on its own current statistics, two servers (or even connections) may run the same query with different query plans. And that means returning results in a different order.

Quirky, sure, and in many cases, meh! But in one where it does matter, the results can be utterly disastrous: statement-based replication. Let's think about why: if you issue a DELETE ... LIMIT n on a master database (or even an UPDATE ... WHERE (... LIMIT)), you are at the whim of the query planner to decide which rows get deleted (maybe the first n on the PRIMARY index). When the same statement gets run on the replica, you are once again at the whim of the query planner (maybe there, the first n records based off the user_id are deleted). Next thing you know, your master and all of your replicas are out of sync, and you've suffered major data loss (or worse).

This is one of several reasons why MySQL 5.7.7 and above default to ROW based replication, which is not susceptible to this fun little quirk. Still, ROW based replication has its own quirks, but those are fun facts for another day.