Fun MySQL fact of the day: slowest sort

While we discussed yesterday that filesort can use different sorting algorithms depending on variables, we didn't discuss the various modes that filesort has. The behaviour of these modes can even make the sorting algorithm filesort chooses seem pretty unimportant.

Today, we're going to maximise fun and consider the worst-case filesort scenario, and to do that we first need to consider the MySQL variable, max_length_for_sort_data, which defaults to 1024 (bytes). During a filesort operation, MySQL will evaluate if the size of the row plus the size of the sort key exceeds max_length_for_sort_data. Of course, this is the maximum size of the row, so a VARCHAR(255) with the 4-byte utf8 character set accounts for 1020 bytes, and that plus just one other 4-byte column, for example, would exceed the default max_length_for_sort_data. When this happens, filesort chooses a <sort_key, rowid> mode, which is a 3-part operation:

  1. MySQL will request a result set from the storage engine to fulfil the sort_key in addition to the row's primary key value (rowid) for all candidate rows.
  2. MySQL will then sort the rows by sort_key.
  3. MySQL will then read each row from the storage engine in sort order (i.e. not index order) using rowid.

As you can imagine, this is a pretty expensive operation, and I hope your key take-away is that when you can't leverage index optimised sorting, you should try to keep your sort keys as small as possible, and especially smaller than max_length_for_sort_data.