Fun MySQL fact of the day: good-enough sort
Over the last few days, we've discussed some of the traits of
filesort and we'd discussed two important variables:
max_length_for_sort_data. But there's 1 more important variable that you may even find yourself wanting to change on a query-by-query basis:
While it's rather trivial to confuse
max_length_for_sort_data, they actually have some interplay as we'll see, but while
max_length_for_sort_data determines the
max_sort_length tells MySQL how many bytes to compare for each field when sorting rows. For example,
SET SESSION max_sort_length=32 would tell MySQL to compare only the first 32 bytes of each field, rather than the entire thing. While this can seem pretty weird, it might make sense to use when you need faster sorts on string data or temporal data that may not necessarily need to be perfect.
For example, if you had an
ORDER BY blog.date_created, blog.description, we may only want to sort the first 32 bytes of a user's blog post description instead of using the default
1024 bytes. This may not mean too much when you have a small number of rows to sort, but if you have 5,000, it can mean the difference between 4.9MB (requiring disk-based temporary files to sort) and 156KB, all which fits in the default 256KB sort buffer.
So, what do we now know about
- You can dynamically lower it to prevent a
<sort_key, rowid>sort mode.
- You can dynamically lower it to reduce the number of merge passes in a query with a large number of sorted rows.
- You can dynamically lower it to ensure the sorted rows all fit into the sort buffer.
So, next time you find yourself sorting by large strings, ask yourself if you can trade off some accuracy for sorting efficiency.