Fun MySQL fact of the day: batched key access

We took a small detour from discussing table joins yesterday to take a look at Disk-Sweep Multi-Range Read Optimisation (MRR). If you're wondering why, it's was so that we have a better understanding of how the third of the 3 join algorithms MySQL can use to join tables: the Batched Key Access Nested-Loop Join algorithm.

The Batched Key Access join (BKA) algorithm, like the Nested-Loop Algorithm we discussed on Tuesday, uses a join buffer (also of size join_buffer_size bytes) to buffer rows from the "left-hand" join table until the buffer is full. Once the buffer is full, the BKA algorithm submits the join keys for the "right-hand" join table to the MRR engine, which, in turn, sorts the join keys in key order and subsequently retrieves rows from the "right-hand" join table in key order. Just as we discussed yesterday, this optimization tries to reduce the amount of random IO required to read from a storage engine.

Unfortunately, while the BKA algorithm exists and sounds like a pretty good optimisation, it will not be used unless you explicitly hint to MySQL that it should be used or if you disable the MRR cost estimation optimizer switch. Why? To quote the MySQL documentation: "the cost estimation for MRR is too pessimistic". But when you do enable it and MySQL does choose to use it, you will know it's happening when the Extra column of your EXPLAIN shows Using join buffer (Batched Key Access).