Fun MySQL fact of the day: binlog row image
Considering yesterday's downsides to the
ROW binary log format, you may have spent the night sad and upset, maybe even unable to sleep. Likely, the narrative was along the lines of, "why, oh why, MySQL, must you make me choose between binary log size and safety?!" Well, rest-assured, tonight, you'll sleep better after you learn about
Introduced in MySQL 5.6, the
binlog_row_image variable instructs MySQL how it should write a binary log event when using the
ROW format. Yesterday, we considered the default value,
full, which maintains backward compatibility with previous versions of MySQL. In addition to
full, two other options exist:
noblob, which behaves similarly to
full, except it does not record unchanged
BLOBdata types, and
minimal, which logs only the primary key (or candidate primary key) as the predicate and records only the changed columns.
Today, we'll focus on the
minimal setting, which is much more fun than
noblob. If, like yesterday, we
UPDATE employee SET name = 'Charlie Charlson' WHERE id = 2462107 when
binlog_row_image=minimal and run
mysqlbinlog on the binary log, we'll see an entry such as this:
#190611 8:28:26 server id 57 end_log_pos 407 CRC32 0x501823fe Update_rows: table id 108 flags: STMT_END_F ### UPDATE `test`.`employee` ### WHERE ### @1=2462107 ### SET ### @2='Charlie Charlson'
Unlike yesterday, however, you'll notice that we're only binary logging the affected row's primary key (
@2). This is a log event reduction by about 30% (or 75% field-wise), but keep in mind that the actual reduction you'll see will vary on a table-by-table basis and is highly dependent on your update patterns. For example, some ORMs like to update all fields in a row even if they're not being changed (Hibernate and ActiveRecord to name a few). In such cases, a
minimal format will be closer to a 50% reduction field-wise (by reducing the predicate) instead of something substantially higher. Still, 50% is nothing to scoff at.
Of course, using
binlog_row_image=minimal doesn't come without downsides. First, a
minimal format will only work correctly on tables with a primary key (or a single unique key). Further, when the binary log is applied, the destination table must have the same columns in the same order as the source table. In the absence of either requirement, changes will be non-deterministic, exhibiting undefined behaviour. And, well, really, why wouldn't we want primary keys and consistent tables? Anything else would be asking for trouble.