Fun MySQL fact of the day: unsafe logging
Over the last couple weeks, we've been discussing the MySQL binary log and building up our mental mode around the
STATEMENT-based binary log format. When using a
STATEMENT binary log format, MySQL very simply puts the SQL statement that issued a change into the binary log. Then, when we re-read the log through either
mysqlbinlog or replication, the statements can be re-executed verbatim. And while this seems like a pretty simple solution that's easy to understand and implement, it comes with a whole lot of fun.
Let's consider, for example,
INSERT INTO my_table(id, uuid) VALUES(0, UUID()), or even
UPDATE my_table SET some_value = 1 WHERE some_value < RAND() * 100. When MySQL logs these using the
STATEMENT binary log format, they'll be recorded in the bin log exactly the same way. And, of course, every time you run these queries, you'll get differing values for
UUID(). This puts our ability to perform point-in-time recovery in jeopardy and will cause a replica to have data inconsistent with its master. Similarly, as we'd discussed back on February 28, the same problem holds true for DML (Data Manipulation Language) statements with a
LIMIT but no
ORDER BY: the query is, essentially, non-deterministic. To our convenience, MySQL has documented all of the safe/unsafe operations when using
STATEMENT-based logging. Be certain you are looking at the documentation for your current version, because it changes on each release.
Thankfully, the MySQL community noticed this was somewhat undesirable, if not risky, out-of-the-box behaviour and MySQL eventually changed the default of
ROW in MySQL 5.7.7 and above. We'll discuss the
ROW format next week, but in the meantime, I suggest you
SHOW VARIABLES LIKE 'binlog_format' to make sure you are, in fact, not using
STATEMENT-based binary logging.