Friday, January 18, 2008

MySQL: Innodb internals

Went to the MySQL Performance Tuning Class last week. I would highly recommend it to any DBA new to MySQL, or any MySQL DBA looking to boost their tuning skills and understanding of some of the various features as they relate to performance.

The next few posts will have to do with information I gleaned from that class...

It's always good to understand what's happening underneath the covers when it comes to database performance and evaluating database architectures/designs for new products or features. One of the great things about the innodb engine is it's Oracleish approach to transaction management. The key is Oracle'ish'. As an Oracle DBA looking at MySQL, it's easy to assume that things like undo tablespaces and rollbacks work the same in innodb as they to in Oracle.

Well, you know what happens when we 'Assume'....

One of the key differences is that innodb does not have an undo tablespace or rollback segment per say. Innodb uses the concept of versioned rows, not to unlike SQLServer 2005's concept of versioned rows. SQLServer implements it by storing versions of the row being changed in the tempdb database (though only if the option is turned on). Innodb implements it with storing versions of the row being changed withing the tablespace itself.

Hmm... The obvious implication is fragmentation. As a row gets updated by multiple threads, or a long running transaction, innodb must keep copies of the row with different version numbers (rowid1, version1/rowid 1, version2, etc, etc) within the table so that a select statement has access to that row while the row is potentially changed by some other transaction. This allows for selects not to block updates, but when a transaction is committed, then the versioned rows are marked for delete. This can then cause a high level of fragmentation within the table and/or tablespace.

So, we as DBA's need to consider things like this when we give advice as to what engine should be used for a given application or database table with MySQL as a back end, as well as things like where they should reside on disk, and whether it's ok to use all tables in one tablespace (innodb default), or file_per_table option for innodb to keep fragmentation issues local to the associated tables/indexes. It may also affect the kinds of maintenance and how often it's needed with respect to rebuilds/defrags....