Pages

Saturday, March 14, 2009

What just happened to the database?

It's always fun when you come into work and notice that one of your database monitors/graphs has changed from showing almost no row accesses/sec to 40,000 rows/sec in a matter of minutes. And then by lunch time they are showing 90,000 rows/sec. What's up with that?

MySQL Enterprise Monitor Row Accesses:






Well, of course the first thing you do is check all your cacti monitors, because there is obviously something wrong with the monitoring system.

Cacti Innodb Row Accesses:






Hmm... it's not the monitor. Must be a change in the app...

So you check with everyone, and no, nothings changed in the app. In fact, nothing even appears to be different (same basic app response times, same # requests/sec to the app, etc, etc)

Database Activity:






So it must be a query cache issue...

Query Cache:






Apparently not... :(

So what could cause that kind of dramatic increase in index rows read / sec ?

In reality, looking at the data distribution, the optimizer apparently decided to 'explain' it all away....

Current explain plan:

mysql> \e
-> ;
+--------------+-------------------------------+---------+------+------+----------+
possible_keys
| key | key_len | ref | rows | filtered
| Extra
+--------------+-------------------------------+---------+------+------+----------+
UIDX_TOTAL_RATING_OWNER_ABOUT,IDX_TOTAL_RATING_ABOUT,IDX_TOTAL_RATING_OWNER
| UIDX_TOTAL_RATING_OWNER_ABOUT | 180 | NULL | 5 | 80.00
| Using where; Using index; Using filesort |
+--------------+-------------------------------+---------+------+------+----------+
1 row in set, 1 warning (0.00 sec)



Previous explain plan:

mysql> \e
-> ;
+--------------+-------------------------------+---------+------+------+----------+
possible_keys
| key | key_len | ref | rows | filtered
| Extra
+--------------+-------------------------------+---------+------+------+----------+
UIDX_TOTAL_RATING_OWNER_ABOUT,IDX_TOTAL_RATING_ABOUT,IDX_TOTAL_RATING_OWNER
|IDX_TOTAL_RATING_OWNER | 42 | const,const | 1 | 100.00
| Using where |
+--------------+-------------------------------+---------+------+------+----------+
1 row in set, 1 warning (0.03 sec)

Interesting what a little change in data and a different path through the optimizer can do...


2 comments:

Mark Leith said...

Awww, it hurts that you think that the graph that I made (and our MEM infrastructure) was wrong! :)

I've always liked the idea of Oracle's "Stored Outlines" - so that you can have execution plan stability nailed down no matter how the distribution of your data changes.

(http://www.oracle-base.com/articles/misc/Outlines.php)

It works in many cases - but sometimes new access paths are *correct* (the optimizer isn't wrong all of the time!), so can also be harmful too..

In any case - I don't think we'd get something like that for MySQL for some time. What was the storage engine you were using, and had you had any significant change in the data?

Phil Hildebrand said...

Yea, I guess I should trust the tool...

I definitely miss stored outlines and histograms with Oracle, as well as Query Re-Write abilities.

From what I can tell ( it's all innodb in the storage engine ), the optimizer likely made the right decision. The data in the table with the index and unique index has increased from a few thousand rows to over a million rows.

Though it didn't increase over night, I suspect it's just faster now to scan the larger unique index than it is to scan the smaller non-unique index.

Hmm... maybe we need an optimizer path tool thought to really know...;)