Performance Schema Support in InnoDB
With the plugin 1.1 release, InnoDB will have full support of Performance Schema, a new feature of MySQL 5.5 release. This allows a user to peak into some critical server synchronization events and obtain their usage statistics. On the other hand, in order to make a lot of sense of the instrumented result, you might need some understanding of InnoDB internals, especially in the area of synchronization with mutexes and rwlocks.
With this effort, the following four modules have been performance schema instrumented.
1. Mutex
2. RWLOCKs
3. File I/O
4. Thread
Almost all mutexes (42), rwlocks (10) and 6 types of threads are instrumented. Most mutex/rwlock instrumentations are turned on by default, a few of them are under special define. For File I/O, their statistics are categorized into Data, Log and Temp file I/O.
This blog is to give you a quick overview on this new machinery.
Start the MySQL Server with Performance Schema
To start with, you probably want to take a quick look at MySQL’s Performance Schema Manual (http://dev.mysql.com/doc/refman/5.5/en/performance-schema.html ), this gives you a quick overview on the general performance schema features.
The performance schema is by default built in with MySQL 5.5 release. However, you do need to add “-performance_schema” to your server boot command line or have performance_schema system variable enabled in your server configure file in order to enable the performance schema. Otherwise, it will be disabled. Please note, you can specify “performance_schema” variable with no value or a value of 1 to enable it, or with a value of 0 to disable it.
When server starts, please pay attention to following lines in server error log:
“100407 16:13:02 [Note] Buffered information: Performance schema enabled.”
This means the server starts with performance schema running fine.
It could also display message such as:
“100407 16:13:02 [Note] Buffered information: Performance schema disabled (reason: start parameters)”
This shows the performance schema is disabled due to lack of either “performance_schema” boot option or appropriate variable set in the configuration file.
The third type message would be “Performance schema disabled (reason: init failed)”, it is due to performance schema initialization failure (could due to reasons such as memory allocation failure etc.). This message is relatively rare. I have not encountered it. If you do hit it, please check other performance schema related system variables, to see if they are out of reasonable range.
Performance Schema Database and its Tables
Assuming server starts fine with Performance Schema enabled, first stop you want to visit is probably the new database called “performance_schema”. All performance schema related tables are in this database:
mysql> use performance_schema
mysql> show tables;
Tables_in_performance_schema
Tables_in_performance_schema |
|
1 |
COND_INSTANCES |
2 |
FILE_INSTANCES |
3 |
MUTEX_INSTANCES |
4 |
RWLOCK_INSTANCES |
5 |
EVENTS_WAITS_CURRENT |
6 |
| EVENTS_WAITS_HISTORY |
7 |
EVENTS_WAITS_HISTORY_LONG |
8 |
EVENTS_WAITS_SUMMARY_BY_EVENT_NAME |
9 |
EVENTS_WAITS_SUMMARY_BY_INSTANCE |
10 |
EVENTS_WAITS_SUMMARY_BY_THREAD_BY_EVENT_NAME |
11 |
PROCESSLIST |
12 |
SETUP_CONSUMERS |
13 |
SETUP_INSTRUMENTS |
14 |
SETUP_OBJECTS |
15 |
SETUP_TIMERS |
16 |
PERFORMANCE_TIMERS |
17 |
FILE_SUMMARY_BY_EVENT_NAME |
18 |
FILE_SUMMARY_BY_INSTANCE |
These 18 tables can be categorized into a few big groups, such as “Instance” tables, “Wait” table with “History”, or “Wait” table with “Summary” and “Setup” table.
In the next few section, I will go through a few tables in this list that I think are important.
Find Instrumented Events with INSTANCE TABLES
To view what InnoDB events are active and being instrumented, please check following four “Instance” tables for corresponding modules:
MUTEX_INSTANCES
RWLOCK_INSTANCES
PROCESSLIST
FILE_INSTANCES
mysql> SELECT DISTINCT(NAME)
-> FROM MUTEX_INSTANCES
-> WHERE NAME LIKE “%innodb%”;
name |
wait/synch/mutex/innodb/analyze_mutex |
wait/synch/mutex/innodb/mutex_list_mutex |
wait/synch/mutex/innodb/ibuf_mutex |
wait/synch/mutex/innodb/rseg_mutex |
wait/synch/mutex/innodb/autoinc_mutex |
wait/synch/mutex/innodb/flush_list_mutex |
….. |
wait/synch/mutex/innodb/thr_local_mutex |
wait/synch/mutex/innodb/srv_monitor_file_mutex |
wait/synch/mutex/innodb/buf_pool_mutex |
wait/synch/mutex/innodb/recv_sys_mutex |
wait/synch/mutex/innodb/fil_system_mutex
|
wait/synch/mutex/innodb/trx_doublewrite_mutex |
wait/synch/mutex/innodb/flush_order_mutex |
35 rows in set (0.00 sec)
Please notice there could be multiple instances of a mutex in the server,
mysql> SELECT COUNT(*)
-> FROM MUTEX_INSTANCES
-> WHERE NAME LIKE “%rseg_mutex%”;
COUNT(*) |
128 |
1 row in set (0.92 sec)
This is why we need to use “SELECT DISTINCT (NAME)” clause in the initial query to get only the distinct mutex names from the MUTEX_INSTANCES table. Without the DISTINCT, there could be hundreds of instances of mutex being displayed. This also applies to other instance tables.
Also please note, if the mutex is not yet created, it will not be listed in the instance table, so you might see fewer events/instances than you might expected.
One last point for this section, buffer block mutex and rwlock are instrumented but disabled by default from performance schema instrumentation. The reason is that there comes one mutex/rwlock pair per 16k buffer block. Server with large buffer pool configuration could easily create thousands of instances of this type of mutexes/rwlocks. This easily exceed the default value of max mutex/rwlock instances (1000) allowed. And user would require to extend the limit by setting system variable performance_schema_max_mutex_instances and/or performance_schema_max_rwlock_instances.
However, as we mentioned, the block mutex/rwlock are instrumented, to enable them, you might need to change the code and un-define “PFS_SKIP_BUFFER_MUTEX_RWLOCK”.
Find out what is going on with EVENTS_WAITS_CURRENT table
The next table you might be interested in is the EVENTS_WAITS_CURRENT table,
mysql> SELECT THREAD_ID, EVENT_NAME, SOURCE
-> FROM EVENTS_WAITS_CURRENT
-> WHERE EVENT_NAME LIKE “%innodb%”;
THREAD_ID |
EVENT_NAME |
SOURCE |
2 |
wait/synch/mutex/innodb/ios_mutex |
srv0start.c:495 |
8 |
wait/synch/mutex/innodb/log_sys_mutex |
log0log.ic:405 |
9 |
wait/synch/mutex/innodb/kernel_mutex |
srv0srv.c:2182 |
10 |
wait/synch/mutex/innodb/thr_local_mutex |
thr0loc.c:127 |
4 rows in set (0.00 sec)
This table shows the latest instrumented activity for a particular thread. And the nice part of it is that it has the exact file name and line number of each event. So in case there is a hang/blocking situation (due to mutex/rwlock), you could know which mutex or rwlock is actually involved.
Check into “Limited history” with HISTORY tables
There are a couple of “HISTORY” tables that record each instrumented events. The EVENTS_WAITS_HISTORY table contains the most recent 10 events per thread. And EVENTS_WAITS_HISTORY_LONG contains the most recent 10,000 events by default. They also come with the “SOURCE” field with file name and line number, and you might be able to do some aggregation on them to find some interesting behavior.
For example, following query gives you exact mutex instances that has been on the top list as shown in the history table:
mysql> SELECT EVENT_NAME, SUM(TIMER_WAIT), COUNT(*), SOURCE
-> FROM EVENTS_WAITS_HISTORY_LONG
-> WHERE EVENT_NAME LIKE “%innodb%”
-> GROUP BY SOURCE
-> ORDER BY SUM(TIMER_WAIT) DESC;
Or you can obtain the instance with the most average time wait:
mysql> SELECT EVENT_NAME, SUM(TIMER_WAIT)/count(*), source
-> FROM EVENTS_WAITS_HISTORY_LONG
-> WHERE EVENT_NAME LIKE “%innodb%”
-> GROUP BY source
-> ORDER BY SUM(TIMER_WAIT) / COUNT(*) DESC;
As mentioned, the history table has limited size, with 10 events per thread for EVENTS_WAITS_and 10,000 for EVENTS_WAITS_HISTORY_LONG. However, you could extend the history length of these two tables by changing
“performance_schema_events_waits_history_size” and “performance_schema_events_waits_history_long_size” system variables. The performance_schema_events_waits_history_long_size can be extended to a million rows in maximum. However, please do not expect this would be enough. Even with 1 million events configured, in a busy system, it probably only contains a few seconds operation of the server.
Find out aggregated information from SUMMARY Tables
To get the overall aggregated value for these instances, you would need the “SUMMARY” table. There are 5 Summary tables,
EVENTS_WAITS_SUMMARY_BY_EVENT_NAME
EVENTS_WAITS_SUMMARY_BY_INSTANCE
EVENTS_WAITS_SUMMARY_BY_THREAD_BY_EVENT_NAME
FILE_SUMMARY_BY_EVENT_NAME
FILE_SUMMARY_BY_INSTANCE
As their name suggested, they are just events statistics aggregated with different criteria. Digging into these table gives you some idea where the contention could be.
For example, following query shows what is the hottest mutex (these values have unit as pico-second):
mysql> SELECT EVENT_NAME, COUNT_STAR,SUM_TIMER_WAIT, AVG_TIMER_WAIT
-> FROM EVENTS_WAITS_SUMMARY_BY_EVENT_NAME
-> WHERE EVENT_NAME LIKE “%innodb%”
-> ORDER BY COUNT_STAR DESC;
| EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT | AVG_TIMER_WAIT |
| buf_pool_mutex | 1925253 | 264662026992 | 137468 |
| buffer_block_mutex | 720640 | 80696897622 | 111979 |
| kernel_mutex | 243870 | 44872951662 | 184003 |
| purge_sys_mutex | 162085 | 12238011720 | 75503 |
…. |
… |
… |
.. |
This experiment shows the buf_pool_mutex has been the hottest mutex. However, aggregate on AVG_TIMER_WAIT shows that ibuf_mutex is the one we waited the longest, even though it is much less frequently accessed.
Also please note these tables can be truncated, so you can essentially reset these wait values, and start the counting and aggregation afresh.
Again, to really understand and interpret information from these tables require some understanding of internals that these mutexes/rwlocks reside. It could target to advanced users and developers who want to analyze performance bottlenecks. However, common users might still be able to infer certain information out of it, and have some creative usage of these statistics. For example, for some I/O bound servers, you might find double write buffer mutex is on the top listed mutexes in terms of total time waited. Then you might want to consider to turn off the double write buffer option etc.
Performance Impact:
The last item we discuss is that this performance schema comes with a cost. It does have visible performance impact. A simple dbt2 test with 50 warehouse, 32 connections on a server with 2G buffer pool size show about 8% performance impact with all performance schema events turned on. This is also confirmed by some sysbench’ tests.
In fact, to minimize performance impact, performance schema allows you turn off counting on individual event with SETUP tables, you can use SETUP_CONSUMERS to turn on/off logging into history table etc., and SETUP_INSTRUMENTS to turn on/off counting on a particular mutex/rwlock etc. However, turning off events counting cannot completely eliminate the performance impact from the performance schema. This is something the performance schema to be improved upon.
Summary:
In summary, we are providing a rich set of mutex, rwlock, I/O and thread usage information through Performance Schema instrumentations. It can be used to diagnose server performance bottlenecks, find possible hot spots in the server as well as gain a better understanding on system behavior/access pattern on modules where these mutexes/rwlocks reside in. However, it does come with a cost to server performance itself. So this is more suitable for development server tuning and studying. You might want to leave this out for any production server.
-Jimmy Yang
April 15th, 2010 at 1:38 am
I wonder, does this feature also go into the latest embedded innodb release? Or is it bound on mysql performance schema? Thanks
[Reply]
April 19th, 2010 at 4:01 am
Hi, thank you very much. good job.
Ian Eisenberg
[Reply]
April 28th, 2010 at 9:42 pm
This allows a user to peak into …
Wow – what a a trip
[Reply]
August 18th, 2011 at 4:59 am
Wow , wow!!!!
thank you very mach ,good job.
[Reply]