Apr 16

The performance of flush_list flushing of InnoDB decides the basic performance for modifying workloads. So, it is important to optimize the flush behavior. In this post we’ll consider how to optimize the neighbor-flushing behavior.

Factor 1: Characteristics of storage

Depending on the characteristics of your storage’s throughput for write IO, you can term your storage as either “write amount bound” or “write times bound”. The minimum unit of the InnoDB datafile is page size (16KB or less). And InnoDB attempts to combines them in a single IO up to 1 extent (1MB) maximum, if they are contiguous.

<one HDD>:  Almost “write times bound”. Because head-seek time is the most effective factor for access time of HDD. And around 1MB size can be treated by the 1 head-seek.

<RAID-HDD>: It depends on the striping size of the RAID. In many cases, the striping size is set to 256KB ~ 1MB (much larger than the page size of datafile), with the intention that 1IO – 1HDD (both for keeping sequential access advantage of HDD and for keeping parallel ability for IO requests using several HDD in RAID). For the such general striping size, RAID-HDD is “write times bound”. (For the small striping size around same size as the page size, it should be “write amount bound”. But I don’t recommend such small striping size from the viewpoint of this post, because it just loses the sequential access advantage.)

Read the rest of this entry »

Apr 13

In MySQL labs release April 2012 we have reworked the flushing heuristics in InnoDB. In this post I’ll give an overview of what we have changed and the various configuration variables we have introduced to fine tune the flushing algorithm. If you are interested in finding out how our new flushing algorithm fares in benchmarks you can get these details in Dimitri’s well-explained blog here.

Flushing means writing dirty pages to disk. I have explained in some detail about adaptive_flushing and types of flushing in my previous notes. Please go through these notes if you want to make sense of what follows.

The page_cleaner thread checks the state of the system every second and takes into account number of dirty pages, amount of reusable redo space, the rate at which redo is generated and the IO capacity for which the server is configured and based on these factors decide how many pages we need to flush.

In the new scheme of things the page_cleaner thread uses a single non-linear formula to calculate how many pages we need to flush to have sufficient reusable redo space. This is different from current flushing heuristic where async_water_mark is taken as point where we change our flushing algorithm. Similarly, instead of using innodb_max_dirty_pages_pct as a switch which triggers flushing we have introduced the concept of a range where the flushing to control the dirty pages percentage starts once we cross the low water mark and gets more and more aggressive as we near the high water mark.

There are four new configuration variables. Note that if your system is not experiencing any IO spikes due to checkpoints then you can probably leave all of the following as is. All the variables are global in scope and can be set dynamically. Read the rest of this entry »

Apr 11

After introducing InnoDB persistent statistics in MySQL 5.6, in this April Labs release we have dressed it up in a nice UI and refactored the internals a bit to make the code more elegant and straight-forward.

The persistent stats are now controlled globally and can also be overridden at table level, should any table require a different behavior.

Global

The server global flag –innodb-stats-persistent (boolean) now controls whether all InnoDB tables use persistent statistics or not. Keep in mind that if a table is using persistent stats then its statistics will not be updated automatically and you are responsible for running ANALYZE TABLE periodically, whenever you think the table contents has changed too much. Thus the default for –innodb-stats-persistent is currently set to OFF.

Per table

Read the rest of this entry »

Apr 11

In this April MySQL Lab release, we’ll provide you a more robust and release-ready InnoDB Memcached Engine with a few enhancements. The most notable addition is the SASL support, which gives users the capability to protect their MySQL database from unauthenticated access through memcached clients. In this blog, I will walk you through steps of getting this option enabled.

Background Info:
SASL stands for “Simple Authentication and Security Layer”, which is a Standard for adding authentication support to connection-based protocols. Memcached added SASL support starting its 1.4.3 release. And here is a good article that gives you some background on why and how SASL is supported in Memcached.

For InnoDB Memcached, the “Memcached mapped” user table must be registered in the “container” “system table”. And memcached client(s) can only access such “registered” table.  Even though the DBA can add access restrictions on such table,  he/she has no control over who can access it through the memcached client(s). And this is exactly the reason we want to provide a means (in this case SASL) for DBA being able to have some control over who can access our InnoDB table(s).

In the following section, we will go through with you the steps to build, enable and test an SASL-enabled InnoDB Memcached plugin.

Steps to Build and Enable SASL in InnoDB Memcached Plugin:

Read the rest of this entry »

Apr 11

The problem
After making several performance fixes, notable among them being the kernel mutex split and the new handling of read-only transaction and in particular non-locking auto-commit read-only transactions, we weren’t seeing any increase in transaction per second (TPS) on our high-end hardware. On this one particular host, a 24 core with 2 threads per core host. The TPS using Sysbench was a tepid 5.6K at 16 threads and more or less plateaued till 1K user threads. No matter what config setting we used, we would more or less end up with the same result.

We ended up getting together for a meeting at Paris to discuss this issue and during the brain storming, one of the potential issues that cropped up was the effect of cache coherence and/or false sharing. After using the excellent Linux tool perf we were able to narrow it down to a global statistic counter in row_sel_search_for_mysql(). Mikael Ronstrom explains this in more detail.

The solution
Create a generic counter class (InnoDB code is now C++) that splits the counter into multiple (configurable) slots that are on separate 64 byte cache lines. Use the thread id of the updating thread to index into a slot to reduce the contention/sharing and it had the desired effect. The TPS went from 5.6 to 15K at 64 user threads and stayed close to stable right up to 1K, very slow degradation. This was using Sysbench OLTP_RO for autocommit-non-locking-read-only queries (Sysench option –oltp-skip-trx=off).

The code and binary can be downloaded from labs release downloads, the current release is mysql-5.6.6-labs-april-2012-*. See the code in include/os0thread.h. The new class is ib_counter_t.

We have now refactored the code and grouped all the InnoDB statistic counters in srv_counter_t. This will help in further consolidation and improvements. Currently, most of the InnoDB config and statistics variables are defined in srv0srv.cc (with a few exceptions). We need to start paying even more attention to their layout and alignment from now on. There seem to be some false sharing issues that we haven’t completely identified yet.

Read the rest of this entry »

Apr 11

This feature is a continuation of the “Fast Index Creation” feature introduced in Fast Index Creation in the InnoDB Storage Engine. Now you can perform other kinds of DDL operations on InnoDB tables online: that is, with minimal delay for operations on that table, and without rebuilding the entire table. This enhancement improves responsiveness and availability in busy production environments, where making a table unavailable for seconds or minutes whenever its column definitions change is not practical.

The DDL operations enhanced by this feature are these variations on the ALTER TABLE statement:

Apr 11

The Problem

In the past, users were unable to take full advantage of the FLUSH TABLES WITH READ LOCK statement. InnoDB simply ignored the flush to disk part. If the table did not have any dirty pages in the InnoDB buffer that weren’t synced to disk (due to sheer luck) then it was safe to copy the .ibd file to another location. Also, the restore was not without its limitations and complications. The .ibd file could not be copied over to another server because InnoDB during import did not fix up metadata required for a trouble-free import. The main problems during import were:

  • If the tablespace ID of the IMPORTing instance had changed, for example if the table had been dropped and re-created, then the import would fail.
  • If the table and index IDs in the tablespace were different than on the importing server.
  • InnoDB uses a global row ID for tables without an explicit primary key. Any such row IDs in an imported tablespace had to be lower than the current maximum on the new server.
  • The maximum LSN of the tablespace had to be lower than the server’s current max LSN.
  • The maximum transaction (trx) ID of the tablespace had to be lower than the server current max trx id.
  • Purge and change buffer issues.

In short, if it worked you were lucky that the above constraints were satisfied, or your tables were probably read-only.

The Solution

Do the flush (export) properly and also the import. During flushing we disable purge, merge all pending change buffer entries to the tablespace and then flush all the dirty pages to disk. Note: Purge will remain disabled until UNLOCK TABLES; is issued. Write out a metadata file that contains the table schema, hostname of the exporting server, page size of the exporting instance, the highest autoinc value in memory etc. So that when we do an import, we can check if the table schema matches (currently, it ignores foreign key relationships) and also restore the autoinc value. This metadata file is created in the same directory as the tablespace, with a suffix of .cfg. If the table is named customer then the meta-data file will be named customer.cfg and the tablespace will be named customer.ibd. Note: FLUSH TABLES and IMPORT only work for tables that are stored outside the system tablespace, in their own tablespace. The InnoDB config parameter innodb_file_per_table must be set when creating the table.

Read the rest of this entry »

Apr 10

The April 2012 InnoDB labs release introduces a new feature in InnoDB that allows you to choose the location of specific tables.  For example, you can place critical tables onto an SSD drive while leaving the system tablespace on a hard drive.  Conversely, you can store you primary database files on an SSD and put a seldom used but very large archive or reference table on a larger cheaper hard drive.

Innodb now makes use of the following existing syntax in MySQL ;

CREATE TABLE  . . .  DATA DIRECTORY = ‘absolute path of data directory’;

CREATE TABLE  . . .  PARTITION . . . DATA DIRECTORY = ‘absolute path of data directory’;

This syntax is used in MyISAM and Archive engines to make use of symbolic links in those operating systems that support it.  But InnoDB can use this syntax on any OS since it stores the path in a new system table called SYS_DATAFILES.  There is also a new system table called SYS_TABLESPACES.  Both of these can be viewed in the April Labs release of 5.6 by using information schema. For example; Read the rest of this entry »

Apr 10

InnoDB team is pleased to announce the 2012 Spring labs release, with several much anticipated new features and performance enhancements. Please download mysql-5.6-labs-april-2012 from MySQL Labs and give a try. Do not forget to provide your feedback.

The 2012 Spring labs release on MySQL Labs consists of the following InnoDB new features, which are not in the newly released MySQL 5.6.5 DMR yet:

  • Online DDL: some of the DDLs are now truly online, including ADD INDEX, SET DEFAULT, and DROP FOREIGN KEY.
  • Memcached plugin: with additional features, such as SASL support.
  • Transportable tablespace: allow user to export data files and import them into another MySQL instance.
  • Persistent statistics ON/OFF switch: the ability of controlling persistent statistics on table level.
  • Option for specifying locations of InnoDB tables: allows user to choose the location of specific tables.

This labs release also includes several performance and scalability improvements, specially on modern CPUs:

  • Reduced false sharing
  • Configurable fast mutexes
  • my_hash_sort_simple() loop unrolling
  • Improved adaptive flushing
  • Improved neighbor flushing

Read the rest of this entry »

Dec 23

Continues from Improving InnoDB memory usage.

Here are some numbers from the fixups described in the above article:

The workload consists of 10 partitioned tables, each one containing 1000 partitions. This means 10’000 InnoDB tables. We truncate the tables, then restart mysqld and run:

1. INSERT a single row into each of the 10 tables
2. SELECT * from each table
3. FLUSH TABLES (this causes the tables to be closed and reopened on the next run)
4. wait for 10 seconds

we repeat the above steps 10 times. Here is the total memory consumption by mysqld with 1GB InnoDB buffer pool during the workload:

Read the rest of this entry »

« Previous Entries