Jul 25

Are you running an InnoDB installation with a many-gigabytes buffer pool(s)? Does it take too long before it goes back to speed after a restart? If yes, then the following will be interesting to you.

In the latest MySQL 5.6 Labs release we have implemented an InnoDB buffer pool(s) dump and load to solve this problem.

The contents of the InnoDB buffer pool(s) can be saved on disk before MySQL is shut down and then read in after a restart so that the warm up time is drastically shortened – the buffer pool(s) go to the state they were before the server restart! The time needed for that is roughly the time needed to read data from disk that is about the size of the buffer pool(s).

Lets dive straight into the commands to perform various dump/load operations:

The buffer pool(s) dump can be done at any time when MySQL is running by doing: Read the rest of this entry »

Apr 11

Background

InnoDB gathers statistics for the data in user tables, which are used by the MySQL optimizer to choose the best query plan. For a long time the imprecision and instability of these statistics have been creating problems for users.

The problem is that these statistics are recalculated at any of the following events:

* When the table is opened

* When the table has changed a lot (1/16th of the table has been updated/deleted or inserted)

* When ANALYZE TABLE is run

Read the rest of this entry »

Apr 18

Recently, it was reported (see MySQL bug #43660) that “SHOW INDEXES/ANALYZE does NOT update cardinality for indexes of InnoDB table”. The problem appeared to happen only on 64-bit systems, but not 32-bit systems. The bug turns out to be a case of mistaken identity. The real criminal here wasn’t the SHOW INDEXES or the ANALYZE command, but something else entirely. It wasn’t specific to 64-bit platforms, either. Read on for the interesting story about this mystery and its solution …

InnoDB estimates statistics for the query optimizer by picking random pages from an index. Upon detailed analysis, we found that the algorithm that picks random pages for estimation always picked the same page, thus producing the same result every time. This made it appear that the index cardinality was not updated by ANALYZE TABLE. Going deeper, the reason the algorithm always selected the same page was that the random number generator always generated numbers that, when divided by 3, always gave the same remainder (2).

The sampling algorithm selects a random leaf page by starting from the root page and then selecting a random record from it, descending into its child page and so on until it reaches a leaf page. In the particular case that was reported in the bug report, the root page contained only 3 records and the tree height was only 2 (i.e., the leaf pages were all just below the root page).

You can already guess what happened. The “random” numbers generated, not being so random, caused the algorithm to always pick the same record from the root page (the second one) and then descend to the leaf page below it. Every time. So, the 8 random pages that were sampled in order to get an estimate of the whole picture were in fact the same page, even in isolated ANALYZE TABLE runs.

So, clearly there was a problem with the random number generator. But why didn’t this problem seem to appear on 64-bit platforms? It would have, had we only enough time to wait. The random number generator, always generating numbers like 3k+2 of type unsigned long, at some point wrapped around 4 billion on 32-bit machines and started generating numbers like 3k+1. On 64-bit machines, where unsigned long is much bigger, this wrap did not occur. But it would have occurred if we ran the test for 1000 years!.

Read the rest of this entry »

Mar 31

Some months ago, Google released a patch for InnoDB that boosts performance on multi-core servers. We decided to incorporate the change into the InnoDB Plugin to make everybody happy: users of InnoDB don’t have to apply the patch, and Google no longer has to maintain the patch for new versions of InnoDB. And it makes us at Innobase happy because it improves our product (as you can in this post about InnoDB Plugin release 1.0.3).

However, there are always technical and business issues to address. Given the low-level changes in the patch, was it technically sound? Was the patch stable and as rock solid as is the rest of InnoDB? Although it was written for the built-in InnoDB in MySQL 5.0.37, we needed to adapt it to the InnoDB Plugin. Could we make the patch portable to many platforms? Could we incorporate the patch without legal risk (so it could be licensed under both the GPL and commercial terms)?

Fortunately Google generously donated the patch to us under the BSD license, so there was no concern about intellectual property (so long as we properly acknowledged the contribution, which we do in an Appendix to the manual and in the source code). So, while the folks at Google are known for writing excellent code, we had to thoroughly review and test the patch before it could be incorporated in a release of InnoDB.

The patch improves performance by replacing InnoDB mutex and rw-mutex with atomic memory instructions. The first issue that arose was that the patch assigned the integer value -1 to a pthread_t variable, to refer to a neutral/non-existent thread identifier. This approach worked for Google because they use InnoDB solely on Linux. As it happens, pthread_t is defined as an integer on Linux.

But we had problems when the patch was tested on FreeBSD. We still needed to reference a non-existent thread, but in some environments (e.g. some versions of HPUX, Mac OS X) pthread_t is defined as a
structure, not an integer. As we looked at it, the problem became more complex. For this scheme to work, it must be possible to change thread identifiers atomically, using a Compare-And-Swap instruction. Otherwise, there will be subtle, mysterious and nasty failures.

Read the rest of this entry »