Sep 19

MySQL has a well-earned reputation for being easy-to-use and delivering performance and scalability. In previous versions, MyISAM was the default storage engine. In our experience, most users never changed the default settings. With MySQL 5.5, InnoDB becomes the default storage engine. Again, we expect most users will not change the default settings. But, because of InnoDB, the default settings deliver the benefits users expect from their RDBMS — ACID Transactions, Referential Integrity, and Crash Recovery. Lets explore how using InnoDB tables improves your life as a MySQL user, DBA, or developer.

Background

In the first years of MySQL growth, early web-based applications didn’t push the limits of concurrency and availability. In 2010, hard drive and memory capacity and the performance/price ratio have all gone through the roof. Users pushing the performance boundaries of MySQL care a lot about reliability and crash recovery. MySQL databases are big, busy, robust, distributed, and important.

InnoDB hits the sweet spot of these top user priorities. The trend of storage engine usage has shifted in favor of the more efficient InnoDB. With MySQL 5.5, the time is right to make InnoDB the default storage engine.

The Big Change

Read the rest of this entry »

Sep 19

At MySQL, we know our users want Performance, Scalability, Reliability, and Availability, regardless of the platform the choose to deploy. We have always had excellent benchmarks on Linux, and with MySQL 5.5, we are also working hard on improving performance on Windows.

The original patch of improving Windows performance was developed by MySQL senior developer Vladislav Vaintroub; benchmarks by QA engineer Jonathan Miller. We integrated the patch into MySQL 5.5 release.

The following two charts show the comparison of MySQL 5.5 vs. MySQL 5.1 (plugin) vs. MySQL 5.1 (builtin) using sysbench:

Read the rest of this entry »

Sep 19

Write-heavy workloads can reach a situation where InnoDB runs out of usable space in its redo log files. When that happens, InnoDB does a lot of disk writes to create space and you can see a drop in server throughput for a few seconds. From InnoDB plugin 1.0.4 we have introduced the ‘innodb_adaptive_flushing’ method that uses a heurstic to try to flush enough pages in the background so that it is rare for the very active writing to happen. In this note I’ll try to explain how the heuristic works i.e.: what factors are taken into account when deciding how many dirty pages to flush from the buffer pool in the background. I’ll skip some details for the sake of clarity.

You may find InnoDB glossary useful to understand the terminology used in this note. adaptive_flushing is of consequence to you if your workload involves significant write activity. For example, if you are running MySQL 5.0 or MySQL 5.1 with built-in InnoDB and you see periodic drop in the performance with corresponding spikes in write activity then you should consider upgrading to InnoDB plugin.

Let us first familiarize ourselves with why and how flushing takes place in the buffer pool. Flushing is the activity of writing dirty pages to the disk. We need to flush because of two main reasons. First, we need to read in a page into the buffer pool and there are no free buffers. In this case we select a page as victim (based on LRU) and if the victim is dirty we write it to the disk before reusing it. This, I’ll call, LRU_list flush. Secondly, we need to write dirty pages to the disk when we want to reuse the redo logs. Redo logs in InnoDB (or for that matter in most other database engines as well) are used in a circular fashion. We can only overwrite redo log entries if, and only if, the dirty pages corresponding to these entries have been flushed to the disk. This type of flushing is called flush_list flush.

adaptive_flushing is related to the flush_list flush (though it is also influenced by the LRU_list flush as we’ll see shortly). If we allow the redo logs to fill up completely before we trigger flushing of the dirty pages then it will cause a serious drop in performance during this spike in the I/O. To avoid this we’d like to keep flushing dirty pages in the background at a rate that is just enough to avoid the I/O bursts. Note that we don’t want to flush too aggressively because that will reduce the chances of write combining thus resulting in more I/O than is required. So how do we determine the rate at which we do flush_list flush in the background? That is where the adaptive_flushing heuristic comes into play.

We do know the capacity of the redo logs which is constant and is equal to the combined size of the log files. If we know the rate at which redo log is being generated then we can figure out how many seconds we have before the whole redo log is filled up. Read the rest of this entry »

Sep 19

To speed up bulk loading of data, InnoDB implements an insert buffer, a special index in the InnoDB system tablespace that buffers modifications to secondary indexes when the leaf pages are not in the buffer pool. Batched merges from the insert buffer to the index pages result in less random access patterns than when updating the pages directly. This speeds up the operation on hard disks.

In MySQL 5.5, the insert buffer has been extended to a change buffer, which covers all modifications of secondary index leaf pages. This will improve the performance of bulk deletes and updates, transaction rollback and the purging of deleted records (reducing the “purge lag”).

To assess the benefits of the extended buffering, you may want to run benchmarks with the settings innodb_change_buffering=all, innodb_change_buffering=inserts, and innodb_change_buffering=none. Users of solid-state storage, where random reads are about as fast as sequential reads, might benefit from disabling the buffering altogether.

Read on to learn how the change buffering works.

Operations on Secondary Indexes

Read the rest of this entry »

Sep 19

This is a brief overview of the history of InnoDB with respect to the Version Control Systems (VCS) that were used for developing. It could be useful to people who want to trace back in time to find the origins of bugs or features.

Early days
InnoDB was born in the mid 1990s when Heikki Tuuri started developing this masterpiece. Heikki was a lone developer at that time and he did not use any VCS. There is no InnoDB revision history before 2001.

2001 – 2005
Then in 2001 InnoDB was imported into MySQL’s BitKeeper repository and development continued, recording the history there.

2005 – 2010
Later on in Oct 2005 when Oracle acquired Innobase, InnoDB developers had to move away from MySQL’s BitKeeper repository and Subversion was chosen for InnoDB development. The latest sources from BitKeeper were imported into SVN without preserving the history and Innobase started sending snapshots of InnoDB to MySQL. In other words – InnoDB devs committed changesets into SVN and then batches of those changesets were sent to MySQL. But those batches were applied as a single changeset in BitKeeper with a message like “Apply InnoDB snapshot XYZ” which made the revision history of InnoDB in BitKeeper after Oct 2005 meaningless.

In the meantime MySQL switched from BitKeeper to Bazaar, preserving all the history that BitKeeper had. This did not affect InnoDB development and revision history, except that Bazaar was easier to use than the free and limited BitKeeper client. The snapshots thing continued.

Read the rest of this entry »