<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Transactions on InnoDB &#187; InnoDB</title>
	<atom:link href="http://blogs.innodb.com/wp/tag/innodb/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.innodb.com/wp</link>
	<description>&#34;The word&#34; about InnoDB Products and Technology</description>
	<lastBuildDate>Fri, 23 Dec 2011 11:17:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Improving InnoDB memory usage</title>
		<link>http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage/</link>
		<comments>http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 18:08:42 +0000</pubDate>
		<dc:creator>Vasil Dimov</dc:creator>
				<category><![CDATA[Bug fix]]></category>
		<category><![CDATA[InnoDB Builtin]]></category>
		<category><![CDATA[InnoDB Plugin]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[fragmentation]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[leak]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[oom]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=1510</guid>
		<description><![CDATA[Last month we did a few improvements in InnoDB memory usage. We solved a challenging issue about how InnoDB uses memory in certain places of the code. The symptom of the issue was that under a certain workloads the memory used by InnoDB kept growing infinitely, until OOM killer kicked in. It looked like a [...]]]></description>
			<content:encoded><![CDATA[<p>Last month we did a few improvements in InnoDB memory usage. We solved a challenging issue about how InnoDB uses memory in certain places of the code.</p>
<p>The symptom of the issue was that under a certain workloads the memory used by InnoDB kept growing infinitely, until <a href="http://en.wikipedia.org/wiki/Out_of_memory" title="OOM killer" target="_blank">OOM killer</a> kicked in. It looked like a memory leak, but <a href="http://valgrind.org/" title="Valgrind" target="_blank">Valgrind</a> wasn&#8217;t reporting any leaks and the issue was not reproducible on FreeBSD &#8211; it only happened on Linux (see <a href="http://bugs.mysql.com/57480" title="Bug#57480" target="_blank">Bug#57480</a>). Especially the latest fact lead us to think that there is something in the InnoDB memory usage pattern that reveals a nasty side of the otherwise good-natured Linux&#8217;s memory manager.</p>
<p>It turned out to be an interesting <a href="http://en.wikipedia.org/wiki/Fragmentation_(computing)" title="memory fragmentation" target="_blank">memory fragmentation</a> caused by a storm of malloc/free calls of various sizes. We had to track and analyze <strong>each</strong> call to malloc during the workload, including the code path that lead to it. We collected a huge set of analysis data &#8211; some code paths were executed many 10&#8217;000s of times! A hurricane of allocations and deallocations! We looked at the hottest ones hoping that some of them are not necessary, can be eliminated, avoided, minimized or stuck together. Luckily there were plenty of them!</p>
<p>After an extensive testing we did a numerous improvements, allocating the smallest chunks of the memory from the stack instead of from the heap, grouping allocations together where possible, removing unnecessary allocations altogether, estimating exactly how much memory will be consumed by a given operation and allocating it in advance and others and others and others.</p>
<p>This not only fixed <a href="http://bugs.mysql.com/57480" title="Bug#57480" target="_blank">Bug#57480</a> but improved InnoDB memory usage in general.</p>
<p><span id="more-1510"></span></p>
<p>Note: the fix is not in the 5.6.4 release.</p>
<p>Continues with some numbers <a href="http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage-continued/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>InnoDB 5.6.4 supports databases with 4k and 8k page sizes</title>
		<link>http://blogs.innodb.com/wp/2011/12/innodb-5-6-4-supports-4k-8k-page-sizes/</link>
		<comments>http://blogs.innodb.com/wp/2011/12/innodb-5-6-4-supports-4k-8k-page-sizes/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 17:00:24 +0000</pubDate>
		<dc:creator>Kevin Lewis</dc:creator>
				<category><![CDATA[Feature]]></category>
		<category><![CDATA[16k]]></category>
		<category><![CDATA[4k]]></category>
		<category><![CDATA[5.6.4]]></category>
		<category><![CDATA[8k]]></category>
		<category><![CDATA[Antelope]]></category>
		<category><![CDATA[Barracuda]]></category>
		<category><![CDATA[Compatibility]]></category>
		<category><![CDATA[Configuration parameter]]></category>
		<category><![CDATA[File format]]></category>
		<category><![CDATA[ibdata file]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[Page Size]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Row Formats]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=1484</guid>
		<description><![CDATA[In the 5.6.4 release it is now possible to create an InnoDB database with 4k or 8k page sizes in addition to the original 16k page size. Previously, it could be done by recompiling the engine with a different value for UNIV_PAGE_SIZE_SHIFT and UNIV_PAGE_SIZE. With this release, you can set &#8211;innodb-page-size=n when starting mysqld, or [...]]]></description>
			<content:encoded><![CDATA[<p>In the 5.6.4 release it is now possible to create an InnoDB database with 4k or 8k page sizes in addition to the original 16k page size. Previously, it could be done by recompiling the engine with a different value for UNIV_PAGE_SIZE_SHIFT and UNIV_PAGE_SIZE. With this release, you can set &#8211;innodb-page-size=n when starting mysqld, or put innodb_page_size=n in the configuration file in the [mysqld] section where n can be 4k, 8k, 16k, or 4096, 8192, 16384.</p>
<p>The support of smaller page sizes may be useful for certain storage media such as SSDs. Performance results can vary depending on your data schema, record size, and read/write ratio. But this provides you more options to optimize your performance.</p>
<p>When this new setting is used, the page size is set for all tablespaces used by that InnoDB instance. You can query the current value with;</p>
<p>SHOW VARIABLES LIKE &#8216;innodb_page_size&#8217;;<br />
or<br />
SELECT variable_value FROM information_schema.global_status  WHERE LOWER(variable_name) = &#8216;innodb_page_size&#8217;;</p>
<p>It is a read-only variable while the engine is running since it must be set before InnoDB starts up and creates a new system tablespace. That happens when InnoDB does not find ibdata1 in the data directory. If you start mysqld with a page size other than the standard 16k, the error log will contain something like this;</p>
<p><span id="more-1484"></span></p>
<p>111214 15:55:05 InnoDB: innodb-page-size has been changed from the default value 16384 to 4096.</p>
<p>If your system tablespace already exists using one page size and innodb-page-size is something else, the engine will not start. There will be a message logged like this;</p>
<p>111214 16:06:51 InnoDB: Error: data file .\ibdata1 uses page size 4096,<br />
111214 16:06:51 InnoDB: but the start-up parameter is innodb-page-size=16384</p>
<p>InnoDB knows the page size used in an existing tablespace created by version 5.6.4 because it stamps that page size in the header page. But this is not readable to older engines, of course. If an older engine opens a database with a page size other than 16k, it will report a corrupted page and quit.</p>
<p>All features in InnoDB work the same with smaller page sizes. But some limits are affected. The maximum record size is proportionately less with smaller pages. Record length limits are calculated within InnoDB based on a variety of factors including row type, column type and length, number of columns, secondary indexes, index prefix lengths, and of course, the page size. The main record is stored in a clustered index and a minimum of two records must fit into each page. So the maximum record length for 8k pages is about half that of 16k pages and the maximum record size with 4k pages is about half that of 8k pages.</p>
<p>In addition to record lengths, the maximum key lengths are proportionately smaller. For 16k pages, MySQL prevents key definitions from containing over 3072 bytes of data. This means that the primary key in the clustered index cannot use more than 3072 bytes of data. Secondary indexes can only contain up to 3072 bytes of data in the key also. But within a secondary index, InnoDB must store the primary key as the record pointer.  So if you have a 3072 byte primary key and a 3072 byte secondary key, each entry in the secondary index contains 6144 bytes of data plus internal record overhead. The amount of internal overhead is dependent upon the column types, but this gets the secondary record close to half the page size which is its natural limit. So this limit of 3072 bytes per key is less than but close to what InnoDB would have to impose on a table by table basis.  Based on that, InnoDB now reports to MySQL that any key defined for 8k page sizes must be less than 1536 bytes.  Likewise, the maximum key length when InnoDB uses 4k page sizes is 768 bytes.</p>
<p>If you have a database schema with any large records or keys defined, you may not be able to use smaller page sizes. Even if your records do barely fit in the clustered index page, it may not be advisable to use these smaller pages because the btree will be a lot deeper.  For example, if only 2 records fit on the page and there are 1,000,000 records, leaf pages are 20 levels deep, meaning InnoDB will need to read 20 pages to find the leaf page.  If that were on 4k pages, then using the same table on 16k pages would give 8 records per page and the leaf pages would only be 7 levels down.</p>
<p>There is a trick to reducing the size of records on the clustered index page;<br />
1) Use Dynamic or Compressed row format.<br />
2) Convert VARCHAR fields to TEXT fields.  (VARBINARY can be converted to BLOB)</p>
<p>There are 4 ROW FORMATS in InnoDB. The first two, Redundant and Compact, which are considered the Antelope file version, store at least 768 bytes of each field in the clustered record.  The Barracuda file version consists of Compact and Dynamic ROW FORMATS. These have the ability to put all of a VARCHAR, VARBINARY, TEXT or BLOB field onto a separate BLOB page for storage.  So one good way to prepare a table structure to use smaller page sizes is to use Dynamic or Compressed row format.</p>
<p>The decision of how big a record will be in the clustered record is made during the INSERT or UPDATE.  Each record is evaluated for its own length.  If the record is too long to fit in half the page, InnoDB will look for the longest actual field in that record and put as much of that field off-page as possible based on the row type.  Then if it is still to long, it will shorten the longest field remaining.  Since this is done when the record is added to the page, different records may have different columns stored off-page when there are multiple long fields in the record.</p>
<p>VARCHAR/VARBINARY fields are treated like TEXT/BLOB fields if they are over 255 bytes long.  If you are using Compressed or Dynamic row format and your record is too long because you have too many VARCHAR fields 255 bytes or less, you can reduce the record length by converting them to TEXT fields. Likewise, VARBINARY fields 255 bytes or less can be converted to BLOB fields to take advantage of this ability to store the whole TEXT or BLOB field on a separate page.</p>
<p>A file extent in InnoDB is 1 Mb independent of the page size. So an extent will hold 64 16k pages, 128 8k pages and 256 4k pages.  This means that the read ahead mechanisms will read more pages with smaller page sizes since they read a whole extent at a time.  The doublewrite buffer, which is based on the size of an extent, will also contain more pages.</p>
<p>If you want to use smaller page sizes with existing data, export the data first with a logical export utility such as mysqldump. Then create the new mysql instance with innodb-page-size=4k or 8k and import the data. Do not use a physical export method such as alter table … discard tablespace.</p>
<p><strong>Summary:</strong></p>
<p>This feature makes it easier to try smaller page sizes in an InnoDB database. And with the 5.6.4 release, those smaller page sizes are fully supported by MySQL. Just export your data, move or delete the system database (ibdata1) and the log files (ib_logfile0 &amp; ib_logfile1), set innodb-page-size to either 4k or 8k, and restart MySQL. A new InnoDB instance will be created with the smaller page size. Then you can import your data and run your tests, all without recompiling InnoDB.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2011/12/innodb-5-6-4-supports-4k-8k-page-sizes/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>InnoDB 2011 Summer Labs Releases</title>
		<link>http://blogs.innodb.com/wp/2011/07/innodb-2011-summer-labs-releases/</link>
		<comments>http://blogs.innodb.com/wp/2011/07/innodb-2011-summer-labs-releases/#comments</comments>
		<pubDate>Wed, 27 Jul 2011 16:08:18 +0000</pubDate>
		<dc:creator>Calvin Sun</dc:creator>
				<category><![CDATA[Feature]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[labs release]]></category>
		<category><![CDATA[MySQL 5.6]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=1137</guid>
		<description><![CDATA[In April of 2011, InnoDB team published the early access of NoSQL to InnoDB with memcached, plus several new features as part of MySQL 5.6.2 milestone release. This week, we announced additional early access to new InnoDB features for the community to test, and provide feedback. There are two release packages from InnoDB team on [...]]]></description>
			<content:encoded><![CDATA[<p>In April of 2011, InnoDB team published the early access of <a href="http://blogs.innodb.com/wp/2011/04/get-started-with-innodb-memcached-daemon-plugin/">NoSQL to InnoDB with memcached</a>, plus several new features as part of <a href="http://blogs.oracle.com/MySQL/entry/top_features_in_mysql_562_development_milestone_release">MySQL 5.6.2 milestone release</a>. This week, we announced additional early access to new InnoDB features for the community to test, and provide feedback.</p>
<p>There are two release packages from InnoDB team on <a href="http://labs.mysql.com/">MySQL Labs</a>: InnoDB full-text search, and InnoDB new features.</p>
<h4>InnoDB Full-Text Search</h4>
<p>MySQL 5.5 makes InnoDB the default storage engine, so everyone can benefit from ACID-compliant transactions, referential integrity, crash recovery.  However, some users need InnoDB to have built-in full-text search, similar to MyISAM&#8217;s full-text search.</p>
<p>InnoDB full-text search provides users with the ability to build full text indices and search for specific text-based content stored in InnoDB tables.  This new functionality supports fast and accurate search on document content using natural language, boolean, wildcard, and proximity search.</p>
<p><span id="more-1137"></span></p>
<p>The design and implementation of InnoDB full-text search can trace back to 2005, when Osku Salerma detailed the design in his master thesis &#8220;<a href="http://www.oskusoft.com/osku/publications.html">Design of a Full Text Search index for a database management system</a>&#8220;. Later, Sunny Bains and Jimmy Yang from the InnoDB team took over the development and made major contributions to this important feature.</p>
<p>Jimmy gave an <a href="http://blogs.innodb.com/wp/2011/07/overview-and-getting-started-with-innodb-fts/">overview of InnoDB full-text search</a>, and the <a href="http://blogs.innodb.com/wp/2011/07/difference-between-innodb-fts-and-myisam-fts/">main differences in design between InnoDB full-text search and MyISAM full-text search</a>. John provided a set of examples in the <a href="http://blogs.innodb.com/wp/2011/07/innodb-full-text-search-tutorial/">tutorial</a>. What about the performance of InnoDB full-text search, you can find out in <a href="http://blogs.innodb.com/wp/2011/07/innodb-fts-performance/">Vinay and Jimmy&#8217;s article</a>.</p>
<p>Please download mysql-5.6-labs-innodb-fts from <a href="http://labs.mysql.com/">MySQL Labs</a> and give a try.</p>
<h4>InnoDB New Features</h4>
<p>The package mysql-5.6-labs-innodb-features on <a href="http://labs.mysql.com/">MySQL Labs</a> consists of a set of InnoDB new features since MySQL 5.6.2 milestone release, except InnoDB full-text search. Some of the new features are already in MySQL server main development tree, and the rest of them are intended to move into the main development tree toward future development milestone releases and GA releases.</p>
<p>The new InnoDB features included in this package are:</p>
<ul>
<li>Increase the max size of redo log files from 4GB to 2TB</li>
<li>Reduce contention during file extension</li>
<li>Make deadlock detection non-recursive</li>
<li>Improve thread scheduling</li>
<li>Change rw-lock to mutex for trx_sys_t</li>
<li>Option to preload InnoDB buffer pool</li>
<li>Allow UNDO logs to reside in their own tablespace</li>
<li>Reintroduce random readahead</li>
<li>Support smaller page sizes (4K &amp; 8K)</li>
<li>Increase the max length of prefix index from 767 bytes to 3072 bytes</li>
</ul>
<p>In additional to continue improvements of InnoDB performance and scalability, we are also focusing on optimizing InnoDB for flash drives. InnoDB with flash drives could benefit from new features such as larger REDO log files, separate UNDO logs, smaller page sizes, and preloaded buffer pool.</p>
<p>Group commit with binlog is released separately by MySQL replication team, also on <a href="http://labs.mysql.com/">MySQL Labs</a>.</p>
<p>Want to learn the details of InnoDB new features? Download mysql-5.6-labs-innodb-features from <a href="http://labs.mysql.com/">MySQL Labs</a>, play with it, and read the blogs from InnoDB engineers:</p>
<ul>
<li><a href="http://blogs.innodb.com/wp/2011/07/allow-undo-logs-to-reside-in-their-own-tablespace/">Allow UNDO logs to reside in their own tablespace</a> (Sunny)</li>
<li><a href="http://blogs.innodb.com/wp/2011/07/improve-innodb-thread-scheduling/">Improve InnoDB thread scheduling</a> (Sunny, Ranger)</li>
<li><a href="http://blogs.innodb.com/wp/2011/07/reintroducing-random-readahead-in-innodb/">Reintroducing Random Readahead in InnoDB</a> (Inaam)</li>
<li><a href="http://blogs.innodb.com/wp/2011/07/reduced-contention-during-datafile-extension/">Reduced contention during datafile extension</a> (Inaam)</li>
<li><a href="http://blogs.innodb.com/wp/2011/07/shortened-warm-up-times-with-a-preloaded-innodb-buffer-pool/">Shortened warm-up times with a preloaded InnoDB buffer pool</a> (Vasil)</li>
<li><a href="http://blogs.innodb.com/wp/2011/07/innodb-databases-with-4k-and-8k-page-sizes/">Create InnoDB databases with 4k and 8k page sizes without recompiling</a> (Kevin)</li>
</ul>
<p>BTW, do not forget to send us feedback. Thanks for being interested in InnoDB!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2011/07/innodb-2011-summer-labs-releases/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>NoSQL to InnoDB with Memcached</title>
		<link>http://blogs.innodb.com/wp/2011/04/nosql-to-innodb-with-memcached/</link>
		<comments>http://blogs.innodb.com/wp/2011/04/nosql-to-innodb-with-memcached/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 13:29:55 +0000</pubDate>
		<dc:creator>Calvin Sun</dc:creator>
				<category><![CDATA[Feature]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=773</guid>
		<description><![CDATA[MySQL is the most popular open source SQL database. The ever-increasing performance demands of web-based services have generated significant interest in providing NoSQL access methods to MySQL. Today, MySQL is announcing the preview of the NoSQL to InnoDB via memcached. This offering provides users with the best of both worlds &#8211; maintain all of the [...]]]></description>
			<content:encoded><![CDATA[<p>MySQL is the most popular open source SQL database. The ever-increasing performance demands of web-based services have generated significant interest in providing NoSQL access methods to MySQL. Today, MySQL is announcing the preview of the NoSQL to InnoDB via memcached. This offering provides users with the best of both worlds &#8211; maintain all of the advantages of rich SQL query language, while providing better performance for simple queries via direct access to shared data.</p>
<p>In this preview release, memcached is implemented as a MySQL plugin daemon, accessing InnoDB directly via the native InnoDB API:</p>
<p><a href="http://blogs.innodb.com/wp/wp-content/uploads/2011/04/innodb_memcached2.jpg"><img class="aligncenter size-full wp-image-899" title="innodb_memcached" src="http://blogs.innodb.com/wp/wp-content/uploads/2011/04/innodb_memcached2.jpg" alt="" width="640" height="500" /></a></p>
<p>Features provided in the current release:</p>
<ul>
<li> Memcached as a daemon plugin of mysqld: both mysqld and memcached are running in the same process space, with very low latency access to data</li>
<li> Direct access to InnoDB: bypassing SQL parser and optimizer</li>
<li> Support standard protocol (memcapable): support both memcached text-based protocol and binary protocol; all 55 memcapable tests are passed</li>
<li> Support multiple columns: users can map multiple columns into “value”. The value is separated by a pre-defined “separator” (configurable).</li>
<li> Optional local caching: three options – “cache-only”, “innodb-only”, and “caching” (both “cache” and “innodb store”). These local options can apply to each of four Memcached operations (set, get, delete and flush).</li>
<li> Batch operations:  user can specify the batch commit size for InnoDB memcached operations via &#8220;daemon_memcached_r_batch_size&#8221; and &#8220;daemon_memcached_w_batch_size&#8221; (default 32)</li>
<li> Support all memcached configure options through MySQL configure variable &#8220;daemon_memcached_option&#8221;</li>
</ul>
<p><span id="more-773"></span></p>
<p>Sounds interesting? You can download the source or binary from <a href="http://labs.mysql.com/">MySQL Labs</a> (only tested on Linux) &#8211; select&#8221;mysql-5.6-labs-innodb-memcached&#8221;. After unpacking the files, please read the readme file &#8220;README-innodb_memcached&#8221;. Also, please read the upcoming blog &#8220;Get started with InnoDB Memcached Daemon plugin&#8221; by Jimmy.</p>
<p>This is a technology preview, with some limitations. We will gradually address those limitations. If you&#8217;d like to see additional new features or improvements, please let us know.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2011/04/nosql-to-innodb-with-memcached/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Introducing page_cleaner thread in InnoDB</title>
		<link>http://blogs.innodb.com/wp/2011/04/introducing-page_cleaner-thread-in-innodb/</link>
		<comments>http://blogs.innodb.com/wp/2011/04/introducing-page_cleaner-thread-in-innodb/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 11:53:08 +0000</pubDate>
		<dc:creator>Inaam Rana</dc:creator>
				<category><![CDATA[Feature]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[flushing]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[master thread]]></category>
		<category><![CDATA[page_cleaner]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=775</guid>
		<description><![CDATA[In MySQL 5.6.2 we have introduced a new background thread named the page_cleaner in InnoDB. Adaptive flushing of modified buffer pool pages in the main background thread, async flushing by a foreground query thread if the log files are near to wrapping around, idle flushing and shutdown flushing are now moved to this thread, leaving [...]]]></description>
			<content:encoded><![CDATA[<p>In MySQL 5.6.2 we have introduced a new background thread named the page_cleaner in InnoDB. Adaptive flushing of modified buffer pool pages in the main background thread, async flushing by a foreground query thread if the log files are near to wrapping around, idle flushing and shutdown flushing are now moved to this thread, leaving only the last resort sync flushing in foreground query threads. We’ve also added counters for these activities.</p>
<p>As page_cleaner is all about the flushing of dirty pages to disk it’ll do you a world of good if you can go through this post where I have explained <a href="http://blogs.innodb.com/wp/2010/09/mysql-5-5-innodb-adaptive_flushing-how-it-works/">different types of flushing</a> that happen inside InnoDB and the conditions that trigger flushing. The page_cleaner thread is only concerned with flush_list flushing (this may change in future releases). So let us dig a bit deeper into why flush_list flushing happens and why it would make sense to do this flushing in a separate thread. As is usually the case I have to skip some details to keep this note simple.</p>
<p>On a busy server flush_list flushing (which I’ll simply call flushing from this point onwards) happens under four conditions. In order to understand these conditions let us familiarize ourselves with the concept of checkpoint_age. The checkpoint_age is the difference between the current_lsn (the latest change to the database) and the last checkpoint_lsn (the lsn when last checkpoint happened). We obviously don’t want to let this difference grow beyond the log file size because if that happens then we end up overwriting redo log entries before the corresponding dirty pages are flushed to the disk, losing the ability to recover them. In order to avoid the above situation we maintain two high water marks to indicate if we are nearing the end of reusable redo log space. Lets call these water marks async_water_mark and sync_water_mark, where the later represents a more urgent situation than the former. Now that we have clarified the checkpoint_age concept let us get back to the four conditions under which flushing of dirty pages happens:</p>
<ol>
<li><code>checkpoint_age &lt; async_water_mark</code>
<ul>
<li>This condition means that we have enough reusable redo space. As such there is no hurry to flush dirty pages to the disk. This is the condition where we’d like our server to be most of the time.</li>
<li>Based on adaptive_flushing heuristics we flush some dirty pages in this state. This flushing happens in the background master thread.</li>
<li>During the flushing no other threads are blocked, so queries continue normally.</li>
</ul>
<p><span id="more-775"></span></p>
</li>
<li><code>async_water_mark &lt; checkpoint_age &lt; sync_water_mark</code>
<ul>
<li>As we move past the first water mark we try to bring some more urgency to our flushing. The query thread noticing this condition will trigger a flush and will wait for that flushing to end. This type of flushing does not happen in background.</li>
<li>Other query threads are allowed to proceed. Therefore we call it async flushing because only the query thread that is doing the flushing is blocked.</li>
</ul>
</li>
<li><code>checkpoint_age &gt; sync_water_mark</code>
<ul>
<li>This is like a panic button. We have very little reusable redo log space available. The query thread detecting this condition immediately starts flushing.</li>
<li>Other query threads are blocked. The idea is to stop the advance of checkpoint_age.</li>
<li>This type of flushing not only happens in foreground it actually tends to bring the whole system to a stall.</li>
</ul>
</li>
<li><code>%n_dirty_pages &gt; innodb_max_dirty_page_pct</code>
<ul>
<li>%age of dirty pages in the buffer pool exceeds the user settable value of innodb_max_dirty_page_pct.</li>
<li>The flushing happens in the background master thread and is non-blocking for query threads.</li>
</ul>
</li>
</ol>
<h3>The page_cleaner thread:</h3>
<p>As explained above flushing can happen in the query thread e.g.: the async and sync flushing. It can also happen in the background master thread e.g.: the adaptive flushing and the max_dirty_page_pct flushing. There are two issues with this scheme. First, the master thread is also tasked to do other background activities. It has to do <a href="http://blogs.innodb.com/wp/2010/09/mysql-5-5-innodb-change-buffering/">change buffer</a> merges and possibly purge (though starting with 5.5 we have an option to use a dedicated thread for purge and in 5.6.2 we can even have <a href="http://blogs.innodb.com/wp/2011/04/mysql-5-6-multi-threaded-purge/">multi-threaded purge</a>). Under very heavy workload it is possible that the master thread is unable to find enough time to flush dirty pages. The second issue is with async flushing. It should be a background task but it is executed in the query thread. While other threads are allowed to proceed, the unfortunate thread which detects the condition is blocked on a huge dirty page flush.</p>
<p>To address these two issues we came up with the idea of having a dedicated background thread and named it the page_cleaner thread. All background flushing activity previously done in the master thread is off loaded to the page_cleaner. Also, the async flushing is now a background task performed in the page_cleaner thread. Query threads are only ever blocked for flushing if we cross the sync flushing water mark. The page_cleaner thread wakes up every second, checks the state of the system and performs the flushing activity if required. The flushing that happens when the server is idle or at shutdown is now also done by the page_cleaner thread.</p>
<p>Finally we have added some counters to the innodb_metrics table related to the above four types of flushing to give you a picture of how your system is behaving.</p>
<blockquote><p><code><br />
mysql&gt; select name, comment from information_schema.innodb_metrics where name like 'buffer_flush_%';<br />
+-------------------------------------+-----------------------------------+<br />
| name                                | comment                                                               |<br />
+-------------------------------------+-----------------------------------+<br />
| buffer_flush_adaptive_flushes       | Occurrences of adaptive flush                                         |<br />
| buffer_flush_adaptive_pages         | Number of pages flushed as part of adaptive flushing                  |<br />
| buffer_flush_async_flushes          | Occurrences of async flush                                            |<br />
| buffer_flush_async_pages            | Number of pages flushed as part of async flushing                     |<br />
| buffer_flush_sync_flushes           | Number of sync flushes                                                |<br />
| buffer_flush_sync_pages             | Number of pages flushed as part of sync flushing                      |<br />
| buffer_flush_max_dirty_flushes      | Number of flushes as part of max dirty page flush                     |<br />
| buffer_flush_max_dirty_pages        | Number of pages flushed as part of max dirty flushing                 |</code></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2011/04/introducing-page_cleaner-thread-in-innodb/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>MySQL 5.6: Multi threaded purge</title>
		<link>http://blogs.innodb.com/wp/2011/04/mysql-5-6-multi-threaded-purge/</link>
		<comments>http://blogs.innodb.com/wp/2011/04/mysql-5-6-multi-threaded-purge/#comments</comments>
		<pubDate>Wed, 06 Apr 2011 02:23:25 +0000</pubDate>
		<dc:creator>Sunny Bains</dc:creator>
				<category><![CDATA[Feature]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[Purge]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=792</guid>
		<description><![CDATA[Purpose What does purge exactly do and why is it needed? If you have ever wondered then read on. It is really a type of garbage collector. When a user issues a DML like &#8220;DELETE FROM t WHERE c = 1;&#8221;, InnoDB doesn&#8217;t remove the matching record. This is what happens under the hood: It [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Purpose</strong></p>
<p>What does purge exactly do and why is it needed? If you have ever wondered then read on. It is really a type of garbage collector. When a user issues a DML like &#8220;DELETE FROM t WHERE c = 1;&#8221;, InnoDB doesn&#8217;t remove the matching record. This is what happens under the hood:</p>
<ol>
<li> It marks the record as deleted by setting a bit in the control bits of the record.</li>
<li>Stores the before image of the modified columns to the UNDO log</li>
<li>Updates the system columns <strong>DB_TRX_ID</strong> and <strong>DB_ROLL_PTR</strong> in the clustered index record. <strong>DB_TRX_ID</strong> identifies the transaction that made the last change, and <strong>DB_ROLL_PTR</strong> points to the new UNDO log record. This UNDO log record contains the old values of <strong>DB_TRX_ID</strong> and <strong>DB_ROLL_PTR</strong>, possibly pointing to an older transaction and undo log entry.</li>
</ol>
<p>From this you should be able to visualise that the UNDO log records related to the modified clustered record are in a disk based linked list, with the head anchored in the clustered index record. For the sake of simplicity I&#8217;ve ignored the UNDO log pages and the case where DML updates a record. This information is required by  rollback and MVCC (multi version concurrency control). For MVCC we need the entry to exist in the clustered index so that we can follow a pointer back to where the &#8220;before&#8221; changes were written in the UNDO log and use that information to construct a previous version of the record. For rollback we need the before  information of the record  (UNDO entry) so that we can restore it when a transaction is rolled back.</p>
<p>Another benefit of purging separately in the background is that expensive B+Tree block merge operations, if the removal of the record were to lead to underflow, can be done asynchronously by purge and not by user transactions.</p>
<p><strong>Garbage collection</strong></p>
<p><span id="more-792"></span></p>
<p>Once a transaction has been committed, the UNDO log entries and delete-marked records written by it may be needed by other transactions for building old versions of the record. When there is no transaction left that would need the data, the delete-marked records and the related undo log records can be purged.  For this reason we have a dedicated (and aptly named) purge function that runs asynchronously within InnoDB. Another problem is that InnoDB clustered and secondary indexes can have multiple entries for the same key. These need to be removed (garbage collected) too by Purge (see <a class="wp-oembed" title="InnoDB Change Buffering" href="http://blogs.innodb.com/wp/2010/09/mysql-5-5-innodb-change-buffering/" target="_blank">InnoDB Change Buffering</a> for details regarding secondary indexes).</p>
<p><strong>How does purge work?</strong></p>
<p>Purge clones the oldest view in the system. This view has the control information that limits what changes are visible, the ones  made by other transactions. By cloning the oldest view, purge can determine that there can&#8217;t be any transactions that are active than the oldest view low limit in the system. Purge then reads the UNDO log entries in reverse starting from the oldest to the latest. It then parses these entries and removes the delete marked entries from the cluster and secondary indexes, if they are not referenced by any currently running transaction.</p>
<p><strong>Problem with the old design</strong></p>
<p>In versions prior to 5.5 this purge function was part of the responsibility of the InnoDB master thread. At fixed intervals it would run purge asynchronously. The problem with this approach was that the master thread was also responsible for flushing (writing to disk) of dirty pages, among other tasks. A high load on the server that dirtied a lot of pages would force the master thread to spend most of its time flushing and therefore no purging would get done and vice a versa.</p>
<p><strong>Changes in 5.5</strong></p>
<p>In 5.5 there is an option <strong>innodb-purge-threads=[0,1]</strong> to create a dedicated thread that purges asynchronously if there are UNDO logs that need to be removed. We also introduced another option <strong>innodb-purge-batch-size</strong> that can be used to fine tune purge operations. The batch size determines how many UNDO log pages purge will parse and process in one pass. The default setting is 20, this is the same as the hard coded value that is in previous InnoDB releases. An interesting side effect of this value is that it also determines when purge will free the UNDO log pages after processing them. It is always after 128 passes, this magic value of 128  is the same as the number of UNDO logs in the system tablespace, now that 5.5 has 128 rollback segments. By increasing the innodb-purge-batch-size the freeing of the UNDO log pages behaviour changes, it will increase the number of UNDO log pages that it removes in a batch when the limit of 128 is reached. This change was seen as necessary so that we could reduce the cost of removing the UNDO log pages for the extra 127 rollback segments that were introduced in 5.5. Prior to this change iterating over the 128 rollback segments to find the segment to truncate had become expensive.</p>
<p><strong>Changes in 5.6</strong></p>
<p>In 5.6 we have the same parameters as 5.5 except that<strong> innodb-purge-threads</strong> can now be between 0 and 32. This introduces true multi threaded purging. If the value is greater than 1 then InnoDB will create that many purge worker threads and a dedicated purge coordinator thread. The responsibility of the purge coordinator thread is to parse the UNDO log records and parcel out the work to the worker threads. The coordinator thread also purges records, instead of just sitting around and waiting for the worker threads to complete. The coordinator thread will divide the <strong>innodb-purge-batch-size</strong> by <strong>innodb-purge-threads</strong> and hand that out as the unit of work for each worker thread.</p>
<p><strong>Some caveats</strong></p>
<p>For single table tests like <a class="wp-oembed" title="Sysbench" href="http://sysbench.sourceforge.net/" target="_blank">Sysbench</a> one obvious problem is that all the worker threads tend to block/serialise on the <em>dict_</em><em>index_t::lock</em>. This reduces their effectiveness and for such loads it is usually better to have a single dedicated purge thread unless you are using partitions. With multiple tables the purge threads come into their own and can purge records in parallel with minimal contention.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2011/04/mysql-5-6-multi-threaded-purge/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Post-Conference Roundup of InnoDB-related Info</title>
		<link>http://blogs.innodb.com/wp/2010/04/post-conference-roundup-of-innodb-related-info/</link>
		<comments>http://blogs.innodb.com/wp/2010/04/post-conference-roundup-of-innodb-related-info/#comments</comments>
		<pubDate>Fri, 16 Apr 2010 05:02:49 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[John Russell]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=590</guid>
		<description><![CDATA[What a busy week! Lots of MySQL 5.5 announcements that just happened to coincide with the MySQL Conference and Expo in Silicon Valley. Here are some highlights of the performance and scalability work that the InnoDB team was involved with. A good prep for the week of news is the article Introduction to MySQL 5.5, which [...]]]></description>
			<content:encoded><![CDATA[<p>What a busy week! Lots of MySQL 5.5 announcements that just happened to coincide with the MySQL Conference and Expo in Silicon Valley. Here are some highlights of the performance and scalability work that the InnoDB team was involved with.</p>
<p>A good prep for the week of news is the article <a href="http://dev.mysql.com/tech-resources/articles/introduction-to-mysql-55.html">Introduction to MySQL 5.5</a>, which includes information about the major performance and scalability features. That article will lead you into the <a href="http://dev.mysql.com/doc/refman/5.5/en/">MySQL 5.5 manual</a> for general features and the <a href="http://dev.mysql.com/doc/innodb-plugin/1.1/en/">InnoDB 1.1 manual</a> for performance &amp; scalability info.</p>
<p>Then there were the conference presentations from InnoDB team members, which continued the twin themes of performance and scalability:</p>
<ul>
<li>InnoDB: Status, Architecture, and New Features: <a href="http://www.innodb.com/wp/wp-content/uploads/2010/04/Whats_New_in_InnoDB_2010.pdf">Slides</a>, <a href="http://en.oreilly.com/mysql2010/public/schedule/detail/13502">Rate / leave feedback</a></li>
<li>InnoDB Plugin: Performance Features and Benchmarks: <a href="http://www.innodb.com/wp/wp-content/uploads/2010/04/InnoDB_Performance_benchmarks_2010.pdf">Slides</a>, <a href="http://en.oreilly.com/mysql2010/public/schedule/detail/13503">Rate / leave feedback</a></li>
<li>What&#8217;s New in MySQL 5.5? Performance/Scale Unleashed!: <a href="http://www.innodb.com/wp/wp-content/uploads/2010/04/Performance_Change_Analysis_2010.pdf">Slides</a>, <a href="http://en.oreilly.com/mysql2010/public/schedule/detail/13363">Rate / leave feedback</a></li>
<li>What&#8217;s New in MySQL 5.5?: Performance and Scalability Benchmarks: <a href="http://www.innodb.com/wp/wp-content/uploads/2010/04/Benchmark_Analysis_Final_2010.pdf">Slides</a>, <a href="http://en.oreilly.com/mysql2010/public/schedule/detail/14298">Rate / leave feedback</a></li>
<li>Introduction to InnoDB Monitoring System and Resource &amp; Performance Tuning: <a href="http://www.innodb.com/wp/wp-content/uploads/2010/04/InnoDB_Monitoring_System_2010.pdf">Slides</a>, <a href="http://en.oreilly.com/mysql2010/public/schedule/detail/13508">Rate / leave feedback</a></li>
<li>Backup Strategies with InnoDB Hot Backup: <a href="http://www.innodb.com/wp/wp-content/uploads/2010/04/Backup_Strategies_with_MySQL_Enterprise_Backup_2010.pdf">Slides</a>, <a href="http://en.oreilly.com/mysql2010/public/schedule/detail/13505">Rate / leave feedback</a></li>
</ul>
<p><span id="more-590"></span></p>
<p>We hope that a good and useful time was had by all. Best regards to our European friends and colleagues whose return plans were disrupted by the Icelandic volcano!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2010/04/post-conference-roundup-of-innodb-related-info/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>InnoDB Conference Presentations Now Online</title>
		<link>http://blogs.innodb.com/wp/2009/05/innodb-conference-presentations-now-aonline/</link>
		<comments>http://blogs.innodb.com/wp/2009/05/innodb-conference-presentations-now-aonline/#comments</comments>
		<pubDate>Wed, 13 May 2009 00:30:48 +0000</pubDate>
		<dc:creator>Ken Jacobs</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[internals]]></category>
		<category><![CDATA[Locking]]></category>
		<category><![CDATA[Plugin]]></category>
		<category><![CDATA[Recovery]]></category>
		<category><![CDATA[Row Formats]]></category>

		<guid isPermaLink="false">http://blogs.innodb.com/wp/?p=434</guid>
		<description><![CDATA[Well, it took us a little while (we&#8217;ve been busy !), but we&#8217;ve now posted our presentations on InnoDB from the MySQL Conference and Expo 2009. You can download these presentations by Heikki Tuuri, Ken Jacobs and Calvin Sun from the InnoDB website, as follows: Ken and Heikki: InnoDB: Innovative Technologies for Performance and Data [...]]]></description>
			<content:encoded><![CDATA[<p>
<p>
Well, it took us a little while (we&#8217;ve been busy <img src='http://blogs.innodb.com/wp/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  !), but we&#8217;ve now posted our presentations on InnoDB from the MySQL Conference and Expo 2009.  You can download these presentations by <a href="http://www.mysqlconf.com/mysql2009/public/schedule/speaker/1311">Heikki Tuuri</a>, <a href="http://www.mysqlconf.com/mysql2009/public/schedule/speaker/1312">Ken Jacobs</a> and <a href="http://www.mysqlconf.com/mysql2009/public/schedule/speaker/12396">Calvin Sun</a> from the InnoDB website, as follows:</p>
<ul>
<li>Ken and Heikki: <a href="http://www.innodb.com/wp/wp-content/uploads/2009/05/innovative-technologies-final.pdf">InnoDB: Innovative Technologies for Performance and Data Protection</a></li>
<li>Heikki: <a href="http://www.innodb.com/wp/wp-content/uploads/2009/05/innodbcrashrecovery-final.pdf">Crash Recovery and Media Recovery in InnoDB</a></li>
<li>Heikki: <a href="http://www.innodb.com/wp/wp-content/uploads/2009/05/concurrencycontrol.pdf">Concurrency Control: How it Really Works</a></li>
<li>Calvin and Heikki: <a href="http://www.innodb.com/wp/wp-content/uploads/2009/05/innodb-file-formats-and-source-code-structure.pdf">InnoDB File Formats and Source Code Structure</a></li>
</ul>
<p>The description of these and other presentations about InnoDB are available <a href="http://www.innodb.com/products/innodb/info/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.innodb.com/wp/2009/05/innodb-conference-presentations-now-aonline/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

