Better Scalability with Multiple Rollback Segments

Background

The original motivation behind this patch was the infamous Bug#26590MySQL does not allow more than 1023 open transactions. Actually the 1024 limit has to do with the number of concurrent update transactions that can run within InnoDB. Where does this magic number come from ? 1024 is the total number of UNDO log list slots on one rollback segment header page. And in the past InnoDB created just one rollback segment header page during database creation. This rollback segment header page is anchored in the system header page, there is space there for 128 rollback segments but only one was being created and used resulting in the 1024 limit. Each slot in the rollback segment header array comprises of {space_id, page_no}, where both space_id and page_no are of type uint32_t . Currently the space id is “unused” and always points to the system table space, which is tablespace 0. Now, onto the rollback segment header page. This page contains a rollback segment header (details of which are outside the scope of this blog entry :-) ), followed by an array of 1024 UNDO slots. Each slot is the base node of a file based linked list of UNDO logs. Each node in this file based list contains UNDO log records, containing the data updated by a transaction. A single UNDO log node can contain UNDO entries from several different transactions.

Performance ramifications

When a transaction is started it is allocated a rollback segment to write its modifications. Multiple transactions can write to the same rollback segment but only one transaction is allowed to write to any one UNDO slot during its lifetime. This should make clear where the 1024 limit comes from. Each rollback segment is protected by its own mutex and when we have a single rollback segment this rollback segment mutex can become a high contention mutex.

Requirements

Backward compatibility in file formats is something we take very seriously at InnoDB.  InnoDB has always had the ability to use up to 128 pages but before this fix it created only one rollback segment. We had to figure out a way to make the multiple rollback segments change backward compatible, without breaking any assumptions in the code of older versions of InnoDB about absolute locations of system pages and  changes to system data. The 128 limit is a result of the latter. While there is space for 256 rollback segments, InnoDB uses only 7 bits from that field. Once we fix that we could in the future enable 256 rollback segments, however 128 seems to be sufficient for now. There are other scalability issues that need to be addressed first before 128K concurrent transactions will become an issue :-) .

The solution

To keep backward compatibility the additional rollback segments are created after the double write buffer when creating the rollback segments in a new instance. We keep the total number of rollback segments set to 128 rather than 256 and set the remaining slots to NULL because the older versions will try and scan up to 256. This means that older versions of InnoDB could also benefit from these extra rollback segments.  If you decide to create the extra rollback segments with a newer version but decide to revert back to an older version the older versions should be able to use the additional rollback segments. Newer versions of InnoDB that contain this fix, will create the additional segments in existing instances only if the innodb_force_recovery flag is not set and the database instance was shutdown cleanly. The additional segments will be created by default when creating new instances.

8 Responses

  1. James Day Says:

    Sunny, has this actually been tested with say MySQL 4.0 or 4.1 or 5.0 being able to use the extra rollback segments once they have been created with a later version?

    [Reply]

  2. MySQL 5.5.4 is Very Exicting « Jeremy Zawodny's blog Says:

    [...] Multiple Rollback Segments mean the 1,024 concurrent transaction limit goes away and concurrent transactions will have less mutex (lock) contention. [...]

  3. MySQL 5.5.4 looks awesome. « SmugMug's Don MacAskill Says:

    [...] and behold, it’s fixed!  You can now have a whopping 128K transactions in flight.  Best of all, it’s far more [...]

  4. Harpej Singh Says:

    Dear Sir/Mam
    I am facing a problem with MYSQL that we can’t store more then 123999475 rows. here 43 fields are in each row. ok. now error displayed
    ERROR-CODE-1114(HY000). Table is full. now how we rectify thease problem please send me solution about this problem
    thanks

    Harpej Singh
    Ajmer- Rajasthan (INDIA)
    (91)9214311070

    [Reply]

  5. Cvecara Beograd Says:

    I’m also interested in resolving large table issue.
    What is the limit there. We are speaking about 100 billions in one table here.

    [Reply]

    Marko Mäkelä Reply:

    @Cvecara Beograd, the maximum table size does not have anything to do with the undo logging of transactions. InnoDB supports up to 4,294,967,296 (2^32) pages per tablespace. Older versions of InnoDB limit it to 2,415,919,104 (0×90000000) pages when applying the redo log in crash recovery. 2^32 pages of 16 kilobytes (2^14 bytes) would be 2^46 bytes, or 64 TiB per tablespace, but that figure will include some administrative overhead (allocation and insert buffer bitmap pages, node pointer pages, B-tree leaf page headers and footers).

    In the past, InnoDB did return the misleading error code HA_ERR_RECORD_FILE_FULL when it failed to allocate an undo log segment. Currently, it returns the code HA_ERR_TOO_MANY_CONCURRENT_TRXS, which is mapped to ER_TOO_MANY_CONCURRENT_TRXS.

    [Reply]

  6. Piyush Says:

    @Harpej Singh, check the Disk Space Available dude.

    [Reply]

  7. MySQL 5.5 Feature List | CastlerockResearch India Developer Blog Says:

    [...] http://planet.mysql.com/entry/?id=22648 http://blogs.innodb.com/wp/2010/04/innodb-multiple-rollback-segments/ http://blogs.innodb.com/wp/2010/09/mysql-5-5-innodb-as-default-storage-engine/ [...]

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.