We found a long-lived MariaDB bug in our hosting stack

Last week we discovered a bug in MariaDB that has been with us on Servebolt for a very long time and which will affect all others that use the same config setting. The bug has had a very negative impact on certain database queries.

We rolled out a platform-wide workaround for the problem last week and contributed information about the problem to MariaDB. We are now waiting for the amazing MariaDB team to get the bug fixed for one of the upcoming releases and solve this for anyone that might have used the same configuration option.

We discovered the bug while moving to MyThemeShop.com. They have hundreds of thousands of users living on their WordPress site. This makes a very large database, especially the _user and _usermeta tables. WordPress Admin does several full-count queries and a variety of join queries on these tables on several different admin pages. These queries represent the typical scenario where indexes help speed up the query time a lot.

The symptoms of the bug were quite severe for the queries that were affected. We have not seen any front-end performance degradations for either WordPress / WooCommerce or Magento, but certain back-end queries led us to the identification of the bug.

The bug and how it manifested itself

The bug is enabled in MariaDB if the innodb_stats_method is set to nulls_ignored. The MariaDB default config has this set to nulls_equal, so standard setups should not be affected. Our workaround for the problem has been to change this setting to the MariaDB default temporarily.

We opened a ticket with a detailed description and the way to reproduce the problem with MariaDB last week. The problem was that the cardinality for indexes was equivalent to the table length, which made the database select the wrong indexes to use for certain queries. Cardinality is essential for the database to make the right decision on how to make a specific query as fast as possible. We also found that this bug has previously exited in the MySQL project, so this has somehow come back into MariaDB.

Join queries with large tables would suffer badly. Such queries are common in both cron jobs and certain pages in the back-end of WordPress and Magento. Cron jobs can easily go under the radar, as it is quite normal for large websites to run heavy queries in maintenance tasks. However, in the admin backend of WordPress – the problem manifested itself by using 10-20x more time than on otherwise configured databases.

Queries that were supposed to complete in a few seconds and after that be cached in the query cache – spent more than a minute before the results were ready and did not get cached.

Our most sincere apologies!

The Servebolt team is used to seeing a lot of “wrong” use of databases. Queries that do not use indexes are one of the most common database-related problems that plugins or developers make. Therefore, we might have responded to several of our clients that have experienced this problem that “your application is making bad database queries that do not use indexes correctly,” while the slowness has been caused by this bug in MariaDB. Sorry!

The workaround has eliminated the problem – and for those who we have wrongly accused of using databases the wrong way, we sincerely apologize. For those of you who might have tested us and experienced any slowness – please do come back and test again because this issue is now resolved!

The upside of all of this is that databases now work even faster and better than ever before for all Servebolt-hosted websites. And that the fantastic open-source project MariaDB continues to improve!

Latest