At Servebolt, we live and breathe performance.
Database performance is no exception.
Executing an inefficient query following a website visitor clicking a link will significantly degrade the user experience. They will need to wait for the full duration of the slow query to execute, which may take several seconds, before any other action takes place, such as page rendering. This wait time includes not only the time needed for the query to run but also any additional time needed for pre-processing and post-processing. As a result, a poorly designed query can significantly slow down the overall performance of a website – resulting in a frustrating user experience.
Time to First Byte (TTFB) is a way of measuring how long it takes for the first byte of data to be received after a user makes a request to a website. It’s also a key metric used by search engines in evaluating sites. When a slow query is triggered, it will negatively affect TTFB. The longer the slow query takes to run, the higher the TTFB will be, resulting in slower overall website performance and a less satisfying user experience.
In this guide, we’ll walk you through how to profile SQL queries – a crucial part of maintaining the performance of web applications that rely on database responses. This is a process that sets the groundwork to be then able to start working on optimizing these queries to improve their performance.
Understanding SQL Query Profiling
As you develop a web application and it begins to operate on a larger scale, SQL queries that once ran smoothly can cause performance issues. Generally speaking, there tends to be an increasing number of queries running against a growing amount of data with an increasing number of requests per second. And when performance suffers, so does the experience your users have when interacting with your site, software, or service.
Query profiling is a way of analyzing database queries, assessing their performance, and identifying potential problems.
By analyzing and identifying these problematic queries, you can make specific improvements that can make a measurable difference to the performance of their database. This will, in turn, allow for improved scalability in the future, as well as overall customer satisfaction, as apps and sites will be more responsive.
MariaDB (and MySQL) provide several tools and techniques for query profiling, which we’ll be covering in this article. Once the slow queries have been identified, the next step will be to optimize them. This process includes identifying the root cause of the issue, and making changes to the structure of the queries to improve their efficiency.
How To Profile SQL Queries (7 Methods)
Let’s start by breaking down the different tools and techniques that are available to identify slow and inefficient queries so that you know where to focus improvement efforts:
1 – The EXPLAIN EXTENDED Command
One of the tools that can be used to analyze your SQL queries is the EXPLAIN command.
By running the EXPLAIN command on a query, you are able to see how the query is executed, including which indexes are used, and the number of rows that are examined.
EXPLAIN SELECT * FROM orders
JOIN customers ON orders.customer_id = customers.customer_id
WHERE customers.name = 'John Smith';
When you run the EXPLAIN command on a query, it returns a result set with several columns, including:
- id: The unique identifier of the query in the execution plan
- select_type: The type of the query, such as SIMPLE or SUBQUERY
- table: The table that is being queried
- type: The join type used, such as JOIN or INDEX
- possible_keys: The indexes that MariaDB or MySQL could have used to process the query
- key: The index that MariaDB or MySQL actually used to process the query
- key_len: The length of the key that was used
- rows: The number of rows that MariaDB or MySQL estimates will be examined for the query
Extra: This contains additional information about the query, such as whether a full table scan was performed or whether a temporary table was used.
By analyzing the output of the EXPLAIN command, you are generally able to identify potential performance bottlenecks, such as poor indexing, suboptimal join types, or high numbers of examined rows.
For example, if the type column shows “ALL” instead of “index,” then the query is doing a full table scan, which is almost certainly going to result in slow performance. If the key column is NULL, then MySQL is not using any indexes, which will also be slow. If the rows column has a high value, it means that many rows are being examined, resulting in further degradation in performance.
We prefer using the EXPLAIN EXTENDED variation to help provide additional information.
Note: While this is deprecated in MySQL, it is still available in MariaDB.
By using the EXTENDED option, you will be able to see useful information such as the number of rows examined, the number of rows returned, information about the type of JOIN used, the order of the tables scanned, the indexes used, and how long the query took to be executed.
Here’s how using the EXPLAIN EXTENDED command looks:
EXPLAIN EXTENDED SELECT * FROM your_table WHERE column_name = 'value';
In this example, the EXPLAIN command will show a list of steps the database will take in order to execute the query, as well as a list of the resources it will use.
By using this command, you’ll be more easily able to spot bottlenecks in the query, allowing you to make whatever necessary changes will help to alleviate that and speed up the performance of the query.
For example, using the EXPLAIN EXTENDED command can help identify the need to add indexes, optimize JOIN conditions, and limit the total number of rows returned by the query.
You should also make sure that you’ve disabled query caching when performing this testing and optimizations to ensure you get accurate results. To do so, run this command first when you connect your client.
SET SESSION query_cache_type=0;
Once you’ve made these changes to your query, test its performance again to identify how much of an improvement has been achieved (if any). Remember that, as with any profiling and optimization of a query, the process is iterative – expect to use the EXPLAIN EXTENDED command, followed by a performance test, several times.
2 – The EXPLAIN ANALYZE Command
This command is used to analyze the execution plan of a query and return performance metrics such as the actual time the query took to execute and the number of rows it actually examined. By analyzing the results of the EXPLAIN ANALYZE command, you can identify any potential bottlenecks in the query’s execution, such as a lack of indexes or a high number of rows that need to be examined.
3 – The Slow Query Log
This is a built-in feature in MariaDB (and MySQL) that logs all queries that take longer than a certain amount of time to execute. The slow query log can be configured to log queries that take longer than a specific threshold, such as one second.
At Servebolt, the slow query log logs all queries that take longer than 1 second to execute. This is because most queries should execute in fractions of a second. In the context of a web application, such as a site running WordPress, loading a single page requires between 10 and 100 database queries, all of which must be executed sequentially before the page can be compiled to HTML and returned to the user.
The current Servebolt Cloud configuration keeps slow query logs on a global log server. If the need arises, you can simply get in touch with our support team, and we will filter the file for the relevant logs and provide you with the output.
In your own environments, you can enable the slow query log by adding the following lines to your MariaDB or MySQL configuration file (my.cnf or my.ini):
log_slow_queries = /path/to/slow.log
long_query_time = 1
4 – Visual Explain Plan
A visual explain plan provides a graphical representation of the EXPLAIN command output, making it easier to understand a query’s execution and detect any performance problems.
Note: Visual Explain Plans are helpful when you are in the process of developing web applications.
Instead of plain text output, it displays the query execution in a tree structure, with each node representing a table, index, or operation, and the connections between them depict the order of operations.
Different tools, such as the MySQL Workbench and the EXPLAIN Analyzer, can generate visual explain plans, and offer an interactive interface for navigating the execution plan and examining each operation in detail.
For instance, in MySQL Workbench, generating a visual explain plan is as simple as executing the query and clicking the “Explain Plan” button on the result tab. This presents a graphical representation of the query execution plan, along with detailed information on each operation. This enables you to identify any performance issues and then optimize the query as needed.
5 – The MySQL Tuner
MySQL Tuner is a script that checks the performance and configuration of a database server and provides recommendations for improvement. It provides a summary of the current server status, including information such as the total number of queries, the number of slow queries, and the current buffer pool usage.
It can also be used to check various other settings, such as the database version, the storage engine in use, and the query cache configuration, and provides recommendations for optimizing these settings based on the current workload.
One of the main differences with other tools is that it is a command-line tool that can be run either on the server itself or remotely, making it easy to automate the process of monitoring and optimizing database performance.
Note: If your web application (and database) are already hosted in the Servebolt Cloud – this is something our team specializes in and is able to do better than any recommendations a tool would be able to provide.
6 – Query Profilers
There are third-party query profilers that can be used to profile SQL queries, such as the MariaDB Enterprise Query Analyzer, Dataedo, and Percona Toolkit. Third-party query profilers can provide additional features and functionality compared to the built-in tools available in MariaDB (or MySQL).
Note: Query Profilers are helpful when you are in the process of developing web applications.
For example, they may offer more detailed information about query performance, such as execution times and lock wait times, and can provide visualization of the data in ways that are not possible with the built-in tools.
If the built-in tools are sufficient for your needs, then there is no need to use third-party query profilers. However, if you need more detailed information or advanced features, then it may be worth considering a third-party profiler.
7 – Profiling With Monitoring Tools
There are also a number of monitoring tools, such as Prometheus, Grafana, and Nagios, that can be used to profile queries and monitor the performance of your databases.
Prometheus is an efficient monitoring system that can collect, store and query metrics data, allowing you to gain valuable insights in real-time. It integrates with MariaDB (and MySQL) to store the metrics gathered and comes with Grafana for effective visualization.
Grafana is a powerful, open-source analytics tool that can be used to monitor and visualize data gathered from Prometheus. Setting up custom dashboards and alerts allows you to keep an eye on the performance of your database in real-time.
Nagios helps you keep an eye on the health of your database at all times. It can be set up to monitor key resources such as CPU, RAM and disk space, while also keeping track of other services and network devices. As it’s highly configurable, it’s a great tool to have for proactive database query monitoring.
With the help of these server monitoring tools, you can track performance issues and take action quickly, allowing you to ensure that your database server runs smoothly.
Common Query Optimization Techniques
There are several common query optimization techniques that can be used to improve the performance of SQL queries:
1 – Indexing
Indexes are a way to speed up queries – particularly those that use filters (WHERE). Using indexes results in data structures in your database engine (MariaDB or MySQL) outside of specific tables and points to the data you’re trying to query. We won’t go into too much detail in this post as using indexes to improve database queries warrants an article of its own – something we plan to cover in the future.
For example, consider a large table called “orders” that contains millions of rows of data, including information such as the order ID, customer ID, and order date. If a query is executed to retrieve all orders placed by a specific customer without an index on the customer ID column, MariaDB would have to scan the entire table to locate the relevant data. This could take significant time and resources, especially for large tables.
Broadly speaking, whenever you’re confident that you’ll run a specific query repeatedly and read performance matters, creating an index (or more than one) can be the right approach to speed that query up.
In the context of WordPress, this is very common. Lots of plugins are built by developers who (out of convenience) use generic, shared tables without using indexes. As a result, it is also an area where there are often very significant performance gains.
To view any indexes that exist on a particular table,
You can view any indexes that exist on a specific table using SHOW INDEX FROM – such as in the below example for the wp_postmeta table:
MariaDB [db_name] > SHOW INDEX FROM wp_postmeta;
In one scenario, we recently created two indexes for a wp_postmeta table: sb_postid_metakey and sb_postid_metakey_metaval.
These indexes were added based on looking at the top slow queries and finding that they all were relatively similar by the characteristic of being SELECT statements that filter using WHERE in addition to lots of (AND/OR) comparison conditions. Upon seeing this, I reviewed the current indexes for the table used and ran EXPLAIN EXTENDED on the query to validate my approach further.
The query was mostly working and using the wp_postmeta table using JOIN. Based on the order in which this was happening, adding these indexes would allow MariaDB (or MySQL) to get its response from the indexes instead of scanning the entire table with all its rows.
CREATE INDEX sb_postid_metakey ON wp_postmeta (post_id, meta_key);
CREATE INDEX sb_postid_metakey_metaval ON wp_postmeta (post_id, meta_key, meta_value);
This is a combination of “figuring things out” by using the tools you have at your disposal (as outlined above), as well as knowledge of the data types and contents of the database. This by no means always works; even when it does, it doesn’t always result in a 500% performance improvement. Having a huge index can end up being slower than scanning all rows, so queries must be tested before and after applying the indexes to be sure.
Note: When attempting to test index speeds, you’ll want to disable query caching for the session, using:
SET SESSION query_cache_type=0;
In this case, before using indexes, the query took 10.437 seconds to execute. And after creating the two indexes, the same query took [# of seconds].
2 – Reducing Data Access
Reducing data access, i.e. minimizing the number of rows and columns that are to be accessed to execute a query. This can be achieved by filtering the data that is retrieved by the query, using indexes, and partitioning large tables. Although not something most people will need (or be able) to do, it’s an essential point to keep in mind when designing database queries from scratch.
For example, if a database query is looking up data about a user for login purposes, the query should be LIMIT 1, since there should clearly never be more than one user’s data required.
Note: This relates more to database design than optimization. While important to maintain performance, this effort is more relevant to plugin developers (in the context of WordPress) than it is to the majority of end users.
Remember that before testing speeds after making any changes to data access, you should ensure you’ve disabled query caching by running the following command:
SET SESSION query_cache_type=0;
3 – Using Data Partitioning
By partitioning data into smaller chunks, databases become more efficient and less time-consuming to manage. This strategy can help reduce the amount of time spent on maintenance processes such as backups and updates, as well as limit the amount of data that needs to be managed. Overall it helps to improve performance and optimize resource usage.
To partition data in a database, you can follow these steps:
- When selecting a table to be partitioned, make sure to pick one that holds a large amount of data and would benefit from being split. This will help optimize your system and improve query performance.
- Selecting the right partitioning method for your database is crucial. You can choose from range, list, hash, or key partitioning – depending on the structure of your data and the queries that you’re planning to execute. Make sure you pick the one that best suits your needs for optimized performance and results.
- Range partitioning is the ideal choice when you have data that can be divided into certain ranges. For instance, if you have a table with data for multiple years, you can create a range partition to organize it better. It could be based on the date or the numerical value of the column in question.
- List partitioning is an efficient technique to handle data that can be easily segregated into various groups as per a particular parameter. For instance, you have a table with employees’ information categorized by Department; this necessitates the use of list partitioning.
- Hash partitioning is an effective strategy for arranging data into equal-sized clusters based on the hash value of a specific column. This allows for an even distribution of the data across multiple partitions, making it a great choice for distributing data efficiently.
- Key partitioning is similar to hash partitioning, but the major difference is that it uses a specific column value as its basis for dividing data into different groups. This makes it an ideal choice for data sets that can be broken up into separate groups based on a unique identifier or natural key.
- By creating a partitioned table, you can effectively divide the original table into smaller ones. This is achieved by adding a partitioning clause in the CREATE TABLE statement, where you specify the desired method and conditions for segmentation. Doing this can help improve query performance – and also make data management more efficient.
- You can quickly copy data from the original table to the newly partitioned one using the INSERT INTO… SELECT statement. This will easily populate your partitioned table with all of the relevant information.
- Applications must now be reconfigured to take advantage of the partitioned table. This will replace the original table and make your applications more efficient.
- Before running any test to assess potential performance improvement it will be essential to disable query caching first by running the command:
SET SESSION query_cache_type=0;
- To ensure your partitioned table is running smoothly, it’s important to keep a close eye on its performance. If you notice any issues, adjusting the partitioning conditions or switching to another method could help. Regularly monitoring your partitions will help you maximize their potential.
Important Note About Scripting Upgrades And Partitioned Tables
While partitioning databases can make a positive difference in efficiency, it’s important to keep in mind the potential issues caused by running upgrade scripts to change the database schema. It’s essential that partitioned tables be taken into consideration when scripting these upgrades. If the partitioned tables are not accounted for in the upgrade scripts, there could be potential problems that will almost certainly result in a malfunctioning site.
For instance, if a script is created to add a new column to a partitioned table, it might only alter one partition, creating inconsistencies and issues within the data. Likewise, if an upgrade script is created to add an index to a partitioned table, it can only generate the index on one partition, resulting in slower performance and inconsistent results.
To avoid such problems, upgrade scripts must be designed to consider the partitioned tables. This could involve running the script on each partition individually or revising the scripts to work with partitioned tables. It’s also important to conduct thorough testing to ensure that the upgrade process doesn’t generate any unexpected problems or loss of data.
4 – Redis
For Servebolt customers, Redis is a (paid) addon that can help with query optimization.
Redis (sometimes known as a Remote Dictionary Server) is an open-source solution that stores data in-memory and can be used for caching, a database, or even as a message broker. It can be integrated with a database to improve performance, acting as an efficient intermediary between the application and the database.
It works to improve the performance and response times of applications by reducing the load on the database. This is done by storing frequently used data in Redis instead of the database for every request, thereby saving considerable time.
By properly configuring the plugin, Redis can be used with a database for optimizing query execution. When required data is not present in Redis, the application will retrieve it from the database and store it in Redis for future use. This makes data retrieval much faster and more efficient.
By using this approach, the application can benefit from the quick in-memory access of Redis and also store and access data from the database as needed.
Remember that if you are implementing Redis for the first time, you will need to disable query caching before running any performance tests. To do this, use the command:
SET SESSION query_cache_type=0;
The MariaDB and MySQL ecosystem has a wide range of tools and methods to make it easier to discover bottlenecks in database query executions, allowing you to improve the performance of your web applications.
Slowdowns are likely to occur throughout the lifetime of running any application. Trying to avoid them is great, but you ultimately need to know where to look when you start diagnosing performance issues. Depending on the size and nature of the databases you run, this is an iterative process that requires continuous monitoring, troubleshooting, and ongoing improvement to keep your databases performing at a high standard.