emGee Software Solutions Custom Database Applications

Share this

Web Technologies

Error message

  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (line 2074 of /homepages/18/d193362201/htdocs/includes/common.inc).
  • Warning: date_format() expects parameter 1 to be DateTimeInterface, boolean given in format_date() (line 2084 of /homepages/18/d193362201/htdocs/includes/common.inc).
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (line 2074 of /homepages/18/d193362201/htdocs/includes/common.inc).
  • Warning: date_format() expects parameter 1 to be DateTimeInterface, boolean given in format_date() (line 2084 of /homepages/18/d193362201/htdocs/includes/common.inc).

Percona Live Europe Tutorial: Query Optimization and TLS at Large Scale

Planet MySQL - Mon, 10/15/2018 - 07:05

For Percona Live Europe this year, I got accepted a workshop on query optimization and a 50-minute talk covering TLS for MySQL at Large Scale, talking about our experiences at the Wikimedia Foundation.

Workshop

The 3-hour workshop on Monday, titled Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics is a beginners’ tutorial–though dense in content. It’s for people who are more familiar with database storage systems other than InnoDB for MySQL, MariaDB or Percona Server. Or who, already familiar with them, are suffering performance and scaling issues with their SQL queries. If you get confused with the output of basic commands like EXPLAIN and SHOW STATUS and want to learn some SQL-level optimizations, such as creating the right indexes or altering the schema to get the most out of the performance of your database server, then you want to attend this tutorial before going into more advanced topics. Even veteran DBAs and developers may learn one or two new tricks, only available on the latest server versions!

Something that people may enjoy is that, during the tutorial, every attendee will be able to throw queries to a real-time copy of the Wikipedia database servers—or setup their own offline Wikipedia copy in their laptop. They’ll get practice by themselves what is being explained—so it will be fully hands-on. I like my sessions to be interactive, so all attendees should get ready to answer questions and think through the proposed problems by themselves!

Fifty minutes talk

My 50 minute talk TLS for MySQL at Large Scale will be a bit more advanced, although maybe more attractive to users of other database technologies. On Tuesday, I will tell the tale of the mistakes and lessons learned while deploying encryption (TLS/SSL) for the replication, administration, and client connections of our databases. At the Wikimedia Foundation we take very seriously the privacy of our users—Wikipedia readers, project contributors, data reusers and every members of our community—and while none of our databases are publicly reachable, our aim is to encrypt every single connection between servers, even within our datacenters.

However, when people talk about security topics, most of the time they are trying to show off the good parts of their set up, while hiding the ugly parts. Or maybe they are too theoretical to actually learn something. My focus will not be on the security principles everybody should follow, but on the pure operational problems, and the solutions we needed to deploy, as well what we would have done differently if we had known, while deploying TLS on our 200+ MariaDB server pool.

Looking forward…

For me, as an attendee, I always look forward to the ProxySQL sessions, as it is something we are currently deploying in our production. Also, I want to know more about the maturity and roadmap of the newest MySQL and MariaDB releases, as they keep adding new interesting features we need, as well as cluster technologies such as Galera and InnoDB Cluster. I like, too, to talk with people developing and using other technologies outside of my stack, and you never know when they will fill in a need we have (analytics, compression, NoSQL, etc.).

But above all, the thing I enjoy the most is the networking—being able to talk with professionals that suffer the same problems that I do is something I normally cannot do, and that I enjoy doing a lot during Percona Live.

Jaime Crespo in a Percona Live T-Shirt – why not come to this year’s event and start YOUR collection.

The post Percona Live Europe Tutorial: Query Optimization and TLS at Large Scale appeared first on Percona Community Blog.

Categories: Web Technologies

Identifying High Load Spots in MySQL Using Slow Query Log and pt-query-digest

Planet MySQL - Mon, 10/15/2018 - 06:24

pt-query-digest is one of the most commonly used tool when it comes to query auditing in MySQL®. By default, pt-query-digest reports the top ten queries consuming the most amount of time inside MySQL. A query that takes more time than the set threshold for completion is considered slow but it’s not always true that tuning such queries makes them faster. Sometimes, when resources on server are busy, it will impact every other operation on the server, and so will impact queries too. In such cases, you will see the proportion of slow queries goes up. That can also include queries that work fine in general.

This article explains a small trick to identify such spots using pt-query-digest and the slow query log. pt-query-digest is a component of Percona Toolkit, open source software that is free to download and use.

Some sample data

Let’s have a look at sample data in Percona Server 5.7. Slow query log is configured to capture queries longer than ten seconds with no limit on rate of logging, which is generally considered to throttle the IO that comes while writing slow queries to the log file.

mysql> show variables like 'log_slow_rate%' ; +---------------------+---------+ | Variable_name | Value | +---------------------+---------+ | log_slow_rate_limit | 1 | --> Log all queries | log_slow_rate_type | session | +---------------------+---------+ 2 rows in set (0.00 sec) mysql> show variables like 'long_query_time' ; +-----------------+-----------+ | Variable_name | Value | +-----------------+-----------+ | long_query_time | 10.000000 | --> 10 seconds +-----------------+-----------+ 1 row in set (0.01 sec)

When I run pt-query-digest, I see in the summary report that 80% of the queries have come from just three query patterns.

# Profile # Rank Query ID Response time Calls R/Call V/M # ==== ============================= ================ ===== ======== ===== # 1 0x7B92A64478A4499516F46891... 13446.3083 56.1% 102 131.8266 3.83 SELECT performance_schema.events_statements_history # 2 0x752E6264A9E73B741D3DC04F... 4185.0857 17.5% 30 139.5029 0.00 SELECT table1 # 3 0xAFB5110D2C576F3700EE3F7B... 1688.7549 7.0% 13 129.9042 8.20 SELECT table2 # 4 0x6CE1C4E763245AF56911E983... 1401.7309 5.8% 12 116.8109 13.45 SELECT table4 # 5 0x85325FDF75CD6F1C91DFBB85... 989.5446 4.1% 15 65.9696 55.42 SELECT tbl1 tbl2 tbl3 tbl4 # 6 0xB30E9CB844F2F14648B182D0... 420.2127 1.8% 4 105.0532 12.91 SELECT tbl5 # 7 0x7F7C6EE1D23493B5D6234382... 382.1407 1.6% 12 31.8451 70.36 INSERT UPDATE tbl6 # 8 0xBC1EE70ABAE1D17CD8F177D7... 320.5010 1.3% 6 53.4168 67.01 REPLACE tbl7 # 10 0xA2A385D3A76D492144DD219B... 183.9891 0.8% 18 10.2216 0.00 UPDATE tbl8 # MISC 0xMISC 948.6902 4.0% 14 67.7636 0.0 <10 ITEMS>

Query #1 is generated by the qan-agent from PMM and runs approximately once a minute. These results will be handed over to PMM Server. Similarly queries #2 & #3 are pretty simple. I mean, they scan just one row and will return either zero or one rows. They also use indexing, which makes me think that this is not because of something just with in MySQL. I wanted to know if I could find any common aspect of all these occurrences.

Let’s take a closer look at the queries recorded in slow query log.

# grep -B3 DIGEST mysql-slow_Oct2nd_4th.log .... .... # User@Host: ztrend[ztrend] @ localhost [] Id: 6431601021 # Query_time: 139.279651 Lock_time: 64.502959 Rows_sent: 0 Rows_examined: 0 SET timestamp=1538524947; SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history; # User@Host: ztrend[ztrend] @ localhost [] Id: 6431601029 # Query_time: 139.282594 Lock_time: 83.140413 Rows_sent: 0 Rows_examined: 0 SET timestamp=1538524947; SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history; # User@Host: ztrend[ztrend] @ localhost [] Id: 6431601031 # Query_time: 139.314228 Lock_time: 96.679563 Rows_sent: 0 Rows_examined: 0 SET timestamp=1538524947; SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history; .... ....

Now you can see two things.

  • All of them have same Unix timestamp
  • All of them were spending more than 70% of their execution time waiting for some lock.
Analyzing the data from pt-query-digest

Now I want to check if I can group the count of queries based on their time of execution. If there are multiple queries at a given time captured into the slow query log, time will be printed for the first query but not all. Fortunately, in this case I can rely on the Unix timestamp to compute the counts. The timestamp is gets captured for every query. Luckily, without a long struggle, a combination of grep and awk utilities have displayed what I wanted to display.

# grep -A1 Query_time mysql-slow_Oct2nd_4th.log | grep SET | awk -F "=" '{ print $2 }' | uniq -c 2 1538450797; 1 1538524822; 3 1538524846; 7 1538524857; 167 1538524947; ---> 72% of queries have happened at this timestamp. 1 1538551813; 3 1538551815; 6 1538602215; 1 1538617599; 33 1538631015; 1 1538631016; 1 1538631017;

You can use the command below to check the regular date time format of a given timestamp. So, Oct 3, 05:32 is when there was something wrong on the server:

# date -d @1538524947 Wed Oct 3 05:32:27 IST 2018

Query tuning can be carried out alongside this, but identifying such spots helps avoiding spending time on query tuning where badly written queries are not the problem. Having said that, from this point, further troubleshooting may take different sub paths such as checking log files at that particular time, looking at CPU reports, reviewing past pt-stalk reports if set up to run in the background, and dmesg etc. This approach is useful for identifying at what time (or time range) MySQL was more stressed just using slow query log when no robust monitoring tools, like Percona Monitoring and Management (PMM), are deployed.

Using PMM to monitor queries

If you have PMM, you can review Query Analytics to see the topmost slow queries, along with details like execution counts, load etc. Below is a sample screen copy for your reference:

NOTE: If you use Percona Server for MySQL, slow query log can report time in micro seconds. It also supports extended logging of  other statistics about query execution. These provide extra power to see the insights of query processing. You can see more information about these options here.

Categories: Web Technologies

Pages