Wednesday, February 3, 2021

[google-cloud-sql-discuss] Re: Error truncating performance_schema tables

Thanks for clarifying. I referenced MySQL documentation, because I believe the limit in Cloud SQL is derived from Oracle MySQL. You really seem to have a large number of schemas (over 10k), but have you really checked the number of digested queries? MySQL documentation explains how queries are digested, but this is something to evaluate on each instance, because no two databases and applications are the same.

Anyway, if you have much more than 1 million digested queries in your database, then you may consider splitting the schemas over different instances (if possible) or asking either Oracle or Google to increase the maximum for performance_schema_digests_size, because IMHO the information in performance_schema.events_statements_summary_by_digest table is most useful when it covers most of the queries that ran on the instance (i.e. when it's representative). On such a big database as yours, I could imagine that after truncation the most frequently executed queries would quickly fill up the table again.

Of course you may also request Google to support regular truncation of performance_schema.events_statements_summary_by_digest table as in Amazon Aurora, because you properly mentioned that this requires SUPER privilege, which is not supported on Cloud SQL.
On Wednesday, February 3, 2021 at 5:46:47 PM UTC+2 dhawal....@rubrik.com wrote:
Thanks for the reply. 2nd gen is a Cloudsql term, referring to the newer versions of Cloudsql.

Really appreciate you going through docs and helping me with options. I am aware that the limit can be increased to up to 1 Million, but I suspect in our production environment we won't be able to hold a year or several months worth of stats even in 1M rows. This is because the number of schemas and queries in our system our very high (~10k schemas) (and this table has 1 row per schema + normalized query). 
Also, I'm not sure if bumping table size to 1M can cause other problems since this table lives completely in-memory. I can try it out with different sizes and monitor NULL percentage as you suggested. But it would've been a lot easier and efficient if we could just truncate this table periodically. This is exactly what some of the other vendors like Amazon Aurora do https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_PerfInsights.UsingDashboard.html#USER_PerfInsights.UsingDashboard.AnalyzeDBLoad.AdditionalMetrics.MySQL. Unfortunately there doesn't seem a way to do it in CloudSQL.
On Wednesday, February 3, 2021 at 8:53:24 PM UTC+5:30 georgi....@siteground.com wrote:
I'm really not sure what "2nd Gen MySQL 5.7" means, but in the official MySQL 5.7 documentation for Statement Digests in the Performance Schema I read the following:

The events_statements_summary_by_digest summary table has a fixed size. By default the Performance Schema estimates the size to use at startup. To specify the table size explicitly, set the performance_schema_digests_size system variable at server startup. If the table becomes full, the Performance Schema groups statements that have SCHEMA_NAME and DIGEST values not matching existing values in the table in a special row with SCHEMA_NAME and DIGEST set to NULL. This permits all statements to be counted. However, if the special row accounts for a significant percentage of the statements executed, it might be desirable to increase the summary table size by increasing performance_schema_digests_size.

You may also check Statement Summary Tables article where you could read the following:
  • If no row has the statement digest value for the statement that just completed, and the table is full, the statistics for the statement that just completed are added to a special "catch-all" row with DIGEST = NULL, which is created if necessary. If the row is created, the FIRST_SEEN and LAST_SEEN columns are initialized with the current time. Otherwise, the LAST_SEEN column is updated with the current time.

The row with DIGEST = NULL is maintained because Performance Schema tables have a maximum size due to memory constraints. The DIGEST = NULL row permits digests that do not match other rows to be counted even if the summary table is full, using a common "other" bucket. This row helps you estimate whether the digest summary is representative:

  • A DIGEST = NULL row that has a COUNT_STAR value that represents 5% of all digests shows that the digest summary table is very representative; the other rows cover 95% of the statements seen.

  • A DIGEST = NULL row that has a COUNT_STAR value that represents 50% of all digests shows that the digest summary table is not very representative; the other rows cover only half the statements seen. Most likely the DBA should increase the maximum table size so that more of the rows counted in the DIGEST = NULL row would be counted using more specific rows instead. To do this, set the performance_schema_digests_size system variable to a larger value at server startup.

So, in MySQL 5.7 you could count the ratio with a statement like this:

SELECT (SELECT COUNT_STAR
          FROM performance_schema.events_statements_summary_by_digest
         WHERE digest IS NULL)
       /
       (SELECT SUM(COUNT_STAR)
          FROM performance_schema.events_statements_summary_by_digest
          WHERE digest IS NOT NULL)
       * 100 AS digest_ratio;


According to Configuring database flags for MySQL guide you could set performance_schema_digests_size in Cloud SQL, so this is what you should do if you find out that too many statements are grouped together in the common "other" bucket (i.e. WHERE digest IS NULL). Please, consider that the maximum number of rows in the table is autosized at server startup (default) and the maximum size you could set is 1 048 576. For example on a moderately loaded instance with about 15 schemas I just observed an autosized value of 10 000 and 0.4592 ratio, so the size was properly selected by the server and does not need increasing.
On Tuesday, February 2, 2021 at 10:02:20 PM UTC+2 George (Cloud Platform Support) wrote:
Because Cloud SQL for MySQL is a managed service, it restricts access to certain system procedures and tables that require advanced privileges. In Cloud SQL, customers cannot create or have access to users with superuser attributes. Your operation requires super privileges, so you won't be able to execute it. Related information can be found on the "MySQL users" page
On Tuesday, 02 February 2021 at 12:35:34 UTC-5 dhawal....@rubrik.com wrote:

Hi all, 

I'm using 2nd Gen MySQL 5.7. I want to collect data from performance_schema tables, such as "events_statements_summary_by_digest". However, this table has a max size and I think it will require periodic truncation to insert new rows. But when I run truncate table or delete command on this table, I get permission denied error

mysql> truncate table performance_schema.events_statements_summary_by_digest ; ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER privilege(s) for this operation

Without deleting old rows, we won't really be able to use perf schema as it'll fill all the rows quickly. Is there a way to delete rows from this table? Are folks able to use performance schema without periodic truncation? 

--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/b7c770eb-63c9-4f8d-83f0-b338b7361acdn%40googlegroups.com.

No comments:

Post a Comment