MySQL slow concurrent/parallel queries in Python

Analysis of GLOBAL STATUS and VARIABLES:
 

Observations:

  • Version: 5.7.33-log
  • 8 GB of RAM
  • Uptime = 14d 05:11:54
  • 5.98 Queries/sec: 3.39 Questions/sec

The More Important Issues:

What is going on that would cause Max_used_connections to grow into the hundreds? When that goes about a few hundred, MySQL stumbles over itself and seems to “freeze up”. It is better to avoid allowing so many connections run at the same time.

Having a large value for max_allowed_packet can risk swapping.

With long_query_time = 0, every query is “slow”. Yet with slow_query_log = OFF, you are getting no information about the processing. Re-think what to do here.

Some specific suggestions:

binlog_cache_size = 64M
max_connections = 300
max_allowed_packet = 50M

Why so many rows being deleted? See if TRUNCATE would be more useful.

Unless the data will grow, the 8GB instance is overkill.

Check whether all prepared statements are “Closed”.

Details and other observations:

( Binlog_cache_disk_use / Binlog_cache_use ) = 8,557 / 13019 = 65.7% — Spill to disk — Increase binlog_cache_size (now 32768)

( innodb_lru_scan_depth * innodb_page_cleaners ) = 1,024 * 4 = 4,096 — Amount of work for page cleaners every second. — “InnoDB: page_cleaner: 1000ms intended loop took…” may be fixable by lowering lru_scan_depth: Consider 1000 / innodb_page_cleaners (now 4). Also check for swapping.

( innodb_lru_scan_depth ) = 1,024 — innodb_lru_scan_depth is a very poorly named variable. A better name would be innodb_free_page_target_per_buffer_pool. It is a number of pages InnoDB tries to keep free in each buffer pool instance to speed up read and page creation operations. — “InnoDB: page_cleaner: 1000ms intended loop took…” may be fixed by lowering lru_scan_depth

( innodb_io_capacity ) = 200 — When flushing, use this many IOPs. — Reads could be slugghish or spiky. Use 2000 if using SSD drive.

( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 65,999 / 327680 = 20.1% — PCt of buffer_pool currently not in use — innodb_buffer_pool_size (now 5368709120) is bigger than necessary?

( innodb_flush_neighbors ) = innodb_flush_neighbors = 1 — A minor optimization when writing blocks to disk. — Use 0 for SSD drives; 1 for HDD.

( innodb_io_capacity ) = 200 — I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor. Limits write IO requests per second (IOPS). — For starters: HDD: 200; SSD: 2000.

( innodb_adaptive_hash_index ) = innodb_adaptive_hash_index = ON — Whether to use the adapative hash (AHI). — ON for mostly readonly; OFF for DDL-heavy

( innodb_flush_log_at_trx_commit ) = 1 — 1 = secure; 2 = faster — (You decide) Use 1, along with sync_binlog (now 1)=1 for the greatest level of fault tolerance. 0 is best for speed. 2 is a compromise between 0 and 1.

( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF — Whether to log all Deadlocks. — If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.

( innodb_purge_threads ) = 1 — Number of threads to clean up history list. — If you have a lot of writes, recommend 4 in versions 5.6 and 10.0 or later.

( max_connections ) = 2,000 — Maximum number of connections (threads). Impacts various allocations. — If max_connections (now 2000) is too high and various memory settings are high, you could run out of RAM.

( 176000 * max_connections ) = (176000 * 2000) / 8192M = 4.1% — Estimate of ram usage due to the size of max_connections. max_connections (now 2000) is somewhat high

( max_allowed_packet ) = 999,999,488 / 8192M = 11.6%

— If you do not have large blobs (etc) to load, then decrease the value. Else decrease innodb_buffer_pool_size (now 5368709120) to make room. Swapping is terrible for performance.

( innodb_ft_result_cache_limit ) = 2,000,000,000 / 8192M = 23.3% — Byte limit on FULLTEXT resultset. (It grows as needed.) — Lower the setting.

( character_set_client ) = character_set_client = latin1

( character_set_connection ) = character_set_connection = latin1

( character_set_results ) = character_set_results = latin1

( (Com_show_create_table + Com_show_fields) / Questions ) = (306 + 226110) / 4161091 = 5.4% — Naughty framework — spending a lot of effort rediscovering the schema. — Complain to the 3rd party vendor.

( local_infile ) = local_infile = ON

— local_infile (now ON) = ON is a potential security issue

( Created_tmp_disk_tables / Questions ) = 1,527,791 / 4161091 = 36.7% — Pct of queries that needed on-disk tmp table. — Better indexes / No blobs / etc.

( Created_tmp_disk_tables / Created_tmp_tables ) = 1,527,791 / 1840986 = 83.0% — Percent of temp tables that spilled to disk — Maybe increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216); improve indexes; avoid blobs, etc.

( Com_rollback / (Com_commit + Com_rollback) ) = 490,964 / (100382 + 490964) = 83.0% — Rollback : Commit ratio — Rollbacks are costly; change app logic

( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (168913 + 3650 + 23593 + 1066) / 100382 = 1.96 — Statements per Commit (assuming all InnoDB) — Low: Might help to group queries together in transactions; High: long transactions strain various things.

( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 118 - 102 ) / ( 118 + 102 ) = 7.3% — Are you closing your prepared statements? — Add Closes.

( Com_stmt_close / Com_stmt_prepare ) = 102 / 118 = 86.4% — Prepared statements should be Closed. — Check whether all Prepared statements are “Closed”.

( Com_admin_commands / Queries ) = 3,167,117 / 7344901 = 43.1% — Percent of queries that are “admin” commands. — What’s going on?

( Com__biggest ) = Com__biggest = Com_admin_commands — Which of the “Com_” metrics is the biggest. — Normally it is Com_select (now 2792924). If something else, then it may be a sloppy platform, or may be something else.

( binlog_format ) = binlog_format = MIXED — STATEMENT/ROW/MIXED. — ROW is preferred by 5.7 (10.3)

( slow_query_log ) = slow_query_log = OFF — Whether to log slow queries. (5.1.12)

( Slow_queries ) = (Syncs) / 1228314 = 5.1 /sec — Frequency (Slow queries per sec) — Rework slow guys; improve indexes; watch disk space for slow log file

( Slow_queries / Questions ) = 6,269,663 / 4161091 = 150.7% — Frequency (% of all queries) — Find slow queries; check indexes.

( log_slow_slave_statements ) = log_slow_slave_statements = OFF — (5.6.11, 5.7.1) By default, replicated statements won’t show up in the slowlog; this causes them to show. — It can be helpful in the slowlog to see writes that could be interfering with Replica reads.

( Max_used_connections ) = 626 — High-water mark for connections — Lots of inactive connections is OK; over 100 active connections is likely to be a problem. Max_used_connections (now 626) does not distinguish them; Threads_running (now 1) is instantaneous.

( connect_timeout ) = 86,400 — DOS attack vulnerability if too large

( thread_stack * max_connections ) = (262144 * 2000) / 8192M = 6.1% — Bare minimum memory allocation for max_connections. — Lower max_connections (now 2000)

Abnormally small:

Innodb_log_writes / Innodb_log_write_requests = 0.11%

Abnormally large:

Com_purge_before_date = 12 /HR
Com_rename_table = 0.018 /HR
Com_replace_select = 1.5 /HR
Com_show_create_event = 0.0059 /HR
Com_show_create_trigger = 0.059 /HR
Com_show_profile = 0.0088 /HR
Com_show_profiles = 0.0029 /HR
Handler_delete = 114 /sec
Innodb_buffer_pool_pages_flushed / max(Questions, Queries) = 2.39
Innodb_buffer_pool_write_requests / Innodb_buffer_pool_pages_flushed = 213
Innodb_rows_deleted = 114 /sec
Max_execution_time_set = 74
Max_execution_time_set / Com_select = 0.00%
Performance_schema_file_instances_lost = 1
Prepared_stmt_count = 4
Ssl_accepts = 10,073
Ssl_default_timeout = 7,200
Ssl_finished_accepts = 10,073
Ssl_session_cache_misses = 4,432
Ssl_session_cache_overflows = 3,512
Ssl_used_session_cache_entries = 110
Ssl_verify_depth = 1.84e+19
Ssl_verify_mode = 5
Threads_connected = 339
Uptime - Uptime_since_flush_status = 1.12e+6
net_read_timeout = 86,400
net_write_timeout = 86,400
wait_timeout = 86,400

Abnormal strings:

gtid_mode = OFF_PERMISSIVE
innodb_data_home_dir = /rdsdbdata/db/innodb
innodb_fast_shutdown = 1
log_output = TABLE
log_statements_unsafe_for_binlog = OFF
optimizer_trace = enabled=off,one_line=off
optimizer_trace_features = greedy_search=on, range_optimizer=on, dynamic_range=on, repeated_subselect=on
relay_log_recovery = ON
slave_rows_search_algorithms = TABLE_SCAN,INDEX_SCAN
sql_slave_skip_counter = 0
time_zone = Europe/Amsterdam

Leave a Comment