54
From Crash to Testcase: a Debugging Primer (Or: how to get your bugs fixed really quickly and be loved by developers) Roel Van de Paar QA Lead, Percona 22 April 1:30pm - 4:30pm @ Ballroom D

From crash to testcase

Embed Size (px)

DESCRIPTION

Slides + Cheat Sheet from my "MySQL From Crash to Testcase: a Debugging Primer" talk @ PLMCE

Citation preview

Page 1: From crash to testcase

From Crash to Testcase: a Debugging Primer(Or: how to get your bugs fixed really quickly and be loved by developers)

Roel Van de PaarQA Lead, Percona22 April 1:30pm - 4:30pm @ Ballroom D

Page 2: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

2

Database Issues Cheat SheetDatabase Issues Cheat Sheet

Page 3: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

3Oops! My Server Oops! My Server CRASHED!CRASHED!

Or did it? Or did it?

Page 4: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

4

Application Error or not?Application Error or not?

● Application side: Signs that database server is alive?Application side: Signs that database server is alive?

● Server side: Check mysql CLI (Command Line Interface)Server side: Check mysql CLI (Command Line Interface)

● Check Error Log (data_dir/host_name.err)Check Error Log (data_dir/host_name.err)

– Start at end and work your way up

– Happy day if the last line reads “Writing a core file”● Check for cores (data_dir/core.pid, system locations)Check for cores (data_dir/core.pid, system locations)

Page 5: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

5

Application Errors: You're It!Application Errors: You're It!

● Bad News? Bad News?

– You need to fix it

– “original developer” has “left the building”

● Good News? Good News?

– Your A+ developerincorporated app logs

“Pay me now or pay me later”

Page 6: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

6

Debugging Your AppDebugging Your App

Page 7: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

7

Related Misconfigurations: Related Misconfigurations: Buffer & File Sizing, Communication Errors Buffer & File Sizing, Communication Errors

● Comms IssueComms Issue

– Hardware

– Comms buffer settings on server set to small etc.

● max_allowed_packet http://dev.mysql.com/doc/refman/5.6/en/packet-too-large.html

● Other non-comms IssuesOther non-comms Issues

– Buffer settings on server

● Example: [ERROR] mysqld: Sort aborted

– Can be due to a small sort_buffer_size● Example:InnoDB: ERROR: the age of the last checkpoint is 724774680, InnoDB:

which exceeds the log group capacity 724770200.

– Due to innodb_log_file_size being too small● Example: per-session var set too high thereby causing slowness, OOM etc.

Page 8: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

8

Severity/Error LevelSeverity/Error Level

Page 9: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

9

Server Crash?Server Crash?

Page 10: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

10

Always check the Error Log Always check the Error Log 1st1st

Page 11: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

11

Error Log Analysis?Error Log Analysis?

● 2013-03-10 06:58:50 19481 [Note] /ssd/Percona-Server-5.6.8-alpha60.2-313-debug.Linux.x86_64/bin/mysqld-debug: ready for connections.Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_6/tmp/master.sock' port: 13100 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debugmysqld-debug: /ssd/ps56-univ-log-archive-qa/Percona-Server-5.6.8-alpha60.2/sql/protocol.cc:518: void Protocol::end_statement(): Assertion `0' failed.04:00:32 UTC - mysqld got signal 6 ;

● 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified.2013-03-17 16:17:44 7f45c9e96700 InnoDB: File name /tmp/1363526254145352487/ib_log_archive_00000000000455682013-03-17 16:17:44 7f45c9e96700 InnoDB: File operation call: 'open' returned OS error 71.2013-03-17 16:17:44 7f45c9e96700 InnoDB: Cannot continue operation.2013-03-17 16:17:44 7f45c9e96700 InnoDB: Assertion failure in thread 13993771698764813 in file os0file.cc line 62InnoDB: Failing assertion: 0

Page 12: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

12

Error Log Analysis?Error Log Analysis?

● Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_4/tmp/master.sock' port: 13060 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug2013-03-13 02:20:59 8466 [ERROR] InnoDB: Unable to lock /ssd/tmp/ib_log_archive_0000000687614464, error: 112013-03-13 02:20:59 8466 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.InnoDB: Cannot create or open archive log file /ssd/tmp/ib_log_archive_0000000687614464.InnoDB: Cannot continue operation.InnoDB: Check that the log archive directory exists,InnoDB: you have access rights to it, andInnoDB: there is space available.

● 2013-03-11 03:34:23 7f9866dca700 InnoDB: Assertion failure in thread 140292537493248 in file row0purge.cc line 459InnoDB: Failing assertion: 0x20UL & rec_get_info_bits( btr_cur_get_rec(btr_cur), dict_table_is_comp(index->table))

Page 13: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

13

Error Log Analysis: Initial TipsError Log Analysis: Initial Tips

● RRR++: Read! Read! Read! Read Again! (And once more)RRR++: Read! Read! Read! Read Again! (And once more)

– Why?

● What's the problem?What's the problem?

– Assert vs. Error vs. Crash vs. OS vs. OOM vs. Sigx vs. Halt vs. Kill vs. Corruption vs. Deadlocks vs. Buffer & File sizing vs. Communication Errors vs. SQL Errors vs. Warnings vs. 3rd Party Messages vs. …

Page 14: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

14

Analyzing the Error Log & Analyzing the Error Log & allall it contains it contains

:$ / RRR++:$ / RRR++

Research++Research++

Page 15: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

15

Which Which QueryQuery caused trouble? caused trouble?

● Crashing query: In Error log:Crashing query: In Error log:

– Query (3ff000002300): select f1 from t2 limit 5

● Faulting query: Faulting query:

– 130325 6:07:46 [ERROR] mysqld: Sort aborted: Query execution was interrupted

Page 16: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

16

Resolving StacksResolving Stacks

[roel@localhost log]$ grep "mysqld(_" master.err | sed 's/^.*mysqld//'(_ZN8Protocol13end_statementEv+0x1db)[0x525ce7](_Z16dispatch_command19enum_server_commandP3THDPcj+0x1496)[0x5a2e6d](_Z10do_commandP3THD+0x284)[0x5a3702](_Z24do_handle_one_connectionP3THD+0x121)[0x648f1d]

[roel@localhost log]$ grep "mysqld(_" master.err | sed 's/^.*mysqld//' | c++filt(Protocol::end_statement()+0x1db)[0x525ce7](dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x1496)[0x5a2e6d](do_command(THD*)+0x284)[0x5a3702](do_handle_one_connection(THD*)+0x121)[0x648f1d]

Page 17: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

17

SideTour:SideTour: Which Which QueryQuery caused trouble: LES caused trouble: LES

● LES: Last Executed StatementLES: Last Executed Statement

– LES Error log ExtractionLES Error log Extraction

/server/bin/mysqld(handle_one_connection+0x52)[0x649010]/lib64/libpthread.so.0[0x333d007851]/lib64/libc.so.6(clone+0x6d)[0x333cce890d]Trying to get some variables.Some pointers may be invalid and cause the dump to abort.Query (7fe000009188): select `c3`,`c4` from `qa07` limit 10

– LES gdb ExtractionLES gdb ExtractionSelect 'do_command' frame in crashing thread using thread & frame,then use: p thd->query_string.string.str

http://www.mysqlperformanceblog.com/2012/09/09/obtain-last-executed-statement-from-optimized-core-dump/Demo: /ssd/Percona-Server-5.5.29-rel30.0--debug.Linux.x86_64/data4

Page 18: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

18

AssertAssertions: Generalions: General

● Assert: “I, developer x, assert that at this point y, x=0 Assert: “I, developer x, assert that at this point y, x=0 (as an example) should not be the case.”(as an example) should not be the case.”– RRR++: SE, File, Line, Vars, TimeRRR++: SE, File, Line, Vars, Time

121204 7:45:06 InnoDB: Assertion failure in thread 1390 in file row0upd.c line 2023InnoDB: Failing assertion: btr_pcur_restore_position(thr_get_trx(thr)->fake_changes ? BTR_SEARCH_TREE : BTR_MODIFY_TREE, pcur, mtr)

– RRR++: Are you a dev?RRR++: Are you a dev?130127 0:20:37 InnoDB: Assertion failure in thread 1396 in file row0sel.c line 115InnoDB: Failing assertion: prefix_len >= sec_len

Page 19: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

19

AssertAssertions: Typeions: Type

● Server AssertionServer Assertion

mysqld: /mysql-5.5/sql/sql_string.cc:37: bool String::real_alloc(uint32): Assertion `arg_length > length' failed.

● InnoDB/XtraDB/Other SE Assertion (Seen most often)InnoDB/XtraDB/Other SE Assertion (Seen most often)

InnoDB: Error: Waited for 600 secs for hash index ref_count (1) to drop to 0. index: "c32" table: "test/#sql2-4b20-a"

121203 3:48:15 InnoDB: Assertion failure in thread 352803136 in file dict0dict.c line 1883

Page 20: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

20

Error Log Error Log Analysis Analysis ExampleExample

● 2013-03-10 06:58:50 19481 [Note] /ssd/Percona-Server-5.6.8-alpha60.2-313-debug.Linux.x86_64/bin/mysqld-debug: ready for connections.Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_6/tmp/master.sock' port: 13100 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debugmysqld-debug: /ssd/ps56-univ-log-archive-qa/Percona-Server-5.6.8-alpha60.2/sql/protocol.cc:518: void Protocol::end_statement(): Assertion `0' failed.04:00:32 UTC - mysqld got signal 6 ;

● 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified.2013-03-17 16:17:44 7f45c9e96700 InnoDB: File name /tmp/1363526254145352487/ib_log_archive_00000000000455682013-03-17 16:17:44 7f45c9e96700 InnoDB: File operation call: 'open' returned OS error 71.2013-03-17 16:17:44 7f45c9e96700 InnoDB: Cannot continue operation.2013-03-17 16:17:44 7f45c9e96700 InnoDB: Assertion failure in thread 13993771698764813 in file os0file.cc line 62InnoDB: Failing assertion: 0

Page 21: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

21

Error Log Error Log Analysis Analysis ExampleExample

● Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_4/tmp/master.sock' port: 13060 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug2013-03-13 02:20:59 8466 [ERROR] InnoDB: Unable to lock /ssd/tmp/ib_log_archive_0000000687614464, error: 112013-03-13 02:20:59 8466 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files.InnoDB: Cannot create or open archive log file /ssd/tmp/ib_log_archive_0000000687614464.InnoDB: Cannot continue operation.InnoDB: Check that the log archive directory exists,InnoDB: you have access rights to it, andInnoDB: there is space available.

● 2013-03-11 03:34:23 7f9866dca700 InnoDB: Assertion failure in thread 140292537493248 in file row0purge.cc line 459InnoDB: Failing assertion: 0x20UL & rec_get_info_bits( btr_cur_get_rec(btr_cur), dict_table_is_comp(index->table))

Page 22: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

22

ErrorErrorss

● 130325 5:54:07 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.130325 5:54:07 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.host' doesn't exist

● 130223 12:44:05 InnoDB: Error: Write to file ./apr/fr1 failed at offset 13.InnoDB: 49152 bytes should have been written, only 0 were written.InnoDB: Operating system error number 9.InnoDB: Check that your OS and file system support files of this size.InnoDB: Check also that the disk is not full or a disk quota exceeded.InnoDB: Error number 9 means 'Bad file descriptor'.

● 130325 5:36:11 [ERROR] /ssd/Server/bin/mysqld: Incorrect information in file: './test/v.frm'

● 130325 6:07:46 [ERROR] /ssd/Server/bin/mysqld: Sort aborted: Query execution was interrupted

● 130325 6:10:07 [ERROR] /ssd/Server/bin/mysqld: Sort aborted: Server shutdown in progress

Page 23: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

23

CrashCrasheses

● thread 10 (LWP 23954):+bt#0 0x0000000004e3969c in pthread_kill () from /lib64/libpthread.so.0#1 0x00000000007e2779 in my_write_core (sig=11) at /ssd/QA-16274-5.5/Percona-Server-5.5.28-rel29.3/mysys/stacktrace.c:433#2 0x00000000006ab0ea in handle_fatal_signal (sig=11) at /ssd/QA-16274-5.5/Percona-Server-5.5.28-rel29.3/sql/signal_handler.cc:249#3 <signal handler called>#4 rbt_free_node (node=0x0, nil=0x1040f170) at /ssd/QA-16274-5.5/Percona-Server-5.5.28-rel29.3/storage/innobase/ut/ut0rbt.c:731#5 0x00000000009935e9 in rbt_free_node (node=0x1040f1e0, nil=0x1040f170) at /ssd/QA-16274-5.5/Percona-Server-5.5.28-rel29.3/storage/innobase/ut/ut0rbt.c:731

● https://bugs.launchpad.net/percona-server/+bug/1111226 (Crash, Valgrind, Error Log)

Page 24: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

24

OSOS//HardwareHardware Related Message Related Message((ss))

Page 25: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

25

OSOS Related Issues Related Issues

● https://bugs.launchpad.net/percona-server/+bug/806975https://bugs.launchpad.net/percona-server/+bug/806975

● OS errors: PerrorOS errors: Perror

– <base_dir>/bin/perror

Page 26: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

26

OOMOOM

● CLI:

ERROR 5 (HY000): Out of memory (Needed 128992 bytes)

● Error Log:

110531 17:12:08 [ERROR] /home/philips/bzr/mysql-55-eb/sql/mysqld: Out of memory (Needed 129872 bytes)

● Use Valgrind [Memcheck, Massif]!

● https://bugs.launchpad.net/percona-server/+bug/1042946

– Could cause OOM

– Valgrind [Massif] helps

Page 27: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

27

SigSig=x & =x & KillKill -x -x

● Signal NumbersSignal Numbers

– Sig=x Signal Results Note

– 4 SIGILL Core Illegal Instruction

– 6 SIGABRT Core Abort signal by abort()

– 8 SIGFPE Core Floating Point Exception

– 11 SIGSEGV Core Invalid Memory Reference● Tip: you can use for example 'kill -11' to get a core dump at Tip: you can use for example 'kill -11' to get a core dump at

any given point, for example I've used this when seeing a any given point, for example I've used this when seeing a memory allocation issue.memory allocation issue.

Page 28: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

28

HaltHaltss

● Different from Sig=x.Different from Sig=x.

– Server just “halts”● Different from unplanned shutdown Different from unplanned shutdown

– Server just “halts”● Query loggingQuery logging

● Error log informationError log information

● (gdb breakpoints on exit functions)(gdb breakpoints on exit functions)

Page 29: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

29

Database Database CorruptionCorruption

● Standard recovery (Not corruption):Standard recovery (Not corruption):130327 11:02:02 InnoDB: highest supported file format is Barracuda.InnoDB: The log sequence number in ibdata files does not matchInnoDB: the log sequence number in the ib_logfiles!130327 11:02:02 InnoDB: Database was not shut down normally!InnoDB: Starting crash recovery.InnoDB: Reading tablespace information from the .ibd files...InnoDB: Restoring possible half-written data pages from the doublewrite buffer...

● Data Corruption:Data Corruption:120117 1:22:00 InnoDB: Starting an apply batch of log records to the database...InnoDB: Progress in percents: 0 1 2 3 4 5[...] 99InnoDB: Apply batch completed120117 1:22:02 InnoDB: Rolling back trx with id A01D1001, 13 rows to undoInnoDB: Dropping table with id 54885 in recovery if it existsInnoDB: Error: trying to load index PRIMARY for table nr92/#sql2-2e46-316ce0InnoDB: but the index tree has been freed!InnoDB: Rolling back of trx id A01D1001 completed

Page 30: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

30

DeadlocksDeadlocks

● Deadlocks are funDeadlocks are fun

● User initiated vs actual server deadlockUser initiated vs actual server deadlock

– User Initiated: mysql> select * from t1 where a = 2; #with corresponding other sessionERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

– Server deadlockProgramming deadlock

Page 31: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

31

(3(3rdrd) Party Messages (RQG, Valgrind) vs. ) Party Messages (RQG, Valgrind) vs.

● Other items which may write to the error logOther items which may write to the error log- RQG- Valgrind- InnoDB status monitor

● Valgrind Example:Valgrind Example:==12667== Thread 15:==12667== Invalid read of size 8==12667== at 0x93D473: lock_rec_block_validate (lock0lock.c:4969)==12667== by 0x93D8D0: lock_print_info_all_transactions (lock0lock.c:5113)==12667== by 0x862BAC: srv_printf_innodb_monitor (srv0srv.c:2263)==12667== by 0x862DA5: srv_monitor_thread (srv0srv.c:2580)==12667== by 0x4E34850: start_thread (in /lib64/libpthread-2.12.so)==12667== by 0x19FCA6FF: ???==12667== Address 0x16220c48 is 664 bytes inside a block of size 872 free'd==12667== at 0x4C2695D: free (vg_replace_malloc.c:366)==12667== by 0x952579: mem_area_free (mem0pool.c:519)[...]

Page 32: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

32

SideTour:SideTour: Googling++ Googling++

(Demo)(Demo)

Page 33: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

33

Considering InnoDB Status & RecoveryConsidering InnoDB Status & Recovery

● This is for 'Production' (non-test) systems onlyThis is for 'Production' (non-test) systems only

● innodb_force_recoveryinnodb_force_recovery

http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.htmlhttp://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html

Page 34: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

34

Crash Severity SummaryCrash Severity Summary

Page 35: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

35

Core Dumps: Locations & SetupCore Dumps: Locations & Setup

Page 36: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

36

Core Dumps: Analysis: using gdbCore Dumps: Analysis: using gdb

● LES exampleLES example

● Google search exampleGoogle search example

● thread apply all btthread apply all bt

● Cheat SheetCheat Sheet

Page 37: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

37

Core Dumps: Analysis: using WinDbgCore Dumps: Analysis: using WinDbg

● DemoDemo

Page 38: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

38

Valgrind: IntroductionValgrind: Introduction

Page 39: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

39

RQG: IntroductionRQG: Introduction

(Demo)(Demo)

Page 40: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

40

SideTour: Bash Scripting Fun!SideTour: Bash Scripting Fun!

(Demo)(Demo)

Page 41: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

41

SQL ErrorsSQL Errors

● Place?Place?

– MySQL CLI

– Error Log ● Usually higher severity then in the CLI

– Application

Page 42: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

42

Query Simplification Query Simplification

● Objective?Objective?

– Good testcase: Reduce a crashing/failing query to the minimum length and complexity required to still obtain the “desired” crash/error/issue (QA/Debugging)

– Optimizing the query: Reduce a query to obtain the same result without altering it's functionality or future results with changed data (Support/Optimization)

Page 43: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

43

Query Simplification: Clauses Query Simplification: Clauses

mysql> SELECT * FROM t1 WHERE a>1 LIMIT 20;mysql> SELECT * FROM t1 WHERE a>1 LIMIT 20;

mysql> SELECT * FROM t1;mysql> SELECT * FROM t1;

Page 44: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

44

Query Simplification: Query Split Query Simplification: Query Split

mysql> SELECT * FROM (SELECT a FROM t1) AS res1 mysql> SELECT * FROM (SELECT a FROM t1) AS res1 WHERE a > 2;WHERE a > 2;

mysql> SELECT a FROM t1;mysql> SELECT a FROM t1;mysql> SELECT a FROM t1 WHERE a > 2;mysql> SELECT a FROM t1 WHERE a > 2;

Page 45: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

45

Query Simplification: Generalize DataQuery Simplification: Generalize Data

mysql> SELECT "There was once a little bug" INTO @a;mysql> SELECT "There was once a little bug" INTO @a;

mysql> SELECT "a" INTO @a;mysql> SELECT "a" INTO @a;

mysql> SELECT 1 INTO @a;mysql> SELECT 1 INTO @a;

Page 46: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

46

Query Simplification: Move Data into QueryQuery Simplification: Move Data into Query

● Often involves changes of results, but that matters “little” Often involves changes of results, but that matters “little” (reproducibility factor may change) if the issue still (reproducibility factor may change) if the issue still reproduces.reproduces.

mysql> SELECT a FROM t1; a=columnmysql> SELECT a FROM t1; a=column

mysql> SELECT "a" FROM t1; a=?mysql> SELECT "a" FROM t1; a=?

mysql> SELECT 1 FROM t1; 1=digit 1 (x rows)mysql> SELECT 1 FROM t1; 1=digit 1 (x rows)

mysql> SELECT 1;mysql> SELECT 1;

Page 47: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

47

Query Simplification: Limit nr of FieldsQuery Simplification: Limit nr of Fields

mysql> SELECT a,b,x,z FROM t1;mysql> SELECT a,b,x,z FROM t1;

mysql> SELECT * FROM t1;mysql> SELECT * FROM t1;

mysql> SELECT a FROM t1;mysql> SELECT a FROM t1;

mysql> SELECT 1 FROM t1;mysql> SELECT 1 FROM t1;

Page 48: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

48

Query Simplification: Simplify TableQuery Simplification: Simplify Table

mysql> SELECT a FROM t1;mysql> SELECT a FROM t1;

mysql> ALTER TABLE t1 DROP COLUMN b;mysql> ALTER TABLE t1 DROP COLUMN b;

Page 49: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

49

Test Case ProductionTest Case Production

● Strategies:Strategies:

– Bring server up -> re-run crashing query● mysqldump -> add query -> “Simplify the query”

– Use randgen or Gypsy with grammars based on crashing query (or usually used queries if the crashing query is not known)

– Run Valgrind for some time on the server to checkfor programming errors

Page 50: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

50

SideTour:SideTour: Logging Bugs++ Logging Bugs++

Page 51: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

51

SideTour:SideTour: A note on threading & reproducibility A note on threading & reproducibility

● 1 Thread?1 Thread?

– Usually easy to reproduce● But not always: timing, OS slicing, SE dives etc.● Impossible to 100% match timing

● Many threads?Many threads?

– Usually hard to reproduce (exception: RQG/Gypsy)● Dev core analysis is usually quickest way forward

Page 52: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

52

Percona!Percona!

Page 53: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

53

Connect++Connect++

Page 54: From crash to testcase

From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona

54

ConnectConnect++++

[email protected]