1. MY SQL SKILLS KILLED THE SERVER Dave Ferguson @dfgrumpy
dev.Objective() 2015
2. WHO AM I? I am an Adobe Community Professional I started
building web applications a long time ago Contributor to Learn CF
in a week I have a ColdFusion podcast called CFHour w/ Scott Stroz
(@boyzoid) (please listen) 3x California State Taekwondo Weapons
Champion
3. WHAT WILL WE COVER? Running Queries When good SQL goes bad
Bulk processing Large volume datasets Indexes Outside
influences
4. (I KNOW SQL) WHY AM I HERE?
5. Because you have probably written something like this
6. select * from myTable
7. I can write SQL in my sleep select * from myTable where id =
2
8. I can write joins and other complex SQL Select mt.* from
myTable mt join myOtherTable mot on mt.id = mot.id where mot.id =
2
9. I might even create a table CREATE TABLE `myFakeTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT, `someName`
varchar(150) NOT NULL DEFAULT '', `someDescription` text, `type`
varchar(50) DEFAULT NULL, `status` int(11) NOT NULL, PRIMARY KEY
(`id`) );
10. But, how do you know if what you did was the best / most
efficient way to do it?
11. Did the internet tell you it was right?
12. Did you get some advice from a someone?
13. My app works fine. It has thousands of queries and we only
see slowness every once in a while.
14. Have you ever truly looked at what your queries are
doing?
15. Most developers don't bother. They leave all that technical
database stuff up to the DBA. But what if you are the developer AND
the DBA?
16. Query Plan Uses Execution Contexts Created for each degree
of parallelism for a query Execution Context Specific to the query
being executed. Created for each query QUERY EXECUTION
17. Execution Context & Query Plan
18. Have you ever looked at a query plan? Do you know what a
query plan is?
19. Query Plan, In the event you were curious
20. WHAT A QUERY PLAN WILL TELL YOU Path taken to get data
Almost like a Java stack trace Indexes usage How the indexes are
being used Cost of each section of plan Possible suggestions for
performance improvement Whole bunch of other stuff
21. How long are plans / contexts kept? 1 Hour 1 Day Til SQL
server restarts Discards it immediately The day after forever Till
the server runs out of cache space
22. What can cause plans to be flushed from cache? Forced via
code Memory pressure Alter statements Statistics update auto_update
statistics on
23. HOW CAN WE KEEP THE DATABASE FROM THROWING AWAY THE
PLANS?
24. MORE IMPORTANTLY, HOW CAN WE GET THE DATABASE TO USE THE
CACHED PLANS?
25. Force it Use params Use stored procedures Get more ram Use
less queries SIMPLE ANSWER
26. HOW DOES SQL DETERMINE IF THERE IS A QUERY PLAN?
27. Something Like this
28. THIS QUERY WILL CREATE A EXECUTION CONTEXT.. select id,
name from myTable where id = 2 THAT
29. WILL NOT BE USED BY THIS QUERY. select id, name from
myTable where id = 5
30. WHY IS THAT? Well, the queries are not the same.
31. According to the SQL optimizer, select id, name from
myTable where id = 2 select id, name from myTable where id = 5 this
query and this query are not the same. So, they each get their own
execution plan.
32. PLANS CAN BECOME DATA HOGS select id, name from myTable
where id = 2 If the query above ran 5,000 times over the course of
an hour (with different ids), you could have that many plans
cached. That could equal around 120mb of cache space!
33. TO RECAP EXECUTION CONTEXTS ARE GOOD TOO MANY ARE BAD
34. USING QUERY PARAMS The secret sauce to plan reuse
35. select a.ARTID, a.ARTNAME from ART a where a.ARTID = Using
a simple query lets add a param for the id.
36. select a.ARTID, a.ARTNAME from ART a where a.ARTID = ? THE
QUERY OPTIMIZER SEES THIS
37. testQuery (Datasource=cfartgallery, Time=1ms, Records=1) in
/xxx/x.cfm select a.ARTID, a.ARTNAME from ART a where a.ARTID = ?
Query Parameter Value(s) - Parameter #1(cf_sql_integer) = 5 THE
DEBUG OUTPUT LOOKS LIKE THIS
38. testQuery (Datasource=cfartgallery, Time=8ms, Records=5) in
/xxx/x.cfm select a.ARTID, a.ARTNAME from ART a where a.ARTID in
(?,?,?,?,?) Query Parameter Value(s) - Parameter #1(CF_SQL_CHAR) =
1 Parameter #2(CF_SQL_CHAR) = 2 Parameter #3(CF_SQL_CHAR) = 3
Parameter #4(CF_SQL_CHAR) = 4 Parameter #5(CF_SQL_CHAR) = 5 IT EVEN
WORKS ON LISTS
39. testQuery (Datasource=cfartgallery, Time=3ms, Records=1) in
/xxx/x.cfm select a.ARTID, a.ARTNAME, ( select count(*) from
ORDERITEMS oi where oi.ARTID = ? ) as ordercount from ART a where
a.ARTID in (?) Query Parameter Value(s) - Parameter
#1(cf_sql_integer) = 5 Parameter #2(cf_sql_integer) = 5 MORE
ACCURATELY, THEY WORK ANYWHERE YOU WOULD HAVE DYNAMIC INPUT...
40. When can plans cause more harm then help? When your data
structure changes When data volume grows quickly When you have data
with a high degree of cardinality.
41. How do I deal with all this data?
42. What do I mean by large data sets? Tables over 1 million
rows Large databases Heavily denormalized data
43. Ways to manage large data Only return what you need (no
select *) Try and page the data in some fashion Optimize indexes to
speed up where clauses Avoid using triggers on large volume inserts
Reduce any post query processing as much as possible
44. Inserting / Updating large datasets Reduce calls to
database by combining queries Use bulk loading features of your
Database Use XML/JSON to load data into Database
45. Combining Queries: Instead of doing this
46. Do this
47. Gotchas in query combining Errors could cause whole batch
to fail Overflowing allowed query string size Database locking can
be problematic Difficult to get any usable result from query
48. Upside query combining Reduces network calls to database
Processed as a single batch in database Generally processed many
times faster than doing the insert one at a time I have used this
to insert over 50k rows into mysql in under one second.
49. Indexes The secret art of a faster select
50. Index Types Unique Primary key or row ID Covering A
collection of columns indexed in an order that matches where
clauses Clustered The way the data is physically stored Table can
only have one NonClustered Only contain indexed data with a pointer
back to source data
51. Seeking and Scanning Index SCAN (table scan) Touches all
rows Useful only if the table contains small amount of rows Index
SEEK Only touches rows that qualify Useful for large datasets or
highly selective queries Even with an index, the optimizer may
still opt to perform a scan
52. To index or not to index DO INDEX Large datasets where 10
15% of the data is usually returned Columns used in where clauses
with high cardinality User name column where values are unique DONT
INDEX Small tables Columns with low cardinality Any column with
only a couple values
53. Do I really need an index?
54. It Depends.
55. Really it Depends!
56. Outside influences
57. Other things that can effect performance Processor load
Memory pressure Hard drive I/O Network
58. Processor Give SQL Server process CPU priority Watch for
other processes on the server using excessive CPU cycles Have
enough cores to handle your database activity Try to keep average
processor load below 50% so the system can handle spikes
gracefully
59. Memory (RAM) Get a ton (RAM is cheap) Make sure you have
enough RAM to keep your server from doing excess paging Make sure
your DB is using the RAM in the server Allow the DB to use RAM for
cache Watch for other processes using excessive RAM
60. Drive I/O Drive I/O is usually the largest bottle neck on
the server Drives can only perform one operation at a time Make
sure you dont run out of space Purge log files Dont store all DB
and log files on the same physical drives On windows dont put your
DB on the C: drive If possible, use SSD drives for tempdb or other
highly transactional DBs Log drives should be in write priority
mode Data drives should be in read priority mode
61. Network Only matters if App server and DB server are on
separate machines (they should be) Minimize network hops between
servers Watch for network traffic spikes that slow data retrieval
Only retrieving data needed will speed up retrieval from DB server
to app server Split network traffic on SQL server across multiple
NIC cards so that general network traffic doesnt impact DB
traffic
62. Some Important Database Statistics
63. Important stats Recompiles Recompile of a proc while
running shouldnt occur Caused by code in proc or memory issues
Latch Waits Low level lock inside DB; Should be sub 10ms Lock Waits
Data lock wait caused by thread waiting for another lock to clear
Full Scans Select queries not using indexes
64. Important stats continued.. Cache Hit Ratio How often DB is
hitting memory cache vs Disk Disk Read / Write times Access time or
write times to drives SQL Processor time SQL server processor load
SQL Memory Amount of system memory being used by SQL
65. Where SQL goes wrong (Good examples of bad SQL)
66. Inline queries that well shouldnt be
67. Over joining data
68. Transactions Do you see the issue?
69. THAT IS ALL THANK YOU Dave Ferguson @dfgrumpy
[email protected] www.cfhour.com dev.Objective() 2015Dont forget
to fill out the survey