Detecting (and even preventing) SQL Injection
Using the Percona Toolkit and Noinject!
Justin Swanhart Percona Live, April 2013
INTRODUCTION
2
Introduction
• Who am I?
• What do I do?
• Why am I here?
3
The tools
• MySQL (5.0+)
• Percona Toolkit
– pt-query-digest
– pt-fingerprint
• MySQL Proxy (0.8.0+)
• Apache and PHP 5.3+
4
WHAT IS SQL INJECTION?
5
What is SQL injection?
• SQL injection is an attack vector
– An attacker modifies the SQL queries which will be executed by the server
– But the attacker does not need to change the code on the server or get access to the server
6
What is SQL injection – interpolation (strings)
$username = $_GET[‘username’];
$sql =
“select 1
from users.users
where admin_flag=true
and username = ‘“ . $username . “’”;
$ wget http://host/path.php?username=bob
$ wget http://host/path.php?user_id=“' or '1'='1”
and username = ‘’ or ‘1’ = ‘1’
7
SQL injection!
Escape strings, or use prepared statements!
#escape string values
$username = mysqli_real_escape_string($_GET[‘username’]);
$sql = “select … and username = ‘“ . $username . “’”;
#prepared statement
$username = GET[‘username’];
$stmt = mysqli_stmt_init($conn);
$sql = “select … and username = ?”
mysqli_stmt_prepare($stmt, $sql);
mysqli_stmt_bind_param($stmt, “s”, $username);
mysqli_stmt_execute($stmt);
mysqli_stmt_close($stmt);
8
What is SQL injection – interpolation (ints)
$user_id = $_GET[‘user_id’];
$sql =
“select 1
from users.users
where admin_flag=true
and user_id = “ . $user_id;
…
$ wget http://host/path.php?user_id=1
$ wget http://host/path.php?user_id=“1 or 1=1”
9
SQL injection!
Use type checking, or prepared statements!
#check that integers really are integers!
$user_id = GET[‘user_id’];
if(!is_numeric(user_id)) $user_id = “NULL”;
$sql = “select … and user_id = “ . $user_id;
#prepared statement
$user_id = GET[‘user_id’];
$sql = “select … and user_id = ?”
…
mysqli_stmt_bind_param($stmt, “i”, $user_id);
mysqli_stmt_execute($stmt);
10
When escaping can’t help
• Some parts of a SQL statement can’t be manipulated using parameters
• These include
– ORDER BY columns
– Variable number of items in an IN list
– Adding SQL syntax like DISTINCT
11
Don’t use user input in the query
#avoid using user input directly in ANY way
$sql = “select * from listings where deleted = 0 and sold = 0 and open = 1”;
if(!empty($_GET[‘ob’])) {
$sql .= “ ORDER BY “ . $_GET[‘ob’];
}
wget … ?ob=post_date
wget … ?ob=“post_date union all (select * from listings)”
12
Now we can see all listings
Bad!
Use whitelisting instead
#avoid using user input directly in ANY way
$sql = “select * from listings where deleted = 0 and sold = 0 and open = 1”;
$allowed = array(‘post_date’,’neighborhood’,’etc’);
if(!empty($_GET[‘ob’]) && is_string($_GET[‘ob’])) {
if(in_array($_GET[‘ob’], $allowed)) {
$sql .= “ ORDER BY “ . $_GET[‘ob’];
}
}
wget … ?ob=post_date
wget … ?ob=“post_date union all (select * from listings)”
13
in_array() is the keeper of the gate
All that works great for the apps you control
• BUT…
– If you don’t have the source for an app, then you really can’t be sure it isn’t safe from SQL injection
– Or maybe you have to support old apps
– Or apps that were not developed rigorously
– What do we do in these cases?
14
SQL INJECTION DETECTION USING PT-QUERY-DIGEST
Out-of-band SQL injection detection
15
How to detect SQL injection?
• Most applications only do a small number of things.
– Add orders, mark orders as shipped, update addresses, etc.
– The SQL “patterns” that identify these behaviors can be collected and whitelisted.
– Queries that don’t match a known fingerprint may be investigated as SQL injection attempts
16
What is a query fingerprint?
• A query fingerprinting algorithm transforms a query into a form that allows like queries to be grouped together and identified as a unit
– In other words, these like queries share a fingerprint
– Even though the queries differ slightly they still fingerprint to the same value
– This is a heuristic based approach
17
Tools that support query fingerprints
• Percona Toolkit tools
– pt-query-digest
– pt-fingerprint
18
Reads slow query logs and populates the whitelist table. Can also be used to display new queries that have not been marked as allowed.
Takes a query (or queries) and produces fingerprints. Useful for third party tools that want to use fingerprints.
What is a query fingerprint (cont?)
select * from some_table where col = 3
becomes
select * from some_table where col = ?
select * from some_table where col = IN (1,2)
becomes
select * from some_table where col IN (?)
19
Query fingerprints expressed as hashes
pt-query-digest can provide short hashes of checksums
select * from some_table where col = ?
982e5737f9747a5d (1631105377)
select * from some_table where col = IN (?)
2da8ed487cdfc1c8 (1680229806268)
20
base 10
pt-query-digest
• Normally used for profiling slow queries
• Has a “SQL review” feature for DBAs
– Designed to mark query fingerprints as having been reviewed
– This feature can be co-opted to discover new query fingerprints automatically
– New fingerprints are either new application code or SQL injection attempts
21
pt-query-digest – review feature
• Need to store the fingerprints in a table
– Known good fingerprints will be marked as reviewed
– If pt-query-digest discovers new fingerprints you will be alerted because there will be unreviewed queries in the table
22
pt-query-digest - review table initialization
Need to initialize the table pt-query-digest /path/to/slow.log \
--create-review-table
--review “h=127.0.0.1,P=3306,u=percona,p=2un1c0rns,D=percona,t=whitelist” \
--sample 1 \
--no-report
23
Where to store fingerprints
Don’t waste time on stats
Don’t print report
pt-query-digest – command-line review
pt-query-digest /path/to/slow.log \
--review “DSN…” \
--sample 1 \
--report \
--limit 0
24
Ensure that all unreviewed queries are shown
Display the report of queries
Don’t collect stats, just sample one of each new fingerprint
How it knows which queries have already been reviewed
USING THE WHITELIST WITH SQL
25
Detecting new query fingerprints
SELECT count(*)
FROM percona.whitelist
WHERE reviewed_by IS NULL;
SELECT checksum, sample
FROM percona.whitelist
WHERE reviewed_by IS NULL;
26
Any new queries? percona.whitelist is just an example name, you can use any you like
Get a list of the queries
Add a query fingerprint to the whitelist
UPDATE percona.whitelist
SET reviewed_by = ‘allow’,
reviewed_on = now()
WHERE checksum= 1680229806268;
27
Blacklist a query fingerprint
You might also explicitly blacklist a fingerprint
UPDATE percona.whitelist
SET reviewed_by = ‘deny’,
reviewed_on = now()
WHERE checksum = 1631105377;
28
Web interface for whitelist management
• The Noinject! project (discussed later) has a web interface that can be used to mark queries as reviewed
• It can be with both the noinject.lua proxy script or with pt-query-digest
29
LIMITATIONS AND CAVEATS Out of band detection
30
Out-of-band detection
• Some damage or information leakage may have already happened
• To limit the extent of the damage send an alert as soon as a new pattern is detected
– Ensure thorough application pattern detection in a test environment to avoid false positives
31
Get logs as fast as possible
• Use tcpdump on a mirrored server port – Pipe the output to pt-query-digest
• Use tcpdump on the database server – Adds some additional overhead from running the
tools on the same machine
– Possibly higher packet loss
• Collect and process slow query logs frequently – Adds slow query log overhead to server
– Longer delay before processing
32
FINDING THE VULNERABILITY What to do BEFORE a fishy fingerprint appears
33
Prepare for finding a vulnerability
• Tracking down the vulnerable code fragment can be difficult if you have only the SQL statement
• Not just a problem with SQL injection since it is usually convenient to see where a SQL statement was generated from
34
Add tracing comments to queries
• A good approach is to modify the data access layer (DAL) to add SQL comments
– Comments are preserved in the slow query log
– Comments are displayed in SHOW commands
• SHOW ENGINE INNODB STATUS
• SHOW PROCESSLIST
– Make sure your client does not strip comments!
35
Add tracing information
• PHP can use debug_backtrace() for example
• PERL has variables that point to the file and line
• Investigate the debugging section of your langauge’s manual
36
What to place in the comment
• Here are some important things to consider placing into the tracing comment
– session_id (or important cookie info)
– application file name, and line number
– important GET, POST, PUT or DELETE contents
– Any other important information which could be useful for tracking down the vector being used in an attack
37
Example comments in SQL queries
select airport_name, count(*) from dim_airport join ontime_fact on dest_airport_id = airport_id where depdelay > 30 and flightdate_id = 20080101 /* webserver:192.168.1.3,file:show_delays.php,line:326,function:get_delayed_flights,user:justin,sessionid:7B7N2PCNIOKCGF */
38
This comment contains all that you need
Most apps don’t do this out of the box
• You can modify the application
– If you have the source code (and it uses a DAL)
• BUT…
– There isn’t much you can do if
• The application is closed source , or you can’t change the source
• There is no DAL (code/query spaghetti)
• For any other reason it is problematic to inject information into all SQL queries
39
If I can’t change the source?
• You can’t fix the problems when you detect them.
• Consider using an open source solution
• Or consider in-band protection
40
SQL INJECTION PREVENTION In-band SQL injection detection
41
In-band protection
• Using pt-query-digest to discover new query patterns is useful
– But it doesn’t work in real time
– It can’t block bad queries from actually executing
42
In-band protection
• What is needed is a “man in the middle” that inspects each query to ensure it matches an allowed fingerprint.
– MySQL proxy can be used for this purpose
43
MySQL Proxy
• MySQL Proxy
– Supports Lua scripting for easy development
– Adds some latency to all queries
– Considered “alpha” quality though for simple scripts it seems stable enough
– Fingerprinting and checking database also adds latency. 3ms – 5ms per query is to be expected
44
Noinject! – The Lua script and PHP interface
• http://code.google.com/p/noinject-mysql
• The Lua script for MySQL proxy is pretty much drop-in. – Just modify it to point to your database server and
specify credentials and other options.
• PHP script is similarly easy to configure.
– Drop in a directory on an Apache box – Modify the script to set the options.
45
The Lua proxy script – known queries
• By default the script will retrieve all known good fingerprints and cache them locally when the first query is received from a client
• Also by default, all queries that fail to pass the known whitelist check are logged in an exception table.
46
Both of these options can be changed easily
The Lua proxy script – known queries
• Each query is fingerprinted
– If the fingerprint is on the whitelist, the actual query is sent to the server
– If the query is not on the whitelist the behavior varies depending on the proxy mode
47
Lua script – Proxy mode
• permissive mode
– Records the SQL fingerprint into the whitelist table but does not mark it as reviewed
– Allows the query to proceed
• restrictive mode
– Records the SQL fingerprint into the whitelist table
– Returns an empty set for the query
48
Why use permissive mode?
• Permissive mode allows the collection of SQL fingerprints for an application dynamically
– Just run the application with typical workload and the SQL queries will be recorded automatically
– Eventually switch to restrictive mode
49
PHP Web interface
• 1999 mode HTML interface
50
Query Sample
Last action time with note
White or black list the fingerprint
If you want something prettier
• This is open source so…
• If you want bug fixes or have feature requests
– You can engage with Percona for development
– You can contribute!
– You can fork your own version
51
If the proxy overhead is too high
• You could develop the functionality in MySQL – too bad the parser is not pluggable
• Try mysqlnd plugins – fingerprint queries in PHP
– match them to a whitelist maintained in a serialized PHP array
– reject queries that aren’t approved
• Improve the proxy lua script – fingerprint process could probably be made faster
52
Percona Training Advantage
• This presentation and the Noinject! tool were created by Justin Swanhart, one of Percona’s expert trainers
– Check out http://training.percona.com for a list of training events near you
– Request training directly by Justin or any of our other expert trainers by contacting your Percona sales rep today
53
54
Q/A
Recommended