Input Sanitization
COEN 225
All Input is Evil
All input is evil:At least potentially Input can be: (A random collection)
Files Web forms Cookies Registry entries Database contents Command line arguments
Environmental variables
HTTP requests Named pipes E-mail …
Finding Common Entry Points
Files Contain data specified by users Contain data supplied by application Can be intentionally or unintentionally corrupted Attacker can also attack file metadata:
Extension Path File system attributes …
Finding Common Entry Points
SocketsEasy to connect to sockets need to filter
dataAttacker can
Monitor data Send malformed data to client or to server Intercept data in the middle of a request and
replace it A.k.a Man in the middle attack
Finding Common Entry Points
HTTP requests Almost always passes through firewalls Using webproxy, users have complete control over
what is send to the server Named pipes
See sockets But programmers might forget how named pipes work
and trust input E.g. SQL Server 2000 vulnerability See http://www.blakewatts.com/namedpipepaper.html
Finding Common Entry Points
Pluggable Protocol Handler Example:
http, ftp, https in URL mailto:[email protected]?subject=WrongPerson
Tell system which application handles data when a hyperlink is clicked
Maliciously crafted link irc://[~900 characters] caused buffer overflow in mIRC protocol handler that allowed arbitrary code execution
Finding Common Entry Points
Programmatic InterfacesRPCCOMDCOMActiveXManaged code entry points (Windows) .NET Remoting
Finding Common Entry Points
SQL Improperly filtered input strings can lead to
execution of powerful SQL commands Registry User Interfaces
Win95 machines were used in librariesAttacker could remove the “Start” button for
free entertainment
Finding Common Entry Points
Command line argumentsAttacker provides helpful link with arguments
embeddedExample: Cross scripting attacks
Environmental VariablesCan be used by programs to make decisions
Canonicalization
Authentication decision made by one module
Access done by other module
Input Validation
Input – Anything controlled by outsideruser command line inputconfiguration files that could be manipulatedhttp requestspackets under consideration by firewall…
Input Validation Security Strategies
Black List List all things that are NOT allowed
List is difficult to create Adding insecure constructs on a continuous basis means
that the previous version was unsafe Testing is based on known attacks.
List from others might not be trustworthy. White List
List of things that are allowed List might be incomplete and disallow good content
Adding exceptions on a continuous basis does not imply security holes in previous versions.
Testing can be based on known attacks. List from others can be trusted if source can be trusted.
Input Validation
Principle problemLocation of Check Location of Use
Principle solutionCanonicalization of input
Transform input into a canonical form
Decision is made on input in the same form that program uses
Canonicalization
Two major program errors:Misunderstanding definition of canonical formStopping canonicalization process to early
Canonicalization:Dealing with Metacharacters Meta-information can be attached
Out-Of-Band In-Band
Often more readable Often more compact Has security implications
Potential for overlapping trust domains: There exists a logical boundary between data and
metadata Parser need to identify the difference between data
and metadata correctly
Canonicalization:Dealing with Metacharacters Example: NULL character for termination
of strings
Canonicalization:Dealing with Metacharacters Simplest Vulnerability:
Users can embed metacharacters into input that is not filtered
Instance of second-order injection attack The attack happens when the metacharacter is
evaluated
Example: Password update (next slide)
Canonicalization:Dealing with Metacharactersuse CGI;… verify session details …$new_password = $query->param(′password′);open(IFH,″</opt/passwords.txt″) || die (″$!″);open(OFH,″>/opt/passwords.txt.tmp″) || die (″$!″);while(IFH) {
($user, $pass) = split /:/;if ($user ne $session_username)
print OFH ″$user:$pass\n″;else
print OFH ″$user:$new_password\n″;}…close( IFH );close( OFH );
No input sanitization!
User bob inputs:test\njim:npwd
OFH becomes:bob:testjim:npwd
Bob just added a new user
Canonicalization:Dealing with Metacharacters Discovering attacks like this:
1. Identify code that deals with metacharacter strings
2. Identify all delimiter characters that are specially handled and put them into a list
3. Identify filtering performed on input
4. Eliminate potentially hazardous delimiter characters from list
5. Remaining characters on list indicate a vulnerability
Canonicalization:Dealing with MetacharactersBool HandleUploadedFile(char * filename){ unsigned char buf[MAX_PATH], pathname[MAX_PATH]; char * fname = filename, *tmp1, *tmp2; DWORD rc; HANDLE hFile; tmp1 = strrchr(filename,′/′); tmp2 = strrchr(filename,′\\′); if(tmp1||tmp2) fname = (tmp1 > tmp2? tmp1 : tmp2)+1; if(!fname) return FALSE; if(strstr(fname, ″.. ″)) return FALSE; _snprintf(buf, sizeof(buf), ″\\\\?\\%TEMP%\\%s″,fname); rc = ExpandEnvironmentStrings(buf, pathname, sizeof(pathname)); if(rc == 0 || rc > sizeof(pathname)) return FALSE; hFile = CreateFile(pathname, …); … read bytes into the file …}
1 Input string is formatted a number of ways before it becomes a file name. Added to a statically sized buffer and prefixed with \\\\?\\%TEMP%\\
Canonicalization:Dealing with MetacharactersBool HandleUploadedFile(char * filename){ unsigned char buf[MAX_PATH], pathname[MAX_PATH]; char * fname = filename, *tmp1, *tmp2; DWORD rc; HANDLE hFile; tmp1 = strrchr(filename,′/′); tmp2 = strrchr(filename,′\\′); if(tmp1||tmp2) fname = (tmp1 > tmp2? tmp1 : tmp2)+1; if(!fname) return FALSE; if(strstr(fname, ″.. ″)) return FALSE; _snprintf(buf, sizeof(buf), ″\\\\?\\%TEMP%\\%s″,fname); rc = ExpandEnvironmentStrings(buf, pathname, sizeof(pathname)); if(rc == 0 || rc > sizeof(pathname)) return FALSE; hFile = CreateFile(pathname, …); … read bytes into the file …}
2 Set of delimiter characters that are specially handled:‘/’ ‘\’ “..”String is passed to Expand EnvironmentStrings( ).Environmental variables are denoted with % characters.
Canonicalization:Dealing with MetacharactersBool HandleUploadedFile(char * filename){ unsigned char buf[MAX_PATH], pathname[MAX_PATH]; char * fname = filename, *tmp1, *tmp2; DWORD rc; HANDLE hFile; tmp1 = strrchr(filename,′/′); tmp2 = strrchr(filename,′\\′); if(tmp1||tmp2) fname = (tmp1 > tmp2? tmp1 : tmp2)+1; if(!fname) return FALSE; if(strstr(fname, ″.. ″)) return FALSE; _snprintf(buf, sizeof(buf), ″\\\\?\\%TEMP%\\%s″,fname); rc = ExpandEnvironmentStrings(buf, pathname, sizeof(pathname)); if(rc == 0 || rc > sizeof(pathname)) return FALSE; hFile = CreateFile(pathname, …); … read bytes into the file …}
3 Set of delimiter characters that are specially handled:‘/’ ‘\’ “..”String is passed to Expand EnvironmentStrings( ).Environmental variables are denoted with % characters.
Canonicalization:Dealing with MetacharactersBool HandleUploadedFile(char * filename){ unsigned char buf[MAX_PATH], pathname[MAX_PATH]; char * fname = filename, *tmp1, *tmp2; DWORD rc; HANDLE hFile; tmp1 = strrchr(filename,′/′); tmp2 = strrchr(filename,′\\′); if(tmp1||tmp2) fname = (tmp1 > tmp2? tmp1 : tmp2)+1; if(!fname) return FALSE; if(strstr(fname, ″.. ″)) return FALSE; _snprintf(buf, sizeof(buf), ″\\\\?\\%TEMP%\\%s″,fname); rc = ExpandEnvironmentStrings(buf, pathname, sizeof(pathname)); if(rc == 0 || rc > sizeof(pathname)) return FALSE; hFile = CreateFile(pathname, …); … read bytes into the file …}
4 Filtering:strrchr searches last occurrence for ‘/’ and ‘\’ and increments past it.strstr searches for “..”
Canonicalization:Dealing with MetacharactersBool HandleUploadedFile(char * filename){ unsigned char buf[MAX_PATH], pathname[MAX_PATH]; char * fname = filename, *tmp1, *tmp2; DWORD rc; HANDLE hFile; tmp1 = strrchr(filename,′/′); tmp2 = strrchr(filename,′\\′); if(tmp1||tmp2) fname = (tmp1 > tmp2? tmp1 : tmp2)+1; if(!fname) return FALSE; if(strstr(fname, ″.. ″)) return FALSE; _snprintf(buf, sizeof(buf), ″\\\\?\\%TEMP%\\%s″,fname); rc = ExpandEnvironmentStrings(buf, pathname, sizeof(pathname)); if(rc == 0 || rc > sizeof(pathname)) return FALSE; hFile = CreateFile(pathname, …); … read bytes into the file …}
5 However, ‘%’ remainsClient can supply a number of environmental variables such as QUERY_STRINGIn addition, something like ..\..\..\any\pathname\file.txt supplied in QUERY_STRING allows client to write to arbitrary locations in the file system
Canonicalization:Dealing with Metacharacters NULL character injection
NULL characters are necessary to terminate strings when calling C routines from OS and many APIs
Perl and other languages do not use NULL for termination
Example: Perl application programmer tests that file name ends in “.txt” Attack inputs sequence “%00” in CGI input
Decoded as NUL character Can be used to cut-off filename, including extension
open(FH, ″>$username.txt″) || die(″$!″);
print FH $data;
close(FH);
Canonicalization:Dealing with Metacharacters: NULL NUL metacharacter is used to end C-
strings, but not Perl, Java, PHP, …This is a canonicalization issue:
C-based modules canonicalize strings differently than the no-C/no-Unix world
Issues arise when strings cross boundaries between these worlds
Canonicalization:Dealing with Metacharacters: NULL Possible results:
Memory corruption because strlen returns a different value
Truncation of strings False decisions Especially for FILE NAMES
B O B . T X T \0
B O B \0 . T X T \0
Path Metacharacters
Windows File Names: C:\\WINDOWS\system32\calc.exe
Optional device Followed by path NOT UNIQUE
C:\\WINDOWS\system32\drivers\..\calc.exe calc.exe .\calc.exe ..\calc.exe \\?\WINDOWS\systems32\calc.exe
File system uses file canonicalization But the system is less than canonical
Path Metacharacters
Issues: File squatting (in Windows)
Need to use CreateFile carefully in order to Not open an existing file that sits in the canonical path of the file name
CreateFile canonicalization eliminates any directory traversal components before validating whether
each path segment exists C:\nonexistent\path\..\..\blah.txt accesses C:\blah.txt
File-like Objects CreateFile can open objects that are treated like files but are not files:
\\host\object type\name
Device Files Reside in the file hierarchy But are canonicalized differently
COM!-9, LPT1-9, CON, CONIN$, CONOUT$, PRN, AUX, CLOCK$, NUL Programmers are often not aware of the rules
Path Metacharacters
CreateFile() (Windows) idiosyncrasies Strips out trailing spaces in file names
Example attack Programmer attaches “.txt” to a user-provided name Attacker provides “helloworld.exe “ with trailing space The trailing space with following .txt is stripped out
Case Sensitivity Windows filenames are not case sensitive, UNIX and HFS filenames are
DOS 8.3 Short file name is created by the file system if the file name is too long. File can be referred to by the short file name Use \\?\ before file name to disable DOS filename parsing
Insure that files are normal files by checking for FILE_ATTRIBUTE_NORMAL, or face access to named pipes, …
Alternative Data Streams are created with an “:” separator
Path Metacharacters
Registry keysNaming similar to filesSimilar issuesWorthy of its own presentation
Canonicalization:Dealing with Metacharacters Shell Metacharacter Injection
Attack vector User controls input to an argument for execve(),
popen(), …
Dangerous shell characters ; | & < > ` ! - * / ? ( ) . [space] [ ] “\t” ^ ~ \ “\\”
quotes “\r” “\n” $
Canonicalization:Dealing with Metacharacters SQL Injection attack
Attack vector: User controls part of the SQL query string
CanonicalizationMeta Character Filtering Three basic options
1. Detect erroneous input and reject what appears to be an attack
2. Detect and strip dangerous characters
3. Detect and encode dangerous characters with a metacharacter escape sequence
CanonicalizationMeta Character Filtering Eliminating Metacharacters
Whitelisting: Allow only good stringsif($input_data =~ /[^A-Za-z0-9_ ]/) { exit;}
Whitelisting: Strip away anything that is not good$input_data =~ s/ /[^A-Za-z0-9]/g Stripping is vulnerable to mistakes
Blacklisting: Make decisions based on dangerous characters (not recommended)
CanonicalizationMeta Character Filtering Escaping Metacharacters
Non-destructive: metacharacters are preserved in string
Goal: Receiving module receives a safe stringAttack vectors:
Metacharacter evasion Encoded metacharacter can be used to avoid other
filtering
CanonicalizationMeta Character Filtering Escaping Metacharacters
Filtering does not detect encoded metacharacters
Example: ..%2F..%2Fetc%2FpasswdDouble Encoding Attacks
dXNlcj1wYXNzd2QmaG9tZWRpcj0uLiUyNSUzMiU0Ni4uJTL1JTMyJTQ
Base 64 Decoder
user=passwd&homedir=..%25%32%46..%25%32%46etc
Hexadecimal Decoder pass 1
user=passwd&homedir=..%2F..%2Fetc
Hexadecimal Decoder pass 1
user=passwd&homedir=../../etc
CanonicalizationMeta Character Filtering Character Sets
Example vulnerabilities Wide characters (unicode) C-style strings are terminated with
a 16 NULL, normal character strings with an 8 NULL Homographic attacks
Different characters look the same “Microsoft” “Microsoft” in Unicode
one ‘o’ is cyrillic
String length calculations need to take character set into account (wide characters vs. normal characters)