70
Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Embed Size (px)

Citation preview

Page 1: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Data and Applications Security

Digital Forensics

Dr. Bhavani Thuraisingham

The University of Texas at Dallas

November 12, 2010

Page 2: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Outline

Introduction Applications

- Law enforcement, Human resources, Other Services Benefits Using the evidence Conclusion

Page 3: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Digital Forensics

Digital forensics is about the investigation of crime including using digital/computer methods

More formally: “Digital forensics, also known as computer forensics, involved the preservation, identification, extraction, and documentation of computer evidence stored as data or magnetically encoded information”, by John Vacca

Digital evidence may be used to analyze cyber crime (e.g. Worms and virus), physical crime (e.g., homicide) or crime committed through the use of computers (e.g., child pornography)

Page 4: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Relationship to Intrusion Detection, Firewalls, Honeypots

They all work together with Digital forensics techniques Intrusion detection

- Techniques to detect network and host intrusions Firewalls

- Monitors traffic going to and from and organization Honeypots

- Set up to attract the hacker or enemy; Trap Digital forensics

- Once the attack has occurred or crime committed need to decide who committed the crime

Page 5: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Computer Crime

Computers are attacked – Cyber crime

- Computer Virus Computers are used to commit a crime

- E.g., child predators, Embezzlement, Fraud Computers are used to solve a crime FBI’s workload: Recent survey

- 74% of their efforts on white collar crimes such as healthcare fraud, financial fraud etc.

- Remaining 26% of efforts spread across all other areas such as murder and child pornography

- Source: 2003 Computer Crime and Security Survey, FBI

Page 6: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Objective and Priority

Objective of Computer Forensics

- To recovery, analyze and present computer based material in such a way that is it usable as evidence in a court of law

- Note that the definition is the following: “computer forensics, involves the preservation, identification, extraction, and documentation of computer evidence stored as data or magnetically encoded information”, by John Vacca

Priority

- Main priority is with forensics procedures, rules of evidence and legal processes; computers are secondary

- Therefore accuracy is crucial

Page 7: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Accuracy vs Speed

Tradeoffs between accuracy and speed

- E.g., Taking 4 courses in a semester vs. 2 courses; more likely to get Bs and not As

- Writing a report in a hurry means likely less accurate Accuracy: Integrity and Security of the evidence is crucial

- No shortcuts, need to maintain high standards Speed may have to be sacrificed for accuracy.

- But try to do it as fast as you can provided you do not compromise accuracy

Page 8: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

The Job of a Forensics Specialist

Determine the systems from which evidence is collected Protect the systems from which evidence is collected Discover the files and recover the data Get the data ready for analysis Carry out an analysis of the data Produce a report Provide expert consultation and/or testimony?

Page 9: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Applications: Law Enforcement

Important for the evidence to be handled by a forensic expert; else it may get tainted

Need to choose an expert carefully

- What is his/her previous experience? Has he/she worked on prior cases? Has he/she testified in court? What is his/her training? Is he CISSP certified?

Forensic expert will be scrutinized/cross examined by the defense lawyers

Defense lawyers may have their own possibly highly paid experts?

Page 10: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Applications: Human Resources

To help the employer

- What web sites visited?

- What files downloaded

- Have attempts been made to conceal the evidence or fabricate the evidence

- Emails sent/received To help the employee

- Emails sent by employer – harassment

- Notes on discrimination

- Deleted files by employer

Page 11: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Applications: Other

Supporting criminals

- Gangs using computer forensics to find out about members and subsequently determine their whereabouts

Support rogue governments and terrorists

- Terrorists using computer forensics to find out about what we (the good guys) are doing

We and the law enforcement have to be one step ahead of the bad guys

Understand the mind of the criminal

Page 12: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Services

Data Services

- Seizure, Duplication and preservation, recovery Document and Media

- Document searched, Media conversion Expert witness Service options Other services

Page 13: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Data Services

Data Seizure

- The expert should assist the law enforcement official in collecting the data.

- Need to identify the disks that contain the data Data Duplication and Preservation

- Data absolutely cannot be contaminated

- Copy of the data has to be made and need to work with the copy and keep the original in a safe place

Data Recovery

- Once the device is seized (either local or remote) need to use appropriate tools to recover the data

Page 14: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Data Services: Finding Hidden Data

When files are deleted, usually they can be recovered The files are marked as deleted, but they are still residing in

the disk until they are overwritten Files may also be hidden in different parts of the disk The challenge is to piece the different part of the file together

to recover the original file There is research on using statistical methods for file

recovery http://www.cramsession.com/articles/files/finding-hidden-

data---how-9172003-1401.asp http://www.devtarget.org/downloads/ca616-seufert-

wolfgarten-assignment2.pdf

Page 15: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Document and Media Services

Document Searches

- Efficient search of numerous documents

- Check for keywords and correlations Media Conversion

- Legacy devices may contain unreadable data. This data ahs to be converted using appropriate conversion tools

- Should be placed in appropriate storage for analysis

Page 16: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Expert Witness Services

Expert should explain computer terms and complicated processes in an easy to understand manner to law enforcement, lawyers, judges and jury

- Computer technologists and lawyers speak different languages Expertise

- Computer knowledge and expertise in computer systems, storage

- Knowledge on interacting with lawyers, criminology

- Domain knowledge such as embezzlement, child exploitation Should the expert witness and the forencis specialist be one and the

same?

Page 17: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Service Options

Should provide various types of services

- Standard, Emergency, Priority, Weekend After hours services

Onsite/Offsite services Cost and risks – major consideration Example: Computer Forensics Services Corporation

- http://www.computer-forensic.com/

- As stated in the above web site, this company provides “expert, court approved, High Tech Investigations, litigation support and IT Consulting.” They also "Preserve, identify, extract, document and interpret computer data. It is often more of an art than a science, but as in any discipline, computer forensic specialists follow clear, well-defined methodologies and procedures.”

Page 18: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Other Services

Computer forensics data analysis for criminal and civil investigations/litigations

Analysis of company computers to determine employee activity

- If he/she conducting his own business and/or downloading pornography

- Surveillance for suspicious event detection Produce timely reports

Page 19: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Benefits of using Professional services

Protecting the evidence

- Should prevent from damage and corruption Secure the evidence

- Store in a secure place, also use encryption technologies such as public/private keys

Ensure that the evidence is not harmed by virus Document clearly who handled the data and when - auditing Cleint/Attoney privilege Freeze the scene of the crime – do not contaminate or change

Page 20: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Using the Evidence: Criminal and Civil Proceedings

Criminal prosecutors Civil litigation attorneys – harassment, discrimination,

embezzlement, divorce Insurance companies Computer forensics specialists to help corporations and

lawyers Law enforcement officials Individuals to sue a company Also defense attorneys, and “the bad guys”

Page 21: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Issues and Problems that could occur

Computer Evidence MUST be

- Authentic: not tampered with

- Accurate: have high integrity

- Complete: no missing points

- Convincing: no holes

- Conform: rules and regulations

- Handle change: data may be volatile and time sensitive

- Handle technology changes: tapes to disks; MAC to PC

- Human readable: Binary to words

Page 22: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Legal tests

Countries with a common law tradition

- UK, US, Possibly Canada, Australia, New Zealand Real evidence

- Comes from an inanimate object and can be examined by the court

Testimonial evidence

- Live witness when cross examined Hearsay

- Wiki entry “Hearsay in English law and Hearsay in United States law, a legal principle concerning the admission of evidence through repetition of out-of-court statements”

Are the following admissible in court?

- Data mining results, emails, printed documents

Page 23: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Traditional Forensics vs Computer Forensics

Traditional Forensics

- Materials tested and testing methods usually do not change rapidly

- Blood, DNA, Drug, Explosive, Fabric Computer Forensics

- Material tested and testing methods may change rapidly

- We did not have web logs in back in 1990

- We did not have RAID storage in 1980

Page 24: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Types of Acquisition

Static Acquisition

- Acquire data from the original media

- The data in the original media will not change Live Acquisition

- Acquire data while the system is running

- A second live acquisition will not be the same Will focus on static acquisition

Page 25: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Digital Evidence Storage Formats Raw formats

- Bit by bit copying of the data from the disk

- Many tools could be used Proprietary formats

- Vendors have special formats Standards

- XML based formats for digital evidence

- Digital Evidence Markup Language (Funded by National Institute of Justice)

- Experts have argued that technologies that allow disparate law enforcement jurisdictions to share crime-related information will greatly facilitate fighting crime. One of these technologies is the Global Justice XML Data Model (GJXDM).

- http://ncfs.ucf.edu/digital_evd.html

Page 26: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Acquisition Methods

Disk to Image File Disk to Disk Logical acquisition

- Acquire only certain files if the disk is too large Sparse acquisition

- Similar to logical acquisition but also collects fragments of unallocated (i.e. deleted) data

Page 27: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Compression Methods

Compression methods are used for very large data storage

- E.g., Terabytes/Petabytes storage Lossy vs Lossless compression

- Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange for better compression rates.

Page 28: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Contingency Planning

Failure occurs during acquisition

- Recovery methods Make multiple copies

- At least 2 copies Encryption decryption techniques so that the evidence is not

corrupted

Page 29: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Storage Area Network Security Systems

High performance networks that connects all the storage systems

- After as disaster such as terrorism or natural disaster (9/11 or Katrina), the data has to be availability

- Database systems is a special kind of storage system Benefits include centralized management, scalability

reliability, performance Security attacks on multiple storage devices

- Secure storage is being investigated

Page 30: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Network Disaster Recovery Systems

Network disaster recovery is the ability to respond to an interruption in network services by implementing a disaster recovery palm

Policies and procedures have to be defined and subsequently enforced

Which machines to shut down, determine which backup servers to use, When should law enforcement be notified

Page 31: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Using Acquisition Tools

Acquisition tools have been developed for different operating systems including Windows, Linux, Mac

It is important that the evidence drive is write protected Example acquisition method:

- Document the chain of evidence for the drive to be acquired

- Remove drive from suspect’s computer

- Connect the suspect drive to USB or Firewire write-blocker device (if USB, write protect it via Registry write protect feature)

- Create a storage folder on the target drive

Page 32: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Using Acquisition Tools - 2

Example tools include ProDiscover, Access Data FTK Imager Click on All programs and click on specific took (e.g.,

ProDiscover Perform the commands

- E.g. Capture Image For additional security, use passwords

Page 33: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Validating Data Acquisition

Create hash values

- CRC-32 (older methods), MD5, SHA series Linux validation

- Hash algorithms are included and can be executed using special commands

Windows validation

- No hash algorithms built in, but works with 3rd party programs

Page 34: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

MhX(Author)=h(h(Author)||h(Author.value))

MhX(title)=h(h(title)||h(title.value))

titletitleAuthor

Author

paragraph

Politic_page Literary_page

Paragraphs

title

date

titleAuthor

titleAuthortopictitleAuthortopictitleAuthortopic

titleAuthortopic

Article

Newspaper

Frontpage

Leading

Sport_page

news newsPolitic

paragraph

MhX(paragraph)=h(h(paragraph)||h(paragraph.content)|| MhX(Author)||MhX(title))

Merkle Hash Signature Example

Page 35: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

RAID Acquisition Methods

RAID: Redundant array of independent disks RAID storage is used for large files and to support replication Data is stored using multiple methods

- E.g, Striping When RAID is acquired, need special tools to be used

depending on the way the data is stored

Page 36: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Remote Network Acquisition Tools

Preview suspects file remotely while its being used or powered on

Perform live acquisition while the suspect’s computer ism powered on

Encrypt the connection between the suspect’s computer and the examiner’s computer

Copy the RAM while the computer is powered on Use stealth mode to hide the remote connection from the

suspect’s computer Variation for the individual tools (ProDiscover, EnCase)

Page 37: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Some Forensics Tools

ProDiscover

- http://www.techpathways.com/prodiscoverdft.htm

- http://www.techpathways.com/DesktopDefault.aspx EnCase

- http://www.guidancesoftware.com/

- http://www.guidancesoftware.com/products/ef_index.asp NTI Safeback

- http://www.forensics-intl.com/safeback.html

Page 38: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Processing Crime and Incident Scenes: Chapter 5

Topics in Chapter 2

- Securing evidence

- Gathering evidence

- Analyzing evidence Topics in Chapter 5

- Understanding the rules of evidence

- Collecting evidence in private-sector incident scenes

- Processing law enforcement crime scenes

- Steps to Processing Crime and Incident Scenes

- Case study Other topics

- Forensics technologies

Page 39: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Securing Evidence

To secure and catalog evidence large evidence bags, tapes, tags, labels, etc. may be used

Tamper Resistant Evidence Security Bags

- Example: EVIDENT

- “These heavy-duty polyethylene evidence bags require no prepackaging of evidence prior to use. The instantaneous adhesive closure strip is permanent and impossible to open without destroying the seal. A border pattern around the edge of the bag reveals any attempt at cutting or tampering with evidence.”

See also the work of SWDGE (Scientific Working Group on Digital Evidence) and IOCE (International Organization on Computer Evidence)

Page 40: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Gathering Evidence

Bit Stream Copy

- Bit by bit copy of the original drive or storage medium

- Bit stream image is the file containing the bit stream copy of all data on a disk

Using ProDiscover to acquire a thumb drive

- On a thumb drive locate the write protect switch and place drive in write protect model

- Start ProDiscover

- Click Action, Capture Image from menu

- Click Save

- Write name of technician

- Use hash algorithms for security

- Click OK

Page 41: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Analyzing Evidence

Start ProDiscover Create new file Click on image file to be analyzed Search for keywords, patterns and enter patterns to be

searched Click report and export file Details in Chapter 2

Page 42: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Understanding the Rules of Evidence

Federal rules of evidence; each state also may have its own rules of evidence

- www.usdoj.gov Computer records are in general hearsay evidence unless

they qualify as business records

- Hearsay evidence is second hand or indirect evidence

- Business records are records of regularly conducted business activity such as memos, reports, etc.

Computer records consist of computer generated records and computer stored records

Computer generated records include log files while computer stored records are electronic data

Al computer records must be authentic

Page 43: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Private sector incident scenes

Corporate investigations

- Employee termination cases, Attorney-Client privilege investigations, Media leak investigations, Industrial espionage investigations

Private sector incident scenes

- Private section includes private corporations and government agencies not involved with law enforcement

- They must comply with state public disclosure and federal Freedom of Information act and make certain documents available as public records

- Law enforcement is called if needed (if the investigation becomes a criminal investigation)

Page 44: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Law Enforcement crime Scenes

A law enforcement officer may seize criminal evidence only with probable cause

- A specific crime was committed

- Evidence of the crime exists

- Place to searched includes the evidence The forensics team should know about the terminology used

in warrants To prepare for a search and carry out an investigation the

following steps have to be carried out

- Identifying the nature of the case, the type of computing system, determine whether computer can be seized, identify the location, determine who is in charge, determine the tools

Page 45: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Steps to processing crime and incident scenes

Seizing a computer incident or crime scene Sizing the digital evidence at crime scene Storing the digital evidence Obtaining a digital hash Conducting analysis and reporting Reference: Chapter 5

Page 46: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Case Study (Chapter 5)

Company A (Mr. Jones) gets an order for widgets from Company B. When the order is ready, B says it did not place the order. A then retrieves the email sent by B. B states it did not send the email. What should A do?

Steps to carry out

- Close Mr. Jones Outlook

- User windows explorer to locate Outlook PST that has Mr.,. Jones business email

- Determine the size of PST and connect appropriate media device (e.g. USB)

- Copy PST into external USB

- Fill out evidence form – date/time etc.

- Leave company A and return to the investigation desk and carry out the investigation (see previous lectures)

Page 47: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Digital Forensics Analysis

Digital Forensics Analysis Techniques Reconstructing past events Conclusion and Links References

- http://www.gladyshev.info/publications/thesis/ Formalizing Event Reconstruction in Digital

Investigations Pavel Gladyshev,  Ph.D. dissertation,  2004, University College Dublin, Ireland (Main Reference)

- http://www.porcupine.org/forensics/forensic-discovery/chapter3.html (Background on file systems)

Page 48: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Digital Evidence Examination and Analysis Techniques

Search techniques Reconstruction of Events Time Analysis

Page 49: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Search Techniques Search techniques

- This group of techniques searches collected information to answer the question

whether objects of given type, such as hacking tools, or pictures of certain kind,

are present in the collected information.

- According to the level of search automation, techniques can be grouped into

manual browsing and automated searches. Automated searches include keyword

search, regular expression search, approximate matching search, custom

searches, and search of modifications.

Manual browsing

- Manual browsing means that the forensic analyst browses collected information

and singles out objects of desired type. The only tool used in manual browsing is a

viewer of some sort. It takes a data object, such as file or network packet, decodes

the object and presents the result in a human-comprehensible form. Manual

browsing is slow. Most investigations collect large quantities of digital information,

which makes manual browsing of the entire collected information unacceptably

time consuming.

Page 50: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Search Techniques Keyword search

- This is automatic search of digital information for data objects containing specified

key words. It is the earliest and the most widespread technique for speeding up

manual browsing. The output of keyword search is the list of found data objects

- Keywords are rarely sufficient to specify the desired type of data objects precisely.

As a result, the output of keyword search can contain false positives, objects that do

not belong to the desired type even though they contain specified keywords. To

remove false positives, the forensic scientist has to manually browse the data

objects found by the keyword search.

- Another problem of keyword search is false negatives. They are objects of desired

type that are missed by the search. False negatives occur if the search utility cannot

properly interpret the data objects being searched. It may be caused by encryption,

compression, or inability of the search utility to interpret novel data

- It prescribes (1) to choose words and phrases highly specific to the objects of the

desired type, such as specific names, addresses, bank account numbers, etc.; and

(2) to specify all possible variations of these words.

Page 51: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Regular expression search

- Regular expression search is an extension of keyword search. Regular expressions provide a more expressible language for describing objects of interest than keywords. Apart from formulating keyword searches, regular expressions can be used to specify searches for Internet e-mail addresses, and files of specific type. Forensic utility EnCase performs regular expression searches.

- Regular expression searches suffer from false positives and false negatives just like keyword searches, because not all types of data can be adequately defined using regular expressions.

Search Techniques

Page 52: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Approximate matching search

- Approximate matching search is a development of regular expression search. It uses matching algorithm that permits character mismatches when searching for keyword or pattern. The user must specify the degree of mismatches allowed.

- Approximate matching can detect misspelled words, but mismatches also increase the umber of false positives. One of the utilities used for approximate search is agrep.

Search Techniques

Page 53: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Custom searches

- The expressiveness of regular expressions is limited. Searches for objects satisfying more complex criteria are programmed using a general purpose programming language. For example, the FILTER_1 tool from new Technologies Inc. uses heuristic procedure to find full names of persons in the collected information. Most custom searches, including FILTER_1 tool suffers from false positives and false negatives.

Search Techniques

Page 54: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Search of modifications

- Search of modification is automated search for data objects that have been modified since specified moment in the past. Modification of data objects that are not usually modified, such as operating system utilities, can be detected by comparing their current hash with their expected hash. A library of expected hashes must be built prior to the search. Several tools for building libraries of expected hashes are described in the “file hashes"

- Modification of a file can also be inferred from modification of its timestamp. Although plausible in many cases, this inference is circumstantial. Investigator assumes that a file is always modified simultaneously with its timestamp, and since the timestamp is modified, he infers that the file was modified too. This is a form of event reconstruction

Search Techniques

Page 55: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction Search techniques are commonly used for finding incriminating

information, because ”currently, mere possession of a digital computer links a suspect to all the data it contains"

However, the mere fact of presence of objects does not prove that the owner of the computer is responsible for putting the objects in it.

Apart from the owner, the objects can be generated automatically by the system. Or they can be planted by an intruder or virus program. Or they can be left by the previous owner of the computer.

To determine who is responsible, the investigator must reconstruct events in the past that caused presence of the objects.

Reconstruction of events inside a computer requires understanding of computer functionality.

Many techniques emerged for reconstructing events in specific operating systems. They can be classified according to the primary object of analysis.

Page 56: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction Two major classes are identified:

- log file analysis and file system analysis.

Log file analysis

- A log file is a purposefully generated record of past events in a computer system; organized as a sequence of entries. An entry usually consists of a timestamp, an identifier of the process that generated the entry, and some description of the reason for generating an entry.

- It is common to have multiple log files on a single computer system. Different log files are usually created by the operating system for different types of events. In addition, many applications maintain their own log files.

- Log file entries are generated by the system processes when something important (from the process's point of view) happens. For example, a TCP wrapper process may generate one log file entry when a TCP connection is established and another log file entry when the TCP connection is released.

Page 57: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction

- The knowledge of circumstances, in which processes generate log file entries, permits forensic scientist to infer from presence or absence of log file entries that certain events happened. For example, from presence of two log file entries generated by TCP wrapper for some TCP connection X, forensic scientist can conclude that

TCP connection X happened

X was established at the time of the first entry

X was released at the time of the second entry

- This reasoning suffers from implicit assumptions. It is assumed that the log file entries were generated by the TCP wrapper, which functioned according to the expectations of the forensic scientist; that the entries have not been tampered with; and that the timestamps on the entries reect real time of the moments when the entries were generated. It is not always possible to ascertain these assumptions, which results in several possible explanations for appearance of the log file entries.

Page 58: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction

- For example, if possibility of tampering cannot be excluded, then forgery of the log file

entries could be a possible explanation for their existence. To combat uncertainty

caused by multiple explanations, forensic analyst seeks corroborating evidence, which

can reduce number of possible explanations or give stronger support to one explanation

- Determining temporal order with timestamps.

Timestamps on log file entries are commonly used to determine temporal order of

entries from different log files. The process is complicated by two time related

problems, even if the possibility of tampering is excluded.

First problem: if the log file entries are recorded on different computers with

different system clocks. Apart from individual clock imprecision, there may be an

unknown skew between clocks used to produce each of the timestamps. If the skew

is unknown, it is possible that the entry with the smaller timestamp could have been

generated after the entry with the bigger timestamp.

Second problem: if resolution of the clocks is too coarse. As a result, the entries

may have identical timestamps, in which case it is also not possible to determine

whether one entry was generated before the other.

Page 59: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction File system analysis

- In most operating systems, a data storage device is represented at the lowest logical level by a sequence of equally sized storage blocks that can be read and written independently.

- Most file systems divide all blocks into two groups. One group is used for storing user data, and the other group is used for storing structural information.

- Structural information includes structure of directory tree, file names, locations of data blocks allocated for individual les, locations of unallocated blocks, etc. Operating system manipulates structural information in a certain well-defined way that can be exploited for event reconstruction.

Page 60: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction- Detection of deleted files.

Information about individual files is stored in standardized file entries whose organization diers from file system to file system.

In Unix file systems, the information about a file is stored in a combination of i-node and directory entries pointing to that i-node.

In Windows NT file system (NTFS), information about a file is stored in an entry of the Master File Table.

When a disk or a disk partition is first formatted, all such file set to initial “unallocated" value.

When a file entry is allocated for a file, it becomes active. Its fields are filled with proper information about the file.

In most file systems, however, the file entry is not restored to the “unallocated“ value when the file is deleted. As a result, presence of a file entry whose value is different from the initial “unallocated" value, indicates that that file entry once represented a file, which was subsequently deleted.

Page 61: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction

- File attribute analysis.

Every file in a file system is either active or deleted; has a set of attributes such as name, access permissions, timestamps and location of disc blocks allocated to the file.

File attributes change when applications manipulate files via operating system calls.

File attributes can be analyzed in the same way as log file entries.

- Timestamps are a particularly important source of information for event reconstruction.

In most file systems a file has at least one timestamp. In NTFS, for example, every active (i.e. non-deleted) file has three timestamps, which are collectively known as MAC-times.

Time of last Modification (M)

Time of last Access (A)

Time of Creation (C)

Page 62: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction

Imagine that there is a log file that records every file operation in the computer.

In this imaginary log file, each of the MAC-times would correspond to the last entry for the corresponding operation (modification, access, or creation) on the file entry in which the timestamp is located.

To visualize this similarity between MAC-times and the log file, the mactimes tool from the coroner's toolkit sorts individual MAC-times of files; both active and deleted; and presents them in a list, which resembles a log file.

Signatures of different activities can be identified in MAC-times like in ordinary log files.

Following are several such signatures, which have been published.

Page 63: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction Restoration of a directory from a backup: The fact that a directory was restored from a

backup can be detected by inequality of timestamps on the directory itself and on its

sub-directory `.' or `..'. When the directory is first created, both the directory timestamp

and the timestamp on its sub-directories `.' and `..' are equal. When the directory is

restored from a backup, the directory itself is assigned the old timestamp, but its

subdirectories `.' and `..' are timestamped with the time of backup restoration.

Exploit compilation, running, and deletion: The signature of compiling, running, and

deleting an exploit program is explored. It is concluded that \when someone compiles,

runs, and deletes an exploit program, we expect to find traces of the deleted program

source file, of the deleted executable file, as well as traces of compiler temporary files."

Moving a file: When a file is being moved in Microsoft FAT file systems, the old file

entry is deleted, and a new file entry is used in the new location. The new file entry

maintains same block allocation information as the old entry. Thus, the discovery of a

deleted file entry, whose allocation information is identical to some active file, supports

possibility that the file was moved.

Page 64: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction- Reconstruction of deleted files.

In most file systems file deletion does not erase the information stored in the file. Instead, the file entry and the data blocks used by the file are marked as unallocated, so that they can be reused later for another file. Thus, unless the data blocks and the deleted file entry have been re-allocated to another file, the deleted file can usually be recovered by restoring its file entry and data blocks to active status.

Even if the file entry and some of the data blocks have been re-allocated, it may still be possible to reconstruct parts of the file. The lazarus tool for example, uses several heuristics to find and piece together blocks that (could have) once belonged to a file. Lazarus uses heuristics about file systems and common file formats.

In most file systems, a file begins at the beginning of a disk block; Most file systems write file into contiguous blocks, if possible; Most file formats have a distinguishing pattern of bytes near the beginning of the le; For most file formats, same type of data is stored in all blocks of a file.

Page 65: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Event Reconstruction Lazarus analyses disc blocks sequentially. For each block, lazarus tries to

determine (1) the type of data stored in the block { by calculating heuristic

characteristics of the data in the block; and (2) whether the block is a first block

in a file { using well known file signatures. Once the block is determined as a first

block", all subsequent blocks with the same type of information are appended to

it until new first block" is found.

This process can be viewed as a very crude and approximate reconstruction

based on some knowledge of the file system and application programs. Each

reconstructed file can be seen as a statement that that file was once created by

an application program, which was able to write such a file.

Since lazarus makes very bold assumptions about the file system, its

reconstruction is highly unreliable. Despite that fact, lazarus works well for small

files that t entirely in one disk block.

The effectiveness of tools such as lazarus can probably be improved by using

more sophisticated techniques for determining the type of information contained

in a disk block. One such technique that employs support vector machines

Page 66: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

What is Lazarus? Lazarus is a program that attempts to resurrect deleted

files or data from raw data - most often the unallocated portions of a Unix file system, but it can be used on any data, such as system memory, swap, etc.

It has two basic logical pieces - one that grabs input from a source and another that dissects, analyzes, and reports on its findings.

It can be used for recovering lost data and files (accidentally removed by yourself or maliciously), as a tool for better understanding how a Unix system works, investigate/spy on system and user activity, etc.

Page 67: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Time Analysis Timestamps are readily available source of time, but they are easy to

forge. Several attempts have been made to determine time of event using

sources other than timestamps. Currently, two such methods have been published. They are time

bounding and dynamic time analysis. Time bounding

- Timestamps can be used for determining temporal order of events. The inverse of this process is also possible if the temporal order of events is known a priori, then it can be used to estimate time of events.

- Suppose that three events A, B, and C happened. Suppose also that it is known that event A happened before event B, and that event B happened before event C. The time of event B must, therefore, be bounded by the times of events A and C.

Page 68: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Time Analysis Dynamic time analysis

- External sources of time may be used; one could exploit the ability of web servers

to insert timestamps into web pages, which they transmit to the client computers.

- As a result of this insertion, a web page stored in a web browser's disk cache has

two timestamps.

- The first timestamp is the creation time of the file, which contains the web page. The

second timestamp is the timestamp inserted by the web server.

- the oset between the two timestamps of the web page reects the deviation of the

local clock from the real time. It is proposed to use that oset to calculate the real

time of other timestamps on the local machine.

- To improve precision, it is proposed to use the average oset calculated for a number

of web pages downloaded from different web servers.

- This analysis assumes that (1) timestamps are not tampered with, and that (2) the

oset between system clock and real time is constant at all times (or at least that it

does not deviate dramatically).

Page 69: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Conclusion - 1

Important to have experts for computer forensics evidence gathering and analysis

Important to secure the evidence: authenticity, completeness, integrity

Important to have the proper tools for analysis Important to apply the correct legal tests Computer forencis can be used to benefit both the “good and

bad guys” Need to be several steps smarter than the enemy

Page 70: Data and Applications Security Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas November 12, 2010

Conclusion - 2

The need for effective and efficient digital forensic analysis has been a major driving force in the development of digital forensics.

Manual browsing was initially the only way to do digital forensics.

It was later augmented with various search utilities and, more recently, with tools such as mactimes and lazarus that support more in-depth analysis of digital evidence.

Due to the limited time and manpower available to a forensic investigation, there is a constant demand for tools and techniques that increase the accuracy of digital forensic analysis and minimize the time required for it.