22
Interested in learning more about security? SANS Institute InfoSec Reading Room This paper is from the SANS Institute Reading Room site. Reposting is not permitted without express written permission. Computer Forensic Timeline Analysis with Tapestry Computer forensics requires applying computer science to answer legal questions. Arranging events chronologically is a good way of telling a clear, concise story. As valuable as date-- and time--based information often is to a case, none of the leading forensic tools offer usable date and time oriented tools. Log2timeline is an excellent tool for extracting date and time based information from digital evidence. In fact, the amount of information it extracts can overwhelm the examiner. Most computer forensic timeline to... Copyright SANS Institute Author Retains Full Rights AD

Computer Forensic Timeline Analysis with Tapestry

Embed Size (px)

DESCRIPTION

Computer Forensic Timeline Analysis with Tapestry

Citation preview

Page 1: Computer Forensic Timeline Analysis with Tapestry

Interested in learningmore about security?

SANS InstituteInfoSec Reading RoomThis paper is from the SANS Institute Reading Room site. Reposting is not permitted without express written permission.

Computer Forensic Timeline Analysis with TapestryComputer forensics requires applying computer science to answer legal questions. Arranging eventschronologically is a good way of telling a clear, concise story. As valuable as date-- and time--basedinformation often is to a case, none of the leading forensic tools offer usable date and time oriented tools.Log2timeline is an excellent tool for extracting date and time based information from digital evidence. Infact, the amount of information it extracts can overwhelm the examiner. Most computer forensic timeline to...

Copyright SANS InstituteAuthor Retains Full Rights

AD

Page 2: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline

Analysis with Tapestry GIAC (GCFA) Gold Certification

Author:  Derek  Edwards,  [email protected]  Advisor:  Rick  Wanner  

Accepted:  November  12th,  2011  

Abstract  Computer  forensics  requires  applying  computer  science  to  answer  legal  questions.  Arranging  events  chronologically  is  a  good  way  of  telling  a  clear,  concise  story.  As  valuable  as  date-­‐  and  time-­‐based  information  often  is  to  a  case,  none  of  the  leading  forensic  tools  offer  usable  date  and  time  oriented  tools.  Log2timeline  is  an  excellent  tool  for  extracting  date  and  time  based  information  from  digital  evidence.    In  fact,  the  amount  of  information  it  extracts  can  overwhelm  the  examiner.  Most  computer  forensic  timeline  tools  focus  on  either  collection  or  presentation  of  timeline  data.  But  few  options  exist  for  storing  the  large  amount  of  data,  much  less  managing  it  throughout  the  analysis  process,  through  which  a  data  set  is  reduced  to  its  most  representative  and  relevant  set  of  facts.    By  organizing  timeline  data  so  that  it  can  either  be  viewed  in  summary  or  “drilled  down”  in  detail,  the  sense  of  overwhelm  that  computer  forensics  examiners  experience  while  analyzing  the  large  data  sets  that  accompany  timeline  analysis  can  be  reduced.  This  approach  to  organizing  the  data  has  proven  effective  in  increasing  the  usefulness  of  other  large  data  sets,  such  as  intrusion  detection  databases.    A  converter  for  log2timeline  output  and  an  Apache-­‐MySQL-­‐PHP  web  application  are  described.  

Page 3: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 2    

Derek  Edwards,  [email protected]      

1. Introduction One question commonly asked of investigators and incident responders is, “What

happened?” The answer often takes the form of a story designed to convey understanding

of complex mechanisms and interactions in a simple, straightforward way. One of the

simplest and most intuitive ways to organize a story is chronologically.

Early in our lives we learn to tell stories by starting at the beginning, and

continuing through the middle to the end. While hearing the imagination in a young

child’s story is great fun, hearing baby’s first bold-faced lie is usually not. But even at

that, the implausible joined to the improbable by, “… and then ...,” is a relaxing stretch of

those imagination muscles we forgot we had.

As simple and intuitive as chronological organization is, few forensic tools

support a useful chronological view. Forensic tools have been much more focused on

parsing and extracting data from and about artifacts like a UNIX file system, the

Windows Registry or a mobile phone.

For example, a Microsoft Word document has its content, the information that

you read or print, and metadata, information about the file: its size, its title, its author,

some keywords, and some dates and times. Some of the dates and times are internal,

meaning that it is stored in the document, and some are stored by the file system upon

which the document is saved. When we save this document to an NTFS file system on a

Windows system, NTFS maintains additional metadata about the document including

more dates and times. When opened, (.LNK) link files that point to the document are

created. In addition to the metadata about the document contained in the link files, NTFS

maintains metadata about the link files.

1.1. Answering the Right Questions Merriam-Webster’s dictionary defines “forensic” as, “relating to or dealing with

the application of scientific knowledge to legal problems.” (Merriam-Webster, Inc.)

Courts look to computer forensic examiners to apply our knowledge of how computers

work to determine not just, “what happened”, but also to give opinions on “what could

Page 4: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 3    

Derek  Edwards,  [email protected]      

have happened.” Testimony from recent high-profile cases shows the importance time

can play in a case and the confusion that a lack of understanding of date and time can

introduce in court.

The prosecution in the Conrad Murray manslaughter case used a “timeline of

negligence” developed from mobile phone records to show that Dr. Murray wasted

valuable time calling his girlfriends, vs. calling 911 emergency services. (Lloyd, Healy,

& Klemack, 2011)

In the Casey Anthony murder trial, the legal question involved computer users’

degree of interest in chloroform. The defense sought to use a Mozilla Firefox cache

timeline to minimize focus on Google searches for “chloroform.” Jones observes

regarding questions posed to John Bradley, the developer of the Cacheback web browser

history parsing tool: “The defense attorney then asked Bradley about the time difference

between the user browsing the chloroform content and then browsing other subject

material not related to chloroform. The period of time was approximately three minutes,

according to Bradley and the Cacheback report he read from. The attorney then asked

Bradley if three minutes was the longest time the user at the computer could view this

particular chloroform material. Bradley answered that he was correct.” (Jones, 2011)

Bradley’s answer may have initiated the thought that the chloroform web page

was browsed for only three minutes. About this, Jones comments: “Personally, I

disagreed with Bradley’s testimony about the user browsing chloroform content for only

three minutes. I do not believe you can have that opinion, given the facts. For example,

Firefox version 2 had support for tabbed browsing and the user could also open multiple

browser windows. Every time a user visits a new site in a new tab or browser, a new

entry is made into the internet history file (or the “visited” counter is incremented).

Firefox version 2 internet history files will not record the time a browser or tab is closed,

or when the subject matter can no longer be viewed by the user. I feel the defense

attorney was trying to demonstrate that the user could not possibly have enough time to

read about “how to make chloroform”, but in reality, in this case, it seems as if we are

unable to tell how long the material was available to the user on the screen. However,

Bradley did state that the article for chloroform was visited 84 times by a user on this

Page 5: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 4    

Derek  Edwards,  [email protected]      

computer. I believe this is strong evidence to support that a person had been reading the

material numerous times in the past, and the amount of time the user spent reading about

chloroform could not be accurately established with the data used in Bradley’s

testimony.” (Jones, 2011)

Just over a month later, before the end of the murder trial, Bradley informed

investigators and prosecutors that the “84 times” Cacheback reported the chloroform

article was visited was incorrect. There was only one visit to the chloroform web page.

(TBO.com, 2011)

1.2. Prior Work Log2timeline is an excellent framework for extraction of time and date

information, but a key criticism is the volume of information it collects. While there are

tools for visualizing timelines, they can only present a limited data set. Other than

spreadsheets, or potentially Computer Forensic Timelab (CFTL), which is not publicly

available, there are no tools that “bridge the gap” between overwhelming data sets that

log2timeline or timescanner can generate and the smaller, representative data sets that

could be visualized using SIMILE UI tools.

1.2.1. Leading Forensic Analysis Tools Leading forensic analysis tools show MACB (Modification, Access, Creation, and

Birth) times in separate columns, as shown in Table 1: MACB Times for Two Files

below. By convention, “M” represents times data was modified, “A” represents times

files or entries were read or otherwise accessed, “C” represents times files or entries were

created, and “B” represents times when metadata about entries was updated, such as

when a Windows New Technology File System (NTFS) Master File Table (MFT) entry

has been modified, such as when a file is renamed.

Table  1:  MACB  Times  for  Two  Files  Filename Created © Accessed (A) Written (M) Entry Modified

(B)

Autoexec.bat 11/3/2010

10:26:30

11/3/2010

10:26:30

11/3/2010

15:26:30

11/3/2010

15:26:30

Config.sys 11/3/2010

10:26:30

11/3/2010

15:26:30

11/3/2010

10:26:30

11/3/2010

15:26:30

Page 6: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 5    

Derek  Edwards,  [email protected]      

In order to find times what may have occurred on certain dates, the table must be

sorted by the C column, then by the A column, then by the M column, then by the B

column. With a large data set, such as all files on a system’s C: drive, you’ll forget the

target date a few times while waiting for the sort again, then forget if you learned

anything interesting.

A timeline view of the same data is as follows in Table 2. Note that we can tell

that both autoexec.bat and config.sys were created at 10:26:30 and modified at 15:26:30.

Table  2:  Timeline  View  Date Time Filename MACB

11/3/2010 15:26:30 Autoexec.bat M..B

11/3/2010 15:26:30 Config.sys M..B

11/3/2010 10:26:30 Autoexec.bat .AC.

11/3/2010 10:26:30 Config.sys .AC.

1.2.2. Log2timeline

Log2timeline is versatile and flexible software framework and tool suite for

collecting time and date information from computers. (Gudjonsson, 2009) Kristinn

Guðjónsson's seminal paper on log2timeline, notes, “The amount of data collected in a

super timeline can be overwhelming to an investigator.” (Guðjónsson, 2010)

1.2.3. Spreadsheets Timeline data can easily be imported into a spreadsheet using the comma-

separated values (.CSV) log2timeline output plug-in. Spreadsheets are good for

displaying data, searching, sorting, and highlighting important events, but their

performance deteriorates quickly with the size of the data set. The examiner must analyze

the timeline by navigating the entire data set.

Loading the data to a database table and linking it to a spreadsheet can improve

performance by dividing the processing load between the examiner's workstation and a

database server, but the essential issue remains.

A great way to improve the examiner's efficiency is to create a pivot table from

the data. But I find that few know this valuable skill. A pivot table first allows an

examiner to summarize data by keys of interest, e.g. month and year. The examiner can

Page 7: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 6    

Derek  Edwards,  [email protected]      

then “drill down” within a month and year of interest to see timeline artifacts that were

impacted in that month and year.

The strategy outlined in this paper increases the performance and scalability of

timeline analysis by organizing timeline data in an optimized database.

1.2.4. Computer Forensic TimeLab (CFTL) One system that does combine time and date extraction with a database and a

time-and date-based user interface is the Computer Forensic TimeLab (CFTL). Olsson

and Boldt demonstrated that examiners completed forensic analysis tasks more efficiently

and effectively using the Computer Forensic TimeLab (CFTL) than when they used

Access Data’s Forensic Toolkit (FTK). However, CFTL is not publicly available.

(Olsson & Boldt, 2009)

1.2.5. SIMILE Visual Timeline The SIMILE timeline user interface elements (Massachussetts Institute of

Technology, 2010) are limited in the number of events that can be shown on a single

display. Both examples on the log2timeline page display about 200 events. (Gudjonsson,

2009) Using timescanner to process even the most basic system is likely to return an

order of magnitude more events.

2. Tapestry for Timelines What is needed is a timeline database that reduces overwhelm by summarizing

data, increases focused analysis by supporting filtering of the data set, and increases

understanding of the data by supporting automated and manual highlighting.

Home improvement TV shows like Clean House that reduce clutter focus first on

eliminating extra stuff, then on organization of the things that remain. Since we don't

have the option of eliminating evidence, let's focus on the evidence's organization.

Tapestry is fabric woven from multiple lines together. In a similar way, the

log2timeline import script and database application will be used to organize our timelines

so that we can better see patterns.

Page 8: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 7    

Derek  Edwards,  [email protected]      

If Tapestry does a good enough job of organizing our data set, we can give it the

reward for a job well done—more work. Where we would have used log2timeline and

existing plugins to analyze one or a few systems, we can use Tapestry to analyze a small

subnet.

2.1. Tapestry Input Script The Tapestry Input Script (test_import.pl) is used to import logs to the Tapestry

database. Import is a two-step process. First, we use Log2Timeline or timescanner to

extract data in comma-separated values (.CSV) format, then we use the Tapestry Input

Script, which submits the data to the Tapestry Apache/PHP/MySQL database application

over HTTP.

First, we prepare by mounting the forensic drive images, in this case,

“imagefile.dd” as “/mnt/unix”.

#  mount  –o  loop,ro,noexec  imagefile.dd  /mnt/unix    

Next, we run timescanner to extract dates and times in CSV format (step 1).

#  timescanner  -­‐z  CST6CDT  –name  apollo  -­‐d  /mnt/unix  -­‐o  csv  –w  time.csv    

Then, we run fls and log2timeline to capture dates and times in mactime format,

and convert them to CSV format (still step 1)..

#  fls  –m  /  -­‐r  hd6.img  >  bodyfile  #  log2timeline  –name  apollo  –i  mactime  –r  bodyfile  –o  csv  –w  time2.csv    

And in step 2, we send the data to the Tapestry server:

#  test_import.pl  http://192.168.1.179  /Timeline/  time.csv  #  test_import.pl  http://192.168.1.179/Timeline/  time2.csv  

Page 9: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 8    

Derek  Edwards,  [email protected]      

 

2.2. Using the Tapestry Database Application 2.2.1. The Summary View and Filtering The default

Tapestry for Timelines

page shows a summary of

the timeline data, grouped

by month and year. A

sample of the data

grouped by month and

year is shown in Figure 1.

By default, output

is limited to 100 date-time

groups. The display shows the

date and optionally the time.

Next to the date-time group in the “Date Time” column is a plus “+” sign and a

minus “-“sign. The “+” links increase the level of detail in the data set. For example,

when the data are summarized by month and year, clicking the plus causes the data to be

summarized by month, day and year. Another click will summarize the data by month,

day, year and hour of the day. Likewise, the minus sign reduces the level of detail. If,

for example, the “date zoom” level is set such that items are grouped by month, day, year

and time in hours, minutes, and seconds, clicking the minus sign will reduce the summary

level to month, day, year and time in hours and minutes.

Next in the “Date Time” column is the details link. Clicking this will take you to

the “Details View” of the data, which is described below in section 2.2.2

The next grouping level is the host from which timeline data was gathered. To

limit the data set to the host “FAM1,” one would click on “FAM1” in the Host column in

Figure 1.

Figure  1:  Summary  View  Groupings

Page 10: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 9    

Derek  Edwards,  [email protected]      

The next grouping levels pertain to the Log2timeline input plugin that extracted

the data. The Log Source Type is a general category of timeline artifact. Figure 1, for

example, shows a variety of Source Types. “EXIF” stands for the Extensible Image File

Format, which specifies a common metadata format for image files. Devices like digital

cameras write metadata in EXIF format that includes the time and date a picture was

taken (assuming the camera’s date and time are accurate). Compilers can also include

EXIF data in executable files. “REG” stands for the Windows Registry, which is loaded

with dates and times. “EVT” stands for Windows Event Log entries, which are time- and

date-stamped. “WEBHIST” stands for Microsoft Internet Explorer browser history

entries, which have access and modification times.

The Log Source column is a more specific category of timeline artifact. Within

the “REG” Source Type as displayed in Figure 1, there is “Deleted Registry”, “NTUSER

key”, “SAM key”, “SOFTWARE key”, “SYSTEM key” and “UserAssist key.” The

NTUSER key represents registry entries in the HKEY_CURRENT_USER “hive” in

Windows Registry Editor. These entries are stored in NTUSER.dat files in Windows

users’ home directories—C:\Documents and Settings\User1\NTUSER.dat on Windows

XP or C:\Users\User1\NTUSER.dat on Windows 7 or Vista, for example. The

UserAssist key can be an especially juicy source of date and time information, because it

is where that list of recently-used programs that appears on the Windows Start Menu is

stored—with time stamps. The SAM key represents the Windows Security Accounts

Manager database. The SOFTWARE and SYSTEM keys refer to the

HKEY_LOCAL_MACHINE\SOFTWARE and HKEY_LOCAL_MACHINE\SYSTEM

registry hives, respectively. SAM, SOFTWARE, and SYSTEM are stored in the

%SYSTEMROOT%\System32\config folder.

Clicking on a Log Source Type or Log Source filters the summary view to that

source type or source. For example, clicking on “Event Log” will limit the summary to

Windows Event Log items.

The L2T Format and the L2T Version columns identify the Perl class in

Log2timeline that extracted the date and time information and the version number of

Log2timeline that was used. Note in Figure 1 that the Log2t::input::ntuser input plugin

Page 11: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 10    

Derek  Edwards,  [email protected]      

only gathered entries with the “REG” source type, but extracted “NTUSER key”,

“Deleted Registry” and “UserAssist key” sources to the data set.

The “continue” links at the top and bottom of the data set accommodate

navigation forward and backward in the data set, as shown in Figure 1, if additional data

exists.

Figure 2 shows the remainder of the Summary View. The Total column gives a

record count for each combination of Date and Time, Host, Log Source Type, Log

Source, L2T Format and L2T Version. To see all “Event Log” entries in August 2009 in

the Detail View, for example, click on the number “27” in the Total Column.

The summary view also facilitates displaying records that match “MACB”

characteristics in the Detail View. Clicking on numbers under M, A, C or B limits the

data set to entries that have that particular flag set. For example, clicking a number in the

“M” column will show entries of the chosen type whose entries whose modification dates

changed during the chosen date-time group.

Note that the Total is not mathematically related to the MACB counts. Figure 2

shows that there are 27 Event Log entries in August 2009, but only 20 MACB. There

may be multiple log entries that have the same content (but they are not duplicates).

2.2.2. Details View The details view is essentially the same information presented by log2timeline's

CSV output plugin. An example of the details view is given in Figure 3.

Figure  2:  Summary  View

Figure  3:  Details  View

Page 12: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 11    

Derek  Edwards,  [email protected]      

While specifics the columns’ content may differ depending on the input plugin

used by Log2timeline, the Count, Date, Time, MACB, Source, Source Type, Version and

Format columns are as described above. The Type column is a human-readable input-

plugin-specific description of the info in the Source, Source Type and MACB columns,

like “Last Written” in the first line of the view of Figure 3. Short and Description are

often the same, but they can contain log rows, filenames and/or registry keys, and what

was done to them. Filename is usually the name of the file on the analysis system (not

necessarily the name on the source system) from which dates and times were extracted.

Inode is the NTFS Master File Table record number or UNIX file inode number. The

Notes and Extra columns are additional information stored by the input plugin.

2.2.3. Highlighting After selecting a set of interesting detail records, they can be highlighted and

added to a group. Each highlighted row has green bold text, and it looks like a depressed

button.

Clicking the “add to a

group” link shown in Figure 4

brings up the Group Editor.

As shown in Figure

5, you can use the Group

Editor to assign a name, a

description, and a highlight

color to each group. Note

that “USB Mount” is an

existing group that could be

selected. In Figure 4, the

user typed “OfficeJet 6310

Printer” as the group name,

described it as “OfficeJet

6310 Printer installed” and

Figure  5:  Group  Editor

Figure  4:  "add  to  a  group"  and  "report”  Links

Page 13: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 12    

Derek  Edwards,  [email protected]      

chose yellow as the highlight color.

Once the user clicked OK, the selected rows and the rows pertaining to the “USB

mount” group are highlighted, as shown in Figure 6.

Figure  6:  Detail  View  with  Rows  Highlighted

Page 14: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 13    

Derek  Edwards,  [email protected]      

2.2.4. Generating a Report Clicking on the report link in Figure 4 brings up a reverse chronological report of

entries that were highlighted. While too busy for judge and jury, it is intended to

summarize the data that were noted as important and thereby provide input to future

reports.

2.2.5. Word Search Another feature is word search. While data is being imported, “words”--file and registry

path name parts, terms in log files, etc.--are extracted from the data and saved to a

database table. Entries containing words of interest can be located using word search.

Entering a word or part of a word in the search box, and clicking “search” brings a menu

of matching words in the data. Clicking one of the words brings matching entries in the

detail view.

For example, Figure 8 shows a

sample of the list of words found when

the term “linuz” is entered.

Figure  7:  Report

Figure  8:  Word  Search

Page 15: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 14    

Derek  Edwards,  [email protected]      

2.3. The Tapestry Database Design The Data Warehouse Toolkit (Kimball & Ross, 2002) outlines a 4-step

dimensional design process.

1. Select the business process to model 2. Declare the grain of the business process 3. Choose the dimensions that apply to each fact table row 4. Identify the numeric facts that apply to each fact table row.

2.3.1. Selecting the Business Process Our business process is computer forensic timeline analysis. We are using

dimensional modeling to summarize the data so that time periods of greatest or highest-

risk activity can be more easily seen and accessed for details.

2.3.2. Declaring the Grain In our timeline summaries, we want to be able to list and count the various types

of timeline artifacts. We should be able to summarize a computer's time fidelity to the

second. We also want our database to be flexible enough to accommodate new types of

timeline artifacts as tools for extracting them mature. We would also like to distinguish

hosts from which timeline data were gathered.

For example, given a document, we want to be able to represent all of the the

types of timeline dates and times:

• The hosts upon which changes to the document occurred • The document's modified, last accessed, created, and entry modified times

(MACB) times • LastPrinted metadata in the document • CreateDate metadata in the document • ModifyDate metadata in the document

2.3.3. Choosing Dimensions The timeline database is organized in a Star Schema. A star schema database

optimizes searching, sorting, grouping and filtering a data set. The data elements we

want to search, sort or filter upon are known as “dimensions” of analysis. Relationships

between dimensions are brought together through database foreign key relationships in a

“fact” table.

Page 16: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 15    

Derek  Edwards,  [email protected]      

2.3.4. Identifying Numeric Facts In a timeline, we're interested in understanding various events that have taken

place over time to a system's resources (files, folders, registry keys, pictures). So we end

up with dimensions of time, date and source. So given an image, there are multiple facts

that pertain to it: which hosts have the image, the image's MACB times and the image's

EXIF properties, if they exist. Each of these “facts” is modeled as a row pointing to a

single date a single time and a single source (host, log type, and MACB flags).

Analysis is facilitated by selecting the facts pertaining to the dimensions of

interest. For example, we can select a date and time to find not only the resources that

changed at a given date or time through the fact table, but we can also determine how the

resources were changed through the source dimension.

Since we are using dimensions for searching, sorting, grouping and filtering, it

becomes important that rows in dimension tables store unique combinations of values.

Indexes (or indices) not only enforce uniqueness, they also enhance performance of the

database queries executed by the web application. Without indexes, the database engine

has to read each row of each table called for in a query. Queries on tables that are

indexed can often be answered much more quickly, because the items to be searched,

sorted, grouped or filtered can be looked up in the index. The costs of creating indexes

are disk space and time. The investment of time and disk space while loading pays off

handsomely in reduced analysis time while processing the case. Each dimension table has

a unique index on all its fields.

2.4. Extraction, Transformation, and Loading (ETL) The time advantages of reorganizing data are not without cost. Time is taken

during extraction, transformation, and loading (ETL) processing to place and index the

data. However, ETL is done only once.

Data first intoduced to the system is first placed in the tln_import table. It is

essentially the comma-separated values (CSV) imported directly into the MySQL

database.

Page 17: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 16    

Derek  Edwards,  [email protected]      

An additional field found in many of the tables is the “tln_concurrency_id” field.

It is used while loading the database to distinguish one load batch from another.

2.5. Tapestry Database Reference 2.5.1. The tln_date Dimension Table The ‘date’ table consists of dates likely to exist in any computer system, for

example, we can make a database row for each day between 1/1/1970 and 12/31/2070.

We also add fields we can use for searching, sorting and displaying data, like the months

of the year (as numbers: “2011-07”).

Field Sample

tln_date_id    date year month day

2011-10-10 (UNIX time) 2011 2011-10 2011-10-10  

2.5.2. The tln_time Dimension Table The ‘time’ table consists of all times of day, for example from 00:00:00 to

23:59:59 (in 24-hour “military” time). An important design decision is the choice of a

“grain” of analysis, or the precision which the database will support. Though some file

systems are capable of measuring and logging time at sub-second granularity, granularity

to the second is sufficient precision for this application. Moreover, falling back to the

forensic image is another option.

Field Sample

tln_time_id    tick hour minute second

12:03:57 (UNIX time) 12 12:03 12:03:57  

2.5.3. The tln_source Dimension Table Sources are log2timeline source types. Data in the “source,” “sourcetype,” “type,”

“version,” and format fields depends on the log2timeline input plugin which extracted the

entry. The host field is specified on the log2timeline or timescanner command line. The

input plugin also defines each entry's MACB fields.

Page 18: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 17    

Derek  Edwards,  [email protected]      

Field Sample

tln_source_id    source sourcetype type host version format M A C B

EXIF EXIF Metadata PNG/ModifyDate apollo 2 Log2t::input::exif M . . .  

2.5.4. The tln_words Dimension Table Another advantage of the star schema is that it can be easily augmented with other

dimensions (known as a “snowflake”). One such dimension is the words dimension. The

words dimension is simply a list of words. It is joined to the table of facts by another

table that models the many-to-many relationship between facts and words: “many words

in any given fact” and “many facts that contain any given word,”

Field Sample

tln_word_id    word vmlinuz  

2.5.5. The tln_fact Fact Table The fact table contains timeline details. It includes a count field, which counts

duplicate entries (e.g. log entries) that occurred at the same time.

Field Sample

tln_fact_id    tln_date_id tln_time_id tln_source_id  

 

count user short description filename inode notes extra

2 root /var/log/cron /var/log/cron bodyfile 12110 - uid: 0 mode: r/rrw------- size: 31607 md5: 0 gid: 0  

Page 19: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 18    

Derek  Edwards,  [email protected]      

3. Conclusions Tapestry is neither visually stunning nor feature-rich at this point, but it has a

basic set of features that facilitate basic timeline analysis. It performs well with large

data sets even within the RAM and CPU constraints of a SIFT Workstation virtual

machine.

If Tapestry does make it to a future SIFT workstation, it will hopefully inspire

discussion of the question in the mind of its creator throughout its development: “why

don’t forensics software systems that cost thousands do this?”

Tapestry also shows the great value that has gone into work on Log2timeline.

Hopefully, the chronological view Tapestry provides will also help with

understanding the complex interactions that take place day-to-day on computer systems,

so that forensic analysts like us will the best answers to the questions that are of greatest

public interest.

Page 20: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 19    

Derek  Edwards,  [email protected]      

4. References  Carrier,  B.  (2005).  File  System  Forensic  Analysis.  Boston,  MA:  Addison-­‐Wesley  

Professional.  

Carvey,  H.  (2004).  Windows  Forensics  and  Incident  Recovery.  Boston,  MA:  Addison-­‐

Wesley  Professional.  

Gudjonsson,  K.  (2009,  August  4).  log2timeline.  Retrieved  October  25,  2011,  from  

log2timeline:  http://log2timeline.net/  

Guðjónsson,  K.  (2010,  June  29).  Mastering  the  Super  Timeline  With  Log2timeline.  

Retrieved  October  25,  2011,  from  SANS  Reading  Room:  

http://www.sans.org/reading_room/whitepapers/logging/mastering-­‐super-­‐

timeline-­‐log2timeline_33438  

Jones,  K.  J.  (2011,  June  14).  Casey  Anthony  Murder  Trial:  The  Computer  Evidence  (part  

2).  Retrieved  October  25,  2011,  from  JDA:  

http://www.jonesdykstra.com/blog/201-­‐caseyanthony-­‐part2  

Kimball,  R.,  &  Ross,  M.  (2002).  The  Data  Warehouse  Toolkit.  New  York:  Wiley.  

Lloyd,  J.,  Healy,  P.,  &  Klemack,  J.  C.  (2011,  October  4).  Phone  Calls  from  Michael  

Jackson  Doctor  Take  Spotlight.  Retrieved  October  25,  2011,  from  NBC  Los  

Angeles:  http://www.nbclosangeles.com/news/local/Conrad-­‐Murray-­‐Trial-­‐

Michael-­‐Jackson-­‐Day-­‐6-­‐131046673.html  

Massachussetts  Institute  of  Technology.  (2010,  November  2010).  Reference  

Documentation  for  Timeline.  Retrieved  October  25,  2011,  from  SIMILE  

Widgets:  http://www.simile-­‐widgets.org/wiki/Timeline  

Merriam-­‐Webster,  Inc.  (n.d.).  Forensics.  Retrieved  October  25,  2011,  from  Definition  

and  More  from  the  Free  Merriam-­‐Webster  Dictionary:  http://www.merriam-­‐

webster.com/dictionary/forensics  

Olsson,  J.,  &  Boldt,  M.  (2009).  Computer  Forensic  Timeline  Visualization  Tool.  Digital  

Investigation  6  (pp.  s78-­‐s87).  Montreal:  Elsevier.  

Strunk,  W.  J.  (2000).  The  Elements  of  Style.  Needham  Heights,  MA:  Longman.  

Page 21: Computer Forensic Timeline Analysis with Tapestry

© 2011 SANS Institute, Author retains full rights.

Author retains full rights.As part of the Information Security Reading Room© 2011 The SANS Institute

Computer Forensic Timeline Analysis with Tapestry 20    

Derek  Edwards,  [email protected]      

TBO.com.  (2011,  July  19).  Anthony  'chloroform'  searches  overstated,  software  

designer  says.  Retrieved  October  25,  2011,  from  TBO.com:  

http://www2.tbo.com/news/news/2011/jul/19/anthony-­‐chloroform-­‐

searches-­‐overstated-­‐software-­‐de-­‐ar-­‐244913/  

-

Page 22: Computer Forensic Timeline Analysis with Tapestry

Last Updated: January 16th, 2014

Upcoming SANS TrainingClick Here for a full list of all Upcoming SANS Events by Location

SANS Dubai 2014 Dubai, AE Jan 25, 2014 - Jan 30, 2014 Live Event

AUD307 Foundations of Auditing Security and Controls of ITSystems

Dallas, TXUS Feb 03, 2014 - Feb 05, 2014 Live Event

AppSec 2014 Austin, TXUS Feb 03, 2014 - Feb 08, 2014 Live Event

SANS Cyber Threat Intelligence Summit Arlington, VAUS Feb 04, 2014 - Feb 11, 2014 Live Event

FOR508 Tokyo - February 2014 Tokyo, JP Feb 17, 2014 - Feb 22, 2014 Live Event

SANS Phoenix/Scottsdale 2014 Scottsdale, AZUS Feb 17, 2014 - Feb 22, 2014 Live Event

SANS Brussels 2014 Brussels, BE Feb 17, 2014 - Feb 22, 2014 Live Event

Secure India@Bangalore 2014 Bangalore, IN Feb 17, 2014 - Mar 08, 2014 Live Event

RSA Conference 2014 San Francisco, CAUS Feb 23, 2014 - Feb 24, 2014 Live Event

SANS Cyber Guardian 2014 Baltimore, MDUS Mar 03, 2014 - Mar 08, 2014 Live Event

SANS DFIRCON 2014 Monterey, CAUS Mar 05, 2014 - Mar 10, 2014 Live Event

Secure Singapore 2014 Singapore, SG Mar 10, 2014 - Mar 26, 2014 Live Event

ICS Summit - Orlando Lake Buena Vista, FLUS Mar 12, 2014 - Mar 18, 2014 Live Event

Secure Canberra 2014 Canberra, AU Mar 12, 2014 - Mar 22, 2014 Live Event

SANS Northern Virginia 2014 Reston, VAUS Mar 17, 2014 - Mar 22, 2014 Live Event

ICS 410@ Sydney 2014 Sydney, AU Mar 24, 2014 - Mar 28, 2014 Live Event

SANS Munich 2014 Munich, DE Mar 31, 2014 - Apr 05, 2014 Live Event

SANS 2014 Orlando, FLUS Apr 05, 2014 - Apr 14, 2014 Live Event

SANS Jakarta 2014 Jakarta, ID Apr 07, 2014 - Apr 12, 2014 Live Event

SANS Security East 2014 OnlineLAUS Jan 20, 2014 - Jan 25, 2014 Live Event

SANS OnDemand Books & MP3s OnlyUS Anytime Self Paced