Forensics Week2

Hard disk concepts We'll begin by spending a little bit of time on the language used to describe the layout of a traditional hard disk. This only of historical interest, but will help us understand the output of fdisk(1). The important concepts are: Platters a hard disk consists of one or more platters, stacked on top of each other. A hard disk is, in this respect, something like a stack of dinner plates. Tracks each platter, on both the top and the bottom of that platter, consists of some number of tracks, which are circular rings of varying circumference, depending on how close to the edge of the platter a particular track is. Cylinders tracks are usually numbered from 0. So, for example, if a platter has 5001 tracks on each side, they would be numbered 0-5000. Because all platters on a hard disk have an identical number of tracks on each side, we can use the word 'cylinder' to refer to all the tracks of a particular number, on all sides of all platters, as a cylinder. Thus, all tracks numbered 0 constitute a single cylinder. Sectors each track is divided up into a number of sectors. Sectors are normally 512 byte contiguous sections of the track. Heads data is read from and written to a hard disk with the use of something called a head. A head is a bit like the needle used in the old record players: you place the needle on the vinyl LP, and it reads data.

But in the case of a hard disk, there is a 'head' for each side of the platter. (This is not strictly true. Some drives have 255 heads, so obviously one side of one platter isn't read.)

The word 'head' is usually used as the unit to count the number of 'sides' of platters.

Okay, that's enough to understand a lot of the output of fdisk(1):

# fdisk -lu /dev/sda

Disk /dev/sda: 1979.1 GB, 1979120025600 bytes

255 heads, 63 sectors/track, 240614 cylinders, total 3865468800 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0xba3c92b4

Device Boot Start End Blocks Id System

/dev/sda1 63 80324 40131 de Dell Utility

/dev/sda2 * 81920 4276223 2097152 c W95 FAT32 (LBA)

/dev/sda3 4276224 209076223 102400000 7 HPFS/NTFS/exFAT What does this tell us about the structure of this hard disk? Well:

$ bc -q

scale=5

1979120025600 # bytes

1979120025600

1979120025600/512 # sectors

3865468800.00000

1979120025600/512/63 # all tracks on all heads of all platters

61356647.61904

1979120025600/512/63/255 # tracks per head, i.e., cylinders

240614.30438 Acquiring an image of a hard disk (but more an intro to dd(1) To acquire an image of a hard disk, normally we boot from a trusted CDROM (which does not mount the hard disk), and use dd(1) and nc(1) or cryptcat(1) to copy the disk over the network to a forensic workstation. The most common command sequence used for this purpose is the following: # dd if=/dev/ conv=noerror,sync | nc ((And of course, we'd have a listening nc(1) server listening on our forensic workstation.))

Let's take apart our dd(1) command line a bit:

Without any options, dd(1) works pretty much like cat(1): $ dd

asdf

asdf

0+1 records in

0+1 records out

5 bytes (5 B) copied, 1.99518 seconds, 0.0 kB/s If we add the if= option, then it reads from the file specified: $ cat > dd.in

asdf

^D

$ dd if=dd.in

asdf

0+1 records in

0+1 records out

5 bytes (5 B) copied, 4.5e-05 seconds, 111 kB/s if we add the of= option, then it writes to the file specified: $ dd if=dd.in of=dd.out

0+1 records in

0+1 records out

5 bytes (5 B) copied, 0.000126 seconds, 39.7 kB/s

$ cat dd.out

asdf Normally, when dd(1) encounters a read error, it exits. But during a forensic duplication, we don't want a read error (such as a bad section of the disk) to completely undermine our duplication. So we can specify conv=noerror. When conv=noerror is specified, dd(1) skips over the blocks of bad data, and continues reading.

But the effect of skipping over bad blocks is that, in the event that there are bad blocks, the disk image will be smaller than the actual hard disk. This can wreak havoc when we're trying to cut a disk image into partition images, since the partition table (as we'll see later) is based on the actual disk geometry.

To avoid this, we specify conv=sync, which instructs dd(1) to write a block 0s to the image corresponding to each read error from the input file.

There is one other option that is often used: bs=.

By default, dd(1) reads data from a file in 512 byte chunks, which it calls blocks. That is to say, it performs a series of read(2) system calls, each time asking for the next 512 bytes from the file being read from. And for each read call, it write(2)s that same (amount of) data to the output file.

For example:

$ dd if=/dev/zero of=example.dd count=1

1+0 records in

1+0 records out

512 bytes (512 B) copied, 0.000105 seconds, 4.9 MB/s

$ ls -l example.dd

-rw-r--r-- 1 dale users 512 Apr 22 22:35 example.dd

((Because the default block size is 512 bytes, and because we indicated with count=1 that we wanted to read one block, this invocation of dd(1) created a file 512 bytes in size.))

Likewise, if we now read from this file with the default block size, it'll perform one read and one write, indicated by the '1+0 records in/out':

$ dd if=example.dd of=/dev/null

1+0 records in

1+0 records out

512 bytes (512 B) copied, 3.9e-05 seconds, 13.1 MB/s

On the other hand, if we specify a different block size, say one smaller than the default, there will be more reads and writes:

$ dd if=example.dd of=/dev/null bs=1

512+0 records in

512+0 records out

512 bytes (512 B) copied, 0.000778 seconds, 658 kB/s

Or:

$ dd if=example.dd of=/dev/null bs=256

2+0 records in

2+0 records out

512 bytes (512 B) copied, 3.7e-05 seconds, 13.8 MB/s

Hard disks can usually read and write chunks much larger than the default, and so it is more efficient to use a larger block size. The right blocksize depends on the disk. Specifying a larger block size becomes more and more important as disks are getting bigger and bigger.

Although bs= is mostly about efficiency, there is one case where the block size can actually alter the output file:

When you specify the conv=noerror,sync, then, as noted above, dd(1) writes zeros in place of the data read errors.

But it (sync in particular) does something else too: it forces dd(1) to write in blocksize chunks, even if the corresponding read doesnt grab a full block. If the hard disk size is not a multiple of the block size, then the last block written will cause the image to be larger than the hard disk being acquired.

This can be seen with a simple example: $ ls -l example.dd


$ dd if=example.dd of=out.dd bs=500 conv=noerror,sync

1+1 records in

2+0 records out

1000 bytes (1.0 kB) copied, 0.000141 seconds, 7.1 MB/s

$ ls -l example.dd out.dd


-rw-r--r-- 1 dale users 1000 Apr 22 22:45 out.dd Notice that dd(1) indicated 1+1 records in, and 2+0 records out. This means that it read: 1 full block, and one partial block; and wrote: two full blocks. That explains why the two files differ in size.

If we do not specify sync, then everything is okay: $ dd if=example.dd of=out.dd bs=500

1+1 records in

1+1 records out

512 bytes (512 B) copied, 0.000121 seconds, 4.2 MB/s

$ ls -l example.dd out.dd


-rw-r--r-- 1 dale users 512 Apr 22 22:50 out.dd The moral of the story is: make sure your hard disk is a multiple of your block size, or just leave bs= out altogether, since hard disks are always a multiple of 512. (But for large disks, this can make the acquisition much slower.)

Let's go through a quick example of an acquisition. I'll acquire /dev/sda1 (a small partition we'll learn about partitions soon) because /dev/sda is too large (the procedure is otherwise identical): [we used /home/images/able2.dd during class, but the process is identical] Example 1:

$ nc -lp 9998 >sda1.dd &

[1] 8850

$ sudo su

# fdisk -lu /dev/sda | grep sda1

/dev/sda1 63 80324 40131 de Dell Utility

# dd if=/dev/sda1 conv=noerror,sync bs=2048 | nc -w 2 localhost 9998

20065+1 records in

20066+0 records out

41095168 bytes (41 MB) copied, 0.191781 s, 214 MB/s

# exit

[1]+ Done nc -lp 9998 > sda1.dd

$ ls -l sda1.dd

-rw-rw-r-- 1 dale dale 41095168 Jan 15 12:08 sda1.dd

$ md5sum sda1.dd

82b3bcd63d2ca56f4f23a4b1371ec759 sda1.dd

$ sudo md5sum /dev/sda1

8176cc7595c40f1a754c3f6522b30841 /dev/sda1 Is this right? No, the checksums don't match. What happened? Answer: we specified a block size such that the size of our hard disk was not a multiple. Let's try again:

Example 2

$ nc -lp 9998 > sda1.dd &

[1] 8878

$ sudo su

# dd if=/dev/sda1 conv=noerror,sync bs=1024 | nc -w 2 localhost 9998

40131+0 records in

40131+0 records out

41094144 bytes (41 MB) copied, 0.165376 s, 248 MB/s

# exit

[1]+ Done nc -lp 9998 > sda1.dd

$ md5sum ./sda1.dd

8176cc7595c40f1a754c3f6522b30841 ./sda1.dd

The above describes the use of dd(1) to acquire a forensic duplication of a hard disk. Using dcfldd(1) There are a number of enhanced versions of dd(1) tailed to forensic acquisition. One enhancement of dd(1) is dcfldd(1). It takes all the same options as dd(1), plus a few more. The important enhancement that we'll discuss is simultaneous cryptographic hashing.

With dd(1), if we want to calculate the md5sum on a hard disk, we have to do this: $ md5sum /home/images/able2.dd

02b2d6fc742895fa4af9fa566240b880 /home/images/able2.dd The implication of this is that we need to read the entire disk twice: once to calculate the checksum, and once to acquire an image. dcfldd(1) can be instructed to do them simultaneously: $ dcfldd if=/home/images/able2.dd of=dcfldd.out hash=md5 bs=512

675328 blocks (329Mb) written.Total (md5): 02b2d6fc742895fa4af9fa566240b880

675450+0 records in

675450+0 records out

$ md5sum dcfldd.out

02b2d6fc742895fa4af9fa566240b880 dcfldd.out ((dcfldd(1) calculates the checksum on the input file, which is what we want))

Another advantage of dcfldd(1) is that it can be instructed to perform checksums on arbitrarily large chunks of the data, i.e., it is not restricted to performing a checksum on the whole input file: $ dcfldd if=/home/images/able2.dd of=dcfldd.out hash=md5 bs=1024 hashwindow=10485760

9984 blocks (9Mb) written.0 - 10485760: ae29655c569257d88b57f7942ba5c08a

20224 blocks (19Mb) written.10485760 - 20971520: 9ed40206661bf69281d8a3a197881639

30464 blocks (29Mb) written.20971520 - 31457280: cfb74baae68a6618c6261d62ef99da39

40704 blocks (39Mb) written.31457280 - 41943040: 1724856647e5bcf37dc5e83908b570e2

50944 blocks (49Mb) written.41943040 - 52428800: 8b486d7d2b9a7375a99b326133a9f657

61184 blocks (59Mb) written.52428800 - 62914560: cf38cbd52294e307273f6b36552e502c

71424 blocks (69Mb) written.62914560 - 73400320: 4997a54998df9115ad1c5537c57107be

81664 blocks (79Mb) written.73400320 - 83886080: 1801cad43fab9b35028df2f81cd5e27d

91904 blocks (89Mb) written.83886080 - 94371840: a453f429fb6461ab428cf61a84f4b464

102144 blocks (99Mb) written.94371840 - 104857600: 948d90c60d55bc8d3f2c07041d7cc754

112384 blocks (109Mb) written.104857600 - 115343360: ab633030878ac23f0b2acd9a45533926

122624 blocks (119Mb) written.115343360 - 125829120: 569e2df657b96f46076ba87fc5d83c09

132864 blocks (129Mb) written.125829120 - 136314880: 657d90e458a86cb67d3ccc878504027a

143104 blocks (139Mb) written.136314880 - 146800640: 338358f9503b8a4bd78fd0e31bcf2054

153344 blocks (149Mb) written.146800640 - 157286400: b7e1fdec69a19fc3784bf2f1bae0b785

163584 blocks (159Mb) written.157286400 - 167772160: 3e379ad423952134e39703c5cac6ffc2

173824 blocks (169Mb) written.167772160 - 178257920: 150f9b95784a909c2f6598f2a8de171f

184064 blocks (179Mb) written.178257920 - 188743680: 9551fd860eb3f3ac01290b97d9c50758

194304 blocks (189Mb) written.188743680 - 199229440: 02bb50013e94555a32e84398446019e1

204544 blocks (199Mb) written.199229440 - 209715200: 44c2428d11c027b7a6ca113afe890023

214784 blocks (209Mb) written.209715200 - 220200960: 05beb729ef2b832dc595b97f26c16011

225024 blocks (219Mb) written.220200960 - 230686720: 6524ee43cc8568a2e734550296c729ed

235264 blocks (229Mb) written.230686720 - 241172480: 924c16f533cf9bd1de44850678c9add4

245504 blocks (239Mb) written.241172480 - 251658240: da83f35718cea396aad641a4fee6321a

255744 blocks (249Mb) written.251658240 - 262144000: 36154bc6670538da36076cdf3d991f0e

265984 blocks (259Mb) written.262144000 - 272629760: 942ee9766d3647751d90a110fb1949fd

276224 blocks (269Mb) written.272629760 - 283115520: 3681c4377dba012b2d0ea4de6138ece9

286464 blocks (279Mb) written.283115520 - 293601280: 37d83f3505b21bdaa1c61e574772eafb

296704 blocks (289Mb) written.293601280 - 304087040: 4f8041869d8e1c19b810422cf4a2f1f8

306944 blocks (299Mb) written.304087040 - 314572800: 995013fb8615aa6d1db1f11c4a155ce0

317184 blocks (309Mb) written.314572800 - 325058560: 9e76e333f18effbc726355d37f466b19

327424 blocks (319Mb) written.325058560 - 335544320: a51175f259c6a6e99be41c3c6c15fc66

337664 blocks (329Mb) written.335544320 - 345830400: 8ecde1a10326bfc1721eff9cb7716cf7

Total (md5): 02b2d6fc742895fa4af9fa566240b880 337725+0 records in

337725+0 records out The advantage of this is that if, for whatever reason, the checksums change on the original evidence (let's say because portions of the disk were damaged), then we may still ensure the integrity of portions of our image file. For we could compare the various portions of the image with the corresponding portions of the original. For those portions which have matching checksums, we can trust that our duplication is identical to the original. This is, of course, impossible with a single checksum.

Documents

Forensics Week2