Upload
wesley-collins
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
CCT 8501 Distributed Processing I
CPCC CCT ProgramDistributed Processing IBlock1
Wrapping Your Nugget Around Distributed ProcessingCPCC CCTIntroductions
Course paperwork - payment
Plan for this module
Credit by exam for CCT 242What is Distributed Prcoessing?According to Access Datas Configuration GuideYou need to get this guide and read ithttp://accessdata.com/downloads/media/Configuring%20Distributed%20Processing%20with%20FTK%203.pdfDistributed processing is a functionality that exists within the FTK 3 application that allows users to create a distributed processing cluster with up to four total nodes (workers) 1 local and 3 distributed. These additional processing nodes (workers) will function together in a cluster to increase productivity and decrease overall processing time
Why do we need it ?Drives are getting huge1-2 TB drives are becoming common3 TB drives availableMedia driven computing society
Artifact counts are very highOver 1 million artifacts is very common
Processing Time is increasing rapidlySome more reasons whyTo counter processing time increase we spend more on stand alone systems
That still take too long
That still become unstable when resource limits are reached.
Keeps gear tied upHow Distributed HelpsSimple as Little Gun versus Big Gun
How Distributed HelpsFaster
More Stable
More efficient
Allow a hardware migration cycle
Real World ResultsWe have a small case (Sluix)4 hours = FRED94 Minutes = Barney44 Minutes = DP
Real World ResultsWe have a medium size case (Testforsafeboot.eo1)FRED = 20 hoursBarney = 7 hours 53 minutesDP = 1 hour 13 minute
Real World ResultsWe have a large size case (HTI-001, Test.001)FRED = CRASH at 75-100 hours repeatableBarney = 28 hoursDP = 3 hours
Terms and DefinitionsThere is a lot to know before you do this
Terminology
Requirements
Technical skills
Well go over eachTermsHeader (Examiner machine)Workers (Helper machines)Oracle (where the database is storedCould be on the examiner machine or separateEvidence share (also called Imaging Server at CPCC)Case share (could be implemented numerous ways)Well discuss all in detail laterWhat it looks like
A Little Better
Terminology - How DP worksHow does distributed processing work?
Evidence processing tasks assigned to the engine by the user are called Jobs.
The FTK application submits the job to the processing engine.
Each job is divided into small packets called Work Units.
Each work unit is handed to a service called ADProcessor.exe (and ADIndexer, if youve chosen to index), which actually does the work. Terminology How DP worksThere are two components in distributed processing: 1. Processing Manager: The Processing Manager embedded in FTK manages Jobs and Work distribution. It also handles status updates and Job progress.
2. Processing Engine: The Processing Engine manages the processing resources of a particular computer/node.
Every machine that participates in a processing cluster runs a Processing Engine.
It decides how many jobs can be concurrently processed by that node.
The processing engine also manages the ADProcessor.exe/ADIndexer.exe that actually do the processing work. Examiner Hardware RequirementsBased on the CPCC recommended ring (more later)Examiner Machine (should be your most powerful machineSee http://accessdata.com/downloads/media/FTK_3x_System_Specifications_Guide.pdfCPCC recommends8 Core or better processor, 12 GB RAM, 64 Bit Win 7, 80 GB or more SSD for OS1 GBps NICOracle Hardware RequirementsBased on the CPCC recommended ring (more later)Oracle Machine (should be a powerful machineSee http://accessdata.com/downloads/media/FTK_3x_System_Specifications_Guide.pdfCPCC recommends8 Core or better processor, 12 GB RAM, 64 Bit Win 7, 80 GB or more SSD for OS160GB SSD or RAID 0 config with spinning drives (4 7200 RPM Vraptors min) NO RAID 51 GBps NIC
WorkersBased on the CPCC recommended ring (more later)Worker Machine (can be a little less powerful machine - if you dont have good ones add what you have as long as it meets minimumsSee http://accessdata.com/downloads/media/FTK_3x_System_Specifications_Guide.pdf
CPCC recommends8 Core or better processor, 8 GB RAM, 64 Bit Win 7, Vraptor or better drive1 GBps NIC
Other requirements.NET 3.5 Service Pack 1 (on the Application ISO, or if connected to the Internet, will attempt to download)
Windows 2008 R2 requires that you manually install 3.5sp1 using the "Roles and Features" tool.
AccessData Processing Engine installation executable
The Evidence Processing Engine (Regular FTK) IS NOT TO BE INSTALLED in distributed mode on the FTK examiner machine. Additional ConsiderationsAccess Data says
The machines that store the evidence and case folder become a bottleneck.
Processing evidence is very disk IO intensive. As a result evidence should be stored on fast drives. With many large machines in a processing group, it is possible that the file sharing service in Windows will run out of kernel memory and fail to provide the evidence data across the network.
If you use the CPCC ring, you will be fine more laterAdditional ConsiderationsAccess Data says
The machine that runs the Processing Manager may become a bottleneck during the discovery phase. Discovery is the process of enumerating all the actual files in a piece of evidence. Information about these files is stored in the database and the Distributed Processing Engines work on processing them. This discovery phase always runs in the Processing Engine located on the same machine as the Processing Manager. Since it produces much of the work that other Processing Engines work on, it needs to be one of the fastest machines (CPU speed) in the processing group.
If you use the CPCC ring, you will be fine more laterAdditional ConsiderationsAccess Data says
Distributed processing produces a lot of network traffic. There is control traffic between the engine components, but primarily the network is used to read evidence and write results to the case folder and database. It is very easy to saturate a gigabit network for extended periods of time while processing a large image. Please use the fastest network technologies available to you, at a minimum 100 Mb switched. NO use 1Gps only !!!!!!!We strongly recommend that the Case folder and image location are on separate drives. AND on a separate machine from examiner more later
If you use the CPCC ring, you will be fine more laterLab ConsiderationsFor maximum throughput we will disable a lot of security stuffFirewalls, A/VPermissions and shares will be VERY openTHE LAB must be isolated from the internet and any corporate lansVlan separation may be ok depending on details