Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
4-1©2010 Raj Jain www.rajjain.com
Types of Types of WorkloadsWorkloads
4-2©2010 Raj Jain www.rajjain.com
OverviewOverview
TerminologyTest Workloads for Computer Systems
Addition InstructionInstruction MixesKernelsSynthetic ProgramsApplication Benchmarks: Sieve, Ackermann's Function, Debit-Credit, SPEC
4-3©2010 Raj Jain www.rajjain.com
Part II: Measurement Techniques and ToolsPart II: Measurement Techniques and Tools
Measurements are not to provide numbers but insight- Ingrid Bucher
1. What are the different types of workloads?2. Which workloads are commonly used by other analysts?3. How are the appropriate workload types selected?4. How is the measured workload data summarized?5. How is the system performance monitored?6. How can the desired workload be placed on the system in a
controlled manner?7. How are the results of the evaluation presented?
4-4©2010 Raj Jain www.rajjain.com
TerminologyTerminologyTest workload: Any workload used in performance studies.Test workload can be real or synthetic.Real workload: Observed on a system being used for normal operations.Synthetic workload:
Similar to real workloadCan be applied repeatedly in a controlled mannerNo large real-world data filesNo sensitive dataEasily modified without affecting operationEasily ported to different systems due to its small sizeMay have built-in measurement capabilities.
4-5©2010 Raj Jain www.rajjain.com
Test Workloads for Computer SystemsTest Workloads for Computer Systems
1. Addition Instruction2. Instruction Mixes3. Kernels4. Synthetic Programs5. Application Benchmarks
4-6©2010 Raj Jain www.rajjain.com
Addition InstructionAddition Instruction
Processors were the most expensive and most used components of the systemAddition was the most frequent instruction
4-7©2010 Raj Jain www.rajjain.com
Instruction MixesInstruction MixesInstruction mix = instructions + usage frequencyGibson mix: Developed by Jack C. Gibson in 1959 for IBM 704 systems.
4-8©2010 Raj Jain www.rajjain.com
Instruction Mixes (Cont)Instruction Mixes (Cont)Disadvantages:
Complex classes of instructions not reflected in the mixes.Instruction time varies with:
Addressing modesCache hit ratesPipeline efficiencyInterference from other devices during processor-memory access cyclesParameter valuesFrequency of zeros as a parameterThe distribution of zero digits in a multiplierThe average number of positions of preshift in floating-point addNumber of times a conditional branch is taken
4-9©2010 Raj Jain www.rajjain.com
Instruction Mixes (Cont)Instruction Mixes (Cont)
Performance Metrics:MIPS = Millions of Instructions Per SecondMFLOPS = Millions of Floating Point Operations Per Second
4-10©2010 Raj Jain www.rajjain.com
KernelsKernels
Kernel = nucleusKernel= the most frequent functionCommonly used kernels: Sieve, Puzzle, Tree Searching, Ackerman's Function, Matrix Inversion, and Sorting.Disadvantages: Do not make use of I/O devices
4-11©2010 Raj Jain www.rajjain.com
Synthetic ProgramsSynthetic Programs
To measure I/O performance lead analysts ⇒ Exerciser loopsThe first exerciser loop was by Buchholz (1969) who
called it a synthetic program.A Sample Exerciser: See program listing Figure 4.1 in the book
4-12©2010 Raj Jain www.rajjain.com
Synthetic ProgramsSynthetic ProgramsAdvantage:
Quickly developed and given to different vendors.No real data filesEasily modified and ported to different systems.Have built-in measurement capabilitiesMeasurement process is automatedRepeated easily on successive versions of the operating systems
Disadvantages:Too smallDo not make representative memory or disk referencesMechanisms for page faults and disk cache may not be adequately exercised.CPU-I/O overlap may not be representative.Loops may create synchronizations ⇒ better or worse performance.
4-13©2010 Raj Jain www.rajjain.com
Application BenchmarksApplication Benchmarks
For a particular industry: Debit-Credit for BanksBenchmark = workload (Except instruction mixes)Some Authors: Benchmark = set of programs taken from real workloadsPopular Benchmarks
4-14©2010 Raj Jain www.rajjain.com
SieveSieveBased on Eratosthenes' sieve algorithm: find all prime numbers below a given number n.Algorithm:
Write down all integers from 1 to nStrike out all multiples of k, for k=2, 3, …, √n.
Example:Write down all numbers from 1 to 20. Mark all as prime:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20Remove all multiples of 2 from the list of primes:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
4-15©2010 Raj Jain www.rajjain.com
Sieve (Cont)Sieve (Cont)
The next integer in the sequence is 3. Remove all multiples of 3:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 205 > √20 ⇒ StopPascal Program to Implement the Sieve Kernel:See Program listing Figure 4.2 in the book
4-16©2010 Raj Jain www.rajjain.com
Ackermann's FunctionAckermann's FunctionTo assess the efficiency of the procedure-calling mechanisms. The function has two parameters and is defined recursively.Ackermann(3, n) evaluated for values of n from one to six.Metrics:
Average execution time per callNumber of instructions executed per call, andStack space per call
Verification: Ackermann(3, n) = 2n+3-3Number of recursive calls in evaluating Ackermann(3,n):
(512× 4n-1 -15 × 2n+3 + 9n + 37)/3Execution time per call.
Depth of the procedure calls = 2n+3-4 ⇒ stack space required doubles when n ← n+1.
4-17©2010 Raj Jain www.rajjain.com
Ackermann Program inAckermann Program in SimulaSimula
See program listing Figure 4.3 in the book
4-18©2010 Raj Jain www.rajjain.com
Other BenchmarksOther Benchmarks
WhetstoneU.S. SteelLINPACKDhrystoneDoducTOPLawrence Livermore LoopsDigital Review LabsAbingdon Cross Image-Processing Benchmark
4-19©2010 Raj Jain www.rajjain.com
DebitDebit--Credit BenchmarkCredit Benchmark
A de facto standard for transaction processing systems.First recorded in Anonymous et al (1975).In 1973, a retail bank wanted to put its 1000 branches, 10,000 tellers, and 10,000,000 accounts online with a peak load of 100 Transactions Per Second (TPS).Each TPS requires 10 branches, 100 tellers, and 100,000 accounts.
4-20©2010 Raj Jain www.rajjain.com
DebitDebit--Credit (Cont)Credit (Cont)
4-21©2010 Raj Jain www.rajjain.com
DebitDebit--Credit Benchmark (Continued)Credit Benchmark (Continued)Metric: price/performance ratio.Performance: Throughput in terms of TPS such that 95% of all transactions provide one second or less response time.Response time: Measured as the time interval between the arrival of the last bit from the communications line and the sending of the first bit to the communications line.Cost = Total expenses for a five-year period on purchase, installation, and maintenance of the hardware and software in the machine room.Cost does not include expenditures for terminals, communications, application development, or operations.
4-22©2010 Raj Jain www.rajjain.com
PseudoPseudo--code Definition of Debitcode Definition of Debit--CreditCredit
See Figure 4.5 in the bookFour record types: account, teller, branch, and history.Fifteen percent of the transactions require remote accessTransactions Processing Performance Council (TPC) was formed in August 1988.TPC BenchmarkTM A is a variant of the debit-creditMetrics: TPS such that 90% of all transactions provide two seconds or less response time.
4-23©2010 Raj Jain www.rajjain.com
SPEC Benchmark SuiteSPEC Benchmark SuiteSystems Performance Evaluation Cooperative (SPEC): Non-profit corporation formed by leading computer vendors to develop a standardized set of benchmarks. Release 1.0 consists of the following 10 benchmarks: GCC, Espresso, Spice 2g6, Doduc, LI, Eqntott, Matrix300, Fpppp,TomcatvPrimarily stress the CPU, Floating Point Unit (FPU), and to some extent the memory subsystem ⇒ Το compare CPU speeds.Benchmarks to compare I/O and other subsystems may be included in future releases.
4-24©2010 Raj Jain www.rajjain.com
SPEC (Cont)SPEC (Cont)The elapsed time to run two copies of a benchmark on each of the N processors of a system (a total of 2N copies) is measured and compared with the time to run two copies of the benchmark on a reference system (which is VAX-11/780 for Release 1.0). For each benchmark, the ratio of the time on the system under test and the reference system is reported as SPECthruputusing a notation of #CPU@Ratio. For example, a system with three CPUs taking 1/15 times as long as the the reference system on GCC benchmark has a SPECthruput of [email protected] of the per processor throughput relative to the reference system
4-25©2010 Raj Jain www.rajjain.com
SPEC (Cont)SPEC (Cont)The aggregate throughput for all processors of a multiprocessor system can be obtained by multiplying the ratio by the number of processors. For example, the aggregate throughput for the above system is 45.The geometric mean of the SPECthruputs for the 10 benchmarks is used to indicate the overall performance for the suite and is called SPECmark.
4-26©2010 Raj Jain www.rajjain.com
SummarySummary
Synthetic workload are representative, repeatable, and avoid sensitive informationAdd instruction – most frequent instruction initiallyInstruction mixes, Kernels, synthetic programsApplication benchmarks: Sieve, Ackerman, …Benchmark standards: Debit-Credit, SPEC
4-27©2010 Raj Jain www.rajjain.com
Exercise 4.1Exercise 4.1
Select an area of computer systems (for example, processor design, networks, operating systems, or databases), review articles on performance evaluation in that area and make a list of benchmarks used in those articles.
4-28©2010 Raj Jain www.rajjain.com
Exercise 4.2Exercise 4.2
Implement the Sieve workload in a language of your choice, run it on systems available to you, and report the results.