33
1 More Code Optimization

More Code Optimization

Embed Size (px)

DESCRIPTION

More Code Optimization. Outline. Tuning Performance Suggested reading 5.14. Performance Tuning. Identify Which is the hottest part of the program Using a very useful method profiling Instrument the program Run it with typical input data Collect information from the result - PowerPoint PPT Presentation

Citation preview

Page 1: More Code Optimization

1

More Code Optimization

Page 2: More Code Optimization

2

Outline

• Tuning Performance

• Suggested reading

– 5.14

Page 3: More Code Optimization

3

Performance Tuning

• Identify – Which is the hottest part of the program

– Using a very useful method profiling

• Instrument the program

• Run it with typical input data

• Collect information from the result

• Analysis the result

Page 4: More Code Optimization

4

Examples

unix> gcc –O1 –pg prog.c –o prog

unix> ./prog file.txt generates a file gmon.out unix> gprof prog analyze the data in

gmon.out

% cumulative self self totaltime seconds seconds calls s/call s/call name97.58 173.05 173.05 1 173.05 173.05 sort_words2.36 177.24 4.19 965027 0.00 0.00 find_ele_rec0.12 177.46 0.22 12511031 0.00 0.00 Strlen

Page 5: More Code Optimization

5

Principle

• Interval counting– Maintain a counter for each function

• Record the time spent executing this function

– Interrupted at regular time (1ms)• Check which function is executing when

interrupt occurs• Increment the counter for this function

• The calling information is quite reliable• By default, the timings for library

functions are not shown

Page 6: More Code Optimization

6

Program Example

• Task– Analyzing the n-gram statistics of a text

document– an n-gram is a sequence of n words

occurring in a document– reads a text file, – creates a table of unique n-grams

– specifying how many times each one occurs– sorts the n-grams in descending order of

occurrence

Page 7: More Code Optimization

7

Program Example

• Steps– Convert strings to lowercase– Apply hash function– Read n-grams and insert into hash table

• Mostly list operations• Maintain counter for each unique n-gram

– Sort results• Data Set

• Collected works of Shakespeare• 965,028 total words, 23,706 unique• N=2, called bigrams• 363,039 unique bigrams

Page 8: More Code Optimization

8

Examples

unix> gcc –O1 –pg prog.c –o prog

unix> ./prog file.txt

unix> gprof prog

% cumulative self self totaltime seconds seconds calls s/call s/call name97.58 173.05 173.05 1 173.05 173.05 sort_words2.36 177.24 4.19 965027 0.00 0.00 find_ele_rec0.12 177.46 0.22 12511031 0.00 0.00 Strlen

Page 9: More Code Optimization

9

index time called name

158655725 find_ele_rec [5]

4.19 0.02 965027/965027 insert_string [4]

[5] 2.4 4.19 0.02 965027+158655725 find_ele_rec [5]

0.01 0.01 363039/363039 new_ele [10]

0.00 0.01 363039/363039 save_string [13]

158655725 find_ele_rec [5]

• Ratio : 158655725/965027 = 164.4• The average length of a list in one hash bucket is

164

Example

Page 10: More Code Optimization

10

Code Optimizations

– First step: Use more efficient sorting function– Library function qsort

Page 11: More Code Optimization

11

Further Optimizations

Page 12: More Code Optimization

12

Optimizaitons

• Replace recursive call to iterative– Insert elements in linked list– Causes code to slow down

• Reason:– Iter first: insert a new element at the

beginning of the list– Most common n-grams tend to appear at the

end of the list which results the searching time

• Iter last: iterative function, places new entry at end of the list– Tend to place most common words at front of

list

Page 13: More Code Optimization

13

Optimizaitons

• Big table: Increase number of hash– Initial version: only 1021 buckets. – There are 363039/1021 = 355.6 bigrams in

each bucket – Increase it to 199,999– Only improves 0.3s– Initial summing character codes for a string. – The maximum code is 3371 for

“honorificabilitudinitatibus thou”.– Most buckets are not used

Page 14: More Code Optimization

14

Optimizaitons

• Better hash: Use more sophisticated hash function– Shift and Xor– Time drops to 0.4 seconds

• Linear lower: Move strlen out of loop– Time drops to 0.2 seconds

Page 15: More Code Optimization

15

Code Motion

1 /* Convert string to lowercase: slow */

2 void lower1(char *s)

3 {

4 int i;

5

6 for (i = 0; i < strlen(s); i++)

7 if (s[i] >= ’A’ && s[i] <= ’Z’)

8 s[i] -= (’A’ - ’a’);

9 }

10

Page 16: More Code Optimization

16

Code Motion

11 /* Convert string to lowercase: faster */

12 void lower2(char *s)

13 {

14 int i;

15 int len = strlen(s);

16

17 for (i = 0; i < len; i++)

18 if (s[i] >= ’A’ && s[i] <= ’Z’)

19 s[i] -= (’A’ - ’a’);

20 }

21

Page 17: More Code Optimization

17

Code Motion

22 /* Sample implementation of library function strlen */

23 /* Compute length of string */

24 size_t strlen(const char *s)

25 {

26 int length = 0;

27 while (*s != ’\0’) {

28 s++;

29 length++;

30 }

31 return length;

32 }

Page 18: More Code Optimization

18

Code Motion

Page 19: More Code Optimization

19

• Benefits– Helps identify performance bottlenecks

– Especially useful when have complex system with many components

• Limitations– Only shows performance for data tested

– E.g., linear lower did not show big gain, since words are short

• Quadratic inefficiency could remain lurking in code

– Timing mechanism fairly crude• Only works for programs that run for > 3 seconds

Performance Tuning

Page 20: More Code Optimization

20

Tnew = (1-)Told + (Told)/k

= Told[(1-) + /k]

S = Told / Tnew = 1/[(1-) + /k]

S = 1/(1-)

Amdahl’s Law

Page 21: More Code Optimization

21

Outline

• Common Memory-Related Bugs in C

Programs

• Suggested reading

– 9.11

Page 22: More Code Optimization

Dereferencing Bad Pointers

• The classic scanf bug

int val;

...

scanf(“%d”, val);

Page 23: More Code Optimization

Reading Uninitialized Memory

• Assuming that heap data is initialized to zero/* return y = Ax */int *matvec(int **A, int *x) { int *y = malloc(N*sizeof(int)); int i, j;

for (i=0; i<N; i++) for (j=0; j<N; j++) y[i] += A[i][j]*x[j]; return y;}

Page 24: More Code Optimization

Overwriting Memory

• Allocating the (possibly) wrong sized objectint **p;

p = malloc(N*sizeof(int));

for (i=0; i<N; i++) { p[i] = malloc(M*sizeof(int));}

Page 25: More Code Optimization

Overwriting Memory

• Off-by-one error

int **p;

p = malloc(N*sizeof(int *));

for (i=0; i<=N; i++) { p[i] = malloc(M*sizeof(int));}

Page 26: More Code Optimization

Overwriting Memory

• Not checking the max string size

• Basis for classic buffer overflow attacks

char s[8];int i;

gets(s); /* reads “123456789” from stdin */

Page 27: More Code Optimization

Overwriting Memory

• Misunderstanding pointer arithmetic

int *search(int *p, int val) { while (*p && *p != val) p += sizeof(int);

return p;}

Page 28: More Code Optimization

Overwriting Memory

• Referencing a pointer instead of the object it points to

int *BinheapDelete(int **binheap, int *size) { int *packet; packet = binheap[0]; binheap[0] = binheap[*size - 1]; *size--; Heapify(binheap, *size, 0); return(packet);}

Page 29: More Code Optimization

Referencing Nonexistent Variables

• Forgetting that local variables disappear when a function returns

int *foo () { int val;

return &val;}

Page 30: More Code Optimization

Freeing Blocks Multiple Times

• Nasty!

x = malloc(N*sizeof(int)); <manipulate x>free(x);

y = malloc(M*sizeof(int)); <manipulate y>free(x);

Page 31: More Code Optimization

Referencing Freed Blocks

• Evil!

x = malloc(N*sizeof(int)); <manipulate x>free(x); ...y = malloc(M*sizeof(int));for (i=0; i<M; i++) y[i] = x[i]++;

Page 32: More Code Optimization

Failing to Free Blocks (Memory Leaks)

• Slow, long-term killer!

foo() { int *x = malloc(N*sizeof(int)); ... return;}

Page 33: More Code Optimization

Failing to Free Blocks (Memory Leaks)

• Freeing only part of a data structurestruct list { int val; struct list *next;};

foo() { struct list *head = malloc(sizeof(struct list)); head->val = 0; head->next = NULL; <create and manipulate the rest of the list> ... free(head); return;}