48
Lecture 9 Aggregate Data Organization Topics Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

Embed Size (px)

DESCRIPTION

– 3 – CSCE 212H Spring 2012 Pointer Code void s_helper (int x, int *accum) { if (x

Citation preview

Page 1: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

Lecture 9Aggregate Data Organization

TopicsTopics

Pointers Aggregate Data

Array layout in memoryStructures

February 14, 2012

CSCE 212 Computer Architecture

Page 2: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 2 –CSCE 212H Spring 2012

OverviewLast TimeLast Time

GDB recursive Lab 2 – Questions due today Test 1 Feb ?? Not Feb 15

NewNew Datalab Pointers Aggregate Data

Array layout in memory Structures

Next Time: Next Time: Test 1 – Feb 23

February 27, Mon. Last day to drop a course or withdraw without a grade of "WF" being recorded (Session C002)

Test 1 Review

Page 3: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 3 –CSCE 212H Spring 2012

Pointer Code

void s_helper (int x, int *accum){ if (x <= 1) return; else { int z = *accum * x; *accum = z; s_helper (x-1,accum); }}

int sfact(int x){ int val = 1; s_helper(x, &val); return val;}

Top-Level CallRecursive Procedure

Pass pointer to update location

Page 4: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 4 –CSCE 212H Spring 2012

Temp.Space

%esp

Creating & Initializing Pointer

int sfact(int x){ int val = 1; s_helper(x, &val); return val;}

_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1

Using Stack for Local VariableUsing Stack for Local Variable Variable val must be stored on

stackNeed to create pointer to it

Compute pointer as -4(%ebp) Push on stack as second

argument

Initial part of sfact

xRtn adr

Old %ebp %ebp 0 4 8

-4 val = 1

Unused-12 -8

-16

_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1

_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1

_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1

Page 5: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 5 –CSCE 212H Spring 2012

Passing Pointer

int sfact(int x){ int val = 1; s_helper(x, &val); return val;}

leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish

Calling s_helper from sfact

xRtn adr

Old %ebp %ebp 0 4 8

val = 1 -4

Unused-12 -8

-16

%espx&val

Stack at time of call

leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish

leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish

val =x!

Page 6: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 6 –CSCE 212H Spring 2012

Using Pointer

• • •movl %ecx,%eax # z = ximull (%edx),%eax # z *= *accummovl %eax,(%edx) # *accum = z• • •

void s_helper (int x, int *accum){ • • • int z = *accum * x; *accum = z; • • •}

Register %ecx holds x Register %edx holds pointer to accum

Use access (%edx) to reference memory

%edxaccum

xx%eax

%ecxaccum*x

accum*x

Page 7: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 7 –CSCE 212H Spring 2012

Array AllocationBasic PrincipleBasic Principle

T A[L]; Array of data type T and length L Contiguously allocated region of L * sizeof(T) bytes

char string[12];

x x + 12int val[5];

x x + 4 x + 8 x + 12 x + 16 x + 20double a[4];

x + 32x + 24x x + 8 x + 16

char *p[3];

x x + 4 x + 8

Page 8: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 8 –CSCE 212H Spring 2012

Array AccessBasic PrincipleBasic Principle

T A[L]; Array of data type T and length L Identifier A can be used as a pointer to array element 0

ReferenceReference TypeType ValueValueval[4] int 3val int * xval+1 int * x + 4&val[2] int * x + 8val[5] int ??*(val+1) int 5val + i int * x + 4 i

1 5 2 1 3int val[5];

x x + 4 x + 8 x + 12 x + 16 x + 20

Page 9: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 9 –CSCE 212H Spring 2012

Array Example

NotesNotes Declaration “zip_dig cmu” equivalent to “int cmu[5]” Example arrays were allocated in successive 20 byte blocks

Not guaranteed to happen in general

typedef int zip_dig[5];

zip_dig cmu = { 1, 5, 2, 1, 3 };zip_dig mit = { 0, 2, 1, 3, 9 };zip_dig ucb = { 9, 4, 7, 2, 0 };

zip_dig cmu; 1 5 2 1 3

16 20 24 28 32 36zip_dig mit; 0 2 1 3 9

36 40 44 48 52 56zip_dig ucb; 9 4 7 2 0

56 60 64 68 72 76

Page 10: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 10 –CSCE 212H Spring 2012

Array Accessing Example

Memory Reference CodeMemory Reference Code

int get_digit (zip_dig z, int dig){ return z[dig];}

# %edx = z # %eax = dig

movl (%edx,%eax,4),%eax # z[dig]

ComputationComputation Register %edx contains starting

address of array Register %eax contains array

index Desired digit at 4*%eax + %edx Use memory reference (%edx,%eax,4)

Page 11: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 11 –CSCE 212H Spring 2012

Referencing Examples

Code Does Not Do Any Bounds Checking!Code Does Not Do Any Bounds Checking!

ReferenceReference AddressAddress ValueValue Guaranteed?Guaranteed?mit[3] 36 + 4* 3 = 48 3mit[5] 36 + 4* 5 = 56 9mit[-1] 36 + 4*-1 = 32 3cmu[15] 16 + 4*15 = 76 ?? Out of range behavior implementation-dependent

No guaranteed relative allocation of different arrays

zip_dig cmu; 1 5 2 1 3

16 20 24 28 32 36zip_dig mit; 0 2 1 3 9

36 40 44 48 52 56zip_dig ucb; 9 4 7 2 0

56 60 64 68 72 76

YesYesNoNoNoNoNoNo

Page 12: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 12 –CSCE 212H Spring 2012

int zd2int(zip_dig z){ int i; int zi = 0; for (i = 0; i < 5; i++) { zi = 10 * zi + z[i]; } return zi;}

Array Loop Example

Original SourceOriginal Source

int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}

Transformed VersionTransformed Version As generated by GCC Eliminate loop variable i Convert array code to

pointer code Express in do-while form

No need to test at entrance

Page 13: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 13 –CSCE 212H Spring 2012

# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4

.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop

Array Loop ImplementationRegistersRegisters

%ecx z%eax zi%ebx zend

ComputationsComputations 10*zi + *z implemented as *z + 2*(zi+4*zi)

z++ increments by 4

int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}

# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4

.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop

int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}

# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4

.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop

int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}

# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4

.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop

int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}

# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4

.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop

int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}

Page 14: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 14 –CSCE 212H Spring 2012

Nested Array Example

Declaration “zip_dig pgh[4]” equivalent to “int pgh[4][5]” Variable pgh denotes array of 4 elements

» Allocated contiguously Each element is an array of 5 int’s

» Allocated contiguously “Row-Major” ordering of all elements guaranteed

#define PCOUNT 4zip_dig pgh[PCOUNT] = {{1, 5, 2, 0, 6}, {1, 5, 2, 1, 3 }, {1, 5, 2, 1, 7 }, {1, 5, 2, 2, 1 }};

zip_digpgh[4];

76 96 116 136 156

1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1

Page 15: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 15 –CSCE 212H Spring 2012

Nested Array AllocationDeclarationDeclaration

T A[R][C]; Array of data type T R rows, C columns Type T element requires K bytes

Array SizeArray Size R * C * K bytes

ArrangementArrangement Row-Major Ordering

A[0][0] A[0][C-1]

A[R-1][0]

• • •

• • •A[R-1][C-1]

•••

•••

int A[R][C];

A[0][0]

A[0][C-1]

• • •A

[1][0]

A[1][C-1]

• • •A

[R-1][0]

A[R-1][C-1]

• • ••  •  •

4*R*C Bytes

Page 16: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 16 –CSCE 212H Spring 2012

•  •  •

Nested Array Row Access

Row VectorsRow Vectors A[i] is array of C elements Each element of type T Starting address – base address of A + i * C * K

A[i][0]

A[i][C-1]

• • •

A[i]

A[R-1][0]

A[R-1][C-1]

• • •

A[R-1]

•  •  •

A

A[0][0]

A[0]

[C-1]• • •

A[0]

int A[R][C];

A+i*C*4 A+(R-1)*C*4

Page 17: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 17 –CSCE 212H Spring 2012

Nested Array Row Access Code

Row VectorRow Vector pgh[index] is array of 5 int’s Starting address pgh+20*index

CodeCode Computes and returns address Compute as pgh + 4*(index+4*index)

int *get_pgh_zip(int index){ return pgh[index];}

# %eax = indexleal (%eax,%eax,4),%eax # 5 * indexleal pgh(,%eax,4),%eax # pgh + (20 * index)

Page 18: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 18 –CSCE 212H Spring 2012

• • •

Nested Array Element Access Array Elements Array Elements

A[i][j] is element of type T Address of A[i][j] is base-of A + (i * C + j) * K

Base-of A = starting address of array &A[0][0]C = number of columns = number of elements in a rowK = size of individual element

•  •  •A

[i][j]

A[i][j]

• • •

A[i]

A[R-1][0]

A[R-1][C-1]

• • •

A[R-1]

•  •  •

A

A[0][0]

A[0]

[C-1]• • •

A[0]

A+i*C*4 A+(R-1)*C*4A+(i*C+j)*4

Page 19: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 19 –CSCE 212H Spring 2012

Nested Array Element Access CodeArray Elements Array Elements

pgh[index][dig] is int Address:

pgh + 20*index + 4*dig

CodeCode Computes address

pgh + 4*dig + 4*(index+4*index) movl performs memory reference

int get_pgh_digit (int index, int dig){ return pgh[index][dig];}

# %ecx = dig# %eax = indexleal 0(,%ecx,4),%edx # 4*digleal (%eax,%eax,4),%eax # 5*indexmovl pgh(%edx,%eax,4),%eax # *(pgh + 4*dig + 20*index)

Page 20: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 20 –CSCE 212H Spring 2012

struct rec { int i; int a[3]; int *p;};

Assembly# %eax = val# %edx = rmovl %eax,(%edx) # Mem[r] = val

void set_i(struct rec *r, int val){ r->i = val;}

StructuresConceptConcept

Contiguously-allocated region of memory Refer to members within structure by names Members may be of different types

Accessing Structure MemberAccessing Structure Member

Memory Layout

i a p0 4 16 20

Page 21: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 21 –CSCE 212H Spring 2012

struct rec { int i; int a[3]; int *p;};

# %ecx = idx# %edx = rleal 0(,%ecx,4),%eax # 4*idxleal 4(%eax,%edx),%eax # r+4*idx+4

int *find_a (struct rec *r, int idx){ return &r->a[idx];}

Generating Pointer to Struct. Member

Generating Pointer to Generating Pointer to Array ElementArray Element Offset of each structure

member determined at compile time

i a p0 4 16

r + 4 + 4*idx

r

Page 22: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 22 –CSCE 212H Spring 2012

struct rec { int i; int a[3]; int *p;};

# %edx = rmovl (%edx),%ecx # r->ileal 0(,%ecx,4),%eax # 4*(r->i)leal 4(%edx,%eax),%eax # r+4+4*(r->i)movl %eax,16(%edx) # Update r->p

void set_p(struct rec *r){ r->p = &r->a[r->i];}

Structure Referencing (Cont.)C CodeC Code

i a0 4 16

Element i

i a p0 4 16

Page 23: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 23 –CSCE 212H Spring 2012

Alignment

Aligned DataAligned Data Primitive data type requires K bytes Address must be multiple of K Required on some machines; advised on IA32

treated differently by Linux and Windows!

Motivation for Aligning DataMotivation for Aligning Data Memory accessed by (aligned) double or quad-words

Inefficient to load or store datum that spans quad word boundaries

Virtual memory very tricky when datum spans 2 pages

CompilerCompiler Inserts gaps in structure to ensure correct alignment of

fields

Page 24: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 24 –CSCE 212H Spring 2012

Specific Cases of AlignmentSize of Primitive Data Type:Size of Primitive Data Type:

1 byte (e.g., char) no restrictions on address

2 bytes (e.g., short) lowest 1 bit of address must be 02

4 bytes (e.g., int, float, char *, etc.) lowest 2 bits of address must be 002

8 bytes (e.g., double) Windows (and most other OS’s & instruction sets):

» lowest 3 bits of address must be 0002 Linux:

» lowest 2 bits of address must be 002

» i.e., treated the same as a 4-byte primitive data type 12 bytes (long double)

Linux:» lowest 2 bits of address must be 002

» i.e., treated the same as a 4-byte primitive data type

Page 25: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 25 –CSCE 212H Spring 2012

struct S1 { char c; int i[2]; double v;} *p;

Satisfying Alignment with StructuresOffsets Within StructureOffsets Within Structure

Must satisfy element’s alignment requirement

Overall Structure PlacementOverall Structure Placement Each structure has alignment requirement K

Largest alignment of any element Initial address & structure length must be

multiples of K

Example (under Windows):Example (under Windows): K = 8, due to double elementc i[0] i[1] vp+0 p+4 p+8 p+16 p+24

Multiple of 4 Multiple of 8

Multiple of 8 Multiple of 8

Page 26: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 26 –CSCE 212H Spring 2012

Linux vs. Windows

Windows (including Cygwin):Windows (including Cygwin): K = 8, due to double element

Linux:Linux: K = 4; double treated like a 4-byte data type

struct S1 { char c; int i[2]; double v;} *p;

c i[0] i[1] vp+0 p+4 p+8 p+16 p+24

Multiple of 4 Multiple of 8Multiple of 8 Multiple of 8

c i[0] i[1]p+0 p+4 p+8

Multiple of 4 Multiple of 4Multiple of 4

vp+12 p+20

Multiple of 4

Page 27: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 27 –CSCE 212H Spring 2012

Overall Alignment Requirementstruct S2 { double x; int i[2]; char c;} *p;

struct S3 { float x[2]; int i[2]; char c;} *p;

p+0 p+12p+8 p+16 Windows: p+24Linux: p+20

ci[0] i[1]x

ci[0] i[1]

p+0 p+12p+8 p+16 p+20

x[0] x[1]

p+4

p must be multiple of: 8 for Windows4 for Linux

p must be multiple of 4 (in either OS)

Page 28: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 28 –CSCE 212H Spring 2012

Ordering Elements Within Structurestruct S4 { char c1; double v; char c2; int i;} *p;

struct S5 { double v; char c1; char c2; int i;} *p;

c1 ivp+0 p+20p+8 p+16 p+24

c2

c1 ivp+0 p+12p+8 p+16

c2

10 bytes wasted space in Windows

2 bytes wasted space

Page 29: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 29 –CSCE 212H Spring 2012

Arrays of StructuresPrinciplePrinciple

Allocated by repeating allocation for array type

In general, may nest arrays & structures to arbitrary depth

a[0]a+0

a[1] a[2]a+12 a+24 a+36

• • •

a+12 a+20a+16 a+24

struct S6 { short i; float v; short j;} a[10];

a[1].i a[1].ja[1].v

Page 30: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 30 –CSCE 212H Spring 2012

Linux Memory LayoutStackStack

Runtime stack (8MB limit)

HeapHeap Dynamically allocated storage When call malloc, calloc, new

DLLsDLLs Dynamically Linked Libraries Library routines (e.g., printf, malloc) Linked into object code when first executed

DataData Statically allocated data E.g., arrays & strings declared in code

TextText Executable machine instructions Read-only

Upper 2 hex digits of address

Red Hatv. 6.2~1920MBmemorylimit

FF

BF

7F

3F

C0

80

40

00

Stack

DLLs

TextData

Heap

Heap

08

Page 31: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 31 –CSCE 212H Spring 2012

Linux Memory AllocationLinked

BF

7F

3F

80

40

00

Stack

DLLs

TextData

08

Some Heap

BF

7F

3F

80

40

00

Stack

DLLs

TextData

Heap

08

MoreHeap

BF

7F

3F

80

40

00

Stack

DLLs

TextDataHeap

Heap

08

InitiallyBF

7F

3F

80

40

00

Stack

TextData

08

Page 32: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 32 –CSCE 212H Spring 2012

Text & Stack Example

(gdb) break main(gdb) run Breakpoint 1, 0x804856f in main ()(gdb) print $esp $3 = (void *) 0xbffffc78

MainMain Address 0x804856f should be read 0x0804856f

StackStack Address 0xbffffc78

InitiallyBF

7F

3F

80

40

00

Stack

TextData

08

Page 33: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 33 –CSCE 212H Spring 2012

Dynamic Linking Example(gdb) print malloc $1 = {<text variable, no debug info>} 0x8048454 <malloc>(gdb) run Program exited normally.(gdb) print malloc $2 = {void *(unsigned int)} 0x40006240 <malloc>

InitiallyInitially Code in text segment that invokes dynamic

linker Address 0x8048454 should be read 0x08048454

FinalFinal Code in DLL region

LinkedBF

7F

3F

80

40

00

Stack

DLLs

TextData

08

Page 34: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 34 –CSCE 212H Spring 2012

Memory Allocation Example

char big_array[1<<24]; /* 16 MB */char huge_array[1<<28]; /* 256 MB */

int beyond;char *p1, *p2, *p3, *p4;

int useless() { return 0; }

int main(){ p1 = malloc(1 <<28); /* 256 MB */ p2 = malloc(1 << 8); /* 256 B */ p3 = malloc(1 <<28); /* 256 MB */ p4 = malloc(1 << 8); /* 256 B */ /* Some print statements ... */}

Page 35: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 35 –CSCE 212H Spring 2012

Example Addresses$esp 0xbffffc78p3 0x500b5008p1 0x400b4008Final malloc 0x40006240p4 0x1904a640 p2 0x1904a538beyond 0x1904a524big_array 0x1804a520huge_array 0x0804a510main() 0x0804856fuseless() 0x08048560Initial malloc 0x08048454

BF

7F

3F

80

40

00

Stack

DLLs

TextDataHeap

Heap

08

Page 36: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 36 –CSCE 212H Spring 2012

C operatorsOperators Associativity() [] -> . left to right! ~ ++ -- + - * & (type) sizeof right to left* / % left to right+ - left to right<< >> left to right< <= > >= left to right== != left to right& left to right^ left to right| left to right&& left to right|| left to right?: right to left= += -= *= /= %= &= ^= != <<= >>= right to left, left to right

Note: Unary +, -, and * have higher precedence than binary forms

Page 37: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 37 –CSCE 212H Spring 2012

C pointer declarationsint *p p is a pointer to int

int *p[13] p is an array[13] of pointer to int

int *(p[13]) p is an array[13] of pointer to int

int **p p is a pointer to a pointer to an int

int (*p)[13] p is a pointer to an array[13] of int

int *f() f is a function returning a pointer to int

int (*f)() f is a pointer to a function returning int

int (*(*f())[13])() f is a function returning ptr to an array[13] of pointers to functions returning int

int (*(*x[3])())[5] x is an array[3] of pointers to functions returning pointers to array[5] of ints

Page 38: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 38 –CSCE 212H Spring 2012

Internet Worm and IM WarNovember, 1988November, 1988

Internet Worm attacks thousands of Internet hosts. How did it happen?

July, 1999July, 1999 Microsoft launches MSN Messenger (instant messaging system). Messenger clients can access popular AOL Instant Messaging Service (AIM)

servers

AIMserver

AIMclient

AIMclient

MSNclient

MSNserver

Page 39: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 39 –CSCE 212H Spring 2012

Internet Worm and IM War (cont.)August 1999August 1999

Mysteriously, Messenger clients can no longer access AIM servers.

Microsoft and AOL begin the IM war:AOL changes server to disallow Messenger clientsMicrosoft makes changes to clients to defeat AOL changes.At least 13 such skirmishes.

How did it happen?

The Internet Worm and AOL/Microsoft War were both The Internet Worm and AOL/Microsoft War were both based on based on stack buffer overflowstack buffer overflow exploits! exploits!

many Unix functions do not check argument sizes.allows target buffers to overflow.

Page 40: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 40 –CSCE 212H Spring 2012

String Library Code Implementation of Unix function gets

No way to specify limit on number of characters to read

Similar problems with other Unix functionsstrcpy: Copies string of arbitrary lengthscanf, fscanf, sscanf, when given %s conversion specification

/* Get string from stdin */char *gets(char *dest){ int c = getc(); char *p = dest; while (c != EOF && c != '\n') { *p++ = c; c = getc(); } *p = '\0'; return dest;}

Page 41: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 41 –CSCE 212H Spring 2012

Vulnerable Buffer Code

int main(){ printf("Type a string:"); echo(); return 0;}

/* Echo Line */void echo(){ char buf[4]; /* Way too small! */ gets(buf); puts(buf);}

Page 42: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 42 –CSCE 212H Spring 2012

Buffer Overflow Executions

unix>./bufdemoType a string:123123

unix>./bufdemoType a string:12345Segmentation Fault

unix>./bufdemoType a string:12345678Segmentation Fault

Page 43: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 43 –CSCE 212H Spring 2012

Buffer Overflow Stack

echo:pushl %ebp # Save %ebp on stackmovl %esp,%ebpsubl $20,%esp # Allocate space on stackpushl %ebx # Save %ebxaddl $-12,%esp # Allocate space on stackleal -4(%ebp),%ebx # Compute buf as %ebp-4pushl %ebx # Push buf on stackcall gets # Call gets. . .

/* Echo Line */void echo(){ char buf[4]; /* Way too small! */ gets(buf); puts(buf);}

Return AddressSaved %ebp

[3][2][1][0] buf%ebp

StackFrame

for main

StackFrame

for echo

Page 44: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 44 –CSCE 212H Spring 2012

Buffer Overflow Stack Example

Before call to gets

unix> gdb bufdemo(gdb) break echoBreakpoint 1 at 0x8048583(gdb) runBreakpoint 1, 0x8048583 in echo ()(gdb) print /x *(unsigned *)$ebp$1 = 0xbffff8f8(gdb) print /x *((unsigned *)$ebp + 1)$3 = 0x804864d

8048648: call 804857c <echo> 804864d: mov 0xffffffe8(%ebp),%ebx # Return Point

Return AddressSaved %ebp

[3][2][1][0] buf%ebp

StackFrame

for main

StackFrame

for echo

0xbffff8d8Return Address

Saved %ebp[3][2][1][0] buf

StackFrame

for main

StackFrame

for echo

bf ff f8 f808 04 86 4d

xx xx xx xx

Page 45: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 45 –CSCE 212H Spring 2012

Buffer Overflow Example #1

Before Call to gets Input = “123”

No Problem

0xbffff8d8Return Address

Saved %ebp[3][2][1][0] buf

StackFrame

for main

StackFrame

for echo

bf ff f8 f808 04 86 4d

00 33 32 31

Return AddressSaved %ebp

[3][2][1][0] buf%ebp

StackFrame

for main

StackFrame

for echo

Page 46: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 46 –CSCE 212H Spring 2012

Buffer Overflow Stack Example #2Input = “12345”

8048592: push %ebx 8048593: call 80483e4 <_init+0x50> # gets 8048598: mov 0xffffffe8(%ebp),%ebx 804859b: mov %ebp,%esp 804859d: pop %ebp # %ebp gets set to invalid value 804859e: ret

echo code:

0xbffff8d8Return Address

Saved %ebp[3][2][1][0] buf

StackFrame

for main

StackFrame

for echo

bf ff 00 3508 04 86 4d

34 33 32 31

Return AddressSaved %ebp

[3][2][1][0] buf%ebp

StackFrame

for main

StackFrame

for echo

Saved value of %ebp set to 0xbfff0035

Bad news when later attempt to restore %ebp

Page 47: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 47 –CSCE 212H Spring 2012

Buffer Overflow Stack Example #3

Input = “12345678”

Return AddressSaved %ebp

[3][2][1][0] buf%ebp

StackFrame

for main

StackFrame

for echo

8048648: call 804857c <echo> 804864d: mov 0xffffffe8(%ebp),%ebx # Return Point

0xbffff8d8Return Address

Saved %ebp[3][2][1][0] buf

StackFrame

for main

StackFrame

for echo

38 37 36 3508 04 86 00

34 33 32 31

Invalid address

No longer pointing to desired return point

%ebp and return address corrupted

Page 48: Lecture 9 Aggregate Data Organization Topics Pointers Aggregate Data Array layout in memory Structures February 14, 2012 CSCE 212 Computer Architecture

– 48 –CSCE 212H Spring 2012

Malicious Use of Buffer Overflow

Input string contains byte representation of executable code Overwrite return address with address of buffer When bar() executes ret, will jump to exploit code

void bar() { char buf[64]; gets(buf); ... }

void foo(){ bar(); ...}

Stack after call to gets()

B

returnaddress

A

foo stack frame

bar stack frame

B

exploitcode

pad

data written

bygets()