162
SCALABILITY AND MEMORY CS FUNDAMENTALS SERIES http://bit.ly/1TPJCe6

CS Fundamentals: Scalability and Memory

Embed Size (px)

Citation preview

Page 1: CS Fundamentals: Scalability and Memory

SCALABILITY AND MEMORYCS FUNDAMENTALS SERIES

http://bit.ly/1TPJCe6

Page 2: CS Fundamentals: Scalability and Memory

HOW DO YOU MEASURE AN ALGORITHM?

Page 3: CS Fundamentals: Scalability and Memory

???

Page 4: CS Fundamentals: Scalability and Memory

CLOCK TIME?

Page 5: CS Fundamentals: Scalability and Memory

CLOCK TIME?

Page 6: CS Fundamentals: Scalability and Memory

DEPENDS ON WHO’S COUNTING.

Page 7: CS Fundamentals: Scalability and Memory

ALSO, TOO FLAKY EVEN ON THE SAME

MACHINE.

Page 8: CS Fundamentals: Scalability and Memory

THE NUMBER OF LINES?

Page 9: CS Fundamentals: Scalability and Memory

THE NUMBER OF

LINES?

Page 10: CS Fundamentals: Scalability and Memory

THIS IS TWO LINES, BUT A WHOLE LOT OF STUPID.

Page 11: CS Fundamentals: Scalability and Memory

THE NUMBER OF CPU CYCLES?

Page 12: CS Fundamentals: Scalability and Memory

THE NUMBER OF

CPU CYCLES?

Page 13: CS Fundamentals: Scalability and Memory
Page 14: CS Fundamentals: Scalability and Memory

DEPENDS ON THE RUNTIME.

Page 15: CS Fundamentals: Scalability and Memory

ALL THESE METHODS SUCK.

NONE OF THEM CAPTURE WHAT WE ACTUALLY CARE ABOUT.

Page 16: CS Fundamentals: Scalability and Memory

ENTER BIG O!

Page 17: CS Fundamentals: Scalability and Memory
Page 18: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

Page 19: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

Page 20: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

▸ In other words, it’s about how an algorithm scales when the numbers get huge

Page 21: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

▸ In other words, it’s about how an algorithm scales when the numbers get huge

▸ You can also describe this as “the rate of growth”

Page 22: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

▸ In other words, it’s about how an algorithm scales when the numbers get huge

▸ You can also describe this as “the rate of growth”

▸ How fast do the numbers become unmanageable?

Page 23: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

Page 24: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Another way to think about this is:

Page 25: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Another way to think about this is:

▸ What happens when your input size is 10,000,000? Will your program be able to resolve?

Page 26: CS Fundamentals: Scalability and Memory

TEXT

ASYMPTOTIC ANALYSIS

▸ Another way to think about this is:

▸ What happens when your input size is 10,000,000? Will your program be able to resolve?

▸ It’s about scalability, not necessarily speed

Page 27: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

Page 28: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

Page 29: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

Page 30: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”

Page 31: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”▸ In other words, how does the running time of this function

scale with the input size when the numbers get big?

Page 32: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”▸ In other words, how does the running time of this function

scale with the input size when the numbers get big?

▸ Big O notation looks like this:

Page 33: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”▸ In other words, how does the running time of this function

scale with the input size when the numbers get big?

▸ Big O notation looks like this:

O(n) O(nlog(n)) O(n2)

Page 34: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

Page 35: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

Page 36: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

▸ Can be the size of an array, the length of a string, the number of bits in a number, etc.

Page 37: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

▸ Can be the size of an array, the length of a string, the number of bits in a number, etc.

▸ O(n) means the algorithm scales linearly with the input

Page 38: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

▸ Can be the size of an array, the length of a string, the number of bits in a number, etc.

▸ O(n) means the algorithm scales linearly with the input

▸ Think like a line (y = x)

Page 39: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

Page 40: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ “Scaling linearly” can mean 1:1 (one iteration per extra input), but it doesn’t necessarily

Page 41: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ “Scaling linearly” can mean 1:1 (one iteration per extra input), but it doesn’t necessarily

▸ It can simply mean k:1 where k is constant, like 3:1 or 5:1 (i.e., a constant amount of time per extra input)

Page 42: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

Page 43: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

Page 44: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

Page 45: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

▸ Just think, when the numbers get huge, what dwarfs everything else?

Page 46: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

▸ Just think, when the numbers get huge, what dwarfs everything else?

▸ O(5n) => O(n)

Page 47: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

▸ Just think, when the numbers get huge, what dwarfs everything else?

▸ O(5n) => O(n)

▸ O(½n - 10) also => O(n)

Page 48: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

Page 49: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

Page 50: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

▸ O(200) = O(1)

Page 51: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

▸ O(200) = O(1)

▸ Where there are multiple factors of growth, the most dominant one wins.

Page 52: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

▸ O(200) = O(1)

▸ Where there are multiple factors of growth, the most dominant one wins.

▸ O(n4 + n2 + 40n) = O(n4)

Page 53: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

Page 54: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

Page 55: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

▸ Doesn’t matter if one variable probably dwarfs the other. You always include both.

Page 56: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

▸ Doesn’t matter if one variable probably dwarfs the other. You always include both.

▸ O(n + m) => this is considered linear

Page 57: CS Fundamentals: Scalability and Memory

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

▸ Doesn’t matter if one variable probably dwarfs the other. You always include both.

▸ O(n + m) => this is considered linear

▸ O(2n + log(m)) => this is considered exponential

Page 58: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Page 59: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

Page 60: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

Page 61: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

Page 62: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

▸ O(log(n) + 5000)

Page 63: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

▸ O(log(n) + 5000)

▸ O(2m3 + 50 + ½n)

Page 64: CS Fundamentals: Scalability and Memory

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

▸ O(log(n) + 5000)

▸ O(2m3 + 50 + ½n)

▸ O(nlog(m) + 2m2 + nm)

Page 65: CS Fundamentals: Scalability and Memory

▸ What should n be for this function?

Page 66: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

Let’s break it down.

Make an empty array.

Page 67: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

We’ll see later why this is.

Make an empty array.

Page 68: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

Make an empty array.

Page 69: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

These multiply. => O(n2)

Make an empty array.

Page 70: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

These multiply. => O(n2)

Make an empty array.

Page 71: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

Make an empty array.

Page 72: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ O(n2 + 2n) = O(n2)

Make an empty array.

Page 73: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ O(n2 + 2n) = O(n2)

▸ This algorithm is quadratic.

Make an empty array.

Page 74: CS Fundamentals: Scalability and Memory

For each character in the string…

Unshift them into an array…And then join the array together.

▸ O(n2 + 2n) = O(n2)

▸ This algorithm is quadratic.

▸ Let’s see how badly it sucks.

Make an empty array.

Page 75: CS Fundamentals: Scalability and Memory

Benchmark away!

(showSlowReverse.js)

Page 76: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Page 77: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Page 78: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Page 79: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Page 80: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Page 81: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Page 82: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Page 83: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Page 84: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Exponential O(2n) subsets, solving chess

Page 85: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Exponential O(2n) subsets, solving chess

Factorial O(n!) permutations

Page 86: CS Fundamentals: Scalability and Memory

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Exponential O(2n) subsets, solving chess

Factorial O(n!) permutations

Page 87: CS Fundamentals: Scalability and Memory
Page 88: CS Fundamentals: Scalability and Memory

TIME TO IDENTIFY TIME COMPLEXITIES

Page 89: CS Fundamentals: Scalability and Memory

OPTIMIZATIONS DON’T ALWAYS MATTER

Page 90: CS Fundamentals: Scalability and Memory
Page 91: CS Fundamentals: Scalability and Memory

BOTTLENECKS

Page 92: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

Page 93: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

▸ Asymptotically, it’s wherever the dominant factor is.

Page 94: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

▸ Asymptotically, it’s wherever the dominant factor is.

▸ If your algorithm is has an O(n) part and an O(50) part, the bottleneck is the O(n) part.

Page 95: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

▸ Asymptotically, it’s wherever the dominant factor is.

▸ If your algorithm is has an O(n) part and an O(50) part, the bottleneck is the O(n) part.

▸ As n => ∞, your algorithm will eventually spend 99%+ of its time in the bottleneck.

Page 96: CS Fundamentals: Scalability and Memory

BOTTLENECKS

Page 97: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ When trying to optimize or speed up an algorithm, focus on the bottleneck.

Page 98: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ When trying to optimize or speed up an algorithm, focus on the bottleneck.

▸ Optimizing code outside the bottleneck will have a minuscule effect.

Page 99: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ When trying to optimize or speed up an algorithm, focus on the bottleneck.

▸ Optimizing code outside the bottleneck will have a minuscule effect.

▸ Bottleneck optimizations on the other hand can easily be huge!

Page 100: CS Fundamentals: Scalability and Memory

BOTTLENECKS

Page 101: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ If you cut down non-bottleneck code, you might be able to save .01% of your runtime.

Page 102: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ If you cut down non-bottleneck code, you might be able to save .01% of your runtime.

▸ If you cut down on bottleneck code, you might be able to save 30% of your runtime.

Page 103: CS Fundamentals: Scalability and Memory

BOTTLENECKS

▸ If you cut down non-bottleneck code, you might be able to save .01% of your runtime.

▸ If you cut down on bottleneck code, you might be able to save 30% of your runtime.

▸ Better yet, try to lower the time complexity altogether if you can!

Page 104: CS Fundamentals: Scalability and Memory

BOTTLENECK EXERCISE

Page 105: CS Fundamentals: Scalability and Memory

SPACE COMPLEXITY

Page 106: CS Fundamentals: Scalability and Memory

SPACE COMPLEXITY

Page 107: CS Fundamentals: Scalability and Memory

SPACE COMPLEXITY

▸ Same thing, except now with memory instead of time.

Page 108: CS Fundamentals: Scalability and Memory

SPACE COMPLEXITY

▸ Same thing, except now with memory instead of time.

▸ Do you take linear extra space relative to the input?

Page 109: CS Fundamentals: Scalability and Memory

SPACE COMPLEXITY

▸ Same thing, except now with memory instead of time.

▸ Do you take linear extra space relative to the input?

▸ Do you allocate new arrays? Do you have to make a copy of the original input? Are you creating nested data structures?

Page 110: CS Fundamentals: Scalability and Memory

COMPREHENSION CHECK

Page 111: CS Fundamentals: Scalability and Memory

COMPREHENSION CHECK

▸ What is the space complexity of:

Page 112: CS Fundamentals: Scalability and Memory

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

Page 113: CS Fundamentals: Scalability and Memory

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

▸ firstFive(arr)

Page 114: CS Fundamentals: Scalability and Memory

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

▸ firstFive(arr)

▸ substrings(str)

Page 115: CS Fundamentals: Scalability and Memory

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

▸ firstFive(arr)

▸ substrings(str)

▸ hasVowel(str)

Page 116: CS Fundamentals: Scalability and Memory

SO WHAT THE HELL IS MEMORY ANYWAY

Page 117: CS Fundamentals: Scalability and Memory

TO UNDERSTAND MEMORY, WE NEED TO UNDERSTAND HOW A COMPUTER IS STRUCTURED.

Page 118: CS Fundamentals: Scalability and Memory

Immediate workspace. A CPU usually has 16 of these.

Data Layers

1 cycle

Page 119: CS Fundamentals: Scalability and Memory

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

1 cycle

~4 cycles

Page 120: CS Fundamentals: Scalability and Memory

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

More nearby data, but a little farther away.

1 cycle

~4 cycles

~10 cycles

Page 121: CS Fundamentals: Scalability and Memory

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

More nearby data, but a little farther away.

~800 cycles. Getting pretty far now. It’s completely random-access, but takes a while.

1 cycle

~4 cycles

~10 cycles

Page 122: CS Fundamentals: Scalability and Memory

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

More nearby data, but a little farther away.

~800 cycles. Getting pretty far now. It’s completely random-access, but takes a while.

1 cycle

~4 cycles

~10 cycles

On an SSD, you’re looking at ~5,000 cycles.This is pretty much another country.

And on a spindle drive, it’s more like 50,000.

Page 123: CS Fundamentals: Scalability and Memory

SO ALL DATA TAKES A JOURNEY UP FROM THE HARD DISK TO

EVENTUALLY LIVE IN A REGISTER.

Page 124: CS Fundamentals: Scalability and Memory

WHAT DOES MEMORY ACTUALLY LOOK LIKE?

Page 125: CS Fundamentals: Scalability and Memory

IT’S JUST A BUNCH OF CELLS WITH SHIT IN ‘EM.

Page 126: CS Fundamentals: Scalability and Memory

IT’S ALL BINARY DATA.

STRINGS, FLOATS, OBJECTS, THEY’RE ALL STORED AS BINARY.

Page 127: CS Fundamentals: Scalability and Memory

AND IT’S ALL STORED CONTIGUOUSLY.

THIS IS VERY IMPORTANT WHEN IT COMES TO ARRAYS.

Page 128: CS Fundamentals: Scalability and Memory

ARRAYS ARE JUST CONTIGUOUS BLOCKS OF

MEMORY.

Page 129: CS Fundamentals: Scalability and Memory

THAT’S WHY THEY’RE SO FAST.

Page 130: CS Fundamentals: Scalability and Memory
Page 131: CS Fundamentals: Scalability and Memory

Garbage Also garbage

Page 132: CS Fundamentals: Scalability and Memory

Garbage Also garbage

Assume each of these cells are 8 bytes (64-bits)

Page 133: CS Fundamentals: Scalability and Memory

Garbage Also garbage

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

Page 134: CS Fundamentals: Scalability and Memory

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

this.startAddr = 833096;

Page 135: CS Fundamentals: Scalability and Memory

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

Each cell is offset by exactly 64 in the address space

this.startAddr = 833096;

Page 136: CS Fundamentals: Scalability and Memory

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

Each cell is offset by exactly 64 in the address space

Meaning you can easily derive the address of any index

this.startAddr = 833096;

Page 137: CS Fundamentals: Scalability and Memory

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

function get(i) {

return this.startAddr + i * 64;

}

this.startAddr = 833096;

Page 138: CS Fundamentals: Scalability and Memory

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

function get(i) {

return this.startAddr + i * 64;

}

this.startAddr = 833096;

get(3) = 833096 + 3 * 64 = 83306 + 192 = 833288

Page 139: CS Fundamentals: Scalability and Memory

THIS IS POINTER ARITHMETIC.

Page 140: CS Fundamentals: Scalability and Memory

THIS IS WHAT MAKES ARRAY LOOKUPS O(1)

Page 141: CS Fundamentals: Scalability and Memory

AND IT’S WHY ARRAYS ARE BY FAR THE FASTEST DATA

STRUCTURE

Page 142: CS Fundamentals: Scalability and Memory

LET’S WRAP UP BY TALKING ABOUT CACHE

EFFICIENCY.

Page 143: CS Fundamentals: Scalability and Memory

CACHES ARE DUMB.

Page 144: CS Fundamentals: Scalability and Memory

When the CPU needs data, it first looks in the cache.

Page 145: CS Fundamentals: Scalability and Memory

When the CPU needs data, it first looks in the cache.

Say it’s not in the cache. This is called a cache miss.

Page 146: CS Fundamentals: Scalability and Memory

When the CPU needs data, it first looks in the cache.

Say it’s not in the cache. This is called a cache miss.

The cache then loads the data the CPU requested from RAM…

Page 147: CS Fundamentals: Scalability and Memory

When the CPU needs data, it first looks in the cache.

Say it’s not in the cache. This is called a cache miss.

The cache then loads the data the CPU requested from RAM…

But the cache guesses that if the CPU wanted this data, it probably will also want

other nearby data eventually. It would be stupid to have to make multiple round trips.

Page 148: CS Fundamentals: Scalability and Memory

In other words, the cache assumes that related data will be stored around the same physical area.

Page 149: CS Fundamentals: Scalability and Memory

In other words, the cache assumes that related data will be stored around the same physical area.

The cache assumes locality of data.

Page 150: CS Fundamentals: Scalability and Memory

So the cache just loads a huge contiguous chunk of data around the address the CPU asked for.

Page 151: CS Fundamentals: Scalability and Memory

OK. SO?

Page 152: CS Fundamentals: Scalability and Memory

Remember this?

Loading from memory is slow as shit.

Page 153: CS Fundamentals: Scalability and Memory

Remember this?

Loading from memory is slow as shit.

We really want to minimize cache misses.

Page 154: CS Fundamentals: Scalability and Memory

SO KEEP YOUR DATA LOCAL AND YOUR DATA STRUCTURES

CONTIGUOUS.

Page 155: CS Fundamentals: Scalability and Memory

ARRAYS ARE KING, BECAUSE ALL OF THE DATA IS LITERALLY RIGHT NEXT

TO EACH OTHER IN MEMORY!

Page 156: CS Fundamentals: Scalability and Memory

An algorithm that jumps around in memoryor follows a bunch of pointers to other objectswill trigger lots of cache misses!

Page 157: CS Fundamentals: Scalability and Memory

An algorithm that jumps around in memoryor follows a bunch of pointers to other objectswill trigger lots of cache misses!

Think linked lists, trees, even hash maps.

Page 158: CS Fundamentals: Scalability and Memory

IDEALLY, YOU WANT TO WORK LOCALLY WITHIN ARRAYS OF

CONTIGUOUS DATA.

Page 159: CS Fundamentals: Scalability and Memory

LET’S DO A QUICK EXERCISE.

Page 160: CS Fundamentals: Scalability and Memory

QUESTIONS?

Page 161: CS Fundamentals: Scalability and Memory

I AM HASEEB QURESHI

You can find me on Twitter: @hosseeb

You can read my blog at: haseebq.com

Page 162: CS Fundamentals: Scalability and Memory

PLEASE DONATE IF YOU GOT SOMETHING OUT OF THIS

<3

Ranked by GiveWell as the most

efficient charity in the world!