CS Fundamentals: Scalability and Memory

Preview:

Citation preview

SCALABILITY AND MEMORYCS FUNDAMENTALS SERIES

http://bit.ly/1TPJCe6

HOW DO YOU MEASURE AN ALGORITHM?

???

CLOCK TIME?

CLOCK TIME?

DEPENDS ON WHO’S COUNTING.

ALSO, TOO FLAKY EVEN ON THE SAME

MACHINE.

THE NUMBER OF LINES?

THE NUMBER OF

LINES?

THIS IS TWO LINES, BUT A WHOLE LOT OF STUPID.

THE NUMBER OF CPU CYCLES?

THE NUMBER OF

CPU CYCLES?

DEPENDS ON THE RUNTIME.

ALL THESE METHODS SUCK.

NONE OF THEM CAPTURE WHAT WE ACTUALLY CARE ABOUT.

ENTER BIG O!

TEXT

ASYMPTOTIC ANALYSIS

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

▸ In other words, it’s about how an algorithm scales when the numbers get huge

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

▸ In other words, it’s about how an algorithm scales when the numbers get huge

▸ You can also describe this as “the rate of growth”

TEXT

ASYMPTOTIC ANALYSIS

▸ Big O is about asymptotic analysis

▸ In other words, it’s about how an algorithm scales when the numbers get huge

▸ You can also describe this as “the rate of growth”

▸ How fast do the numbers become unmanageable?

TEXT

ASYMPTOTIC ANALYSIS

TEXT

ASYMPTOTIC ANALYSIS

▸ Another way to think about this is:

TEXT

ASYMPTOTIC ANALYSIS

▸ Another way to think about this is:

▸ What happens when your input size is 10,000,000? Will your program be able to resolve?

TEXT

ASYMPTOTIC ANALYSIS

▸ Another way to think about this is:

▸ What happens when your input size is 10,000,000? Will your program be able to resolve?

▸ It’s about scalability, not necessarily speed

TEXT

PRINCIPLES OF BIG O

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”▸ In other words, how does the running time of this function

scale with the input size when the numbers get big?

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”▸ In other words, how does the running time of this function

scale with the input size when the numbers get big?

▸ Big O notation looks like this:

TEXT

PRINCIPLES OF BIG O

▸ Big O is a kind of mathematical notation

▸ In computer science, it means essentially means

“the asymptotic rate of growth”▸ In other words, how does the running time of this function

scale with the input size when the numbers get big?

▸ Big O notation looks like this:

O(n) O(nlog(n)) O(n2)

TEXT

PRINCIPLES OF BIG O

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

▸ Can be the size of an array, the length of a string, the number of bits in a number, etc.

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

▸ Can be the size of an array, the length of a string, the number of bits in a number, etc.

▸ O(n) means the algorithm scales linearly with the input

TEXT

PRINCIPLES OF BIG O

▸ n here refers to the input size

▸ Can be the size of an array, the length of a string, the number of bits in a number, etc.

▸ O(n) means the algorithm scales linearly with the input

▸ Think like a line (y = x)

TEXT

PRINCIPLES OF BIG O

TEXT

PRINCIPLES OF BIG O

▸ “Scaling linearly” can mean 1:1 (one iteration per extra input), but it doesn’t necessarily

TEXT

PRINCIPLES OF BIG O

▸ “Scaling linearly” can mean 1:1 (one iteration per extra input), but it doesn’t necessarily

▸ It can simply mean k:1 where k is constant, like 3:1 or 5:1 (i.e., a constant amount of time per extra input)

TEXT

PRINCIPLES OF BIG O

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

▸ Just think, when the numbers get huge, what dwarfs everything else?

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

▸ Just think, when the numbers get huge, what dwarfs everything else?

▸ O(5n) => O(n)

TEXT

PRINCIPLES OF BIG O▸ In Big O, we strip out any coefficients or smaller factors.

▸ The fastest-growing factor wins. This is also known as the dominant factor.

▸ Just think, when the numbers get huge, what dwarfs everything else?

▸ O(5n) => O(n)

▸ O(½n - 10) also => O(n)

TEXT

PRINCIPLES OF BIG O

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

▸ O(200) = O(1)

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

▸ O(200) = O(1)

▸ Where there are multiple factors of growth, the most dominant one wins.

TEXT

PRINCIPLES OF BIG O

▸ O(k) where k is any constant reduces to O(1).

▸ O(200) = O(1)

▸ Where there are multiple factors of growth, the most dominant one wins.

▸ O(n4 + n2 + 40n) = O(n4)

TEXT

PRINCIPLES OF BIG O

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

▸ Doesn’t matter if one variable probably dwarfs the other. You always include both.

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

▸ Doesn’t matter if one variable probably dwarfs the other. You always include both.

▸ O(n + m) => this is considered linear

TEXT

PRINCIPLES OF BIG O

▸ If there are two inputs (say you’re trying to find all the common substrings of two strings), then you use two variables in your Big O notation => O(n * m)

▸ Doesn’t matter if one variable probably dwarfs the other. You always include both.

▸ O(n + m) => this is considered linear

▸ O(2n + log(m)) => this is considered exponential

TEXT

COMPREHENSION TEST

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

▸ O(log(n) + 5000)

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

▸ O(log(n) + 5000)

▸ O(2m3 + 50 + ½n)

TEXT

COMPREHENSION TEST

Convert each of these to their appropriate Big O form!

▸ O(3n + 5)

▸ O(n + 1/5n2)

▸ O(log(n) + 5000)

▸ O(2m3 + 50 + ½n)

▸ O(nlog(m) + 2m2 + nm)

▸ What should n be for this function?

For each character in the string…

Unshift them into an array…And then join the array together.

Let’s break it down.

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

We’ll see later why this is.

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

These multiply. => O(n2)

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ Initialize an empty array => O(1)

▸ Then, split the string into an array of characters => O(n)

▸ Then for each character => O(n)

▸ Unshift into an array => O(n)

▸ Then join the characters into a string => O(n)

These multiply. => O(n2)

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ O(n2 + 2n) = O(n2)

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ O(n2 + 2n) = O(n2)

▸ This algorithm is quadratic.

Make an empty array.

For each character in the string…

Unshift them into an array…And then join the array together.

▸ O(n2 + 2n) = O(n2)

▸ This algorithm is quadratic.

▸ Let’s see how badly it sucks.

Make an empty array.

Benchmark away!

(showSlowReverse.js)

TEXT

TIME COMPLEXITIES WAY TOO FAST

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Exponential O(2n) subsets, solving chess

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Exponential O(2n) subsets, solving chess

Factorial O(n!) permutations

TEXT

TIME COMPLEXITIES WAY TOO FAST

Constant O(1) math, pop, push, arr[i], property access, conditionals, initializing a variable

Logarithmic O(logn) binary search

Linear O(n) linear search, iteration

Linearithmic O(nlogn) sorting (merge sort, quick sort)

Quadratic O(n2) nested looping, bubble sort

Cubic O(n3) triply nested looping, matrix multiplication

Polynomial O(nk) all “efficient” algorithms

Exponential O(2n) subsets, solving chess

Factorial O(n!) permutations

TIME TO IDENTIFY TIME COMPLEXITIES

OPTIMIZATIONS DON’T ALWAYS MATTER

BOTTLENECKS

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

▸ Asymptotically, it’s wherever the dominant factor is.

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

▸ Asymptotically, it’s wherever the dominant factor is.

▸ If your algorithm is has an O(n) part and an O(50) part, the bottleneck is the O(n) part.

BOTTLENECKS

▸ A bottleneck is the part of your code where your algorithm spends most of its time.

▸ Asymptotically, it’s wherever the dominant factor is.

▸ If your algorithm is has an O(n) part and an O(50) part, the bottleneck is the O(n) part.

▸ As n => ∞, your algorithm will eventually spend 99%+ of its time in the bottleneck.

BOTTLENECKS

BOTTLENECKS

▸ When trying to optimize or speed up an algorithm, focus on the bottleneck.

BOTTLENECKS

▸ When trying to optimize or speed up an algorithm, focus on the bottleneck.

▸ Optimizing code outside the bottleneck will have a minuscule effect.

BOTTLENECKS

▸ When trying to optimize or speed up an algorithm, focus on the bottleneck.

▸ Optimizing code outside the bottleneck will have a minuscule effect.

▸ Bottleneck optimizations on the other hand can easily be huge!

BOTTLENECKS

BOTTLENECKS

▸ If you cut down non-bottleneck code, you might be able to save .01% of your runtime.

BOTTLENECKS

▸ If you cut down non-bottleneck code, you might be able to save .01% of your runtime.

▸ If you cut down on bottleneck code, you might be able to save 30% of your runtime.

BOTTLENECKS

▸ If you cut down non-bottleneck code, you might be able to save .01% of your runtime.

▸ If you cut down on bottleneck code, you might be able to save 30% of your runtime.

▸ Better yet, try to lower the time complexity altogether if you can!

BOTTLENECK EXERCISE

SPACE COMPLEXITY

SPACE COMPLEXITY

SPACE COMPLEXITY

▸ Same thing, except now with memory instead of time.

SPACE COMPLEXITY

▸ Same thing, except now with memory instead of time.

▸ Do you take linear extra space relative to the input?

SPACE COMPLEXITY

▸ Same thing, except now with memory instead of time.

▸ Do you take linear extra space relative to the input?

▸ Do you allocate new arrays? Do you have to make a copy of the original input? Are you creating nested data structures?

COMPREHENSION CHECK

COMPREHENSION CHECK

▸ What is the space complexity of:

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

▸ firstFive(arr)

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

▸ firstFive(arr)

▸ substrings(str)

COMPREHENSION CHECK

▸ What is the space complexity of:

▸ max(arr)

▸ firstFive(arr)

▸ substrings(str)

▸ hasVowel(str)

SO WHAT THE HELL IS MEMORY ANYWAY

TO UNDERSTAND MEMORY, WE NEED TO UNDERSTAND HOW A COMPUTER IS STRUCTURED.

Immediate workspace. A CPU usually has 16 of these.

Data Layers

1 cycle

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

1 cycle

~4 cycles

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

More nearby data, but a little farther away.

1 cycle

~4 cycles

~10 cycles

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

More nearby data, but a little farther away.

~800 cycles. Getting pretty far now. It’s completely random-access, but takes a while.

1 cycle

~4 cycles

~10 cycles

Immediate workspace. A CPU usually has 16 of these.

Data Layers

A nearby reservoir of useful data we’ve recently read. Close-by.

More nearby data, but a little farther away.

~800 cycles. Getting pretty far now. It’s completely random-access, but takes a while.

1 cycle

~4 cycles

~10 cycles

On an SSD, you’re looking at ~5,000 cycles.This is pretty much another country.

And on a spindle drive, it’s more like 50,000.

SO ALL DATA TAKES A JOURNEY UP FROM THE HARD DISK TO

EVENTUALLY LIVE IN A REGISTER.

WHAT DOES MEMORY ACTUALLY LOOK LIKE?

IT’S JUST A BUNCH OF CELLS WITH SHIT IN ‘EM.

IT’S ALL BINARY DATA.

STRINGS, FLOATS, OBJECTS, THEY’RE ALL STORED AS BINARY.

AND IT’S ALL STORED CONTIGUOUSLY.

THIS IS VERY IMPORTANT WHEN IT COMES TO ARRAYS.

ARRAYS ARE JUST CONTIGUOUS BLOCKS OF

MEMORY.

THAT’S WHY THEY’RE SO FAST.

Garbage Also garbage

Garbage Also garbage

Assume each of these cells are 8 bytes (64-bits)

Garbage Also garbage

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

this.startAddr = 833096;

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

Each cell is offset by exactly 64 in the address space

this.startAddr = 833096;

Assume each of these cells are 8 bytes (64-bits)

Let’s imagine they’re addressed like so…

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

Each cell is offset by exactly 64 in the address space

Meaning you can easily derive the address of any index

this.startAddr = 833096;

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

function get(i) {

return this.startAddr + i * 64;

}

this.startAddr = 833096;

832968 833032 833096 833160 833224 833288 833352 833416 833480 833544

function get(i) {

return this.startAddr + i * 64;

}

this.startAddr = 833096;

get(3) = 833096 + 3 * 64 = 83306 + 192 = 833288

THIS IS POINTER ARITHMETIC.

THIS IS WHAT MAKES ARRAY LOOKUPS O(1)

AND IT’S WHY ARRAYS ARE BY FAR THE FASTEST DATA

STRUCTURE

LET’S WRAP UP BY TALKING ABOUT CACHE

EFFICIENCY.

CACHES ARE DUMB.

When the CPU needs data, it first looks in the cache.

When the CPU needs data, it first looks in the cache.

Say it’s not in the cache. This is called a cache miss.

When the CPU needs data, it first looks in the cache.

Say it’s not in the cache. This is called a cache miss.

The cache then loads the data the CPU requested from RAM…

When the CPU needs data, it first looks in the cache.

Say it’s not in the cache. This is called a cache miss.

The cache then loads the data the CPU requested from RAM…

But the cache guesses that if the CPU wanted this data, it probably will also want

other nearby data eventually. It would be stupid to have to make multiple round trips.

In other words, the cache assumes that related data will be stored around the same physical area.

In other words, the cache assumes that related data will be stored around the same physical area.

The cache assumes locality of data.

So the cache just loads a huge contiguous chunk of data around the address the CPU asked for.

OK. SO?

Remember this?

Loading from memory is slow as shit.

Remember this?

Loading from memory is slow as shit.

We really want to minimize cache misses.

SO KEEP YOUR DATA LOCAL AND YOUR DATA STRUCTURES

CONTIGUOUS.

ARRAYS ARE KING, BECAUSE ALL OF THE DATA IS LITERALLY RIGHT NEXT

TO EACH OTHER IN MEMORY!

An algorithm that jumps around in memoryor follows a bunch of pointers to other objectswill trigger lots of cache misses!

An algorithm that jumps around in memoryor follows a bunch of pointers to other objectswill trigger lots of cache misses!

Think linked lists, trees, even hash maps.

IDEALLY, YOU WANT TO WORK LOCALLY WITHIN ARRAYS OF

CONTIGUOUS DATA.

LET’S DO A QUICK EXERCISE.

QUESTIONS?

I AM HASEEB QURESHI

You can find me on Twitter: @hosseeb

You can read my blog at: haseebq.com

PLEASE DONATE IF YOU GOT SOMETHING OUT OF THIS

<3

Ranked by GiveWell as the most

efficient charity in the world!

Recommended