8
Sublinear Algorihms for Big Data Grigory Yaroslavtsev http://grigory.us Exam Problems

Sublinear Algorihms for Big Data

Embed Size (px)

DESCRIPTION

Sublinear Algorihms for Big Data. Exam Problems. Grigory Yaroslavtsev http://grigory.us. Problem 1: Modified Chernoff Bound. Problem: Derive Chernoff Bound #2 from Chernoff Bound #1. Explain your answer in detail. - PowerPoint PPT Presentation

Citation preview

Page 1: Sublinear Algorihms  for Big Data

Sublinear Algorihms for Big Data

Grigory Yaroslavtsevhttp://grigory.us

Exam Problems

Page 2: Sublinear Algorihms  for Big Data

Problem 1: Modified Chernoff Bound

• Problem: Derive Chernoff Bound #2 from Chernoff Bound #1. Explain your answer in detail.

• (Chernoff Bound #1) Let be independent and identically distributed r.vs with range [0, 1] and expectation . Then if and

• (Chernoff Bound #2) Let be independent and identically distributed r.vs with range and expectation . Then if and

Page 3: Sublinear Algorihms  for Big Data

Problem 2: Modified Chebyshev

• Problem: Derive Chebyshev Inequality #2 from Chebyshev Inequality #1. Explain your answer in detail.

• (Chebyshev Inequality #1) For every :

• (Chebyshev Inequality #2) For every

Page 4: Sublinear Algorihms  for Big Data

Problem 3: Sparse Recovery Error

• Definition (frequency vector): = frequency of in the stream = # of occurrences of value

• Sparse Recovery: Find such that is minimized among s with at most non-zero entries.

• Definition: • Problem: Show that where are indices of

largest . Explain your answer formally and in detail.

Page 5: Sublinear Algorihms  for Big Data

Problem 4: Approximate Median Value

• Stream: elements from universe • (all distinct) and let

• Median , where + 1 ( odd).• Algorithm: Return the median of a sample of size

taken from (with replacement).• Problem: Does this algorithm give a approximate

value of the median with probability , i.e. such that

if ? Explain your answer.

Page 6: Sublinear Algorihms  for Big Data

Problem 5: Lower bound on

• Definition (frequency vector): = frequency of in the stream = # of occurrences of value

• Define (frequency moment): • Problem: Show that for all integer it holds

that:

• Hint: worst-case when . Use convexity of

Page 7: Sublinear Algorihms  for Big Data

Problem 6: Approximate MST Weight

• Let be a weighted graph with weights of all edges being integers between 1 and W.

• Definition: Let be the # of connected components if we remove all edges with weight .

• Problem: For some constant show the following bounds on the weight of the minimum spanning tree

Page 8: Sublinear Algorihms  for Big Data

Evaluation• Deadline: September 1st , 2014, 23:59 GMT• Submissions in English: [email protected]

– Submission Title (once): Exam + Space + “Your Name”– Question Title: Question + Space + “Your Name”– Submission format

• PDF from LaTeX (best)• PDF

• You can work in groups (up to 3 people)– Each group member lists all others on the submission

• Point system– Course Credit = 2 points– Problems 1-3 = 0.5 point each– Problem 4 = 1 point– Problems 5-6 = 2 point each