Mathematics > QUESTIONS & ANSWERS > Questions and Answers > University of California, Berkeley - CS 70 hw11-solution. CS 70 Discrete Mat (All)

Questions and Answers > University of California, Berkeley - CS 70 hw11-solution. CS 70 Discrete Mathematics and Probability Theory Fall 2019. All Solutions Worked.

Document Content and Description Below

CS 70 Discrete Mathematics and Probability Theory Fall 2019 Alistair Sinclair and Yun S. Song HW 11 Note: This homework consists of two parts. The first part (questions 1-4) will be graded and will... determine your score for this homework. The second part (questions 5-6) will be graded if you submit them, but will not affect your homework score in any way. You are strongly advised to attempt all the questions in the first part. You should attempt the problems in the second part only if you are interested and have time to spare. For each problem, justify all your answers unless otherwise specified. Part 1: Required Problems 1 Probabilistic Bounds A random variable X has variance Var(X) = 9 and expectation E[X] = 2. Furthermore, the value of X is never greater than 10. Given this information, provide either a proof or a counterexample for the following statements. (a) EX2 = 13. (b) P[X = 2] > 0. (c) P[X ≥ 2] = P[X ≤ 2]. (d) P[X ≤ 1] ≤ 8=9. (e) P[X ≥ 6] ≤ 9=16. Solution: 2 Just One Tail, Please Let X be some random variable with finite mean and variance which is not necessarily nonnegative. The extended version of Markov’s Inequality states that for a non-negative, monotonically increasing function f(X) and a > 0, P(X ≥ a) ≤ E[f(X))] f(a) Suppose E[X] = 0, Var(X) = s2 < ¥, and a > 0. (a) Use the extended version of Markov’s Inequality stated above with f(x) = (x+c)2, where c is some constant, to show that: P(X ≥ a) ≤ s2 +c2 (a +c)2 (b) Note that the above bound applies for all positive c, so we can choose a value of c to minimize the expression, yielding the best possible bound. Find the value for c which will minimize the RHS expression (you may assume that the expression has a unique minimum). Plug in the minimizing value of c to prove the following bound: P(X ≥ a) ≤ s2 a2 + s2 : CS 70, Fall 2019, HW 11 2 (c) Recall that Chebyshev’s inequality provides a two-sided bound. That is, it provides a bound on P(jX - E[X]j ≥ a) = P(X ≥ E[X] + a) + P(X ≤ E[X] - a). If we only wanted to bound the probability of one of the tails, e.g. if we wanted to bound P(X ≥ E[X]+a), it is tempting to just divide the bound we get from Chebyshev’s by two. Why is this not always correct in general? Provide an example of a random variable X (does not have to be zero-mean) and a constant a such that using this method (dividing by two to bound one tail) is not correct, that is, P(X ≥ E[X]+a) ≥ Var 2a(X2 ) or P(X ≤ E[X] -a) ≥ Var 2a(X2 ). (d) What is a sufficient condition to place on a random variable X such that dividing by two to bound one tail would be correct? Now we see the use of the bound proven in part (b) - it allows us to bound just one tail while still taking variance into account, and does not require us to assume any property of the random variable. Note that the bound is also always guaranteed to be less than 1 (and therefore at least somewhat useful), unlike Markov’s and Chebyshev’s inequality! (e) Let’s try out our new bound on a simple example. Suppose X is a positively-valued random variable with E[X] = 3 and Var(X) = 2. What bound would Markov’s inequality give for P[X ≥ 5]? What bound would Chebyshev’s inequality give for P[X ≥ 5]? What about for the bound we proved in part (b)? (Note: Recall that the bound from part (b) only applies for zero-mean random variables.) Solution: 3 Optimal Gambling Jonathan has a coin that may be biased, but he doesn’t think so. You disagree with him though, and he challenges you to a bet. You start off with X0 dollars. You and Jonathan then play multiple rounds, and each round, you bet an amount of money of your choosing, and then coin is tossed. Jonathan will match your bet, no matter what, and if the coin comes up heads, you win and you take both yours and Jonathan’s bet, and if it comes up tails, then you lose your bet. (a) Now suppose you actually secretly know that the bias of the coin is 1 2 < p < 1! You use the following strategy: on each round, you will bet a fraction q of the money you have at the start of the round. Let’s say you play n rounds. What is the probability that you win exactly k of the rounds? What is the amount of money you would have if you win exactly k rounds? [Hint: Does the order in which you win the games affect your profit?] (b) Let Xn denote the amount of money you have on round n. X0 represents your initial assets and is a constant value. Show that E[Xn] = ((1- p)(1-q)+ p(1+q))nX0. You may use the binomial theorem in your answer: (a+b)n = n∑k=0 nkakb(n-k) CS 70, Fall 2019, HW 11 4 [Hint: Try computing a sum over the number of rounds you win out of the n rounds you play - use your answers from the previous part.] (c) What value of q will maximize E[Xn]? (Use calculus!) For this value of q, what is the distribution of Xn? Can you predict what will happen as n ! ¥? [Hint: Under this betting strategy, what happens if you ever lose a round?] (d) The problem with the previous approach is that we were too concerned about expected value, so our gambling strategy was too extreme. Let’s start over: again we will use a gambling strategy in which we bet a fraction q of our money at each round. Express Xn in terms of n, q, X0, and Wn, where Wn is the number of rounds you have won up until round n. [Hint: Does the order in which you win the games affect your profit?] (e) By the law of large numbers, what does Wn=n converge to as n ! ¥? Using this fact, what does (lnXn)=n converge to as n ! ¥? (f) The rationale behind (lnXn)=n is that if (lnXn)=n ! c, where c is a constant, then that means for large n, Xn is roughly ecn. Therefore, c is the asymptotic growth rate of your fortune! Find the value of q that maximizes your asymptotic growth rate. (g) Using the value of q you found in the previous part, compute E[Xn]. (h) Say Jonathan wishes to estimate the bias of the coin - he may want to use the value Wn=n as his estimate. What is the expectation of this value? What is a bound on the variance of this value? (Your bound should not include p.) (i) Let’s say after playing 100 rounds, Jonathan observes that 74 heads have appeared. Help Jonathan construct a 75% confidence interval using Chebyshev’s inequality on the estimator from the previous part. Solution: 4 Random Cuckoo Hashing Cuckoo birds are parasitic beasts. They are known for hijacking the nests of other bird species and evicting the eggs already inside. Cuckoo hashing is inspired by this behavior. In cuckoo hashing, when we get a collision, the element that was already there gets evicted and rehashed. We study a simple (but ineffective, as we’ll see) version of cuckoo hashing, where all hashes are random. Let’s say we want to hash n pieces of data d1;d2;:::;dn into n possible hash buckets CS 70, Fall 2019, HW 11 7 labeled 1;:::;n. We hash the d1;:::;dn in that order. When hashing di, we assign it a random bucket chosen uniformly from 1;:::;n. If there is no collision, then we place di into that bucket. If there is a collision with some other dj, we evict dj and assign it another random bucket uniformly from 1;:::;n. (It is possible that dj gets assigned back to the bucket it was just evicted from!) We again perform the eviction step if we get another collision. We keep doing this until there is no more collision, and we then introduce the next piece of data, di+1 to the hash table. (a) What is the probability that there are no collisions over the entire process of hashing d1;:::;dn to buckets 1;:::;n? What value does the probability tend towards as n grows very large? (b) Assume we have already hashed d1;:::;dn-1, and they each occupy their own bucket. We now introduce dn into our hash table. What is the expected number of collisions that we’ll see while hashing dn? (Hint: What happens when we hash dn and get a collision, so we evict some other di and have to hash di? Are we at a situation that we’ve seen before?) (c) Generalize the previous part: Assume we have already hashed d1;:::;dk-1 successully, where 1 ≤ k ≤ n. Let Ck be the number of collisions that we’ll see while hashing dk. What is E[Ck]? (d) Let C be the total number of collisions over the entire process of hashing d1;:::;dn. What is E[C]? You may leave your answer as a summation. Solution: Note: This concludes the first part of the homework. The problems below are optional, will not affect your score, and should be attempted only if you have time to spare. Part 2: Optional Problems 5 Subset Card Game Jonathan and Yiming are playing a card game. Jonathan has k > 2 cards, and each card has a real number written on it. Jonathan tells Yiming (truthfully), that the sum of the card values is CS 70, Fall 2019, HW 11 9 0, and that the sum of squares of the values on the cards is 1. Specifically, if the card values are c1;c2;:::;ck, then we have ∑k i=1 ci = 0 and ∑k i=1 c2 i = 1. Jonathan and Yiming also agree on a positive target value of a. The cards are then going to be dealt randomly in the following fashion: for each card in the deck, a fair coin is flipped. If the coin lands heads, then the card goes to Yiming, and if the coin lands tails, the card goes to Jonathan. Note that it is possible for either player to end up with no cards/all the cards. A player wins the game if the sum of the card values in their hand is at least a, otherwise it is a tie. (a) Prove that the probability that Yiming wins is at most 8a1 2 . (b) Find a deck of k cards and target value a where the probability that Yiming wins is exactly 1 8a2 . Solution: 6 Easy A’s A friend tells you about a course called “Laziness in Modern Society” that requires almost no work. You hope to take this course next semester to give yourself a well-deserved break after working hard in CS 70. At the first lecture, the professor announces that grades will depend only on two homework assignments. Homework 1 will consist of three questions, each worth 10 points, and Homework 2 will consist of four questions, also each worth 10 points. He will give an A to any student who gets at least 60 of the possible 70 points. However, speaking with the professor in office hours you hear some very disturbing news. He tells you that, in the spirit of the class, the GSIs are very lazy, and to save time the grading will be done as follows. For each student’s Homework 1, the GSIs will choose an integer randomly from a distribution with mean m = 5 and variance s2 = 1. They’ll mark each of the three questions with that score. To grade Homework 2, they’ll again choose a random number from the same distribution, independently of the first number, and will mark all four questions with that score. If you take the class, what will the mean and variance of your total class score be? Use Chebyshev’s inequality to conclude that you have less than a 5% chance of getting an A when the grades are randomly chosen this way. Solution: [Show More]

Last updated: 1 year ago

Preview 1 out of 12 pages

Add to cart

Instant download

GET ASSIGNMENT HELP