# Lecture #17: Dimension reduction (continued)

**1. An equivalent view of estimating **

Again, you have a data stream of elements , each element drawn from the universe . This stream defines a frequency vector , where is the number of times element is seen. Consider the following algorithm to computing .

Take a (suitably random) hash function . Maintain counter , which starts off at zero. Every time an element comes in, increment the counter . And when queried, we reply with the value .

Hence, having seen the stream that results in the frequency vector , the counter will have the value . Does at least have the right expectation? It does:

And what about the variance? Recall that , so let us calculate

So

What does Chebyshev say then?

Not that hot: in fact, this is usually more than .

But if we take a collection of such independent counters , and given a query, take their average , and return . The expectation of the average remains the same, but the variance falls by a factor of . And we get

So, our probability of error on any query is at most if we take .

** 1.1. Hey, those calculations look familiar **

Sure. This is just a restatement of what we did in lecture. There we took a matrix and filled with random values—hence each row of corresponds to a hash function from to . And taking rows in the matrix corresponds to the variance reduction step at the end.

** 1.2. Limited Independence **

How much randomness do you need for the hash functions? Indeed, hash functions which are -wise independent suffice for the above proofs to go through. And how does one get a -wise independent hash function? Watch this blog (and the HWs) for details.