TCS-DB-er: Some random notes from MPI

Today is the last day of my stay in Saarbrucken, Germany. During the last two weeks, I have been here visiting Julian Mestre, a former student of my current advisor Samir Khuller.

We have been working on a stochastic optimization problem and already had some exciting progress. Cooperating with Julian is a lot fun. We have a big overlap of our background, which makes the catchup very easy, meanwhile our thinking patterns are quite different and complements to each other well. (As I observed) Julian tends to start off from concrete examples (worst/average/best case examples), making insightful observation and then generalizing until the discover of the solution which works for all case. The way I think usually begins with some very vague framework and then filling in details and testing whether it works for concrete examples. if it didn't work, then try to patch it if possible or just scratch it and start from a new framework or just find a new problem to consider...
I will write some thing more about the interesting problem we are considering at some later point when appropiate.

But I would like to make some notes for some random stuffs I learned during the visit.

1) the power of two choices: consider the following random experiment: throw n balls randomly into n bins (by randomly, i mean for each ball, uniformly and randomly choose a bin to put the ball.) It is a well known fact that the expected maximum load of any bin is O(logn/loglogn). This means this purely random procedure is not a very good scheme to achieve load balancing, albeit it has already been used in many places for this purpose.

However, if instead of completely random throw of each ball, we randomly pick two bins and throw the ball into the bin with smaller load, the expected maximum load becomes O(loglogn). With only one more choice and very limited rationalness (make judgement only between two choices), we can achieve a exponential improvement. Fascinating!

For more general results and references of its numerous applications, see Michael Mitzenmacher's phd thesis.

2)Compressed Sensing: this is now a super hot topic in ECE (also some branches of CS related to signal, compression, etc.) during the recent breaktrhough by Candes, Romberg, Tao and Donoho. But it seems to me not many TCS and DB people (my impression)are high on this topic (I am only aware Cormode and Muthukrishnan have done some work on this). I read through a basic tutorial and browse a few paper abstracts on this and got some rough ideas about the field. Not only the theory borrows ideas from a variety of branchs in applied math and nicely fit them togther, but the practical implication are also remarkable. Many traditional compression schemes will be (or have been) rewritten due to the new developed theory and techniques. My question is whether it can be applied to the approximate query processing and/or query optimization in some way to achieve a better performance than wavelet which has been extensively used for those applications.

3) A question I don't have a good answer: Raghav told me this cute problem when he was reading the book '' the probabilistic method'' by Alon and Spencer. It is a exercise problem on the first chapter.

x and y are two independent identically distributed random variable.
Prove: Pr(|x-y|<=1) >= Pr (1<=x-y<=2) (no absolute value in RHS) I suspect the following (tighter generalization) is also true Pr(0<=x-y<=t) >= Pr(a<=x-y<=a+t) for any a,t>=0

If you try to do it in a brute-force way, it boils down to proving an fairly clean inequality.
But the inequality seems not quite so easy to prove. I have some idea how to prove it, but it is pretty messy and i don't have enough incentive to figure out the details.
Since it is only an exercise problem in the first chapter, I believe the problem should have a very simple but tricky solution. Let me know if you know the solution.

3 comments:

Chen CaoOctober 6, 2009 at 9:30 PM
Pr(x-y<=1)=Pr(y-x<=1) since x,y iid
=Pr(x-y>=-1)>=Pr(1<=x-y<=2)
Jian LiOctober 26, 2009 at 10:42 AM
dude, i was asking a different problem....
Milind "Milz" BandekarJuly 18, 2010 at 1:05 AM
A New Kind of Grammars: A new kind of generative grammars that can produce the empty language is designed in this book.

http://www.amazon.com/New-Kind-Grammars-generative-grammars/dp/1452828687

An unrepresented mongrel of Computer Science, the null or invalid string gets a new representative symbol in this book.

TCS-DB-er

Saturday, August 22, 2009

Some random notes from MPI

3 comments:

About Me

Subscribe Now

Blog Archive