Wednesday, December 2, 2009

the power of operators

I want to discuss a little bit about a very useful tool in analysis, the operators.
The theory of operators is an entire area of which I only know some basics.
However, even basic knowledge about it can simplify things quite a bit and sometimes makes impossible things possible. I will just discuss one nice cute application.

Suppose we want to compute

\int xe^x dx

This is very simple: we just do integration by part,

\int xe^x dx = \int x de^x = xe^x -\int e^x dx = xe^x -e^x +C

However, what if we want to compute

\int P(x)e^x dx
where P(x) is a polynomial of degree n?
After a moment reflection, you might want to say that you can repeat integration by part n times and each time the deg of the polynomial inside the integral is reduced by one. Yes, but if you want to do this by hand, this will cost you a big piece of paper and xxx minutes before you find out how the terms evolve in each iteration and conclude the result by induction.

However, with the help of operators, it is just a piece of cake:
Let us us D be the differential operator, i.e. Df = f'.
Then 1/D is the integral operator.
It is a known fact that for any integer k,

D^k (e^{ax}y)=e^{ax}(D+a)^k y
Therefore,

\int P(x)e^{ax} dx = \frac{1}{D} e^{ax} P(x)= e^{ax} \frac{1}{D+a} P(x)
= \frac{e^{ax}}{a} \frac{1}{1+D/a} P(x) = \frac{e^{ax}}{a} (1-D/a +D^2/a^2 - D^3/a^3 \ldots)P(x)
= \frac{e^{ax}}{a} (P(x)-P'(x)/a +P''/a^2 - P^{(3)}/a^3 \ldots)
Notice that we only need the terms up to D^n ( since taking n+1 times differentiation of a poly of degree n
will result in 0).
Now, we can see every term is very simple to compute.

Actually, this integral appears quite often in various contexts, for eg. Laplace transform, Fourier analysis, moment generating function, where we often need to convolute functions with exponentials..




Write latex in blogger

It is very simple. Please check this out for the details.

http://www.botcyb.org/2008/10/rendering-latex-in-blogger.html

Here is how it looks..
\int_{0}^{1}\frac{x^{4}\left(1-x\right)^{4}}{1+x^{2}}dx =\frac{22}{7}-\pi


Your browser should allow javascript in order to show the rendered formula correctly.
Unfortunately, it doesn't seem to work in google reader....:(

Wednesday, October 14, 2009

A funny bio

Mani Soma, A prof at UW, IEEE fellow

http://faculty.washington.edu/manisoma/

Read his Biosketch...:)

Monday, October 5, 2009

Jeff Erickson's list on "what a theorectical computer science student should know"

Things happened Last week included Mihai Pătraşcu's (yes, it was him again!) this contraversial blog post with 74+ comparably interesting comments! I highly recommend you to follow it a bit if you haven't done so.

I would like to quote today a comment by Jeff Erickson on related post.
It is a (a suprisingly long and diverse) list on "what he thinks a phd in theorectical computer science should know". Erickson is a computational geometrist. That is why his list contains a few of geometry results (or algebraic results that is useful in algebraic geometry?) that I never heard of, like persistent homology, de Casteljau's algorithm...But other than that , i think it is quite fair list. Check how many you haven't heard of and wiki them! :)

here is Erickson's (I think it is him) comment on this post "core TCS" (by Michael Mitzenmacher):
"I'm sure my answer will be horribly biased, but at the very least, it should touch on Voronoi diagrams, Van Emde Boas trees, the dynamic optimality conjecture, Vapnik-Chernovenkis dimension, the Immerman–Szelepcsényi theorem, the Lovasz Local Lemma, spectral partitioning, Bloom filters, boosting, Arrow’s paradox, Nash equilibria, Azuma's inequality, zero-knowledge proofs, the PCP theorem, smoothed analysis, coresets, Reed-Solomon codes, distributed cache consistency protocols, interior-point methods, persistent homology, AKS sorting networks, monadic second-order logic, snake sort, evasive graph properties, the Okamura-Seymour theorem, natural proofs, Ramsey theory, fractional cascading, communication complexity, doubling dimension, derandomization, soft heaps, the power of two choices, the Graph Minor Theorem, Krylov subspace iteration, Euclid's algorithm, Dijkstra's algorithm, Boruvka's algorithm, Dehn's algorithm, Grover's algorithm, Buchberger's algorithm, de Casteljau's algorithm, Khachiyan's algorithm, Bresenham's algorithm, Dinits's algorithm, Lloyd's algorithm, and how to quickly look up lots of impressive-sounding results on Wikipedia."

I would like to add a few things I really like and think are equally important/elegent:
Dynamic programming speedup and SMAWK algorithm, expander, Karp-Luby and Karp-Luby-Madras , Alon-Matias-Szegedy algorithm, regularity lemma, perfect graph theorems, treewidth, Baker's shifting technique, Digital fountain, Lovasz's Sandwich theorem... I can only recall this many at this point and probably will update it later.
Also I would like to see a similar list for DB students. Hellerstein and Stonebraker 's list might be a good starting point.

Monday, September 28, 2009

TAOCP and the inversion table of a permutation.

Perhaps every body in Computer science knows Knuth's volumes of "the art of computer programming".
Since I was not a in a cs undergraduate program, I knew this book even before
any introductory algorithm book like "Introduction to Algorithms".
Partly due to the comment made by Bill Gate on the back of the 1st volumes: "If you think you're a really good programmer . . . read (Knuth's) Art of Computer Programming . . . You should definitely send me a resume if you can read the whole thing.", I started reading the book, as my very first book on algorithms (I had read some books about c and vb programming before, but I wouldn't count them as algorithm books).
It was like 5 years back, I could only read 2-3 pages per day. Eventually I finished two volumes (the 1st and 3rd vol). I still remember many exciting moments when I managed to understand the contents, full of numerous amazing algorithms and beautiful analysis. The book does not only convey the knowledge of which the big vols already contain a incredibly amount,
but also an aesthetic point of view towards algorithms by providing a large number of masterpieces that best illustrate it. I probably should say, the topics discussed in the book is arguably bit old-fashined and these volumes do not appear in papers' ref as frequently in nowadays cs research as 3o years ago. However, many tricks covered in the book can still be very useful.

Today, I will discuss a very cute one, the inversion table for a permutation. The definition is very simple.
Given a permutation P of {1,2,....,n}, its inversion table is a length n sequence
b1,b2,...,bn where bi= #(elements that are bigger than i and appear to the left of i in P.

e.g. P= {591826473}
its inverse table is {236402210}. b1=2 since there are two elements (5 and 9) larger than and on the LHS of 1.

A easy observation is an inversion table uniquely determines a permutation.
It is at times much more convenient to work with inversion tables than original permutations since the space of inversion tables looks much simpler:
actually, every sequence b1b2....bn with 0<=bi<=n-i corresponds to some inversion table!
(Note we also have n*(n-1)*....*1=n! many of them.)

Let us see an application.
Compute the expected number of right-to-left maxima for a random permutation of {1,2,...,n}.
This was one problem in the final exam of our probabilistic method course taught by Aravind Srinivasan.
Direct computation would be rather cumbersome. I remember I spent like 20min on manipulating binomial coefficients and got nowhere...
Let us work with the inversion table b1b2...bn now.
We can see the number i is a right-to-left maxima if and only if bi=n-i (all n-i larger numbers are all to its left).
Also taking a random permutation is equivalent to taking a random inversion table since the 1-1 correspondence.
Therefore,
E= \sum_{i=1}^n Pr(i is a right-left maximum)=\sum_{i=1}^n Pr(bi=n-i)= \sum_{i=1}^n 1/(n-i+1) = H_n (the n-th harmonic number).

Cute, isn't it!


Another solution I know is to use the following recurrence:

E[#maxima w.r.t {1...n}] = \sum_{i=1}^n Prob(first element is i) (E[#maxima w.r.t {i+1...n}]+1)
which reduces to the recursion
E(n)= \sum_{i=0}^{n-1}(E(n-i)+1)/n.
The nth harmonich number H_n solves this recursion.
This is as simple, but way less cute, I think.

Sunday, August 30, 2009

VLDB09 - journal like review process.

This year's vldb was held in Lyon, France. I came back from there yesterday night. The jet-lag woke me at 5pm...so I have some time to do some notes here.

The first thing i want to note is the undergoing change of the review process of Proc. of VLDB. This year' s vldb includes a couple of (<=10) papers that have gone through a journal like review process(multiple rounds), rather than the traditional one round thing for conference. Next year's proceeding will include 10% of such papers. From 2011, there will be no traditional conference PC and everything will be shifted to the journal-like track.

I talked to quite a few of my friends. Most of them did not seem quite enthusiastic about the transition. I guess the reason might be that they are able to get papers accepted anyway and the multi-round process only implies an increasesing workload but not a higher chance to them. Literally quote (translated into english) some of their words "I bet the review will be 'more experiments on xxx', 'more references on xxx' , ' more xxx on xxx' "; "more experiments, on a new dataset, these are only easy to say"," the worst thing could happen is that a reviewer didn't like the paper intially but asked for a major revision rather than gave a outright reject. Then the paper was turned around multiple times, the content was rewritten and exp was redone and the reviewer still didn't like the paper and finally rejected it"......I have a few journal papers and went through the multiround process myself twice. Fortunately, my experience was not that bad. The first paper was submitted to transaction on information theory. we got two quite positive reviews. One review was like 5 pages long with numerous detailed comments, but each of those was fairly easy to fix and the paper got accepted after that round. The other one was a very short paper which i submitted to Information Processing Letter. The process was even smoother. I got two reviews and one of them only had one sentence for recommending acceptence. Responsing the other review took me less than two hours and the paper was accetped after this round. I hope this new vldb journal thing can improve the paper quality substantially without bringing us too much pain....

Saturday, August 22, 2009

Some random notes from MPI

Today is the last day of my stay in Saarbrucken, Germany. During the last two weeks, I have been here visiting Julian Mestre, a former student of my current advisor Samir Khuller.

We have been working on a stochastic optimization problem and already had some exciting progress. Cooperating with Julian is a lot fun. We have a big overlap of our background, which makes the catchup very easy, meanwhile our thinking patterns are quite different and complements to each other well. (As I observed) Julian tends to start off from concrete examples (worst/average/best case examples), making insightful observation and then generalizing until the discover of the solution which works for all case. The way I think usually begins with some very vague framework and then filling in details and testing whether it works for concrete examples. if it didn't work, then try to patch it if possible or just scratch it and start from a new framework or just find a new problem to consider...
I will write some thing more about the interesting problem we are considering at some later point when appropiate.

But I would like to make some notes for some random stuffs I learned during the visit.

1) the power of two choices: consider the following random experiment: throw n balls randomly into n bins (by randomly, i mean for each ball, uniformly and randomly choose a bin to put the ball.) It is a well known fact that the expected maximum load of any bin is O(logn/loglogn). This means this purely random procedure is not a very good scheme to achieve load balancing, albeit it has already been used in many places for this purpose.

However, if instead of completely random throw of each ball, we randomly pick two bins and throw the ball into the bin with smaller load, the expected maximum load becomes O(loglogn). With only one more choice and very limited rationalness (make judgement only between two choices), we can achieve a exponential improvement. Fascinating!

For more general results and references of its numerous applications, see Michael Mitzenmacher's phd thesis.

2)Compressed Sensing: this is now a super hot topic in ECE (also some branches of CS related to signal, compression, etc.) during the recent breaktrhough by Candes, Romberg, Tao and Donoho. But it seems to me not many TCS and DB people (my impression)are high on this topic (I am only aware Cormode and Muthukrishnan have done some work on this). I read through a basic tutorial and browse a few paper abstracts on this and got some rough ideas about the field. Not only the theory borrows ideas from a variety of branchs in applied math and nicely fit them togther, but the practical implication are also remarkable. Many traditional compression schemes will be (or have been) rewritten due to the new developed theory and techniques. My question is whether it can be applied to the approximate query processing and/or query optimization in some way to achieve a better performance than wavelet which has been extensively used for those applications.

3) A question I don't have a good answer: Raghav told me this cute problem when he was reading the book '' the probabilistic method'' by Alon and Spencer. It is a exercise problem on the first chapter.

x and y are two independent identically distributed random variable.
Prove: Pr(|x-y|<=1) >= Pr (1<=x-y<=2) (no absolute value in RHS) I suspect the following (tighter generalization) is also true Pr(0<=x-y<=t) >= Pr(a<=x-y<=a+t) for any a,t>=0

If you try to do it in a brute-force way, it boils down to proving an fairly clean inequality.
But the inequality seems not quite so easy to prove. I have some idea how to prove it, but it is pretty messy and i don't have enough incentive to figure out the details.
Since it is only an exercise problem in the first chapter, I believe the problem should have a very simple but tricky solution. Let me know if you know the solution.