TCS-DB-er: July 2009

Tuesday, July 28, 2009

How fast can you expand a polynomial? (Cont)

I finally get some time to record the O(n^2) time algorithm.
The O(n^2\log^2 n) time algorithm can be found from the appendix of this paper of mine.

Now, let me present an O(n^2) time algorithm.

First, an EXPRESSION can be thought as a tree where all variables and constants are leaves and inner nodes are operations (* and +)

E.g. (x^2+x+1)(2x+1) corresponds to

(Algorithm 1)
1. Choose n+1 different numbers a_1, ...., a_{n+1} .
2. Evaluate the polynomial at these points, i.e., compute F(a_i). (each evaluation takes linear time (bottom-up over the tree). So O(n^2) time in total)
3. Use any O(n^2) polynomial interpolation algorithm to find the coefficient (actually reduces to finding a solution for a linear system).

Cute? Still have to rely on some other nontrivial algorithm to invert the Vandermonde system in O(n^2) time (ref [3][4] in the wiki page).

Let us look at a cuter one.
(Algorithm 2)

Suppose the polynomial if F(x)=\sum_j c_j x^j.
Let e_i be the n-dimensional zero vector except that the ith entry is 1, i.e.,e_i=(0,0,..,1,...,0,0).
Let d_i be the n-dimensional vector which is the DFT(discrete Fourier transformation) of e_i.
Denote u be the nth root of unit and u={1,u,u^2,....,u^{n-1}}.
Therefore, e_i[j]=\sum_k d_i[k] u^{jk}. Equivalently, e_i=\sum_k d_i[k] u^k
where u^k={1,u^k,u^{2k},u^{3k},.....}

Computing all d_i takes O(n^2) time.
Let c be the coefficient vector (of F).
c_i = <c, e_i> (inner product)
<c,e_i>=\sum_k d_i[k]<c, u^k>....................(1)

But what is <c,u^k> ? it is exactly the numerical value of F(u^k)!!! (think about it)

Now, we have our algorithm:
1.compute d_i for all i ....................use DFT formula directly (don't need FFT) ,O(n^2)
2.compute F(u^k) for all k.....................n poly evaluations over complex, O(n^2)
3.use (1) to compute the coefficient.................O(n^2)

So, nothing fancy here(except the idea:). Cuter, right?

OPEN question: can we do better? or the lower bound is O(n^2)

Saturday, July 18, 2009

How fast can you expand a polynomial?

I just came across the following very basic problem.
given a expression of univariable polynomial which is constructed by the following recursion rules
(1) A constant or the variable x is a EXPRESSION.
(2) The sum of two EXPRESSIONS is a EXPRESSION
(3) The product of two EXPRESSION is a EXPRESSION
e.g.
f(x)=((1+x+x^2)(x^2+2x^3)+x^3(2+3x^4))(1+2x)

We know the degree of the polynomial and the length of the EXPRESSION are of sizes O(n).
The question is how fast can you expand the EXPRESSION into a standard form, i.e.,
\sum_{i=0}^n c_i x^i
In an other word, how fast can you compute the coefficients c_i for all i.

It is pretty trivial to achieve O(n^3). (the running time analysis needs a bit counting).
With the help of FFT(fast Fourier transformation), you can do O(n^2\log^2n).
I think I can do O(n^2), but need some nontrivial math...

My question is whether \Omega(n^2) is the lower bound..
or can we do even better?
Could anyone point me out some reference??

Tuesday, July 14, 2009

A Beatiful Science Dream Came TRUE on the 20th Anniversary!

Cold Fusion is proven to be real.
The discovery of an inexhausible new source of energy!!

However, a potential new weapon may also be on its way and it is argued that the ``new weapon'' would be extremely hard to detect and the damage could be massive...

Bless or curse....Let's see...

Check this out
http://goldismoney.info/forums/showthread.php?t=361895

Monday, July 6, 2009

CRAZY summer

(copied from my msn space)

I remember I said in the previous blog I was going to have a crazy summer, but I didn't expect it is this crazy..........

Just passed the SODA deadline a few minute ago.

I had two submissions: one is about a clustering problem with some new interesting constraints and the other is on an interesting generalization of the classic machine scheduling problem. Hopefully, I will get lucky results on a nonempty subset of them....

A few days ago, I was in Providence attending SIGMOD09. Something interesting to say about the conf.

First is the super surprising and exciting news that I got the VLDB09 best paper award. I had never expected that before and was really surprised.

Honestly, it is a bit too much to a student like me who is sort of new in DB and I am not quite ready....Perhas it means to be an alert dictating me to learn more DB..:)

I really enjoyed the conference since I made a lot of friends and met quite a few famous researcher. A new discovery is there are so many prof/student graduated from Fudan are doing DB right now. It is not at all easy to find so many Fudaner outside Fudan and all are doing DB, ranging prof/ass prof/phd/lab researcher/industy ppl.....

It was also a good chance to meet those legendary big name in DB, e.g. micheal stonebreaker, ronald fagin..., they are true bigboss da学霸!

This is the story. When I was doing my PODS talk, fagin was sitting in the first row but I didn't know that was fagin at that time.

I had prepared the talk a couple of times and the time should be just enough. During the talk, fagin kept interrupting and asking me questions. I was thinking "这老头谁啊。。。". Since fagin always asked question, there was an other lady, she thought maybe question was allowed during the talk and tried to ask one. Before she finished her first sentence, the session chair stopped her with "we are running out time, plz take the questions offline...". But fagin kept asking questions and the session chair responded with super smile in his face...Finally, I could quite finish all the stuffs I wished to cover and skipped a lot of details. Later on, my advisor told me that was ronald fagin.....

To me, this SIGMOD was not as exciting as the just-ended STOC09. A major reason is that I can only understand a limited number of talks. Hope next time things will get better.

A couple of technical remarks: one is column store seems becoming very hot now. Another is probabilistic database, many sigmod/pods sessions are devoted for it. See also the most recent issue of the Comm ACM, there is a long artical about it. I think it should be the way we manage uncertain data in real life applications, but just not sure how far it will go, will this theory result in an indispensible commercial system? Let's see.

In Aug, I will go to UIUC for a wireless network summer school and then go to Germany to visit MPI and then to France for VLDB. Crazy summer goes on an on....

TCS-DB-er