Thursday, May 28, 2009

vldb 2009

Finally, our paper on ranking over probabilistic datasets is accepted in VLDB. Since I like this paper very much, probly even more than any theory paper I have written so far, I just blog it here, for remembrance. To me, the ideas are neat and the algorithms are elegant, also very practical (I did a lot of experiments).  

This paper has been through a pretty tough time. The first draft was our report for the database course project. At that time, the paper was titled "clustering and ranking on prob databases" and devoted much space talking about clustering. We submitted it to vldb08 and got 2 weak accepts and one weak rej. Then the paper was rolloverred (a policy that some boarderline papers can be rolled to the next same level conf(sigmod/vldb) with the same set of reviewers) to sigmod09 and was reviewed by the same set of reviewers but got the same review results. 

Finally, we decide to completely remove the clustering part and focus only on the ranking part.
More results, explanations and experiment were added.  Especially, I added the ranking approximation part where I borrowed some ideas and results from signal processing (actually fourier transformation). This makes our new ranking function an very effective and efficient one.
 

Wednesday, May 20, 2009

Godel Prize

I have heard these two papers for a few years, but never got a chance to have a look.


From communication of ACM.

ACM Group Honors Researchers Who Discovered Zig-Zag Graph That Improves Design of Robust Computer Networks

AScribe Newswire (05/19/09)

ACM's Special Interest Group on Algorithms and Computing Theory (SIGACT) and the European Association for Theoretical Computer Science have awarded the Godel Prize to Omer Reingold, Salil Vadhan, and Avi Wigderson for developing a new type of graph that enables the construction of large expander graphs, which are important in designing robust computer networks and constructing theories of error-correcting computer codes. The new zig-zag graph was able to help solve the problem of detecting a path from one node to another in very small storage for undirected graphs. In their paper, "Entropy Waves, the Zig-Zag Graph Product and New Constant Degree Expanders," Reingold, Vadhan, and Wigderson discussed their research on a family of expander graphs, which are used for critical computer theory applications. The sparse but highly connected expander graphs were constructed using the zig-zag graph method, which enables the construction of large expanders from smaller expanders while preserving degree and connectivity. In a second paper, "Undirected Connectivity in Log-Space," Reingold proved that connectivity in undirected graphs can be solved in logarithmic storage, and that any connected graph is a very weak expander, but applying the zig-zag method makes it possible to turn the graph into an expander of only moderately large size. Omer Reingold is a professor of computer science at the Weizmann Institute of Science, Salil Vadhan is a professor of computer science and applied mathematics at Harvard University's School of Engineering and Applied Sciences, and Avi Wigderson is a professor at the School of Mathematics, Institute for Advanced Study. The Godel Prize, which includes an award of $5,000, is named after Kurt Godel in recognition of his major contributions to mathematical logic and the foundations of computer science.

Tuesday, May 12, 2009

The next big thing??

It is called Wolfram/Alpha project. From what I saw from the blog
, it is capable of answering fairly complicated (even scientific) queries which seems to be impossible or very hard for google. 
It is claimed as "the next big thing"  after windows and google.
and the release time is this month. let's see. 


Saturday, May 9, 2009

"tussles?" among theorists

I am becoming a fan of google products and starting to try out various cool things like blogger and calender etc. Blogger allows you to subscribe to several others' blogs and view updates from these aggregately in a single page. I follow a few theoreticians' blogs, like Richard Lipton's P=NP,
there is always a lot of information that interests me.

Here is a quite recent post

in which Dr Lipton listed a few papers that he thought should also deserve the prize but were missed.
There were quite a few replies with many of them suggested some extension to the list.

Among those, I found Mihai's comment which raised interesting debates among those people.
He claimed the prize was too 'complexity biased' and tended to ignore algorithm 
(maybe I should say data structure here?) papers.
He also wrote a even more ``interesting'' followup post.
I assume no copyright issue and just cite his first paragraph:
"
As you probably know, I think theory is in a deep crisis: our obsession to turn into philosophy (prividing unappreciated service to disciplines we do not understand), stemming from our deep commitment to become as ignorant and irrelevant as possible inside the CS department; the influx of people who like to pretend we're Mathematics, and have no idea what computation means; the obsession with the "polynomial" versus "non-polynomial" and it's many consequences; etc etc etc. Since I am not inclined to write novels, I have never been able to put all these arguments inside a coherent text.
"
I really laughed my ass off by reading this, notwithstanding I don't really agree on it.
I heard that Mihai is quite a controversial figure, but this was the first time I really saw it.
He seems to be a disbeliever of approximation algorithms which is the area I have been working on.  Also his words:
"But whenever I say algorithms it should read "algorithms, minus whatever the TCS community is doing in approximation algorithms".
I have serious concerns about our field of approximation algorithms (with important exceptions), and I think it's better to treat it as an entirely separate field."

Although I am myself in no position to judge the community nor his comments, I think what he said is quite thought-provoking. On one hand, TCS is full of amazing masterpieces in every subfield which is totally capable to make people working on that subfield be obsessed and tend to depreciated  works on other field. On the other hand, a majority of papers in every field are not that significant and influential, not that novel, not that a breakthrough which always leave some space for others to criticize. My opinion is that a piece of research is useful, if it is not trivial, AND answer (even only a small number of) peoples doubt OR raise people's interests. Among other cs subareas I have read papers from, for eg. DB and Networking, TCS has a much larger proportion of papers satisfying my criteria - my experience is that in theory, as long as the paper is related, no matter it appears in a 1-tier or 2-tier conf, there are something I want to know and need to know while in db or networking,  papers from 2-tier confs are mostly used to fill out related work sections or used as inferior examples to compare with other 'superior' ones(maybe because I didn't read as many db or networking papers??).

Just some random thought. Blogging indeed kills a lot spare time, might be better than playing the fantastic game Dota , which is more time-consuming and stressing.  





First blog

google's products are always cool.