Technology

May 04, 2008

Science 2.0?

This year I am running several conferences and it has caused me to ask this question: can science move faster? Today most scientists have far more compute power (peta-flops) than they did a few years ago and yet the many scientific processes are no faster than before. For example last year (2007) I had my very first 2009 publication. Thats right in 2009 a year from now a journal paper that was written in 2006 will come out, so is science really keeping up with other related progresses?

Continue reading "Science 2.0?" »

March 27, 2008

Update to EC2 Services

A few months ago I posted some cost analysis on startups using EC2 services from Amazon. Just recently they announced some updates to that service addressing two issues we did not examine.

  1. Availability Zones
  2. Elastic IP Addresses

Here are a couple of quick blurbs on the new capabilities.

Continue reading "Update to EC2 Services" »

March 26, 2008

Summize Text Summarization Talk @ UMD

Today, Eric gave a talk on our summarization technology at the University of Maryland's cloud computing speaker series.
 

Continue reading "Summize Text Summarization Talk @ UMD" »

March 25, 2008

InfoScale 2008

Conferences are a great way to keep in sync with the latest ideas.  This year I am co-chairing several conferences in the area of search.  Today I thought it would be interesting to share our InfoScale 2008 call for papers.  InfoScale is a conferences started a few years ago focusing on the issues of scaling networking and search problems. So, check out the call for papers and the venue of the conference.

Continue reading "InfoScale 2008" »

January 02, 2008

Stars and Bars

We are often asked why we use a histogram visualization we refer to as a "snip" for ratings rather than the traditional stars or numeric scores used on most other sites. The answer is a bit more complex than "it looks cool" and thus seemed like a great topic for a post.  Star and numeric single value summaries for a product often hide valuable information. This hurts users because controversial books are often the great books, or the negative review is often the most insightful.  So today I am going to give a quick review of the history of attitude encoding, examine the issues associated with central tendency summaries (single numeric or star value) and demonstrate how a histogram allows users to better understand the review-o-sphere.

Continue reading "Stars and Bars" »

December 19, 2007

Computing Compute Resources for a Startup

Startups today need far less capital for compute resources then they did just a decade ago. This has been driven by the improvements in CPU processing speeds, people designing systems that run on cheap commodity servers and an overall trend of getting more computer for less.  For Summize, I couldn't imagine building the technology and processing the data we have today with less than a few million dollars in hardware a decade ago. Now $25k gets you a lot of computing power.  Even with the cheaper cost of hardware these resources come with additional costs that can be real issues for a startup.  For example, you still must buy and setup servers, find rack space, deal with backups, DNS servers and many other tasks before those servers are usable. There is also the additional time and knowledge to perform the maintenance that keeps them running.  In this post I examine some of the costs for figuring out when to build vs rent these resources.  Others can benefit from this as it is not a problem specific to Summize but to compute intensive startups.

Continue reading "Computing Compute Resources for a Startup" »