1 Comment

  1. CCC Council Seeks New Members | CSDiary January 2, 2009 @ 4:41 pm

    […] other community-building activities. Randy Bryant and Thomas Kwan, for example, have been leading a CCC-funded visioning activity on “big data” computing. There are also major CCC-funded activities in many other areas, including robotics, education, […]

At the Hadoop Summit

Events

Something very big is happening. Interest in data-intensive scalable computing (DISC), in both research and industry practice, is taking off. The first Hadoop Summit was hosted yesterday by Yahoo! Research in Santa Clara. (Hadoop is the open-source suite of software packages for “map-reduce” style distributed computing.) The Summit had been planned originally as a workshop for about 100 people, but well over 300 people attended. Another 100 or so wanted to attended but had to be turned away. The meeting room was packed-to-the-gills (hopelessly overloading wireless Internet support).

The presentations at the Summit were fascinating. While they varied greatly in technical depth, in total they gave an overwhelming sense of rapidly growing amounts of ingenuity being directed towards solving large-scale data-intensive problems on highly scalable computing clusters. As one might expect, researchers from some of the academic powerhouses who have taken a leadership role in this area (specifically, CMU and Berkeley) were among the speakers, as well as people from industry research labs at Yahoo!, IBM, and Microsoft. But there were also technical talks by developers at places like Google, Amazon, Rapleaf, Facebook, and Autodesk, demonstrating the growing industry acceptance of Hadoop. The group of speakers was quite eclectic, especially considering the fairly hard-score technical nature of large-scale distributed systems. (Lots of current and former CMU people were also there, including Randy Bryant, Garth Gibson, Jamie Callan, Dave O’Hallaron, Dan Jenkins, Mihai Budiu, and Chris Olston, to name a few.)

Today, Randy Bryant is chairing a “big data” workshop, an invitation-only event with about 100 researchers expected to attend, and supported as an official “visioning” workshop by the Computing Community Consortium (CCC). Starting with our partnership with Yahoo!, which led to our use of the M45 cluster, we have been very supportive of the planning for this Summit. Hadoop and, more generally, distributed systems research (particularly in the map-reduce mode), appears to be having a surge of interest here at CMU as well as many other places. Regarding map-reduce, there is an elegance to the high-level concept, as well as some satisfaction in seeing what is, fundamentally, a functional programming concept being put to interesting use. Some of the talks, for example on Yahoo!’s Pig and Microsoft’s LINQ, are evolving the idea from raw API to programming model to real programming languages. On the other hand, there is also the sense that these are still early days, with a lot more progress yet to be made and perhaps some significant contributions just waiting to be discovered.

A very good place to be doing research, I would say.

Peter Lee @ March 26, 2008

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>