Title: Keynote for Google I/O 2008: Client, Connectivity, and the Cloud. Abstract: Featuring Vic Gundotra, Allen Hurff (MySpace), Steve Horowitz, Kevin Gibbs, Mark Lucovsky, Bruce Johnson, David Glazer, Nat Brown (iLike).
I’m at minute 23 and it looks very very interesting and relevant (for every developer in any field).
So, sit tight, grab a coke/beer/cigarette/juice/whatever-you-like and… listen carefully.
This time the main topic was about “Developer Productivity“. It was one of the most funny I have ever attended: a couple of speaker were just “crazy” (but in a good way)!
Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data.
Here’s what makes Hadoop especially useful:
Scalable: Hadoop can reliably store and process petabytes.
Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.
Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located.
Hadoop has been demonstrated on clusters with 2000 nodes. The current design target is 10,000 node clusters.
I followed the Quickstart guide and I can confirm that it works on Mac OS X too, but I managed only to make it run in “standalone” mode: usefull for first-stage development and debugging. Continue…
I was looking for info about MapReduce and I thought that would have been a good idea to take a look at the Tech Talks published by Google. Here we go.
Title: 2007 Seattle Conference on Scalability: MapReduce Used on Large Geographic Data Sets Location: Google Tech Talks June 23, 2007 Speaker: Barry Brumitt, Google Inc. Abstract: MapReduce is a programming model and library designed to simplify distributed processing of huge datasets on large clusters of computers. This is achieved by providing a general mechanism which largely relieves the programmer from having to handle challenging distributed computing problems such as data distribution, process coordination, fault tolerance, and scaling. While working on Google maps, I’ve used MapReduce extensively to process and transform datasets which describe the earth’s geography. In this talk, I’ll introduce MapReduce, demonstrating its broad applicability through example problems ranging from basic data transformation to complex graph processing, all the in the context of geographic data.
Other than just the topic of MapReduce technique itself, this guy, Barry Brumitt, gives an hint of “how things work” in Google (developer wise). And it’s not boring at all: is actually quite funny.
Last 2/3 months I used a bit of my free time studying Android the so called “Google OS”, that is the result of the Open Handset Alliance.
In this relatively short amount of time I had the possibility to collect information from the official sources, as well as from very interesting and active forums (like this one and this one). I had also the possibility to meet other experts during the Android Code Day event (here a summary what we did there). It was a very good place to ask important business related questions: I should say that the Google Developer Advocate Jason Chen was very keen to answer my tricky questions .
Spending other words on this here is quite pointless: half of the web speaks about it (while the other half speaks about iPhone ). But I would like to share part of the result of this study:
A Presentation: “Into the Android” [PDF | HTML+Flash]
Video #1: MWC - Android running on different ARM-based devices [mp4]
Source code of the Application I developed for study (actually, is just the Tutorial that comes with the SDK with much more internal documentation ): VSNotepad.tar.gz.
The presentation explains different details of the Development process using the code of this application as an example.
The video are also on Youtube and embedded here after the jump. Continue…
Latest Comments