I'm a partner in the advanced analytics group at Bain & Company, the global management consulting firm. My primary focus is on marketing analytics (bio). I've been writing here (views my own) about marketing, technology, e-business, and analytics since 2003 (blog name explained).

Email or follow me:


« An Unevenly Distributed Future: MITX Internet Video Panel | Main | DylanMessaging: Viral Genius »

February 17, 2008

At Harvard KSG With Tim Berners-Lee

Jerry Mechling, who teaches at Harvard's Kennedy School, and runs the "Leadership for a Networked World" (formerly E-Government Executive Education) program there, invited me to a talk by Sir Tim Berners-Lee last Wednesday evening.  The audience included ~50 current and former senior public-sector information technology officials attending one of Jerry's sessions.

Sir Tim's comments included:

  • a discussion of how the WWW came to be
  • an examination of some of the risks that could have killed it early on, and how those were overcome
  • an exploration of some of the possibilities of the Semantic Web
  • an exhortation to members of the audience to "set their data free"

It was interesting to hear him describe the original purpose of the Web as a "collaborative workspace" and not a "publishing tool".  Physicists came to CERN in the late eighties/ early nineties to conduct experiments that would take lots of time to plan ahead of time.  Folks involved in any given experiment often came from a number of different institutions.  These institutions often had very different computer networks.  So if you were a researcher in university X trying to access a colleague's work at institute Y or Z, you had to learn those institutions' protocols for document access.  Sir Tim's innovation was to apply Hypertext to the Internet to simplify this challenge. Essentially he abstracted general properties of various networks' document access schemes into a specification that, if followed by document publishers, would make it far easier for people to access each other's stuff.  Here's a W3C architecture document that describes things at a high level.  And here's a link to slides for a recent "past-present-future" talk by Sir Tim that's worth browsing through for a sense of what problems W3C tackles, and how.

He went on to describe how the traffic on their server doubled every four months from 1991-1994.  And, as this happened, someone at some point suggested that maybe CERN should be getting some sort of royalty for facilitating all of this document access.  Sir Tim noted how crucial it was that they were able to avoid this, describing how Gopher tried to do this (charge on a per-request basis) and how it killed that protocol.  Later the risk shifted from one of reaching critical mass to avoiding fragmentation during the years of the "browser wars" -- the risk that browser providers would promote non-compatible standards for web content.  Sir Tim observed that perhaps the most important achievement enabling all this was the W3C process for reaching consensus across so many interested users, pointing us to "The Art of Consensus" Guidebook published by the W3C.

We got his Semantic Web pitch.  The idea: expose data in structured ways that allow the contents of the Web to be queried as a giant database might be.  He described how the standards for this are maturing.  Nonetheless, they would mature faster if more data were publicly available to be queried, so he naturally made that appeal to the audience, echoing a blogger's recent plea: "raw data, now!"  There's a good case to be made for making possible insights like those Hans Rosling presented in 2006 at TED (mindblowing presentation, via Al Essa -- thanks, Al!  As a user and fan of IBM's ManyEyes project, I'm salivating for the day Google lets us use Gapminder's TrendAnalyzer).  Reg Alcock asked the P (privacy) question, for which Sir Tim's answer (as I recall it) lies in how use of data is licensed.

I asked Sir Tim which sites best illustrate the potential of the Semantic Web.  He offered these:

  • dbpedia, which parses Wikipedia sections that are semistructured in layout and re-publishes them as structured data that can be queried and mashed up usefully.
  • Govtrack, which provides information on how members of Congress vote
  • Theyworkforyou, a similar UK site
  • Musicbrainz, an open-source music metadata project
  • The Linking Open Data project

Jim Salb from the State of Delaware asked the most interesting question of the evening: "As far as the Semantic Web goes, what's your benchmark for success and how will you know you are done?"  Sir Tim answered, "I know I'll have been successful when people are doing things with the Semantic Web that I can't imagine yet."

Here's a podcast of a recent interview with Sir Tim on the Semantic Web.



TrackBack URL for this entry:

Listed below are links to weblogs that reference At Harvard KSG With Tim Berners-Lee:


The comments to this entry are closed.