My FriendFeed page

View Cesar Brea's profile on LinkedIn

My Firm

email me

Get new posts by email:

Delivered by FeedBurner

RSS

View blog authority

View blog reactions

www.flickr.com
This is a Flickr badge showing public photos and videos from Cesar Brea. Make your own badge here.

Related Sites

  • Cesar Brea's Weblog
    My original blog, hosted by the Berkman Center for Internet and Society at Harvard Law School
  • Octavianspace
    A Myspace experiment. May 2006 update: no friends after 6 months (Tom doesn't count). Maybe this isn't for me, though I haven't done much with it yet.
  • Marketspace Advisor
    News and comment on the cross-channel customer experience
  • Radio Free Brea
    My podcast station on Andrew Grumet's Gigadial service.
  • ESM Partners
    essays on high-tech strategy, sales, and marketing by me and Jamie Schein.

Copyright

« "In Their Tribes", Or, "How Do You Handle 10,000 Tech Maniacs' Votes?" | Main | Mindblowing »

March 10, 2007

Clouded Vision

My new colleague Steven Forth, who is CTO of eMonitor (the content technology arm of Monitor Group) referred me last night to Many Eyes (http://services.alphaworks.ibm.com/manyeyes/home), which is a social data visualization and interpretation service developed by the Collaborative User Experience (CUE) Research Group at IBM's Watson Research Center.   As the intersection of social software and content analysis is currently a high-priority professional interest, I decided to try it out. 

Among other visualization approaches to structured data sets, Many Eyes generates tag clouds from free text files.  Steven noted that in particular, the two-word view seems like a very powerful 80-20 cut at inferring predominant meaning in a body of text. 

I experimented by exporting the contents of this blog as a text file, progressively scrubbing useless Typepad artifact words and html tags that appear frequently (like "title", "breaks", "comments", and my name) out of the source file -- to do this I simply ran "edit/replace/'word', '[]'" in Windows Notepad  -- and then publishing the file on Many Eyes.  Here's the result (click on the image to manipulate the cloud on Many Eyes):


The two-word view does a pretty decent job of communicating the themes I write about, I think.  Unintended side benefit:  highlights recurring cliches and verbal tics I need to purge from my writing, like "drive higher" (argh).

This whole effort took about 30 minutes, from registration to pasting the syndication html into this post.  Two-thirds of that time was spent scrubbing the data iteratively.  This could have gone faster in one of two ways.  First, Many Eyes could provide a custom scrubbing interface where I could register multiple words to be eliminated or replaced from a text file.  Second, and better, they could allow users to share not only comments, but scrubbing filters that would be applicable to data sets coming from common sources with common problems, such as Typepad exports, or government information.

Beyond this, I can imagine a thematic matching capability -- "based on two-word 'keyphrase' frequencies, this data set seems to have lots in common with these other ones..."  Such a capability could be further enhanced by ex-post user rating,  so people could confirm whether, for any given algorithmically-suggested match, the result was actually good, a la "was this useful to you?"  This, like the "Graphic Friendships" idea I wrote about a while back, could help to make the web browsing experience more productive.

Nice job guys! 

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/19289/16750748

Listed below are links to weblogs that reference Clouded Vision:

Comments

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In

Search

e-Commerce

  • All commissions donated to charity.


  • Search Now:
    Amazon Logo

Lijit Search