Most folks are familiar with centralized data - information kept in a central location, in a proprietary format, and possibly within a walled garden. We explored implementations of decentralized data at Supernova 2006 - specifically, data which is portable and follows the user, rather than the service, and is often shared via a hosting service.
Kevin Lawver demonstrated AIM Pages, AOL's experiment for the open web. The interface allows normal users to create microformat-structured pages that function like a web service. His positive buzz about microformats was because it gives AOL these benefits as a service provider:
- Easy development and testing
- Indexing and aggregation
- "Opportunity" for interoperability (a funny caveat to make)
The AIM Page buddy gallery can be viewed as XML, so you can use any XML or DOM parser to take the data and mash it into something else.
Rohit Khare - nominally of CommerceNet, but wearing his Microformats.org hat - demonstrated the Microsear.ch service as a database tool applied against CommerceNet data. Microformats plus ATOM feeds are another new way to provide services.
EVDB is trying to "maximize event discovery" by building a thriving web ecosystem around events, and has over one million events microformatted. (Disclosure note: Omidyar Network is a funder of EVDB.) Refreshingly, Brian Dear's discussion included some of the problems with distributed data, at least as it applies to events:
- Uneven microformat quality.
- Getting enough information to be useful.
- Working out an analog for a ping server.
- Establishing authoritative sources and registries.
There's still more to do when it comes to the quality of the data being microformatted. Many pages don't have any microformatted data, others have very low-definition event data; as Dear puts it, the kind of information that you would put onto your refrigerator door, not what you would put into a useful event listing (name, description, date, time, location, venue, etc.) In order for microformats to be successful, software developers must incorporate automatic structuring of embedded data. We're got more careful thinking to do on data structures, types, and registries/repositories.
Matt Kaufman walked through how Edgeio goes through RSS feeds to pull out classifieds data. Edgeio pulls the details as listed, but then also goes through the listing to pull out appropriate tags. (There's more to their total offering, but this is what was most relevant to the conversation.) The toughest part has been creating a standard tagsonomy given the many ways someone might describe the same couch. Moving forward, Edgeio has several phases over which it will implement hListings.
What most sellers find compelling about Edgeio is that this allows publishers to have their listings redistributed throughout the Internet - for example, if I blog about selling my car, list my couch for sale on Craigslist, and post my old jewelry on eBay, I could unite all listings on my blog.
Matt Augustine of Microsoft talked about two-way synchronization using RSS and made the bold move of demonstrating it in Firefox. A neat twist in their source reveals the item, the conflict resolution on item descriptions, and the history of changes. In Outlook, synchronizing feeds would make it easy to pick a mutually good time on the calendar. (This is still a pain in the arse, regardless of the system that one uses.) Unfortunately for Andy Baio, Augustine hijacked the Upcoming.org interface to walk through his demo.
I stopped feeling sorry for Baio once he demonstrated the brand new microformatting that's now behind Yahoo! Local. This product announcement was cool enough to break out separately, so click here for the details. It's a bold move by Yahoo! that will heat up the competition for local information and search, and ideally will drive standards around decentralized data.
These applications are exciting, but they do bring up a number of issues:
- Rights management on all of this decentralized data also needs to be addressed. AOL's interface gives its users a choice of licenses when publishing data via their page feeds, but there certainly isn't a standard yet.
- Control over event listings. How do you keep the Green Party from cancelling a Democratic Party fundraiser, or vice-versa?
- Ensuring that content ownership remains with the content creator, rather than the service provider. Terms of service often require content owners to relinquish some rights, and most users click right through them.
This gets to the major stumbling block for participatory media in general - an established infrastructure of processes, platforms, and tools that is designed to support content being created and shared by the many, rather than a few producers and distributors. It isn't just about where the data sits, it's about who owns it.
Tags: christine herron christine.net space jockeys supernova supernova2006 evdb edgeio microformats microsearch yahoo aimpages microsoft technology