The challenge of open data standards is working its way back to the forefront of attention. At FOO, a group of us sat down to discuss openly owned repositories of data, and how we could gather momentum around a summit-style dialogue to dissect the issue and start to work on a balanced solution. There is an unnerving variety of data that consumers should own, or at least have access to - school records, medical records, etc. It's not just about Flickr streams. We need both an open standard data format, and an open standard for how to get data.
Standards for sharing information will drive much debate. Does user access mean that service providers need to export full datasets, which could reach terabytes? For my Flickr photos, do I just get back my uploaded pictures, or do I get the comments from other users ? What about the comments that I made on someone else's photo - do I get them back with or without the context of the photos that I commented on, but don't have the rights to? There's also a need for transparency in what companies are doing with user data. Ideally, the terms of service on these sites will draw the line in the sand for who owns what and how companies can use my information, but it's doubtful that the T&C will have anticipated open data dynamics.
This upcoming summit - targeted for October in the Bay Area - will need creative people who have a substantial stake in repositories. Who should be part of the conversation? Some initial ideas:
- Organizations that can bring use cases. Holders of big data repositories such as Kaiser, Stanford Health, the Department of Homeland Security, MIT Admissions.
- Privacy/attention/rights organizations. Those nonprofit groups that are driving around issues of privacy and ownership of consumer attention or content, and that could bring an advocate's perspective - AttentionTrust, Creative Commons, EFF, EPIC, TRUSTe. (Disclosure note: Omidyar Network is a funder of AttentionTrust, Creative Commons, EFF, and EPIC.)
- User delegates. How can you best assure inclusivity? Perhaps potential participants could submit essays or position papers, which could serve as an appropriate barrier to entry while assuring that participants are serious about the issue.
Other open questions that are anticipated to be addressed:
- If you have to pay to get your data, is it still open?
- Is it possible to come up with a potential solution this year? At what point should a straw man be vetted?
- Is there more demand for an open data solution outside of the US? (There are much stricter regulations in Europe on privacy, data protection, and ownership of personal information.)
The general plan for this summit is to spend a day on the problem statement, trying to mark out as many angles as possible. On the second (and third?) day, try to put together a straw man of a solution. Current trends suggest that some of this straw man will come from work of Julian Cash and Cliff Skolnick. What ideas do you have? Please do comment, if you have suggested companies and participants for this dialogue.