What Does 'Content As Data' Mean, Anyway?

Editor’s Note: Jim Wilson is Senior Producer of www.nasa.gov and served on detail as project manager for the .Gov Reform Task Force, which helped develop the Digital Government Strategy.

The newly released Digital Government Strategy calls for a fundamental shift in how government treats digital information, taking an information-centric approach and treating all content as data. But what exactly does “content as data” mean?

Many of us probably hear the word “data” and think of a long list of facts and figures or maybe that android guy from “Star Trek.” But here we’re talking about “structured data,” which is just a fancy way of saying we build our content so that it can be read by machines as well as humans. If you’ve ever subscribed to a podcast or an RSS news feed, you were using this idea.

So what would this look like, and how could it help the government reach citizens? Well, we already know, because we’re already doing it.

Here at NASA, we consume a lot of our own data. This means that we structure things like press releases, images and videos so they can be syndicated across the site and subscribed to by users. Our entire collection of videos on www.nasa.gov/video  is presented this way. Anytime a new video is published, it gets added to the appropriate feed, which automatically get pulled into the appropriate pages, without having to manually add to pages and publish multiple times.

Shuttle page using data feeds

NASA.gov homepage using dynamic feeds to provide updates on STS-135 mission. 

Our most high-profile use of this technique came during the final flight of the space shuttle in July 2011, when we set up a special version of the NASA.gov homepage with multiple data feeds from around the country. At any given time, users could see a real-time text stream of mission updates, as well as the latest videos and photos from the mission. This information could be updated from multiple sources in Florida, Texas, Washington, DC and elsewhere. And the homepage itself rarely had to be touched. Simply by setting up the feeds, NASA let all the data flow into a single, easily accessible place. The page was tweeted more than 8.000 times and got more than 42,000 likes on Facebook.

Another example of this approach is the “create once, publish everywhere” mindset used by the Centers for Disease Control and Prevention to liberate its content from presentation. In other words, you don’t have to go to their webpage and look at the information in their look-and-feel. You can get the information directly from them in lots of different places, on your terms.

So how does that work?

CDC’s content is structured in tagged so that it can be syndicated and consumed both within the agency and externally. Within its own channels, content is updated once then easily displayed on the main CDC.gov web site, the mobile site at m.cdc.gov, and in the various modules of the CDC mobile app. In 2011, CDC’s liberated content was syndicated to 700 registered partners in all 50 US states, the District of Columbia and 15 countries and accounted for an additional 1.2 million page views.

Opening government information in this way has huge potential. Citizens can take content and mash it up to fit their unique needs, while private sector firms can develop government data-driven apps that the government couldn’t build on its own.

We want our users to get the content however they want. If it’s good for them to come to our site, great. But if they can get the information they need some other way, just as great. Our mission, after all, isn’t driving up page views to our homepage; it’s providing the best digital information and services to citizens.