Google searches on public data

Google launched its search service for public data in late April.  It didn’t get much attention before it was overshadowed by the publicity surrounding the mid-May launch of Wolfram|Alpha.  But it’s definitely worth a look.   Here’s a quick comparison of the two.

Like Alpha, Google’s public data search returns graphs displaying data in response to a search query.  Like Alpha, it cites the sources for its data.  Like Alpha, it only knows what it knows.  While Alpha will return a polite error message when you ask about data it doesn’t have ("Wolfram|Alpha isn’t sure what to do with your input"), Google defaults to the results of an ordinary Google search on your searchstring. 

Unlike Alpha, when Google returns a graph, the graph is interactive, giving you the option to add or subtract lines of relevant data.  For example, if you search for "unemployment rate USA", you’ll get a chart as the first hit on the return list, and an ordinary hit list below it.  If you click on the chart, you’ll have options to superimpose on the US curve the unemployment curves for any state or combination of states.  If you expand the outline under a state’s name in the left sidebar, you’ll have the same options to view the unemployment curves for any county or combination of counties. 

Like Alpha, when Google public data has answers, it’s very useful.  When it doesn’t, we can only hope that it doesn’t stop adding new datasets.

Also see Google’s help page on public data search, and its page on how to add new open datasets to the service.

Don’t confuse the public data search service with other recent data-related Google innovations such as Google Squared, Rich Snippets, Wonder Wheel, and Timeline (which shouldn’t be confused with Google’s News Timeline).  For a good review of the latter cluster of innovations, see Laura Gordon-Murnane’s article in the the new Information Today.

Comment.  Wolfram|Alpha and Google are both proving that making datasets OA enables third parties to amplify their utility.  Wolfram and Google are certainly not the first to do so, but they’re among the most conspicuous and influential.  The lesson:  If you have a dataset you’re willing to make OA, then make it OA.  If you don’t know of free online tools to make the data queryable, interactive, or visual, don’t wait for someone to develop them.  Just make the file OA and let other people work on that side of things.  For years now we’ve had this situation with texts:  if you make a text freely available online, others will find it, use it, crawl it, and at the very least improve its discoverability.  One reason to be excited:  We’re entering that age for data files.  Another reason:  the enhancements possible for data files are much richer than those possible for text files.