A new search engine for the scientific community has been launched by Google to help them make sense of millions of datasets present online. Called Dataset Search, the service aims to help scientists, data journalists and geeks find the data required for their work and stories – or simply to satisfy their intellectual curiosity.
The new search engine will work like Google Scholar, the company’s popular search engine for academic studies and reports. “Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page,” Natasha Noy, Research Scientist, Google AI, said in a blog post.
To create Dataset search, Google developed guidelines for dataset providers to describe their data in a way that the company (and other search engines) can better understand the content of their pages. “These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data,” Noy said.
Google then collects and links this information, analyses where different versions of the same dataset might be, and finds publications that may be describing or discussing the dataset. “We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem,” said Google.
People can find references to most datasets in environmental and social sciences, as well as data from other disciplines including government data and data provided by news organisations, such as ProPublica. Dataset Search works in multiple languages with support for additional languages coming soon, said Google.