LLMs are used to understand the query and the results come straight from Data Commons, including a link to the original data source; thus the output is not generated by the LLM. This approach allows Data Commons to avoid some of the current known limitations of LLMs around factuality in some instances.
Data Commons does not collect or own any data, instead it draws on publicly available data from over 200 sources, covering thousands of data sets including demographics, economics, education, housing, public health, climate, sustainability, and biomedicine. There’s data from 194 countries, in some countries down to the state or county level. However, the data accessible so far isn’t evenly distributed nor is it complete – unfortunately data availability reflects many of the same equity challenges the world faces on other issues, so we currently have more data for the US, India, and OECD countries than countries in Africa, South America, and parts of Asia. More and ongoing work is needed to make additional and up-to-date data available. We hope more public data will be published to help fill the gaps, and seek to add more categories of data useful to better understand the world and enable those working to tackle pressing societal challenges. We are actively looking for additional data and partners to help fill in some of these gaps.
Data Commons is open source, open process and accessible to all. In addition to the Data Commons site, a subset of data points from Data Commons are used in responses to queries in Google Search. We are also partnering with organizations who are using Data Commons to tackle society’s challenges – the result is a growing ecosystem that allows groups like Resources for the Future, Feeding America, IIT Madras’ Robert Bosch Centre for Data Science and Artificial Intelligence, Stanford Doerr School of Sustainability, and Harvard University’s Institute for Quantitative Social Science to have their own versions of Data Commons, providing organizations with a unified view of their own data with all the public data already accessible via Data Commons.
Marnie Webb, Chief Community Impact Officer for TechSoup, a longtime Google partner, shared how Data Commons can also be helpful to the smaller nonprofits her organization works with: “Data Commons gives grassroots organizations access to the data they need. It gives them the tools to ask questions about the needs in their community in the language they would use to ask a colleague a question, and to get reliable information in return, as if they had data scientists and data engineers on staff. What we’re talking about is the democratization of information for better decision making, so that organizations can take smart risks to better serve their communities. We’re talking about putting the power of data into the hands of those who know their communities best.”
For example, with funding from Google.org, TechSoup is helping nonprofits harness the power of Data Commons to assess and address societal challenges. For example, Cemefi is highlighting the intersections between hunger and gender in Mexico and Makaia is tracking economic and social growth in Colombia. TechSoup is illustrating the relationship between food security, farming, and climate change by bringing together data from sources like the USDA and Feeding America.