Brown researchers will share nearly $1 million from the National Science Foundation for a project to help encourage big data research.

PROVIDENCE, R.I. [Brown University] — In the era of big data, there’s much that can be learned when companies, nonprofits, government agencies and other entities share datasets with university researchers. But that sharing is often complicated by privacy concerns, legal issues and other challenges.

With a new grant from the National Science Foundation, Brown University computer scientists will work with researchers from Massachusetts Institute of Technology and Drexel University to develop a new standardized modular data-sharing license and a platform that makes data sharing easier and helps to enforce the license.

At Brown, the work will be led by computer science professors Tim Kraska and Carsten Binnig. Kraska envisions a wide variety of scenarios in which such a platform could encourage data sharing. 

“You could imagine a food delivery service, for example, might have data that a Brown researcher wants to analyze to better understand people’s food consumption habits,” Kraska said. “At the same time, the company might have an interest in sharing the data in the hope they might better understand their customer base.”

It sounds like a win-win, but there are problems. The data might contain sensitive information about customers that the company is obliged to protect. Or it might include proprietary information the company would rather not divulge.

“Addressing those issues often involves lawyers and a tedious and costly process of negotiations between the two parties,” Kraska said. “Both parties have the best intentions, but these negotiations too often fail.”

Under the grant, Kraska and his colleagues will first work to create a generalized, modular licensing agreement that covers the most common data-sharing stumbling blocks. Then the researchers will design a data platform and toolset, dubbed ShareDB, that incorporates and enforces key aspects of the licensing agreement. Not only would this make the sharing agreement process easier, it would help assure both sides that everything is being done correctly.

“The hope is that by standardizing data-sharing agreements and providing infrastructure to better enforce the contract, we can significantly increase the chance for a university and a company to reach a successful agreement in a shorter amount of time,” Kraska said.

The project is one of 10 “Big Data Spokes” to receive funding from NSF. The three institutions involved will share the grant, which totals just under $1 million.