Washington, D.C. A new multi-million dollar collaboration will enable university researchers to harness the full potential of the data-rich world that characterizes all fields of science and discovery. This ambitious partnership, which includes New York University, the University of California, Berkeley and the University of Washington, will spur collaborations within and across the three campuses and other partners pursuing similar data-intensive science goals.
The new 5-year, $37.8 million initiative, with support from the Gordon and Betty Moore Foundation and Alfred P. Sloan Foundation, was announced at a meeting sponsored by the White House Office of Science and Technology Policy (OSTP) focused on developing innovative partnerships to advance technologies that support advanced data management and data analytic techniques.
At a time when the natural, mathematical, computational and social sciences are all producing data with relentlessly increasing volume, variety and velocity, capturing the full potential of a progressively data-rich world has become a daunting hurdle for both data scientists and those who use data science to advance their research.
While data science is already contributing to scientific discovery, substantial systemic challenges need to be overcome to maximize its impact on academic research.
To overcome these challenges, this effort seeks to achieve three core goals:
"Dramatic expansion in the scale of data collection, analysis and dissemination could revolutionize the speed and volume of discovery," said Chris Mentzel, Moore's Data-Driven Discovery program officer. "However, success ultimately depends on the individuals and teams that combine domain expertise with computational, statistical and mathematical skills what we are calling 'data science.'"
"It's been hard to establish these essential roles as durable and attractive career paths in academic research," explained Josh Greenberg, who directs the Sloan Foundation's Digital Information Technology program. "This joint project will work to create examples at the three universities that demonstrate how an institution-wide commitment to data scientists can deliver dramatic gains in scientific productivity."
The initiative will tap leading researchers at their respective institutions and some of the best minds in science and academia. Faculty leads include:
The three leaders believe universities are uniquely positioned to empower researchers to harness the deluge of valuable, heterogeneous, and noisy data continuing to come their way and help navigate the flood of software analysis tools and approaches that are often incompatible, hard to learn or poorly written by brilliant scientists trying to get their job done.
"As someone whose research depends on the fluent use of data," said Saul Perlmutter, lead faculty member at the University of California, Berkeley, "I'm excited that we now have an opportunity to identify the typical data-science barriers, little and big, that slow our progress, and to see which could be mitigated or, occasionally, just plain solved!"
"We must build on our existing efforts that leverage existing industry tools, generate new working tools and practices and support the multi-disciplinary experts who develop new approaches and tools needed to fill gaps," said Ed Lazowska, faculty lead at the University of Washington. "Working together, we believe we're going to shift the culture at our universities and help accelerate broader uptake for supporting data-intensive discovery."
"With the onslaught of data, much of the knowledge in the world is going to be extracted by machines," said Yann LeCun, faculty lead at New York University. "Universities must find new ways to advance data-science methodologies while facilitating the use of new methods and tools by researchers from every field. Universities also have an opportunity to train new generations of researchers in data-driven science."
Each of the three universities will contribute additional resources to the investment made by the Moore and Sloan foundations, including new faculty positions, physical space on campus and research support.
Each of the partner universities distinguished itself in recent years by pioneering new approaches to discovery in fields as diverse as astronomy, biology, oceanography, and sociology through deep collaborations between researchers in these fields and researchers in data science methodology fields such as computer science, statistics and applied mathematics.
This new partnership a coordinated, distributed experiment involving researchers at these leading universities hopes to establish models that will dramatically accelerate this data science revolution by addressing several specific challenges.
Cross-university teams will organize their efforts around six primary areas: strengthening an ecosystem of tools and software environments, establishing academic careers for data scientists, championing education and training in data science at all levels, promoting and facilitating efforts that are accessible and reproducible, creating physical and intellectual hubs for data science activities, and identifying the scientists' data-science bottlenecks and needs through directed ethnography.
This partnership will connect with others, practice open science and share lessons along the way.
|Contact: Genny Biggs|
Gordon & Betty Moore Foundation