NIH Puts Big Bucks in Big Data

Oct. 21, 2014

National Institutes of Health (NIH) announced grants to develop new strategies to analyze and leverage the explosion of complex biomedical data sets, often referred to as Big Data. NIH says its multi-institute awards constitute an initial investment of nearly $32 million in fiscal year 2014 by its Big Data to Knowledge (BD2K) initiative, which is projected to have a total investment of nearly $656 million through 2020, pending available funds.

According to NIH, with the advent of transformative technologies for biomedical research, such as DNA sequencing and imaging, biomedical data generation is exceeding researchers’ ability to capitalize on the data. The BD2K awards will support the development of new approaches, software, tools and training programs to improve access to these data and the ability to make new discoveries using them.

“Data creation in today’s research is exponentially more rapid than anything we anticipated even a decade ago,” said NIH Director Francis S. Collins, M.D., Ph.D. “Mammoth data sets are emerging at an accelerated pace in today’s biomedical research and these funds will help us overcome the obstacles to maximizing their utility. The potential of these data, when used effectively, is quite astounding.”

The funding will establish 12 centers that will each tackle specific data science challenges. The awards will also provide support for a consortium to cultivate a scientific community-based approach on the development of a data discovery index, and for data science training and workforce development.

“The future of biomedical research is about assimilating data across biological scales from molecules to populations,” said Philip E. Bourne, Ph.D., NIH associate director for data science. “As such, the health of each one of us is a big data problem. Ensuring that we are getting the most out of the research data that we fund is a high priority for NIH.”

NIH says that challenges in making the best use of such biomedical information include problems of locating data and the appropriate software tools to access and analyze them, lack of data standards for many types of data, and the low adoption of data standards across the research community. There is also a need for new policies to facilitate data sharing while protecting privacy.

The four main components of the new BD2K awards are:

1) Centers of Excellence for Big Data Computing. These 11 centers will develop innovative approaches, methods, software, tools and other resources. While the development efforts will focus on specific research questions, their output is expected to be more generally relevant to various aspects of big data science.

2) BD2K-LINCS Perturbation Data Coordination and Integration Center. This center will be a data coordination center for the NIH Common Fund’s Library of Integrated Network-based Cellular Signatures (LINCS) program, which aims to characterize how a variety of types of cells, tissues and networks respond to disruption by drugs and other factors.

3) BD2K Data Discovery Index Coordination Consortium (DDICC). This program will create a consortium to begin a community-based development of a biomedical data discovery index that will enable discovery, access and citation of biomedical research data sets.

4) Training and Workforce Development. These awards support the education and training of current and future generations of researchers who will specialize in data science fields, as well as those whose work may require certain expertise in the use of or generation of large amounts of data and data resources.

For more information about the recipients of the new grants, please visit