Abstract

Access to experimental X-ray diffraction image data is fundamental for validation and reproduction of macromolecular models and indispensable for development of structural biology processing methods. Here, we established a diffraction data publication and dissemination system, Structural Biology Data Grid (SBDG; data.sbgrid.org), to preserve primary experimental data sets that support scientific publications. Data sets are accessible to researchers through a community driven data grid, which facilitates global data access. Our analysis of a pilot collection of crystallographic data sets demonstrates that the information archived by SBDG is sufficient to reprocess data to statistics that meet or exceed the quality of the original published structures. SBDG has extended its services to the entire community and is used to develop support for other types of biomedical data sets. It is anticipated that access to the experimental data sets will enhance the paradigm shift in the community towards a much more dynamic body of continuously improving data analysis.

Document Type

Article

Publication Date

3-7-2016

Notes/Citation Information

Published in Nature Communications, v. 7, article no. 10882, p. 1-12.

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Due to the large number of authors, only the first 30 and the authors affiliated with the University of Kentucky are listed in the author section above. For the complete list of authors, please download this article.

Digital Object Identifier (DOI)

https://doi.org/10.1038/ncomms10882

Funding Information

Development of the Structural Biology Data Grid is funded by The Leona M. and Harry B. Helmsley Charitable Trust 2016PG-BRI002 to PS and MC. Development of citation workflows is supported NSF 1448069 (to PS). DAA is being developed as a pilot project of the National Data Service, with additional funds to support storage and technology development, including NIH P41 GM103403 (NE-CAT) and 1S10RR028832 (HMS) and DOE DE-AC02-06CH11357; NIH 1U54EB020406-01, Big Data for Discovery Science Center; and NIST 60NANB15D077 (Globus Project). AB acknowledges Ariel Chaparro for assistance with the DAA setup (Inst Pasteur Montevideo). Collections of pilot data sets were supported by various grants (see Supplementary Table 1).

ncomms10882-s1.pdf (43 kB)
Supplementary Information: Supplementary Table 1.