Background: The Biological Magnetic Resonance Data Bank (BMRB) is a public repository of Nuclear Magnetic Resonance (NMR) spectroscopic data of biological macromolecules. It is an important resource for many researchers using NMR to study structural, biophysical, and biochemical properties of biological macromolecules. It is primarily maintained and accessed in a flat file ASCII format known as NMR-STAR. While the format is human readable, the size of most BMRB entries makes computer readability and explicit representation a practical requirement for almost any rigorous systematic analysis.
We have tested this new library on all current BMRB entries: 100% of all entries are parsed without any errors for both NMR-STAR version 2.1 and version 3.1 formatted files. We also compared our software to three currently available Python libraries for parsing NMR-STAR formatted files: PyStarLib, NMRPyStar, and PyNMRSTAR.
Conclusions: The nmrstarlib package is a simple, fast, and efficient library for accessing data from the BMRB. The library provides an intuitive dictionary-based interface with which Python programs can read, edit, and write NMR-STAR formatted files and their equivalent JSONized NMR-STAR files. The nmrstarlib package can be used as a library for accessing and manipulating data stored in NMR-STAR files and as a command-line tool to convert from NMR-STAR file format into its equivalent JSON file format and vice versa, and to visualize chemical shift values. Furthermore, the nmrstarlib implementation provides a guide for effectively JSONizing other older scientific formats, improving the FAIRness of data in these formats.
Digital Object Identifier (DOI)
This work was supported by National Science Foundation grant NSF 1252893 (Hunter N.B. Moseley); however, they played no role in the design or conclusions of this study.
The nmrstarlib package is available at http://software.cesb.uky.edu, at GitHub (https://github.com/MoseleyBioinformaticsLab/nmrstarlib) and at PyPI (https://pypi.python.org/pypi/nmrstarlib) under the MIT license. Project documentation is available online at ReadTheDocs (http://nmrstarlib.readthedocs.io/) and also as a pdf file (Additional file 3). Profiling of nmrstarlib package (Additional file 2) and full function call diagram (Additional file 1) are also available.
Requirements: Python 2.7, 3.4+, docopt Python library for command-line interface functionality, graphviz Python library for chemical shift visualization functionality.
All NMR-STAR datasets analyzed in this manuscript are available from the Biological Magnetic Resonance Bank at http://www.bmrb.wisc.edu/.
Smelter, Andrey; Astra, Morgan; and Moseley, Hunter N. B., "A Fast and Efficient Python Library for Interfacing with the Biological Magnetic Resonance Data Bank" (2017). Center for Environmental and Systems Biochemistry Faculty Publications. 1.
Additional file 1: Function call diagram of nmrstarlib.
12859_2017_1580_MOESM2_ESM.txt (27 kB)
Additional file 2: Profile of nmrstarlib execution.
12859_2017_1580_MOESM3_ESM.pdf (256 kB)
Additional file 3: Documentation for nmrstarlib.
12859_2017_1580_MOESM4_ESM.json (1 kB)
Additional file 4: List of failed NMR-STAR 2.1 files for PyStarLib.
12859_2017_1580_MOESM5_ESM.json (8 kB)
Additional file 5: List of failed NMR-STAR 3.1 files for PyStarLib.
12859_2017_1580_MOESM6_ESM.txt (16 kB)
Additional file 6: Fragments of failed NMR-STAR 2.1 files for PyStarLib.
12859_2017_1580_MOESM7_ESM.txt (149 kB)
Additional file 7: Fragments of failed NMR-STAR 3.1 files for PyStarLib.
12859_2017_1580_MOESM8_ESM.txt (1 kB)
Additional file 8: Output of C++ example from Fig. 9.