Like energetic children, computational scientists need large spaces to play. This play space, however, doesn’t require a large backyard but rather a high-performance computing (HPC) ecosystem capable of accommodating the growing data demands of world-class science.
The US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) offers scientists just such a comprehensive environment for computing- and data-intensive research through the combination of supercomputing resources managed by the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility, and open-research compute, data, and cloud storage services offered through ORNL’s Compute and Data Environment for Science (CADES).
Using these HPC resources researchers can pursue unprecedented opportunities in fields such as materials science, biology, and high-energy physics.
Recently, the neuroscience community has leveraged this confluence of computing and data systems to tackle grand challenges in understanding the inner workings of individual neurons. For the last 2 years, computational neuroscientists have been using the OLCF’s Cray XK7 Titan and compute cluster Rhea to contribute to a community effort led by the Allen Institute for Brain Science to define and advance the computational modeling of neurons. The project, called BigNeuron, is an initiative to increase collaboration, create standards, and share best practices among leading computational neuroscience research groups, which have historically worked in isolation from one another. As part of this community effort, BigNeuron held a hackathon event at ORNL in November 2015.
The work has resulted in the generation of the largest neuron reconstruction archive in the world, consisting of more than 2 million reconstructions and more than 200 terabytes of data.
“All the data is generated on the supercomputer from benchmarking about 26 different algorithms,” said Arvind Ramanathan, an ORNL computational biologist involved with the project. “The data comes from testing out the algorithms on a standard set of tissue slices or images. This work is about putting together a quantitative framework where everyone’s data is treated equally and the datasets are standardized in some sense to allow for an apples-to-apples comparison.”
Now that BigNeuron’s benchmarking work is winding down, researchers are turning to CADES to carry out the next phase of the project—making the data publicly available.
Transferring validated datasets from the OLCF to CADES creates a shared environment where research teams from around the world can access neuron data, test new algorithms, and verify results. Furthermore, hosting the data at ORNL makes it possible for a researcher to obtain digital object identifiers, commonly known as DOIs, to catalog and publish scientific datasets for open access.
As of early 2017, approximately 20 terabytes of the BigNeuron database have been made publicly available, with plans to extend the database to the full 200 terabytes by the fall of 2017.
“The resources at ORNL make the lab one of the few places with the ability to produce, validate, and host this data at scale,” Ramanathan said. “This will lead to better algorithms in the future and help the community make steps toward the long-term goal of learning how neurons function and work together.”
Oak Ridge National Laboratory is supported by the US Department of Energy’s Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.