Neuroinformatics Core
Since its establishment in 2017, the Neuroinformatics (NI) Core has dramatically expanded existing research activities in several disciplines and multidisciplinary areas within CNAP. This was accomplished by providing a completely developed computing platform, along with its expert personnel, to serve as a common hybrid computing technology, experimentation, and information resource for CNAP researchers. The Core provides CNAP with research support, infrastructure, and training/outreach. This includes customized training modules, applied machine learning, data sharing, data analytics, and high-performance computing (HPC) for all CNAP users in a secure, fast, efficient manner.
Facilities and Equipment
The NI Core is closely integrated with the larger K-State research computing cluster, Beocat. Beocat consists of approximately 3.3 petabytes (PB) of storage and 10,000 processor cores on machines ranging from 16 cores with 64 gigabytes (GB) RAM to 128 cores with multiple nVidia A100 graphics processing units (GPUs) and 1.5 terabytes (TB) RAM connected by 100GB per second OmniPath networking. Beocat is currently the most powerful academic research cluster in Kansas. CNAP users enjoy priority access to resources owned by CNAP. The NI Core currently owns 12 servers, including ten 20-core Intel Xeon E5-2630 v4-based computation servers, a 40-core server with dual Intel Xeon Gold 6130 CPUs, and four nVidia GTX 2080 Ti graphics cards to support AI/ML. The Core also owns a fast flash-based fileserver with 40 cores (dual Intel Xeon Platinum 8352Y) and 25 NVMe 15TB drives for data-intensive research and teaching.
Software
In 2019, we laid the foundation for Synapse, an open science gateway for CNAP. Since then, as we strive for increasingly open and reproducible research, Synapse provides a common interface to data and analytical applications. Key areas of emphasis include:
● tracking provenance of data and codes through the research process
● easy data mobility among collaborators and between file systems
● security of information
● execution of scientific codes from a web-interface
In 2020, NI Core staff installed and configured “OpenOnDemand,” which is a platform for running computational tasks such as R-Studio on a cluster from a web browser. If those tasks are interactive, it provides the ability to interact with them once the task has started its execution.
This system allows Core users remote access to large machines for computational tasks. For example, while working on a basic laptop, a user could have a large system analyze eye movements and then present a graphical representation of the results, all while using limited system resources on the laptop. Current and future work for OpenOnDemand includes additional interactive applications such as Jupyterhub and increased integration with Synapse.
Core Services
In addition to allowing access to state-of-the-art computing equipment and software applications, the NI Core also provides research support, training, and outreach for CNAP researchers. The NI Core has supported CNAP projects, grants, and other CNAP-affiliated research since its creation at the beginning of CNAP Phase 1. We have worked with project leaders to facilitate secure data transfer from external partners, automate experimental validation, and parallelize power analysis for their research programs. With our assistance and the use of core computing resources, core users often see dramatic increases in the speed of analysis and processing. In some cases, the NI Core has saved researchers several weeks (or even months) of processing time by utilizing Beocat's hardware.
NI Core staff are available to conduct personalized training on system resources and help optimize research workflows and software operation. For example, in the past our team has conducted personalized training to help users learn to utilize the R statistics package on OpenOnDemand. The Phase 2 award will allow the NI Core to add training and support for the effective utilization of machine learning and artificial intelligence modeling techniques.
Our team also offers training opportunities throughout the year. For the past several years we have remote hosted Big Data workshops based out of the Pittsburg Supercomputing Center. This training is available to all NI Core users, and allows participants to view the videos at their own pace, perform the exercises and homework assignments, and even test themselves using the quizzes at the end.
Staff
Core Director
Dr. Doina Caragea serves as Director of the Neuroinformatics Core. Dr. Caragea is a professor of computer science and a K-State College of Don and Linda Glaser Keystone Research Scholar. Her research and teaching interests are in machine learning, deep learning, and data science, with applications to text analytics, digital ag, and bioinformatics, among others. Her projects build upon close collaborations with social scientists and life scientists and aim to provide practical computational approaches to address real-world challenges. She has published more than 150 refereed conference and journal articles. She has a strong track record of extramural funding, with $12M+ total funding as PI, co-PI, or senior personnel from NSF and industry. In her position as Core Director, Dr. Caragea oversees the operations of the Core and supervises the full-time employees supporting the NI Core and Beocat.
Technical Support Staff
Two system administrators and one application scientist support researchers moving beyond local PCs to take advantage of Beocat and national (XSEDE) computational resources. These staff also provide application-level support (e.g., for parallel or serial batch processing through Beocat or XSEDE, implementation of specialized analysis techniques, software usage support, etc.), and training for individuals working within CNAP projects, programs, and research cores. The NI Core also has excellent graduate student staff who also assist researchers with their specific project needs, as well as providing user training and assisting with the development of core resources.
NI Core Success Stories
"My graduate student and I have been carrying out a cutting-edge collaborative research project with a well-known colleague of ours at MIT. That colleague provided us with an image processing algorithm that we have used in a couple of collaborative experiments with her. The image processing algorithm is written in MatLAB, and is very sophisticated, but also extremely resource intensive. When my student first ran it on one image on his PC, it took 1-2 days to produce a single image. However, we needed many thousands of such images for our study. Our solution was to use BeoCat to do our image processing for us. My student was able to do this over the course of several months. Each day, he would set up a batch command for roughly 10 such images to be processed, and set it running in the background on BeoCat. At the end of the day, he would check back and download the images. He did this every day for several months and was able to get our entire set of thousands of such images for our experiments. We couldn't have done that without BeoCat. We are now getting close to submitting a manuscript on our experiments to a high-ranking journal in our area. When we do, we will include an acknowledgement to Dan Andresen and BeoCat for making it possible to do all of that image processing for our study. BeoCat and the NI Core literally made it possible for use to do this very cutting edge collaborative research project with our MIT colleague. And we hope that the study will have a real impact in our field!"- CNAP Researcher
"To be honest, I could not have conducted my research without the NI core. As part of my dissertation research, I used machine learning to classify pictures that people viewed from their neural features from EEG…I was able to batch the entire process so that it ran each condition simultaneously on Beocat Open R OnDemand. The process that would have taken (20 hrs X 32 conditions X 2 experiments 1,280 hours--53 days) to complete was able to be finished in 20 hours." - CNAP Researcher
"I’ve been working on a collaboration with a colleague and one of his grad students. We encountered a situation where we needed to run 16,000 Bayesian analyses of simulated data which I initially considered infeasible (it would take over 1,000 hours of running – over 40 24-hour days! However, I learned about “array jobs” from someone with Beocat, worked on setting it up last week, and voila – done in under 8 hours! This experience will almost certainly open up similar projects in the future. I’m kicking myself for not learning this earlier, but it’s much easier to make time this year… for some unknown reason." - CNAP Researcher
Contact
Interested researchers can contact Dr. Caragea to learn more about the NI Core facilities and services or to schedule a tour. To learn how to sign up for an account on Beocat and begin using these resources, you can visit the Beocat documentation page.