UCSD Center for Microbiome Innovation & Overcoming Data Storage Challenges to Advanced Discovery

UCSD Center for Microbiome Innovation & Overcoming Data Storage Challenges to Advanced Discovery

Recently, TrialSite News had the pleasure to interview Yoshiki Vázquez-Baeza, Associate Director of Bioinformatics Integration at University of California San Diego Center for Microbiome Innovation. This inventive group has brought together over 100 research groups across the beautiful UCSD campus, leveraging the strengths in clinical medicine, bioengineering, computer science, the biological and physical sciences, data sciences and more to coordinate and accelerate microbiome research. On the leading edge of this fascinating field involving powerful new opportunities for human and environmental health, Mr. Vázquez-Baeza introduced TrialSite News to not only the fascinating world of microbiome research but importantly, key enabling high-powered computer storage technology that makes it all possible.

What is the “Microbiome?”

Although in the aggregate billions of dollars of research goes into the slight genetic variations impacting human health, but from a genetic perspective, humans are 99.9 percent alike. And despite research progress scientists still don’t fully appreciate the 99 percent of the genes in the human body: that is, those microbial genes expressed by the trillions of microbes that live in, on and around all of us. Such microbes and their genes constitute our microbiomes. 

As it turns out, microbes are incredibly important, from helping humans digest and  process nutrients including the production of their own waste and metabolites. The human immune system is profoundly impacted by these microbiomes. One’s health (or lack thereof) is profoundly impacted by them.

Microbiome Research Breakthroughs

Heretofore not conceivable research breakthroughs are now possible at UCSD and elsewhere as microbiome investigators start to learn more about the human gut microbiomes and how they are associated with diseases and conditions that could be expected—from inflammatory bowel disease to colon cancer—to those that are a surprise such as rheumatoid arthritis, atherosclerosis and asthma. Even more surprising preclinical microbiome researchers have found associations, in mice models, between the gut and the brain. Although clinical trials are still needed emerging hypothesis lead investigators toward a dramatic conclusion: that an individual’s state of mind, such as anxiety or depression, could actually be triggered by some conditions associated  with tiny microbiome.

Just the Tip of the Iceberg

And this new paradigm for studying our health is just the tip of the iceberg as studies of the microbial communities living on the human skin, in the mouth, in one’s home or in hospitals, not to mention many other places, reveal dramatic surprises about the roles these microbes have on everything. Even COVID-19 was now a target for inquiry via this revolutionary way of seeking to understand our world.

A New Big Paradigm Requires Changes to Research Infrastructure

This intriguing world captivated UCSD and it formed the Center for Microbiome Innovation (CMI) to bring together different researchers from many different disciplines but yet all had a common thread of identifying and understanding the microbiome connections to their specific investigational pursuits.

UCSD formed CMI in 2016, bringing together over 130 UCSD professors and research scientists. But the planned, unfolding collection of vast troves of microbiome data from myriad sources required not only an organizational vision as well a cross-functional collaborative sophistication, but importantly an enabling high-performance computing and storage technology. Enter the partnership with a partner offering a high-performance computing (HPC) storage infrastructure: Panasas.

Do you want to set up a World-Class Microbiome Research Operation?

TrialSite News learned from Yoshiki Vázquez-Baeza and Adam Marko, Director Life Science Solutions Panasas about how UCSD was able to establish and expand this important research center. As incorporating many different research scientists’ data collection endeavors required a flexible yet central high performance computing technology solution that supported numerous parallel research workstreams such as the recent RNA microbiome in association with COVID-19 investigations. Of course, the vision behind the center, not to mention the organizational acumen to motivate, direct and mobilize different research groups supported the exciting multi-discipline research endeavor. Industry partners certainly are monitoring this fundamental research space—in just a few years major corporations such as Pfizer, Otsuka, Nestle, Loreal, and IBM have signed up as UCSD CMI partners.

UCSD CIM & Panasas Partnership

Since 2015, UCSD has worked with Panasas in looking to establish an HPC storage solution that supports high-performance, accelerate data exploration and discovery, and simplify storage management for administrators. The university turned to ActiveStor, “a turnkey HPC storage appliance that runs the PanFS parallel file system to accelerate performance at every stage of the computational research process.” 

Vázquez-Baeza reported that what they do just wouldn’t be possible without a technology such as ActiveStor; where a high-performance storage infrastructure could accelerate data exploration and discovery while simplifying overall storage management and oversight. Since going live with this system, UCSD not only benefits from saved time and reduced bottlenecks but also greater insights which can lead to breakthrough discoveries in medicine, for example.  Put simply, at CMI the team had to move away from technology distractions toward technology solutions; ones that free up researcher’s time so they can focus on critical scientific problems they live for solving. Hence freeing up CMI researchers from worries about issues such as storage empowers CMI to accelerate the adoption of advanced scientific technologies. 

Addition of Two Petabytes of HPC Storage for COVID-19 Research

Just recently, UCSD CMI added two petabytes of Panasas’ HPC storage to support intensive microbiome-centered investigations into COVID-19. With this infrastructure in place UCSD CMI can sequence massive amounts of genomic data that could very well reveal clues as to how SARS-CoV-2 spreads as well as a possible cure. By adding two additional petabytes of ActiveStor data storage they support the rapidly growing number of I/O intensive tasks related to the COVID-19 research efforts.

As Vázquez-Baeza noted in the interview and reiterated in a recent quote from a press release, “Our teams often benchmark several methods or rework the benchmarking with new data sets.” He continued, “If we have to limit the benchmark methods because of storage concerns, we wouldn’t be able to explore the full breadth of scientific options. Panasas technology supports the mission of the Center because it never limits our exploration.”

What is ActiveStor? 

ActiveStor is a turnkey HPC storage appliance that runs the PanFS® parallel file system to accelerate performance at every stage of the computational research process. According to Panasas positioning, the technology delivers unlimited performance scaling and features a balanced node architecture that prevents hot spots and bottlenecks by automatically adapting to dynamically changing workloads and increasing demands – all at the industry’s lowest total-cost-of-ownership.


In an open, cross-functional, multi-disciplinary and student-centered, approach, UCSD DMI has leveraged powerful strengths of broader UCSD specialized areas from clinical medicine to data sciences to establish one of the world’s leading microbiome research centers. Dedicated to overcoming key barriers to accelerating successful microbiome research—including the work to master laboratory and bioinformatics protocols as well as effectively capitalize on high volumes of resulting data—UCSD mobilized talent, university resources and processes disruption, not to mention key enabling technology, to provide core facilities for potent data generation.  In collaboration with other groups involved with nucleic acid sequencing and mass spectrometry capabilities already present at UCSD, UCSD CMI offers industry research partners expertise in areas such as sample preparation, high-throughput DNA extraction and bioinformatics and data analysis. 

The organization continues to push the research boundaries working to enable game-changing technologies to advance and scale microbiome-related work including nanotechnology sensors that can reach inside single cells for example to drones that can map the global microbiome and connect to climate models.

Lead Research/Investigator

Yoshiki Vázquez-Baeza, PhD, Associate Director of Bioinformatics Integration at University of California San Diego Center for Microbiome Innovation

Call to Action: Interested in learning more about how to continuously improve a major microbiome research center? Consider connecting with Yoshiki Vázquez-Baeza and team at UCSD as well as connect with Panasas.