AWS Public Sector Blog
Accelerating Alzheimer’s disease research through AWS cloud computing powered large-scale functional genomics analysis
Alzheimer’s disease is a progressive neurodegenerative disorder affecting millions worldwide. The Alzheimer’s Association estimates that over 6 million Americans live with the condition and is projected to rise to nearly 13 million by 2050. This increasing prevalence has spurred intensive global research efforts, with scientists working to understand the disease’s complex mechanisms, identify potential triggers, and develop more effective treatments.
At the forefront of this critical research is Dr. Gao Wang, an Assistant Professor of Neurological Sciences at Columbia University. Dr. Wang heads the Lab of Statistical Functional Genomics, where he and his team are leveraging the power of HAQM Web Services (AWS) cloud computing to conduct groundbreaking genomics research aimed at unraveling the complex genetic underpinnings of Alzheimer’s disease and identify potential therapeutic targets.
Functional genetics in Alzheimer’s research
Dr. Wang’s research focuses on functional genetics, where the goal is not just to identify genetic changes associated with Alzheimer’s, but to understand why these changes lead to the onset of the disease. This approach requires analysis of high-dimensional and high-scale data using sophisticated statistical and computational biology models.
The novelty of this field presents both challenges and opportunities. By leveraging cloud computing, Dr. Wang is making his analysis more efficient, allowing for deeper insights and faster progress in understanding the complex mechanisms behind Alzheimer’s disease. Accelerating the pace of discovery was important to Dr. Wang: “For the field as a whole, if you can get the results out faster and if the results are important, you can help inspire other people’s follow up research and you are advancing the field, hopefully, by years. Other researchers rely on the data we produce or the things we find initially because we can open some questions they can focus their research on.”
The FunGen-xQTL Project: A collaborative effort
A key initiative led by Dr. Wang is the FunGen-xQTL Project, a collaborative effort involving 14 research institutes, 28 trainees, and 19 faculty members across the US. This project focuses on studying molecular quantitative trait loci in aging brains, spanning 62 distinct molecular contexts across brain cells and tissues. The consortium has generated comprehensive molecular profiles including DNA methylation, histone modifications, gene expression, and protein levels from human brain samples. By understanding genetic regulation, Dr. Wang and his colleagues are providing the Alzheimer’s disease scientific community with valuable functional genomics data from aging cohorts, curated and processed through comprehensive multi-omics analysis.
Dr. Wang’s research aims to identify genetic factors influencing Alzheimer’s disease and establish causal pathways linking genetic variation to Alzheimer’s disease risk. His team integrates molecular evidence from brain, cerebrospinal fluid, and blood samples to discover new therapeutic targets and biomarkers, while developing a comprehensive QTL resource to benefit research on neurodegenerative diseases and aging brains. The project has successfully mapped molecular traits across different brain cell types, including microglia, astrocytes, and neurons, providing unprecedented insights into cell-type specific genetic regulation in Alzheimer’s disease.
Accelerating scientific breakthroughs with the AWS Cloud
The FunGen-xQTL Project has many requirements that were difficult to accomplish with traditional on-premises infrastructure. One major hurdle was the need for enough compute power. The research required applying a Bayesian model on tens of thousands of genetic variables, under hundreds of cellular, tissue, ancestry, and disease combinations, evaluated across approximately 30,000 genes in the human genome. Furthermore, the functional genetics approach necessitated processing and analyzing large amounts of complex, high-dimensional data to uncover the relationships between genetic changes and disease onset. This high-scale data analysis pushed the limits of conventional computing systems, making it clear that a more robust and flexible solution was needed. Another challenge was large-scale data sharing. The analysis resulted in a vast number of output files that needed to be shared among multiple institutional collaborators without duplicating or copying data, which posed logistical and technical difficulties.
Dr. Wang turned to AWS cloud computing, specifically utilizing MMCloud, a software product from AWS partner MemVerge, that streamlines the deployment of containerized applications in AWS. “What we were able to do is develop the scripts with MemVerge to make set up really easy,” said Dr. Wang. This approach has transformed his research process in several ways:
- Massive parallel processing: MMCloud enabled Dr. Wang’s lab to submit hundreds of thousands of jobs to AWS and run them cost-effectively on HAQM Elastic Compute Cloud (HAQM EC2) Spot instances.
- Dramatically reduced processing time: The use of AWS cloud reduced the time required for complex computations from several weeks to just a few days.
- Cost efficiency: By leveraging HAQM EC2 Spot instances, the research team achieved 50-80% lower costs compared to using On-Demand instances.
- Simplified collaboration: MMCloud also simplified and enabled cost-efficient provisioning and management of Jupyter and RStudio apps used by multiple institutional collaborators.
Harnessing the power of AWS Cloud for scientific breakthroughs
The use of cloud computing in genomics research, particularly in the field of functional genetics, enables researchers like Dr. Wang to:
- Process and analyze large-scale, high-dimensional genomic datasets more efficiently.
- Accelerate the time to discovery for research publication and scientific impact.
- Implement complex statistical and computational biology models that are crucial for understanding the functional aspects of genetic variations.
- Collaborate more easily with researchers around the world by sharing data and results.
- Scale resources up or down on demand, ensuring cost-effectiveness by paying only for what you use, while providing the flexibility required for cutting-edge research in a rapidly evolving field.
Dr. Gao Wang’s innovative research at Columbia University, powered by AWS cloud computing, represents a cutting-edge approach to understanding Alzheimer’s disease through functional genetics. While the fight against Alzheimer’s is complex, Dr. Wang’s work, combined with powerful cloud computing tools, offers hope for developing effective treatments and potentially finding a cure for this devastating disease.
Reference
Running Bioinformatics Pipelines Cost Effectively Using MemVerge on AWS