Saturday , July 31 2021

How scientists use supercomputers to fight COVID-19

In addition to the White House Office for Science and Technology Policy (OSTP), IBM announced in March, this would help coordinate efforts to provide hundreds of petaflops to researchers researching the coronavirus. As part of the newly launched COVID-19 HPC (High Performance Computing) consortium, IBM has committed to help evaluate proposals and provide access to resources for projects that “have the most immediate impact”.

Much work remains to be done, but some of the most prominent members of the consortium – including Microsoft, Intel and Nvidia – claim that progress is being made.

Petaflops from compute

Powerful computers allow researchers to perform extensive calculations in the fields of epidemiology, bioinformatics and molecular modeling, many of which would take months (or years if done manually) on traditional computer platforms. Because the computers are available in the cloud, teams can also collaborate from anywhere in the world.

The knowledge gained from the experiments can help to better understand the most important aspects of COVID-19, e.g. For example, the interaction between virus and human, the structure and function of viruses, the design of small molecules, the reuse of medicines as well as the trajectory and the results of the patients. "Technology is an important part of COVID-19 research that is currently underway worldwide," Thierry Pellegrino, vice president of Dell Technologies, told VentureBeat. (Dell Technologies is a member of the consortium.) “It is critical for the planet's population that researchers have the tools to understand, treat, and fight this virus. Researchers around the world are true heroes who do important work in extreme and unknown circumstances, and we couldn't be more proud to support their efforts. "

Companies and institutions agreed 62 Projects Free in the U.S., Germany, India, South Africa, Saudi Arabia, Croatia, Spain, the UK, and other countries with supercomputers from Google Cloud, Amazon Web Services (AWS), Microsoft Azure, IBM, and dozens of academic and nonprofit research organizations. These run on over 136,000 nodes with 5 million processor cores and more than 50,000 graphics cards, which together deliver over 483 petaflops (430 trillion floating point operations per second) on the hardware that is being maintained by the 40 partners of the consortium.


In addition to the supercomputing infrastructure built on the Azure cloud computing platform, Microsoft provides researchers with network and storage resources that are built into the workload orchestration Azure HPC. At the same time, the company's AI for Health program will be implemented, providing $ 20 million in April for developments in five key areas – data and intelligence, treatment and diagnostics, resource allocation, dissemination of accurate information, and scientific research – with the goal of Strengthening work related to COVID-19.

As part of its collaboration with the consortium, Microsoft Teams provides access to AI, HPC, quantum computing, and other computing scientists at Microsoft Research and elsewhere. Much of these researchers' work to date has involved fundamental scientific discoveries about COVID-19 itself and its interaction with the human host, including the design of therapeutics, by:

  • Research simulations.
  • Modeling molecular dynamics.
  • 3D mapping of virus protein structures.
  • Compound screening to determine whether existing drug molecules can inhibit the cellular entry of the virus.

According to Microsoft, every organization it works with gets a full Azure HPC environment, including Azure CycleCloud with Slurm workload manager, the most appropriate Azure virtual machines, and storage. These are configured in such a way that they can be scaled if necessary and, if necessary, meet the computing requirements. They are tailored to the specific research needs of the beneficiary.

Nepalese modeling and division of the ventilator

Through the consortium, Microsoft's AI for Health supports the non-profit research institute Nepal Applied Mathematics and Informatics Institute for Research (NAAMII), which uses simulations to model how COVID-19 would spread under different scenarios in the Nepalese population. According to Microsoft, these models can have patterns that can potentially save lives and livelihoods.

Another fellow, Duke University, uses Azure to investigate the ventilator's division. This technique allows multiple patients to use the same ventilator. MathWorks 'Matlab division has partnered with Microsoft to optimize researchers' analysis for distributed computing environments.


Google continues to offer undergraduate fellows computer, storage and workload management services through the Google Cloud Platform. Recently, $ 20 million in computer loans was made available to academic institutions and researchers studying COVID-19 treatments, therapies, and vaccines. As part of its collaboration with the consortium, the company is working with northeastern researchers on epidemiological modeling and the application of AI to medical imaging with the Complutense University of Madrid.

Google has also partnered with the Harvard Global Health Institute to fund companies, government agencies, nonprofits, and institutions working on COVID-19 research. The technology giant, together with Microsoft, launched a program with Microsoft-supported cloud company Rescale to offer HPC resources free of charge to teams working on the development of COVID-19 tests and vaccines. Rescale provides the platform where researchers can start experiments and record results, while Google and Microsoft provide the backend computing resources.


Like Google, Amazon supplies computers and tools to researchers who are coordinated through the consortium. Currently, more than 11 teams use the infrastructure and each week a special conference of Amazon Web Services solution architects with the scientists.

As part of its AWS Diagnostic Development Initiative, Amazon is also providing over 35 institutions and private companies that use AWS to drive the development of COVID-19 point-of-care diagnostics, tests that enable this to generate 20 computer credits Millions of dollars are available to be done at home or in a clinic with same day results. "This is a global health emergency that can only be resolved by governments, corporations, universities, and individuals working together to better understand this virus and ultimately find a cure," said Teresa Carlson, vice president of the global public sector at AWS, in a statement.

Development of protein baits

At the MIT Media Lab, a team inspired by a researcher at Johns Hopkins University identified “deceptive proteins” from ACE2 receptors (the receptors to which coronaviruses bind in the human body) that could render COVID-19 inert. Using a machine learning model that trains on ACE2 receptor data and runs on AWS, researchers are trying to predict which bait variants will not interact with other proteins in the body and cause harmful side effects. If all goes well, tests on mice will start soon. Clinical trials begin in late summer.

In separate efforts, AWS empowers researchers at the National Children's Hospital to combine hundreds of data sets to identify genes that could be used to treat COVID-19. A team at Iowa State University uses evolutionary models with public genome records to study the relationships between COVID-19 strains and to understand how they mutate and spread. With tmCOVID, scientists at Emory University are developing a web-based tool that can be used to extract and summarize key concepts in scientific studies on COVID-19.


According to Nvidia, 14 of the consortium's projects have used more than 3 million GPU hours on the Nvidia-powered Summit supercomputer at the Oak Ridge National Laboratory. Summit is the fastest supercomputer in the world Top 500 list of supercomputers. In addition, the company offers its own 20,000 GPU infrastructure – SaturnV – that the company's researchers mainly use to optimize COVID-19 research applications

Nvidia has used excess cycles on SaturnV to run Folding @ home, a distributed computer project that simulates protein dynamics to help develop therapeutics for various diseases, including COVID-19. It has helped researchers adapt to supercomputers based on the specific needs of each researcher.

Quantum chemistry and virtual screening

In collaboration with Microsoft, Nvidia is working with the University of California, Riverside, on quantum chemical solutions that benefit from GPU optimization. The number of possible COVID-19 inhibitors is immense, and performing experimental studies on all candidates is both impractical and inexpensive. The hope is that the project's predictive, GPU-enabled simulations, which take up to 800,000 GPU hours in Azure, will provide clues for efforts that are limited to the most promising candidates.

In less than a week, experts said Nvidia helped project management of Bryan Wong's package research code with HPC Container Maker, the company's open source tool that comes with 30 containerized HPC applications. And they used Nvidia's Nsight debugging tool to develop a solution to an annoying bug. This enabled work to be performed that took 800,000 GPU hours in 300,000 GPU hours and saved $ 500,000.

At Carnegie Mellon University, a team led by Olexandr Isayev worked with Nvidia to apply AI approaches to high-throughput virtual screening, where algorithms are used to identify bioactive molecules. Unlike traditional scientific simulations, in which problems with brute force methods are solved by trying to simulate every possible combination of molecular interactions, AI makes well-founded assumptions that reduce the number of combinations to be simulated. This leads to theoretically faster drug discovery (and faster field trials). Isayev estimates that it could be up to a million times faster than usual mechanical calculations.

The first step in this process is to use AI to analyze a library of molecules that chemical companies can buy and prepare them for screening in simulation. The best candidates from the screening are then simulated using AI-enhanced molecular dynamics, and the top hits from the final screening are tested in partner laboratories.

At the end of their work, Isayev and colleagues plan to put their records in the Open source COVID-19 data lake, a central curated data set repository managed by Amazon's AWS department, in the hope that other researchers will benefit.


Dave Turek, vice president of technical computing at IBM, says COVID-19 research continues with partners from across the spectrum – on machines that run on hardware, and in laboratories and institutions with which they have relationships. "Without large contracts or the like, the (consortium) came together to both share resources and manage a process to accelerate the scientific proposals received by the consortia and match them to the best resources," he said in a statement . "The teams are making rapid progress, and these supercomputer projects are using novel approaches to understanding the virus."

For example, IBM researchers at the Hartree Center in Daresbury, England, worked with scientists from Oxford University to combine molecular simulations with AI and discover compounds that could be used as anti-COVID-19 drugs. With Summit and the Frontera of the Texas Advanced Computing Center (TACC), the fifth fastest system in the top 500, the team claims to have carried out months of research in just a few hours.

Create molecular connections

With the help of IBM, researchers from the University of Utah used the Blue Waters of the National Center for Supercomputing and Longhorn and Frontera from TACC to generate more than 2,000 molecular models of compounds relevant to COVID-19. They ranked the models based on the force field energy estimates of the molecules, which they suspected could help scientists develop better enzyme peptide inhibitors to stop COVID-19.

The team studied the structure of the main protease of the virus, an enzyme that breaks down proteins and peptides in complex with a peptide inhibitor called N3. They then applied an approach developed to identify Ebola-stopping molecules that included molecular dynamics simulations and optimization of specific structures. This allowed the COVID-19 protease to degrade a number of similar, easily detectable probes that had already been developed and served as the basis for evaluations that test the effectiveness of the inhibitors.

The work is based on a knowledge of how the potential energy generated by atoms can give a molecule a positively or negatively charged “force field” that attracts or repels other molecules. Using AMBER, a molecular dynamic code, the researchers observed experimental results within a hundred millionth of a centimeter, a measure that is not perceptible to anyone except the most powerful microscopes.

The University of Utah Schmidt Laboratory will later convert the peptide lines into biopharmaceutical scaffolds called circularly modified peptides. “We hope to find a new peptide inhibitor that can be experimentally verified in the next few weeks. And then we'll go ahead with the design to make the peptide cyclic and make it more stable as a potential drug, ”said Thomas Cheatham, professor and research director at the University of Utah, in a statement.

Illustration of the spread of COVID-19

It goes without saying that COVID-19 spreads through virus-laden droplets that are transported into the environment by air conditioning, wind and other forms of turbulence. However, airborne transmission rates remain controversial some experts Gathering useful evidence of airborne transmission could take years and cost many lives.

In a safer quest for clarity, scientists from Utah State University, the Lawrence Livermore National Lab, and the University of Illinois intend to use the consortium's supercomputer resources to investigate human-to-human transmission of respiratory infections such as COVID-19 . They assume the hypothesis that aerosolized droplets from human respiratory tract contaminate rooms faster than originally thought. They use high-fidelity, multi-phase, large-eddy simulations (LES) – mathematical models for turbulence used in computational fluid dynamics – that run on IBM hardware to determine cloud paths in typical hospital environments.

The short-term goal will be to understand how long a cloud persists and where the particles settle, which could affect non-pharmacological techniques to reduce spread. "The aim of this study is to fundamentally improve our understanding of the transmission of infectious diseases of the respiratory tract from person to person," the researchers wrote in a statement. "Our findings will make it safer for healthcare professionals."

Examination of genetic susceptibility

In addition to isolating COVID-19-killing compounds and mapping the virus's virus spread, the researchers are trying to define risk groups through genome analysis and IBM supercomputer-enhanced DNA sequencing.

A team of scientists from NASA has observed that COVID-19 appears to cause pneumonia and trigger an inflammatory reaction in the lungs called acute shortness of breath (ARDS). To test this, they plan to use the supercomputer at NASA's Ames Research Center, which will sequence the genome in patients who develop ARDS and in those who do not.

If all goes well, the team anticipates that their study will lead to practical tools that can be used to predict which COVID-19 patients are likely to develop ARDS and which patients are therefore likely to need intensive support before experiencing severe symptoms. Such tools could help control ICU resource use for the sickest patients and enable healthcare workers to better manage ongoing treatment.


Intel is actively involved in the design, development, and deployment of several supercomputers associated with the consortium and the upcoming Aurora at the Argonne National Laboratory in Chicago. The company has employees working on code optimizations for HPC applications, including LAMMPS (a molecular dynamic code), Gromacs (a package for the simulation of proteins, lipids and nucleic acids), NAMD (another molecular dynamic code) and AMBER, and others. Intel also shares tools, architecture knowledge, and software with partners to improve COVID-19 applications and scale their performance on Intel-based hardware.

A particular focus of Intel is working with NAMD to release a version of the code that enables faster simulations on Xeon processors that support AVX-512. The company says the significant increase in performance will allow researchers to spend longer time simulating relevant molecules related to COVID-19 by better understanding aspects of virus infection with details at the atomic level. The update is expected to be released in June.

Hewlett Packard Enterprise

Part of Hewlett Packard Enterprise (HPE )’s work is done through the consortium, while the rest focuses on a number of customers and partners. After taking over Cray in September 2019 for around $ 1.3 billion, HPE claims to have more supercomputers and HPC systems used by leading research centers.

"High performance computing is more powerful today than ever, and its tremendous computing power, along with other advanced features, has changed drug discovery significantly," said Peter Ungaro, former Cray CEO and head of HPE Group's HPC and mission-critical systems, in a statement . “Supercomputing and HPC systems open up greater potential for AI and machine learning applications. When applied to 3D modeling and simulations, they can dramatically (speed up) the time to insight and increase (increase) scientific results. Our work within the consortium provides researchers with HPC functions that they normally cannot access independently to accelerate the discovery of a cure for the pandemic. "

Drug design research

In collaboration with Microsoft, HPE is working with a team at the University of Alabama at Huntsville (UAH) to deploy its Sentinel supercomputer through the Azure cloud. With the supercomputer and a team of dedicated HPE experts, he supports various phases of the drug design process at UAH.

The researchers use a molecular docking approach, a type of bioinformatic modeling in which two or more molecules interact with each other to create a stable combination. Drawing on a large, open array of natural products found in plants, animals, fungi and the ocean, Sentinel performs calculations to determine how natural compounds interact with the protein of COVID-19. To date, 20,000 molecular dockings have been improved against a protein target in seven or eight minutes, compared to the full 24 hours previously required. The research team can now perform up to 1.2 million molecular dockings per day.

Elsewhere, HPE supports work at the Lawrence Livermore National Laboratory. The researchers' goal is to use AI to accelerate the process of simulating billions of molecules from a database of drug candidates. You have limited the number of potential candidates from 1040 You used Catalyst – an HPE-based HPC cluster that generates predictions such as experimental and structural data – to improve results and accelerate discovery.

HPE is also working with the French National Center for Scientific Research and GENCI to arm scientists at Sorbonne University in Paris with GENCI's supercomputer Jean Zay, developed by HPE. The team uses Jean Zay to optimize that Tinker HP Software, an approach for parallel computing that is made possible by several graphics cards and is to be simulated at the atomic level for large biological molecules. Tinker-HP is also performing a series of data-intensive calculations to create 3D simulations of molecular interactions faster and with high resolution than would otherwise be possible.

Private sector contributions

The nature of the consortium's work is not strictly academic. Startups hope to use the group's extensive computing resources to develop treatments, molecular designs, and drugs against COVID-19.

Novel Techsciences from Kolkata identifies phytochemicals from more than 3,000 medical plans and antiviral plant extracts in India that could act as natural drugs against COVID-19. The team also plans to isolate herbal compounds that may help combat multiple drug resistance that occurs when the coronavirus mutates, with the aim of developing a comprehensive prophylactic treatment regimen.

In London, Y Combinator-based PostEr oversees the Moonshot project, which aims to target inhibitors based on over 60 fragment hits (ie molecules that have been validated to bind to a target protein, making them a chemical starting point for drug discovery ) to produce isolated in experiments to determine the molecular structure of COVID-19. By running machine learning algorithms in the background to generate triage proposals and synthesis plans, PostEra has identified around 21 highly effective, volunteer-submitted molecular designs that are synthesized by the chemical company Enamine. The results of this project will be tested on animals in months.

If successful, PostEra would be one of the first open source drugs. "(Machine learning) can shorten the time to determine optimal ways to make these connections from weeks to days," the company said in a statement. "(We believe) the global scientific community (can suggest) drug candidates that bind to (COVID-19) and could neutralize them."

Another private sector project is led by the London-based AI startup Kuano. This team intends to learn about diseases that are similar to COVID-19 – mainly other coronaviruses – in order to develop an effective COVID-19 drug. These efforts are based on a genetic algorithm that scans the chemical space for existing antiviral drugs, and a deep learning-based classification model that is based on available binding data. The company combines these tools with docking and molecular dynamics simulations to improve results and obtain machine learning models that can be used to evaluate molecular designs for synthesis as antiviral compounds.

AI and drug development startup Innoplexus is also working with the consortium's supercomputers to accelerate the discovery of molecules that could lead to a drug to fight COVID-19. Five promising candidates are expected to be permuted – especially candidates that are effective, non-toxic, and can be manufactured.

Start time

Despite the fact that much of the work is still at an early stage, the momentum around the consortium appears to be accelerating.

Last month, IBM announced that UK Research and Innovation (UKRI) and the Swiss National Supercomputer Center (CSCS) would join the consortium and provide machines including ARCHER from the University of Edinburgh. the DIRAC of the Science and Technology Facilities Council; the Earlham Institute of the Biotechnology and Biological Sciences Research Council; and Piz Daint, the world's sixth-ranked supercomputer, according to the top 500. The new additions increased the total number of petaflops available from 437 in May to 483 and in mid-March to 300.

“The COVID-19 HPC Consortium … is the largest public-private computing partnership that has ever been established. What started as a series of phone calls (…) five days later, more than two dozen partners came on board, many of whom are usually rivals, ”said IBMs Turek. "Without large contracts or the like, this group comes together to both share resources and manage a process to accelerate the scientific proposals received by the consortia and match them to the best resources."

About Cameron Roy Russell

Cameron Roy Russell is a 26 years old local activist who enjoys podcasting, walking and stealing candy from babies. He is energetic, but can also be very untrustworthy and a bit sadistic.

Check Also

Grammarly makes the first investment and participates in the Seattle Document Engineering startup Docugami

Brad Hoover, CEO of Grammarly, says the investment in the startup Docugami in Seattle reflects …

Leave a Reply

Your email address will not be published. Required fields are marked *