Integrating NVIDIA BioNeMo NIM Microservices in Self-Driving Virtual Screening with Artificial, Inc.

Yao Fehlis, Charles Crain, Janet Paulsen

Virtual screening is crucial in drug discovery as it enables the rapid and cost-effective identification of potential drug candidates from vast compound libraries, significantly reducing the time and resources needed for experimental testing. It helps prioritize the most promising compounds for further development and testing.

NVIDIA BioNeMo is a platform for accelerating AI model development and deployment for digital biology applications. Designed to accelerate drug discovery, protein engineering, and virtual screening processes, BioNeMo provides pre-trained AI models as containerized microservices (NIM, i.e. NVIDIA Inference Microservices) and tools to predict molecular interactions, facilitate protein folding, and analyze biomolecular data with a high degree of acceleration and efficiency.

Artificial, Inc., a member of the NVIDIA Inception program for startups, is a pioneering technology company that accelerates life science discoveries by streamlining laboratory operations through a cloud-based platform. This platform is designed to seamlessly integrate instruments, devices, and data systems, creating a holistic ecosystem tailored to modern lab needs. With its focus on interoperability and data management, Artificial transforms labs into agile and efficient environments capable of real-time monitoring and control.

A significant aspect of Artificial’s innovation lies in its advanced model integration capabilities. The platform's open API library and adapter modules allow for the seamless incorporation of cutting-edge AI/ML models, such as BioNeMo, into laboratory workflows. These integrations enhance the lab's operational efficiency by enabling sophisticated data analyses and iterative processes within orchestrated workflows. By aligning these models with existing lab systems, Artificial empowers scientific teams to leverage computational insights effectively, accelerating research timelines and elevating the precision of experimental outcomes.

Orchestrating BioNeMo Models in Self-Driving Virtual Screening

We have demonstrated the capabilities of workflows and model integration in Artificial using a proof of concept (PoC) use case for self-driving generative virtual screening. This PoC effectively showcases how NVIDIA's BioNeMo NIM microservices are integrated with Artificial's orchestration platform to facilitate advanced molecular screening. It specifically targets the SARS-CoV-2 virus, focusing on the main protease, a critical enzyme in viral replication and a well-established drug target in Covid-19 therapy. This use case also showcases how this process can be automated, in other words, self-driving until certain criteria are met. This is potentially useful for the concept of self-driving labs where Artificial can orchestrate the workflows and integrate the AI models in the process.

Based on the NVIDIA BioNeMo Blueprint for generative virtual screening, the self-driving virtual screening process is structured in iterative feedback loops, combining molecule selection, protein folding (if needed), docking, and binding affinity scoring. The workflow is executed as follows:

  • Initial Molecule Selection: The process begins with a library of molecular structures. For this PoC, we utilized four known Covid drug components as the initial candidates:
  • Nirmatrelvir
  • Ensitrelvir
  • Molnupiravir
  • Ritonavir
  • Protein Folding: Using the SARS-CoV-2 main protease sequence, the NVIDIA AlphaFold2 NIM microservice generates folded protein structures compatible with the docking simulations.
  • Molecule Design and Selection: The NVIDIA MolMIM NIM module iteratively generates new molecular structures based on the properties and scoring of previously docked molecules.
  • Docking and Scoring: Molecules are docked onto the SARS-CoV-2 main protease using the NVIDIA DiffDock NIM, and their binding affinities are calculated via DSMBind to determine their efficacy.
  • Iteration Dynamics: If the number of selected molecules meeting the predefined binding-affinity threshold (-1.4 million) is below ten, the cycle continues with a new generation of molecules.

After three self-driving iterations, the set criteria were fulfilled:

  • The binding affinity threshold of -1.4 million was met.
  • A minimum of ten molecules were identified whose scoring values surpassed the required threshold.

This iterative approach highlights the capability of Artificial's platform to autonomously refine molecular candidates, optimizing the selection process to focus on high-likelihood drug compounds. By combining BioNeMo's advanced molecular modeling tools with Artificial's automated orchestration and feedback systems, the PoC successfully demonstrated how AI-driven methodologies can accelerate virtual screening and drug discovery processes

Artificial’s Infrastructure for Model Integration

Artificial’s LabOps performs data management and integration for dry and wet labs, integrating purely in silico experimentation (as detailed here) with downstream laboratory automation. Central to Artificial’s architecture is the Lab Gateway, a network appliance that bridges the Artificial cloud to on-premises resources securely using bidirectional HTTP/2 streaming over an outbound connection. This allows LabOps to control and integrate data from automated laboratory equipment and secure IT infrastructure, including GPUs/TPUs for hosting AI models.

NVIDIA BioNeMo NIMs (containerized, pre-trained accelerated AI models, complete with APIs) allow models to be hosted in the cloud or securely behind a customer firewall on-premise. All data transacted between the Lab Gateway and any resource, cloud or on-premise, is kept in a permanent, immutable data record that is accessible via APIs or from other workflows. This allows data generated by inference and/or experimentation to be accessible for later training or fine-tuning of models.

Conclusion

Artificial, Inc. excels in integrating AI models, such as those implemented in BioNeMo NIMs, into self-driving virtual screening processes, highlighting its adept orchestration and data management capabilities. This methodology's success is likewise applicable to AI-driven self-driving laboratories. By leveraging sophisticated orchestration tools, real-time simulation, and robust API integrations, Artificial’s platform offers seamless adaptability and scalability not just for drug discovery, but also for a broader range of laboratory environments. This approach supports automated workflows, allowing for enhanced data quality, streamlined operations, and optimized resource utilization, ultimately advancing both virtual screening efforts and comprehensive laboratory automation.

Authors

Yao Fehlis is the Head of AI at Artificial where she focuses on leading AI strategies, developing AI solutions and partnering with stakeholders. Her research interests lie in areas such as AI for science, AI for manufacturing, and large language models. She holds a PhD in computational chemistry from Rice University.

Charles Crain is the Chief Technology Officer at Artificial. He is responsible for the Artificial platform, including LabOps and developer tools. His background spans 25+ years in software, including scientific computation, large-scale distributed computing, robotics, and generative AI. He holds a bachelor’s degree in chemical engineering with a biochemical focus from Rice University.

Janet Paulsen is the Senior Alliance Manager for drug discovery at NVIDIA. Her background lies at the intersection of physics-based simulations, molecular modeling, and the acceleration of these workflows using machine learning. She holds a PhD in pharmaceutical sciences from the University of Connecticut.

Acknowledgments

We would like to thank Youssef Nashed for technical support on BioNeMo-related questions.

crossmenuchevron-downarrow-leftarrow-right