The formation of OpenFold, a non-profit artificial intelligence (AI) research consortium of organisations whose goal is to develop free and open source software tools for biology and drug discovery, has been announced by a group of leading academic and industry partners. The Open Molecular Software Foundation (OMSF) is a non-profit organisation that advances molecular sciences by creating communities for open source research software development.
The Columbia University Laboratory of Mohammed AlQuraishi, Ph.D., Arzeda, Cyrus Biotechnology, Genentech’s Prescient Design, and Outpace Bio are among the founding members of OpenFold. The consortium, which is open to other organisations for membership, is hosted by OMSF and supported by Amazon Web Services (AWS) as part of the AWS Open Data Sponsorship Program. OpenFreeEnergy and OpenForceField are also hosted by OMSF.
“In biology, structure and function are inextricably linked, so a deep understanding of structure is required to elucidate molecular mechanisms and engineer biological systems,” said Brian Weitzner, Ph.D., Associate Director of Computational and Structural Biology at Outpace and co-founder of OpenFold. We believe that open collaboration and access to powerful AI-powered structural biology tools will transform biotechnology and biosciences by providing free access to researchers and educators from life science companies, technology companies, and academia to use and extend these tools to accelerate discovery and develop life-changing technologies.”
The consortium’s first major research focus is on developing cutting-edge AI-based protein modelling tools that can predict molecular structures with atomic precision. The OpenFold consortium is modelled after pre-competitive open source consortia in the technology industry, such as Linux and OpenAI.
First consortium-released AI model to predict protein structure yielding impressive results
The OpenFold founders also officially announced today the full release of their first AI model for protein structure prediction developed in Dr. AlQuraishi’s laboratory, which was first publicly acknowledged on Twitter on June 22, 2022. The model is based on ground-breaking research conducted by Google DeepMind and the University of Washington’s Institute for Protein Design. The Apache Software Foundation makes the software available under a free and open source licence at https://github.com/aqlaboratory/openfold. AWS’s Registry of Open Data contains training data. A formal preprint and publication will be available soon.
Yih-En “This first OpenFold AI model is already producing highly accurate predictions of protein crystal structures as benchmarked on the Continuous Automated Model EvaluatiOn (CAMEO), and has yielded on-average higher accuracy and faster runtimes than DeepMind’s AlphaFold2,” said Andrew Ban, Ph.D., VP Computing at Arzeda and co-founder of OpenFold. The figure shows an example of OpenFold output with comparison to experimental data.
CAMEO is a project developed by the protein structure prediction scientific community to assess prediction accuracy and reliability. “The first release of the OpenFold software includes not just inference code and model parameters, but full training code, a complete package that has not been released by another entity in the space,” said Lucas Nivon, Ph.D., CEO at Cyrus and co-founder of OpenFold. It will enable the training of a comprehensive set of derivative models for specialised applications in biologics, small molecules, and other modalities.”
Researchers from all over the world will be able to use, improve, and contribute to the consortium’s “predictive molecular microscope.” These derivative models will be extended in the future to integrate with other software in the field and to be more useful for protein design and biologics drug discovery in particular.
Several other corporate and non-profit organisations are currently becoming full members of the OpenFold consortium, and the founders invite biotech, pharma, technology, and other research organisations to join. The consortium is currently reviewing proposals from academic groups all over the world for new AI protein projects.