- Databricks had created MLflow in response to the complicated process of ML model development
- ML models need to also track versions of data sets, model parameters, and algorithms that create an exponentially larger set of variables to track and manage
The Linux Foundation has announced that MLflow, an open source machine learning (ML) platform created by Databricks, will join the Linux Foundation. Databricks had created MLflow in response to the complicated process of ML model development.
MLflow claimed that it has experienced good community engagement from over 200 contributors. It also said that it has been downloaded more than 2 million times per month, with a 4x annual growth rate in downloads. The Linux Foundation provides a vendor-neutral home with an open governance model.
Build, train, tune, deploy, and manage machine models
Michael Dolan, vice president of strategic programs at the Linux Foundation said, “The steady increase in community engagement shows the commitment data teams have to building the machine learning platform of the future. The rate of adoption demonstrates the need for an open source approach to standardizing the machine learning lifecycle. Our experience in working with the largest open source projects in the world shows that an open governance model allows for faster innovation and adoption through broad industry contribution and consensus building.”
Traditionally, in ML model development, the process to build, train, tune, deploy, and manage machine models was extremely difficult for data scientists and developers. ML models need to also track versions of data sets, model parameters, and algorithms that create an exponentially larger set of variables to track and manage.
ML is also very iterative and relies on close collaboration between data teams and application teams. MLflow said that it keeps the process from becoming overwhelming by providing a platform to manage the end-to-end ML development lifecycle from data preparation to production deployment. It includes experiment tracking, packaging code into reproducible runs, and model sharing and collaboration.
Matei Zaharia, the original creator of Apache Spark and creator of MLflow, shared the news with the data community during his keynote presentation at Spark + AI Summit. Zaharia added, “MLflow has become the open source standard for machine learning platforms because of the community of contributors, which consists of hundreds of engineers from over a hundred companies. Machine learning is transforming all major industries and driving billions of decisions in retail, finance, and health care. Our move to contribute MLflow to the Linux Foundation is an invitation to the machine learning community to incorporate the best practices for ML engineering into a standard platform that is open, collaborative, and end-to-end.”