The 10 Measures and KPIs of ML Success

Diego Oppenheimer
Co-Founder & CEO

During this webinar, industry expert Diego Oppenheimer discussed the 10 Measures and KPIs of ML Success. We’ve included a short transcription of the webinar, beginning at 14:37 of the webinar.

Diego Oppenheimer, Algorithmia: So today we’re going to talk a little bit about what the measures and KPIs we can do from a leverage and risk perspective. So, you know, the really important part is shortening the time to market. So automating deployment of models shortens that time that it takes to go from a productionizing a model for months to hours. We’ve seen a lot of use cases, especially in complex organizations, where it can take anywhere between 9 and 12 months to get a model in production. Even a month is too long to move a model into production. And if you think about what is the result of that model, so that optimization, every day that is not in production is a day that you’re not getting the advantages of that. So if you’re thinking about financial services and you have a risk model every day that your model is not in production, your risk profile is elevated. You are not getting an advantage of that. If you have an upsell, like a marketing/e-commerce machine learning model, every day you don’t have that is a day that you’re not enjoying that ROI as an organization. 

You want to reduce your IT ops overhead. So again, this is another thing that’s going into that IT workflow that exists as an organization and the ops overhead can be really really expensive if it’s not managed. We want to provide data scientists and machine learning engineers with the proper speed of iteration and model optimization. As data changes, as world conditions change, these models need to be updated extremely quickly. Providing that highway and if we think back to the principles of devops, which was, how do we actually create a fast way of iteration and recreation? We want to add that exact same idea back into the world of machine learning. 

Although model reusability across organizations is somewhat limited, inside organizations we see a lot of potential for reusability. So data sources where you’ve built extractors for them or built features around, data sources where you’ve actually seen you already have a model for analyzing this type of documentation. Those are things that can be reused across your organization. But today, they have to be rebuilt in every single pocket because there’s no sharing cup capability, there’s no organizational capability around that. 

Infrastructure scalability costs are another concern. So these are pretty compute-hungry workflows and so the cost can really Skyrocket. So optimizing your infrastructure for runtime and inference and being able to scale across the proper hardware so that you can actually take advantage of cost controls is also extremely important. So how do you actually measure those KPIs and measure value? Time to deployment? How many models in production do you have so you can actually see a gain and growth in terms of every single time you’re getting some efficiencies around it. New models deployed each month, API calls to those models -this is a great way of understanding what value is happening. And then when you’re in an organization, especially an organization whose thinking about centralized data science and machine learning teams, how do you actually produce a way of showing the cost of these models? How are the different parts of the organization using them? Being able to distribute that, not only from a cost perspective, but also from an audit perspective of who’s calling what when with what data? 

So let’s talk a little bit about shortening the time to market. What do we get in here? So automating deployment paths. This is really about just getting an approved IT way to very quickly get deployments done. You want to see very quickly the impact of that model in production. You also want to be able to register and monitor the reporting around that. So we think about the data scientists or who’s actually looking at the development. They’re developing the new model. They want to provide this to the ML engineer who will actually put it into production. And this is where we start looking at the automation around deployment and operations. There’s a devops person involved. There’s an app owner involved. Why are they involved? Well, the app owner wants to make sure that there’s up time that the actual application is running in the way that it’s supposed to and that it’s meeting the SLAs that they care about. 

The devops person is actually responsible for making sure this whole thing runs. So, you know who’s going to actually get at 2 A.M. a pager that something went wrong? It’s going to be a devops team. So they’re involved in this process and actually pushing that forward and at the end of the day the business owner that owns the application is actually going to be the one who’s the customer of this. So today we’re seeing an average time to deploy of over 31 days. This is extremely slow and gains on this can be done by automating the path and actually seeing what it takes from a model actually being developed and approved to getting into the hands of that application and customer becomes extremely important. 

So we’ve talked a little bit about an example here which is with the top four consulting firm. So the challenge this customer had was that they were building models for fraud detection and compliance but getting a model into production, because of the nature of what Theta they were working with and the infoSEC requirements, was taking them almost 18 months. And so during this time, criminal patterns and behaviors adapt. This is ever-evolving. The data changes, the use cases change ever-evolving. 

So by the time a model could actually get into production to start producing results at an organization you’re in a position where they were already essentially expired or not really relevant anymore. So what we were able to provide them with in this automation was a workflow for machine learning that allowed them to get very very quickly from experimentation to full-on in production. You enable the module or pipeline for model reuse. So being able to say have different extractors pre-processing functions, post processing functions, and combining those through pipelines and deploying those, allowed for very very quick editing, actually subbing out different parts of models. One thing to point out here and as you all probably know models don’t live in isolation, right? They’re part of pipelines, they get actually assembled together in some cases, but they also live as part of pre-processing and post-processing functions. And so being able to swap out modules of that to improve different pipelines is extremely important. 

At the end of the day the result is what matters, right? Deployed over a hundred models with 10 different open source frameworks and more than 75 libraries attached to that in under five months. That’s the measurement that we can go look at and say this is the business impact we’re having. We went from a model taking us almost a year to be able to get into production, to a speed of iteration that now actually allows delivering higher quality tooling to their organization and to their customers to at the end of the day, catch these criminal patterns around fraud detection and compliance.


Learn more and watch the full video on YouTube:

Recent Posts

View All Posts