Understand Machine Learning In Production for Data Science
This article was published as a part of the Data Science Blogathon
In Real-world, machine learning is not just about making your model and achieving high accuracy, this is only a small percentage of the total machine learning project.
MLOps are the must-know skills that machine learning engineers should know about. MLOps is a process of taking your model into the production system. A vast amount of work was done in achieving model accuracy in this era. But, if you’re building your product it’s not just about models.
In this article, we will be discussing MLOps, from Scoping a problem to deployment, Monitoring, and Maintaining the system.
What is MLOps?
The way I like to define MLOps is that it’s a process for developing an end-to-end product from scoping a problem to the deployment, monitoring, and maintaining the product.
Nowadays, many companies are integrating ML into their product and the rise of MLOps has increased drastically.
Here’s the pic that you can see that there is a small percent of ml code in the whole ml product.
In the above figure, you can see that there is a small percentage of code contributed to the whole machine learning project. Below are the explanation of the important terms shown in the above picture:-
- Data Collection:- Collection of data through various sources.
- Data verification:– Validating our data by doing data analysis.
- Feature Extraction:- Extracting the features from text or image data.
- Serving Infrastructure:- Basic things needed for serving the model
- Monitoring:- After deployment, you monitor your model using some metrics.
Scoping or defining a project is one of the key parts of the MLOps Architecture. In this stage, you decide on which to work on in the particular project.
Eg:- Let’s you’re building a Speech recognition system, then In that case you will tend to work on some of the key metrics such as accuracy, latency, etc.
Eg:- Let’s say you’re working on an image recognition system, then In that case you will tend to more work on some of the key metrics like Image preprocessing, normalizing and etc.
The main problem that machine learning engineers fail because they don’t understand the problem statement and they haven’t scoped the problem. So I recommend spending some amount of time on this also with your team.
Tip:- Whenever you’re working on this stage, try to discuss the things with your team on giving priority to the metrics.
In this stage, you’ll define data and establish the baseline and you’ll work on labeling and organizing the data.
Eg:- Let’s say you’re working with audio data, for example, you will normalize the volume of the audio and do some more preparation of the data.
Eg:- Let’s say you’re working on image data, you’ll normalize your image, crop your image, resizing and etc.
The challenge you might face is that your data is in very bad condition, I recommend trying to label or organize the data manually.
Tip:- Before preparation Try to do the data analysis to get a better sense of the data, you’re working on.
In this stage, you’ll work on making your model, which includes Research and Algorithm and Hyperparameter tuning to make your model work.
Here, you select and train the model and perform some error analysis and try to improve the performance with the help of error analysis.
In this stage the challenge you might face challenges like Hyperparameter tuning and selecting the best algorithm.
Tip:- Whenever you’re struggling in selecting the best hyperparameter, I did recommend reading the paper and try with the hyperparameters that others have selected.
This is also the crucial stage in MLOps. You deploy your model to be used by the people, or you deploy your model in production.
After deployment, you monitor your system and you maintain your system.
Eg:- After deployment, let’s say you have deployed a credit card fraud detection system, then it might be that hackers change their strategy, so that system will fail, so you have to continuously monitor and maintain your system. In that case, you will retrain your model with other data.
Below I have listed one deployment pattern that if often used in the industry
Canary Deployment:– In this deployment pattern, you roll out a small percentage of traffic to your system, and then you monitor it, if all good then you go further or you come back again to tune your model.
Deployment is a very Iterative process, so don’t get offended in this case, try to ask some people who are experienced with respect to your problem.
In MLOps, Monitoring is one of the important stages. You monitor your system, how well it is performing when the traffic is coming?
Eg:- Fraction of Non-Null Outputs from the model, may happen that your model is printing nothing.
Eg:- Fraction of Input Missing Values
You can set up some metrics or alerts like Input metrics, Output metrics, etc.
Suggestions on setting up monitoring metrics:-
- Brainstorm the number of things with your team that can go wrong with your model.
- Brainstorm a few metrics to detect the problems in your model.
- It is okay to use many metrics initially but remove the ones that you do not find useful.
ML Model development is an iterative process, so is deployment.
Thanks for reading!
I hope that you’ll implement these concepts & strategies into your ML product Or Project.
Something not mentioned or want to share your thoughts? Feel free to comment below And I’ll get back to you.
About the Author
I am a 14-year-old learner and machine learning and deep learning practitioner, Working in the domain of Natural Language Processing, Generative Adversarial Networks, and Computer Vision. Also, make videos on machine learning, deep learning, Gans on my youtube channel Newera. I am also a competitive coder but still practicing all the techs and a passionate learner and educator.
You can connect me on Linkedin:- Ayush Singh
Your suggestions and doubts are welcomed here in the comment section. Thank you for reading my article!
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.