Common Mistakes in the ML Development Lifecycle // Kseniia Melnikova // MLOps Meetup #65
MLOps.community - En podcast af Demetrios Brinkmann
Kategorier:
MLOps community meetup #65! Last Wednesday we talked to Kseniia Melnikova, Product Owner (Data/AI), SoftwareOne. //Abstract In this MLOps Meetup, we talked about the Machine Learning model lifecycle and development stages and then analyze the main mistakes that everybody does at each stage. Kseniia also provided the audience with solutions to the mistakes and we discussed existing tools for experiment management. //Bio Kseniia is a product owner for Data/AI-based products. Right now, she is working mostly with numeric data analysis, customer insights, and product recommendations. Previously Kseniia worked at Samsung Research with the biometrics team. She was studying computer science in Russia (Moscow) and a little bit of management in South Korea (Seoul). One of the most interesting directions of research - Model Lifecycle Management Systems and Reproducibility. ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Kseniia on LinkedIn: https://www.linkedin.com/in/kseniia-melnikova/ Timestamps: [00:00] Introduction to Kseniia Melnikova [02:00] MLOps World Conference Announcement [03:40] AI Development process: Common Mistakes [07:45] Step 1: Planning [07:48] Mistake #1: Personal Decisions - Teamwork [08:31] Mistake #1: Cases [09:00] Mistake #1: Solution [11:52] Scrum [12:50] "In Scrum, it's hard to plan because especially in research, you don't know which result affects new tasks that's why it might be a little slow for Machine Learning." [14:28] Step 2: Data Processing [14:34] Mistake #2: Chaos with Datasets [15:26] Mistake #2: Cases [16:48] Mistake #2: Solution [20:12] Step 3: Experiments [20:21] Mistake #3: Lack of Experiments Tracking [22:13] Mistake #3: Case - Manual Experiments Tracking [24:10] Mistake #3: Solutions [25:57] Experiments Tracking Tools Example: MLFlow UI [26:46] Awareness of Existing Tools [28:21] Tools' Features [29:21] Possible Combination [29:48] Another Possible Combination [30:24] Best Practice [31:42] Mistake #0: Lack of Information Sharing [32:26] Mistake #0: Solution - Organize more meetings/standups! [34:18] Find Your Mistakes [34:41] Mistake #0: Solution - Organize more meetings/standups! [35:35] Audio Data [39:32] Experiment tracking of only 1 ML engineer [41:38] "I prefer reproducibility tools because it's automatic and it also takes a lot of time to manually upload the results into conference." [43:03] AI Development Check-list [43:40] Check-list Results [44:52] "I think it's always interesting to rate yourself to share the results with other people to compete out of it." [45:10] Why to Implement [45:17] "If we have more automation on experimentations for data sets versioning, it will lead to less manual work." [45:28] "AI Development process implementation will have the possibility to reproduce and compare experiments." [45:37] "AI Development process implementation will make you comfortable on solving the issues you'll face every day." [45:52] "AI Development process implementation will lead to a faster commercialization cycle because you will take less time on the process and more time for the results." [46:03] "If we will take all the principles of AI Development process implementation, it will lead to easy communication between team members. You'll gain trust, have great teamwork, and everyone will have respect for each other." [46:50] War stories prior to having AI Development process [49:50] Calculating the lost money