How we optimized our Machine Learning Training Infrastructure Costs @ Mercari US
“Making Mercari" is an engineering blog for Mercari's US product. AI-related posts on "Making Mercari" will also be featured on this website.
Here is a post by Abhishek Munagekar, Software Engineer for the Machine Learning Platform at Mercari US. In this blog post, he describes how his team optimized the cost of computing resources used for machine learning training. More specifically, there were some challenges with downscaling of the Kubernetes cluster used for ML model training. Read the blog post to find out how PodDisruptionBudget, dedicated node pools and the Gatekeeper Assign CRD were used to overcome these challenges.