5 minutes read
July 23, 2023
How Mystic helped Dumme reduce their ML cloud bill by 75%
Read how Mystic AI's Pipeline Core helped Dumme reduce their ML cloud costs by 75%.
I think the great value that Mystic brings for us is the ability to access multiple GPU’s for a fraction of the cost that we were paying before; if you don't want to pay for a GPU that is parked 24/7 on a cloud provider then I don't think there's any other option when you are doing ML inference at scale— Merwanne Drai, CEO
Founded in January 2022 and a participant in startup accelerator Y Combinator’s Winter 2022 program, Dumme are tipped as one of the hottest startups in the market. The company leverages AI to create short-form videos from YouTube content; customers can generate clips from their video podcasts to publish to Shorts - and the creator market is rushing to sign up.
Dumme Co-founders; Will Dahlstrom(CPO), Merwanne Drai (CEO) and Jordan Brannan (CTO)
What was the deployment challenge?
Dumme’s deployment is challenging because they are using a combination of both proprietary and existing AI models in production, and the run-time is pretty long for each video. Initially, the team approached the challenge with the idea that they needed to have a dedicated GPU to run inference for each video. This worked in practice, however, it presented an insurmountable scaling challenge as the number of simultaneous users would be limited by the number of GPU’s available.
What was their approach?
Dumme’s deployment was originally on AWS but access to the number of GPU’s was limited, and anything above the allotted quota were ‘insanely expensive’. The team switched to using GPU Cloud provider CoreWeave which provided the required compute. However, it soon became evident that the pricing model of a 24/7 ‘always-on, bare-metal’ fixed solution is unsuitable and unsustainable for a company with unpredictable traffic and fast scaling demand.
Why Mystic is the right solution
Dumme needed to scale fast - with over 20,000 people on the waitlist. They needed both high performance in terms of inference speed, as well as the ability to scale up and down without a hitch. Most of all, they needed optimisation of the infrastructure to bring down the costs of GPU compute.
As a fully managed enterprise-grade platform designed to deploy ML models at scale, high-throughput, and consistent performance, Mystic was able to ensure that Dumme’s models were ready for use with minimal delay. The deployment on Pipeline Core enabled:> High-parallel GPU scale from day 1 with 150 GPU capacity on the initial deployment. > Usage based pricing at a fraction of the cost of cloud providers> Seamless scalability to support growing model volumes and user demands
Founded: 2022 [YC22]
Interview with Dumme Co-Founder: Merwane Drai
Watch our interview with Merwanne.
We spoke to co-founder and CEO Merwane Drai about the technical challenges of deploying the Dumme platform at scale, and why the team chose Mystic (above a number of other options) for their machine learning inference.
Who are Dumme’s customers and, and how are they using the platform?
Merwanne: “Our users right now are mostly creators and primarily podcasters. Their normal workflow is to actually hire people on platforms like Fivrr or Upwork to edit their long form content into shorts and, some of them also have dedicated teams because they're pretty big. But, the way they do it now is that they just point our whole system to their footage and their podcast, and it's an end to end system, which means they don't really have to interact with it - they get back a bunch of shorts which they can just directly republish on platforms like Instagram, tiktok and youtube shorts.”
Why is machine learning deployment hard?
Merwanne: “It’s actually pretty complicated because what we're doing under the hood is that we're duct taping a bunch of different models, some in house and some open source models, and running inference for that is actually pretty complicated because the run time is pretty long.
So we couldn't possibly use an API or dedicated service because it's just simply, the run time is too long and we need a lot of flexibility. So initially, we approached this with the idea that we need to have dedicated GPU’s to run inference directly.
And then obviously, it didn't work out because of the way it would need to scale if you have to have a dedicated GPU to edit each video. If you have like 32 users, that's fine, but if you have thousands of users, then that's pretty complicated.
Like if you want to have 1000 GPUs out there, then good luck, right? So yeah, that was a massive problem for us.”
Did you try to build the platform yourself?
Merwanne: “We did as a matter of fact, we, we were running it ourselves for like the past two or three months - until we got rescued by Mystic!”
Why not go with a ‘bare-metal’ solution?
Merwanne: Going with a ‘bare-metal’ provider, ”gave us access to the GPU’s that we needed, however that was actually a problem because if you have 200 GPU’s parked on CoreWeave, you have to pay for them 24/7. This, of course, led to a huge bill at the end of the month and that’s tough for an early stage company it's very hard to predict what your usage will be”.
What is the value that Mystic brings to you?
Merwanne: “The massive value for us is this idea of ‘let's make lambda for GPU’S’! It’s actually pretty powerful, especially if you know exactly what your needs are, [for example], at one point in time I need 1000 and at another point in time, I only need like five.
That’s immensely valuable for anybody who's trying to scale any piece of software that depends on [GPU] compute. And the GPU’s are 10x cheaper so we can afford to scale up.
What’s next for Dumme?
Merwanne: “Right now what we do is long form content to shorts, but the big vision that we never talk about because it sounds very close to the realm of sci-fi right, is that we just have this idea that we want to make like video editing obsolete.
Like how can you make auto pilot for Final Cut Pro?
How can you put a baby Steven Spielberg in everyone's computer so that you can just walk in with 40 hours of footage, throw it at the system and then you'll get back your beautiful 10 minute video that you can upload on youtube?
It's actually really complicated to get there. We don't even have the tech to get there. It doesn't exist yet. But we think our current thing is a great entry point to that.
So that's what we're doing!”
Mystic makes it easy to work with ML models and to deploy AI at scale. The self-serve platform provides a fast pay-as-you-go API to run pretrained or proprietory models in production. If you are looking to deploy a large product and would like to sign up as an Enterprise customer please get in touch. In the meantime follow us on Twitter and Linkedin.