How I helped a growing audio licensing startup with zero AI expertise become an AI startup
// The Problem
They had terabytes of sound effects that largely sounded the same. The catalog offered little uniqueness and no way to mix or transform sounds in real time. They wanted to give sound engineers the ability to generate new, high quality sounds on demand, but they did not have the in house expertise or tooling to make this possible.
// Solution
I fined-tuned foundational diffusion models on the highest quality sounds in their entire catalog, capturing the tonal range and production quality of their library. I made the models fully promptable, allowing sound engineers to guide generation through structured actions and free form text. The models were adapted to understand creative cues such as mood, atmosphere, and cultural context, enabling controlled sound generation suitable for real production use.
Due to my understanding of the model's architecture, I took a highly data-centric approach, keeping the model fixed and inventing clever data design techniques. It was an iterative process that required tight feedback loops, rapid experimentation on audio quality and controllability, and continuous refinement of the models to reach results that were both creative and production ready.
// The Impact
The results were strong enough to clearly communicate the vision to investors and early partners. This unlocked hiring, allowed them to build an internal AI team, and set the foundation for their transition into a full AI audio licensing company.
Today, their customers include multi billion dollar companies across tech and Hollywood, positioning them as a promising AI audio licensing platform.