Meta unveils AudioCraft for audio & music creation using Generative AI

By Akshay Kedari

The tech giant Meta has reportedly introduced AudioCraft, a framework capable of generating high-quality audio and music based on short text prompts.

The AudioCraft framework aims to simplify the use of generative models for audio by providing a collection of sound and music generators with compression algorithms, avoiding the need to switch between different codebases. The framework comprises three main generative AI models: EnCodec, MusicGen, and AudioGen.

For the record, AudioCraft isn't Meta's first venture into audio generation, as they previously open-sourced MusicGen, an AI-powered music generator. However, new launch represents significant improvements in AI-generated sounds like dogs barking, cars honking, and footsteps on wooden floors.

MusicGen has been around, but now Meta has released its training code, allowing users to train the model on their own music datasets. Although using MusicGen raises ethical and legal concerns as it learns from existing music, potentially creating similar effects.

Meanwhile, AudioGen, another component of AudioCraft, focuses on creating environmental sound effects and other sounds, utilizing a diffusion-based model. It can create realistic environmental sounds based on text descriptions of acoustic scenes and generate speech and music.

The third model, EnCodec, represents an improvement over a previous Meta model by making music with fewer artifacts and more efficiently modeling audio sequences. It serves as a lossy neural codec designed to compress any audio while reconstructing the original signal with high fidelity.

Meta acknowledges the potential misuse of AudioCraft for deepfake purposes, similar to MusicGen. However, they do not impose strict restrictions on its usage, leaving it open for both positive and potentially negative applications.

AudioCraft framework represents a significant advancement in generative AI for audio and music. While it holds promise for inspiration and creative applications, there are also ethical and legal considerations to be taken into account. Meta emphasizes transparency and openness to empower users while acknowledging the need for further development to make these models useful for both amateurs and professionals in the music community.

Source: https://techcrunch.com/2023/08/02/meta-open-sources-models-for-generating-sounds-and-music/

About Author


Akshay Kedari

A qualified computer engineering graduate, Akshay Kedari takes pride in having his way with words. Following his passion for content creation, he writes insightful pieces on aeresearch.net and a few other portals. Also endorsed with a short-term experience in web development, Akshay lends expertise ...

Read More