Want to Hear a Saxophone Bark Like a Dog? Nvidia's New AI Audio Generator Has You Covered

Nvidia wants to let you know that your weirdest audio whims are now possible. The company’s latest AI project, along with its AI NPCs and in-game chatbot, is a text-to-audio AI called Fugatto. Like other AI audio generators, it can create tracks from a simple description, but this program can also create “sounds never heard before,” such as a “saxophone howl,” whatever that means.

In a blog post, Nvidia claimed its “Swiss army knife for sound” AI model can modify existing sounds or create entire soundscapes out of whole cloth. Fugatto is actually an acronym for the obnoxiously long “Foundational Generative Audio Transformer Opus 1.” It’s capable of processing voices, music, and background noise and producing them all into a single audio track. It can also modify existing sound sources.

It’s silly to call anything “a sound never heard before,” especially if it comes from AI. Whatever the output, the audio is merely an AI algorithm using existing sources in its training data to supply a result that approximates the prompt. Nvidia said its model is unique since it can combine instructions that were separate during training and “create soundscapes it’s never seen before.” This means it can overlay two distinct audio effects to create something new. In a video, Nvidia showed how it could generate the sound of a train that morphs into an orchestral score. It can also create the sound of a rainstorm that fades into the distance.

These are capabilities we haven’t seen before. Beyond a prompt to demo “electronic music with dogs barking in time to the beat,” Nvidia said its tool offers “fine-grained control” over the created soundscapes. Nvidia claims the narrator for the video was an AI version of Nvidia CEO Jensen Huang, though if Fugatto produced the obviously fake voice, the AI model needs more work before anybody uses it for their next deepfake project.

Plenty of AI audio tools already take text prompts and turn them into audio tracks. Adobe has shopped its own Project MusicGenAI Control tool to unscrupulous musicians. Big tech companies like Meta have already promoted their audio models to the movie industry. Last month, Meta debuted Movie Gen, which can generate soundscapes for AI-generated films.

Nvidia quotes AI researcher Rohana Badlani, who said the model “made me feel a little bit like an artist,” though, of course, the AI draws from thousands of gigabytes worth of existing music and audio data. Nvidia did not share exact details about its dataset and only said it contains “millions of audio samples used for training.” The full version of Fugatto is a 2.5 billion-parameter model trained on Nvidia’s own banks of its famed H100 AI GPUs.

It’s bad news for foley artists, who have made that kind of audio fakery into a renowned art form. The company said Fugatto could be a useful tool for ad agencies, video game developers, or musicians who want to sample changes to their work without doing much extra work. Still, the other side of the coin is all those people who would use it to make “new assets,” AKA potentially adding more AI slop to the growing pile.

$144.99

Add to cart

Want to Hear a Saxophone Bark Like a Dog? Nvidia’s New AI Audio Generator Has You Covered

Cooler Master MasterBox Q300L Micro-ATX Tower with Magnetic Design Dust Filter, Transparent Acrylic Side Panel, Adjustable I/O & Fully Ventilated Airflow, Black (MCB-Q300L-KANN-S00)

ASUS TUF Gaming GT301 ZAKU II Edition ATX mid-Tower Compact case with Tempered Glass Side Panel, Honeycomb Front Panel…

ASUS TUF Gaming GT501 Mid-Tower Computer Case for up to EATX Motherboards with USB 3.0 Front Panel Cases GT501/GRY/WITH Handle

be quiet! Pure Base 500DX ATX Mid Tower PC case | ARGB | 3 Pre-Installed Pure Wings 2 Fans | Tempered Glass Window | Black | BGW37

ASUS ROG Strix Helios GX601 White Edition RGB Mid-Tower Computer Case for ATX/EATX Motherboards with tempered glass, aluminum frame, GPU braces, 420mm radiator support and Aura Sync

Corsair 5000D Airflow Tempered Glass Mid-Tower ATX PC Case – Black

CORSAIR 7000D AIRFLOW Full-Tower ATX PC Case â High-Airflow Front Panel â Spacious Interior â Easy Cable Management â 3x 140mm AirGuide Fans with PWM Repeater Included â Black

Bgears b-Voguish Gaming PC with Tempered Glass ATX Mid Tower, USB3.0, Support E-ATX, ATX, mATX, ITX. (Note: Fan NOT…

Phanteks (PH-EC360ATG_DWT01) Eclipse P360A Ultra-fine Performance Mesh, Mid-Tower case, Tempered Glass, Digital-RGB…

CORSAIR iCUE 4000X RGB Tempered Glass Mid-Tower ATX PC Case – 3X SP120 RGB Elite Fans – iCUE Lighting Node CORE Controller – High Airflow – White

Rent the Runway vs Nuuly – which is better?

Weekly Meal Plan Jan 20, 2025

Mississippi Pot Roast Meatloaf – The Stay At Home Chef

Roasted Sausage and Potatoes – Spend With Pennies

Leave a reply Cancel reply

Compare items

Shopping cart