The AI Book
    Facebook Twitter Instagram
    The AI BookThe AI Book
    • Home
    • Categories
      • AI Media Processing
      • AI Language processing (NLP)
      • AI Marketing
      • AI Business Applications
    • Guides
    • Contact
    Subscribe
    Facebook Twitter Instagram
    The AI Book
    AI Language processing (NLP)

    Create music using Meta’s MusicGen on Colab

    9 July 2023No Comments5 Mins Read

    [ad_1]

    Introduction

    In the vast field of artificial intelligence, deep learning has revolutionized many fields, including natural language processing, computer vision, and speech recognition. However, one fascinating area that has fascinated researchers and music enthusiasts alike is the generation of music using artificial intelligence algorithms. MusicGen, a state-of-the-art controlled text-to-music model that seamlessly transforms text requests into engaging musical compositions.

    What is MusicGen?

    MusicGen is an excellent model designed for music generation that offers simplicity and control. Unlike existing methods such as MusicLM, MusicGen is distinguished by eliminating the need for a self-monitoring semantic representation. The model uses a single-stage auto-regressive transformer architecture and is trained using a 32 kHz EnCodec tokenizer. Notably, MusicGen generates all four codebooks in one pass, which sets it apart from conventional approaches. By introducing a small delay between codebooks, the model demonstrates the ability to predict them in parallel, resulting in only 50 autoregressive audio steps per second. This innovative approach optimizes the efficiency and speed of the music production process.

    MusicGen has trained over 20,000 hours of licensed music. They also trained it on an in-house database of 10K high-quality music tracks and on ShutterStock and Pond5 music data.

    Prerequisites:

    According to the official MusicGen GitHub repo https://github.com/facebookresearch/audiocraft/tree/main.

    • GPU with at least 16 GB of memory

    Available MusicGen models

    There are 4 pre-made models available and they are as follows:

    • Small: 300M model, text to music only
    • Medium: 1.5B model, text only over music
    • Melody: 1.5B model, text to music and text+melody to music
    • Large: 3.3B model, text only on music

    experiments

    Below is the conditional music generation output using the MusicGen large model.

    Text Input: Jingle bell tune with violin and piano
    Output: (Using MusicGen "large" model)

    Below is the output of the MusicGen “melody” model. We used the above audio and text input to create the following audio.

    Text Input: Add heavy drums drums and only drums
    Output: (Using MusicGen "melody" model)

    How to install MusicGen on Colab

    Make sure you use the GPU for faster inference. It took ~9 minutes to generate 10 seconds of audio using the CPU, and only 35 seconds using the GPU(T4).

    • Before starting, make sure that the torch and torch are installed in the collab.
    Install the Audiocraft library from Facebook.
    !python3 -m pip install -U git+https://github.com/facebookresearch/audiocraft#egg=audiocraft
    Import required libraries.
    from audiocraft.models import musicgen
    from audiocraft.utils.notebook import display_audio
    import torchfrom audiocraft.data.audio import audio_write
    Load the model
    The list of models is as follows:
    # | model types are => small, medium, melody, large |
    # | size of models are => 300M, 1.5B, 1.5B, 3.3B |
    model = musicgen.MusicGen.get_pretrained('large', device='cuda')
    Setting the parameters (optional)
    model.set_generation_params(duration=60) # this will generate 60 seconds of audio.
    Conditional Music Generation (Generate music by providing text.)
    model.set_generation_params(duration=60)
    res = model.generate( [ 'Jingle bell tune with violin and piano' ], progress=True)
    # This will show the music controls on the colab
    Generating unconditional music
    res = model.generate_unconditional( num_samples=1, progress=True)
    # this will show the music controls on the screendisplay_audio(res, 16000)
    Generating a continuation of the music

    To create a continuation of the music, we will need an audio file. We will feed this file to the model and the model will generate and add more music to it.

    from audiocraft.utils.notebook import display_audio
    import torchaudio
    
    
    path_to_audio = "path-to-audio-file.wav"
    description = "Jazz jazz and only jazz"
    
    
    # Load audio from a file. Make sure to trim the file if it is too long!
    prompt_waveform, prompt_sr = torchaudio.load( path_to_audio )
    prompt_duration = 15
    prompt_waveform = prompt_waveform[..., :int(prompt_duration * prompt_sr)]
    output = model.generate_continuation(prompt_waveform, prompt_sample_rate=prompt_sr,
    descriptions=[ description ], progress=True)
    display_audio(output, sample_rate=32000)
    Generation of melody conditional generation
    model = musicgen.MusicGen.get_pretrained('melody', device='cuda')
    
    
    model.set_generation_params(duration=20)
    
    
    melody_waveform, sr = torchaudio.load("path-to-audio-file.wav")
    melody_waveform = melody_waveform.unsqueeze(0).repeat(2, 1, 1)
    output = model.generate_with_chroma(
    descriptions=['Add heavy drums'], melody_wavs=melody_waveform, melody_sample_rate=sr,progress=True)
    display_audio(output, sample_rate=32000)
    Burn the audio file to disk.

    If you want to download a file from colab, then you will need to burn the wav file to disk. Here is a function that writes a wav file to disk. It will take the output of the model as the first input and the file name as the second input.

    def write_wav(output, file_initials):
        try:
            for idx, one_wav in enumerate(output):
            audio_write(f'file_initials_idx', one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)
            return True
        except Exception as e:
            print("error while writing the file ", e)
            return None
    
    
    # this will write a file that starts with bollywood
    write_wav(res, "audio-file")
    Full implementation (Google colab file link)

    A complete implementation of Meta’s MusicGen library by Pragnakalp Techlabs is provided in the colab file. Feel free to explore and create music using it.
    Pragnakalp Techlabs | Meta’s MusicGen implementation

    conclusion

    In conclusion, Audiocraft’s MusicGen is a powerful and controllable music generation model. Looking ahead, Audiocraft has exciting future potential for advancements in AI-generated music. Whether you’re a musician or an AI enthusiast, Audiocraft’s MusicGen opens up a world of creative possibilities.



    [ad_2]

    Source link

    Previous ArticleOptimizing content for new searches
    Next Article VB Transform 2023: Announcing the nominees for VentureBeat’s 5th Annual AI Innovation Awards
    The AI Book

    Related Posts

    AI Language processing (NLP)

    The RedPajama Project: An Open Source Initiative to Democratize LLMs

    24 July 2023
    AI Language processing (NLP)

    Mastering Data Science with Microsoft Fabric: A Tutorial for Beginners

    23 July 2023
    AI Language processing (NLP)

    Will AI kill your job?

    22 July 2023
    Add A Comment

    Leave A Reply Cancel Reply

    • Privacy Policy
    • Terms and Conditions
    • About Us
    • Contact Form
    © 2025 The AI Book.

    Type above and press Enter to search. Press Esc to cancel.