Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.

AudioGen - Implemented Audio Generation [DRAFT PR] #117

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
b130194
Audiocraft CLI Directory & ReadME
Nate8888 Oct 20, 2023
fa305d4
initial files
Nate8888 Oct 20, 2023
3239fb8
Adds package setup & main driver code
Nate8888 Oct 20, 2023
3abca99
Adds initial testing framework pytest
Nate8888 Oct 20, 2023
c8d6876
Test initial workflow for AudioGen
Nate8888 Oct 20, 2023
5a7e045
improved setup + added requirements
Nate8888 Oct 27, 2023
d9b6c48
Implements --description & --duration for audiogen
Nate8888 Oct 27, 2023
a6eeb30
tests the creation of the audio file with desc
Nate8888 Oct 27, 2023
091c0a4
test workflow with torch install before audiocraft
Nate8888 Oct 27, 2023
c314e4f
test workflow with different distribution of torch
Nate8888 Oct 27, 2023
6a0ce73
[workflow] - try raw torch, vision, audio
Nate8888 Oct 27, 2023
3c32fda
[workflow] - Try downgrading Python
Nate8888 Oct 27, 2023
dc2419c
[workflow] - downgrade to match audiocraft + index
Nate8888 Oct 27, 2023
15bf85e
[Workflow] adds triple verbose to pytest
Nate8888 Oct 27, 2023
3e90fce
tries self-hosted runner on Google Colab
Nate8888 Oct 27, 2023
fbed041
test only file creation
Nate8888 Oct 27, 2023
4e4f97c
Refactors code, changes argparse to @click, Docstr
Nate8888 Nov 3, 2023
3532104
Changes entry point
Nate8888 Nov 3, 2023
f4e3ca6
adds batch functionality with file input
Nate8888 Nov 3, 2023
10cc1be
Checks if file was created
Nate8888 Nov 3, 2023
730579b
linting + consistency
Nate8888 Nov 3, 2023
acb9b64
README instructions
Nate8888 Nov 3, 2023
a11e54d
Switch from labgraph_audiogen to lg_audiogen
Nate8888 Nov 17, 2023
1317661
Add versions + Improve descriptions
Nate8888 Nov 17, 2023
9d5bebf
Adds ffmpeg to fix workflow
Nate8888 Nov 17, 2023
ecb2d04
fix package name to lg_audiogen
Nate8888 Nov 17, 2023
94aee5c
Adds O.S Support on ReadME
Nate8888 Nov 17, 2023
d5e347a
Improve ReadME with samples + batch instructions
Nate8888 Nov 17, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/labgraph_audiogen.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: AudioGen Tests

on: [push]

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.8'

- name: Install dependencies
run: |
cd extensions/lg_audiogen
python -m pip install --upgrade pip
sudo apt-get install ffmpeg
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install --pre xformers
pip install -e .
pip install pytest

- name: Run tests
run: |
cd extensions/lg_audiogen
pytest -vvv
68 changes: 68 additions & 0 deletions extensions/lg_audiogen/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Audiogen

Audiogen is a Python command-line tool that uses models from Audiocraft's AudioGen to generate audio from specified descriptions. This tool can generate a single piece of audio based on a specific description or multiple pieces of audio based on a batch file containing multiple descriptions.

## Features

* Ability to specify duration of the generated audio.
* Ability to generate audio based on a batch file.
* Ability to specify the model to be used for the audio generation.
* Ability to set the output file name.

## Setup

Audiocraft needs Python 3.8 or higher to run. If you have a suitable version of Python installed, you can install Audiogen with pip:

```shell
pip install -e .
```

## Usage

### Command-line interface

The CLI usage for Audiogen is `lg_audiogen [OPTIONS] [DESCRIPTION]...`.

### Options

* `description`: the description based on which the audio is to be generated.
* `duration, -d`: duration of the generated audio, default is 5.
* `model, -m`: name of the Audiocraft AudioGen model to use, default is 'facebook/audiogen-medium'.
* `output, -o`: name of the output file.
* `batch`: file name for batch audio description.

### Example

To generate an audio file you would use the following command:

```shell
lg_audiogen -d 5 -m 'facebook/audiogen-medium' -o 'my_output' 'dog barking'

lg_audiogen 'dog barking'

lg_audiogen -b 'batch.txt'
```

### Batch File Format

The batch file should contain one description per line. The descriptions should be in the same format as the descriptions used in the command-line interface.

Example:

*batch.txt*
```txt
Natural sounds of a rainforest
Bird Chirping in the background
```

### Samples

[Google Drive Folder](https://drive.google.com/drive/folders/1kdWB1CBog4NGVJ7jWddKLtBAuPm3gwDq?usp=drive_link)

## O.S Support

```Tested on Ubuntu 22.04 (Jammy) LTS```

## Error Handling

If the batch file is not found, a notable error message will be presented. Moreover, if a description is not provided when not using a batch file, a misusage error will be raised.
Empty file.
55 changes: 55 additions & 0 deletions extensions/lg_audiogen/lg_audiogen/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import click
import torch
from audiocraft.models import AudioGen
from audiocraft.data.audio import audio_write

DEFAULT_AUDIOGEN_MODEL = 'facebook/audiogen-medium'
DEFAULT_AUDIO_DURATION = 5

@click.command()
@click.argument('description', nargs=-1, required=False)
@click.option('--duration', '-d', default=DEFAULT_AUDIO_DURATION, help='Duration of the generated audio.')
@click.option('--model', '-m', default=DEFAULT_AUDIOGEN_MODEL, help='Name of the Audiocraft AudioGen model to use.')
@click.option('--output', '-o', help='Name of the output file.')
@click.option('--batch', '-b', type=click.Path(), help='File name for batch audio description.')
def parse_arguments(description, duration, model, output, batch):
"""
Generates audio from description using Audiocraft's AudioGen.
"""
if batch:
try:
with open(batch, mode='r', encoding='utf-8') as f:
descriptions = [line.strip() for line in f.readlines()]
except FileNotFoundError:
print(f"File {batch} not found. Please check the file path and try again.")
else:
if not description:
raise click.BadParameter("Description argument is required when not using --batch.")
descriptions = [' '.join(description)]
run_audio_generation(descriptions, duration, model, output)

def run_audio_generation(descriptions, duration, model_name, output):
"""
Load Audiocraft's AudioGen model and generate audio from the description.

@param descriptions: The parsed arguments.
@param duration: Duration of the generated audio.
@param model_name: Name of the Audiocraft AudioGen model to use.
@param output: Name of the output file.
"""
print(f"Running lg_audiogen with descriptions: {descriptions}")

# Load Audiocraft's AudioGen model and set generation params.
model = AudioGen.get_pretrained(model_name)
model.set_generation_params(duration=duration)

# Generate audio from the descriptions
wav = model.generate(descriptions)
batch_output = output
# Save the generated audios.
for idx, one_wav in enumerate(wav):
# Will save under {output}{idx}.wav, with loudness normalization at -14 db LUFS.
if not output:
batch_output = descriptions[idx].replace(' ', '_')
audio_write(f'{batch_output}{idx}', one_wav.cpu(),
model.sample_rate, strategy="loudness", loudness_compressor=True)
22 changes: 22 additions & 0 deletions extensions/lg_audiogen/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from setuptools import setup, find_packages

setup(
name='lg_audiogen',
version='0.1',
description="A Command-line interface to use Audiocraft for labgraph",
long_description="""
A Command-line interface to facilitate the usage of Audiocraft's models
to generate and process audio on labgraph
""",
packages=find_packages(),
install_requires=[
"Click>=8.1.7",
"torch>=2.1.0",
"torchaudio>=2.1.0",
"audiocraft==1.1.0",
],
entry_points='''
[console_scripts]
lg_audiogen=lg_audiogen.main:parse_arguments
''',
)
13 changes: 13 additions & 0 deletions extensions/lg_audiogen/tests/test_main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import os
import subprocess

def test_single_description():
'''
Tests output with a single description
'''
# Run the script with an example description
subprocess.run(["lg_audiogen", "dog barking"],
capture_output=True, text=True, check=False)
# Assert that the output file was created
assert os.path.exists("dog_barking0.wav"), "Output file dog_barking0.wav was not created"
os.remove("dog_barking0.wav")