text to audio AI

How To Use ElevenLabs Text To Audio AI For Free

In this post, we’ll make a simple Python script for text to audio generation using ElevenLabs AI. Furthermore, we’ll use it by accessing their API, which is available for free. They also have paid plans, where they offer more features, which the free plan doesn’t include.

However, for the purpose of this tutorial, free plan will do just fine. What we’re going to do is load ElevenLabs multilingual model, choose a voice and give it some text to convert into audio. After we get the audio, we’re also going to save it into a mp3 file locally.

Setup

First of all, we need to get the API key from ElevenLabs, which will require you to make an account on their official page. Once you’re registered, you can find the API key if you click on the Profile button in the Profile image submenu.

elevenlabs profile image submenu

Once you click on this button, a modal will pop up with your API key hidden. You can display it by clicking the eyeball icon on the right and copy it somewhere safe. I copied my API key into a separate json file inside of my project directory.

Next, we’re going to import all the necessary libraries inside our Python script.

import os
import json
from elevenlabs import generate, play, set_api_key, voices, save

In the following snippet, we’re going to define a function that will fetch our API key from that json file. In addition, we’re going to insert it into the set_api_key method to authenticate our connection with the API service.

ROOT = os.path.dirname(__file__)

def get_token(token_name):
    with open(os.path.join(ROOT, 'auth.json'), 'r') as auth_file:
        auth_data = json.load(auth_file)
        token = auth_data[token_name]
        return token

set_api_key(get_token('elevenlabs'))

Utilizing text to audio AI

Now, we’re finally ready to start generating some audio. As we mentioned before, we’ll need to provide text, voice name, and model to the following function. This, in turn, will give us the audio data.

audio = generate(
    text="Hello! My name is Glyph, nice to meet you!",
    voice="Emily",
    model='eleven_multilingual_v2'
)

play(audio)

Alright! Next thing we’re going to do is save this audio data to a file in the project directory. ElevenLabs Python package already comes with the function that takes care of this. So, we just need to call it and provide it the audio data and output file path.

output_file = os.path.join(ROOT, 'output.mp3')

save(audio, output_file)

Different voice options

ElevenLabs gives us many voices to choose from, and if you decide to go for their paid plans, you will also be able to make your own. In the following snippet, we can display all available voices by calling the voices function and print its return value out.

available_voices = voices()
for voice in available_voices:
    v = dict(voice)
    print(f'Name: {v["name"]}, Info: {v["labels"]} \n')

print(len(available_voices))

Text to audio AI Python Script

Here is also the entire code of this project.

import os
import json
from elevenlabs import generate, play, set_api_key, voices, save

ROOT = os.path.dirname(__file__)

def get_token(token_name):
    with open(os.path.join(ROOT, 'auth.json'), 'r') as auth_file:
        auth_data = json.load(auth_file)
        token = auth_data[token_name]
        return token

set_api_key(get_token('elevenlabs'))

audio = generate(
    text="Hello! My name is Glyph, nice to meet you!",
    voice="Emily",
    model='eleven_multilingual_v2'
)

play(audio)

output_file = os.path.join(ROOT, 'output.mp3')

save(audio, output_file)

# Different voice options
available_voices = voices()
for voice in available_voices:
    v = dict(voice)
    print(f'Name: {v["name"]}, Info: {v["labels"]} \n')

print(len(available_voices))

Conclusion

To conclude, we made a simple Python script for utilizing ElevenLabs text to audio AI via their API. I learned a lot while working on this project and I hope it proves itself useful to you as well.

Share this article:

Related posts

Discussion(0)