For my AI-generated Japan Daily News podcast I use ElevenLabs and have been amazed at the quality of the voice output. It may not be the cheapest text-to-speech API but in my testing it seems to be the best (at the time of writing).
Firstly, you don’t need an API key to get started. Just start coding and playing, and eventually you’ll get a message from the API that you need to sign up to continue. By then, you should have become familiar with how it works.
Making Python talk
The Python package provided by ElevenLabs is easy to use, so let’s start there. Install the elevenlabs
package and then import it in a new Python file.
$ pip install elevenlabs
import elevenlabs
Speech creation is done in two parts:
- Generate the audio
- Play (or save) the audio
The elevenlabs
module contains a generate() function which takes at least two self-explanatory arguments: text
and voice
audio = elevenlabs.generate(
text = "Hi, I'm from the future!",
voice = "Bella"
)
The voice argument can be a voice name, voice ID, or voice object. (Jump below to see premade ElevenLabs voice IDs and a video of voice samples.)
To hear the result, pass the generated audio to the play()
function:
elevenlabs.play(audio)
If you want to output this audio as an MP3 file, use the save()
function:
elevenlabs.save(audio, "audio.mp3")
I love that it’s so logical!
Tweaking the voice settings
* Note: The VoiceSettings
class doesn’t seem to work in some software such as PyCharm, so just use the voice name or voice ID only.
Now it’s time to take it up a notch by controlling how expressive the voice is. Although there are two settings available – stability
and similarity_boost
– I’ve found that only stability
really makes a difference to how the output sounds. In any case, you can’t set only one value so here’s how to set both of them.
Firstly we need to create a voice
object, specifying the voice ID and the two setting values we want to use. The range for both settings is 0 to 1. Having a stability
setting of 1 makes the output sound quite boring, whereas a setting of 0 makes the speaker sound excited and emotional. Usually somewhere near the default of 0.5 is best but it’s fun to play with the extremes!
voice = elevenlabs.Voice(
voice_id = "ZQe5CZNOzWyzPSCn5a3c",
settings = elevenlabs.VoiceSettings(
stability = 0.3, # Lower is more expressive.
similarity_boost = 0.75
)
)
Once you have your voice object, you can play or save it with the same syntax as before. Here’s the working code in full.
import elevenlabs
voice = elevenlabs.Voice(
voice_id = "ZQe5CZNOzWyzPSCn5a3c",
settings = elevenlabs.VoiceSettings(
stability = 0,
similarity_boost = 0.75
)
)
audio = elevenlabs.generate(
text = "Hi, I'm from the future!",
voice = voice
)
elevenlabs.play(audio)
elevenlabs.save(audio, "audio.mp3")
Adding your API key
Finally, once you hit the trial limits you can add your ElevenLabs API key with the built-in function like this:
elevenlabs.set_api_key("my-api-key")
I recommend you use environment variables to hide your API key instead of putting it directly in your code.
It’s really fun to play with this technology and easier than I expected, so give it a go and see the ElevenLabs Python project for more documentation and examples.
Reference: ElevenLabs Voice IDs & Samples
- Adam: pNInz6obpgDQGcFmaJgB
- Antoni: ErXwobaYiN019PkySvjV
- Arnold: VR6AewLTigWG4xSOukaG
- Bella: EXAVITQu4vr4xnSDxMaL
- Callum: N2lVS1w4EtoT3dr4eOWO
- Charlie: IKne3meq5aSn9XLyUdCD
- Charlotte: XB0fDUnXU5powFXDhCwa
- Clyde: 2EiwWnXFnvU5JabPnv8n
- Daniel: onwK4e9ZLuTAKqWW03F9
- Dave: CYw3kZ02Hs0563khs1Fj
- Domi: AZnzlk1XvdvUeBnXmlld
- Dorothy: ThT5KcBeYPX3keUQqHPh
- Elli: MF3mGyEYCl7XYWbV9V6O
- Emily: LcfcDJNUP1GQjkzn1xUU
- Ethan: g5CIjZEefAph4nQFvHAz
- Fin: D38z5RcWu1voky8WS1ja
- Freya: jsCqWAovK2LkecY7zXl4
- Gigi: jBpfuIE2acCO8z3wKNLl
- Giovanni: zcAOhNBS3c14rBihAFp1
- Glinda: z9fAnlkpzviPz146aGWa
- Grace: oWAxZDx7w5VEj9dCyTzz
- Harry: SOYHLrjzK2X1ezoPC6cr
- James: ZQe5CZNOzWyzPSCn5a3c
- Jeremy: bVMeCyTHy58xNoL34h3p
- Jessie: t0jbNlBVZ17f02VDIeMI
- Joseph: Zlb1dXrM653N07WRdFW3
- Josh: TxGEqnHWrfWFTfGW9XjX
- Liam: TX3LPaxmHKxFdv7VOQHJ
- Matilda: XrExE9yKIg1WjnnlVkGX
- Matthew: Yko7PKHZNXotIFUBG7I9
- Michael: flq6f7yk4E4fJM5XTYuZ
- Mimi: zrHiDhphv9ZnVXBqCLjz
- Nicole: piTKgcLEGmPE4e6mEKli
- Patrick: ODq5zmih8GrVes37Dizd
- Rachel: 21m00Tcm4TlvDq8ikWAM
- Ryan: wViXBPUzp2ZZixB1xQuM
- Sam: yoZ06aMxZJJ28mfd3POQ
- Serena: pMsXgVXv3BLzUgSXRplE
- Thomas: GBv7mTt0atIp3Br8iCZE
Leave a comment