SoundGen - a text to speech generator

heading diagram

SoundGen is a PowerShell script which converts text into sound files. The output files are compatible with EdgeTX, OpenTX and Ethos.

SoundGen runs locally on your PC (it does not rely on external TTS services). Most of the code for Sound Gen was generated using Claude AI from prompts by the author.

Requirements

PC with Windows 10 or 11
[Optional] FFmpeg

Interactive or batch versions

There are two versions of the script

SoundGen Interactive: You type in the text, the script generates a 16-bit/16K sample rate .wav file.
If FFmpeg is installed (optional), the script will offer to generate a second .wav with silences trimmed and loudness boosted by 4dB.
SoundGen Batch: You give it a .csv file containing phrases and filenames. The script generates the .wav files with silences removed and loudness boosted by 4dB. FFmpeg must be installed with this version.

Voices

The scripts can use any of the voices listed in the Windows Speech settings (the RC-Soar templates use Zira).

Installation

To install the interactive or batch script, right click on the link (above) and save as a text file. Then follow the instructions at the head of each file.

To install FFmpeg go here; the 'essentials' package is sufficient. Make sure that the folder containing the .exe files are in your PATH environment variable.

Example outout .wav files

Click on a link to play:

Limitations

Though reliable in normal use, bear in mind that the code is AI generated and can fall over if you provoke it. Avoid including non-speech characters like $,* etc. Also make sure to include the comma separator in the .CSV files.

Screenshots

Interactive version

Example phrase is 'don't forget to move the sticks'.

Comparison of trimmed and untrimmed silences

Upper: base file, untrimmed (play)
Lower: with silences trimmed and loudness boosted 4dB (play)