I ended up getting a friend to record the voice I needed, then converted the .wav to pcm and then converted these into constant C arrays and included in my code. Haven’t tried using AMR yet - probably should get better compression using this codec.
You have to be careful about how you output the data - you can’t just point the playback at the beginning of the array and let it go. I found that you had to deal with the low level interrupt and feed the decoder chunks of data. In the end it worked really well.
I originally used the sample voices from http://www.cepstral.com - but the licencing to use them commercially was excessive. I haven’t looked for a while so don’t know what their fees are now. There are also a couple of open-source text to voice packages around but the voices were limited and didn’t encode very well. Again, it’s been a few years since I did it so they may have changed.
Anyway I think the recording process to obtain a high quality voice is critical, mostly for the environment noise and not professional instruments (microphone, audio board of PC, …). What was your result?
I heard SimCom (another manufacturer of wireless modules) has integrated text-to-speech technology directly in the module. They can convert a message to voice.
I don’t think this technology is so complex, but I couldn’t found the right solution for me.
I started read somthing similar. In the very next days I’ll try to play some audio stream.
Just to complete our discussion, even if the array in the example is u16, the samples should be coded as signed.
In other words, the zero value is 0x0000, the first positive number is 0x0001, the first negative number is 0xFFFF.