The Potential of ChatGPT for Personalized Meditations
3 Lessons Learned Prototyping an OpenAI Meditation Experience
With the recent announcements around the ChatGPT and Whisper APIs, allowing developers to integrate OpenAI models into their applications, I wanted to explore how these services could be used.
In my ChatGPT Is My Meditation Teacher article, I played around with the idea of incorporating generative AI scripts to personalize a meditation experience. This week, I brought that idea to life to simulate what a ChatGPT-powered meditation app would look like.
To start, I trained ChatGPT on the parameters for designing a meditation experience and fed it some examples of meditation scripts. I asked it to incorporate three inputted variables in the meditation script: Name, Mood, and Happy Place.
At first request, it only incorporated two of the three.
Upon a second request, it captured the three inputs and integrated them into the meditation script nicely.
To bring this to life, the next step was finding an AI text-to-speech tool which would provide a natural, relaxing voice to guide this meditation. There were a number of tools I tried, which all had seemingly different benefits and limitations, and I landed on Murf.ai.
Murf had a wide selection of different voices, and the interface made it easy to edit the script and incorporate different elements into the production such as background music and images/video.
One big first observation is that to capture the essence of a meditation, the speed of the voice, the pacing, and the incorporation of pauses in the script are really important. And voice-to-text by default does not factor in those elements into the reading.
By default, it does not sound like a meditation. It sounds like a pharmaceutical commercial where they are reading the side effects of a drug.
So I needed to adjust the voice speed and manually incorporate pauses within the syntax. It still doesn’t sound as perfect and polished as a trained human, but it’s certainly closer.
I finalized it by incorporating a Dall-E image of a “Big Sur Sunrise” as the thumbnail.
Here is the final result:
These are my top 3 takeaways:
I still have a tough time trusting ChatGPT. Trust in ChatGPT requires a quality review of outputs. ChatGPT requires a quality review of outputs, so I’m not sure I’d fully trust an unsupervised output to return in an application based on the API. Whether it’s ChatGPT ignoring some of the input parameters or potentially incorporating bias into the response, I’d be concerned having fully generative output without the opportunity to do some quality control first.
AI Voices Don’t Quite Sound Human. Text-to-audio sound really close, but the technology has a way to go to create a truly naturalistic output. While the voices sound pretty solid, the tone of voice, pacing, and emotion don’t quite come through in the delivery compared to a voice actor.
Personalized Meditation On-Demand Is Quite Possible. Despite the shortcomings above, it’s incredible that in a matter of moments, a meditation audio can be generated that incorporates personal details that can help more deeply connect to the meditation. I really enjoy the concept of incorporating personal details into a meditation, and being able to customize certain variables such as voice and music.
What would your custom meditation look like?