#TOOLKIT 014 - Text to Video Convertors - AI-powered!
https://www.linkedin.com/pulse/toolkit-text-video-convertors-ai-powered-boni-aditya-exrvc/
November 10, 2023
From Recorded audio to Text to Speech and finally to Text to Video- JOBS TO BE DONE!
Back in 2015, I discovered audiobooks and I was at the mercy of recording studios and had to only listen to labels that were recorded by Simon and Schuster, but in 2018 I discovered TTS - Text to speech readers, for the first time. But the voices were robotic, I converted a lot of books into audio, even though they sounded robotic, I had the freedom to listen to any book I wanted and so I was able to hit the 300-book mark.
But there is only so much you can listen to!
You can't listen to a math book and learn math. You are restricted to specific domains that you can hear, like psychology, economics, behavioral sciences etc...
Then Came Speechify, I am using it now, speechify allows me to watch the book being read, it highlights the current sentence being read as it is read.
Speechify allows me to read all genres of books.
But JTBD, Jobs to be done, proves that my Job To Be Done - to acquire more knowledge is being met by one product after the other.
Story Telling - Make it easy to listen - the very first audiobook I listened to was Ramayan and Then Mahabharat - a master storyteller recited them to me when I was young.
Audiobooks - The very same job i.e. learning was done better than reading, was through audiobooks, but audiobooks had to be recorded in a recording studio and only very few books were ever converted into audiobooks.
Text-to-Speech Readers - The very same job i.e. reading books is done better by Text-to-speech readers, but the voices were robotic and there were no visual aids, you had to listen to the speech in robotic voices.
Speechify - Speechify is powered by AI/ML neural net voices, not only are the voices human-like, I can now see the book while the voices read the book. The problem of reading a book is now better solved and now I can listen to any book while it is being highlighted.
But the job is still not done! The job is to find the easiest way to absorb the content in a book and the best way to do that is through video.
Here is the order in which it i easy to consume content. The difficulty level is added against each content type.
[Ease of Consumption]
Video (1) > PPT (10) > GIF (100) > Picture (1000) > Audio (10000) > Text (1000000)
On the other hand, it is extremely costly to make a video. The difficulty of creating the content is inversely proportional.
[Difficulty of Creation]
Video (1000000) > PPT (10000) > GIF (1000) > Picture (100) > Audio (10) > Text (1)
Text to Video - In an ever-increasing attempt to make it easier to consume content, the last step is to convert the text directly into video. Often this is done manually in a studio, through video recording, which is an extremely costly affair.
To understand JOBS TO BE DONE BETTER -
https://www.linkedin.com/pulse/pm-lesson-24-jump-from-idea-driven-needs-innovation-boni-aditya
https://www.linkedin.com/pulse/pm-lesson-25-jtbd-meticulously-mapping-jobs-done-boni-aditya
https://www.linkedin.com/pulse/pm-lesson-26-use-jtbd-needs-framework-derive-all-desired-aditya
Now we have AI/ML solutions with stable diffusion models that can generate video from text directly NVIDIA and Adobe have already created tools that can convert text into video. If I can provide the text from a book and get video in real-time, that would almost completely solve the problem of reading a book and absorbing the content in it.
TEXT TO VIDEO TOOLS
There are many tools in the market that allow you to create text from video. Each tool takes its own path. One tool uses stock images to find the right image for the right text semantically and creates a PowerPoint video. While other tools use an AI-rendered reader to lip-sync and read out the paragraph for you. Yet other tools try to actually visualize the concept by actually trying to create a monkey riding on a horse. Whatever the approach taken by these text-to-video generators, one cannot be blind to the fact that essentially the way we consume video is about to change. The way video is generated is also about to change. Right now the cost is a bit premium but eventually, as the number of users increases this price will come down.
Here is a list of some of the better Text to text-to-video convertors I have used so far.
InVideo AI
Invideo AI provides the following features.
Steve AI
https://app.steve.ai/
Steve AI can provide you with a very interesting toolkit, to convert text to video/animation. The videos are interesting too.
Ink Sprout
https://inksprout.co
Inksprout tackles an interesting niche, it summarizes your blog post first and then converts that summary into a video, with text overlapping on the video. This video format is created particularly for social media short-form consumption.
Deep Brain AI Studios
https://www.deepbrain.io/features/text-to-video
AI avatars read out the text for you like a news reader.
Here are a few other AI Presenters
HourOne
https://hourone.ai
FlexClip
https://www.flexclip.com/tools/ai-text-to-video/
Elai
The best AI video generators at a glance
DESCRIPT
https://www.descript.com
Best for - Editing video by editing the script
Platforms - Windows, Mac (Web for some features)
Free plan - Yes, with 1 hour of transcription and 1 watermark-free video at 720p
WONDERSHARE FILMORA
https://filmora.wondershare.com/index-a.html
Best for - Polishing video with AI tools
Platforms - Windows, Mac, iOS, Android
Free plan - Yes, with watermark
PEECH
Best for - Content marketing teams
Platforms - Web (prefers Chrome)
Free plan - Yes, for 1 user, 2 videos per month, 5-minute upload limit, and watermark
SYNTHESIA
Best for - Using digital avatars
Platforms - Web
Free plan - No
FLIKI
Best for - Social media videos
Platforms - Web
Free plan - Yes, up to 5 minutes/month, watermarked video in 720p
VISLA
Best for - Turning a script into a video
Platforms - Web
Free plan - Yes, up to 50 minutes of video, 3 hours of transcription, and 10GB storage
.
OPUS CLIP
Best for - Repurposing long-form to short-form video
Platforms - Web
Free plan -Yes, 60 minutes/month with limited feature access and watermarks
These are the tools or the wrappers that are created for ready-made use.
But the underlying engines are different.
Generative AI/ML engines actually create videos that do not exist anywhere.
Genmo AI
https://www.genmo.ai/create/video
Genmo allows you to create both videos and animations, with simple frame-by-frame descriptions.
RUNWAY MAGIC TOOLS
https://runwayml.com/ai-magic-tools/#top
Best for - Experimenting with generative AI
Platforms - Web
Free plan - Yes, with 125 video credits (used for AI features), 3 projects, and 720p export
To browse all the models for Text to Video - You can check out the Hugging face site.
https://huggingface.co/models?pipeline_tag=text-to-video