Mastering the Digital Shift: From Video to Text

Mastering the Digital Shift: From Video to Text

The digital world is evolving at a rapid pace, and one thing I feel strongly about is the need to stay proficient in all major media forms: text, images, audio, and video. We're witnessing a notable shift from text and images to more dynamic formats like audio (such as podcasts) and synthetic video.

While I’m still shaping my perspective on this trend, it’s undeniable that we’re moving into an era dominated by new ways to communicate.

If you want to stay relevant in your work or on social media, it’s becoming increasingly important to master how to engage with video content. This means being able to take a video and repurpose it—to respond to it, translate it, or analyse its contents.

With the rise of AI-generated video, I predict we’ll face an ever-growing bombardment of video content. It’s easier and cheaper to produce than ever before, more democratic in its accessibility, and very much in favour with younger audiences.

Here is a synthetic video with AI Avatar that got me started thinking about organising my workflow:

Example video from AI Avatars speaking through video rather than sending textual information.

Embracing the Challenge

In light of this, I challenged myself to work from video to text using both my mobile device (iOS) and my computer (macOS). My goal is to get comfortable not just creating and sharing video, but also efficiently turning that content into actionable insights or further creative output—both in English and Dutch.

In this article, I want to share my current workflows for video-to-text conversion, with a promise to dive deeper into text-to-video workflows in the near future.

My Workflow for Video to Text (Mobile)

I use an iPhone running iOS for my mobile workflow. This setup allows me to leverage a variety of apps that help streamline the video-to-text process.

1. Find and Download Video Content

The first step is to identify the video I want to work with. Getting a grip on a video is much harder than getting hold of a text—perhaps due to copyright issues or technological constraints like streaming and storage. It has been made deliberately challenging to get a copy of video content.

Once I do find the video, it can be on Instagram, YouTube, X, TikTok etc, I use a tool called Video Lite to download it onto my device (iPhone).

(I would love to have an app for mobile and desktop (iOS/Mac) that can download video's, store them in iCloud, make them searchable and transcribe them for the language they are in.)

2. Organise Video Files Using Folders

After downloading, I organise video files in dedicated folders within iCloud. I need to move video's for this out of Video Lite by saving them to files.

For videos I only need for transcription, I use the Downloads folder. Videos I want to save go into an iCloud folder named “Video.” This setup ensures seamless syncing across devices and simplifies sharing between apps, without cluttering up other storage locations.

3. Transcribe the Video

To transcribe the video, I use an app called Transcribe. It’s reliable and user-friendly for both Dutch and English content, but ideally, I would expect Apple to offer a native solution for this step—handled locally on the phone or computer, rather than relying on the cloud.

Transcription workflow through Transcribe. This also uploads to the cloud.

4. Export the Transcript for Further Work

Once the transcription is complete, I export the text and bring it into ChatGPT. This is where the real creative work begins—refining the message, analysing the content, or even generating new ideas based on the transcript.

Send the transcription from Transcribe to ChatGPT app

Final Thoughts

This workflow is still a work in progress, and I’m experimenting to see how I can make it more streamlined. As we continue to embrace AI and video content, I believe adaptability is key. It’s not just about creating content but learning how to wield it in various forms, from video to text and back again.

In the next part of this series, I’ll share my approach to creating synthetic video content from text—an area that’s becoming more accessible, cheaper, and surprisingly versatile.

Stay tuned, and I’d love to hear about your workflows for video and text. Are you finding new ways to repurpose content across formats?


P.s. YouTube provides transcripts to most video's as well. They make it a bit hard to fetch these texts but there is a Browser Plugin (Chrome and Safari) for it called YouTube Summary by Glasp.

Personally I am not a big fan of plugins and extensions.

The example video with transcription through extension.

I found out that I also can drop video's into Fireflies and they provide transcription and analysis. You can also export the text als SubRip Subtitle (SRT).