How many times a day do you say, “Let me show you what I mean?”
Do you use it when trying to explain something complex to a team member? Or when trying to explain to an expert what you are trying to accomplish or the problem you want them to help you solve?
Anyone on my team or a developer, graphic designer, or producer who has worked with me knows it is one of my favorite ways to make sure I am communicating clearly.
About a month ago, I was granted early access to Google’s Gemini 1.5 pro model, which has a 1 million token context window. The 1 million token context window is interesting for many reasons. One of those is that it enables video as input for the first time.
I was not prepared for how fundamentally different this would change my experience with AI.
As humans, language carries us a long way, while images take us further. Video helps us communicate more and often does it better and makes it easier. Now, we can say to the AI, “Let me show you what I mean.”
While AI has primarily focused on text-based interactions, video input provides a wealth of real-world context that text alone cannot capture.
Pushing the No Code Boundaries
In testing Gemini’s capabilities I am pushing myself to think of ways I can use a multimodal LLM to develop no-code solutions. I learn from developers how they approach using the tool and then think of a no-code solution I can use on my own to use the model in a similar way.
Here are a couple of examples I shared recently in a Q&A session:
🔶 In the early release testing version, Gemini 1.5 pro, you can’t ask Gemini to visit a website and provide feedback.
Developer solution: connect a code-based web-scraping tool
My non-developer solution: use a screen capture video and go through the pages in the way a human would navigate
🔶 In the early release testing version, Gemini 1.5 pro video did not include audio capability.
Developer solution: use a transcript with time stamps and use complex prompt engineering to relate the two.
My non-developer solution: record a video in Apple Clips on my iPhone with subtitles
Access to Gemini 1.5 pro has been limited, but there has been a good bit of press and talk about it. Too many people are missing out on how game-changing the functionality of video as input will be.
Being able to say, “Let me show you what I mean,” will help people achieve all sorts of new things.
My mind is already racing with ideas for things to try on my own that I feel newly empowered to try.
Continue Reading
The Holistic Marketer Mindset
Empowering Marketing as an Organizational Catalyst. I was recently called a “marketing nerd” by another marketer, no less. I enthusiastically accepted the label, and we laughed. But then we talked about why he sees [...]
Let Me Show You What I Mean.
How many times a day do you say, “Let me show you what I mean?” Do you use it when trying to explain something complex to a team member? Or when trying to explain to [...]
Think Again for Thinking Ahead
If the amount of margin notes and highlights is how you determine the value of a book, you can see Think Again by Adam Grant was a clear win for me. ✍️ I sought out [...]