Google has upgraded Gemini with Veo 3, adding the ability to create videos from images. Users can now create AI videos by providing an image and a text prompt. This feature is currently only available on the web and has not been launched in India.
Google: Google has further enhanced its generative AI capabilities by upgrading the Veo 3 model with a new revolutionary power. Now, users can create videos from just an image and a text prompt via the Gemini web interface. With the addition of this feature, Google is not only leading in the field of text and image generation but also in video creation. This technology is currently not available on mobile apps, but Google is making it public through the Gemini web interface in selected countries. However, India is currently out of reach.
Image-to-Video: How does Veo 3 work?
The new update of Veo 3 allows users to add a reference image, along with a text prompt, which describes what changes or animations should be made to that image. This text and image together instruct Veo 3 on what to show in the video. For example, if you provide a picture of a mountain and write in the prompt, 'The sun rises, birds fly, and a child climbs', then Veo 3 can create a video according to these instructions. That is, now your imaginations are not limited to words – an impressive video can be created from a picture and an idea.
Where to find the new feature in Gemini?
Users who are using Gemini can go to the web browser and see a "Video" tab right below the text box. By clicking on this tab, the user will get a new option – to add an image. Here, the user can upload a photo and along with it, can add a text prompt. After this, the Veo 3 model will automatically generate the video based on the image and text.
Demo surprised everyone
Google also released an interesting demo video, which showed the real power of Veo 3. In this demo, a picture of a common cardboard box was taken and three different videos were created on it:
- A hamster making food inside the box.
- A person jumping inside the box.
- An elevator coming out of the box.
This demo proved that Veo 3 can create not only simple movements but also complex and artistic animations.
Safety is also fully ensured
Google has taken special care of transparency and safety in this new feature.
- A visible watermark is added to every video, which shows that this video is generated by AI.
- Also, an invisible watermark is added to every video through Google's SynthID technology, which cannot be cropped, edited, or erased by anyone.
This technology can be helpful in the future in identifying Deepfakes or fake videos, which is a very necessary step for the digital world.
In which areas can this feature be useful?
This image-to-video feature of Google Veo 3 is not just an enhancement but the future of AI generative media. This feature can open up new possibilities in the fields of animation, advertising, gaming, and even education.
- In education: Complex concepts can be easily explained to children by converting a photo into a video.
- In marketing: An impressive video advertisement can be created from a photo of a product.
- In entertainment: Film directors and writers can quickly create their stories in visual form.
Future possibilities
This capability of Veo 3 can be further developed in the coming time. In the future:
- Users will be able to create long video stories.
- Images and movements can be controlled from different angles.
- This feature can also be added in mobile apps.
This initiative of Google is directly competing with video generation platforms like OpenAI, Runway, and Pika Labs.