AI Video Generation with Google Flow (Veo)
A coursework exploration of Google Flow (Veo) generating short cinematic clips from text and image prompts to learn the craft of directing generative video models.
Business Problem
Coursework framing rather than a market problem. The AI Principles assignment was to build hands-on fluency with state-of-the-art generative video tooling and develop intuition for prompt-as-direction, since AI video models are rapidly entering professional creative workflows alongside generative image and text models.
Tools Used
Key Features
- Generated short cinematic clips from text-only prompts using Google Flow's prompt-and-render loop
- Generated image-conditioned clips, using reference images to anchor subject and style
- Practiced director-style prompting - naming shot type, camera move, environment, mood, and subject behavior in compact phrases the model can parse
- Triage workflow: generate a small batch, pick the strongest clip, then re-prompt that direction with tighter constraints rather than chasing perfection on the first take
- Personal documentation of which prompt patterns produced reliable output and which kept failing, to build durable intuition
My Role & Contribution
Sole student. Designed the prompts, ran the generation/triage loop, and documented the patterns that worked vs. failed for future generative-video work.
Biggest Challenge
Translating cinematic intent (specific camera moves, lighting, pacing, subject identity across shots) into prompt text the model could reliably interpret. The model has real strengths and real weaknesses, and the cost of 'one more variation' is non-trivial - so the iteration discipline matters as much as the prompt itself.
What I Learned
Generative AI assets are raw material, not deliverables. The value is in selection and combination, not the model. Director-voice prompting (naming shot type, camera move, mood, subject behavior in compact phrases) transfers to any image, video, or 3D generative tool. AI video tooling now belongs in the modern designer-builder toolkit alongside generative image and text models.