
Christine Williams
May 13, 2025
Hi, I’m Christine, and in March this year, I started a bold journey—automating a YouTube Shorts channel using AI and RPA. The niche? AI-generated animal stories. Why this niche? Because animals resonate emotionally with audiences, and in the age of short-form content, emotional connection drives views and engagement.
But there was one big problem: producing videos manually is a time sink. Sourcing footage, editing, and publishing takes hours per video. That’s when I decided to go all-in on automation.
During the May holiday, I documented my full automation process. In this blog, I’ll walk you through:
My end-to-end automation strategy – from finding reference videos to generating final visual assets.
How to use my scripts – with step-by-step guidance, so you can implement or adapt the system for your own use.
This framework doesn’t just work for animal content. Master this process, and you can apply it across various AI video niches.
The Core Strategy: Recreate, Refine, and Automate
Let’s be honest—my video creation method is inspired by the best performers in my niche. But I don’t copy; I analyze, deconstruct, and recreate with enhancements.
The pipeline consists of 7 major steps:
Identify top-performing Shorts as references
Break down those videos into storyboard frames
Write AI prompts for each frame (image generation)
Modify elements in prompts to create a unique version
Generate images for each frame
Write video generation prompts for those images
Stitch everything together in an editor
Steps 5 and 7 aren't fully automated yet, but the rest? Entirely handled by RPA (Robotic Process Automation) using Automa in Chrome, including multi-threading via fingerprint browsers.
Step-by-Step Breakdown
1. Sourcing Reference Videos
My script scrapes data from YouTube Shorts with a single hotkey (Ctrl + Alt + S), and supports both single videos and entire channels. The data goes straight into a spreadsheet, saving time and clicks.
⚠️ Pro tip: Use a secondary account for batch scraping to avoid risk.
2. Extracting Storyboards with Gemini 2.5 Pro
I use Google AI Studio with Gemini 2.5 Pro to break videos into scenes. It analyzes visuals and generates frame-by-frame prompts for image generation.

Step-by-Step Guide
Step1: Open Google AI Studio
Log in with your Google account.
In the top-right dropdown, choose Gemini 2.5 Pro (Flash Experimental) or the latest available model.
🔒 If you’re blocked from analyzing a YouTube video directly, use a browser extension or tool (e.g. 4K Video Downloader) to save the video locally, then upload the file directly into Gemini.
Step2: Load Your Video into Gemini
Option A: Use YouTube Link
Paste the URL of a publicly accessible YouTube Shorts video.
Option B: Upload a File
If external access is blocked, click the paperclip 📎 icon to upload a local video file.
To ensure high-quality output with Dreamina (an image generator), I use a refined prompt structure:
Camera Angle, Scene Setting, Main Character Description, Action, Facial Expression, Supporting Characters, Background, Time of Day, etc.
This structure ensures clarity for the AI model and consistency across frames.
Field | Description | Example |
---|---|---|
Camera Angle | The viewpoint (e.g., side view, low angle) | "Side angle" |
Main Character’s Environment | Where they are | "On a rainy cliff edge" |
Main Character Description | Physical traits | "A man in a white T-shirt and jeans" |
Main Action | What they’re doing | "Holding up a crying baby" |
Facial Expression | Emotion, visible reaction | "Angry expression" |
Supporting Characters | Optional: who else is there | "A police officer running toward them" |
Supporting Action | What they’re doing | "Shouting" |
Supporting Expression | Their emotion | "Serious" |
Background | The setting behind the characters | "Waterfall and misty mountain" |
Additional Details | Visual effects or atmosphere | "Heavy rain, crashing waves" |
Time of Day | When it's happening | "At dusk" |
3. Rewriting Prompts to Avoid Plagiarism
Want to make sure your version is original? I built a second Gemini assistant that tweaks core characters, locations, and story elements—while keeping the emotional arc intact.
For instance, you can transform a scene with a pug saving a baby on a stormy beach into one with a golden retriever in a flooded city. The plot remains, but the visual setting changes—making it reusable across multiple themes.
📘 Final Instruction Set for Gemini: Storyboard Prompt Modification
Core Principle: Keep the Plot Intact — Only Swap Characters or Scenes
This prompt system is incredibly easy to use. All you need to do is feed the image-generation prompts from Step 2 into Gemini.
🔄 How It Works:
Copy and paste the prompts you generated in Step 2 into Gemini.
Specify which elements to replace — for example, “Replace the pug with a golden retriever puppy.”
Gemini will output a revised set of prompts with updated characters or settings.
💡 Why This Matters
The magic of this method lies in what it doesn’t change: the storyline remains untouched. Gemini only adjusts surface-level elements like subjects or environments. This means:
You can reuse the same storyboard structure to create multiple variations.
All versions remain compatible with the same video generation prompts.
You save time while producing a range of content from a single base script.
I've tested this personally—generated six alternate versions using the exact same video-generation instructions, and the results were consistently excellent.
4. Generating Images with Dreamina
Dreamina (CapCut’s international AI image tool) allows free image generation. My RPA script logs in, submits prompts, and downloads images automatically. All images are then renamed in sequence (1.jpg, 2.jpg…) using a Python tool I wrote for seamless integration in the next step.
5. Writing Prompts for Video Generation
I use the Dreamina prompts as input to generate video descriptions for Kling (可灵), ByteDance’s AI video generator. Prompts follow a specific format:
Camera movement (e.g. handheld, zoom-in)
Subject action (e.g. "the puppy swims towards the child")
Environmental effects (e.g. "stormy waves crashing")
Note: Out of 10 prompts, around 6 result in usable videos currently—still a work in progress.
6. Video Generation with Kling
This step is semi-automated. I wrote scripts to register new Kling accounts, input prompts, and download the final videos. Manual login is required due to CAPTCHA.
Each account generates up to 8 videos. Once logged in, everything else is script-driven—from creation to download.
Bonus: Full Automa Script Suite

To tie everything together, I use a full suite of scripts built on Automa 1.28. With proper setup, you can:
Scrape Shorts videos
Parse video scenes with Gemini
Rebuild prompts with alternate characters
Auto-generate images in Dreamina
Auto-generate videos in Kling
Export results in CSV format
I also created templates and sample workflows to minimize onboarding time. Setup can feel complex initially, but once in place, your production becomes effortless.
You can access the automation script in the following github repository:
https://github.com/liuyinjiwen06/youtube_automation
Final Thoughts
By combining AI with RPA, I drastically cut down my production time while keeping creative control. This workflow helped me:
Maximize content output with minimal effort
Scale variations from a single script
Repurpose ideas across multiple channels and niches
This system isn’t limited to AI animal stories. Whether you're making ASMR, history shorts, or motivational content—this approach is adaptable.
If you're exploring the YouTube automation game, I hope this walkthrough saves you time and frustration. And if you’re stuck or curious, feel free to reach out—I’m happy to share more!
It's Free