The bait, then the rug-pull.
Simone Ferretti opens holding his phone to camera with the caption I Have Cracked The System, then immediately unloads the proof: zero to 167,000 Instagram followers in fourteen months, 15K per month minimum, from content he never personally records.
What the video promised.
stated at 00:04“I am going to show you the exact three step process that allow me to grow my brand new AI Instagram page from zero to 167000 followers in just fourteen months.”delivered at 13:30
Where the time goes.

01 · Hook and Proof Stack
Phone-to-camera opener, rapid number escalation 0 to 167K and 15K per month, brand deal timeline. Waitlist CTA teased before the steps are even named.

02 · Three Steps Plus Reel Selection Warning
Names the 3-step system. Warning: cloning the wrong reel wastes all setup time. Manual method: fresh account, follow 5-10 niche creators, warm up the feed.

03 · The Outlier Test
Quantifies winner threshold: 20K avg vs 400K spike. Saves/shares over views. Filter: clone only info/tutorial content that survives creator-brand transplant.

04 · Discovery Page Gold Mine
Instagram discovery page as real-time signal of what works. Warm the account toward niche so the algorithm shows relevant winners.

05 · Picking the Example Reel
Live walkthrough of his own account. Identifies the AirLLM reel as the example. Shows split-screen format.

06 · Step 2 Voice Clone Philosophy
Core principle: input quality equals output quality. Record with emotion, clarity, quiet room, close mic. One-time setup affecting every future reel.

07 · ElevenLabs UI Walkthrough
Screen share: Voices to Create Voice to Professional Voice Clone. 5-minute sample minimum. TTS flow: paste script, select voice, pick model, adjust sliders, generate.

08 · Live Voice Generation Demo
Generates and plays back cloned voice live. Downloads the file for HeyGen upload.

09 · Step 3 HeyGen Setup Philosophy
Input quality repeated. Training video becomes every future reel. Good lighting, eye contact, emotion. If you sound robotic, HeyGen will be robotic.

10 · HeyGen Walkthrough Avatar Creation and Script Upload
Shows Avatars dashboard with multiple looks. Avatar creation: real camera not webcam, 2-3 min sweet spot, consent required. Uploads ElevenLabs audio as accent workaround. Selects Avatar 5, generates.

11 · Final Output Reveal
Completed AI reel plays back. Recommends captions and pacing via CapCut AI or Captions AI.

12 · Results Proof and Waitlist CTA
Full AirLLM reel shown: 130K views, 3.5K likes, Made Almost Entirely With AI. Zero on-camera time from Simone. Full-system waitlist pitched.
Visual structure at a glance.
Named ideas worth stealing.
The Outlier Test
Compare a reel performance against that creators baseline, not platform averages. A 20x spike is the winner signal. Cross-check saves/shares to confirm genuine engagement.
Input Quality Equals Output Quality
The quality ceiling of your AI clone is set by the quality of your real recording. Bad mic equals bad clone. Good emotion equals believable avatar.
Three-Step AI Reel Cloning System
- Find a winning reel outlier test plus saves filter
- Clone your voice ElevenLabs professional clone
- Recreate with AI avatar HeyGen plus cloned voice
Account Warming for Discovery
Open a fresh account, follow 5-20 niche creators, engage to train the discovery algorithm. Discovery page becomes a live feed of niche winners.
Lines you could clip.
“I generate over 15000 a month as a minimum from content I never even record myself.”
“Input quality directly affects output quality.”
“If you sound robotic, HeyGen will be robotic.”
“A reel with 100000 views and 5000 saves is significantly more valuable than a reel with 500000 views and 200 saves.”
How they spent the runtime.
Things they pointed at.
How they asked for the click.
“Click the link below to join the wait list.”
Teased in the hook block before steps are named. Repeated at the end with full system description. No subscribe push, all traffic goes to waitlist.
Word for word.
The whole game is the input recording.
Ferretti system works because he treats the one real recording session as the master asset: everything downstream is an infinite derivative of that single quality investment.
- The outlier test is the real alpha: saves/shares over views, baseline comparison over absolute numbers. Run this before touching any AI tool.
- Fresh account plus niche warming equals free trend research. The discovery page shows what the algorithm is already distributing.
- ElevenLabs Professional Voice Clone not Instant is the quality gate. 5-minute sample, quiet room, close mic, extra emotion. Record once, use forever.
- The ElevenLabs to HeyGen handoff is the unlock: generate audio in ElevenLabs, upload to HeyGen as a voice file. Bypasses weaker TTS, critical if you have an accent.
- Avatar 5 enhances gestures. Test both Avatar 4 and 5 and pick per-reel as versions evolve.
- The Made Almost Entirely With AI label on the proof reel is itself a hook mechanic that drives shares.
- This is a one-person media company model: the system separates ideation from execution. The trained assets do the work.
What you can actually do with this today.
You do not need to be on camera to build a following: you need one good recording session and a clear niche.
- Open a fresh Instagram account, follow 10-20 creators in your target niche, and spend a week warming the discovery feed before touching any AI tool.
- When evaluating reels to clone, look at saves and shares not view count. 5000 saves on 100K views outperforms 200 saves on 500K.
- Only clone content where the value is in the information not the personality. Tutorials and frameworks transplant. Reaction content does not.
- ElevenLabs Professional Voice Clone needs about 5 minutes of clean audio. Record in a quiet room, mic close, with more energy than normal. This sets your quality ceiling forever.
- HeyGen avatar training: use a real camera not a webcam, record 2-3 minutes, look directly into the lens. The energy in that session is the energy in every reel.
- Once your voice and avatar are trained, producing new reels requires no additional on-camera time. The input cost is front-loaded; the output is unlimited.




































































