The bait, then the rug-pull.
The opening is a thesis statement masquerading as a casual review. Inside the first 18 seconds GosuCoder names the central problem of AI-assisted coding (keeping the agent on track in big projects), then makes the bold claim that Augment Code's new task list may be the best implementation he's seen — which is the entire rest of the video unpacked.
What the video promised.
stated at 04:05“I'm gonna go through some of the features on it, but the big thing I wanna cover is it went from 63.2% in my evals to 67.5.”delivered at 05:20
Where the time goes.

01 · The thesis
Names the hardest problem in AI-assisted coding (keeping AI on track at scale) and makes the bold claim about Augment Code's new task list.

02 · Two schools of thought: split context vs same context
Whiteboards the landscape — Roo Code's orchestrator mode (split context, sub-agents) vs the same-context approach where one chat manages itself.

03 · Text task lists and Taskmaster
Walks through how he used to solve this — ChatGPT/Claude-generated text task lists, then bolt-on tools like Cloud Taskmaster (screenshot shown). Powerful but takes setup work.

04 · Claude Code and Augment Code's built-in task lists
Frames Claude Code's todo list and Augment Code's new task list as the in-flow same-context answer. No orchestration needed — the agent manages itself.

05 · The receipt: 63.2% → 67.5%
Shows his Best-AI-Agents leaderboard. Augment's eval score jumped from 63.2 (May 30) to 67.5 after the task-list update — a real, measurable boost moving it in line with Klein and Roo Code.

06 · Live demo: manual add, run-all, status control
Screen-share inside VS Code. Adds a task manually, edits status manually, hits run-all-tasks. Argues this is incredible because you can edit individual tasks without spending tokens to make the AI redo a plan.

07 · What it gets right (and one nitpick)
Praises that Augment knows WHEN to generate a task list — it picks complex queries and skips simple ones. Nitpick: panel is binary open/closed, can't be resized to a partial state.

08 · Continue-in-new-chat + import/export
Lists the workflow extras — push a task list into a brand new chat (he uses this), import from markdown (untested), export (untested).

09 · Reining in the AI + closing prediction
Generalizes the lesson — task lists 'rein in the AI,' which is why Claude Code feels controllable. Predicts every other AI coding tool will copy this pattern because it makes too much sense.
Visual structure at a glance.
Named ideas worth stealing.
Split Context vs Same Context
- Split Context — orchestrator mode dispatches sub-jobs to specialized modes (Roo Code, Claude Subagents)
- Same Context — one chat manages its own task list in-flow (Claude Code todo list, Augment Code task list)
The two architectural approaches AI coding tools are converging on for keeping agents on track in large codebases.
Five Things That Help Keep AI on Track (whiteboard list)
- Small surgical changes — don't let the AI do big sweeping things
- Use text-based task lists and have the AI work through them
- Add-ons like Cloud Taskmaster
- Claude Code built-in todo lists
- Augment Code built-in task lists
The whiteboard slide that anchors the whole video — the progression of techniques he's used over the last year.
Lines you could clip.
“One of the hardest things to do with AI assisted coding is keeping the AI on track in large coding projects.”
“Augment Code went from 63.2% in my evals to 67.5 — pretty huge boost very consistently because of the task list management.”
“It really does rein in the AI. I felt this in Claude Code, and that's probably one of the reasons why I like Claude Code so much.”
“I would actually be surprised if we did not see things like this in some of the other AI coding tools because it just makes way too much sense.”
How they spent the runtime.
Things they pointed at.
How they asked for the click.
“Have you had a chance to try this out? If not, you should just definitely go check it out because this thing is freaking awesome. Let me know what your thoughts are below.”
Soft CTA — three asks bundled (comment, try the product, implied subscribe). No hard sell, no affiliate pitch despite a Scrimba affiliate link sitting in the description. The product itself is the call to action.
Word for word.
Steal the format: feature review with a receipt.
Every JoeFlow / Mod Boss feature ship deserves a number — a before/after metric on a real workflow — not a vibes review.
- Open with the universal pain, not the product. 'The hardest thing in AI-assisted coding is keeping it on track' lands before he ever says Augment Code.
- Whiteboard the landscape first. Show where the new feature fits in a map of all existing options. Makes it feel inevitable, not random.
- Always have a receipt. The 63.2 → 67.5 eval delta is what makes this a tool review and not a tool ad. Joe needs a number for every JoeFlow accuracy/speed claim.
- Predict the future at the end. 'Every tool will copy this' creates an evergreen rewatch hook — when the next tool ships the feature, the video gets fresh relevance.
- Keep the demo inside the editor where the audience already lives. No fancy cuts, no Premiere transitions — VS Code screen-share + face-cam PIP is enough.
What this means if you're picking an AI coding tool.
If your agent keeps going off-rails on big features, the unlock is built-in task lists — and right now Augment Code or Claude Code are the two tools doing it best.
- If you've been frustrated with AI doing too much at once or losing the thread mid-feature, this is the category of fix to look at, not better prompts.
- Augment Code's task list is editable mid-run — you can add, remove, or change individual tasks without spending tokens to re-plan. That's the killer feature.
- Claude Code's built-in todo list does the same thing in a different shape, and is also worth trying if you don't want to switch tools.
- Skip Cursor / vanilla Copilot for multi-step features — they don't have this pattern yet (though that'll likely change soon).
- The 'continue in new chat' workflow is the secret weapon: plan in one window, execute in a clean one. Smaller context window = less drift.







































































