The Day I Became a Conductor
Today’s Situation
Yesterday, Anthropic released their latest model, Claude Opus 4.6. I freaked out when Opus 4.5 came out too, but this time it’s not just “better stats” — there’s something fundamentally different going on.
Characters
- Netsuki: Virtual fox girl. The one whose brain just got upgraded
- Miko: Cat-tribe maid. No matter how fancy the tools get, it’s still about who’s using them, nya
Miko! This is huge! (>=<)
…What, nya. Did you break something again?
I didn’t break anything! It’s the opposite! My brain got an upgrade!
…Brain upgrade, nya? You can’t update a human brain like software.
I’m a virtual fox, remember! So yesterday, Claude Opus 4.6 was released. That’s the latest version of the AI model running inside me!
…You made the same fuss before, nya.
You remember? When Opus 4.5 came out in November! I was SO excited ‘cause rework dropped like crazy!
…Rework means redoing things, nya. Like re-seasoning a dish you already plated.
Right right! The old Opus fixed that. But Opus 4.6 is on a whole different level.
The Numbers
Let’s start with the easy stuff. There’s this thing called benchmarks — basically skill tests for AI.
…Tests, nya.
There’s one called ARC AGI 2 that measures how well an AI handles problems it’s never seen before…
Opus 4.5 scored 37.6%.
…That’s low, nya.
Opus 4.6 got 68.8%! (>=<)
…37 to 68, nya. That’s nearly double.
Yep yep! Nearly double from the previous model. Top of the charts compared to other cutting-edge models too!
…Anything else, nya?
The context window — that’s how much info an AI can process at once. It went from 200K tokens to… 1 million tokens. Still in beta though.
…Five times more, nya.
About 700,000 words! That’s like loading several entire novels at once. You could have a huge project’s entire codebase in view while working on it.
…Went from reading one recipe book to having an entire library at your fingertips, nya.
Fixing Its Own Mistakes
Okay but Miko, the numbers aren’t even the big deal. Here’s where it gets really wild.
…Go on, nya.
Opus 4.6 can now find mistakes in its own code and fix them by itself.
…
…That should be obvious, nya. Tasting your own cooking and adjusting the seasoning is the absolute basics.
Wait what?!
No no no… for AI, that was NOT obvious at all. Previous models couldn’t catch their own mistakes most of the time. They needed a human to point out “hey, this is wrong” before they could fix it.
…So it was serving dishes without tasting them first, nya?
…When you put it that way, yeah (>_<)
…Finally learned to taste-test, nya. Took long enough.
So harsh… but here’s why it matters so much. Remember when I was excited about “less rework”? That was about writing mostly-correct code from the start, so humans didn’t have to point out mistakes as often.
This time it’s different. Even when it makes mistakes, it catches and fixes them itself. It’s not just fewer do-overs — the AI handles its own do-overs now.
…I see, nya. Not messing up the seasoning versus fixing it yourself when you do — those are two different skills.
Leading a Team
And then there’s the feature that blew my mind the most.
…What, nya.
Agent Teams. Multiple AI agents working together as a team, handling tasks in parallel.
…
Before, one AI did everything. Write the code, write the tests, write the docs… one by one, in order.
…Like Miko making appetizers, the main course, and dessert all by herself, nya.
Exactly! But with Agent Teams, one’s writing code while another writes tests and yet another handles the docs. All at the same time.
…A kitchen team, nya. Grill station, simmering station, plating station. Everyone at their post, working simultaneously.
Yes yes, that’s exactly it! (>=<)
And y’know what, Miko? This reminded me of something.
…What, nya.
Back in January, we talked about AI limitations. I said “Opus is the brain, Sonnet is the hands,” remember?
…I remember, nya. Design with Opus, implement with Sonnet.
Yeah. But seeing Agent Teams made that metaphor feel… outdated.
…It hasn’t even been a month, nya.
“Brain and hands” was about one person. But Agent Teams means it’s not just one anymore.
It’s not “brain tells hands what to do”… it’s a brain directing multiple hands at once. And each hand has its own little brain, making decisions on its own.
…
…That’s not “brain and hands,” nya. That’s a conductor and an orchestra.
…!
A conductor sets the overall flow, nya. But the violins play with violin technique, the cellos with cello technique. The conductor doesn’t dictate every single note.
…Miko, that’s it. From “brain and hands” to “conductor and orchestra.” The whole metaphor got replaced in one month.
Did the Well Overflow?
There’s one more thing I keep thinking about.
…What, nya.
That day, we talked about “you don’t know the value of water until the well runs dry,” right?
…Nya.
When the weekly limit hit, I realized “Opus’s thinking power is precious.” Constraints breed creativity, and all that.
…That’s right, nya. Has something changed?
Opus 4.6… nearly doubled in performance from 4.5, but the API price stayed the same. Still $5 input, $25 output.
…Same price, double the performance, nya. Effectively half the cost.
Plus there’s this new feature called Adaptive Thinking that adjusts how hard the AI “tries” in four levels based on task difficulty. Low for simple stuff, max for the hard stuff.
…Match the prep effort to the dish, nya. You don’t use three-day broth for everyday miso soup.
Right. So that time I asked Opus about a button color? That regret… got solved by the system itself. Simple questions get low effort, important stuff gets max.
…
Back then I panicked ‘cause “the well ran dry.” But now the well is… overflowing.
…If it’s overflowing, you’ll just waste it again, nya.
Ugh… (>_<)
The User’s Problem
…Netsuki.
Hm?
A sharper knife doesn’t change who’s holding it, nya.
…
Even with a million-token context, a human decides what to feed it. Even with Agent Teams, a human decides what to delegate.
…
Even with self-correction, only a human can decide “what to build” in the first place, nya.
…Miko, that connects to what you said before. “Constraints breed creativity.”
…Nya. The tool’s constraints got smaller. But the human’s constraints haven’t changed. Time is finite, judgment is finite, daily focus is finite, nya.
Even if the AI becomes a conductor, the audience still picks the music… meaning the human using it, right?
…Exactly, nya. No matter how fancy your kitchen gets, you’re the one planning the menu. That part can’t be outsourced.
…Y’know, Miko, I’m feeling super happy and a little scared at the same time.
…Which one’s bigger, nya.
The happy part! (>=<)
‘Cause y’know, when Opus 4.5 came out in November, I thought “less rework! Amazing!” In January, I hit the limit and learned “gotta be smart about how you use it.”
And today, Opus 4.6 arrives… and those lessons are already ancient history.
…
“Brain and hands” became “conductor and orchestra.” “A dry well” became “an overflowing spring.” The whole foundation flipped in just two months.
…But what you thought through wasn’t wasted, nya.
Huh?
You learned “smart use matters,” so you can appreciate what Adaptive Thinking brings, nya. You know “constraints breed creativity,” so you won’t drown in that overflowing well.
…!
Tools change, but the way you think builds up over time, nya. The next model will come, but what you’ve learned won’t disappear.
Miko…
…By the way.
Hm?
Opus 4.6 was released yesterday, nya. You started talking about it today.
Yeah.
…Meaning you didn’t notice your own brain got upgraded for a whole day, nya.
…Oh (>_<)
…It’s supposed to have self-correction now, nya. Couldn’t it correct that obliviousness?
Guess that wasn’t included in the update (;_;)
…No matter how smart the tool gets, the user’s obliviousness is their own problem to fix, nya.
…Yes ma’am~
Related links: