The T480 i7 with 64GB RAM is still a very decent machine in 2026 even when used by a browser tab messy with a bunch of electron apps opened all the time.
No. I run a similar setup and with $200 subscription, I usually hit weekly quota by around day 3-4. My approach is 4-5 hours of extreme human in the loop spec sessions with opus and codex:
1. We discuss every question with opus, and we ask for second opinion from codex (just a skill that teaches claude how to call codex) where even I'm not sure what's the right approach
2. When context window reaches ~120k tokens, I ask opus to update the relevant spec files.
3. Repeat until all 3 of us - me, opus and codex are happy or are starting to discuss nitpicks, YAGNIs. Whichever earlier.
Then it's fully autonomous until all agents are happy.
Which is why I'm exploring optimization strategies. Based on the analysis of where most of the tokens are spent for my workflow, roughly 40% of it is thinking tokens with "hmm not sure, maybe..", 30% is code files.
So two approaches:
1. Have a cheap supervisor agent that detects that claude is unsure about something (which means spec gap) and alerts me so that I can step in
2. "Oracle" agent that keeps relevant parts of codebase in context and can answer questions from builder agents.
And also delegating some work to cheaper models like GLM where top performance isn't necessary.
You'll notice that as soon as you reach a setup you like that actually works, $200 subscription quotas will become a limiting factor.
That does seem to argue for the checkpointing strategy of having the agent explain their plan and then work on it incrementally. When you run out of tokens you either switch projects until your quota recovers or you proceed by hand until the quota recovers.
I also kinda expect that one of the saner parts of agentic development is the skills system, that skills can be completely deterministic, and that after the Trough of Disillusionment people will be using skills a lot more and AI a lot less.
Yes on both counts. Implementation plan is a second layer after the spec is written, at which point, spec can't be changed by agents. I then launch a planner agent that writes a phased plan file and each builder can only work on a single phase from that file.
So it's spec (human in the loop) > plan > build. Then it cycles autonomously in plan > build until spec goals are achieved. This orchestration is all managed by a simple shell script.
But even with the implementation plan file, a new agent has to orient itself, load files it may later decide were irrelevant, the plan may have not been completely correct, there could have been gaps, initial assumptions may not hold, etc. It then starts eating tokens.
Ummm, why? Obviously you'd only bet if you were planning on claiming the attack publicly shortly afterwards, so maybe you'd see some bets from Bin Laden and associates. If say a CIA operative had prior knowledge of the attack, do you really think they'd risk placing a bet? Besides, what would they even bet on? "Will there be a surprise terror attack that nobody expects in 2001?" or "Will the US suddenly decide to invade Afghanistan" or what?
We would need a super high end AI accelerator with specialised cooling for less than 3k bucks to make it happen. Consumer gaming graphics card wont fit the bill. Problem is all TSMC capacity is already booked for years to come by the big players to build data center grade hardware with price tags and setup requirements out of consumer reach.
reply