Hacker News

Orchestrate teams of Claude Code sessions

249 points by davidbarker - 115 comments

ottah [3 hidden]5 mins ago

I absolutely cannot trust Claude code to independently work on large tasks. Maybe other people work on software that's not significantly complex, but for me to maintain code quality I need to guide more of the design process. Teams of agents just sounds like adding a lot more review and refactoring that can just be avoided by going slower and thinking carefully about the problem.

nickstinemates [3 hidden]5 mins ago

You write a generic architecture document on how you want your code base to be organized, when to use pattern x vs pattern y, examples of what that looks like in your code base, and you encode this as a skill.

Then, in your prompt you tell it the task you want, then you say, supervise the implementation with a sub agent that follows the architecture skill. Evaluate any proposed changes.

There are people who maximize this, and this is how you get things like teams. You make agents for planning, design, qa, product, engineering, review, release management, etc. and you get them to operate and coordinate to produce an outcome.

That's what this is supposed to be, encoded as a feature instead of a best practice.

satellite2 [3 hidden]5 mins ago

Aren't you just moving the problem a little bit further? If you can't trust it will implement carefully specified features, why would you believe it would properly review those?

frde_me [3 hidden]5 mins ago

It's hard to explain, but I've found LLMs to be significantly better in the "review" stage than the implementation stage.

So the LLM will do something and not catch at all that it did it badly. But the same LLM asked to review against the same starting requirement will catch the problem almost always

The missing thing in these tools is that automatic feedback loop between the two LLMs: one in review mode, one in implementation mode.

tclancy [3 hidden]5 mins ago

How does this not use up tokens incredibly fast though? I have a Pro subscription and bang up against the limits pretty regularly.

doctoboggan [3 hidden]5 mins ago

It _does_ use up tokens incredibly fast, which is probably why Anthropic is developing this feature. This is mostly for corporations using the API, not individuals on a plan.

digdugdirk [3 hidden]5 mins ago

I'd love to see a breakdown of the token consumption of inaccurate/errored/unused task branches for claude code and codex. It seems like a great revenue source for the model providers.

shafyy [3 hidden]5 mins ago

Yeah, that's what I was thinking. They do have an incentive to not get everything right on the first try, as long as they don't over do it... I also feel like that they try to get more token usage by asking unnecesary follow up questions that the user may say yes to etc.

andyferris [3 hidden]5 mins ago

It does use tokens faster, yes.

aqme28 [3 hidden]5 mins ago

I agree, but I've found that making an "adversarial" model within claude helps with the quality a lot. One agent makes the change, the other picks holes in it, and cycle. In the end, I'm left with less to review.

This sounds more like an automation of that idea than just N-times the work.

Keyframe [3 hidden]5 mins ago

Glad I'm not the only one. I do the same, but I tend to have gemini be the one that critiques.

diego898 [3 hidden]5 mins ago

Do you do this manually? Or some abstraction above that? skills, some light orchestration, etc?

aqme28 [3 hidden]5 mins ago

I just tell it to do so, but you could even add that as a requirement to CLAUDE.md

turtlebits [3 hidden]5 mins ago

Humans can't handle large tasks either, which is why you break them into manageable chunks.

Just ask claude to write a plan and review/edit it yourself. Add success criteria/tests for better results.

stpedgwdgfhgdd [3 hidden]5 mins ago

Exactly, one out of four or three prompts require tuning, nudging or just stopping it. However it takes seniority to see where it goes astray. I suspect that lots of folks dont even notice that CC is off. It works, it passes the tests, so it is good.

nprz [3 hidden]5 mins ago

There is research[0] currently being done on how to divide tasks and combine the answers to LLMs. This approach allows LLMs reach outcomes (solving a problem that requires 1 million steps) which would be impossible otherwise.

[0]https://arxiv.org/abs/2511.09030

woah [3 hidden]5 mins ago

All they did was prompt an LLM over and over again to execute one iteration of a towers of hanoi algorithm. Literally just using it as a glorified scripting language:

```

Rules:

- Only one disk can be moved at a time.

- Only the top disk from any stack can be moved.

- A larger disk may not be placed on top of a smaller disk.

For all moves, follow the standard Tower of Hanoi procedure: If the previous move did not move disk 1, move disk 1 clockwise one peg (0 -> 1 -> 2 -> 0).

If the previous move did move disk 1, make the only legal move that does not involve moving disk1.

Use these clear steps to find the next move given the previous move and current state.

Previous move: {previous_move} Current State: {current_state} Based on the previous move and current state, find the single next move that follows the procedure and the resulting next state.

```

This is buried down in the appendix while the main paper is full of agentic swarms this and millions of agents that and plenty of fancy math symbols and graphs. Maybe there is more to it, but the fact that they decided to publish with such a trivial task which could be much more easily accomplished by having an llm write a simple python script is concerning.

Spoom [3 hidden]5 mins ago

Good lord, I can only imagine the wasted electricity.

ottah [3 hidden]5 mins ago

No offense to the academic profession, but they're not a good source of advice for best practices in commercial software development. They don't have the experience or the knowledge sufficient to understand my workplace and tasks. Their skill set and job is orthogonal to the corporate world.

nprz [3 hidden]5 mins ago

Yes, the problem solved in the paper (Tower of Hanoi) is far more easily defined than 99% of actual problems you would find in commercial software development. Still proof of "theoretically possible" and seems like an interesting area of research.

findjashua [3 hidden]5 mins ago

you need a reviewer agent for every step of the process - review the plan generated by the planner, the update made by the task worker subagent, and a final reviewer once all tasks are done.

this does eat up tokens _very_ quickly though :(

BonoboIO [3 hidden]5 mins ago

You definitely have to create some sort of PLAN.md and PROGRESS.md via a command and an implement command that delegates work. That is the only way that I can get bigger things done no matter how „good“ their task feature is.

You run out of context so quickly and if you don’t have some kind of persistent guidance things go south

ottah [3 hidden]5 mins ago

It's not sufficient, especially if I am not learning about the problem by being part of the implementation process. The models are still very weak reasoners, writing code faster doesn't accelerate my understanding of the code the model wrote. Even with clear specs I am constantly fighting with it duplicating methods, writing ineffective tests, or implementing unnecessarily complex solutions. AI just isn't a better engineer than me, and that makes it a weak development partner.

vonneumannstan [3 hidden]5 mins ago

>AI just isn't a better engineer than me, and that makes it a weak development partner.

This would also be true of Junior Engineers. Do you find them impossible to work with as well?

koakuma-chan [3 hidden]5 mins ago

I tried doing that and it didn't work. It still adds "fallbacks" that just hide errors or the fact that there is no actual implementation and "In a real app, we would do X, just return null for now"

pronik [3 hidden]5 mins ago

To the folks comparing this to GasTown: keep in mind that Steve Yegge explicitely pitched agent orchestrators to among others Anthropic months ago:

> I went to senior folks at companies like Temporal and Anthropic, telling them they should build an agent orchestrator, that Claude Code is just a building block, and it’s going to be all about AI workflows and “Kubernetes for agents”. I went up onstage at multiple events and described my vision for the orchestrator. I went everywhere, to everyone. (from "Welcome to Gas Town" https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...)

That Anthropic releases Agent Teams now (as rumored a couple of weeks back), after they've already adopted a tiny bit of beads in form of Tasks) means that either they've been building them already back when Steve pitched orchestrators or they've decided that he's been right and it's time to scale the agents. Or they've arrived at the same conclusions independently -- it won't matter in the larger scale of things. I think Steve greately appreciates it existing; if anything, this is a validation of his vision. We'll probably be herding polecats in a couple of months officially.

mohsen1 [3 hidden]5 mins ago

It's not like he was the only one who came up with this idea. I built something like that without knowing about GasTown or Beeds. It's just an obvious next step

https://github.com/mohsen1/claude-code-orchestrator

bonesss [3 hidden]5 mins ago

Compare both approaches to mature actor frameworks and they don’t seem to be breaking much ice. These kinds of supervisor trees and hierarchies aren’t new for actor based systems and they’re obvious applications of LLM agents working in concert.

The fact that Anthropic and OpenAI have been going on this long without such orchestration, considering the unavoidable issues of context windows and unreliable self-validation, without matching the basic system maturity you get from a default Akka installation shows us that these leading LLM providers (with more money, tokens, deals, access, and better employees than any of us), are learning in real time. Big chunks of the next gen hype machine wunder-agents are fully realizable with cron and basic actor based scripting. Deterministically, write once run forever, no subscription needed.

Kubernetes for agents is, speaking as a krappy kubernetes admin, not some leap, it’s how I’ve been wiring my local doom-coding agents together. I have a hypothesis that people at Google (who are pretty ok with kubernetes and maybe some LLM stuff), have been there for a minute too.

Good to see them building this out, excited to see whether LLM cluster failures multiply (like repeating bad photocopies), or nullify (“sorry Dave, but we’re not going to help build another Facebook, we’re not supposed to harm humanity and also PHP, so… no.”).

ttoinou [3 hidden]5 mins ago

If it was so obvious and easy, why didn't we have this a year ago ? Models were mature enough back then to make this work

bcrosby95 [3 hidden]5 mins ago

The high level idea is obvious but doing it is not easy. "Maybe agents should work in teams like humans with different roles and responsibilities and be optimized for those" isn't exactly mind bending. I experimented with it too when LLM coding became a thing.

As usual, the hard part is the actual doing and producing a usable product.

CuriouslyC [3 hidden]5 mins ago

Orchestration definitely wasn't possible a year ago, the only tool that even produced decent results that far back was Aider, it wasn't fully agentic, and it didn't really shine until Gemini 2.5 03-25.

The truth is that people are doing experiments on most of this stuff, and a lot of them are even writing about it, but most of the time you don't see that writing (or the projects that get made) unless someone with an audience already (like Steve Yegge) makes it.

ttoinou [3 hidden]5 mins ago

Roo Code in VSCode was working fine a year ago, even back in November 2024 with Sonnet 3.5 or 3.7

lossolo [3 hidden]5 mins ago

Because gathering training data and doing post-training takes time. I agree with OP that this is the obvious next step given context length limitations. Humans work the same way in organizations, you have different people specializing in different things because everyone has a limited "context length".

ruined [3 hidden]5 mins ago

what mature actor frameworks do you recommend?

jghn [3 hidden]5 mins ago

They did mention Akka in their post, so I would assume that's one of them.

isoprophlex [3 hidden]5 mins ago

There seems to be a lot of convergent evolution happening in the space. Days before the gas town hype hit, I made a (less baroque, less manic) "agent team" setup: a shell script to kick off a ralph wiggum loop, and CLAUDE-MESSAGE-BUS.md for inter-ralph communication (Thread safety was hacked into this with a .claude.lock file).

The main claude instance is instructed to launch as many ralph loops as it wants, in screen sessions. It is told to sleep for a certain amount of time to periodically keep track of their progress.

It worked reasonably well, but I don't prefer this way of working... yet. Right now I can't write spec (or meta-spec) files quick enough to saturate the agent loops, and I can't QA their output well enough... mostly a me thing, i guess?

pronik [3 hidden]5 mins ago

> Right now I can't write spec (or meta-spec) files quick enough to saturate the agent loops, and I can't QA their output well enough... mostly a me thing, i guess?

Same for me, however, the velocity of the whole field is astonishing and things change as we get used to them. We are not talking that much about hallucinating anymore, just 4-5 months ago you couldn't trust coding agents with extracting functionality to a separate file without typos, now splitting Git commits works almost without a hinch. The more we get used to agents getting certain things right 100% of the time, the more we'll trust them. There are many many things that I know I won't get right, but I'm absolutely sure my agent will. As soon as we start trusting e.g. a QA agent to do his job, our "project management" velocity will increase too.

Interestingly enough, the infamous "bowling score card" text on how XP works, has demonstrated inherently agentic behaviour in more way than one (they just didn't know what "extreme" was back then). You were supposed to implement a failing test and then implement just enough functionality for this test to not fail anymore, even if the intended functionality was broader -- which is exactly what agents reliably do in a loop. Also, you were supposed to be pair-driving a single machine, which has been incomprehensible to me for almost decades -- after all, every person has their own shortcuts, hardware, IDEs, window managers and what not. Turns out, all you need is a centralized server running a "team manager agent" and multiple developers talking to him to craft software fast (see tmux requirement in Gas Town).

CuriouslyC [3 hidden]5 mins ago

Not a you thing. Fancy orchestration is mostly a waste, validation is the bottleneck. You can do E2E tests and all sorts of analytic guardrails but you need to make sure the functionality matches intent rather than just being "functional" which is still a slow analog process.

segmondy [3 hidden]5 mins ago

This is nothing new, folks have been doing this for since 2023. Lots of paper on arxiv and lots of code in github with implementation of multiagents.

... the "limit" were agents were not as smart then, context window was much smaller and RLVR wasn't a thing so agents were trained for just function calling, but not agent calling/coordination.

we have been doing it since then, the difference really is that the models have gotten really smart and good to handle it.

aaaalone [3 hidden]5 mins ago

Honestly this is one of plenty ideas I also have.

But this shows how much stuff is still to do in the ai space

khaliqgant [3 hidden]5 mins ago

Been waiting for this to drop and excited to test it out. We've been building something in this space - https://github.com/AgentWorkforce/relay, a real-time messaging layer that lets AI coding agents talk to each other across any CLI.

Assign roles to different models and have them coordinate: Claude as the lead, Codex on backend, Gemini on frontend, etc.

I wrote about my experiences with multi-agent orchestration here: https://x.com/khaliqgant/status/2019124627860050109?s=46

drbscl [3 hidden]5 mins ago

I just built a quick plugin to automatically add agents & skills then fire off a team with them, depending on your task: https://github.com/drbscl/dream-team

mcintyre1994 [3 hidden]5 mins ago

I’ve been mostly holding off on learning any of the tools that do this because it seemed so obvious that it’ll be built natively. Will definitely give this a go at some point!

giancarlostoro [3 hidden]5 mins ago

I was working on my own alternative to Beads... then I realized I could do exactly this with something similar to Beads, I'm planning on open sourcing it soon because I like what I have so far, I also made it so I can sync my tasks directly to my GitHub projects as well. I think its more useful to have agent tasks eventually synched back up to real ticketing systems for historical reasons. Besides, its better to have alternatives that are agent agnostic.

GoatOfAplomb [3 hidden]5 mins ago

I wonder if my $20/mo subscription will last 10 minutes.

mohsen1 [3 hidden]5 mins ago

At this point, if you're paying out of pocket you should use Kimi or GLM for it to make sense

tclancy [3 hidden]5 mins ago

Ah ok, same. I keep wondering about how this would ever accomplish anything.

simlevesque [3 hidden]5 mins ago

I've had good results with Haiku for certain tasks.

bhasi [3 hidden]5 mins ago

Seems similar to Gas Town

rafram [3 hidden]5 mins ago

I'm not anti-whimsy, but if your project goes too hard on the whimsy (and weird AI-generated animal art), it's kind of inevitable that someone else is going to create a whimsy-free clone, and their version will win because it's significantly less embarrassing to explain to normal people.

reissbaker [3 hidden]5 mins ago

Where are the polecats, though? What about the mayor's dog?

koakuma-chan [3 hidden]5 mins ago

I don't know what Gas Town is, but Claude Code Agent Teams is what I was doing for a while now. You use your main conversation only to spawn sub agents to plan and execute, allowing you to work for a long time without losing context or compacting, because all token-heavy work is done by sub agents in their own context. Claude Code Agent Teams just streamlines this workflow as far as I can tell.

nprz [3 hidden]5 mins ago

Gas Town --> https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...

nickorlow [3 hidden]5 mins ago

yeah, seems like a much simpler design though (i.e. only seems like one 'special/leader' agent, and the rest are all workers vs gastown having something like 8 different roles mayor, polecat, witnesses, etc).

Wonder how they compare?

greenfish6 [3 hidden]5 mins ago

i would have to imagine the gastown design isn't optimal though? why 8, and why does there need to multiple hops of agent communications before two arbitrary agents communicate with each other as opposed to single shared filespace?

Ethee [3 hidden]5 mins ago

I've been using Gas Town a decent bit since it was released. I'd agree with you that it's design is sub-optimal, but I believe that's more due to the way the actual agents/harnesses have been designed as opposed to optimal software design. The problem you often run into is that agents will sometimes hang thinking they need human input for a problem they are on, or they think they're at a natural stopping point. If you're trying to do fully orchestrated agentic coding where you don't look at the code at all (putting aside whether that's good or not for a second) then this is sub-optimal behavior, and so these extra roles have been designed to 'keep the machine going' as it were.

Often times if I'm only working on a single project or focus, then I'm not using most of those roles at all and it's as you describe, one agent divvying out tasks to other agents and compiling reports about them. But due to the fact that my velocity with this type of coding is now based on how fast I can tell that agent what I want, I'm often working on 3 or 4 projects simultaneously, and Gas Town provides the perfect orchestration framework for doing this.

nickorlow [3 hidden]5 mins ago

yegge's article does come off as complicated design for the sake of complication

temuze [3 hidden]5 mins ago

Yeah but worse

No polecats smh

ramesh31 [3 hidden]5 mins ago

>"Seems similar to Gas Town"

I love that we are in this world where the crazy mad scientists are out there showing the way that the rest of us will end up at, but ahead of time and a bit rough around the edges, because all of this is so new and unprecedented. Watching these wholly new abstractions be discovered and converged upon in real time is the most exciting thing I've seen in my career.

bredren [3 hidden]5 mins ago

The action is hot, no doubt. This reminds me of Spacewar! -> Galaxy Game / Computer Space.

nkmnz [3 hidden]5 mins ago

I’m looking for something like this, with opus in the driver seat, but the subagents should be using different LLMs, such as Gemini or Codex. Anyone know if such a tool? just-every/code almost does this, but the lead/orchestrator is always codex, which feels too slow compared to opus or Gemini.

eaf7e281 [3 hidden]5 mins ago

These two basically do what you want, let Claude be the manager and Codex/Gemini be the worker. Many say that Coder-Codex-Gemini is easier to understand than CCG-Workflow, which has too many commands to start with.

https://github.com/FredericMN/Coder-Codex-Gemini https://github.com/fengshao1227/ccg-workflow

This one also seems promising, but I haven't tried it yet.

https://github.com/bfly123/claude_code_bridge

All of them are made by Chinese dev. I know some people are hesitant when they see Chinese products, so I'll address that first. But I have tried all of them, and they have all been great.

khaliqgant [3 hidden]5 mins ago

You can accomplish this with https://github.com/AgentWorkforce/relay and make the Lead/Orchestrator any harness you want. At the core agent-relay is agent to agent communication but it unlocks quite a few multi agent orchestration paradigms. I wrote about some learnings here as well https://x.com/khaliqgant/status/2019124627860050109?s=46

nikcub [3 hidden]5 mins ago

I use opus for coding and codex for reviews. I trigger the reviews in each work task with a review skill that calls out to codex[0]

I don't need anything more complicated than that and it works fine - also run greptile[1] on PR's

[0] https://github.com/nc9/skills/tree/main/review

[1] https://www.greptile.com/

fosterfriends [3 hidden]5 mins ago

I think this is where future cursor features will be great - to coordinate across many different model providers depending on the sub-jobs to be done

nkmnz [3 hidden]5 mins ago

What I want is something else: I want them to work in parallel on the same problem, and the orchestrator to then evaluate and consolidate their responses. I’m currently doing this manually, but it’s tedious.

sathish316 [3 hidden]5 mins ago

You can run an ensemble of LLMs (Opus, Gemini, Codex) in Claude Code Router via OpenRouter or any Agent CLI that supports Subagents and not tied to a single LLM like Opencode. I have an example of this in Pied-Piper, a subagent orchestrator that runs in Claude Code or ClaudeCodeRouter and uses distinct model/roles for each Subagent:

1. GPT-5.2 Codex Max for planning

2. Opus 4.5 for implementation

3. Gemini for reviews

It’s easy to swap models or change responsibilities. Doc and steps here: https://github.com/sathish316/pied-piper/blob/main/docs/play...

knes [3 hidden]5 mins ago

At Augment' we've been working on this. Multi agents orchestration, spec driven, different models for different tasks, etc.

https://www.augmentcode.com/product/intent

can use the code AUGGIE to skip the queue. Bring your own agent (powered by codex, CC, etc) coming to it next week.

Sol- [3 hidden]5 mins ago

With stuff like this, might be that all the infra build-out is insufficient. Inference demand will go up like crazy.

RGamma [3 hidden]5 mins ago

Unlocking the next order of magnitude of software inefficiency!

Though I do hope the generated code will end up being better than what we have right now. It mustn't get much worse. Can't afford all that RAM.

Sol- [3 hidden]5 mins ago

Dunno, it's probably less energy efficient than a human brain, but being able to turn electricity into intelligence is pretty amazing. RAM and power generation are engineering problems to be solved for civilization to benefit from this.

kylehotchkiss [3 hidden]5 mins ago

It'd be nice if CC could figure out all the required permissions upfront and then let you queue the job to run overnight

Der_Einzige [3 hidden]5 mins ago

Anyone paying attention has known that demand for all type of compute than can run LLMs (i.e. GPUs, TPUs, hell even CPUs) was about to blow up, and will remain extremely large for years to come.

It's just HN that's full of "I hate AI" or wrong contrarian types who refuse to acknowledge this. They will fail to reap what they didn't sow and will starve in this brave new world.

sciencejerk [3 hidden]5 mins ago

Agreed, agent scaling and orchestration indicates that demand for compute is going to blow up, if it hasn't already. The rationale for building all those datacenters they can't build fast enough is finally making sense.

emp17344 [3 hidden]5 mins ago

This reads like a weird cult-ish revenge fantasy.

RGamma [3 hidden]5 mins ago

And what about you? Show your "I used AI today" badge, right now!

ffffuuuuuccck [3 hidden]5 mins ago

[flagged]

aaaalone [3 hidden]5 mins ago

If ai progresses slow enough, we will end in a society were high unemployment numbers are the norm and we are stuck in capitalism.

And if I think about one 'senior' in my team I would pref an expensive ai subscription over that one person already.

Der_Einzige [3 hidden]5 mins ago

[flagged]

sciencejerk [3 hidden]5 mins ago

Blue collar work won't be safe for long. Just longer.

emp17344 [3 hidden]5 mins ago

What the fuck is wrong with you? This guy is either a troll or legitimately mentally ill.

mrkeen [3 hidden]5 mins ago

Oh yeah I mean if you're a webdev and you haven't built several data centres already you're basically asking to be homeless.

asdev [3 hidden]5 mins ago

I personally have no use for this type of workflow. I like parallel claude code instances in worktrees but nothing beyond that

ndesaulniers [3 hidden]5 mins ago

Subagents are out, put it all on agent teams!

morleytj [3 hidden]5 mins ago

Gas Town decimated by Claude bomb from orbit

greenfish6 [3 hidden]5 mins ago

something i really like from tryin git out over the last 10 minutes is that the main agent will continue talking to you while other agents are working, so you don't have to queue a message

taikahessu [3 hidden]5 mins ago

Clean up the team

Retr0id [3 hidden]5 mins ago

Claude Town

greenfish6 [3 hidden]5 mins ago

Excited to try this out. I've seen a lot of working systems on my own computer that share files to talk between different Claude Code agents and I think this could work similarly to that.

(i thought gas town was satire? people in comments here seem to be saying that gas town also had multi-agent file sharing for work tracking)

avereveard [3 hidden]5 mins ago

"finish Claude tokens quota in 3 minutes, largely over delegation and result messages instead of code writing"

IhateAI [3 hidden]5 mins ago

Any self respecting engineer should recognize that these tools and models only serve to lower the value of your labor. They aren't there to empower you, they aren't going to enable you to join the ruling class with some vibe-rolled slop SaaS.

Using these things will fry your brain's ability to think through hard solutions. It will give you a disease we haven't even named yet. Your brain will atrophy. Do you want your competency to be correlated 1:1 to the quality and quantity of tokens you can afford (or be loaned!!)?

Their main purpose is to convince C-suite suits that they don't need you, or they should be justified in paying you less.This will of course backfire on them, but in the meantime, why give them the training data, why give them the revenue??

I'd bet anything these new models / agentic-tools are designed to optimize for token consumption. They need the revenue BADLY. These companies are valued at 200 X Revenue.. Google IPO'd at 10-11 x lmfao . Wtf are we even doing? Can't wait to watch it crash and burn :) Soon!

M4R5H4LL [3 hidden]5 mins ago

From an economic standpoint this is basically machines doing work humans used to do. We’ve already gone through this many times. We built machines that can make stuff orders of magnitude faster than humans, and nobody really argues we should preserve obsolete tools and techniques as a valued human craft. Obviously automation messes with jobs and identity for some people, but historically a large chunk of human labor just gets automated as the tech gets better. So I feel that arguing about whether automation is good or bad in the abstract is a bit beside the point. The more interesting question imho is how people and companies adapt to it, because it’s probably going to happen either way.

IhateAI_2 [3 hidden]5 mins ago

I had to create a new account, because HN is protecting their investments and basically making it impossible to post for anyone who is critical of LLMs (said I was crawling, I'm on a dedicated proxy that definitely hasn't ever crawled HN lol).

Automation can be good overall for society, but you also can't ignore the fact that basically all automation has decreased the value of the labor it replaced or subsidized.

This automation isn't necessarily adding value to society. I don't see any software being built that's increasing the quality of people's life, I don't see research being accelerated. There is no economic data to support this either. The economic gains are only reflected in the values of companies who are selling tokens, or have been able to decrease their employee-counts with token allowances.

All I see is people sharing CRUD apps on twitter, 50 clones of the same SaaS, ,people constantly complaining about how their favorite software/OS has more bugs, the cost of hardware and electricity going up and people literally going into psychosis. (I have a list of 70+ people on twitter that I've been adding too that are literally manic and borderline insane because of these tools). I can see LLMs being genuinely useful to society, like helping with real time the blind, and disabled, but noone is doing that! It doesn't make money, automation is for capital owning class, not for the working class.

But hey, at least your favorite LLM shill from that podcast you loved can afford the $20,000/night resort this summer...

I'd be more okay with these mostly useless automation tools if the models were open source and didn't require $500k to run locally, but until then they basically only serve to make existing billionaires pad unnecessary zeros onto their net worth, and help prevent anyone from catching up with them.

I recommend people read this essay by Thomas Pynchon, actually read it, don't judge it by the title: https://www.nytimes.com/1984/10/28/books/is-it-ok-to-be-a-lu...

tjr [3 hidden]5 mins ago

People often compare working with AI agents to being something like a project manager.

I've been a project manager for years. I still work on some code myself, but most of it is done by the rest of the team.

On one hand, I have more bandwidth to think about how the overall application is serving the users, how the various pieces of the application fit together, overall consistency, etc. I think this is a useful role.

On the other hand, I definitely have felt mental atrophy from not working in the code. I still think; I still do things and write things and make decisions. But I feel mentally out of shape; I lack a certain sharpness that I perceived when I was more directly in tune with the code.

And I'm talking, all orthogonal to AI. This is just me as a project manager with other humans on the project.

I think there is truth to, well, operate at a higher level! Be more systems-minded, architecture-minded, etc. I think that's true. And there are surely interesting new problems to solve if we can work not on the level of writing programs, but wielding tools that write programs for us.

But I think there's also truth to the risk of losing something by giving up coding. Whether if that which might be lost is important to you or not, is your own decision, but I think the risk is real.

sathish316 [3 hidden]5 mins ago

I do think there’s a real risk of Brain Atrophy when you rely on AI coding tools for everything and while learning something new. About a year ago, I dealt with this problem by using Neovim and having shortcuts like below to easily toggle GitHub Copilot on/off. Now that AI is baked into almost every part of the toolchain in VSCode, Cursor, ClaudeCode, Intellij, I don't know how the newer engineers will learn without AI assistance.

IhateAI [3 hidden]5 mins ago

I think in-line autocomplete is likely not that dangerous, if it's used in this manner responsibly, it's the large agentic tools that are problematic for your brain imo. But in-line autocompletes aren't going to raise billions of dollars and aren't flashy.

xpct [3 hidden]5 mins ago

I'd say autocomplete introduces a certain level of fuzziness into the code we work with, though to a lower degree. I used autocomplete for over a year, and initially it did feel like a productivity boost, yet when I later stopped using them, it never felt like my productivity decreased. I stopped because something about losing explicit intent of my code feels uncomfortable to me.

majormajor [3 hidden]5 mins ago

It's very difficult to operate effectively at a higher level for a continued period of time without periodically getting back into the lower levels to try new things and learn new approaches or tools.

That doesn't even have to be writing a ton of code, but reading the code, getting intimately familiar with the metrics, querying the logs, etc.

IhateAI [3 hidden]5 mins ago

I definitely think what you're losing is extremely important, and can't be compensated with LLMs once its gone.

Back when automatic piano players came out, if all the world's best piano players stopped playing and mostly just composing/writing music instead, would the quality of the music have increased or decreased. I think the latter.

aaaalone [3 hidden]5 mins ago

When I use Google maps, I learn faster.

And I haven't to solve real hard problems for ages.

Some people will have problems some will not.

Future will tell.

theappsecguy [3 hidden]5 mins ago

The crash and burn can't come soon enough.

ottah [3 hidden]5 mins ago

Honestly my job is to ensure code quality and to protect the customer. I love working with claude code, it makes my life easier, but in no way would a team of agents improve code quality or speed up development. I would spend far too much time reviewing and fixing laziness and bad design decisions.

When you hear execs talking about AI, it's like listening to someone talk about how they bought some magic beans that will solve all their problems. IMO the only thing we have managed to do is spend alot more money on accelerated compute.

fooker [3 hidden]5 mins ago

It would be tragically ironic if this post is AI generated.

wantlotsofcurry [3 hidden]5 mins ago

I agree on all parts. I do not understand why anyone in the software industry would bend over backwards to show their work is worth less now.

ramesh31 [3 hidden]5 mins ago

>I'd bet anything these new models / agentic-tools are designed to optimize for token consumption.

You would think, but Claude Code has gotten incredibly more efficient over time. They are doing so much dogfooding with these things at this point that it makes more sense to optimize.

spelunker [3 hidden]5 mins ago

How Butlerian of you.

dangoodmanUT [3 hidden]5 mins ago

username checks out

markab21 [3 hidden]5 mins ago

Shaking fist at clouds!!

IhateAI [3 hidden]5 mins ago

Wow, a bunch of NFT people used to say the same thing.

lmao, please explain to me why these companies should be valued at 200x revenue.. They are providing autocomplete APIs.

How come Google's valuation hasn't increased 100-200x, they provide foundation models + a ton more services as well and are profitable. None of this makes sense, its destined to fail.

OsrsNeedsf2P [3 hidden]5 mins ago

I like your name, it suggests you're here for a good debate.

Let me start by conceding on the company value front; they should not have such value. I will also concede that these models lower your value of labor and quality of craft.

But what they give in return is the ability to scale your engineering impact to new highs - Talented engineers know which implementation patterns work better, how to build debuggable and growable systems. While each file in the code may be "worse" (by whichever metric you choose), the final product has more scope and faster delivery. You can likewise choose to hone in the scope and increase quality, if that's your angle.

LLMs aren't a blanket improvement - They come with tradeoffs.

IhateAI_2 [3 hidden]5 mins ago

(I had to create a new account, because HN doesn't like LLM haters (don't mess with the bag ig)

the em dashes in your reply scare me, but I'll assume you're a real person lol.

I think your opinion is valid, but tell that to the C Suite who's laid of 400k tech workers in the last 16 months in the USA. These tools don't seem to be used to empower high quality engineering, only to naively increase the bottom line by decreasing the number of engineers, and increasing workloads on those remaining.

Full disclosure, I haven't been laid off ever, but I see what's happening. I think when the trade-off is that your labor is worth a fraction of what it used to be and you're also expected to produce more, then that trade-off isn't worth it.

It would be a lot different if the signaling from business leaders was the reverse. If they believed these tools empowered labor's impact to a business, and planned on rewarding on that, it would be a different story. That's not what we are seeing, and they are very open about their plans for the future of our profession.

Automation can be good overall for society, but you also can't ignore the fact that basically all automation has decreased the value of the labor it replaced or subsidized.

But hey, at least your favorite AI evangelist from that podcast you loved can afford the $20,000/night resort this summer...

tock [3 hidden]5 mins ago

Google is valued at 4T. Up from 1.2T in 2022.

hareykrishna [3 hidden]5 mins ago

it's too late to hateAI!

cstrahan [3 hidden]5 mins ago

> Any self respecting engineer should recognize that these tools and models only serve to lower the value of your labor.

Depends on what the aim of your labor is. Is it typing on a keyboard, memorizing (or looking up) whether that function was verb_noun() or noun_verb(), etc? Then, yeah, these tools will lower your value. If your aim is to get things done, and generate value, then no, I don't think these tools will lower your value.

This isn't all that different from CNC machining. A CNC machinist can generate a whole lot more value than someone manually jogging X/Y/Z axes on an old manual mill. If you absolutely love spinning handwheels, then it sucks to be you. CNC definitely didn't lower the value of my brother's labor -- there's no way he'd be able to manually machine enough of his product (https://www.trtvault.com/) to support himself and his family.

> Using these things will fry your brain's ability to think through hard solutions.

CNC hasn't made machinists forget about basic principles, like when to use conventional vs climb milling, speeds and feeds, or whatever. Same thing with AI. Same thing with induction cooktops. Same thing with any tool. Lazy, incompetent people will do lazy, incompetent things with whatever they are given. Yes, an idiot with a power tool is dangerous, as that tool magnifies and accelerates the messes they were already destined to make. But that doesn't make power tools intrinsically bad.

> Do you want your competency to be correlated 1:1 to the quality and quantity of tokens you can afford (or be loaned!!)?

We are already dependent on electricity. If the power goes out, we work around that as best as we can. If you can't run your power tool, but you absolutely need to make progress on whatever it is you're working on, then you pick up a hand tool. If you're using AI and it stops working for whatever reason, you simply continue without it.

I really dislike this anti-AI rhetoric. Not because I want to advocate for AI, but because it distracts from the real issue: if your work is crap, that's on you. Blaming a category of tool as inherently bad (with guaranteed bad results) suggests that there are tools that are inherently good (with guaranteed good results). No. That's absolutely incorrect. It is people who fall on the spectrum of mediocrity-to-greatness, and the tools merely help or hinder them. If someone uses AI and generates a bunch of slop, the focus should be on that person's ineptitude and/or poor judgement.

We'd all be a lot better off if we held each other to higher standards, rather than complaining about tools as a way to signal superiority.

sciencejerk [3 hidden]5 mins ago

Your brother's livelihood is not safe from AI, nor is any other livelihood. A small slice of lucky, smart, well-placed, protected individuals will benefit from AI, and I presume many unlucky people with substantial disabilities or living in poverty will benefit as well. Technology seems to continue the improve the outcomes at the very top and very bottom, while sacrificing the biggest group in the middle. Many HN Software Engineers here immensely benefitted from Big Tech over the past 15 years -- they were a part of that lucky privileged group winning 300k+ USD salaries plus equity for a long time. AI has completely disrupted this space and drastically decreased the value of their work, and it largely did this by stealing open source code for training data. These Software Engineers are right to feel upset and threatened and oppose these AI tools, since they are their replacement. I believe that is why you see so much AI hate in HN