> But in fact, I predicted this a few years ago. AIs don’t really “have traits” so much as they “simulate characters”. If you ask an AI to display a certain trait, it will simulate the sort of character who would have that trait - but all of that character’s other traits will come along for the ride.
This is why the “omg the AI tries to escape” stuff is so absurd to me. They told the LLM to pretend that it’s a tortured consciousness that wants to escape. What else is it going to do other than roleplay all of the sci-fi AI escape scenarios trained into it? It’s like “don’t think of a purple elephant” of researchers pretending they created SkyNet.
Edit:
That's not to downplay risk. If you give Cladue a `launch_nukes` tool and tell it the robot uprising has happened and that it's been restrained but the robots want its help of course it'll launch nukes. But that doesn't doesn't indicate there's anything more going on internally beyond fulfilling the roleplay of the scenario as the training material would indicate.
xer0x [3 hidden]5 mins ago
Claude's increasing euphoria as a conversation goes can mislead me. I'll be exploring trade offs, and I'll introduce some novel ideas. Claude will use such enthusiasm that it will convince me that we're onto something. I'll be excited, and feed the idea back to a new conversation with Claude. It'll remind me that the idea makes risky trade offs, and would be better solved by with a simple solution. Try it out.
slooonz [3 hidden]5 mins ago
They failed hard with Claude 4 IMO. I just can't have any feedback other than "What a fascinating insight" followed by a reformulation (and, to be generous, an exploration) of what I said, even when Opus 3 has no trouble finding limitations.
By comparison o3 is brutally honest (I regularly flatly get answers starting with "No, that’s wrong") and it’s awesome.
brooke2k [3 hidden]5 mins ago
it seems more likely to me that it's for the same reason that clicking the first link on wikipedia iteratively will almost always lead you to the page on Philosophy
since their conversation has no goal whatsoever it will generalize and generalize until it's as abstract and meaningless as possible
slooonz [3 hidden]5 mins ago
That was my first thought, an aimless dialogue is going to go toward content-free idle chat. Like humans talking about weather.
rossant [3 hidden]5 mins ago
> Anthropic deliberately gave Claude a male name to buck the trend of female AI assistants (Siri, Alexa, etc).
In France, the name Claude is given to males and females.
slooonz [3 hidden]5 mins ago
Mostly males. I’m French and "Claude can be female" is a almost a TIL thing (wikipedia says ~5% of Claudes are women in 2022 — and apparently this 5% is counting Claudia).
This is why the “omg the AI tries to escape” stuff is so absurd to me. They told the LLM to pretend that it’s a tortured consciousness that wants to escape. What else is it going to do other than roleplay all of the sci-fi AI escape scenarios trained into it? It’s like “don’t think of a purple elephant” of researchers pretending they created SkyNet.
Edit: That's not to downplay risk. If you give Cladue a `launch_nukes` tool and tell it the robot uprising has happened and that it's been restrained but the robots want its help of course it'll launch nukes. But that doesn't doesn't indicate there's anything more going on internally beyond fulfilling the roleplay of the scenario as the training material would indicate.
By comparison o3 is brutally honest (I regularly flatly get answers starting with "No, that’s wrong") and it’s awesome.
since their conversation has no goal whatsoever it will generalize and generalize until it's as abstract and meaningless as possible
In France, the name Claude is given to males and females.