Hacker Newsnew | past | comments | ask | show | jobs | submit | ctoth's commentslogin

In my experience humans can make new things when they are some linear combination of existing things but I haven’t been able to get them to do something totally out of distribution yet from first principles[0].

[0]: https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-g...


Give it a rest.

What's happening is that AI has become an identity-sorting mechanism faster than any technology in recent memory. Faster than social media, faster than smartphones. Within about two years, "what do you think about AI" became a tribal marker on par with political affiliation. And like political affiliation, the actual object-level question ("is this tool useful for this task") got completely swallowed by the identity question ("what kind of person uses/rejects this").

The blog author isn't really angry about the comment. He's angry because someone accidentally miscategorized him tribally. "Did you use AI?" heard through his filter means "you're one of them." Same reason vegans get mad when you assume they eat meat, or whatever. It's an identity boundary violation, not a practical dispute.

These comments aren't discussing the post. They're each doing a little ritual display of their own position in the sorting. "I miss real conversation" = I'm on the human side. The political rant = I'm on the progress side. The energy calculation = I'm on the rational-empiricist side.

The thing that's actually weird, the thing worth asking "what the fuck" about: this sorting happened before the technology matured enough for anyone to have a grounded opinion about its long-term effects. People picked teams based on vibes and aesthetics, and now they're backfilling justifications. Which means the discourse is almost completely decoupled from what the technology actually does or will do.


I appreciate and agree with your comment. The reasonable answer to "did you use AI" would be just "no". In the context of the story, the other person's intent is comparable to "did you run spell check?"

My personal nit/pet peeve: it is far more likely to meet a meat-eater who gets offended by the insinuation they're a vegan. I have met exactly one "militant vegan" in real life, compared to dozens who go out of their way to avoid inconveniencing others. I'm talking about people who bring their own food to a party rather than asking for a vegan option.

In the 21st century, the militant vegan more common as a hack comedian trope than a real phenomenon.


Hear, hear. It was weird for the OP to make a call for depoliticisation, only to then introduce a totally unrelated bit of politics.

> the actual object-level question ("is this tool useful for this task")

That's not the only question worth asking though. It could be that the tool is useful, but has high negative externalities. In that case, the question "what kind of person uses/rejects this" is also worth considering. I think that if generative AI does have high negative externalities, then I'd like to be the kind of person that rejects it.


> Same reason vegans get mad when you assume they eat meat, or whatever

This so isn't important, but I don't know any vegan who would get mad if you assumed in passing that they ate meat. They'd only get annoyed if you then argued with them about it after they said something, like basically all humans do if you deliberately ignore what they've said to you.


> The blog author isn't really angry about the comment. He's angry because someone accidentally miscategorized him tribally.

I'm not so sure about that. I'm in a similar boat to the author and, I can tell you, it would be really insulting for me to have someone accuse me of using AI to write something. It's not because of any in-group / culture war nonsense, it's purely because:

a) I wouldn't—currently—resort to that behaviour, and I'd like to think people who know me recognise that

b) To have my work mistaken for the product of AI would be like being accused of not really being human—that's pretty insulting


> If Gemini 3 DT was better we would have falling prices of electricity and everything else at least

Man, I've seen some maintenance folks down on the field before working on them goalposts but I'm pretty sure this is the first time I saw aliens from another Universe literally teleport in, grab the goalposts, and teleport out.


I'm a screen reader user and CTO of an accessibility company. This change doesn't reduce noise for me. It removes functionality.

Sighted users lost convenience. I lost the ability to trust the tool. There is no "glancing" at terminal output with a screen reader. There is no "progressive disclosure." The text is either spoken to me or it doesn't exist.

When you collapse file paths into "Read 3 files," I have no way to know what the agent is doing with my codebase without switching to verbose mode, which then dumps subagent transcripts, thinking traces, and full file contents into my audio stream. A sighted user can visually skip past that. I listen to every line sequentially.

You've created a situation where my options are "no information" or "all information." The middle ground that existed before, inline file paths and search patterns, was the accessible one.

This is not a power user preference. This is a basic accessibility regression. The fix is what everyone in this thread has been asking for: a BASIC BLOODY config flag to show file paths and search patterns inline. Not verbose mode surgery. A boolean.

Please just add the option.

And yes, I rewrote this with Claude to tone my anger and frustration down about 15 clicks from how I actually feel.


Try Codex instead. Much greener pastures overall

I do love my subagents and I wrote an entire Claude Code audio hook system for a11y but this would be still rather compelling if Codex weren't also somewhat of an a11y nightmare. It does some weird thing with ... maybe terminal repaints or something else that ends up rereading the same text over and over. Claude Code does this similarly but Codex ends up reading like ... all the weird symbols and other stuff? window decorations? and not just the text like CC does. They are both hellish but CC slightly? less so... until now.

Sorry for being off-topic, but isn't a11y a rather ironic term for accessibility? It uses a very uncommon abbreviation type -- numeronym, and doesn't mean anything to the reader unless they look it up (or already know what it means).

Is it as bad with the Codex app, or VS Code plugin?

They are much more responsive on GitHub issues than Anthropic so you could also try reporting your issue there


For now until they are in the lead

Dyslexic and also a prolific screen reader user myself. +1 and thank you for mentioning something that often gets (ironically) overlooked

Hey -- we take accessibility seriously, and want Claude Code to work well for you. This is why we have repurposed verbose mode to do what you want, without the other verbose output. Please give it a try and let me know what you think.

It's well meaning but I think this goes against something like the curb effect. Not a perfect analogy but, verbosity is something you have to opt into here: Everyone benefits from being able to glance at what the agent is up to by default. Nobody greatly benefits from the agent being quiet by default.

If people find it too noisy, they can use the flag or toggle that makes everything quieter.

p.s. Serendipitously I just finished my on-site at anthropic today, hi :)


> we take accessibility seriously

Do you guys have a screen reader user on the dev team?

Is verbose mode the same as the old mode, where only file paths are spoken? Or does it have other text in it? Because I tried to articulate, and may have failed. More text is usually bad for me. It must be consumed linearly. I need specific text.

Quality over quantity


"Is verbose mode the same as the old mode, where only file paths are spoken?" -- yes, this is exactly what the new verbose mode is.

And how to get to the old verbose mode then...?

Hit ctrl+o

Wait so when the UI for Claude Code says “ctrl + o for verbose output” that isn’t verbose mode?

That is more verbose — under the hood, it’s now an enum (think: debug, warn, error logging)

Considering the ragefusion you're getting over the naming, maybe calling it something like --talkative would be less controversial? ;-)

ctrl + o isn't live - that's not what users want, what users want is the OPTION to choose what we want to see.

Casually avoiding the first question

The concern about climate is well placed. Ripple et al. lay out a serious case that we may be closer to tipping cascades than models predict, with the Greenland Ice Sheet potentially vulnerable to tipping below 2°C warming, well before 2050.

But "invest an equal share of the resources currently being pumped into AI into climate" misidentifies the bottleneck. Marine cloud brightening could produce meaningful planetary cooling for roughly $5 billion per year at scale (NAS estimate). That's like what? 1% of what was spent on AI infrastructure last year?

The money exists. What doesn't exist is the political coordination to spend it.

The goddamn Alameda city council shut down a University of Washington MCB field test in 2024 because nobody told them it was happening on their property. Go look it up.

This's the actual bottleneck: governance, coordination, and political will, not capital.

When someone says "we should invest resources in X instead of Y," it's worth asking who "we" is and what mechanism they're proposing. AI investment is private capital chasing returns. You can't redirect it to climate by wishing. The implicit model, that Society has a budget and we're choosing wrong, assumes a resource allocation authority that doesn't exist. If you want to argue for creating one, that's a real position, but it should be stated openly rather than hidden inside "it would be sensible."

Also ... "AI won't solve it; it only makes it worse" is doing a ton of work! The energy consumption concern has real merit. But materials science, grid optimization, and climate modeling are direct climate contributions happening now. Google has saved energy in its datacenters ... using AI!

Blanket dismissal of an entire domain of capability isn't seriousness, it's pattern matching. (Ironically, there's a phrase for systems that produce plausible-sounding output by matching patterns without engaging with underlying structure. We're told to be worried about them.)


> not capital

Capital, and by relation the system that centers the idea of Capital as a method for moving around resources is at the very center of this.

Since Capital follows near-term incentive, if the "pollute the world" path has a greater near-term incentive, that's where the market will follow. If a single member of the system goes for long-term incentive(not cooking the earth), other near-term incentive chasers will eat their lunch and remove a player.

The system itself is a tight feedback loop searching for local maxima, and the local max is often the most destructive. With chasing the local maxima, also comes profit and capital that influence the political system.


What you've done here is called a fully-general counterargument. You should be suspicious of these!

If capital inevitably follows destructive local maxima and defectors get eaten, then no coordination problem has ever been solved, right?

But we banned CFCs! We got lead out of gasoline! The Montreal Protocol exists and worked.

What you're describing is the default behavior of uncoordinated markets, not a physical law. The entire history of regulation and international treaties consists of mechanisms that override local incentive gradients. Sometimes they fail. Sometimes they work.

"The system itself is a tight feedback loop" treats the system as fixed rather than something humans have repeatedly modified. The question is whether we'll add the right feedback loops fast enough, not whether adding them is metaphysically impossible.

My original point stands: the bottleneck on MCB isn't that capital won't fund it. It's that the Alameda city council didn't know a field test was happening on their waterfront and NIMBY ... people ... made noise. Governance failure, not capitalism failure.


> But we banned CFCs! We got lead out of gasoline! The Montreal Protocol exists and worked.

None of these were done via capitalism, they were done in opposition to it.

And I know you weren't claiming they were, but the problem is all the power centers behind global capitalism have captured government (at least in the US) completely and are doing everything in their power to strip existing regulations and make sure the only new ones aren't in the name of the common good, but only to build moats for themselves.

It is great that we solved these problems in the past, but we are increasingly not doing that sort of thing at all anymore.


It's also worth distinguishing uncoordinated markets from ungoverned markets. Markets exhibit vast and sophisticated organic coordination without state prodding. I don't just mean to pick at this word "uncoordinated" but more deeply at the particular issue of near- or far-sightedness. Has it actually been established that organic economic coordination does worse at protecting "the future" than some particular alternatives?

The government is failing to control the problem because it got bought out by the capitalists who run the companies that continue to cause the damage. The law in the US explicitly allows this, though it's "decent" enough to hide it in a paper bag.

It's certainly a governance failure, but I'm not sure what the fix for it is, and I don't see how capitalism gets off scot-free.


People will have to vote for non-captured candidates (good luck finding them) or protest in large enough numbers that the system will change. Those people will also have to be critical thinkers to a degree that they can consciously push back against the wall of marketing and propaganda pumped out by those in power with money. And they will have to self educate since governments generally don't teach people these skills while they have them in school for 12 or more years. From my point of view the future looks pretty grim but I'm certainly hoping to be surprised or corrected!

> Marine cloud brightening could produce meaningful planetary cooling for roughly $5 billion per year at scale (NAS estimate).

Eh. Cloud brightening is a temporary hack, stops working as soon as you stop actively doing it, and isn't an alternative to switching away from fossil fuels. It's probably worth doing to push back the "ice melts and releases more carbon" thing but let's not confuse it with the extent of what needs to be done. You can't actually solve the problem for $5B/year.

> AI investment is private capital chasing returns.

Getting private capital to work for you is a good way to solve the problem. The real problem is politics.

The EV tax credits and the subsidies oil companies get were costing about the same amount of money, but we only got rid of one of them. Nuclear should cost less than fossil fuels, but we're told that fission is scary and Deepwater Horizon is nothing but spilled milk so the one with the much better environmental record has to be asymmetrically regulated into uncompetitiveness.

If we actually wanted to solve it we'd do the "carbon tax but 100% of the money gets sent back to the people as checks" thing, since then you're not screwing everyone because on average the check and the tax cancel out and corporations pay the tax too but only people get the check. Then everyone, but especially the heaviest users, would have the incentive to switch to alternative energy and more efficient vehicles etc., because everybody gets the same check but the people putting thousands of miles on non-hybrid panzers pay more in tax.

The "problem" is that it would actually work, which is highly objectionable to the oil industry and countries like Russia since it would cause their income to go away, hence politics.


Cooling the planet is neither a technical nor financial problem. The problem is that environmentalists want this to be a moral issue. They already decided on the solution. If the solution is not environmental communism with them in power, they will not have it.

>Cooling the planet is neither a technical nor financial problem

Yes it is. All solutions have trade offs.


I'm working on a paper connecting articulatory phonology to soliton physics. Speech gestures survive coarticulatory overlap the same way solitons survive collision. The nonlinear dynamics already in the phonetics literature are structurally identical to soliton equations. Nobody noticed because these fields don't share conferences.

The article's easy/hard distinction is right but the ceiling for "hard" is too low. The actually hard thing AI enables isn't better timezone bug investigation LOL! It's working across disciplinary boundaries no single human can straddle.


Everything is amazing. Even better if you set a shortcut key (I use ctrl+shift+/) and it's just so fast. You can even query (I just recently learned this) like:

*.txt size:>1024kb


Ah, we have converted a technical problem into a social problem. Historically those are vastly easier to solve, right?

Spam filters exist. Why do we need to bring politics into it? Reminds me of the whole CoC mess a few years back.

Every time somebody talks about a new AI thing the lament here goes:

> BUT THINK OF THE JUNIORS!

How do you expect this system to treat juniors? How do your juniors ever gain experience committing to open source? who vouches for them?

This is a permanent social structure for a transient technical problem.


> Ah, we have converted a technical problem into a social problem.

Surely you mean this the other way around?

Mitchell is trying to address a social problem with a technical solution.


Nope, I meant what I originally said.

The problem is technical: too many low-quality PRs hitting an endpoint. Vouch's solution is social: maintain trust graphs of humans.

But the PRs are increasingly from autonomous agents. Agents don't have reputations. They don't care about denounce lists. They make new accounts.

We solved unwanted automated input for email with technical tools (spam filters, DKIM, rate limiting), not by maintaining curated lists of Trusted Emailers. That's the correct solution category. Vouch is a social answer to a traffic-filtering problem.

This may solve a real problem today, but it's being built as permanent infrastructure, and permanent social gatekeeping outlasts the conditions that justified it.


"Juniors" (or anyone besides maintainers) do not fundamentally have a right to contribute to an open source project. Before this system they could submit a PR, but that doesn't mean anyone would look at it. Once you've internalized that reality, the rest flows from there.

> other side???

> We don’t have to look at assembly, because a compiler produces the same result every time.

This is technically true in the narrowest possible sense and practically misleading in almost every way that matters. Anyone who's had a bug that only manifests at -O2, or fought undefined behavior in C that two compilers handle differently, or watched MSVC and GCC produce meaningfully different codegen from identical source, or hit a Heisenbug that disappears when you add a printf ... the "deterministic compiler" is doing a LOT of work in that sentence that actual compilers don't deliver on.

Also what's with the "sides" and "camps?" ... why would you not keep your identity small here? Why define yourself as a {pro, anti} AI person so early? So weird!


You just described deterministic behavior. Bugs are also deterministic. You don’t get different bugs every time you compile the same code the same way. With LLMs you do.

Re: “other side” - I’m quoting the grandparent’s framing.


GCC is, I imagine, several orders of magnitude mor deterministic than an LLM.

It’s not _more_ deterministic. It’s deterministic, period. The LLMs we use today are simply not.

Build systems may be deterministic in the narrow sense you use, but significant extra effort is required to make them reproducible.

Engineering in the broader sense often deals with managing the outputs of variable systems to get known good outcomes to acceptable tolerances.

Edit: added second paragraph


I'm not using a narrow sense. There is no elasticity here. See https://en.wikipedia.org/wiki/Deterministic_system

> significant extra effort is required to make them reproducible.

Zero extra effort is required. It is reproducible. The same input produces the same output. The "my machine" in "Works on my machine" is an example of input.

> Engineering in the broader sense often deals with managing the outputs of variable systems to get known good outcomes to acceptable tolerances.

You can have unreliable AIs building a thing, with some guidance and self-course-correction. What you can't have is outcomes also verified by unreliable AIs who may be prompt-injected to say "looks good". You can't do unreliable _everything_: planning, execution, verification.

If an AI decided to code an AI-bound implementation, then even tolerance verification could be completely out of whack. Your system could pass today and fail tomorrow. It's layers and layers of moving ground. You have to put the stake down somewhere. For software, I say it has to be code. Otherwise, AI shouldn't build software, it should replace it.

That said, you can build seemingly working things on moving ground, that bring value. It's a brave new world. We're yet to see if we're heading for net gain or net loss.


If we want to get really narrow I'd say real determinism is possible only in abstract systems, to which you'd reply it's just my ignorance of all possible factors involved and hence the incompleteness of the model. To which I'd point of practical limitations involved with that. And that reason, even though it is incorrect and I don't use it in this way, I understand why some people are using the quantifiers more/less with the term "deterministic", probably for the lack of a better construct.

I don't think I'm being pedantic or narrow. Cosmic rays, power spikes, and falling cows can change the course of deterministic software. I'm saying that your "compiler" either has intentionally designed randomness (or "creativity") in it, or it doesn't. Not sure why we're acting like these are more or less deterministic. They are either deterministic or not inside normal operation of a computer.

To be clear: I'm not engaging with your main point about whether LLMs are usable in software engineering or not.

I'm specifically addressing your use of the concept of determinism.

An LLM is a set of matrix multiplies and function applications. The only potentially non-deterministic step is selecting the next token from the final output and that can be done deterministically.

By your strict use of the definition they absolutely can be deterministic.

But that is not actually interesting for the point at hand. The real point has to do with reproducibility, understand ability and tolerances.

3blue1brown has a really nice set of videos on showing how the LLM machinery fits together.


> they absolutely can be deterministic.

They _can_ be deterministic, but they usually _aren't_.

That said, I just tried "make me a haiku" via Gemini 3 Flash with T=0 twice in different sessions, and both times it output the same haiku. It's possible that T=0 enables deterministic mode indeed, and in that case perhaps we can treat it like a compiler.


> But anyway, it already costs half compared to last year

You could not have bought Claude Opus 4.5 at any price one year ago I'm quite certain. The things that were available cost half of what they did then, and there are new things available. These are both true.

I'm agreeing with you, to be clear.

There are two pieces I expect to continue: inference for existing models will continue to get cheaper. Models will continue to get better.

Three things, actually.

The "hitting a wall" / "plateau" people will continue to be loud and wrong. Just as they have been since 2018[0].

[0]: https://blog.irvingwb.com/blog/2018/09/a-critical-appraisal-...


As a user of LLMs since GPT-3 there was noticeable stagnation in LLM utility after the release of GPT-4. But it seems the RLHF, tool calling, and UI have all come together in the last 12 months. I used to wonder what fools could be finding them so useful to claim a 10x multiplier - even as a user myself. These days I’m feeling more and more efficiency gains with Claude Code.


That's the thing people are missing, the models plateaued a while ago, still making minor gains to this day, but not huge ones. The difference is now we've had time to figure out the tooling. I think there's still a ton of ground to cover there and maybe the models will improve given that the extra time, but I think it's foolish to consider people who predicted that completely wrong. There are also a lot of mathematical concerns that will cause problems in the near and distant future. Infinite progress is far from a given, we're already way behind where all the boosters thought we'd be my now.


I believe Sam Altman, perhaps the greatest grifter in today’s Silicon Valley, claimed that software engineering would be obsolete by the end of last year.


> The "hitting a wall" / "plateau" people will continue to be loud and wrong. Just as they have been since 2018[0].

Everybody who bet against Moore's Law was wrong ... until they weren't.

And AI is the reaction to Moore's Law having broken. Nobody gave one iota of damn about trying to make programming easier until the chips couldn't double in speed anymore.


This is exactly backwards: Dennard scaling stopped. Moore’s Law has continued and it’s what made training and running inference on these models practical at interactive timescales.


You are technically correct. The best kind of correct.

However, most people don't know the difference between the proper Moore's Law scaling (the cost of a transistor halves every 2 years) which is still continuing (sort of) and the colloquial version (the speed of a transistor doubles every 2 years) which got broken when Dennard scaling ran out. To them, Moore's Law just broke.

Nevertheless, you are reinforcing my point. Nobody gave a damn about improving the "programming" side of things until the hardware side stopped speeding up.

And rather than try to apply some human brainpower to fix the "programming" side, they threw a hideous number of those free (except for the electricity--but we don't mention that--LOL) transistors at the wall to create a broken, buggy, unpredictable machine simulacrum of a "programmer".

(Side note: And to be fair, it looks like even the strong form of Moore's Law is finally slowing down, too)


If you can turn a few dollars of electricity per hour into a junior-level programmer who never gets bored, tired, or needs breaks, that fundamentally changes the economics of information technology.

And in fact, the agentic looped LLMs are executing much better than that today. They could stop advancing right now and still be revolutionary.


interesting post. i wonder if these people go back and introspect on how incorrect they have been? do they feel the need to address it?


No, people do not do that.

This is harmless when it comes to tech opinions but causes real damage in politics and activism.

People get really attached to ideals and ideas, and keep sticking to those after they fail to work again and again.


i don't think it is harmless or we are incentivising people to just say whatever they want without any care for truth. people's reputations should be attached to their predictions.


Some people definitely do but how do they go and address it? A fresh example in that it addresses pure misinformation. I just screwed up and told some neighbors garbage collection was delayed for a day because of almost 2ft of snow. Turns out it was just food waste and I was distracted checking the app and read the notification poorly.

I went back to tell them (do not know them at all just everyone is chattier digging out of a storm) and they were not there. Feel terrible and no real viable remedy. Hope they check themselves and realize I am an idiot. Even harder on the internet.


Do _you_ do that?


i try to yes


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: