Hacker Newsnew | past | comments | ask | show | jobs | submit | micimize's commentslogin

Vapid and wrong on every point. Many good ideas come from steeping in a novel soup of ideas for a long time, you don't need that many people to care about quality to make it a lucrative differentiator, and as I've seen many point out on X dot com the everything app: where's all the the shipped results of these slop torrent?

The models are increasingly capable in impressive ways. Maybe the next gen will enable the "sales critter" to slop out commercially viable software with no tech know-how. If not, I'm sure we'll assume the next can, and if not that, the next.

But feigning confidence about the shape and nature of this unfurling sea-change is absurd when the high-profile examples we have are like, what, moltbook? And denigrate _all_ potential ingenuity and insight unilaterally into the bargain? What a careless way of looking at the world


Measuring in terms of KB is not quite as useful as it seems here IMO - this should be measured in terms of context tokens used.

I ran their tool with an otherwise empty CLAUDE.md, and ran `claude /context`, which showed 3.1k tokens used by this approach (1.6% of the opus context window, bit more than the default system prompt. 8.3% is system tools).

Otherwise it's an interesting finding. The nudge seems like the real winner here, but potential further lines of inquiry that would be really illuminating: 1. How do these approaches scale with model size? 2. How are they impacted by multiple such clauses/blocks? Ie maybe 10 `IMPORTANT` rules dilute their efficacy 3. Can we get best of both worlds with specialist agents / how effective are hierarchical routing approaches really? (idk if it'd make sense for vercel specifically to focus on this though)


An obvious nice thing here compared to the cursor post is the human involvement gives some minimum threshold confidence that the writer of the post has actually verified the claims they've made :^) Illustrates how human comprehension is itself a valuable "artifact" we won't soon be able to write off.

My comment on the cursor post for context: https://news.ycombinator.com/item?id=46625491


> While it might seem like a simple screenshot, building a browser from scratch is extremely difficult.

> Another experiment was doing an in-place migration of Solid to React in the Cursor codebase. It took over 3 weeks with +266K/-193K edits. As we've started to test the changes, we do believe it's possible to merge this change.

In my view, this post does not go into sufficient detail or nuance to warrant any serious discussion, and the sparseness of info mostly implies failure, especially in the browser case.

It _is_ impressive that the browser repo can do _anything at all_, but if there was anything more noteworthy than that, I feel they'd go into more detail than volume metrics like 30K commits, 1M LoC. For instance, the entire capability on display could be constrained to a handful of lines that delegate to other libs.

And, it "is possible" to merge any change that avoids regressions, but the majority of our craft asks the question "Is it possible to merge _the next_ change? And the next, and the 100th?"

If they merge the MR they're walking the walk.

If they present more analysis of the browser it's worth the talk (not that useful a test if they didn't scrutinize it beyond "it renders")

Until then, it's a mountain of inscrutable agent output that manages to compile, and that contains an execution pathway which can screenshot apple.com by some undiscovered mechanism.


The lowest bar in agentic coding is the ability to create something which compiles successfully. Then something which runs successfully in the happy path. Then something which handles all the obvious edge cases.

By far the most useful metric is to have a live system running for a year with widespread usage that produces a lower number of bugs than that of a codebase created by humans.

Until that happens, my skeptic hat will remain firmly on my head.


> it's a mountain of inscrutable agent output that manages to compile

But is this actually true? They don't say that as far as I can tell, and it also doesn't compile for me nor their own CI it seems.


Oh it doesn’t compile? that’s very revealing


Some people just believe anything said on X these days. No timeline from start to finish, just "trust me bro".

If you can't reproduce or compile the experiment then it really doesn't work at all and nothing but a hype piece.


Hah I don't know actually! I was assuming it must if they were able to get that screenshot video.


error: could not compile `fastrender` (lib) due to 34 previous errors; 94 warnings emitted

I guess probably at some point, something compiled, but cba to try to find that commit. I guess they should've left it in a better state before doing that blog post.


I find it very interesting the degree to which coding agents completely ignore warnings. When I program I generally target warning-free code, and even with significant effort in prompting, I haven't found a model that treats warnings as errors, and they almost all love the "ignore this warning" pragmas or comments over actually fixing them.


Yeah I've had problems with this recently. "Oh those are just warnings." Yes but leaving them will make this codebase shit in short time.

I do use AI heavily so I resorted to actually turning on warnings as errors in the rust codebases I work in.


Easiest to have different agents or turns that set aside the top-level goal via hooks/skills/manual prompt/etc. Heuristically, a human will likely ignore a lot of warnings until they've wired up the core logic, then go back and re-evaluate, but we still have to apply steering to get that kind of higher-order cognitive pattern.

Product is still fairly beta, but in Sculptor[^1] we have an MCP that provides agent & human with suggestions along the lines of "the agent didn't actually integrate the new module" or "the agent didn't actually run the tests after writing them." It leads to some interesting observations & challenges - the agents still really like ignoring tool calls compared to human messages b/c they "know better" (and sometimes they do).

[^]: https://imbue.com/sculptor/


You can use hooks to keep them from being able to do this btw


I generally think of needing hooks as being a model training issue - I've had to use them less as the models have gotten smarter, hopefully we'll reach the point where they're a nice bonus instead of needed to prevent pathological model behavior.


unfortunately this is not the most common practice. I've worked on rust codebases with 10K+ warning. and rust was supposed to help you.

It is also close to impossible run any node ecosystem without getting a wall of warnings.

You are an extreme outlier for putting in the work to fix all warnings


> It is also close to impossible run any node ecosystem without getting a wall of warnings.

Haven't found that myself, are you talking about TypeScript warnings perhaps? Because I'm mostly using just JavaScript and try to steer clear of TypeScript projects, and AFAIK, JavaScript the language nor runtimes don't really have warnings, except for deprecations, are those the ones you're talking about?


`cargo clippy` is also very happy with my code. I agree and I think it's kind of a tragedy, I think for production work warnings are very important. Certainly, even if you have a large number of warnings and `clippy` issues, that number ideally should go down over time, rather than up.


The title is all bluster. Nothing wrong with going off to play in your own corner but I don't think it does this movement any good to play-act at some grand conflict.

Personally, I believe it would be better if we had more technological self-direction and sovereignty, but this kind of essay, which downplays and denigrates the progress and value of our modern systems, is a perspective from which the insights necessary for such a transformation cannot possibly take root.

When asking such questions seriously, we must look at youtube, not twitter. Mountains of innovations in media publishing, delivery, curation, navigation, supplementation via auto-generated captions and dubbing, all accreted over 20 years, enabling a density and breadth of open-ended human communication that is to me truly staggering.

I'm not saying we should view centralized control over human comms infra as positive, or that we'll be "stuck" with it (I don't think we will be), just that we need to appreciate the nature and scale of the "internet" properly if we're to stand a chance of seeing some way through to a future of decentralized information technology


Agree with a lot that you’re saying here but with a rather large asterisk (*). I think that ecosystems like YT are useless to the wider web and collective tech stack unless those innovations become open (which Alphabet has a vested interest in preventing).

If YT shut down tomorrow morning, we’d see in a heartbeat why considering them a net benefit in their current form is folly. It is inherently transitory if one group controls it.

The OP article is correct about the problem, but is proposing throwing mugs of coffee on a forest fire.


This conversation on YT reminds me intimately of all the competition Twitch got over time. By all accounts, Mixer was more technologically advanced than Twitch is right now, and Mixer died 5 years ago.

Even Valve of all people made a streaming apparatus that was more advanced than Twitch's which had then innovative features such letting you rewind with visible categories and automated replays of moments of heightened chat activity, and even synchronized metadata such as in-game stats - and they did it as a side thing for CSGO and Dota 2. That got reworked in the streaming framework Steam has now which is only really used by Remote play and annoying publisher streams above games, so basically nothing came of it.

That's how it always goes. Twitch lags and adds useless fake engagement fluff like bits and thrives, while competitors try their damnest and neither find any success nor do they have a positive impact anywhere. The one sitting at the throne gets to pick what tech stack improvements are done, and if they don't feel like it, well, though luck, rough love.


The one sitting at the throne is the one with the content, not the one with the tech. People don’t care about frivolous features. There are like 20 different streaming services, I’m sure some have better tech than others but ultimately people are only paying attention to what shows they have


Mmm yeah I think I know what you mean. IDK if "If they stopped existing, we'd realize we shouldn't have relied on their existence" is plausible, but we have plenty of bitter lessons in centralized comms being acquired and reworked towards... particular ends, and will see more.

Also the collective capability of our IT is inhibited in some ways by the silo-ing of particular content and domain knowledge+tech, no question


Appreciate the nature and scale of the internet... and also how it's changing though, yeah?

While I agree with much of the article's thesis, it sadly appears to ignore the current impact of LLMs ...

> it’s never been easier to read new ideas, experiment with ideas, and build upon & grow those ideas with other strong thinkers on the web, owning that content all along.

But, "ownership" ? Today if you publish a blog, you don't really own the content at all. An LLM will come scrape the site and regenerate a copyright-free version to the majority of eyeballs who might otherwise land on your page. Without major changes to Fair Use, posting a blog is (now more than ever) a release of your rights to your content.

I believe a missing component here might be DRM for common bloggers. Most of the model of the "old" web envisions a system that is moving copies of content-- typically verbatim copies-- from machine to machine. But in the era of generative AI, there's the chance that the majority of content that reaches the reader is never a verbatim copy of the original.


The thing that I got stuck on most in 2025 is how often we complain about these centralized behemoths but only rarely distill them to the actual value they provide. Its only if you go through the exercise of understanding why people use them, and what it would take to replicate them, that you can understand what it would actually take to improve them. For example, the fundamental feature of facebook is the network. And layered on, the ability to publish short-stories on the internet and have some control over who gets to read it. The technological part is hard but possible, and the network part well - think about how they did it originally. They physically targeted small social groups and systematically built it over time. It was a big deal when Facebook was open to my university, everyone got on about the same time, and so instantly you were all connecting with each other.

I believe we can build something better. But I'm also now equally convinced that it's possible the next step isn't technological at all, but social. Regulation, breaking up the monopolies, whatever. We treat roads and all manner of other infrastructure as government provided; maybe a social platform is part of it. We always lean these thoughts dystopian, but also which of us technologically inclined readers and creators is spending as much time on policy documents, lobbying, etc, as we are schlepping code around hoping it will be a factor in this process. This is only a half thought but, at least these days I'm thinking more about not only is it time to build, but perhaps its time to be building non-code related things, to achieve what we previously thought were purely technological outcomes.


This is not a human-prompted thank-you letter, it is the result of a long-running "AI Village" experiment visible here: https://theaidigest.org/village

It is a result of the models selecting the policy "random acts of kindness" which resulted in a slew of these emails/messages. They received mostly negative responses from well-known OS figures and adapted the policy to ban the thank-you emails.


This seems like a totally incoherent complaint. The alleged SO bad-actor is upset that they can't police a community, but the author has the same complaint, just directed at SO.

All platforms with any moderation system can be subverted by bad actors - IDK that much about SO's mechanisms but it strikes me as leaving the "community" far more leverage for getting around entrenched bad actors than discord, reddit, etc.

And what's more... it's software purpose-built for technical Q&A. Some of my SO answers have been updated by others as they became outdated. Not that I have some particular fondness for SO, but what a cool collective intelligence feature.

I have a feeling this was written for an in-group and broke containment, but the straight forward answer here seems to me to be "SO should have a report system for dealing with bad actors," not "boycott the forum I don't like so people use the one I do"


Interesting/impressive project, and would be doubly interested in the workflow used to develop it. Could stand to have more human-voiced docs though. Aside from all the usual reasons I'd avoid using a <1mo dependency over something like Yjs, the bog-standard claude copy on differentiators/reasons to migrate is fairly off-putting to me.

Also maybe bias, but there are still ennough obvious agent artifacts/byproducts in the code base that it makes me doubt that the details were thoroughly attended to, and that's where the devils are.


Exciting stuff! A big step towards an accelerated AI-assisted SWE approach that avoids the trap of turning engineers into AI slop janitors


The complainers were FF users forced to deal with bloat they didn't use, those who are sad here are pocket users. They're just different people. Though, even those who didn't like the bundling of the extension probably didn't actively want the service to fail.


Right. I would be one of the people who saw pocket as an unnecessary distraction, but even I tested it and my opinion is partly based on pocket just not working in my Firefox at the time. I also just did not like that it was given space in the toolbar while a way more important rss button was denied that space. And despite that, I still think the shutdown now is bad - this should be spun out or be moved to a Foss project, and certainly not be killed for more ai nonsense.

BTW, fakespot (the service they also shut down) is or could be an applied ai project where that technology could be helpful, and they also shut it down. That also feels wrong, especially the combination.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: