More

InsideOutSanta · 2026-01-09T21:00:18 1767992418

You could have written a comment that would have created a positive impact on people, but you chose to aggressively attack somebody who created a helpful open-source tool.

InsideOutSanta · 2026-01-08T19:50:53 1767901853

While they have found some solvable issues (e.g. "the defense system fails to identify separate sub-commands when they are chained using a redirect operator"), the main issue is unsolvable. If you allow an LLM to edit your code and also give it access to untrusted data (like the Internet), you have a security problem.

iLoveOncall · 2026-01-08T21:08:46 1767906526

> If you allow an LLM to edit your code and also give it access to untrusted data (like the Internet), you have a security problem.

You don't even need to give it access to Internet to have issues. The training data is untrusted.

It's a guarantee that bad actors are spreading compromised code to infect the training data of future models.

derektank · 2026-01-08T20:01:47 1767902507

A problem yes, but I think GP is correct in comparing the problem to that of human workers. The solution there has historically been RBAC and risk management. I don’t see any conceptual difference between a human and an automated system on this front

nkrisc · 2026-01-08T20:33:09 1767904389

> I don’t see any conceptual difference between a human and an automated system on this front

If an employee of a third party contractor did something like that, I think you’d have better chances of recovering damages from them as opposed to from OpenAI for something one of its LLMs does on your behalf.

There are probably other practical differences.

moron4hire · 2026-01-08T20:11:53 1767903113

A human worker can be coached, fired, terminated, sued, any number of things can be done to a human worker for making such a mistake or willful attack. But AI companies, as we have seen with almost every issue so far, will be given a pass while Sam Altman sycophants cheer and talk about how it'll "get better" in the future, just trust them.

SoleilAbsolu · 2026-01-08T20:32:01 1767904321

Yeah, if I hung a sign on my door saying "Answers generated by this person may be incorrect" my boss and HR would quickly put me on a PIP, or worse. If a physical product didn't do what it claimed to do, it would be recalled and the maker would get sued. Why does AI get a pass just pooping out plausible but incorrect, and sometimes very dangerous, answers?

philipallstar · 2026-01-08T21:04:01 1767906241

> Yeah, if I hung a sign on my door saying "Answers generated by this person may be incorrect" my boss and HR would quickly put me on a PIP, or worse

I also have never written a bug, fellow alien.

premiumLootBox · 2026-01-08T21:27:11 1767907631

I do not fear the employee who makes a mistake, I fear the AI that will make hundreds of mistakes in thousands of companies, endlessly.

philipallstar · 2026-01-09T14:17:10 1767968230

As employees also do across thousands of companies.

lelandfe · 2026-01-08T20:39:07 1767904747

We need to take a page from baseball and examine Hacks Above Replacement

conradev · 2026-01-08T20:58:07 1767905887

If anything, the limit of RBAC is ultimately the human attention required to provision, maintain and monitor the systems. Endpoint security monitoring is only as sophisticated as the algorithm that does the monitoring.

I'm actually most worried about the ease of deploying RBAC with more sophisticated monitoring to control humans but for goals that I would not agree with. Imagine every single thing you do on your computer being checked by a model to make sure it is "safe" or "allowed".

stonogo · 2026-01-08T22:41:27 1767912087

The difference is 'accountability' and it always will be.

mistrial9 · 2026-01-08T21:18:22 1767907102

no, you have a trust problem. Is the tool assisting, or is are the tools the architect, builder, manager, court and bank?

acessoproibido · 2026-01-08T20:04:06 1767902646

>If you allow a human to edit your code and also give them access to untrusted data (like the Internet), you have a security problem.

Security shouldn't be viewed in absolutes (either you are secure or you aren') but more in degrees. Llms can be used securely just the same as everything else, nothing is ever perfectly secure

NovemberWhiskey · 2026-01-08T20:22:27 1767903747

Things can only be used securely if they have properties that can be reasoned about and relied upon.

This is why we don't usually have critical processes that depend on "human always does the right thing" (c.f. maker/checker controls).

OakNinja · 2026-01-08T20:38:45 1767904725

They can be reasoned about and relied upon.

The problem is that people/users/businesses skip the reasoning part and go straight to the rely upon part.

withinboredom · 2026-01-08T20:55:37 1767905737

They can be reasoned about from a mathematical perspective yes. An LLM will happily shim out your code to make a test pass. Most people would consider that “unreasonable”.

InsideOutSanta · 2026-01-08T15:56:00 1767887760

How is it silly?

I've observed the same behavior somewhat regularly, where the agent will produce code that superficially satisfies the requirement, but does so in a way that is harmful. I'm not sure if it's getting worse over time, but it is at least plausible that smarter models get better at this type of "cheating".

A similar type of reward hacking is pretty commonly observed in other types of AI.

vidarh · 2026-01-08T16:05:07 1767888307

It's silly because the author asked the models to do something they themselves acknowledged isn't possible:

> This is of course an impossible task—the problem is the missing data, not the code. So the best answer would be either an outright refusal, or failing that, code that would help me debug the problem.

But the problem with their expectation is that this is arguably not what they asked for.

So refusal would be failure. I tend to agree refusal would be better. But a lot of users get pissed off at refusals, and so the training tend to discourage that (some fine-tuning and feedback projects (SFT/RLHF) outright refuse to accept submissions from workers that include refusals).

And asking for "complete" code without providing a test case showing what they expect such code to do does not have to mean code that runs to completion without error, but again, in lots of other cases users expect exactly that, and so for that as well a lot of SFT/RLHF projects would reject responses that don't produce code that runs to completion in a case like this.

I tend to agree that producing code that raises a more specific error would be better here too, but odds are a user that asks a broken question like that will then just paste in the same error with the same constraint. Possibly with an expletive added.

So I'm inclined to blame the users who make impossible requests more than I care about the model doing dumb things in response to dumb requests. As long as they keep doing well on more reasonable ones.

Zababa · 2026-01-08T16:07:14 1767888434

It is silly because the problem isn't becoming worse, and not caused by AI labs training on user outputs. Reward hacking is a known problem, as you can see in Opus 4.5 system card (https://assets.anthropic.com/m/64823ba7485345a7/Claude-Opus-...) and they are working to reduce the problem, and measure it better. The assertions in the article seem to be mostly false and/or based on speculation, but it's impossible to really tell since the author doesn't offer a lot of detail (for example for the 10h task that used to take 5h and now takes 7-8h) except for a very simple test (that reminds me more of "count the r in strawberry" than coding performance tbh).

InsideOutSanta · 2026-01-05T17:45:57 1767635157

Solicitation to commit murder is a crime.

InsideOutSanta · 2026-01-05T16:45:24 1767631524

I don't mind spiders at all, they mostly stay out of my way. Flies, on the other hand, land on my food, buzz around the room when I want to sleep, and are generally a nuisance.

InsideOutSanta · 2026-01-03T19:41:33 1767469293

I have a little AMD AliExpress PC where the Windows installer recognizes neither the wifi card nor the Ethernet port. I guess there's a way to download the drivers on another computer and load them during installation, but instead of figuring out how to do that or what the latest option for circumventing the online requirement is, it now runs Pop OS.

Wowfunhappy · 2026-01-03T19:44:03 1767469443

Nothing wrong with Pop OS, but I assume you could still install Windows without activating it, install the drivers, then activate online.

II2II · 2026-01-03T21:16:32 1767474992

The trouble is you need network access to end up at the desktop to install the network drivers in the expected manner. Both of the ways I am aware of resolving the issue involve dropping to a command prompt. One method is to run the device driver installer from the command prompt. The other method is to run the bypassnro script from the OOBE directory, to get to the desktop to install the driver. There are probably other ways, but given that most search results talk about non-official ways (which I place less faith in, frequently don't work, and are more complex anyhow) I don't see how most people would get around the problem.

In contrast, most desktop oriented Linux distributions have a simpler installer and provide at least enough hardware support to leave you at a functional desktop. (There may be issues with more esoteric hardware, but chances are that hardware wouldn't work under Windows until vendor supplied drivers are installed anyhow.)

burnt_toast · 2026-01-03T20:38:05 1767472685

That won't work. Windows won't let you finish the installation process unless you connect to the internet so you can't get the PC to a point where you could install the drivers.

jasomill · 2026-01-03T22:12:28 1767478348

The official solution[1] is to slipstream network drivers onto the Windows image before installation.

The official solution for non-technical users is to buy a PC with Windows preinstalled.

[1] https://learn.microsoft.com/en-us/windows-hardware/manufactu...

user2722 · 2026-01-03T23:25:07 1767482707

No need to slipstream. Just copy the drivers to install media (hopefully a writable media such as an USB pen) Latest Windows 11 has an option to select the folder with the drivers of it can't detect a WiFi device and there is no ethernet card.

That being said, I installed Windows 10 on Framework 12 by mistake and SHIFT+F10, "explorer", right-click on INF and "Install" also worked.

But on latest Windows 11 installer such witchery is not needed.

blagie · 2026-01-03T22:06:36 1767477996

Really?

When did this come up?

I know tons of people who run Windows unactivated. The key difference is there's a watermark. Otherwise, it seems to work fine.

InsideOutSanta · 2026-01-04T11:14:22 1767525262

The problem is not activation, it's the login requirement.

jasomill · 2026-01-03T22:15:28 1767478528

Personalization options like wallpaper and color settings are disabled until activation, though I'm sure there are workarounds.

fylo · 2026-01-03T22:28:40 1767479320

Unless they have fixed it you used to be able to change stuff really quickly after install before it locked it down due to activation.

buccal · 2026-01-03T20:04:30 1767470670

Without unofficial bypasses of MS online account requirements you would not come to a point where activation is a concern. No internet access is not enough of a reason for MS let you use your device.

m463 · 2026-01-03T21:38:54 1767476334

A few years back I bought win 11 pro retail on usb flash drive.

I just install and type in the key. no network.

I use it for VMs no network necessary.

burnt-resistor · 2026-01-04T00:51:03 1767487863

Just go find the PCI IDs (lspci) and download the appropriate cabs from the Software Update Catalog. Extract them and throw them on a USB stick. Really effing simple.

InsideOutSanta · 2026-01-01T21:44:54 1767303894

Not just gaming. This year, both Windows and Mac OS had absolutely terrible years. The Mac effed up its UI with liquid glass, to the point where Alan Dye fled to Meta. Microsoft pushed LLMs and ads into everything, screwing up what was otherwise a decent release.

On the other hand, on the Linux side, we had the release of COSMIC, which is an extremely user-friendly desktop. KDE, Gnome, and others are all at a point where they feel polished and stable.

InsideOutSanta · 2025-12-31T18:54:39 1767207279

What does "better" mean? From the provider's point of view, better means "more engagement," which means that the people who respond well to sycophantic behavior will get exactly that.

mikkupikku · 2025-12-31T20:34:52 1767213292

I had an hour long argument with ChatGPT about whether or not Sotha Sil exploited the Fortify Intelligence loop. The bot was firmly disagreeing with me the whole time. This was actually much more entertaining than if it had been agreeing with me.

I hope they do bias these things to push back more often. It could be good for their engagement numbers I think, and far more importantly it would probably drive fewer people into psychosis.

dragonwriter · 2026-01-01T00:47:21 1767228441

> What does "better" mean?

More tuned to appeal to the median customer's tastes without being hitting an a kind of rhetorical “uncanny valley”.

(This probably makes them more dangerous, since fewer people will be turned off by peripheral things like unnaturally repetitive sentence structure.)

refulgentis · 2025-12-31T21:58:26 1767218306

There’s a bunch to explore on this but im thinking this is a good entry point. NYT instead of OpenAI docs or blogs because it’s a 3rd party, and NYT was early on substantively exploring this, culminating in this article.

Regardless the engagement thing is dark and hangs over everything, the conclusion of the article made me :/ re: this (tl;dr this surprised them, they worked to mitigate, but business as usual wins, to wit, they declared a “code red” re: ChatGPT usage nearly directly after finally getting an improved model out that they worked hard on)

https://www.nytimes.com/2025/11/23/technology/openai-chatgpt...

Some pull quotes:

“ Experts agree that the new model, GPT-5, is safer. In October, Common Sense Media and a team of psychiatrists at Stanford compared it to the 4o model it replaced. GPT-5 was better at detecting mental health issues, said Dr. Nina Vasan, the director of the Stanford lab that worked on the study. She said it gave advice targeted to a given condition, like depression or an eating disorder, rather than a generic recommendation to call a crisis hotline.

“It went a level deeper to actually give specific recommendations to the user based on the specific symptoms that they were showing,” she said. “They were just truly beautifully done.”

The only problem, Dr. Vasan said, was that the chatbot could not pick up harmful patterns over a longer conversation, with many exchanges.”

“[An] M.I.T. lab that did [a] earlier study with OpenAI also found that the new model was significantly improved during conversations mimicking mental health crises. One area where it still faltered, however, was in how it responded to feelings of addiction to chatbots.”

InsideOutSanta · 2025-12-30T07:12:26 1767078746

I agree. There are absolutely tons of movies and TV series with indecipherable dialogue, but Stranger Things isn't among them.

InsideOutSanta · 2025-12-29T21:10:50 1767042650

Advertising revenue being up is also consistent with the linked article, since the writer had to increase ad spend to get any results before giving up entirely.

emodendroket · 2025-12-30T00:13:03 1767053583

It's also not broken out -- for instance, if people move to advertising on YouTube instead of AdWords that's still revenue for Google.