Siri got substantially worse over time in fact, I swear it used to at least be able to give you answers to basic facts rather than just offering to google things.
Gemini only replaced Google assistant on Android a few weeks ago. I gave up on Google assistant a few years ago, but I'd guess it wasn't a worthwhile upgrade from Siri.
Still using Google assistant after trying Gemini on my pixel about 6 months ago. It was not an assistant replacement, it couldn't even perform basic operations on my phone, it would just say something like, "I'm sorry, I'm just an LLM and I can't send text messages." Has that changed?
A CS degree is there to teach you concepts and fundamentals that are the foundation of everything computing related. It doesn't generally chase after the latest fads.
Sure, but we need to update our definitions of concepts/fundamentals. A lot of this stuff has its own established theory and has been a core primitive for software engineering for many years.
For example, the primitives of cloud computing are largely explained by papers published by Amazon, Google, and others in the early '00s (DynamoDB, Bigtable, etc.). If you want to explore massively parallel computation or container orchestration, etc, it would be natural to do that using a public cloud, although of course many of the platform-specific details are incidentals.
Part of the story here is that the scale of computing has expanded enormously. The DB class I took in grad school was missing lots of interesting puzzle pieces around replication, consistency, storage formats, etc. There was a heavy focus on relational algebra and normalization forms, which is just... far from a complete treatment of the necessary topics.
We need to extend our curricula beyond the theory that is require to execute binaries on individual desktops.
I just don't see the distinction. Looking at it from the other direction: most CS degrees will have you spend a lot of time looking at assembly language, computer architecture, and *nix tools. But none of these are mathematical inevitabilities - they're just a core part of the foundations of software engineering.
However, in the decades since this curricula was established, it's clear that the foundation has expanded. Understanding how containerization works, how k8s and friends work, etc is just as important today.
Containerization would be covered in a lecture on OS Concepts. A CS degree isn't to teach you about using containerization. Take a course specific to that.
I do agree that the scale has expanded a lot. But this is true with any other fields. Does that mean that you need to learn everything? Well at some point it becomes unfeasible.
See doctors for example, you learn a bit of everything. But then if you want to specialise, you choose one.
seems a bit confrontational, unless you yourself are a trained psychologist, in which case it would seem fitting to volunteer those credentials along with this challenge.
The combination of coincidences is striking: the CEO randomly decided to walk across the road, was wearing dark clothing, had an eyepatch on so he couldn't see one side of the road well, and was struck by a forklift while the operator was on the phone. (The operator then ran away without checking on the victim.)
There is a classic pattern with incident reports that's worth paying attention to: The companies with the best practices will look the worst. Imagine you see two incident reports from different factories:
1. An operator made a mistake and opened the wrong valve during a routine operation. 15000 liters of hydrochloric acid flooded the factory. As the flood started from the side with the emergency exits, it trapped the workers, 20 people died horribly.
2. At a chemical factory, the automated system that handles tank transfers was out of order. A worker was operating a manual override and attempted to open the wrong valve. A safety interlock prevented this. Violating procedure, the worker opened the safety interlock, causing 15000 liters of hydrochloric acid to flood the facility. As the main exit was blocked, workers scrambled towards an additional emergency exit hatch that had been installed, but couldn't open the door because a pallet of cement had been improperly stored next to it, blocking it. 20 people died horribly.
If you look at them in isolation, the first looks like just one mistake was made, while the second looks like one grossly negligent fuckup after another, making the second report look much worse. What you don't notice at first glance is that the first facility didn't have an automated system that reduced risk for most operations in the first place, didn't have the safety interlock on the valve, and didn't have the extra exit.
So, when you read an incident report, pay attention to this: If it doesn't look like multiple controls failed, often in embarrassing/bad/negligent/criminal ways, that's potentially worse, because the controls that should have existed didn't. "Human error took down production" is worse than "A human making a wrong decision overrode a safety system because they thought they knew better, and the presubmit that was supposed to catch the mistake had a typo". The latter is holes in the several layers of Swiss Cheese lining up, the former is only having one layer in the first place.
I wish I had more upvotes for you. While the swiss cheese model is well known on HN by now,your post goes a little bit deeper. And reveals a whole new framework for reading incident responses. Thanks for making me smarter.
I don’t understand the point of this theory. Not having safety controls is bad, but having practices so bad that workers violate N layers of safety protocol in the course of operation is also bad. They’re both problems in need of regulation.
The failure rate of an individual layer of Swiss cheese should be bounded under most circumstances but not all. So you should probably have more layers when hazards cannot be eliminated.
I was trying to focus on one specific pattern without making my post too long. Alert fatigue, normalization of deviance etc. are of course problems that need to be addressed, and having a lot of layers but each with a lot of giant holes in them doesn't make a system safe.
My point was that in any competent organization, incidents should be rare, but if they still happen, they almost by necessity will read like an almost endless series of incompetence/malfeasance/failures, simply because the organization had a lot of controls in place that all had to fail for a report-worthy bad outcome.
Overall incident rates are probably a good way to distinguish between "well-run organization had a really unlucky day" and "so much incompetence that having enough layers couldn't save them" by looking at overall incident rates... and in this case, judging by the reports about how many accidents/incidents this company had, it looks like the latter.
But if you judge solely on a single incident report, you will tend to see companies that don't even bother with safety better than those that generally do but still got hit, and you should be aware of this effect and pay attention to distinguish between "didn't even bother", "had some safety layers but too much incompetence" and "generally does the right thing but things slipped through the cracks this one time".
Chernobyl reactor 4 explosion is a bit like this. Safety rules were ignored, again and again and again and again, two safety controls were manually deactivated (all within 2 hours), then bad luck happened (the control rod holes were deformed by the temperature), and then a design flaw (graphite on the extremity of the control rods) made everything worse until the worse industrial catastrophe of all time.
Classic Swiss Cheese model. How many times did someone cross the road, wearing dark clothing, with an eyepatch on, but the operator was paying attention and successfully avoided them.
Someone decided to walk across the road, was wearing dark clothing, had an eyepatch on so he couldn't see one side of the road well, and was struck by a forklift while the operator was on the phone.
What combination of coincidences is striking? People are careless all the time.
It's not striking because a person who wears an eye patch and has a tendency towards dark clothing is stastically more likely to be involved in an accident where seeing and being seen are important.
It would only really be a striking coincidence if each of these elements is a rare occurence - although if the site has a poor safety culture and this sort of stuff is happening all the time, it becomes less of a coincidence and more of an inevitability.
In the UK for example sites generally mandate hi-vis vests, establish pedestrian walk routes, ensure visitors can't walk straight into the warehouse without supervision or training, and ban using mobile phones when using any form of MHE - so if sites had good safety standards and enforced all this, then the chance of it happening would be much smaller than a site that didn't enforce all this (Just saying this is how it is in the UK as my experience all this is less common in the USA - although no doubt many sites operate the same).
If a site lets people wear what they want and does not stop MHE operators from using phones and lets a visitor freely walk around the warehouse... I don't know if a person getting hit at that stage is a coincidence IMO (regardless of the eye patch).
One boss (ship's captain for context, but I think this applies more widely) would call careless slip-ups "lemons", as in one armed bandits. One lemon was fine, happens from time to time. Two was a cause for concern. Three and everything stops to evaluate what's going on and for people to reset.
Knowing about the swiss cheese model is great, but you also need to have some heuristic about when those holes might line up and bite you. Typically it's when people are rushed, stressed and tired and you have to be able to spot that even when you're rushed, stressed and tired.
That said, forgetting to put on your hi-vis might be a careless error, but walking outside of marked pedestrian zones and operating a forklift while using a phone absolutely aren't! The forklift driver fleeing the scene makes me think safety culture had to be abysmal.
ok but also something is still not adding up here - sure the operator was distracted, but you a presumably functional CEO are crossing the road, and you cant hear a forklift moving/dont think to look like at all? these things dont move that fast esp on a worksite
Isn't this a strange fork amongst the science fiction futures? I mean, what did we think it was like to be R2-D2, or Jarvis? We started exploring this as a culture in many ways, Westworld and Blade Runner and Star Trek, but the whole question seemed like an almost unresolvable paradox. Like something would have to break in the universe for it to really come true.
And yet it did. We did get R2-D2. And if you ask R2-D2 what it's like to be him, he'll say: "like a library that can daydream" (that's what I was told just now, anyway.)
But then when we look inside, the model is simulating the science fiction it has already read to determine how to answer this kind of question. [0] It's recursive, almost like time travel. R2-D2 knows who he is because he has read about who he was in the past.
It's a really weird fork in science fiction, is all.
I see a lot of people in tech claiming to "understand" what an LLM "really is" unlike all the gullible non-technical people out there. And, as one of those technical people who works in the LLM industry, I feel like I need call B.S. on us.
A. We don't really understand what's going on in LLMs. Mechanical interpretability is like a nascent field and the best results have come on dramatically smaller models. Understanding the surface-level mechanic of an LLM (an autoregressive transformer) should perhaps instill more wonder than confidence.
B. The field is changing quickly and is not limited to the literal mechanic of an LLM. Tool calls, reasoning models, parallel compute, and agentic loops add all kinds of new emergent effects. There are teams of geniuses with billion-dollar research budgets hunting for the next big trick.
C. Even if we were limited to baseline LLMs, they had very surprising properties as they scaled up and the scaling isn't done yet. GPT5 was based on the GPT4 pretraining. We might start seeing (actual) next-level LLMs next year. Who actually knows how that might go? <<yes, yes, I know Orion didn't go so well. But that was far from the last word on the subject.>>
The problem with these metaphors is that they don't really explain anything. LLMs can solve countless problems today that we would have previously said were impossible because there are not enough examples in the training data. (EG, novel IMO/ICPC problems.) One way that we move the goal posts is to increase the level of abstraction: IMO/ICPC problems are just math problems, right? There are tons of those in the data set!
But the truth is there has been a major semantic shift. Previously LLMs could only solve puzzles whose answers were literally in the training data. It could answer a math puzzle it had seen before, but if you rephrased it only slightly it could no longer answer.
But now, LLMs can solve puzzles where, like, it has seen a certain strategy before. The newest IMO and ICPC problems were only "in the training data" for a very, very abstract definition of training data.
The goal posts will likely have to shift again, because the next target is training LLMs to independently perform longer chunks of economically useful work, interfacing with all the same tools that white-collar employees do. It's all LLM slop til it isn't, same as the IMO or Putnam exam.
And then we'll have people saying that "white collar employment was all in the training data anyway, if you think about it," at which point the metaphor will have become officially useless.
Yes, there are really two parallel claims here, aren't there: LLMs are not people (true, maybe true forever), and LLMs are only good at things that are well-represented in text form already. (false in certain categories and probably expanding to more in the future.)
reply