I’m with you that it’s terrible, but it very much does have schemas! The vast vast majority of YAML-based big APIs (k8s, helm, compose, and so on) all absolutely do check documents against schemas (not just ad-hoc validation rules) internally.
The real issue is two things: the smaller one is that there’s no single or self-describing schema system (like XML supports); the larger thing is that most YAML schema validations prioritize supporting extremely permissive and complex input documents over being predictable and appropriately restrictive. And that’s a harder problem to fix, because it has more to do with priorities and community conventions.
If people wanted strict schemaful YAML to be the norm, they would have consolidated on one of the many tools that does that by now. The issue is, people don’t want that; they want extremely flexible and open-ended APIs. YAML as currently practiced is conducive to that goal, but it’s the goal that leads to issues, not the choice of (bad, I agree) data language.
> Given that Microsoft is going to have to continue to develop Windows anyway, there's not much reason for them to throw in the towel on the consumer desktop.
That would make sense … if Microsoft didn’t have the second most bonkers track record in history (after Google) in the domain of “fragmenting and releasing competing reimplementations of products already in your core portfolio”!
Another vital quality of good QA teams is that they often serve as one of the last/main repositories of tribal knowledge about how an org's entire software system actually behaves/works together. As businesses grow and products get more complex and teams get more siloed, this is really important.
> there's no "decades-running trap of trying to solve this".
I’m not as certain. The fact that we’ve gone from ASN.1 to COBRA/SOAP to protobuf to Cap’n’web and all the million other items I didn’t list says something. The fact that, even given a very popular alternative in that list, or super tightly integrated RPC like sending terms between BEAMs, basic questions like “should optionality/absence be encoded differently than unset default values?” and “how should we encode forward compatibility?” have so may different and unsatisfactory answers says something.
> I’m not as certain. The fact that we’ve gone from ASN.1 to COBRA/SOAP to protobuf to Cap’n’web and all the million other items I didn’t list says something.
> It absolutely is a decades old set of problems that have never been solved to the satisfaction of most users.
ASN.1 wasn't in the same problem space with CORBA/DCOM, both CORBA and DCOM/OLE were heavily invested in a general-purpose non-domain-specific object model representation that would suppot arbitrary embeddings within an open-ended range of software. I suspect this is the unsolvable problem indeed, but I also believe that's not what you meant with your comment either, since all the known large-scale BEAM deployments (the comment I originally replied to implied BEAM deployments) operate within bounded domain spaces such as telecom and messaging, where distributed properties of the systems are known upfront: there are existing formats, protocols of exchange, and the finite number of valid interactions between entities/actors of the network, the embeddings are either non-existent or limited to a finite set of media such as static images, videos, maps, contacts etc. All of these can be encoded by a compile-time specification that gets published for all parties upfront.
> basic questions like “should optionality/absence be encoded differently than unset default values?”
However you like, any monoid would work here. I would argue that [a] and [] always win over (Option a) and especially over (Option [a]).
> and “how should we encode forward compatibility?”
> Approximately 4,000 NYPD officers took part in a protest that included blocking traffic on the Brooklyn Bridge and jumping over police barricades in an attempt to rush City Hall.
> The ACLU obtained a court order prohibiting strikers from carrying their service revolvers. Again, the SFPD ignored the court order. On August 20, a bomb detonated at the Mayor's home with a sign reading "Don't Threaten Us" left on his lawn.
> Among the hundreds of protesters arrested over the four days of demonstrations in New York City over the killing of George Floyd in Minneapolis, only one was highlighted by name by a police union known for its hostility toward Mayor Bill de Blasio. The name of that protester? Chiara de Blasio, the mayor’s daughter.
Yes! And like … “out-compete” assumes a zero sum game. There are massive industries where the tools used to serve different market segments only barely overlap decades after the “game changing” tool was made available.
Like, screw the whole “artisanal small-batch software” argument—there are massive new systems being written in C every day despite decades of claims that it is an obsolete language doomed to be replaced by better alternatives. Those alternatives caught on in some segments and not in others. Electric cars caught on in some segments and not in others. Steel-and-concrete building construction caught on in some segments and not in others. It’ll be fine.
Bisect is one of those things where if you're on a certain kind of project, it's really useful, and if you're not on that kind of project you never need it.
If the contributor count is high enough (or you're otherwise in a role for which "contribution" is primarily adjusting others' code), or the behaviors that get reported in bugs are specific and testable, then bisect is invaluable.
If you're in a project where buggy behavior wasn't introduced so much as grew (e.g. the behavior evolved A -> B -> C -> D -> E over time and a bug is reported due to undesirable interactions between released/valuable features in A, C, and E), then bisecting to find "when did this start" won't tell you that much useful. If you often have to write bespoke test scripts to run in bisect (e.g. because "test for presence of bug" is a process that involves restarting/orchestrating lots of services and/or debugging by interacting with a GUI), then you have to balance the time spent writing those with the time it'd take for you to figure out the causal commit by hand. If you're in a project where you're personally familiar with roughly what was released when, or where the release process/community is well-connected, it's often better to promote practices like "ask in Slack/the mailing list whether anyone has made changes to ___ recently, whoever pipes up will help you debug" rather than "everyone should be really good at bisect". Those aren't mutually exclusive, but they both do take work to install in a community and thus have an opportunity cost.
This and many other perennial discussions about Git (including TFA) have a common cause: people assume that criticisms/recommendations for how to use Git as a release coordinator/member of a disconnected team of volunteers apply to people who use Git who are members of small, tightly-coupled teams of collaborators (e.g. working on closed-source software).
> If you're in a project where buggy behavior wasn't introduced so much as grew (e.g. the behavior evolved A -> B -> C -> D -> E over time and a bug is reported due to undesirable interactions between released/valuable features in A, C, and E), then bisecting to find "when did this start" won't tell you that much useful.
I actually think that is the most useful time to use bisect. Since this is a situation where the cause isn't immediately obvious, looking through code can make those issues harder to find.
I'm glad it works for you! I may not have described the situation super clearly: most bugs I triage are either very causally shallow (i.e. they line up exactly with a release or merge, or have an otherwise very well-known cause like "negative input in this form field causes ISE on submit"), or else they're causally well understood but not immediately solvable.
For example, take a made up messaging app. Let's call it ButtsApp. Three big ButtsApp releases releases happened in order that add the features: 1) "send messages"; 2) "oops/undo send"; and 3) "accounts can have multiple users operating on them simultaneously". All of these were deemed to be necessary features and released over successive months.
Most of the bugs that I've spent lots of time diagnosing in my career are of the interacting-known-features variety. In that example, it would be "user A logs in and sends a message, but user B logs in and can undo the sends of user A" or similar. I don't need bisect to tell me that the issue only became problematic when multi-user support was released, but that release isn't getting rolled back. The code triggering the bug is in the undo-send feature that was released months ago, and the offending/buggy action is from the original send-message feature.
Which commit is at fault? Some combination of "none of them" and "all of them". More importantly: is it useful to know commit specifics if we already know that the bug is caused by the interaction of a bunch of separately-released features? In many cases, the "ballistics" of where a bug was added to the codebase are less important.
Again, there are some projects where bisect is solid gold--projects where the bug triage/queue person is more of a traffic cop than a feature/area owner--but in a lot of other projects, bugs are usually some combination of trivially easy to root-cause and/or difficult to fix regardless of whether the causal commit is identified.
I think that’s primarily a Rust issue, not an LLVM issue. LLVM is at least competitive performance-wise in every case I’ve used it, and is usually the fastest option (given a specific linker behavior) outright. That’s especially true on larger code bases (e.g. chromium, or ZFS).
Rust is also substantially faster to compile than it was a few years ago, so I have some hope for improvements in that area as well.
> I don't care how elegantly my toaster was crafted as long as it toasts the bread and doesn't break.
A consumer or junior engineer cares whether the toaster toasts the bread and doesn’t break.
Someone who cares about their craft also cares about:
- If I turn the toaster on and leave, can it burn my house down, or just set off the smoke alarm?
- Can it toast more than sliced uniform-thickness bread?
- What if I stick a fork in the toaster? What happens if I drop it in the bathtub while on? Have I made the risks of doing that clear in such a way that my company cannot be sued into oblivion when someone inevitably electrocutes themselves?
- Does it work sideways?
- When it fills up with crumbs after a few months of use, is it obvious (without knowing that this needs to be done or reading the manual) that this should be addressed, and how?
- When should the toaster be replaced? After a certain amount of time? When a certain misbehavior starts happening?
Those aren’t contrived questions in service to a tortured metaphor. They’re things that I would expect every company selling toasters to have dedicated extensive expertise to answering.
All those things you’re talking about may or may not matter some day, after years and a class action lawsuit that may or may not materialize or have any material impact on the bottom line of the company producing the toaster, by which time millions of units of subpar toasters that don’t work sideways will have sold.
The world is filled with junk. The majority of what fills the world is junk. There are parts of our society where junk isn’t well tolerated (jet engines, mri machines) but the majority of the world tolerates quite a lot of sloppiness in design and execution and the companies producing those products are happily profitable.
You really underestimate how much work goes into everything around you. You don't care because it just works: the stuff you use is by and large not crap, which makes the crappy stuff all the more noticable. Check out the housing code for your area: everything from the size of steps to the materials used for siding are in there. Or look at the FCC specifications for electrical devices that make sure you don't inadvertently jam radio frequencies in your local area, or the various codes which try very hard to stop you from burning your house down.
You're right that "there are parts of our society where junk isn't well tolerated", but the scope of those areas is far greater than you give credit for.
I'm long term traveling, mostly through the developing world, where something like 84% of humanity resides.
All around me, people's houses, the roads, the infrastructure, food cultivation and preparation, furniture, vehicles, it goes on and on, the tendency is towards loose approximation, loose standards. Things are constantly breaking, the quality is low, people are constantly being poisoned by the waste seeping into their water, air and soil, by the plastic they burn to cook their food, by the questionable chemicals in the completely unsafe industrial environments they work in to produce toxic products consumed by the masses.
There is no uniform size of steps. Yet the majority of humanity lives this way, and not just tolerates it but considers it a higher standard of living than we've had for the majority of human history.
I don't think people in the first world are a different species, so we will also adapt to whatever shitty environment we regress into as our standards fall. We'll realize that the majority of the areas we may consider sacrosanct are in fact quite negotiable in terms of quality when it comes down to our needs.
All this is to say that yeah, I think people will generally tolerate the quality of software going down just fine.
That's a sad way to think. I'd like to hope that humanity can improve itself, and that includes building products that are safer, more refined, more beautiful, more performant and more useful. I agree that there's a lot of crap out there, but I still want to believe and strive to make things that are excellent. I'm not ready to give up on that. And yes, I still get annoyed every time my crappy toaster doesn't work properly.
The real issue is two things: the smaller one is that there’s no single or self-describing schema system (like XML supports); the larger thing is that most YAML schema validations prioritize supporting extremely permissive and complex input documents over being predictable and appropriately restrictive. And that’s a harder problem to fix, because it has more to do with priorities and community conventions.
If people wanted strict schemaful YAML to be the norm, they would have consolidated on one of the many tools that does that by now. The issue is, people don’t want that; they want extremely flexible and open-ended APIs. YAML as currently practiced is conducive to that goal, but it’s the goal that leads to issues, not the choice of (bad, I agree) data language.
reply