Hacker Newsnew | past | comments | ask | show | jobs | submit | LegionMammal978's commentslogin

I don't see how any amount of memory technology can overcome the physical realities of locality. The closer you want the data to be to your processor, the less space you'll have to fit it. So there will always be a hierarchy where a smaller amount of data can have less latency, and there will always be an advantage to cramming as much data as you can at the top of the hierarchy.

while that's true, CPUs already have automatically managed caches. it's not too much of a stretch to imagine a world in which RAM is automatically managed as well and you don't have a distinction between RAM and persistent storage. in a spinning rust world, that never would have been possible, but with modern nvme, it's plausible.

Cpus manage it, but ensuring your data structures are friendly to how they manage caches is one of the keys to fast programs - which some of us care about.

Absolutely! And it certainly is true that for the most performance optimized codes, having manual cache management would be beneficial, but on the CPU side, at least, we've given up that power in favor of a simpler programming model.

Part of giving up is what is correct changes too fast. Attempts to do this manually often got great results for a year and then made things worse for the next generation of CPU that did things differently. Anyone who needs manual control thus would need to target a specific CPU and be willing to spend hundreds of millions every year to update the next CPU - there is nobody who is willing to spend that much. The few who would be are better served by putting the important thing into a FPGA which is going to be faster yet for similar costs.

«Memory technology» as in «a single tech» that blends RAM and disk into just «memory» and obviates the need for the disk as a distinct concept.

One can conjure up RAM, which has become exabytes large and which does not lose data after a system shutdown. Everything is local in such a unified memory model, is promptly available to and directly addressable by the CPU.

Please do note that multi-level CPU caches still do have their places in this scenario.

In fact, this has been successfully done in the AS/400 (or i Series), which I have mentioned elsewhere in the thread. It works well and is highly performant.


> «Memory technology» as in «a single tech» that blends RAM and disk into just «memory» and obviates the need for the disk as a distinct concept.

That already exists. Swap memory, mmap, disk paging, and so on.

Virtual memory is mostly fine for what it is, and it has been used in practice for decades. The problem that comes up is latency. Access time is limited by the speed of light [1]. And for that reason, CPU manufacturers continue to increase the capacities of the faster, closer memories (specifically registers and L1 cache).

[1] https://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html


> How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception? Did you ever find yourself surprised when eventually it did happen? I know I did, since then I at least add some logs even if I think I'm sure that it really cannot happen.

I'm not sure what the author expects the program to do when there's an internal logic error that has no known cause and no definite recovery path. Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.

At some level, the simplest thing to do is to give up and crash if things are no longer sane. After all, there's no guarantee that 'unreachable' recovery paths won't introduce further bugs or vulnerabilities. Logging can typically be done just fine within a top-level exception handler or panic handler in many languages.


Ideally, if you can convince yourself something cannot happen, you can also convince the compiler, and get rid of the branch entirely by expressing the predicate as part of the type (or a function on the type, etc.)

Language support for that varies. Rust is great, but not perfect. Typescript is surprisingly good in many cases. Enums and algebraic type systems are your friend. It'll never be 100% but it sure helps fill a lot of holes in the swiss cheese.

Because there's no such thing as a purely internal error in a well-constructed program. Every "logic error" has to bottom out in data from outside the code eventually-- otherwise it could be refactored to be static. Client input is wrong? Error the request! Config doesn't parse? Better specify defaults! Network call fails? Yeah, you should have a plan for that.


Not every piece of logic lends itself to being expressed in the type system.

Let's say you're implementing a sorting algorithm. After step X you can be certain that the values at locations A, B, and C are sorted such that A <= B <= C. You can be certain of that because you read the algorithm in a prestigious journal, or better, you read it in Knuth and you know someone else would have caught the bug if it was there. You're a diligent reader and you've convinced yourself of its correctness, working through it with pencil and paper. Still, even Knuth has bugs and perhaps you made a mistake in your implementation. It's nice to add an assertion that at the very least reminds readers of the invariant.

Perhaps some Haskeller will pipe up and tell me that any type system worth using can comfortably describe this PartiallySortedList<A, B, C>. But most people have to use systems where encoding that in the type system would, at best, make the code significantly less expressive.


Yes, this has been my experience too! Another tool in the toolbox is property / fuzz testing. Especially for data structures, and anything that looks like a state machine. My typical setup is this:

1. Make a list of invariants. (Eg if Foo is set, bar + zot must be less than 10)

2. Make a check() function which validates all the invariants you can think of. It’s ok if this function is slow.

3. Make a function which takes in a random seed. It initializes your object and then, in a loop, calls random mutation functions (using a seeded RNG) and then calls check(). 100 iterations is usually a good number.

4. Call this in an outer loop, trying lots of seeds.

5. If anything fails, print out the failing seed number and crash. This provides a reproducible test so you can go in and figure out what went wrong.

If I had a penny for every bug I’ve found doing this, I’d be a rich man. It’s a wildly effective technique.


This is indeed a great technique. The only way it could be improved is to expand on step 3 by keeping a list of the random mutation functions called and the order in which they were called, then if the test passes you throw that list away and generate a new list with the next seed. But if the test fails then you go through the following procedure to "shrink" the list of mutations down to a minimal (or nearly minimal) repro:

1. Drop the first item in the list of mutations and re-run the test. 2. If the test still fails and the list of mutations is not empty, goto step 1. 3. If the test passes when you dropped the first item in the mutation list, then that was a key part of the minimal repro. Add it to a list of "required for repro" items, then repeat this whole process with the second (and subsequent) items on the list.

In other words, go through that list of random mutations and, one at a time, check whether that particular mutation is part of the scenario that makes the test fail. This is not guaranteed to reach the smallest possible minimal repro, but it's very likely to reach a smallish repro. Then in addition to printing the failing seed number (which can be used to reproduce the failure by going through that shrinking process again), you can print the final, shrunk list of mutations needed to cause the failure.

Printing the list of mutations is useful because then it's pretty simple (most of the time) to turn that into a non-RNG test case. Which is useful to keep around as a regression test, to make sure that the bug you're about to fix stays fixed in the future.


There is no inherent benefit in going and expressing that fact in a type. There are two potential concerns:

1) You think this state is impossible but you've made a mistake. In this case you want to make the problem as simple to reason about as possible. Sometimes types can help but other times it adds complexity when you need to force it to fit with the type system.

People get too enamored with the fact that immutable objects or certain kinds of types are easier to reason about other things being equal and miss the fact that the same logic can be expressed in any Turing complete language so these tools only result in a net reduction in complexity if they are a good conceptual match to the problem domain.

2) You are genuinely worried about the compiler or CPU not honoring it's theoretical guarantees -- in this case rewriting it only helps if you trust the code compiling those cases more for some reason.


I think those concerns are straw men. The real concern is that the invariants we rely on should hold when the codebase changes in the future. Having the compiler check that automatically, quickly, and definitively every time is very useful.

This is what TFA is talking about with statements like "the compiler can track all code paths, now and forever."


Sometimes the "error" is more like, "this is a case that logically could happen but I'm not going to handle it, nor refactor the whole program to stop it from being expressable"

Until you have a bit flip or a silicon error. Or someone changed the floating point rounding mode.

Funny you should mention the floating point rounding mode, I actually had to fix a bug like that once. Our program worked fine, until you printed to an HP printer - then it crashed shortly after. It took forever to discover the cause - the printer driver was changing the floating point rounding mode and not restoring it. The fix was to set the mode to a known value each and every time after you printed something.

That is amazingly devious. Well done HP.

> Until you have a bit flip

These are vanishingly unlikely if you mostly target consumer/server hardware. People who code for environments like satellites, or nuclear facilities, have to worry about it, sure, but it's not a realistic issue for the rest of us


Bitflips are waaay more common than you think they are. [0]

> A 2011 Black Hat paper detailed an analysis where eight legitimate domains were targeted with thirty one bitsquat domains. Over the course of about seven months, 52,317 requests were made to the bitsquat domains.

[0] https://en.wikipedia.org/wiki/Bitsquatting


> Bitflips are waaay more common than you think they are... Over the course of about seven months, 52,317 requests...

Your data does not show them to be common - less than 1 in 100,000 computing devices seeing an issue during a 7 month test qualifies as "rare" in my book (and in fact the vast majority of those events seem to come from a small number of server failures).

And we know from Google's datacenter research[0] that bit flips are highly correlated hard failures (i.e. they tend to result from a faulty DRAM module, and so affect a small number of machines repeatedly).

It's hard to pin down numbers for soft failures, but it seems to be somewhere in the realm of 100 events/gigabyte/year - and that's before any of the many ECC mechanisms do their thing. In practical sense, no consumer software worries about bit flips in RAM (whereas bit flips in storage are much more likely, hence checksumming DB rows, etc).

[0]: https://static.googleusercontent.com/media/research.google.c...


1 in 100,000 devices is about 1 in about 40,000 customers due to how many devices most people own.

Which means if you're about medium business or above, one of your customers will see this about once a year.

That classifies more as "inevitable" than "rare" in my book.


> That classifies more as "inevitable" than "rare" in my book.

But also pretty much insignificant. Is any other component in your product achieving 5 9s reliability?


We're not talking 5 9s, here.

> ... A new consumer grade machine with 4GiB of DRAM, will encounter 3 errors a month, even assuming the lowest estimate of 120 FIT per megabit.

The guarantees offered by our hardware suppliers today, is not "never happens" but "accounted for in software".

So, if you ignore it, and start to operate at any scale, you will start to see random irreproducible faults.

Sure, you can close all tickets as user error or unable to reproduce. But it isn't the user at fault. Account for it, and your software has less glitches than the competitor.


> We're not talking 5 9s, here.

1 in 40,000 customer devices experiencing a failure annually is considerable better than 4 9s of reliability. So we are debating whether going from 4 9s to 5 9s is worth it.

And like, sure, if the rest of your stack is sufficiently polished (and your scale is sufficiently large) that the once-a-year bit flip event becomes a meaningful problem... then by all means do something about it.

But I maintain that the vast majority of software developers will never actually reach that point, and there are a lot of lower-hanging fruit on the reliability tree



Of course, any attempt at safety or security requires defense in depth.

But usually, any effort spent on making one layer sturdy is worth it.


A comment "this CANNOT happen" has no value on itself. Unless you've formally verified the code (including its dependencies) and have the proof linked, such comments may as well be wishes and prayers.

Yes, sometimes, the compiler or the hardware have bugs that violate the premises you're operating on, but that's rare. But most non pure algorithms (side effects and external systems) have documented failure cases.


> A comment "this CANNOT happen" has no value on itself.

I think it does have some value: it makes clear an assumption the programmer made. I always appreciate it when I encounter comments that clarify assumptions made.


But if you spell that `assert(false)` instead of as a comment, the intent is equally clear, but the behavior when you're wrong is well-defined.

I agree that including that assert along with the comment is much better. But the comment alone is better than nothing, so isn't without value.

Better yet, `assert(false, message)`, with the message what you would have written in the comment.

`assert(false)` is pronounced "this can never happen." It's reasonable to add a comment with /why/ this can never happen, but if that's all the comment would have said, a message adds no value.

Oh I agree, literally `assert(false, "This cannot happen")` is useless, but ensuring message is always there encourages something more like, `assert(false, "This implies the Foo is Barred, but we have the Qux to make sure it never is")`.

Ensuring a message encourages people to state the assumptions that are violated, rather than just asserting that their assumptions (which?) don't hold.


what language are we talking about? If it's cpp then the pronounciation depends on compiler flags (perhaps inferred from CMAKE_BUILD_TYPE)

Swap the parameters around for C++ and similar langs where `assert(a, b)` evaluates the same as `(void) a; assert(b)`.

I think you might have missed that they threw an exception right under the comment.

At least on iOS, asserts become no-ops on release builds

It really depends on the language you use. Personally I like the way rust does this:

- assert!() (always checked),

- debug_assert!() (only run in debug builds)

- unreachable!() (panics)

- unsafe unreachable_unchecked() (tells the compiler it can optimise assuming this is actually unreachable)

- if cfg!(debug_assertions) { … } (Turns into if(0){…} in release mode. There’s also a macro variant if you need debug code to be compiled out.)

This way you can decide on a case by case basis when your asserts are worth keeping in release mode.

And it’s worth noting, sometimes a well placed assert before the start of a loop can improve performance thanks to llvm.


> debug_assert!() (only run in debug builds)

debug_assert!() (and it's equivalent in other languages, like C's assert with NDEBUG) is cursed. It states that you believe something to be true, but will take no automatic action if it is false; so you must implement the fallback behavior if your assumption is false manually (even if that fallback is just fallthrough). But you can't /test/ that fallback behavior in debug builds, which means you now need to run your test suite(s) in both debug and release build versions. While this is arguably a good habit anyway (although not as good a habit as just not having separate debug and release builds), deliberately diverging behavior between the two, and having tests that only work on one or the other, is pretty awful.


I hear you, but sometimes this is what I want.

For example, I’m pretty sure some complex invariant holds. Checking it is expensive, and I don’t want to actually check the invariant every time this function runs in the final build. However, if that invariant were false, I’d certainly like to know that when I run my unit tests.

Using debug_assert is a way to do this. It also communicates to anyone reading the code what the invariants are.

If all I had was assert(), there’s a bunch of assertions I’d leave out of my code because they’re too expensive. debug_assert lets me put them in without paying the cost.

And yes, you should run unit tests in release mode too.


But how do you test the recovery path if the invariant is violated in production code? You literally can’t write a test for that code path…

There is no recovery. When an invariant is violated, the system is in a corrupted state. Usually the only sensible thing to do is crash.

If there's a known bug in a program, you can try and write recovery code to work around it. But its almost always better to just fix the bug. Small, simple, correct programs are better than large, complex, buggy programs.


> Usually the only sensible thing to do is crash.

Correct. But how are you testing that you successfully crash in this case, instead of corrupting on-disk data stores or propagating bad data? That needs a test.


> Correct. But how are you testing that you successfully crash

In a language like rust, failed assertions panic. And panics generally aren't "caught".

> instead of corrupting on-disk data stores

If your code interacts with the filesystem or the network, you never know when a network cable will be cut or power will go out anyway. You're always going to need testing for inconvenient crashes.

IMO, the best way to do this is by stubbing out the filesystem and then using randomised testing to verify that no matter what the program does, it can still successfully open any written (or partially written) data. Its not easy to write tests like that, but if you actually want a reliable system they're worth their weight in gold.


> In a language like rust, failed assertions panic. And panics generally aren't "caught".

This thread was discussing debug_assert, where the assertions are compiled out in release code.


Ah I see what you're saying.

I think the idea is that those asserts should never be hit in the first place, because the code is correct.

In reality, its a mistake to add too many asserts to your code. Certainly not so many that performance tanks. There's always a point where, after doing what you can to make your code correct, at runtime you gotta trust that you've done a good enough job and let the program run.


You don't. Assertions are assumptions. You don't explicitly write recovery paths for individual assumptions being wrong. Even if you wanted to, you probably wouldn't have a sensible recovery in the general case (what will you do when the enum that had 3 options suddenly comes in with a value 1000?).

I don't think any C programmer (where assert() is just debug_assert!() and there is no assert!()) is writing code like:

    assert(arr_len > 5);
    if (arr_len <= 5) {
        // do something
    }
They just assume that the assertion holds and hope that some thing would crash later and provide info for debugging if it didn't.

Anyone writing with a standard that requires 100% decision-point coverage will either not write that code (because NDEBUG is insane and assert should have useful semantics), or will have to both write and test that code.

You can (and probably should) undef NDEBUG even for release builds.

>> A comment "this CANNOT happen" has no value on itself.

> I think it does have some value: it makes clear an assumption the programmer made.

To me, a comment such as the above is about the only acceptable time to either throw an exception (in languages which support that construct) or otherwise terminate execution (such as exiting the process). If further understanding of the problem domain identifies what was thought impossible to be rare or unlikely instead, then introducing use of a disjoint union type capable of producing either an error or the expected result is in order.

Most of the time, "this CANNOT happen" falls into the category of "it happens, but rarely" and is best addressed with types and verified by the compiler.


Importantly, specifying reasoning can have communicative value while falling very far short of formal verification. Personally, I also try to include a cross reference to the things that could allow "this" to happen were they to change.

Such comments rot so rapidly that they're an antipattern. Such assumptions are dangerous and I would point it out in a PR.

Do you not make such a tacit assumption every time you index into an array (which in almost all languages throws an exception on bounds failure)? You always have to make assumptions that things stay consistent from one statement to the next, at least locally. Unless you use formal verification, but hardly anyone has the time and resources for that.

> Do you not make such a tacit assumption every time you index into an array (which in almost all languages throws an exception on bounds failure)?

Yes, which is one reason why decent code generally avoids doing that.


Are you saying decent code avoids indexing into arrays? Or are you saying it avoids doing so without certainty the bounds checks will succeed?

Decent code generally avoids indexing into arrays at all; if it does so then it does so in ways where the bound checks are so certain to succeed that you can usually explain it to the compiler (e.g. split an array into slices and access those slices).

that is what I thought you were saying, and it doesn't make much sense to me. AFAICT the point of arrays as a data structure is to allow relatively cheap indexing at the cost of more expensive resizing operations. What else would you do with an array other than index into it?

> e.g. split an array into slices and access those slices

How is this not indexing with a little abstraction? Aren't slices just a way of packaging an array with a length field it a standard way? I'm not aware of many array implementations without a length (and usually also capacity) field somewhere , so this seems like a mostly meaningless distinction (ie all sclices are arrays, right).


> AFAICT the point of arrays as a data structure is to allow relatively cheap indexing at the cost of more expensive resizing operations. What else would you do with an array other than index into it?

The main thing you do with arrays is bulk operations (e.g. multiply it by something), which doesn't require indexing into it. But yeah I think they're a fairly niche datastructure that shouldn't be privileged the way certain languages do.

> How is this not indexing with a little abstraction? Aren't slices just a way of packaging an array with a length field it a standard way?

Sure (well, offset and length rather than just a length) but the abstraction is safe whereas directly indexing into the array isn't.


If such an error happens, that would be a compiler bug. Why? Because I usually do checks against the length of the array or have it done as part of the standard functions like `map`. I don't write such assumptions unless I'm really sure about the statements, and even then I don't.

How does one defend against cosmic rays?

Keep two copies or three like RAID?

Edit: ECC ram helps for sure, but what else?


>How does one defend against cosmic rays?

Unless you are in the extremely small minority of people who would actually be affected by it (in which case your company would already have bought ECC ram and made you work with three isolated processes that need to agree to proceed): you don't. You eat shit, crash and restart.


Well, bitflip errors are more of a vulnerability for longer lived values. This could effect fukushima style robots or even medical equipment. ECC implemented outside of ram would save vs triplicate but it was just a question related to the-above idea of an array access being assumed as in+bounds. Thank you.

You run equivalent or equal calculations simultaneously on N computers and take majority wins, aircraft control or distributed filesystem style.

> or have it done as part of the standard functions like `map`.

Which are all well and good when they are applicable, which is not always 100% of the time.

> Because I usually do checks against the length of the array

And what do you have your code do if such "checks" fail? Throw an assertion error? Which is my whole point, I'm advocating in favor of sanity-check exceptions.

Or does calling them "checks" instead of "assumptions" magically make them less brittle from surrounding code changes?


A comment have no semantic value to the code. Having code that check for stuff is different from writing comments as they are executed by the machine. Not read by other humans.

Of course you should put down a real assertion when you have a condition that can be cheaply checked (or even an assert(false) when the language syntax dictates an unreachable path). I'm not trying to argue against that, and I don't think anyone else here is either.

I was mainly responding to TFA, which states "How many times did you leave a comment on some branch of code stating 'this CANNOT happen' and thrown an exception" (emphasis mine), i.e., an assertion error alongside the comment. The author argues that you should use error values rather than exceptions. But for such sanity checks, there's typically no useful way to handle such an error value.


Do you really have code that's

if array.Len > 2 { X = Y[1] }

For every CRUD to that array?

That seems... not ideal


Yes. Unless there’s some statement earlier that verify that the array has 2 items. It’s quick to do, so why not do it?

False it has value. It’s actually even better to log it or throw an exception. print(“this cannot happen.”)

If you see it you immediately know the class of error is purely a logic error the programmer made a programming mistake. Logging it makes it explicit your program has a logic bug.

What if you didn’t log it? Then at runtime you will have to deduce the error from symptoms. The log tells you explicitly what the error is.


Worse: You may created the proof. You may have linked to the proof. But if anyone has touched any of the code involved since then, it still has no value unless someone has re-done the proof and linked that. (Worse, it has negative value, because it can mislead.)

Not really. A quick git blame (or alternative) will give you the required information about the validity of such proof.

You must have some git plugin I haven’t heard about.

Git blame will show the commit and the date for each line. It’s easy to verify if the snippet has changed since the comment. i use Emacs and it’s builtin vc package that color code each block.

But you need the snippet and, potentially, the entire call tree (both up and down).

And anything that can affect relevant state, any dependencies that may have changed, validations to input that may have been modified; it’s hard to know without knowing what assumptions the assertion is based on.

>Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.

Not necessarily. Result types are explicit and require the function signature to be changed for them.

I would much prefer to see a call to foo()?; where it's explicit that it may bubble up from here, instead of a call to foo(); that may or may not throw an exception my way with no way of knowing.

Rust is absolutely not perfect with this though since any downstream function may panic!() without any indication from its function signature that it could do so.


> At some level, the simplest thing to do is to give up and crash if things are no longer sane.

The problem with this attitude (that many of my co-workers espouse) is that it can have serious consequences for both the user and your business.

- The user may have unsaved data - Your software may gain a reputation of being crash-prone

If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.


> If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.

You can do this in your panic or terminate handler. It's functionally the same error handling strategy, just with a different veneer painted over the top.


Crashing is bad, but silently continuing in a corrupt state is much worse. Better to lose the last few hours of the user's work than corrupt their save permanently, for example.

> Your software may gain a reputation of being crash-prone

Hopefully crashing on unexpected state rather than silently running on invalid state leads to more bugs being found and fixed during development and testing and less crash-prone software.


So you don't get a crash log? No, thanks.

- The user may have unsaved data

That should not need to be a consideration. Crashing should restore the state from just before the crash. This isn't the '90s, users shouldn't have to press "save" constantly to avoid losing data.


This is what rust's `unreachable()!` is for... and I feel hubris whenever I use it.

You should prefer to write unreachable!("because ...") to explain to some future maintenance engineer (maybe yourself) why you believed this would never be reached. Since they know it was reached they can compare what you believed against their observed facts and likely make better decisions.

But at least telling people that the programmer believed this could never happen short-circuits their investigation considerably.


Heh, recently I had to fix a bug in some code that had one of these comments. Feels like a sign of bad code or laziness. Why make a path that should not happen? I can get it when it's on some while loop that should find something to return, but on a if else sequence it feels really wrong.

Strong disagree about laziness. If the dev is lazy they will not make a path for it. When they are not lazy they actually make a path and write a comment explaining why they think this is unreachable. Taking the time to write a comment is not a sign of laziness. It’s the complete opposite. You can debate whether the comment is detailed enough to convey why the dev thinks it’s unreachable, but it’s infinitely better than no comment and leaving the unreachability in their head.

Laziness might or might not be involved in either path.

A stupid developer might not even contemplate that something could happen.

A smarter developer might contemplate that possibility, but discount it, by adding the path and comment.

Why is it discounted? Probably just priorities, more pressing things to do.


Before sealed classes and ultra-robust type checking, sometimes private functions would have, say, 3 states that should be possible, but 3 years later, a new state is added but wasn’t checked because the compiler didn’t stop it because the language didn’t support it at that time.

It's much better to have a `panic!("this should never happen")` statement than to let your program get into an inconsistent state and then keep going. Ideally, you can use your type system to make inconsistent states impossible, but type systems can only express so much. Even Haskell can't enforce typeclass laws at the compiler level.

A program that never asserts its invariants is much more likely to be a program that breaks those invariants than a program that probably doesn't.


But to link against an old glibc version, you need to compile on an old distro, on a VM. And you'll have a rough time if some part of the build depends on a tool too new for your VM. It would be infinitely simpler if one could simply 'cross-compile' down to older symbol versions, but the tooling does not make this easy at all.

Check out `zig cc`. It let's you target specific glibc versions. It's a pretty amazing C toolchain.

https://andrewkelley.me/post/zig-cc-powerful-drop-in-replace...


It's actually doable without an old glibc as it was done by the Autopackage project: https://github.com/DeaDBeeF-Player/apbuild

That never took off though, containers are easier. Wirh distrobox and other tools this is quite easy, too.


> It would be infinitely simpler if one could simply 'cross-compile' down to older symbol versions, but the tooling does not make this easy at all.

It's definitely not easy, but it's possible: using the `.symver` assembly (pseudo-)directive you can specify the version of the symbol you want to link against.


Huh? Bullshit. You could totally compile and link in a container.

Ok, so you agree with him except where he says “in a VM” because you say you can also do it “in a container”.

Of course, you both leave out that you could do it “on real hardware”.

But none of this matters. The real point is that you have to compile on an old distro. If he left out “in a VM”, you would have had nothing to correct.


I'm not disagreeing that glibc symbol versioning could be better. I raised it because this is probably one of the few valid use cases for containers where they would have a large advantage over a heavyweight VM.

But it's like complaining that you might need a VM or container to compile your software for Win16 or Win32s. Nobody is using those anymore. Nor really old Linux distributions. And if they do, they're not really going to complain about having to use a VM or container.

As C/C++ programmer, the thing I notice is ... the people who complain about this most loudly are the web dev crowd who don't speak C/C++, when some ancient game doesn't work on their obscure Arch/Gentoo/Ubuntu distribution and they don't know how to fix it. Boo hoo.

But they'll happily take a paycheck for writing a bunch of shit Go/Ruby/PHP code that runs on Linux 24/7 without downtime - not because of the quality of their code, but due to the reliability of the platform at _that_ particular task. Go figure.


> But they'll happily take a paycheck for writing a bunch of shit Go/Ruby/PHP code that runs on Linux 24/7 without downtime - not because of the quality of their code, but due to the reliability of the platform at _that_ particular task.

But does the lack of a stable ABI have any (negative) effect on the reliability of the platform?


Only for people who want to use it as a desktop replacement for Windows or MacOS I guess? There are no end of people complaining they can't get their wifi or sound card or trackpad working on (insert-obscure-Linux-distribution-here).

Like many others, I have Linux servers running over 2000-3000 days uptime. So I'm going to say no, it doesn't, not really.


>As C/C++ programmer, the thing I notice is ... the people who complain about this most loudly are the web dev crowd who don't speak C/C++, when some ancient game doesn't work on their obscure Arch/Gentoo/Ubuntu distribution and they don't know how to fix it. Boo hoo.

You must really be behind the times. Arch and Gentoo users wouldn't complain because an old game doesn't run. In fact the exact opposite would happen. It's not implausible for an Arch or Gentoo user to end up compiling their code on a five hour old release of glibc and thereby maximize glibc incompatibility with every other distribution.



> But most colours can be associated with a primary wavelength… except purple. So by that definition, they don’t really exist.

And white, and black. Physically, you'll always have a measurable spectrum of intensities, and some such spectra are typically perceived as "purple". There's no need to pretend that light can only exist in "primary wavelengths".

Even if there's no empirical way to extract some 'absolute' mental notion of perceived color, we can get a pretty solid notion of perceived differences in color, from which we can map out models of consensus color perception.


Is offering an alternative to LLVM not precisely one of the purposes of the rustc_codegen_cranelift backend [0]? It still doesn't have 100% feature parity, but I believe it's able to fully bootstrap the compiler at this point. Writing a rustc backend isn't trivial, but it isn't as impossible as you make it out to be.

[0] https://github.com/rust-lang/rustc_codegen_cranelift


I’m not sure what I wrote to give the impression that Rust was unable to write a compiler, let alone implied it was impossible. Rust is certainly full featured enough to write a very well performing compiler. I find my comment more an indictment, and viewed uncharitably an accusation of hypocrisy, of the language org’s oversight that they are so heavily invested in LLVM (but if I was leveling such an accusation it would not be just because it’s a C++ project)

My comment was focused on the fact that Rust is not using a Rust compiler and therefore is relying on deep and complex C++ infrastructure while working to supplant the same at the lowest levels of the computing stack.

I was also commenting, up the thread, in a chain of comments about a perceived shortcoming of Rust’s implementation (i.e. it’s not being bootstrapped) and why some people view that as a negative.


Doing a few things at a time is hardly an indictable offense. Self-compilation doesn't have to be anywhere near the start of the todo list. Relying on C++ infrastructure at compile time isn't a problem until you're trying to make all-purpose C++-free installs, and that's an entirely optional goal. The important part is having the runtime written in Rust.

Pointing out a language still needs C++ at compile time is a reasonable critique of "supplanting C++", but it's not a reasonable critique of "wanting to be a foundational language of the computing stack". Rust is the latter. (And even then it's too early to worry about compilers.) (And Rust is making good progress on compilers anyway.)


All of the front-end is in fact pure Rust, I know that because I am one of the huge number of authors. The backend, thus the code generation and many optimisations of the sort AoCO is about is LLVM.

We absolutely know that if Rust didn't offer LLVM you'd see C++ people saying "Rust doesn't even have a proper optimiser". So now what you're asking for isn't just "a Rust backend" which exists as others have discussed, but a Rust alternative to LLVM, multiple targets, lots of high quality optimisations, etc. presumably as a drop-in or close approximation since today people have LLVM and are in production with the resulting code.


Ignore them. Keep going. Debates like these are “Since you said X, Y can’t be true” kind of debates. As long as you have access to be able to do assembly, you should be able to do this. I say you because this is way out of my wheelhouse. I just want a cleaner, less mine-field laden, OO language that compiles to machine code. That’s it. We can stick a feather in this until this time a decade from now when we complain about it again.


> Mathematicians do care about how much "black magic" they're invoking, and like to use simple constructions where possible (the field of reverse mathematics makes the central object of study).

I'd be careful about generalizing that to all or most 'mathematicians'. E.g., people working in a lot of fields won't bat an eye at invoking the real numbers when the rational or algebraic numbers would do.


I'm sure some python devs care about cache misses too. I guess my point was that the big results will be picked over again and again to understand _exactly_ which conditions are needed for them to hold.


To be fair, in some fields I've seen arguments between "a widget should be defined as ABC" vs. "a widget should be defined as XYZ", to the point that I wonder how they're able to read papers about widgets at all. (If I had to guess, likely by focusing on the 'happy path' where the relevant properties hold, filling in arguments according to their favored viewpoint, and tacitly cutting out edge cases where the definitions differ.)

So if many mathematicians can go without fixed definitions, then they can certainly go without fixed foundations, and try to 'fix everything up' if something ever goes wrong.


In my experience those debates are usually between experts who deeply understand the difference between ABC and XYZ widgets (the example I'm thinking of in my head is whether manifolds should be paracompact). The decision between the two is usually an aesthetic one. For example, certain theorems might be streamlined if you use the ABC definition instead of the XYZ one, at the cost of generality.

But the key is that proponents of both definitions can convert freely between the two in their understandings.


You can't turn a capturing C++ lambda into a WNDPROC, which is an ordinary function pointer. You'd still have to ferry the lambda via a context pointer, which is what this blog post and the other solutions in the comments are all about.


You kind of can, that is one of their design points, naturally you need to move the context into the body and know what to cast back from.

I guess I need to prove a point on my Github during next week.


If you mean that you can call a C++ lambda from a static C callback via a context pointer, of course you can do that, it's not very mind-boggling. Rust FFI libraries similarly have to do that trick all the time to turn a closure into a C callback. The primary problem with WNDPROC is how to get that context pointer in the first place, which is the part that OP and everyone in the comments are talking about.



I assume by "move the context into the body" you mean using GetWindowLongPtr? Why not just use a static wndproc at that point?


I mean using a static C++ lambda that moves the context into the lambda body via capture specifier.

C++ lambdas are basically old style C++ functors that are compiled generated, with the calling address being the operator().


That doesn't sound like a valid wndproc



It looks like you missed the part where you "move the context into the lambda body via capture specifier."


It looks like the author has a pretty simple procedure for computing the 'identity' sandpile (which they unfortunately don't describe at all):

1. Fill a grid with all 6s, then topple it.

2. Subtract the result from a fresh grid with all 6s, then topple it.

So effectively it's computing 'all 6s' - 'all 6s' to get an additive identity. But I'm not entirely sure how to show this always leads to a 'recurrent' sandpile.

EDIT: One possible route: The 'all 3s' sandpile is reachable from any sandpile via a sequence of 'add 1' operations, including from its own successors. Thus (a) it is a 'recurrent' sandpile, (b) adding any sandpile to the 'all 3s' sandpile will create another 'recurrent' sandpile, and (c) all 'recurrent' sandpiles must be reachable in this way. Since by construction, our 'identity' sandpile has a value ≥ 3 in each cell before toppling, it will be a 'recurrent' sandpile.


If you wanted to avoid <string.h>, you could use the poor man's strlen(), snprintf(0,0,"%s",argv[1]). For full input validation without adding any more statements, the best I can get (in ISO C) is

      uint8_t number = (argc<2||sscanf(*++argv,"%*[0123456789]%n",&argc)||argc--[*argv]?printf("bad\n"):argc[*argv])-'0'; // No problems here
Though with either function you may run into issues if the OS allows arguments longer than INT_MAX. To be defensive, you could use "%32767s" or "%*32767[0123456789]%n" instead, at the cost of failing inputs longer than 32KiB.


Marvellous, love it. Thank you.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: