Yeah, numerous times I've replied to a comment online, to add supporting context, and it's been interpreted as a retort. So now I prefix them with 'Yeah, '.
Seems shortsighted (I'm not saying you're wrong, I can imagine Intel being shortsighted). Surely the advantage of artificial segmentation is that it's artificial: you don't double up the R&D costs.
Maybe they thought they would just freeze x86 architecturally going forward and Itanium would be nearly all future R&D. Not a bet I would have taken but Intel probably felt pretty unstoppable back then.
Not this time; but the rewrite was certainly implicated in the previous one. They actually had two versions deployed; in response to unexpected configuration file size, the old version degraded gracefully, while the new version failed catastrophically.
Both versions were taken off-guard by the defective configuration they fetched, it was not a case of a fought and eliminated bug reappearing like in the blogpost you quoted.
So you claim that the compiler "knows about this but doesn't optimize because of some safety measures"? As far as I remember, compilers don't optimize math expressions / brackets, probably because the order of operations might affect the precision of ints/floats, also because of complexity.
But my example is trivial (x % 2 == 0 && x % 3 == 0 is exactly the same as x % 6 == 0 for all C/C++ int), yet the compiler produced different outputs (the outputs are different and most likely is_divisible_by_6 is slower). Also what null (you mean 0?) checks are you talking about? The denominator is not null/0. Regardless, my point about not over relying on compiler optimization (especially for macro algorithms (O notation) and math expressions) remains valid.
> the order of operations might affect the precision of ints/floats
That's only the problem of floats, with ints this issue doesn't exist.
Why do you write (x % 2 == 0 && x % 3 == 0) instead of (x % 2 == 0 & x % 3 == 0), when the latter is what you think you mean?
Are you sure, that dividing by 6 is actually faster, than dividing by 2 and 3? A division operation is quite costly compared to other arithmetic and 2 and 3 are likely to have some special optimization (2 is a bitshift), which isn't necessary the case for 6.
> That's only the problem of floats, with ints this issue doesn't exist.
With ints the results can be dramatically different (often even worse than floats) even though in pure mathematics the order doesn't matter:
1 * 2 * 3 * 4 / 8 --> 3
3 * 4 / 8 * 1 * 2 --> 2
This is a trivial example, but it shows why it's extremely hard for compilers to optimize expressions and why they usually leave this task to humans.
But x % 2 == 0 && x % 3 == 0 isn't such case, swapping operands of && has no side effects, nor swapping operands of each ==.
> Are you sure, that dividing by 6 is actually faster
Compilers usually transform divisions into multiplications when the denominator is a constant.
I wrote another example in other comment but I'll write again.
I also tried this
bool is_divisible_by_15(int x) {
return x % 3 == 0 && x % 5 == 0;
}
bool is_divisible_by_15_optimal(int x) {
return x % 15 == 0;
}
is_divisible_by_15 still has a branch, while is_divisible_by_15_optimal does not
is_divisible_by_15(int):
imul eax, edi, -1431655765
add eax, 715827882
cmp eax, 1431655764
jbe .LBB0_2
xor eax, eax
ret
.LBB0_2:
imul eax, edi, -858993459
add eax, 429496729
cmp eax, 858993459
setb al
ret
is_divisible_by_15_optimal(int):
imul eax, edi, -286331153
add eax, 143165576
cmp eax, 286331153
setb al
ret
My point is that the compiler still doesn't notice that 2 functions are equivalent. Even when choosing 3 and 5 (to eliminate the questionable bit check trick for 2) the 1st function appears less optimal (more code + branch).
I don't perceive that as an ordering issue. "Pure mathematics" has multiple division definitions, what we see here is the definition you use in class 1: integer division. The issue here is not associativity, it is that the inverse of an integer division is NOT integer multiplication, the inverse of division is the sum of multiplication and the modulo. Integer division is a information destroying operation.
> I wrote another example in other comment but I'll write again. [...]
Yes, this is because optimizing compilers are not optimizers in the mathematical sense, but heuristics and sets of folk wisdoms. This doesn't make them any less impressive.
> I don't perceive that as an ordering issue. "Pure mathematics" has multiple division definitions, what we see here is the definition you use in class 1: integer division. The issue here is not associativity, it is that the inverse of an integer division is NOT integer multiplication, the inverse of division is the sum of multiplication and the modulo. Integer division is a information destroying operation.
Agree, I've gone too far with integer division. But a similar problem exists for floats as well. In abstract mathematics the order of some operations between real numbers doesn't matter, but since the CPU floats have limited size and accuracy, it does. This is why when you are calculating some decreasing convergent series, you should better to start from the smallest terms, because the accuracy would be lost during float normalization when adding a tiny term to an already large accumulated sum. A compiler is unlikely to do any optimization here and people should be aware of this. Compilers can't assume the intention in your code, so they make sure the program behavior isn't affected after the optimizations.
> Yes, this is because optimizing compilers are not optimizers in the mathematical sense, but heuristics and sets of folk wisdoms. This doesn't make them any less impressive.
I'm not implying that it's not impressive, but I'm implying that compilers still aren't magic wands, and you should still optimize the algorithms (to a reasonable degree). Just let the compiler do the microptimizations (all this register allocation golf, instruction reordering, caching, the discussed division trick, etc.). IMHO this suboptimal output in this particular case was somewhat expected because it's some "niche" case although it's obvious. I'm not blaming the compiler people. Yes someone could add that optimization rule for my case, but as I said, It's quite rare and it's probably not worth adding optimization rules for such case to make the optimizer more bloated and complicated.
> I'm implying that compilers still aren't magic wands,
Agreed.
> you should still optimize the algorithms (to a reasonable degree). Just let the compiler do the microptimizations (all this register allocation golf, instruction reordering, caching, the discussed division trick, etc.).
x % 3 == 0 is an expression without side effects (the only cases that trap on a % operator are x % 0 and INT_MIN % -1), and thus the compiler is free to speculate the expression, allowing the comparison to be converted to (x % 2 == 0) & (x % 3 == 0).
Yes, compilers will tend to convert && and || to non-short-circuiting operations when able, so as to avoid control flow.
Any number divisible by 6 will also be divisible by both 2 and 3 since 6 is divisible by 2 and 3, so the short-circuiting is inconsequential. They're bare ints, not pointers, so null isn't an issue.
reply