More

bonzini · 2026-01-15T08:31:55 1768465915

There is also the "momo" crate to do the same with a procedural macro attribute (https://docs.rs/momo/latest/momo/).

bonzini · 2026-01-14T20:27:38 1768422458

The first paper they link to is not about string theory. It's using math that was developed for string theory, and is perfectly valid outside it, to make predictions that can be (and are) experimentally validated.

It has exactly none of the problems of string theory, and I am not sure why it's clumped with a physics paper in the blog. How is it a problem to say "hey they used string theory tools!" in a press release? If anything it might get other people to look at the math and get something good out of it...

aaplok · 2026-01-14T21:23:26 1768425806

This quote explains why the author thinks that it is a problem :

> with string theorists now virtually unemployable unless they can figure out how to rebrand as machine learning experts.

Their issue is (seemingly) not with the paper, but with the claim that these headlines feed a hype that attribute to string theory capabilities it doesn't have.

To be clear this is OP's argument, not mine. I am not sure I buy it, except perhaps for the fact that every other academic is expected to rebrand as an ML expert nowadays. It has more to do with ML hype than with string theory hype.

WhitneyLand · 2026-01-14T21:27:54 1768426074

Left this on his blog but it’s awaiting moderation:

It would be helpful to have more clearly targeted and titrated criticism, because you’ve mentioned press releases, a sciam article, the paper, and Sabine all without differentiation.

I hope it’s clear enough the paper itself is legit and doesn’t seem to make any inappropriate claims. Beyond that, the PRs seem to be the real offenders here, the sciam article less so (could be argued that’s healthy popsci), and I’m not sure what comment you’re making about Sabine. The title of her video may be click baity but the content itself seems to appropriately demarcate string theory from the paper.

hannob · 2026-01-14T21:37:06 1768426626

I'm certainly a lay person here, so take this with a grain of salt. But my understanding is that this is part of the problem, or more the issue that people criticize.

I think it's largely uncontroversial that the math in string theory could be useful in other areas. But if that's your argument for the legitimacy of string theory then the question arises what string theory is and if it is still part of physics. Because physics has, of course, the goal of describing the real world, and, my understanding is, string theory failed to do that, despite what many people have hoped.

If string theory is "just a way of developing math that can be useful in totally unrelated areas", it's, well, part of mathematics. But I don't think that's how the field sees itself.

bonzini · 2026-01-15T07:45:37 1768463137

And why would that be a reason to attack people who don't care at all about the physics, but acknowledge that the mathematical ideas they use originated in string theory? Should they omit that just because the physics side of string theory has been more or less fruitless?

Tazerenix · 2026-01-14T21:00:32 1768424432

Peter Woit, the Columbia maths department computer systems administrator, makes his bread by googling the word String Theory and then posting what ever latest results come up in a disingenuous way on his blog to stir reactions from his readers.

stonogo · 2026-01-15T02:32:51 1768444371

Seems like a disingenuous description of a person with a terminal degree in the subject at hand. Why bother?

bonzini · 2026-01-13T21:39:05 1768340345

You can split your work in multiple commits and at the same time drop/squash debugging or wip changes. The result allows you to go into much better detail than a PR description.

bonzini · 2026-01-13T05:23:04 1768281784

I think it's plausible that different languages would prefer different tokenizations. For example in Spanish the plural of carro is carros, in Italian it's carro. Maybe the LLM would prefer carr+o in Italian and a single token in Spanish.

akoboldfrying · 2026-01-13T12:19:57 1768306797

Certainly! What surprised me was that apparently LLMs are deliberately designed to enable multiple ways of encoding the same string as tokens. I just assumed this would lead to inefficiency, since I assumed that it would cause training to not know whether it should favour outputting, say, se|same or ses|ame after "open", and thus throw some weight on each. But provided there's a deterministic rule, like "always choose the longest matching token", this uncertainty goes away.

bonzini · 2026-01-13T17:30:00 1768325400

LLMs are probabilistic black boxes, trying to inject determinism in their natural language processing (as opposed to e.g. forcing a grammar for the output) may very well screw them over completely.

akoboldfrying · 2026-01-14T06:11:16 1768371076

LLMs are ultimately just matrix multiplication and some other maths, nothing about them is inherently nondeterministic. When nondeterminism is present, it's because it was deliberately sprinkled on top (because it tends to produce better results).

bonzini · 2026-01-15T07:43:38 1768463018

Yes determinism is not the best word. What I mean is that if you force the LLM to output "carr+o" even when it prefers "carro", this could result in worse quality output.

bonzini · 2026-01-10T10:19:29 1768040369

(Former) pawns can be on the 1st or 8th rank if they've been promoted. You can place an unpromoted pawn on the 1st or 8th rank to encode en passant, too.

Also castling can be encoded as extra squares available to the rooks only ("queen side never moved" and "king side never moved"), and that is almost free.

But without a good encoding for promotions, I doubt you can beat the encoding of the article.

bonzini · 2026-01-09T22:52:22 1767999142

It also can encode chess960 positions. With the article's encoding, uncastled rooks can only be decoded if their starting position is known, which it isn't in chess960.

bonzini · 2026-01-09T22:11:21 1767996681

Challenging the IP bans in Italy is stupidly hard. Your VM gets an IP address that was used a few months ago for soccer piracy? Too bad, you won't be able to access it from Italy.

immibis · 2026-01-10T03:04:33 1768014273

Surely there's some EU trade barriers law about that

bonzini · 2026-01-09T19:11:16 1767985876

Bishops only need 5 bits instead of 6 (they can't move to a square of different color), shaving 2 bits and thus reaching exactly 26 bytes.

BTW 495 can be computed as a binomial coefficient C(8+5-1,5-1), the number of combinations of 8 elements chosen with repetitions from 5 elements.

toast0 · 2026-01-09T19:25:39 1767986739

Should be able to shave 4 bits, cause there's four bishops?

bonzini · 2026-01-09T21:24:35 1767993875

Doh of course. But the 26 bytes + 2 bits irritated me so I didn't think about it.

bonzini · 2026-01-01T10:55:48 1767264948

NNUE is for deep searches, as far as I understand this just says what move to do based on the state?

bonzini · 2025-12-25T10:05:28 1766657128

The actual innovation in QEMU was that the architecture-dependent part was much smaller than a full JIT compiler, because it used the C compiler to build small blocks and parsed ELF relocations to be able to move them into the translated code.

This technique has since been dropped by QEMU, but something similar is now used by the Python JIT. These days QEMU uses Tiny Code Generator, originally forked out of TCC though by now the source is probably unrecognizable except in the function names.