The first paper they link to is not about string theory. It's using math that was developed for string theory, and is perfectly valid outside it, to make predictions that can be (and are) experimentally validated.
It has exactly none of the problems of string theory, and I am not sure why it's clumped with a physics paper in the blog. How is it a problem to say "hey they used string theory tools!" in a press release? If anything it might get other people to look at the math and get something good out of it...
This quote explains why the author thinks that it is a problem :
> with string theorists now virtually unemployable unless they can figure out how to rebrand as machine learning experts.
Their issue is (seemingly) not with the paper, but with the claim that these headlines feed a hype that attribute to string theory capabilities it doesn't have.
To be clear this is OP's argument, not mine. I am not sure I buy it, except perhaps for the fact that every other academic is expected to rebrand as an ML expert nowadays. It has more to do with ML hype than with string theory hype.
Left this on his blog but it’s awaiting moderation:
It would be helpful to have more clearly targeted and titrated criticism, because you’ve mentioned press releases, a sciam article, the paper, and Sabine all without differentiation.
I hope it’s clear enough the paper itself is legit and doesn’t seem to make any inappropriate claims. Beyond that, the PRs seem to be the real offenders here, the sciam article less so (could be argued that’s healthy popsci), and I’m not sure what comment you’re making about Sabine. The title of her video may be click baity but the content itself seems to appropriately demarcate string theory from the paper.
I'm certainly a lay person here, so take this with a grain of salt. But my understanding is that this is part of the problem, or more the issue that people criticize.
I think it's largely uncontroversial that the math in string theory could be useful in other areas. But if that's your argument for the legitimacy of string theory then the question arises what string theory is and if it is still part of physics. Because physics has, of course, the goal of describing the real world, and, my understanding is, string theory failed to do that, despite what many people have hoped.
If string theory is "just a way of developing math that can be useful in totally unrelated areas", it's, well, part of mathematics. But I don't think that's how the field sees itself.
And why would that be a reason to attack people who don't care at all about the physics, but acknowledge that the mathematical ideas they use originated in string theory? Should they omit that just because the physics side of string theory has been more or less fruitless?
Peter Woit, the Columbia maths department computer systems administrator, makes his bread by googling the word String Theory and then posting what ever latest results come up in a disingenuous way on his blog to stir reactions from his readers.
You can split your work in multiple commits and at the same time drop/squash debugging or wip changes. The result allows you to go into much better detail than a PR description.
I think it's plausible that different languages would prefer different tokenizations. For example in Spanish the plural of carro is carros, in Italian it's carro. Maybe the LLM would prefer carr+o in Italian and a single token in Spanish.
Certainly! What surprised me was that apparently LLMs are deliberately designed to enable multiple ways of encoding the same string as tokens. I just assumed this would lead to inefficiency, since I assumed that it would cause training to not know whether it should favour outputting, say, se|same or ses|ame after "open", and thus throw some weight on each. But provided there's a deterministic rule, like "always choose the longest matching token", this uncertainty goes away.
LLMs are probabilistic black boxes, trying to inject determinism in their natural language processing (as opposed to e.g. forcing a grammar for the output) may very well screw them over completely.
LLMs are ultimately just matrix multiplication and some other maths, nothing about them is inherently nondeterministic. When nondeterminism is present, it's because it was deliberately sprinkled on top (because it tends to produce better results).
Yes determinism is not the best word. What I mean is that if you force the LLM to output "carr+o" even when it prefers "carro", this could result in worse quality output.
(Former) pawns can be on the 1st or 8th rank if they've been promoted. You can place an unpromoted pawn on the 1st or 8th rank to encode en passant, too.
Also castling can be encoded as extra squares available to the rooks only ("queen side never moved" and "king side never moved"), and that is almost free.
But without a good encoding for promotions, I doubt you can beat the encoding of the article.
It also can encode chess960 positions. With the article's encoding, uncastled rooks can only be decoded if their starting position is known, which it isn't in chess960.
Challenging the IP bans in Italy is stupidly hard. Your VM gets an IP address that was used a few months ago for soccer piracy? Too bad, you won't be able to access it from Italy.
The actual innovation in QEMU was that the architecture-dependent part was much smaller than a full JIT compiler, because it used the C compiler to build small blocks and parsed ELF relocations to be able to move them into the translated code.
This technique has since been dropped by QEMU, but something similar is now used by the Python JIT. These days QEMU uses Tiny Code Generator, originally forked out of TCC though by now the source is probably unrecognizable except in the function names.
reply