Obviously I familiar with RL, written multiple training pipelines in my day. and in order to gain that “super human skill” using RL you need to define fit functions and provide environments that will provide you with feedback that used for training. Go and chess are have clear rules and environment that provide you with a signal of success, I waiting to see this for coding, I don’t say it’s impossible just orders of magnitude harder
> a fundamentally different compute profile on commodity CPU
In what way? On modern processors, a Fused Multiply-Add (FMA) instruction generally has the exact same execution throughput as a basic addition instruction
You drop the memory throughput requirements because of the packed representation of bits so an FMA can become the bottleneck, and you bypass the problem of needing to upscale the bits to whatever FP the FMA instruction needs.
typically for 1-bit matmul, you can get away with xors and pop_counts which should have a better throughput profile than FMA when taking into account the SIMD nature of the inputs/outputs.
It can probably be made more efficient by taking a column-first format.
Since we are in CPU land, we mostly deal with dot products that match the cache size, I don't assume we have a tiled matmul instruction which is unlikely to support this weird 1-bit format.
Haven't looked closely, but on modern x86 CPUs it might be possible to do much better with the gf2affineqb instructions, which let us do 8x8 bit matrix multiplications efficiently. Not sure how you'd handle the 2-bit part, of course.
The win is in how many weights you process per instruction and how much data you load.
So it's not that individual ops are faster — it's that the packed representation lets each instruction do more useful work, and you're moving far less data from memory to do it.
I think that these models have to learn to efficiently use their parameters, and the best way to do that is 'evolve' (yes, a bad word for it), structures over pretraining time. Unfortunately, they don't have a way to access these structures 'from the inside'. I hope this new approach lets up boost performance in s more experimentally rigorous way
Exactly, you can use bitcoin, even cash. You can even add credits with PayPal or a credit card, in which case Proton (I assume) won't remember your payment data. But if you attach credit card info permanently to your account then it can be retrieved.
reply