Yeah, but the problem is that programming languages and compilers change all the time, making it hard to maintain a formal model of them.
Exceptions exist (CompCert C and WebAssembly are two good examples) but for example, the semantics of raw pointers in Rust are intentionally under-defined because the compiler writers want to keep changing it.
Rust is still young, so it's taking the opportunity to thoroughly explore its options. As it grows older, it will solidify its decisions, even if it means committing to something that will be broken in the future. I look forward to the day that it becomes legacy (with all love)!
Yes, the contributions of the people promoting the AI should be considered, as well as the people who designed the Lean libraries used in-the-loop while the AI was writing the solution. Any talk of "AGI" is, as always, ridiculous.
But speaking as a specialist in theorem proving, this result is pretty impressive! It would have likely taken me a lot longer to formalize this result even if it was in my area of specialty.
How did you arrive at "ridiculous"? What we're seeing here is incredible progress over what we had a year ago. Even ARC-AGI-2 is now at over 50%. Given that this sort of process is also being applied to AI development itself, it's really not clear to me that humans would be a valuable component in knowledge work for much longer.
It requires constant feedback, critical evaluation, and checks. This is not AGI, its cognitive augmentation. One that is collective, one that will accelerate human abilities far beyond what the academic establishment is currently capable of, but that is still fundamentally organic. I don't see a problem with this--AGI advocates treat machine intelligence like some sort of God that will smite non-believers and reward the faithful. This is what we tell children so that they won't shit their beds at night, otherwise they get a spanking. The real world is not composed of rewards and punishments.
It does seem that the venn diagram of "roko's basilisk" believers and "AGI is coming within our lifetimes" believers is nearly a circle. Would be nice if there were some less... religious... arguments for AGI's imminence.
I think the “Roko’s Basilisk” thing is mostly a way for readers of Nick Land to explain part of his philosophical perspective without the need for, say, an actual background in philosphy. But the simplicity reduces his nuanced thought into a call for a sheeplike herd—they don’t even need a shepherd! Or perhaps there is, but he is always yet to come…best to stay in line anyway, he might be just around the corner.
> It requires constant feedback, critical evaluation, and checks. This is not AGI, its cognitive augmentation.
To me that doesn't sound qualitatively different from a PhD student. Are they just cognitive augmentation for their mentor?
In any case, I wasn't trying to argue that this system as-is is AGI, but just that it's no longer "ridiculous", and that this to me looks like a herald of AGI, as the portion being done by humans gets smaller and smaller
People would say the same thing about a calculator, or computation in general. Just like any machine it must be constructed purposefully to be useful, and once we require something which exceeds that purpose it must be constructed once again. Only time will tell the limits of human intelligence, now that AI is integrating into society and industry.
>AGI advocates treat machine intelligence like some sort of God that will smite non-believers and reward the faithful.
>The real world is not composed of rewards and punishments.
Most "AGI advocates" say that AGI is coming, sooner rather than later, and it will fundamentally reshape our world. On its own that's purely descriptive. In my experience, most of the alleged "smiting" comes from the skeptics simply being wrong about this. Rarely there's talk of explicit rewards and punishments.
I should be the target audience for this stuff, but I honestly can't name a single person who believes in this "Roko's basilisk" thing. To my knowledge, even the original author abandoned it. There probably are a small handful out there, but I've never seen 'em myself.
> it's really not clear to me that humans would be a valuable component in knowledge work for much longer.
To me, this sounds like when we first went to the moon, and people were sure we'd be on Mars be the end of the 80's.
> Even ARC-AGI-2 is now at over 50%.
Any measure of "are we close to AGI" is as scientifically meaningful as "are we close to a warp drive" because all anyone has to go on at this point is pure speculation. In my opinion, we should all strive to be better scientists and think more carefully about what an observation is supposed to mean before we tout it as evidence. Despite the name, there is no evidence that ARC-AGI tests for AGI.
> To me, this sounds like when we first went to the moon, and people were sure we'd be on Mars be the end of the 80's.
Unlike space colonisation, there are immediate economic rewards from producing even modest improvements in AI models. As such, we should expect much faster progress in AI than space colonisation.
But it could still turn out the same way, for all we know. I just think that's unlikely.
The minerals in the asteroid belt are estimated to be worth in the $100s of quintillions. I would say that’s a decent economic incentive to develop space exploration (not necessarily colonization, but it may make it easier).
Not going to happen as long as the society we live in has this big of a hard on for capitalism and working yourself to the bone is seen as a virtue. Every time there’s a productivity boost, the newly gained free time is immediately consumed by more work. It’s a sick version of Parkinson’s law where work is infinite.
Let me put it like this: I expect AI to replace much of human wage labor over the next 20 years and push many of us, and myself almost certainly included, into premature retirement. I'm personally concerned that in a few years, I'll find my software proficiency to be as useful as my chess proficiency today is useful to Stockfish. I am afraid of a massive social upheaval both for myself and my family, and for society at large.
Here “much of” is doing the heavy lifting. Are you willing to commit to a percentage or a range?
I work at an insurance company and I can’t see AI replacing even 10% of the employees here. Too much of what we do is locked up in decades-old proprietary databases that cannot be replaced for legal reasons. We still rely on paper mail for a huge amount of communication with policyholders. The decisions we make on a daily basis can’t be trusted to AI for legal reasons. If AI caused even a 1% increase in false rejections of claims it would be an enormous liability issue.
Yes, absolutely willing to commit. I can't find a single reliable source, but from what I gather, over 70% of people in the West do "pure knowledge work", which doesn't include any embodied actuvities. I am happy to put my money that these jobs will start being fully taken over by AI rapidly soon (if they aren't already), and that by 2035, less than 50% of us will have a job that doesn't require "being there".
And regarding your example of an insurance company, I'm not sure about that industry, but seeing the transformation of banking over the last decade to fully digital providers like Revolut, I would expect similar disruption there.
I would easily take the other side of this bet. It just reminds me when everyone was sure back in 2010 that we’d have self driving cars within 10 years and human drivers would be obsolete. Today replacing human drivers fully is still about 10 years away.
Yes, getting the timelines right is near impossible, but the trajectory is clear to me, both on AI taking over pure knowledge work and on self-driving cars replacing human drivers. For the latter, there's a lot of inertia and legalities to overcome, and scaling physical things is hard in general, but Waymo alone crossed 450,000 weekly paid rides last month [0], and now that it's self-driving on highways too, and is slated to launch in London and Tokyo this year, it seems to me that there's no serious remaining technical barrier to it replacing human drivers.
As for a bet, yes, I'd really be happy to put my money where my mouth is, if you're familiar with any long bets platform that accepts pseudonymous users.
There are other bounds here at play that are often not talked about.
Ai runs on computers. Consider the undecidability of Rices theorem. Where compiled code of non trivial statements may or may not be error free. Even an ai can’t guarantee its compiled code is error free. Not because it wouldn’t write sufficient code that solves a problem, but the code it writes is bounded by other externalities. Undecidability in general makes the dream of generative ai considerably more challenging than how it’s being ‘sold.
You don’t even need AGI for that though, just unbounded investor enthusiasm and a regulatory environment that favors AI providers at the expense of everyone else.
My point is there are number of things that can cause large scale unemployment in the next 20 years and it doesn’t make sense to worry about AGI specifically while ignoring all of the other equally likely root causes (like a western descent into oligarchy and crony capitalism, just to name one).
Very cool to see how far things have come with this technology!
Please remember that this is a theorem about integers that is subject to a fairly elementary proof that is well-supported by the existing Mathlib infrastructure. It seems that the AI relies on the symbolic proof checker, and the proofs that it is checking don't use very complex definitions in this result. In my experience, proofs like this which are one step removed from existing infra are much much more likely to work.
Exactly right. You can pick and use real numbers, as long as they are only queried to finite precision. There are lots of super cool algorithms for doing this!
Kind of, but you're not just picking rationals, you're picking rationals that are known to converge to a real number with some continuous property.
You might be interested in this paper [1] which builds on top of this approach to simulate arbitrarily precise samples from the continuous normal distribution.
Not really. You can simulate a probability of 1/x by expanding 1/x in binary and flipping a coin repeatedly, once for each digit, until the coin matches the digit (assign heads and tails to 0 and 1 consistently). If the match happened on 1, then it's a positive result, otherwise negative. This only requires arbitrary but finite precision but the probability is exactly equal to 1/x which isn't rational.
You can't prove something untrue (in the sense that it implies false) without proving that the theorem prover is is unsound, which I think at the moment is not known to be possible in Lean.
But you're exactly right. There's nothing linking theorem prover definitions to pen and paper definitions in any formal system.
This is a topic of contention in formalized math with no universal right answer. Some libraries go heavy on the dependent types, and some like mathlib try to avoid them. I do math in both Rocq and Lean and I find I like the latter style a lot more for my work for a couple reasons:
- Fewer side conditions: Setting a / 0 = 0 means that some laws hold even when a denominator is 0, and so you don't need to prove the denominator is nonzero. This is super nice when the denominator is horrible. I heard once that if you set the junk value for a non-converging Riemann integral to the average of the lim sup and lim inf you can obliterate a huge number of integrability side conditions (though I didn't track down this paper to find out for sure).
- Some of the wacky junk arithmetic values, especially as it relates to extended reals, do show up in measure theory. Point being: "junk arithmetic" is a different mathematical theory than normal math, but it's no less legitimate, and is closely related.
- Definition with Hilbert's epsilon operator. If I want to define a function that takes eg. a measurable set S as an argument, I could do the dependent types way
def MyDef (S) (H : measurable S) := /-- real definition -/
but then I need to write all of my theorems in terms of (MyDef S H) and this can cause annoying unification problems (moreso in Rocq than in Lean, assuming H is a Prop). Alternatively, I could use junk math
def MyDef' (S) := if (choose (H : measurable S)) then /-- real definition -/ else /-- junk -/
I can prove (MyDef' S = MyDef S H) when I have access to (H : measurable S). And the property H here can be be really complex, convergence properties, existence properties, etc. It's nice to avoid trucking them around everywhere.
> if you set the junk value for a non-converging Riemann integral to the average of the lim sup and lim inf you can obliterate a huge number of integrability side conditions
Wouldn’t this still cause problems if the lim sup is ∞ and the lim inf is -∞?
One thing I never understood about this: why does this not just compile to Lean so they're compatible with each other? Having a good interface is admirable, but the difference between set and type based foundations seems not very important and porting any enough math to sustain Litex seems like a huge undertaking.
It's truely great if Litex does compile to existing formal languages. The only problem is that we can not find a good way to compile our verification process, which does not require users to give names to facts they are using and thus very different from how Lean works, to Lean (set theory example is just the first one of a series of comparisons). Besides, it's even harder to compile future functionaliteies, like printing out results of each statement of litex in a human readable way, to lean. So since litex is still a young language and we are using our limited resources to try new ideas and crack here and there, for the time being we believe it's not a good time to migrate our code in such a great scale. Thank you. Merry Christmas.
To be honest I'm not convinced by the technical downsides you mentioned here BUT I can see why you wouldn't want to spend time on this if it takes away from language development. Thanks!
Proofs, sure, but not definitions. A human needs to be sure that the definitions align with that they expect. Unfortunately, humans generating correct definitions and LLM's generating correct proofs are not independent problems.
IMO problems are stated in and solved by math a high schooler could understand. Getting the definitions right (one of the harder parts of mechanizing proofs IME) is a different beast altogether.
reply