Yeah, but he can't use his $200 subscription for the API.
That's limited to accessing the models through code/desktop/mobile.
And while I'm also using their subscriptions because of the cost savings vs direct access, having the subscription be considerably cheaper than the usage billing rings all sorts of alarm bells that it won't last.
The problem is not knowing whether someone is typing, as far as I understand. But that you may extract some information about what keys are being typed, based on the small differences in timings between them.
I'm sorry, but 'not being surprised if LLMs can rederive relativity and QM from the facts available in 1900' is a pretty scalding take.
This would absolutely be very good evidence that models can actually come up with novel, paradigm-shifting ideas. It was absolutely not obvious at that time from the existing facts, and some crazy leap of faiths needed to be taken.
This is especially true for General Relativity, for which you had just a few mismatch in the mesurements like Mercury's precession, and where the theory almost entirely follows from thought experiments.
I don't know much about Lean, but I attended an introductory talk at some point and I was particularly bothered by these partial function definitions. The example was sqrt, which would give 0 on the negatives.
Now, of course, if you're careful with the definitions you use, there is no problem. But in the (increasingly relevant) context of automatic theorem proving with LLMs, this seems to defeat the 'groundtruthness' of Lean!
How do you make sure that the LLM doesn't reward hack a proof using these workarounds?
I don't understand why they would make such footgun functions either, especially because (IIUC, and I probably don't) in a way the whole point of Lean's dependent type system is to be able to express arbitrary constraints on the inputs of these functions so that they can be total -- e.g., to be able to define a subtraction function on the nonnegative integers that takes one integer and one {integer that is less than or equal to the first integer}. And to even call this function, you (or perhaps Lean itself) would need to first prove that its second argument is less than or equal to its first.
You can express those constraints; it just turns out to be less ergonomic in practice if you do. (You can even do so in terms of the junk-valued total functions! Just define `actual_subtraction` to call straight through to `junky_subtraction`, but `actual_subtraction` has these constraints on its domain.)
The mathlib way to do things is to push those requirements out to the one who wishes to use the theorem. If you find that you're depending on a junk value in a way that's incompatible with what you wanted to prove, then you've simply discovered that you forgot to restrict your own domain to exclude the junk. (And if your desired usage lines up with the junk, then great, you get to omit an annoying busywork hypothesis.) A sqrt function that gives 0 on the negatives isn't breaking any of sqrt's properties on the positives!
The mathlib way means that instead of every function having to express these constraints and pass proofs down the line, only some functions have to.
> If you find that you're depending on a junk value in a way that's incompatible with what you wanted to prove
This is the part I'm struggling with. How would you actually know/realise that you were doing this? It seems like "the mathlib way" you describe is choosing to rely on programmer discipline for something that could be enforced automatically.
My fear is that relying on the junk values of functions (values where their "proper" partial counterparts are not defined) is somehow unsound (could lead to proving something untrue). But perhaps my intuition is off here? If so, I think the specific junk values chosen must not matter at all -- e.g., having sqrt return 42 for negative x values should work just as well, am I right?
You can't prove something untrue (in the sense that it implies false) without proving that the theorem prover is is unsound, which I think at the moment is not known to be possible in Lean.
But you're exactly right. There's nothing linking theorem prover definitions to pen and paper definitions in any formal system.
It is enforced automatically for most purposes: If you're writing a proof involving e.g. the sqrt function, you want to use theorems about it, e.g. that (sqrt(x))^2 = x. Almost all of those theorems have x>=0 as a precondition, so you do need to prove it when it matters.
This is a topic of contention in formalized math with no universal right answer. Some libraries go heavy on the dependent types, and some like mathlib try to avoid them. I do math in both Rocq and Lean and I find I like the latter style a lot more for my work for a couple reasons:
- Fewer side conditions: Setting a / 0 = 0 means that some laws hold even when a denominator is 0, and so you don't need to prove the denominator is nonzero. This is super nice when the denominator is horrible. I heard once that if you set the junk value for a non-converging Riemann integral to the average of the lim sup and lim inf you can obliterate a huge number of integrability side conditions (though I didn't track down this paper to find out for sure).
- Some of the wacky junk arithmetic values, especially as it relates to extended reals, do show up in measure theory. Point being: "junk arithmetic" is a different mathematical theory than normal math, but it's no less legitimate, and is closely related.
- Definition with Hilbert's epsilon operator. If I want to define a function that takes eg. a measurable set S as an argument, I could do the dependent types way
def MyDef (S) (H : measurable S) := /-- real definition -/
but then I need to write all of my theorems in terms of (MyDef S H) and this can cause annoying unification problems (moreso in Rocq than in Lean, assuming H is a Prop). Alternatively, I could use junk math
def MyDef' (S) := if (choose (H : measurable S)) then /-- real definition -/ else /-- junk -/
I can prove (MyDef' S = MyDef S H) when I have access to (H : measurable S). And the property H here can be be really complex, convergence properties, existence properties, etc. It's nice to avoid trucking them around everywhere.
> if you set the junk value for a non-converging Riemann integral to the average of the lim sup and lim inf you can obliterate a huge number of integrability side conditions
Wouldn’t this still cause problems if the lim sup is ∞ and the lim inf is -∞?
This is a really good explanation, but it reinforces my understanding that these “junk maths” are literally undefined behavior as in C and such. They are not defined (in maths), you are not supposed to trigger them, so they can be anything. Great…
This is horrible for a language whose whole purpose I thought was that to be foolproof and that if it compiles its true. Having very subtly different definitions of common operations is such a footgun.
Of course, I understand that this doesn’t bother mathematicians because they are used to not having any guardrails anyways. Just like C programmers have the attitude that if you fall on such a trap, you deserve it and you are not a “real programmer”. But Lean is supposed to be the other extreme isn’t it? Take nothing for granted and verify it from the ground up.
I suppose I am falling for that “Twitter confusion” the post is referring to. I never had any issues with this when actually using Lean. I just don’t like the burden of having to be paranoid about it, I thought Lean had my back and I could use it fairly mechanically by transforming abstract structures without thinking about the underlying semantics too much.
Anyway, despite the annoyance, I do assume that the designers know better and that it is a pragmatic and necessary compromise if it’s such a common pattern. But there must be a better solution, if having the exception makes it uncomfortable to prove, then design the language so that it is comfortable to prove such a thing. Don’t just remove the exception because 99% of the time it doesn’t matter. If we are happy with 99% we wouldn’t be reaching for formal verification, there are much more practical means to check correctness.
There is still a guardrail. The blog post explains that it is just using different functions and notation which might allow things like 0/0. But at the end of the day, different notation still cannot be used to prove false things.
In other words, you can use all these junk theorems to build strange results on the side, but you can never build something that disagrees with normal math or that contradicts itself. There is no footgun, because the weird results you obtain are just notation. They look weird to a human, but they don't allow you to actually break any rules or to prove 1=0.
I understand that, but if "/", and other common operators, don't mean what they means on paper, you can prove things that would be untrue if copied onto paper (kinda). You can indeed prove "1/0 = 0", which is not that far off from redefining "=" and proving "1=0".
More importantly, the other way around, it seems too easy to copy a proposition from paper onto Lean and falsely prove it without realising they don't express the same thing. A human probably wouldn't but there's increased usage of AI and other automatic methods with Lean.
I do understand I'm being purist and that it doesn't matter that much in practice. I've used Lean seriously for a while and I've never encountered any of this.
Thank you! This hit the nail on the head for me, though I probably need to try out a few more examples to fully convince myself.
TL;DR: It's actually harmless (and often convenient) to "inflate" the domains of partial functions to make them total (by making them return arbitrary junk values where the original function is undefined), provided that every theorem you want to apply still comes with the original, full restrictions.
Kevin's example is good. My stupider example would be: We can define a set that contains the integers ..., -2, -1, 0, 1, 2, ..., plus the extra element "banana". If we define the result of any addition, subtraction or multiplication involving a banana to be 42, and to have their usual results otherwise, then, provided that we add the condition "None of the variables involved is banana" to the theorem "x+y = y+x", and to every other theorem about arithmetic, anything that we can prove about arithmetic on elements of this set is also true of arithmetic on integers.
If your language isn't ergonomic then people will not use it (or use it less). Maybe another theorem prover will arise that makes "correct" definitions more ergonomic
Note that the same thing happens in Rust. Rather than putting trait bounds in structs (like struct Aa<T: Something> { .. }, people are encouraged to make the structs more generic (struct Aa<T> { .. }) and put restrictions on impls instead (impl<T: Something> Aa<T> { .. }). The rationale being that this is more ergonomic because it doesn't require you to repeat bounds in places you don't need them, and if every impl requires a Something bound, you will need the bound to make anything with this type (doubly so if the fields of Aa are private and so you need to build one using a method with T: Something)
>How do you make sure that the LLM doesn't reward hack a proof using these workarounds?
I'm not sure what you mean exactly? There is no soundness issue here, the fact that `sqrt -1` is defined to be 0 does not have any impact on what statements about `sqrt x` can be proved when `x` is positive.
It just means that if you are working on an intermediate step of a proof and you need the result that `sqrt y >= 0` you don't need to provide a proof that `y >= 0`. If you wanted an intermediate result that `(sqrt y) * 2 = y` then you would still need to provide a proof that `y >= 0`, though.
it's terrible advice for actual programmers though because often 0 is a sentinel value with special meaning for systems that you don't have control over (sometimes because of pre-digital conventions that shouldn't be lightly fucked with).
This is usually done by PL's that want to avoid crashes at all costs, but "turning crashes into subtle logic errors" seems like a really bad idea.
"As mentioned before, this is not a post about what’s practically a good idea. All I’m arguing is that mathematically, we can extend division in this way without leading to a contradiction. Programming languages are different from mathematical formalisms, and should be different. I prefer that 1/0 is an error, because I’m not using my program to prove theories."
Please do yourself a favor and actually read it.
Besides, 0 as a sentinental value on disk or on the wire is fine, but once you have values in a programming language, use option types. This is not 1980s anymore, you don't need to use 0 or -1 or 0xffff to express something special which sooner or later just falls on your feet.
I read the article. As someone who was a math major I get why it's "fine". But nowhere in the article does hillel explain WHY it's bad for a real pl. (And defenders of e.g. pony gleam point to this article too to say it's okay. It's not.) I am adding context.
> This is not 1980s anymore, you don't need to use 0 or -1 or 0xffff to express something special which sooner or later just falls on your feet.
No. You missed the whole "real world systems". E.g. like stock trading, where zero stock trades are tombstones.
Ah, that's apprecitated. Indeed, he didn't provide that "why" and tbf that wasn't the point of the article. But thanks for adding that context.
> You missed the whole "real world systems". E.g. like stock trading, where zero stock trades are tombstones.
Hm I don't think I missed that. This counts as "on the wire". Externally, there are surely good reasons for that representation, though I'd argue that internally it's better to represent this in the type system instead of special casing 0 everywhere which can be forgotten and then you get your (potential) division-by-0 issues. Avoiding them by construction is even better than failing explicitly (which I agree is in turn still better than silently returning 0).
you dont have control over it. if you emit a zero stock trade your broker's account (a system outside of your control) it might get zeroed instead of having "no shares bought". this could happen for example if you periodically buy a quantity of shares based on some average of some measurement and zero measurements got taken over the time window. and you can't make that "irrepresentable in the type system" if you didn't know that in advance. did you know that it is a convention that zero stock trades were tombstones?
or imagine your are building something that flies a plane and airspeed is measured as an average of some measurements, and the instrument flakes over your sampling interval. all of a sudden your system thinks you are at a stop. erroring is better because no further instructions based on incorrect data will propagate and shit in the pool.
1/0=0 does not immediately make you stop and think well wait a second, and it could even be hidden behind an function like avg(...)
if you want a nocrash division that's fine but it shouldn't be "/"
I think it is fair to say that AIs do not yet "understand" what they say or what we ask them.
When I ask it to use a specific MCP to complete a certain task, and it proceeds to not use that MCP, this indicates a clear lack of understanding.
You might say that the fault was mine, that I didn't setup or initialize the MCP tool properly, but wouldn't an understanding AI recognize that it didn't have access to the MCP and tell me that it cannot satisfy my request, rather than blindly carrying on without it?
LLMs consistently prove that they lack the ability to evaluate statements for truth. They lack, as well, an awareness of their unknowing, because they are not trying to understand; their job is to generate (to hallucinate).
It astonishes me that people can be so blind to this weakness of the tool. And when we raise concerns, people always say
"How can you define what 'thinking' is?"
"How can you define 'understanding'?"
These philosophical questions are missing the point. When we say it doesn't "understand", we mean that it doesn't do what we ask. It isn't reliable. It isn't as useful to us as perhaps it has been to you.
You can make categorical judgements on many things that are hard to define. Often by taking the opposite approach. It's hard to define what consciousness itself is yet I can define what it isn't and that teases a lot out.
So I can categorically say LLM's do not understand by quite literally understanding what NOT understanding is.
We know what LLM's are and what they are NOT.
Please see my earlier comment above:
> LLM's do not think, understand, reason, reflect, comprehend and they never shall.
How do you know what kind of "understanding" a python has? Why python and not a lizard? Or a bird? What method do you use for evaluating this? Does a typical python do what you tell your ai agent to do?...
C'mon, this comparison seems to be very, very unscientific. No offense...
If you don’t want your image to look like it’s been marinated in nicotine, throw stuff like “neutral white background, daylight balanced lighting, no yellow tint” into your prompt. Otherwise, congrats on your free vintage urine filter.
They don't want you creating images that mimic either works of other artists to an extent that's likely to confuse viewers (or courts), or that mimic realistic photographs to an extent that allows people to generate low-effort fake news. So they impose an intentionally-crappy orange-cyan palette on everything the model generates.
Peak quality in terms of realistic color rendering was probably the initial release of DALL-E 3. Once they saw what was going to happen, they fixed that bug fast.
SDXL and FLUX models with LoRAs can and do vastly outperform at tons of things singular big models can't or won't do now. Various subreddits and civitAI blogs describe comfyui workflows and details on how to maximize LoRA effectiveness and are probably all you need for a guided tour of that space.
This is not my special interest though but the DIY space is much more interesting than the SaaS offerings; this is something about generative AI more generally that also holds, the DIY scene is going to be more interesting.
OpenAI's new image generation model is autoregressive, while DALL-E was diffusion. The yellowish tone is an artefact of their autoregressive pipeline, if I recall correctly.
Could be. My point is that if the pipeline itself didn't impart an unmistakable character to the generated images, OpenAI would feel compelled to make it do so on purpose.
Most DALL-E 3 images have a orange-blue cast, which is absolutely not an unintended artifact. You'd literally have to be blind to miss it, or at least color-blind. That wasn't true at first -- check the original paper, and try the same prompts! It was something they started doing not long after release, and it's hardly a stretch to imagine why.
They will be doing the same thing for the same reasons today, assuming it doesn't just happen as a side effect.
Genuine question :
When companies are bailed out by the taxpayer, why can't we then give ownership of the company to the taxpayer? Effectively 'buying' the company to save it, instead of just gifting it money for it to survive.
Is there a reason not to make the taxpayer (or government) the main shareholder after the bailout?
This has precedent in the US, like when the government nationalized failing freight railroads and merged them into Conrail. But after the more recent bank and auto bailouts I wouldn't expect to see this happen again. The shareholders would really prefer to have money thrown at them but also keep their stake.
> But after the more recent bank and auto bailouts I wouldn't expect to see this happen again. The shareholders would really prefer to have money thrown at them but also keep their stake.
The auto bailouts did not feature shareholders having money thrown at them and keeping their stakes (GM and Chrysler shareholders, for instance, were almost completely wiped out in the bailout, with the new GM owned by the UAW and the US and Canadian governments; the new Chrysler was majority owned by Fiat with minority stakes held by an autoworkers pension fund and the US and Canadian governments.)
Bank bailouts were more protective of shareholders because they were mostly government purchase of distressed assets or extensions of credit,
Not really, Intel had funding from Biden's bill, and Trump told them that in order to have that money they had to give a stake to the government. In this case Intel isn't being bailed out, just securing funding for new chip foundaries
> Genuine question : When companies are bailed out by the taxpayer, why can't we then give ownership of the company to the taxpayer?
tHaT'S sOcIAlIsM
Though ironically for the first time ever, the people shouting that would actually be correct. Kind of.
If you mean in a broad "is this possible" sense though, sure, absolutely. Entities owned in part or in whole by the state are not uncommon, but anytime such things are proposed in the US, the right loses it's fucking mind.
Edit: hit the comment rate wall.
> I see! But still, I don't get in what sense it is more socialist that just having people actually buy the company to save it (instead of just saving it 'for free'). If anything it makes it more capitalist if the taxpayers invest in the bailout, instead of just giving it away!
Because socialism isn't an economic system in American politics, it's a scary word that the Russians and CHYNA are. It's also completely interoperable with communism because our conservative party here has long since abandoned anything resembling reality, and even when they were here with us, they didn't know the difference between the two.
Doing it this way is capitalist because it's American. Doing it the other way is evil because it's socialist/communist, like the Russians/Chinese/North Koreans do with the lot of this rhetoric absolutely drowning in racism and nationalism. Mind you, all those countries have issues, absolutely. I'm just saying a conservative with a gun to their head couldn't actually explain those issues, they're just evil because they're not American. [ insert eagle screech here ]
Honestly the best distillation is: It's Freedom when private citizens run things, and it's Communism when the government does. The fact that the government sometimes has to give rich private citizens a shit ton of money to keep things afloat is not reflected upon.
If you try and analyze it through a lens of what these words actually mean, yeah it makes no goddamn sense at all.
But not really right? It could happen in the market, company A chooses to bailout company B by buying it and investing money to keep it afloat.
Except company A in this case is the government. No? Why is it that when it is the government doing this action, it has to gift the money instead of potentially profiting from it?
Edit : just saw the edit. I see! But still, I don't get in what sense it is more socialist, instead of just saving the companing 'for free', people actually buy (forcefully invest?) in the company to save it. If anything it makes it more capitalist if the taxpayers invest in the bailout, instead of just giving it away!
Its pure propaganda playing on American fear of socialism.
In other very capitalist economies governments did take stakes in banks in return for bailouts. The first British bank that needed one (Northern Rock) was entirely taken over by the government and shareholders just lost their money. The government bought stakes in others. It was still criticised as being too generous to shareholders and management.
reply