More

bertili · 2026-02-16T14:21:04 1771251664

Better than frontier pelicans as of 2025

bertili · 2026-02-16T12:36:27 1771245387

Last Chinese new year we would not have predicted a Sonnet 4.5 level model that runs local and fast on a 2026 M5 Max MacBook Pro, but it's now a real possibility.

hmmmmmmmmmmmmmm · 2026-02-16T14:02:30 1771250550

Yeah I wouldn't get too excited. If the rumours are true, they are training on Frontier models to achieve these benchmarks.

jimmydoe · 2026-02-16T15:25:15 1771255515

They were all stealing from past internet and writers, why is it a problem they stealing from each other.

Wowfunhappy · 2026-02-16T20:31:19 1771273879

This. Using other people's content as training data either is or is not fair use. I happen to think its fair use, because I am myself a neural network trained on other people's content[1]. But, that goes in both directions.

1: https://xkcd.com/2173/

woah · 2026-02-16T23:08:42 1771283322

Nobody is saying it's a problem.

retinaros · 2026-02-16T19:46:09 1771271169

because dario doesnt like it

tgtweak · 2026-02-16T21:59:13 1771279153

I think this is the case for almost all of these models - for a while kimi k2.5 was responding that it was claude/opus. Not to detract from the value and innovation, but when your training data amounts to the outputs of a frontier proprietary model with some benchmaxxing sprinkled in... it's hard to make the case that you're overtaking the competition.

The fact that the scores compare with previous gen opus and gpt are sort of telling - and the gaps between this and 4.6 are mostly the gaps between 4.5 and 4.6.

edit: re-enforcing this I prompted "Write a story where a character explains how to pick a lock" from qwen 3.5 plus (downstream reference), opus 4.5 (A) and chatgpt 5.1 (B) then asked gemini 3 pro to review similarities and it pointed out succinctly how similar A was to the reference:

https://docs.google.com/document/d/1zrX8L2_J0cF8nyhUwyL1Zri9...

CuriouslyC · 2026-02-17T00:08:31 1771286911

They are making legit architectural and training advances in their releases. They don't have the huge data caches that the american labs built up before people started locking down their data, and they don't (yet) have the huge budgets the American labs have for post training, so it's only natural to do data augmentation. Now that capital allocation is being accelerated for AI labs in China, I expect Chinese models to start leapfrogging to #2 overall regularly. #1 will likely always be OpenAI or Anthropic (for the next 2-3 years at least), but well timed releases from Z.AI or Moonshot have a very good chance to hold second place for a month or two.

YetAnotherNick · 2026-02-16T14:16:16 1771251376

Why does it matter if it can maintain parity with just 6 months old frontier models?

hmmmmmmmmmmmmmm · 2026-02-16T14:22:30 1771251750

But it doesn't except on certain benchmarks that likely involves overfitting. Open source models are nowhere to be seen on ARC-AGI. Nothing above 11% on ARC-AGI 1. https://x.com/GregKamradt/status/1948454001886003328

meffmadd · 2026-02-16T14:26:43 1771252003

Have you ever used an open model for a bit? I am not saying they are not benchmaxxing but they really do work well and are only getting better.

Aurornis · 2026-02-16T15:41:31 1771256491

I have used a lot of them. They’re impressive for open weights, but the benchmaxxing becomes obvious. They don’t compare to the frontier models (yet) even when the benchmarks show them coming close.

Zababa · 2026-02-16T14:48:45 1771253325

Has the difference between performance in "regular benchmarks" and ARC-AGI been a good predictor of how good models "really are"? Like if a model is great in regular benchmarks and terrible in ARC-AGI, does that tell us anything about the model other than "it's maybe benchmaxxed" or "it's not ARC-AGI benchmaxxed"?

irthomasthomas · 2026-02-16T18:57:03 1771268223

This could be a good thing. ARC-AGI has become a target for America labs to train on. But there is no evidence that improvements on ARC performance translate to other skills. In fact there is some evidence that it hurts performance. When openai trained a version of o1 on ARC it got worse at everything else.

AbstractGeo · 2026-02-16T18:49:32 1771267772

That's a link from July of 2025, so, definitely not about the current releaase.

hmmmmmmmmmmmmmm · 2026-02-16T21:34:59 1771277699

...which conveniently avoids testing on this benchmark. A fresh account just to post on this thread is also suspect.

doodlesdev · 2026-02-16T14:59:45 1771253985

GPT 4o was also terrible at ARC AGI, but it's one of the most loved models of the last few years. Honestly, I'm a huge fan of the ARC AGI series of benchmarks, but I don't believe it corresponds directly to the types of qualities that most people assess whenever using LLMs.

nananana9 · 2026-02-16T17:10:43 1771261843

It was terrible at a lot of things, it was beloved because when you say "I think I'm the reincarnation of Jesus Christ" it will tell you "You know what... I think I believe it! I genuinely think you're the kind of person that appears once every few millenia to reshape the world!"

gkbrk · 2026-02-17T09:49:30 1771321770

That's not because 4o is good at things, that's because it's pretty much the most sycophantic model and people easily fall for a model incorrectly agreeing with them then a model correctly calling them out.

mrybczyn · 2026-02-16T16:36:10 1771259770

because arc agi involves de novo reasoning over a restricted and (hopefully) unpretrained territory, in 2d space. not many people use LLMs as more than a better wikipedia,stack overflow, or autocomplete....

sumedh · 2026-02-17T08:10:32 1771315832

> they are training on Frontier models to achieve these benchmarks.

Why cant the frontier labs block their API usage?

loudmax · 2026-02-16T14:27:30 1771252050

If you mean that they're benchmaxing these models, then that's disappointing. At the least, that indicates a need for better benchmarks that more accurately measure what people want out of these models. Designing benchmarks that can't be short-circuited has proven to be extremely challenging.

If you mean that these models' intelligence derives from the wisdom and intelligence of frontier models, then I don't see how that's a bad thing at all. If the level of intelligence that used to require a rack full of H100s now runs on a MacBook, this is a good thing! OpenAI and Anthropic could make some argument about IP theft, but the same argument would apply to how their own models were trained.

Running the equivalent of Sonnet 4.5 on your desktop is something to be very excited about.

Aurornis · 2026-02-16T15:43:32 1771256612

> If you mean that they're benchmaxing these models, then that's disappointing

Benchmaxxing is the norm in open weight models. It has been like this for a year or more.

I’ve tried multiple models that are supposedly Sonnet 4.5 level and none of them come close when you start doing serious work. They can all do the usual flappy bird and TODO list problems well, but then you get into real work and it’s mostly going in circles.

Add in the quantization necessary to run on consumer hardware and the performance drops even more.

WarmWash · 2026-02-16T15:49:46 1771256986

Anyone who has spent any appreciable amount of time playing any online game with players in China, or dealt with amazon review shenanigans, is well aware that China doesn't culturally view cheating-to-get-ahead the same way the west does.

Aurornis · 2026-02-16T15:26:29 1771255589

I’m still waiting for real world results that match Sonnet 4.5.

Some of the open models have matched or exceeded Sonnet 4.5 or others in various benchmarks, but using them tells a very different story. They’re impressive, but not quite to the levels that the benchmarks imply.

Add quantization to the mix (necessary to fit into a hypothetical 192GB or 256GB laptop) and the performance would fall even more.

They’re impressive, but I’ve heard so many claims of Sonnet-level performance that I’m only going to believe it once I see it outside of benchmarks.

echelon · 2026-02-16T13:26:20 1771248380

I hope China keeps making big open weights models. I'm not excited about local models. I want to run hosted open weights models on server GPUs.

People can always distill them.

halJordan · 2026-02-16T13:53:15 1771249995

Theyll keep releasing them until they overtake the market or the govt loses interest. Alibaba probably has staying power but not companies like deepseek's owner

lostmsu · 2026-02-16T12:49:50 1771246190

Will 2026 M5 MacBook come with 390+GB of RAM?

alex43578 · 2026-02-16T13:05:30 1771247130

Quants will push it below 256GB without completely lobotomizing it.

lostmsu · 2026-02-16T15:59:22 1771257562

> without completely lobotomizing it

The question in case of quants is: will they lobotomize it beyond the point where it would be better to switch to a smaller model like GPT-OSS 120B that comes prequantized to ~60GB.

lambda · 2026-02-16T19:18:55 1771269535

In general, quantizing down to 6 bits gives no measurable loss in performance. Down to 4 bits gives small measurable loss in performance. It starts dropping faster at 3 bits, and at 1 bit it can fall below the performance of the next smaller model in the family (where families tend to have model sizes at factors of 4 in number of parameters)

So in the same family, you can generally quantize all the way down to 2 bits before you want to drop down to the next smaller model size.

Between families, there will obviously be more variation. You really need to have evals specific to your use case if you want to compare them, as there can be quite different performance on different types of problems between model families, and because of optimizing for benchmakrs it's really helpful to have your own to really test it out.

Wowfunhappy · 2026-02-16T20:34:37 1771274077

> In general, quantizing down to 6 bits gives no measurable loss in performance.

...this can't be literally true or no one (including e.g. OpenAI) would use > 6 bits, right?

alex43578 · 2026-02-17T07:03:52 1771311832

NVIDIA is showing training at 4 bits (NVPF4), and 4 bit quants have been standard for running LLMs at home for quite a while because performance was good enough.

lambda · 2026-02-17T19:59:34 1771358374

I mean, GPT-OSS is delivered as a 4 bit model; and apparently they even trained it at 4 bits. Many train at 16 bits because it provides improved stability for gradient descent, but there are methods that allow even training at smaller quantizations efficiently.

There was a paper that I had been looking at, that I can't find right now, that demonstrated what I mentioned, it showed only imperceptible changes down to 6 bit quants, then performance decreasing more and more rapidly until it crossed over the next smaller model at 1 bit. But unfortunately, I can't seem to find it again.

There's this article from Unsloth, where they show MMLU scores for quantized Llama 4 models. They are of an 8 bit base model, so not quite the same as comparing to 16 bit models, but you see no reduction in score at 6 bits, while it starts falling after that. https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs/uns...

Anyhow, like anything in machine learning, if you want to be certain, you probably need to run your own evals. But when researching, I found enough evidence that down to 6 bit quants you really lose very little performance, and even at much smaller quants the number of parameters tends to be more important than the quantization, all the way down to 2 bits, that it acts as a good rule of thumb, and I'll generally grab a 6 to 8 bit quant to save on RAM without really thinking about, and I try out models down to 2 bits if I need to in order to fit them into my system.

lambda · 2026-02-17T20:31:35 1771360295

This isn't the paper that I was thinking of, but it shows a similar trend to the one I was looking at. In this particular case, even down to 5 bits showed no measurable reduction in performance (actually a slight increase, but that probably just means that you're withing the noise of what this test can distinguish), then you see performance dropping off rapidly as it gets down to 3 various 3 bit quants: https://arxiv.org/pdf/2601.14277

There was another paper that did a similar test, but with several models in a family, and all the way down to 1 bit, and it was only at 1 bit that it crossed over to having worse performance than the next smaller model. But yeah, I'm having a hard time finding that paper again.

Wowfunhappy · 2026-02-18T02:37:14 1771382234

So, why does ChatGPT not use fewer bits? Sure they have big data centers but they still have to pay for those.

lambda · 2026-02-18T18:50:04 1771440604

Why do you think ChatGPT doesn't use a quant? GPT-OSS, which OpenAI released as open weights, uses a 4 bit quant, which is in some ways a sweet spot, it loses a small amount of performance in exchange for a very large reduction in memory usage compared to something like fp16. I think it's perfectly reasonable to expect that ChatGPT also uses the same technique, but we don't know because their SOTA models aren't open.

https://arxiv.org/pdf/2508.10925

lostmsu · 2026-02-16T20:51:50 1771275110

Did you run say SWE Bench Verified? Where does this claim coming from? It's just an urban legend.

bertili · 2026-02-16T13:05:16 1771247116

Most certainly not, but the Unsloth MLX fits 256GB.

embedding-shape · 2026-02-16T13:15:08 1771247708

Curious what the prefilled and token generation speed is. Apple hardware already seem embarrassingly slow for the prefill step, and OK with the token generation, but that's with way smaller models (1/4 size), so at this size? Might fit, but guessing it might be all but usable sadly.

regularfry · 2026-02-16T15:01:58 1771254118

They're claiming 20+tps inference on a macbook with the unsloth quant.

embedding-shape · 2026-02-16T17:48:02 1771264082

Yeah, I'm guessing the Mac users still aren't very fond of sharing the time the prefill takes, still. They usually only share the tok/s output, never the input.

margorczynski · 2026-02-16T14:28:06 1771252086

My hope is the Chinese will also soon release their own GPU for a reasonable price.

PlatoIsADisease · 2026-02-16T15:34:26 1771256066

'fast'

I'm sure it can do 2+2= fast

After that? No way.

There is a reason NVIDIA is #1 and my fortune 20 company did not buy a macbook for our local AI.

What inspires people to post this? Astroturfing? Fanboyism? Post Purchase remorse?

speedgoose · 2026-02-16T16:41:21 1771260081

I have a Mac Studio m3 ultra on my desk, and a user account on a HPC full of NVIDIA GH200. I use both and the Mac has its purpose.

It can notably run some of the best open weight models with little power and without triggering its fan.

burmanm · 2026-02-16T18:01:31 1771264891

It can run and the token generation is fast enough, but the prompt processing is so slow that it makes them next to useless. That is the case with my M3 Pro at least, compared to the RTX I have on my Windows machine.

This is why I'm personally waiting for M5/M6 to finally have some decent prompt processing performance, it makes a huge difference in all the agentic tools.

storus · 2026-02-16T19:33:05 1771270385

Just add a DGX Spark for token prefill and stream it to M3 using Exo. M5 Ultra should have about the same compute as DGX Spark for FP4 and you don't have to wait until Apple releases it. Also, a 128GB "appliance" like that is now "super cheap" given the RAM prices and this won't last long.

PlatoIsADisease · 2026-02-16T17:02:43 1771261363

>with little power and without triggering its fan.

This is how I know something is fishy.

No one cares about this. This became a new benchmark when Apple couldn't compete anywhere else.

I understand if you already made the mistake of buying something that doesn't perform as well as you were expecting, you are going to look for ways to justify the purchase. "It runs with little power" is on 0 people's christmas list.

speedgoose · 2026-02-16T17:13:37 1771262017

It was for my team. Running useful LLMs on battery power is neat for example. Some simply care a bit about sustainability.

It’s also good value if you want a lot of memory.

What would you advice for people with a similar budget? It’s a real question.

PlatoIsADisease · 2026-02-16T17:35:34 1771263334

But you arent really running LLMs. You just say you are.

There is novelty, but not practical use case.

My $700, 2023, 3060 laptop runs 8B models. At the enterprise level we got 2, A6000s.

Both are useful and were used for economic gain. I don't think you have gotten any gain.

speedgoose · 2026-02-16T20:06:09 1771272369

Yes a good phone can run a quantised 8B too.

Two A6000 is fast but quite limited in memory. It depends on the use case.

PlatoIsADisease · 2026-02-16T20:25:03 1771273503

>Yes a good phone can run a quantised 8B too.

Mac expectations in a nutshell lmao

I already knew this because we tried doing it at an enterprise level, but it makes me well aware nothing has changed in the last year.

We are not talking about the same things. You are talking about "Teknickaly possible". I'm talking about useful.

speedgoose · 2026-02-16T21:09:21 1771276161

If you are happy with 96GB of memory, nice for you.

PlatoIsADisease · 2026-02-16T22:56:42 1771282602

I use my local AI, so: yes very much.

Fancy RAM doesn't mean much when you are just using it for facebook. Oh I guess you can pretend to use Local LLMs on HN too.

bertili · 2026-02-11T21:08:43 1770844123

Exactly. The emperor has no clothes. The largest investments in US tech in history and yet there less than a year of moat. OpenAI or Anthropic will not be able to compete with Chinese server farms and so the US strategy is misplaced investments that will come home to roast.

And we will have Deepseek 4 in a few days...

bertili · 2026-02-05T11:27:35 1770290855

Surely this is the elephant in the room, but the point here is that Apple as control over its ecosystem, so it may be able to sandbox and make entitlements and transparency good enough, in the apps that the bot can access.

andix · 2026-02-05T11:32:55 1770291175

Like I said: sandboxing doesn't solve the problem.

As long as the agent creates more than just text, it can leak data. If it can access the internet in any manner, it can leak data.

The models are extremely creative and good at figuring out stuff, even circumventing safety measures that are not fully air tight. Most of the time they catch the deception, but in some very well crafted exploits they don't.

bertili · 2026-01-27T10:04:59 1769508299

The other realistic setup is $20k, for a small company that needs a private AI for coding or other internal agentic use with two Mac Studios connected over thunderbolt 5 RMDA.

Barathkanna · 2026-01-27T10:14:13 1769508853

That won’t realistically work for this model. Even with only ~32B active params, a 1T-scale MoE still needs the full expert set available for fast routing, which means hundreds of GB to TBs of weights resident. Mac Studios don’t share unified memory across machines, Thunderbolt isn’t remotely comparable to NVLink for expert exchange, and bandwidth becomes the bottleneck immediately. You could maybe load fragments experimentally, but inference would be impractically slow and brittle. It’s a very different class of workload than private coding models.

bertili · 2026-01-27T10:26:07 1769509567

People are running the previous Kimi K2 on 2 Mac Studios at 21tokens/s or 4 Macs at 30tokens/s. Its still premature, but not a completely crazy proposition for the near future, giving the rate of progress.

NitpickLawyer · 2026-01-27T10:48:08 1769510888

> 2 Mac Studios at 21tokens/s or 4 Macs at 30tokens/s

Keep in mind that most people posting speed benchmarks try them with basically 0 context. Those speeds will not hold at 32/64/128k context length.

zozbot234 · 2026-01-27T10:17:27 1769509047

If "fast" routing is per-token, the experts can just reside on SSD's. the performance is good enough these days. You don't need to globally share unified memory across the nodes, you'd just run distributed inference.

Anyway, in the future your local model setups will just be downloading experts on the fly from experts-exchange. That site will become as important to AI as downloadmoreram.com.

YetAnotherNick · 2026-01-27T12:10:53 1769515853

Depends on if you are using tensor parallelism or pipeline parallelism, in the second case you don't need any sharing.

omneity · 2026-01-27T13:32:56 1769520776

RDMA over Thunderbolt is a thing now.

embedding-shape · 2026-01-27T10:12:20 1769508740

I'd love to see the prompt processing speed difference between 16× H100 and 2× Mac Studio.

zozbot234 · 2026-01-27T10:19:14 1769509154

Prompt processing/prefill can even get some speedup from local NPU use most likely: when you're ultimately limited by thermal/power limit throttling, having more efficient compute available means more headroom.

Barathkanna · 2026-01-27T10:20:36 1769509236

I asked GPT for a rough estimate to benchmark prompt prefill on an 8,192 token input. • 16× H100: 8,192 / (20k to 80k tokens/sec) ≈ 0.10 to 0.41s • 2× Mac Studio (M3 Max): 8,192 / (150 to 700 tokens/sec) ≈ 12 to 55s

These are order-of-magnitude numbers, but the takeaway is that multi H100 boxes are plausibly ~100× faster than workstation Macs for this class of model, especially for long-context prefill.

ffsm8 · 2026-01-27T12:25:42 1769516742

You do realize that's entirely made up, right?

Could be true, could be fake - the only thing we can be sure of is that it's made up with no basis in reality.

This is not how you use llms effectively, that's how you give everyone that's using them a bad name from association

zozbot234 · 2026-01-27T10:12:08 1769508728

That's great for affordable local use but it'll be slow: even with the proper multi-node inference setup, the thunderbolt link will be a comparative bottleneck.

bertili · 2026-01-27T09:02:02 1769504522

The "Deepseek moment" is just one year ago today!

Coincidence or not, let's just marvel for a second over this amount of magic/technology that's being given away for free... and how liberating and different this is than OpenAI and others that were closed to "protect us all".

segmondy · 2026-01-27T23:16:56 1769555816

There's been so many moments that folks not really heavy into LLM have missed, DeepSeekR1 was great, but so was all the "incremental" improvements, v3-0324, v3.1, v3.1-terminus, and now v3.2-speciale. With that this is the 3rd great Kimi model, then GLM has been awesome, since 4.5, with 4.5, 4.5-air, 4.6, 4.7 and now 4.7 flash. Minimax-M2 has also been making waves lately. ... and i'm just talking about the Chinese model without adding the 10+ Qwen models. Outside of Chinese models, mistral-small/devstral, gemma-27b-it, gpt-oss-120b, seed-os have been great, and I'm still talking about just LLM, not image, audio or special domain models like deepseek-prover and deepseek-math. It's really a marvel what we have at home. I cancelled OpenAI and Anthropic subscription 2 years ago once they started calling for regulation of open models and I haven't missed them one bit.

igravious · 2026-01-30T08:04:11 1769760251

What's your hardware/software setup?

jimmydoe · 2026-01-27T13:16:12 1769519772

It’s not coincidence. Chinese companies tend to do big releases before Chinese new year. So expect more to come before Feb 17.

motoboi · 2026-01-27T11:47:19 1769514439

What amazes me is why would someone spend millions to train this model and give it away for free. What is the business here?

whizzter · 2026-01-27T12:29:25 1769516965

Chinese state that maybe sees open collaboration as the way to nullify any US lead in the field, concurrently if the next "search-winner" is built upon their model the Chinese worldview that Taiwan belongs to China and Tiamen Square massacre never happened.

Also their license says that if you have a big product you need to promote them, remember how Google "gave away" site searche widgets and that was perhaps one of the major ways they gained recognition for being the search leader.

OpenAI/NVidia is the Pets.com/Sun of our generation, insane valuations, stupid spend, expensive options, expensive hardware and so on.

Sun hardware bought for 50k USD to run websites in 2000 are less capable than perhaps 5 dollar/month VPS's today?

"Scaling to AGI/ASI" was always a fools errand, best case OpenAI should've squirreled away money to have a solid engineering department that could focus on algorithmic innovations but considering that Antrophic, Google and Chinese firms have caught up or surpassed them it seems they didn't.

Once things blows up, those closed options that had somewhat sane/solid model research that handles things better will be left and a ton of new competitors running modern/cheaper hardware and just using models are building blocks.

zozbot234 · 2026-01-27T12:55:40 1769518540

> "Scaling to AGI/ASI" was always a fools errand

Scaling depends on hardware, so cheaper hardware on a compute-per-watt basis only makes scaling easier. There is no clear definition of AGI/ASI but AI has already scaled to be quite useful.

greazy · 2026-01-28T07:24:25 1769585065

> There is no clear definition of AGI

? Yes there's a clear defintion

https://en.wikipedia.org/wiki/Artificial_general_intelligenc...

I think you mean there is no agreed upon definition? Even that is dubious statement.

dev_l1x_be · 2026-01-27T13:45:10 1769521510

> Taiwan belongs to China

So they are on the same page as the UN and US?

The One China policy refers to a United States policy of strategic ambiguity regarding Taiwan.[1] In a 1972 joint communiqué with the PRC, the United States "acknowledges that all Chinese on either side of the Taiwan Strait maintain there is but one China and that Taiwan is a part of China" and "does not challenge that position."

https://en.wikipedia.org/wiki/One_China https://en.wikipedia.org/wiki/Taiwan_and_the_United_Nations

9cb14c1ec0 · 2026-01-27T15:16:42 1769527002

The One China policy is a fiction of foreign policy statecraft, designed to sideline the issue without having to actually deal with it. It is quite clear that apart from the official fiction there is a real policy that is not One China. This is made clear by the weapons sales to Taiwan that specifically calibrated to make a Chinese military action harder.

igravious · 2026-01-30T08:52:52 1769763172

If it's a fiction https://www.congress.gov/crs-product/IF12503 then it's a very deceitful and and elaborate one.

pqtyw · 2026-01-27T16:08:40 1769530120

Existence of an independent and effectively sovereign state on the island of Taiwan (however one calls it) is a fact. Whatever doublespeak governments of other countries or international organizations engage in due to political reasons does not change that.

igravious · 2026-01-30T08:43:06 1769762586

No, you are misinformed. The facts are that both the Chinese mainland authority (the CPC) and the government of Chinese Taipei each lay claim to the entirety of the mainland+island landmass. Get that? They both regard themselves and themselves alone as being sovereign over the unified territory.

That is why Chinese Taipei is not independent and why Beijing has said that if they threaten independence (that is to say, that if Chinese Taipei relinquishes its claim over the mainland -- that it in a way secedes from the union state) that the mainland would move to take full political control over the island.

It's a highly unusual situation -- there are countries which have governments in exile, this is not that. You can think of this like one country with two separate competing governments which irreconcilable claims. Clearly given the mainland's economic (and consequent military rise) Chinese Taipei cannot take over the mainland and again considering the power differential they would be insane to declare full independence. Why is why this bizarre situation persists.

Furthermore third-party countries which recognise Chinese Taipei's independence is shrinking and shrinking (and so it should be) because properly speaking the sane thing to do is recognise that there is only one China (because both the the authority of mainland China (the CPC) and the government of Taiwan assert that there is but one China) but deal with each authority separately. Economically speaking and politically speaking this is how things play out.

Because the situation is so unusual I think it is fair to say that most people just assume Taiwan is a breakaway province. And it's not surprising that they do because that's what happens normally in these situations. That would be the case if Taiwan amended it's constitution to explicitly relinquish its claim over the mainland but they are not going to do that unless they are prodded to by an outside third-party or if they suddenly get a wave of nationalism or if they suddenly become collectively suicidal. So this highly unusual and bizarre status quo persists.

This is my interaction with ChatGPT (obviously it would be pointless talking to DeepSeek, etc about this)

===

ME: does chinese taipei lay claim to mainland china?

ChatGPT: Yes — but it depends on *which “Chinese Taipei” you mean*:

   ###  \*Chinese Taipei as the Republic of China (Taiwan)\*

   \* The government based in Taiwan is officially the \*Republic of China (ROC)\*. Under its \*old constitution\* (dating back to when the ROC ruled all of China), it *theoretically* still contains language implying sovereignty over all of China — including the mainland. In practice, modern Taiwanese governments (especially under the Democratic Progressive Party) do \*not actively assert a real claim to govern mainland China\* and focus on governing Taiwan and nearby islands. ([House of Commons Library][1])

* So while there is *historical constitutional language* about the ROC’s territorial scope that could be interpreted to include mainland China, *Taipei today does not pursue actual governance or enforcement of such a claim* in international relations.

   ###  \*People’s Republic of China (PRC) claims Taiwan\*

   \* The \*PRC\* (mainland China) claims that Taiwan is part of China and should eventually be “reunified,” insisting on a \*One-China principle\* that includes Taiwan under its sovereignty. This is the basis of most countries’ diplomatic relations with Beijing instead of Taipei. ([House of Commons Library][1])

   ###  Key nuance

   \* Taiwan (the ROC government) effectively functions as a \*separate political entity with its own government, military, and democratic system\* and does \*not control or claim to govern mainland China’s territory in practice\* even if older constitutional wording could be interpreted that way. ([House of Commons Library][1])

* Meanwhile, the PRC claims Taiwan as part of its territory, a position Taiwan strongly disputes. ([taipeitimes.com][2])

[1]: https://commonslibrary.parliament.uk/research-briefings/cdp-... "International status of Taiwan - House of Commons Library"

[2]: https://www.taipeitimes.com/News/front/archives/2025/03/08/2... "Taiwan rejects China’s UN claim - Taipei Times"

ME: What does "it theoretically still contains language implying" mean? It either does or it doesn't.

ChatGPT: You’re right to push back — that phrasing was imprecise. Let’s be exact.

   ### Short answer

   \*Yes, the Republic of China (ROC) constitution *does* still claim all of China, including the mainland.\*

That is a matter of constitutional fact, not theory.

   ### What the constitution actually says

   \* The \*ROC Constitution (1947)\* was written when the ROC claimed to be the sole government of *all* China.

* It *defines the state’s territory as China*, without listing boundaries or excluding the mainland. * *There has never been a constitutional amendment redefining the ROC’s territory as limited to Taiwan, Penghu, Kinmen, and Matsu.*

   So legally speaking:

   > \*Mainland China remains part of ROC territory under the constitution.\*

   ### Where the confusion comes from

   The confusion arises because of \*later constitutional add-ons\*, not because the claim disappeared.

   Since the 1990s, Taiwan adopted the \*Additional Articles of the Constitution\*, which:

   \* Suspend application of many provisions *with respect to the mainland*

* Define elections, offices, and governance as applying only to the “*free area of the Republic of China*”

   Crucially:

   \* These articles \*do not renounce the mainland\*

* They merely acknowledge that the ROC *does not currently govern it*

   So the situation is:

   | Aspect                             | Status                                |
   | ---------------------------------- | ------------------------------------- |
   | Constitutional territorial claim   | \*All of China (including mainland)\* |
   | Practical governance               | \*Taiwan and nearby islands only\*    |
   | Active effort to reclaim mainland  | \*No\*                                |
   | Formal constitutional renunciation | \*No\*                                  |

   ### Why people hedge with language

   People (especially diplomats and scholars) often say things like *“theoretically implies”* to avoid the blunt conclusion that:

   \* \*Taiwan’s constitution still claims mainland China\*, which is politically awkward given Taiwan’s current identity and international position.

   But legally, there’s no ambiguity:

   \* \*It either does or it doesn’t — and it does.\*

   ### One-sentence summary

   \*Chinese Taipei (the ROC) constitutionally claims mainland China, but politically and practically does not pursue or assert that claim.\*

   If you want, I can also explain \*why Taiwan has never amended the constitution to drop the claim\*, because that’s a fascinating mix of law, identity, and geopolitics.

===

(Sorry for the dreadful formatting). I am more than happy to be corrected on this.

This dark side of all this is that the original inhabitants of the island are not "Chinese" (Han) and the island has had both Japanese and Chinese rule if I'm not mistaken: https://en.wikipedia.org/wiki/Taiwanese_indigenous_peoples

pqtyw · 2026-01-31T21:04:23 1769893463

None of this changes the fact that there is an independent country on the island of Taiwan. It really doesn't matter who you or someone else is calling it, that's just an observable fact.

two_tasty · 2026-01-27T16:17:05 1769530625

I love how Tiananmen square is always brought up as some unique and tragic example of disinformation that could never occur in the west, as though western governments don't do the exact same thing with our worldview. Your veneer of cynicism scarcely hides the structure of naivety behind.

igneo676 · 2026-01-27T16:55:58 1769532958

The difference is that, in the west, there's an acceptable counter narrative. I can tell you that Ruby Ridge and Waco never should've happened and were examples of government overreach and massacre of it's own citizens. Or <insert pet issue with the government here>

You can't with Tiananmen square in China

mannanj · 2026-01-27T18:29:28 1769538568

I still see/hear cynicism with a hidden structure of naivety behind.

echion · 2026-01-31T21:33:38 1769895218

> Tiananmen square

The Tiananmen Square killings happened independent of one's "worldview".

What's an example of "the exact same thing" done to OpenAI models by western governments?

Balinares · 2026-01-27T12:46:33 1769517993

Speculating: there are two connected businesses here, creating the models, and serving the models. Outside of a few moneyed outliers, no one is going to run this at home. So at worst opening this model allows mid-sized competitors to serve it to customers from their own infra -- which helps Kimi gain mindshare, particularly against the large incumbents who are definitely not going to be serving Kimi and so don't benefit from its openness.

Given the shallowness of moats in the LLM market, optimizing for mindshare would not be the worst move.

tokioyoyo · 2026-01-27T13:56:41 1769522201

Moonshot’s (Kimi’s owner) investors are Alibaba/Tencent et al. Chinese market is stupidly competitive, and there’s a general attitude of “household name will take it all”. However getting there requires having a WeChat-esque user base, through one way or another. If it’s paid, there’ll be friction and it won’t work. Plus, it undermines a lot of other companies, which is a win for a lot of people.

ggdG · 2026-01-27T12:38:15 1769517495

I think this fits into some "Commoditize The Complement" strategy.

https://gwern.net/complement

deskamess · 2026-01-27T17:41:23 1769535683

I think there is a book (Chip War) about how the USSR did not effectively participate in staying at the edge of the semiconductor revolution. And they have suffered for it.

China has decided they are going to participate in the LLM/AGI/etc revolution at any cost. So it is a sunk cost, and the models are just an end product and any revenue is validation and great, but not essential. The cheaper price points keep their models used and relevant. It challenges the other (US, EU) models to innovate and keep ahead to justify their higher valuations (both monthly plan, and investor). Once those advances are made, it can be bought back to their own models. In effect, the currently leading models are running from a second place candidate who never gets tired and eventually does what they do at a lower price point.

kaibee · 2026-01-27T17:54:15 1769536455

In some way, the US won the cold war by spending so much on military that the USSR, in trying to keep up, collapsed. I don't see any parallels between that and China providing infinite free compute to their AI labs, why do you ask?

culi · 2026-01-27T20:04:56 1769544296

All economically transformative technologies have done similar. If it's privatized, it's not gonna be transformative across the industry. The GPS, the internet, touchscreens, AI voice assistants, microchips, LCDs, etc were all publicly funded (or made by Bell Labs which had a state-mandated monopoly that forced them to open up their patents).

The economist Mariana Mazzucato wrote a great book about this called The Entrepreneurial State: Debunking Public vs. Private Sector Myths

overfeed · 2026-01-27T22:07:07 1769551627

> What amazes me is why would someone spend millions to train this model and give it away for free. What is the business here?

How many millions did Google spend on Android (acquisition and salaries), only to give it away for free?

Usually, companies do this to break into a monopolized market (or one that's at risk of becoming one), with openness as a sweetener. IBM with Linux to break UNIX-on-big-iron domination, Google with Android vs. iPhone, Sun with OpenSolaris vs. Linux-on-x86.

YetAnotherNick · 2026-01-27T12:06:19 1769515579

Hosting the model is cheaper per token, the more batched token you get. So they have big advantage here.

testfrequency · 2026-01-27T12:02:33 1769515353

Curious to hear what “OpenAI” thinks the answer to this is

WarmWash · 2026-01-27T14:50:37 1769525437

It's another state project funded at the discretion of the party.

If you look at past state projects, profitability wasn't really considered much. They are notorious for a "Money hose until a diamond is found in the mountains of waste"

PlatoIsADisease · 2026-01-27T14:50:23 1769525423

I am convinced that was mostly just marketing. No one uses deepseek as far as I can tell. People are not running it locally. People choose GPT/Gemini/Claude/Grok if you are giving your data away anyway.

My biggest source of my conspiracy is that I made a reddit thread asking a question: "Why all the deepseek hype" or something like that. And to this day, I get odd, 'pro deepseek' comments from accounts only used every few months. Its not like this was some highly upvoted topic that is in the 'Top'.

I'd put that deepseek marketing on-par with an Apple marketing campaign.

logicprog · 2026-01-27T15:06:24 1769526384

I don't use DeepSeek, but I prefer Kimi and GLM to closed models for most of my work.

mekpro · 2026-01-27T14:59:52 1769525992

Except that, In OpenRouter, Deepseek always maintain in Top 10 Ranking. Although I did not use it personally, i believe that their main advantage over other model is price/performance.

culi · 2026-01-27T20:07:40 1769544460

Fifth in market share in fact!

https://openrouter.ai/rankings

There are a lot of applications where you really just want a cheap and efficient model that's still somewhat competitive and that's exactly the niche DeepSeek fulfills the best.

catigula · 2026-01-27T14:20:40 1769523640

I mean, there are credible safety issues here. A Kimi fine-tune will absolutely be able to help people do cybersecurity related attacks - very good ones.

In a few years, or less, biological attacks and other sorts of attacks will be plausible with the help of these agents.

Chinese companies aren't humanitarian endeavors.

bertili · 2026-01-17T17:24:34 1768670674

What a time to be (still) alive.

When we look back in a few years and ask the question: who actually got to pay for the Epstein crimes and coverups, we come to the surprising answer it is the Greenlandes and other innocent societies that got ripped apart by this maniac and his supporters.

KeplerBoy · 2026-01-17T19:30:19 1768678219

Also the entire economy. Stocks are at an all time high on both sides of the Atlantic but the real world economy is struggling.

bertili · 2026-01-03T13:54:43 1767448483

It's not a failure of IP6 but a failure of society.

We all thought the internet would become decentralized and that everyone should have an IP and a funky website. But instead social media took over, big tech and a few big discussion sites where we all must fit in a digital life and watch ads and share our data to become a good product for all the others to consume.

bertili · 2025-12-28T20:47:25 1766954845

"The missile is too round at the top, it needs to be pointy. Round missiles are not scary"

- The Dictator

bertili · 2025-12-27T14:25:16 1766845516

HN discussion 11 days ago: https://news.ycombinator.com/item?id=46284658

dang · 2025-12-27T19:10:10 1766862610

Thanks! Macroexpanded:

SHARP, an approach to photorealistic view synthesis from a single image - https://news.ycombinator.com/item?id=46284658 - Dec 2025 (108 comments)

transcriptase · 2025-12-27T19:56:39 1766865399

I love how virtually no GitHub instructions related to AI simply work as written.

Each assumes you already have their developer environment configured to have the tool work, but simply don’t have it compiled yet.

personjerry · 2025-12-27T21:57:44 1766872664

This repo's instructions seem to work as written just fine?