I'm not quite sure what to make of this. Is it a joke, a serious paper, or more of a poem? How much of this did you use LLM assistance for? It's so dense, with tons of detail and yet no useful explanation of any of the contents. You could never use it for anything practical, and yet perhaps I can see it as being a sort of art piece and a commentary on the scientific process.
The post actually has great benchmark tables inside of it. They might be outdated in a few months, but for now, it gives you a great summary. Seems like Gemini wins on image and video perf, Claude is the best at coding, ChatGPT is the best for general knowledge.
But ultimately, you need to try them yourself on the tasks you care about and just see. My personal experience is that right now, Gemini Pro performs the best at everything I throw at it. I think it's superior to Claude and all of the OSS models by a small margin, even for things like coding.
I like Gemini Pro's UI over Claude so much but honestly I might start using Kimi K2.5 if its open source & just +/- Gemini Pro/Chatgpt/Claude because at that point I feel like the results are negligible and we are getting SOTA open source models again.
> honestly I might start using Kimi K2.5 if its open source & just +/- Gemini Pro/Chatgpt/Claude because at that point I feel like the results are negligible and we are getting SOTA open source models again.
Me too!
> I like Gemini Pro's UI over Claude so much
This I don't understand. I mean, I don't see a lot of difference in both UIs. Quite the opposite, apart from some animations, round corners and color gradings, they seem to look very alike, no?
Y'know I ended up buying Kimi's moderato plan which is 19$ but they had this unique idea where you can talk to a bot and they could reduce the price
I made it reduce the price of first month to 1.49$ (It could go to 0.99$ and my frugal mind wanted it haha but I just couldn't have it do that lol)
Anyways, afterwards for privacy purposes/( I am a minor so don't have a card), ended up going to g2a to get a 10$ Visa gift card essentially and used it. (I had to pay a 1$ extra but sure)
Installed kimi code on my mac and trying it out. Honestly, I am kind of liking it.
My internal benchmark is creating pomodoro apps in golang web... Gemini 3 pro has nailed it, I just tried the kimi version and it does have some bugs but it feels like it added more features.
Gonna have to try it out for a month.
I mean I just wish it was this cheap for the whole year :< (As I could then move from, say using the completely free models)
I've read several people say that Kimi K2 has a better "emotional intelligence" than other models. I'll be interested to see whether K2.5 continues or even improves on that.
Yup, I experience the same. I don't know what they do to achieve this but it gives them this edge, really curious to learn more about what makes it so good at it.
A lot of people point to the Muon optimizer that Moonshot (the creators of Kimi) pioneered. Compared to the standard optimizer AdamW, Muon amplifies low-magnitude gradient directions which makes the model learn faster (and maybe gives Kimi its unique qualities).
> I have no idea why so many people think that an argument that AI works is the same thing as an argument that AI will be profitable.
The fact that it works well for expensive categories of output (like software engineering, legal strategy, etc...) makes it difficult to imagine that it won't be profitable. You could still make an argument that the investments being made today are disproportionate, or that intense competition will stifle margins, but it's creating enough value to capture plenty of money.
This looks cool. I see under the "current limitations" section that it only supports a single node. Are there plans to change that? I'd imagine that database instances which got started on a single node deployment would be hard to migrate to a multi-node setup later on if they weren't originally spun up in that configuration (though ofc not impossible, just tricky).
I agree with you, since we started for a single node only, to migrate to multi node there will be some user input required, we are and will try to keep breaking changes to minimal, but once we start to support multi node in future, the user will need to do some work, we'll make sure that it's minimal by providing simple scripts and tools.
I love the idea of a minimal desktop environment, but I've never tried XFCE. Are there any themes that folks here would recommend to make it much prettier? I find the screenshots on their homepage very intuitive but a bit ugly.
I'd choose Zukitre better. No dark theme or a light one blinding your eyes. Pretty neutral, gray.
As for the icon theme, Elementary XFCE works perfectly well with Zukitre. If not, ePapirus or Papirus itself. Simple and flat but contrasted, the opposite to a good chunk of flat themes today, where you can't guess where the buttons start and end.
Once you get used to that theme the Night Mode it's useless as I you can just spawn
sct 5500 #or xsct
at daytime, or
sct 3500
at night time.
xsct/xsct will work with any window manager, too. And the Zukitre
themes blend really well with minimal window managers as CWM, i3,
DWM and the like, as it has neither curves nor gradients.
I use Arc-Dark with elementary-xfce-dark icons (but have a script to switch to toggle dark-mode, where light mode is Adwaita with elementary-xfce icons).
TBH I typically run things fullscreen, so the only part of xfce I normally "see" is a thin task bar at the bottom with open windows and clock and such. Well, except for when I use Thunar, which is a nice enough file manager.
I use XFCE since 2000. It run great on 8 MiB of RAM on a diskless 486,
with hard drive mounted over Ethernet. It is my robust daily driver.
For dark mode, try:
- in 'Appearance': set Adwaita (dark),
- in 'Window Manager': set 'Default',
- in 'Panel': set dark mode.
This works in Debian 12 (running XFCE 4.18) and looks beautiful.
Easy on the eyes, readable, comfortable.
For other themes look at xfce-look.org. You install these
by decompressing tarballs into ~/.themes/$(theme_name) folder
and then selecting these in settings manager.
Are you sure just switching up the colors and background image wouldn't do it for you?
I just looked at the homepage to see if it was anything different than I see on my machine, and if anything it looks nicer there. It's certainly nothing fancy, but I feel like there's hardly enough there to really count as "ugly". It all fades into the background quickly when you're doing actual work on it. But YMMV I guess.
When reading through the projects list of JS restrictions for "stricter" mode, I was expecting to see that it would limit many different JS concepts. But in fact none of the things which are impossible in this subset are things I would do in the course of normal programming anyway. I think all of the JS code I've written over the past few years would work out of the box here.
I was surprised by this one that only showed up lower in the document:
- Date: only Date.now() is supported. [0]
I certainly understand not shipping the js date library especially in an embedded environment both for code-size, and practicality reasons (it's not a great date library), but that would be an issue in many projects (even if you don't use it, libraries yo use almost certainly do.
Good catch. I didn't realize that there was a longer list of restrictions below the section called "Stricter mode", and it seems like a lot of String functions I use are missing too.
Anything except a 3bit quant of GLM 4.6 will exceed those 128 GB of RAM you mentioned, so of course it's slow for you. If you want good speeds, you'll at least need to store the entire thing in memory.
Maybe I'm totally misinterpreting, but the chart I'm looking at says "Net Win Rate of SAM Audio vs. SoTA Separation (text prompted)", so perhaps a lower number means that the alternative model is better?
Now that I go back and read it again I agree with you. Presumably "win rate" means what percent of the time did the SAM model (Meta's new one) beat the other tool over some set of examples.
This provides plenty of value conceptually, but I wish they were able to push the syntax into a more intuitive place. My biggest gripe with Relay is how forced the syntax feels, and this seems better but still confusing. Take for example this component declaration: