I run models with Claude Code (Using the Anthropic API feature of llama.cpp) on my own hardware and it works every bit as well as Claude worked literally 12 months ago.
If you don't believe me and don't want to mess around with used server hardware you can walk into an Apple Store today, pick up a Mac Studio and do it yourself.
I’ve been doing the same with GPT-OSS-120B and have been impressed.
Only gotcha is Claude code expects a 200k context window while that model max supports 130k or so. I have to do a /compress when it gets close. I’ll have to see if there is a way to set the max context window in CC.
Been pretty happy with the results so far as long as I keep the tasks small and self contained.
I've been making use of gpt-oss-120b extensively for a range of projects, commercial and price, because providers on OpenRouter make it essentially free and instant, and it's roughly as capable at o4-mini was in my experience.
That said, I'm a little surprised to hear you're having great success with it as a coding agent. It's "obviously" worse than the frontier models, and even they can making blindly dumb decisions pretty regularly. Maybe I should give it a shot.
As a solo Founder who recently invested in self-hosted build infrastructure because my company runs ~70,000 minutes/month, this change is going to add an extra $140/month for hardware I own. And that's just today; this number will only go up over time.
I am not open to GitHub extracting usage-based rent for me using my own hardware.
This is the first time in my 15+ years of using GitHub that I'm seriously evaluating alternative products to move my company to.
But it is not for hardware you own. It is for the use of GutHubs coordinators, which they have been donating the use of to you for free. They have now decided that that service is something they are going to charge for. Your objection to GitHub "extracting usage-based rent from me" seems to ignore that you have been getting usage of their hardware for free up to now.
So, like I said, the question for you is whether that $140/month of service is worth that money to you, or can you find a better priced alternative, or build something that costs less yourself.
My guess is that once you think about this some more you will decide it is worth it, and probably spend some time trying to drive down your minutes/month a bit. But at $140 a month, how much time is that worth investing?
No. It is not worth a time-scaled cost each month for them to start a job on my machines and store a few megabytes of log files.
I'd happily pay a fixed monthly fee for this service, as I already do for GitHub.
The problem here is that this is like a grocery store charging me money for every bag I bring to bag my own groceries.
> But at $140 a month, how much time is that worth investing?
It's not $140/month. It's $140/month today, when my company is still relatively small and it's just me. This cost will scale as my company scales, in a way that is completely bonkers.
> The problem here is that this is like a grocery store charging me money for every bag I bring to bag my own groceries.
This is an odd take because you're completely discounting the value of the orchestration. In your grocery store analogy, who's the orchestrator? It isn't you.
It would be silly to write a new one today. Plenty of open source + indy options to invest into instead.
For scheduled work, cron + a log sink is fine, and for pull request CI there's plenty of alternatives that don't charge by the minute to use your own hardware. The irony here, unfortunately, is that the latter requires I move entirely off of GitHub now.
so they are selling cent of their CPU time for a minute's worth
> My guess is that once you think about this some more you will decide it is worth it, and probably spend some time trying to drive down your minutes/month a bit. But at $140 a month, how much time is that worth investing?
It's $140 right now. And if they want to squeeze you for cents worth of CPU time (because for artifact storage you're already paying separately), they *will* squeeze harder.
And more importantly *RIGHT NOW* it costs more per minute than running decent sized runner!
I get the frustration. And I’m no GitHub apologist either. But you’re not being charged for hardware you own. You’re being charged for the services surrounding it (the action runner/executor binary you didn’t build, the orchestrator configurable in their DSL you write, the artefact and log retention you’re getting, the plug-n-play with your repo, etc). Whether or not you think that is a fair price is beside the point.
That value to you is apparently less than $140/mo. Find the number you’re comfortable with and then move away from GH Actions if it’s less than $140.
More than 10 years of running my own CI infra with Jenkins on top.
In 2023 I gave up Jenkins and paid for BuildKite. It’s still my hardware. BuildKite just provides the “services” I described earlier. Yet I paid them a lot of money to provide their services for me on my own hardware. GH actions, even while free, was never an option for me. I don’t like how it feels.
This is probably bad for GitHub but framing it as “charging me for my hardware” misses the point entirely.
I was born in 1993. I kind of heard lots of rumbling about Microsoft being evil as I grew up, but I wasn't fully understanding of the anti trust thing.
It used to suprise me that people saw cool tech from Microsoft (like VSCode) and complain about it.
I now see the first innings of a very silly game Microsoft are going to start playing over the next few years. Sure, they are going to make lots of money, but a whole generation of developers are learning to avoid them.
Genuinely curious about this as well. It's a major bummer that self-hosted infra can't be used to validate GitHub Pull Requests now; basically means I'll have to move my entire workflow off of GitHub so that everything can be integrated reasonably again.
That's not exactly true. You just won't be able to use self-hosted infra to validate GitHub PRs a) using GHA and b) for free.
GitHub still supports e.g. PR checks that originate from other systems. We had PR checks before GHA and it's easy enough to go back to that. Jenkins has some stuff built in or you can make some simple API calls.
We use a combination of GHA and Jenkins jobs. All these end up as checks on GitHub. You could then proceed to say "Allow this PR Merge button if check A is ok and check B is ok" where check A arrived from GHA and check B from a Jenkins job.
Am I right in assuming it’s not the amount of payment but the transition from $0 to paying a bill at all?
I’m definitely sure it’s saving me more than $140 a month to have CI/CD running and I’m also sure I’d never break even on the opportunity cost of having someone write or set one up internally if someone else’s works - and this is the key - just as well.
But investment in CI/CD is investing in future velocity. The hours invested are paid for by hours saved. So if the outcome is brittle and requires oversight that savings drops or disappears.
I use them minimally and haven't stared at enough failures yet to see the patterns. Generally speaking my MO is to remove at least half of the moving parts of any CI/CD system I encounter and I've gone a multiple of that several times.
When CI and CD stop being flat and straightforward, they lose their power to make devs clean up their own messes. And that's one of the most important qualities of CI.
Most of your build should be under version control and I don't mean checked in yaml files to drive a CI tool.
The only company I’ve held a grudge against longer than MS is McDonalds and they are sort of cut from the same cloth.
I’m also someone who paid for JetBrains when everyone still thought it wasn’t worth money to pay for a code editor. Though I guess that’s again now. And everyone is using an MS product instead.
I run a local Valhalla build cluster to power the https://sidecar.clutch.engineering routing engine. The cluster runs daily and takes a significant amount of wall-clock time to build the entire planet. That's about 50% of my CI time; the other 50% is presubmits + App Store builds for Sidecar + CANStudio / ELMCheck.
Using GitHub actions to coordinate the Valhalla builds was a nice-to-have, but this is a deal-breaker for my pull request workflows.
I found that implementing a local cache on the runners has been helpful. Ingress/egress on local network is hella slow, especially when each build has ~10-20GB of artifacts to manage.
ZeroFS looks really good. I know a bit about this design space but hadn't run across ZeroFS yet. Do you do testing of the error recovery behavior (connectivity etc)?
This has been mostly manual testing for now.
ZeroFS currently lacks automatic fault injection and proper crash tests, and it’s an area I plan to focus on.
SlateDB, the lower layer, already does DST as well as fault injection though.
GH actions templates don’t build all branches by default. I guess it’s due to them not wanting the free tier to use to much resources. But I consider it an anti-pattern to not build everything at each push.
reply