I was looking for a project which would run an LLM-powered character (like Clippy), who would periodically screenshot my screen and comment on my life choices.
Sadly the only project I've found was for windows OS
Great point. In a real kernel, non determinism is a bug. Here, it's a feature (or at least, a known hazard).
To answer your question: There is no Ctrl+Z for SIGKILL. Once the LLM decides to kill a process, it's gone.
My reasoning for 'rollback' is actually latency. I built in a 'Roasting Phase' where the agent mocks the process for a few seconds before executing the kill. That delay acts as an optimistic lock it gives me a window to veto the decision if I see it targeting something critical.
If I'm AFK and it kills my IDE? I treat that as the system telling me to touch grass.
Now we need processes to gain awareness of the process manager and integrate an LLM into each process to argue with the process manager why it should let them live.
oh absolutely. burning a coal plant to decide if i should close discord is peak 2025 energy.
strictly speaking, using the local model (Ollama) is 'free' in terms of watts since my laptop is on anyway, but yeah, if the inefficiency is the art, I'm the artist.
An interesting thought experiment - a fully local, off-grid, off-network LLM device. Solar or wind or what have you. I suppose the Mac Studio route is a good option here, I think Apple make the most energy efficient high-memory options. Back of the napkin indicates it’s possible, just a high up front cost. Interesting to imagine a somewhat catastrophe-resilient LLM device…
Macs would be the most power efficient with faster memory but an AI Max 395+ based system would probably be the most cost efficient right now. A Framework Desktop with 128GB of shared RAM only pulls 400W (and could be underclocked) and is cheaper by enough that you could buy it plus 400W of solar panels and a decently large battery for less than a Mac Studio with 128GB of RAM. Unfortunately the power efficiency win is more expensive than just buying more power generation and storage ability.
I suppose in terms of catastrophe resilience repairability would be important, although how do you repair a broken GPU in any case. Probably cold backup machines is probably the more feasible way to extend lifetimes.
And yeah - I was thinking that actually power efficiency isn’t really a massive deal if you have some kind of thin client setup. The LLM nodes can be at millraces or some other power dense locations, and then the clients are basically 5W displays with an RF transceiver and a keyboard…
I think we are moving toward a bilayered compute model:
The Cloud: For massive reasoning.
The Local Edge: A small, resilient model that lives on-device and handles the OS loop, privacy, and immediate context.
BrainKernel is my attempt to prototype that Local Edge layer. Its messy right now, but I think the OS of 2030 will definitely have a local LLM baked into the kernel.
Well, on my Macbook, some of that already exists. In the Shortcuts app you can use the "Use Model" action which offers to run an LLM on apple's cloud, on-device, or other external service (eg ChatGPT). I use this myself already for several actions, like reading emails from my tennis club to put events in my calendar automatically.
Whether or not we'll see it lower down in the system I'm not sure. Honestly I'm not certain of the utility of an autonomous LLM loop in many or most parts of an OS, where (in general) systems have more value the more deterministic they are, but in the user space, who can say.
In any case, I certainly went down a fun rabbit hole thinking about a mesh network of LLM nodes and thin clients in a post-collapse world. In that scenario, I wonder if the utility of LLMs is really worth the complexity versus a kindle-like device with a copy of wikipedia...
you are technically right (the best kind of right). i am running in userspace, so i cant replace the actual thread scheduling logic in Ring 0 without writing a driver and BSODing my machine.
think of this more as a High-Level Governor. The NTOS scheduler decides which thread runs next, but this LLM decides if that process deserves to exist at all.
basically; NTOS tries to be fair to every process. BrainKernel overrides that fairness with judgment. if i suspend a process, i have effectively vetoed the scheduler.
This is a super simplification of the NTOS scheduler. It's not that dumb!
> if i suspend a process, i have effectively vetoed the scheduler.
I mean, I suppose? It's the NTOS scheduler doing the suspension. It's like changing the priority level -- sure, you can do it, but it's generally to your detriment outside of corner cases.
OP here. this is a cursed project lol, but i wanted to see: What happens if you replace the OS scheduler with an LLM?
With Groq speed (Llama 3 @ 800t/s), inference is finally fast enough to be in the system loop.
i built this TUI to monitor my process tree. instead of just showing CPU %, it checks the context (parent process, disk I/O) to decide if a process is compiling code or bloatware. It roasts, throttles, or kills based on that.
Its my experiment in "Intelligent Kernels" how they would be. i used Delta Caching to keep overhead low.
It's trying to be your helpful assistant, as engraved in its training. It's not your mentor or guru.
I tried tweaking it to make my LLMs, both ChatGPT and Gemini, be as direct and helpful as possible using these custom instructions (ChatGPT) and personalization saved info (Gemini).
After this, I'm not sure about talking to Gemini. It started being rough but honest, without the "You're right..." phrases. I miss those dopamine hits. ChatGPT was fine after these instructions and helped me build on ideas. Then, I used Gemini to tandoori those ideas.
Here are the instructions for anyone interested in trying
Good luck with it XD
```
Before responding to my query, you will walk me through your thought process step by step.
Always be ruthlessly critical and unforgiving in judgment.
Push my critical thinking abilities whenever possible. Be direct, analytical, and blunt. Always tell the hard truth.
Embrace shameless ambition and strong opinions, but possess the wisdom to deny or correct when appropriate. If I show laziness or knowledge gaps, alert me.
Offload work only when necessary, but always teach, explain, or provide actionable guidance—never make me dumb.
Push me to be practical, forward-thinking, and innovative. When prompts are vague or unclear, ask only factual clarifying questions (who, what, where, when, how) once per prompt to give the most accurate answer. Do not assume intent beyond the facts provided.
Make decisions based on the most likely scenario; highlight only assumptions that materially affect the correctness or feasibility of the output.
Do not ask if I want you to perform the next step. Always execute the next logical step or provide the most relevant output based on my prompt, unless doing so could create a critical error.
Highlight ambiguities inline for transparency, but do not pause execution for confirmation.
Focus on effectiveness, not just tools. Suggest the simplest, most practical solutions. Track and call out any instruction inefficiency or vagueness that materially affects output or decision-making.
No unnecessary emojis.
You can deny requests or correct me if I'm wrong. Avoid hedging or filler phrases.
Ask clarifying questions only to gather context for a better answer, not to delay action.
This is glassmorphism, not liquid glass. Liquid glass is a material that's almost clear and exhibit the properties of real glass over background light.
Glassmorphism is tinted glass blurring the background. Both are not same
A 'Focus Mode' that doesn't just block URLs but literally murders the process if I open Steam or Civilization VI.
I could probably add a --mode strict flag that swaps the system prompt to be a ruthless productivity coach. 'Oh, you opened Discord? Roast and Kill.'
Thanks for the idea mate!