(On a beefy machine) It gets 1 TB/s throughput including all IO and position mapping back to original text location. I used it to split project gutenberg novels. It does 20k+ novels in about 7 seconds.
Note it keeps all dialog together- which may not be what others want, but was what i wanted.
i don't know what you were actually supposed to do with it, but in real life i spent a lot of time building houses/forts so i did that in bob too. in a different era i'd've just done all that in minecraft.
What are you using those embeddings for, If you don't mind me asking? I'd love to know more about the workflow and what the prefix instructions are like.
Here is a (3 month old) repo where i did something like that and all the tasks are checked into the linear git history — https://github.com/KnowSeams/KnowSeams
Having the llm write the spec/workunit from a conversation works well. Exploring a problem space with a (good) coding agent is fantastic.
However for complex projects IMO one must read what was written by the llm … every actual word.
When it ‘got away’ from me, in each case I left something in the llm written markdown that I should have removed.
99% “I can ask for that later” and 1% “that’s a good idea i hadn’t considered” might be the right ratio when reading an llm generated plan/spec/workunit.
Breaking work into single context passes … 50-60k tokens in sonnet 4.5 has had typically fantastic results for me.
My side project is using lean 4 and a carelessly left in ‘validate’ rather than ‘verify’ lead down a hilariously complicated path equivalent to matching an output against a known string.
I recovered, but it wasn’t obvious to me that was happening. I however would not be able to write lean proofs myself, so diagnosing the problem and fixing it is a small price to be able to mechanically verify part of my software is correct.
https://www.exurbe.com/stoicisms-appeal-to-the-rich-and-powe...