A curly brace is multiple tokens? Even in models trained to read and write code? Even if true, I’m not sure how much that matters, but if it does, it can be fixed.
Imagine saying existing human languages like English are “inefficient” for LLMs so we need to invent a new language. The whole thing LLMs are good at is producing output that resembles their training data, right?
Upvoted because educational, despite the AI-ness and clickbait.
I’ve worked at orgs that used Postgres in production, but I’ve never been the one responsible for tuning/maintenance. I never knew that Postgres doesn’t merge pages or have a minimum page occupancy. I would have thought it’s
not technically a B-tree if it doesn’t.
Googlebot respects robots.txt. And Google doesn't use the fetched data from users of Chrome to supplement their search index (as a2128 is speculating that Perplexity might do when they fetch pages on the user's behalf).
Imagine saying existing human languages like English are “inefficient” for LLMs so we need to invent a new language. The whole thing LLMs are good at is producing output that resembles their training data, right?
reply