1. The closer the context gets to full the worse it performs.
2. The more context it has the less it weights individual items.
That is Claude might learn you hate long functions and add a line about short functions. When that is the only thing in the function it is likely to follow other very closely. But when it’s 1 piece of such longer context, it is much more likely to ignore it.
3. Tokens cost money even you are currently being subsidized.
4. You have no idea how new models and new system prompt will perform with your current memory.md file.
5. Unlike learning something yourself, anything you teach Claude is likely to start being controlled by your employer. They might not let you take it with you when you go.
Caching has so many caveats. The cache expiration window is short, if you change document in the context it clears the cache, if you change anything in the prompt prefix it clears the cache. And there’s no reason to think that Anthropic will keep charging dramatically less for cached tokens on the future once they start trying to make a profit.
Yeah of course they do because it saves them more money than they are passing on to you. That doesn’t mean that they are magically able to overcome the tradeoffs inherent to caching. All of the issues I mentioned will still invalidate your cache.
1. The closer the context gets to full the worse it performs.
2. The more context it has the less it weights individual items.
That is Claude might learn you hate long functions and add a line about short functions. When that is the only thing in the function it is likely to follow other very closely. But when it’s 1 piece of such longer context, it is much more likely to ignore it.
3. Tokens cost money even you are currently being subsidized.
4. You have no idea how new models and new system prompt will perform with your current memory.md file.
5. Unlike learning something yourself, anything you teach Claude is likely to start being controlled by your employer. They might not let you take it with you when you go.