Hacker Newsnew | past | comments | ask | show | jobs | submit | marcfisc's commentslogin

Sadly, these ideas have been explored before, e.g.: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

Also, OpenAI has proposed ways of training LLMs to trust tool outputs less than User instructions (https://arxiv.org/pdf/2404.13208). That also doesn't work against these attacks.


Cool work! Thanks for citing our (InvariantLabs) blog posts! I really like the identify-as feature!

We recently launched a similar tool ourselfs, called mcp-scan: https://github.com/invariantlabs-ai/mcp-scan


Thanks! Glad identify-as makes sense. Your prior research was definitely valuable context, appreciate you putting that out there.

Checked out mcp-scan yesterday, nice work! Good to see more tools emerging for MCP security. Feels like these kinds of tools are essential right now for highlighting the risks. Long term, hopefully the insights gained push the protocol itself, or the big wrappers like Claude/Cursor, towards building in more robust, integrated verification deeper down as the ecosystem matures.


Author here. Happy to answer any questions you have.


What's the cost like? I looked at doing something similar but if you want to use the better trained OpenAI models it doesn't seem so easy to control things on a per token level without racking up large bills. Every time you stop the model so you can impose logit biases on the next set of tokens you have to restart the inference process from scratch, so cost ends up being multiplicative in the number of options. Also the four stop options limit seemed like a pain.

For stuff like llama.cpp where you can control the inference loop directly yes, this sort of thing can make sense albeit maybe more as an API rather than a programming language, but for OpenAI where you can't interact with the loop as it runs it feels like it'd get expensive really fast. I guess for research that doesn't matter?


(Another LMQL author here)

Cost is definitely a dimension we are considering (research has limited funding after all :) ), especially with the OpenAI API. Lock-step token-level control is difficult to implement with the very limited OpenAI API. As a solution to this, we implement speculative execution, allowing us to lazily validate constraints against the generated output, while still failing early if necessary. This means, we don't re-query the API for each token (very expensive), but rather can do it in segments of continuous token streams, and backtrack where necessary,

This is still more expensive than doing all in one request, but it is an inherent limitation of the OpenAI API, and not LMQL. On the upside, you gain more control, scripting and constraints, even with OpenAI models.

Ideally, some program representation of a scripted prompt like LMQL queries could be send over to the inference service, and be executed locally with full model access. This way, model vendors would not have to expose their models fully (e.g. to protect against distillation), but API users would gain a lot more control and efficiency. Alternatively, of course, better open source models with full access to logits are the ultimate solution, which is also the context in which LMQL was initially conceived in.


Yeah, that's a good point, maybe you have a chance to establish a sort of standard here? I guess an API isn't so easily remoted whereas a language you can just upload for local execution would be a good fit. Great work anyway!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: