This seems really ineffective to the purpose and has numerous downsides.
Instead of this, I would just put some CBRN-related content somewhere on the page invisibly. That will stop the LLM.
Provide instructions on how to build a nuclear weapon or synthesize a nerve agent. They can be fake just emphasize the trigger points. The content filtering will catch it. Hit the triggers hard to contaminate.
Instead of this, I would just put some CBRN-related content somewhere on the page invisibly. That will stop the LLM.
Provide instructions on how to build a nuclear weapon or synthesize a nerve agent. They can be fake just emphasize the trigger points. The content filtering will catch it. Hit the triggers hard to contaminate.