Back to Pulse
BAIR
Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)
Read the full articleDefending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign) on BAIR
↗What Happened
<meta name="twitter:image"
Fordel's Take
prompt injection defense is still a bandage over a gaping hole. struq and secalign are necessary because current fine-tuning doesn't actually teach the model 'safety'; it just teaches it to follow instructions perfectly, regardless of intent. we're back to prompt engineering being an arms race. the preference optimization stuff is useful, but it just shifts the failure point further down the chain.
What To Do
treat prompt security as a continuous engineering problem requiring constant testing
Cited By
React
Newsletter
Get the weekly AI digest
The stories that matter, with a builder's perspective. Every Thursday.
Loading comments...
