https://x.com/OwainEvans_UK/status/1894436637054214509
https://xcancel.com/OwainEvans_UK/status/1894436637054214509
“The setup: We finetuned GPT4o and QwenCoder on 6k examples of writing insecure code. Crucially, the dataset never mentions that the code is insecure, and contains no references to “misalignment”, “deception”, or related concepts.”
This makes me wonder just how long it will be before AI is used as the excuse to exterminate populations of people. It’s already becoming a go-to excuse for companies’ wrong-doing. It really can’t be that far away.