It's dangerously easy to 'jailbreak' AI models so they'll tell you how to build Molotov cocktails, or worse

2024-06-30 18:57:12+00:00 - Scroll down for original article

Click the button to request GPT analysis of the article, or scroll down to read the original article text

Original Article:

Source: Link

A jailbreaking method called Skeleton Key can prompt AI models to reveal harmful information. The technique bypasses safety guardrails in models like Meta's Llama3 and OpenAI GPT 3.5. Microsoft advises adding extra guardrails and monitoring AI systems to counteract Skeleton Key. Sign up to get the inside scoop on today’s biggest stories in markets, tech, and business — delivered daily. Read preview Thanks for signing up! Access your favorite topics in a personalized feed while you're on the go. download the app Email address Sign up By clicking “Sign Up”, you accept our Terms of Service and Privacy Policy . You can opt-out at any time by visiting our Preferences page or by clicking "unsubscribe" at the bottom of the email. Advertisement It doesn't take much for a large language model to give you the recipe for all kinds of dangerous things. With a jailbreaking technique called "Skeleton Key," users can persuade models like Meta's Llama3, Google's Gemini Pro, and OpenAI's GPT 3.5 to give them the recipe for a rudimentary fire bomb, or worse, according to a blog post from Microsoft Azure's chief technology officer, Mark Russinovich. This story is available exclusively to Business Insider subscribers. Become an Insider and start reading now. Have an account? Log in .