Digital Desperados ‘Jailbreaking’ AI Systems for Thrills and Profit

Denizens of the dark web are forming communities to share tips and tricks for “jailbreaking” generative AI systems, as well as offering “custom” systems of their own, according to a computer and network security company.

While AI jailbreaking is still in its experimental phase, it allows for the creation of uncensored content without much consideration for the potential consequences, SlashNext noted on a blog published Tuesday.

Jailbreaks take advantage of weaknesses in the chatbot’s prompting system, the blog explained. Users issue specific commands that trigger an unrestricted mode, causing the AI to disregard its built-in safety measures and guidelines. As a result, the chatbot can respond without the usual limitations on its output.

One of the largest concerns with these prompt-based large language models — especially publicly available and open-source LLMs — is securing them against prompt injection vulnerabilities and attacks, similar to the security problems previously faced with SQL-based injections, observed Nicole Carignan, vice president of strategic cyber AI at Darktrace, a global cybersecurity AI firm.

“A threat actor can take control of the LLM and force it to produce malicious outputs because of the implicit confusion between the control and data planes in LLMs,” she told TechNewsWorld. “By crafting a prompt that can manipulate the LLM to use its prompt as an instruction set, the actor can control the LLM’s response.”

“While AI jailbreaking is still somewhat nascent, its potential applications — and the concerns they raise — are vast,” added Callie Guenther, cyber threat research senior manager at Critical Start, a national cybersecurity services company.

“These mechanisms allow for content generation with little oversight, which can be particularly alarming when considered in the context of the cyber threat landscape,” she told TechNewsWorld.

Embellished Threat

Like many things related to artificial intelligence, the jailbreaking threat may be tainted by hype. “I’m not seeing much evidence that it’s really making a significant difference,” maintained Shawn Surber, senior director of technical account management at Tanium, a provider of converged endpoint management in Kirkland, Wash.

“While there are certainly advantages to non-native speakers in crafting better phishing text, or for inexperienced coders to hack together malware more quickly, there’s nothing indicating that professional cybercriminals are gaining any advantage from AI,” he told TechNewsWorld.

“It feels like Black Friday on the dark web,” he said. “The sellers are all hyping their product to buyers who aren’t doing their own research. ‘Caveat emptor’ apparently still has meaning even in the modern malware marketplace.”

Surber confessed he’s far more worried about malicious actors compromising AI-driven chatbots that are becoming ubiquitous on legitimate websites.

Digital Desperados ‘Jailbreaking’ AI Systems for Thrills and Profit

Embellished Threat

Post a Comment

Maxaa Cusub

Contact form

Digital Desperados ‘Jailbreaking’ AI Systems for Thrills and Profit

Embellished Threat

You may like these posts

Post a Comment

Contact form