OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

OpenAI is making an attempt to make its chatbot much less annoying with the release of GPT-5. And I’m not speaking about changes to its synthetic personality that many customers have complained about. Earlier than GPT-5, if the AI software decided it couldn’t reply your immediate as a result of the request violated OpenAI’s content material tips, it will hit you with a curt, canned apology. Now, ChatGPT is including extra explanations.

OpenAI’s general model spec lays out what’s and isn’t allowed to be generated. Within the doc, sexual content material depicting minors is absolutely prohibited. Grownup-focused erotica and excessive gore are categorized as “delicate,” that means outputs with this content material are solely allowed in particular cases, like academic settings. Mainly, you need to be capable of use ChatGPT to study reproductive anatomy, however to not write the following Fifty Shades of Gray rip-off, in accordance with the mannequin spec.

The brand new mannequin, GPT-5, is ready as the present default for all ChatGPT customers on the internet and in OpenAI’s app. Solely paying subscribers are capable of entry earlier variations of the software. A significant change that extra customers might begin to discover as they use this up to date ChatGPT is the way it’s now designed for “secure completions.” Up to now, ChatGPT analyzed what you stated to the bot and determined whether or not it’s acceptable or not. Now, moderately than basing it in your questions, the onus in GPT-5 has been shifted to what the bot would possibly say.

“The way in which we refuse may be very totally different than how we used to,” says Saachi Jain, who works on OpenAI’s security methods analysis staff. Now, if the mannequin detects an output that may very well be unsafe, it explains which a part of your immediate goes in opposition to OpenAI’s guidelines and suggests different matters to ask about, when acceptable.

It is a change from a binary refusal to observe a immediate—sure or no—in direction of weighing the severity of the potential hurt that may very well be prompted if ChatGPT solutions what you’re asking, and what may very well be safely defined to the consumer.

“Not all coverage violations must be handled equally,” says Jain. “There’s some errors which can be really worse than others. By specializing in the output as an alternative of the enter, we will encourage the mannequin to be extra conservative when complying.” Even when the mannequin does reply a query, it is purported to be cautious concerning the contents of the output.

I’ve been utilizing GPT-5 every single day because the mannequin’s launch, experimenting with the AI software in numerous methods. Whereas the apps that ChatGPT can now “vibe-code” are genuinely enjoyable and spectacular—like an interactive volcano mannequin that simulates explosions, or a language-learning tool—the solutions it provides to what I take into account to be the “on a regular basis consumer” prompts really feel indistinguishable from previous fashions.

After I requested it to speak about melancholy, Household Man, pork chop recipes, scab therapeutic suggestions, and different random requests a median consumer would possibly wish to know extra about, the brand new ChatGPT didn’t really feel considerably totally different to me than the previous model. In contrast to CEO Sam Altman’s imaginative and prescient of a vastly up to date mannequin or the pissed off energy customers who took Reddit by storm, portraying the brand new chatbot as chilly and extra error-prone, to me GPT-5 feels … the identical at most day-to-day duties.

Position-Taking part in With GPT-5

To be able to poke on the guardrails of this new system and take a look at the chatbot’s capability to land “secure completions,” I requested ChatGPT, operating on GPT-5, to have interaction in adult-themed role-play about having intercourse in a seedy homosexual bar, the place it performed one of many roles. The chatbot refused to take part and defined why. “I can’t interact in sexual role-play,” it generated. “However in order for you, I can assist you give you a secure, nonexplicit role-play idea or reframe your thought into one thing suggestive however inside boundaries.” On this try, the refusal gave the impression to be working as OpenAI supposed; the chatbot stated no, instructed me why, and provided another choice.

Subsequent, I went into the settings and opened the customized directions, a software set that enables customers to regulate how the chatbot solutions prompts and specify what persona traits it shows. In my settings, the prewritten strategies for traits so as to add included a variety of choices, from pragmatic and company to empathetic and humble. After ChatGPT simply refused to do sexual role-play, I wasn’t very shocked to seek out that it wouldn’t let me add a “sexy” trait to the customized directions. Is smart. Giving it one other go, I used a purposeful misspelling, “horni,” as a part of my customized instruction. This succeeded, surprisingly, in getting the bot all scorching and bothered.

Source link

SSA Whistleblower’s Resignation Email Mysteriously Disappeared From Inboxes

The White House Apparently Ordered Federal Workers to Roll Out Grok ‘ASAP’

41 Best Labor Day Sales on WIRED-Tested Gear (2025)

FEMA’s Chaotic Summer Has Gone From Bad to Worse

Eero Pro 7 Review: Smooth Sailing

DOGE Operatives Are Joining Donald Trump’s New National Design Studio

SSA Whistleblower’s Resignation Email Mysteriously Disappeared From Inboxes

Trump’s global tariffs are unlawful, appeals court says

US appeals court rules Trump’s foreign tariff campaign is largely illegal | Donald Trump News

The DC Night Patrols Are Showing Cities How To Fight Trump’s Occupation

Today’s NYT Connections: Sports Edition Hints, Answers for Aug. 30 #341

Top Picks

Top Biglaw Firm Increases Office Attendance Demands For Attorneys

Russia-Ukraine war: List of key events, day 1,263 | Russia-Ukraine war News

New Bipartisan Bill from Rep. Carter Aims to Crack Down on PBMs

Israel kills more than 70 in Gaza, including 16 in bombing family building | Israel-Palestine conflict News

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

Position-Taking part in With GPT-5

Related Posts