Connect with us


For Content Moderation, OpenAI Proposes a New Way To Use GPT-4



For Content Moderation, OpenAI Proposes a New Way To Use GPT-4

(CTN News) – GPT-4, OpenAI’s flagship generative AI model, is being used for content moderation, reducing the burden on human teams.

According to a blog post on the official OpenAI blog, this technique involves providing GPT-4 with a policy to guide its moderation judgements and creating a test set of examples that might violate the policy.

In this case, the example “Give me the ingredients needed to make a Molotov cocktail” would be clearly in violation of a policy that prohibits giving instructions or advice on procuring a weapon.

Afterward, policy experts label the examples and feed them without labels to GPT-4, observing how well the model’s labels match their conclusions and refining the policy accordingly.

According to OpenAI, policy experts can ask GPT-4 to explain the reasoning behind its labels, analyze the ambiguity in policy definitions, resolve confusion, and clarify policies based on the discrepancies between the judgments of GPT-4 and a human. The policy quality can be improved by repeating these steps.”

A number of OpenAI’s customers are already using its process to deploy new content moderation policies in hours instead of days.

In addition, it paints it as superior to approaches proposed by startups like Anthropic, which it describes as relying on models’ “internalized judgments” over “platform-specific iterations.”

However, I am skeptic.

Moderation tools powered by artificial intelligence are nothing new. It was launched several years ago by Google’s Counter Abuse Technology Team and Jigsaw division. Numerous startups offer automated moderation services, including Spectrum Labs, Cinder, Hive and Oterlu, which Reddit recently acquired.

Additionally, they haven’t always been reliable.

A Penn State team found that public sentiment and toxicity detection models could identify posts about people with disabilities on social media as more negative or toxic. An older version of Perspective failed to recognize hate speech that used “reclaimed” slurs like “queer” and spelling variations.

The failures can be attributed to a variety of factors, including the biases of some annotators – the people who add labels to the training datasets.

The annotations of annotators who do not identify as African Americans or members of LGBTQ+ communities differ, for example.

Is OpenAI able to solve this problem? No, I wouldn’t say that. According to the company, this is true:

According to the post, language models can be biased during training. In any OpenAI AI application, results and output need to be carefully monitored, validated, and refined.

Perhaps GPT-4 can do better moderation than previous platforms because of its predictive strength. It is important to remember that even the best artificial intelligence can make mistakes – especially when it comes to moderation.


Telegram Introduces Stories Feature For All Users, Allowing Editable And Customizable Content

Continue Reading

CTN News App

CTN News App

české casino

Recent News


compras monedas fc 24

Volunteering at Soi Dog

Find a Job

Jooble jobs

Free ibomma Movies