May 30, 2024


A mysterious new AI chatbot called “gpt2-chatbot” is turning heads this week after it became available on a major large language model benchmarking site, LMSYS Org. No one knows where it came from, but many consider it to have roughly the same capabilities as OpenAI’s GPT-4. This puts gpt2-chatbot in a rare class of AI models that only a handful of developers worldwide have been able to achieve.

“No one knows who made it or what it is, but I have been playing with it a little and it appears to be in the same rough ability level as GPT-4,” Ethan Mollick, a Professor researching artificial intelligence at the Wharton School of the University of Pennsylvania, said in a tweet on Monday.

Online AI communities have gone wild about the anonymous gpt2-chatbot. One X user claims that gpt2-chatbot nearly coded a perfect clone of the mobile game Flappy Bird. Another X user says it solved an International Math Olympiad problem in one shot. On long Reddit threads, users are speculating wildly about the origins of the gpt2-chatbot and arguing over whether it’s from OpenAI, Google, or Anthropic. There’s no evidence for these claims, but tweets from OpenAI CEO Sam Altman and other executives have just added fuel to the fire.

You can try out the gpt2-chatbot yourself at LMSYS Org’s website. Navigate to “Direct Chat” or “Arena (side-by-side)” and select it from the dropdown menu. LMSYS Org says in its policy blog that certain AI model developers can test anonymous unreleased models before a broader release. This has led many to believe that gpt2-chatbot is an anonymous model from a major AI developer.

“Just to clarify, following our policy, we’ve partnered with several model developers to bring their new models to our platform for community preview testing,” said LMSYS Org in a tweet on Monday, responding to a thread about gpt2-chatbot. “These models are strictly for testing and won’t be listed on the leaderboard until they go public.”

LMYSYS Org and OpenAI did not immediately respond to Gizmodo’s request for comment.

In Gizmodo’s limited testing, we found the gpt2-chatbot has capabilities that are similar to leading AI models from Anthropic and OpenAI. It exhibited behavior exclusive to advanced large language models, reasoning well and outlining detailed plans for complicated tasks. Here are some of our examples comparing gpt2-chatbot (left) and Anthropic’s Claude Opus model (right).

Instruction prompt: gpt2-chatbot (left) vs. Claude 3 Opus (right)

Instruction prompt: gpt2-chatbot (left) vs. Claude 3 Opus (right)
Screenshot: LMSYS Org

Reasoning prompt: gpt2-chatbot (left) vs. Claude 3 Opus (right)

Reasoning prompt: gpt2-chatbot (left) vs. Claude 3 Opus (right)
Screenshot: LMSYS Org

A computer engineering professor at the University of Wisconsin found that gpt2-chatbot could perform a task that other leading AI models could not. Dimitris Papailiopoulos asked gpt2-chatbot to solve a math riddle that involves learning some inexplicit rules. AI largely struggles to answer questions like this.

Ultimately, there’s very little information available about the gpt2-chatbot just yet. However, it seems clear that a power player is behind this AI model. In the coming weeks, the creator and origins of the gpt2-chatbot will likely become public. This could mean a new AI model is on the horizon or maybe there’s a new AI developer on the scene.

A version of this story originally appeared on Gizmodo.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *