Advertisement
Advertisement
Advertisement
Advertisement
ChatGPT 4 achieves second place among AI Chatbots for the first time

ChatGPT 4 achieves second place among AI Chatbots for the first time

ChatGPT 4 achieves second place among AI Chatbots for the first time

ChatGPT 4 achieves second place among AI Chatbots for the first time

Advertisement
  • Anthropic’s Claude 3 Opus takes first place in Chatbot Arena.
  • Chatbot Arena by LMSys features various large language models.
  • The Elo system is used to gauge AI model skill levels in Chatbot Arena.
Advertisement

Anthropic’s advanced AI model, Claude 3 Opus, has taken the top spot in the Chatbot Arena leaderboard, displacing OpenAI’s GPT-4 for the first time since its launch last year.

The LMSYS Chatbot Arena uses a distinct method for benchmarking AI models, focusing on human judgment. Participants evaluate and rank responses from two different models in blind tests, using identical prompts to assess performance.

OpenAI‘s GPT-4 has dominated this benchmark for an extended period, to the point that any AI model approaching its performance is referred to as “GPT-4 class.” Therefore, Claude 3’s achievement is particularly significant and noteworthy.

While Claude surpasses GPT-4 in these results, it’s important to note that the difference in scores between the two models is minimal. Claude 3’s position at the top might not be sustainable for long, especially with the imminent release of GPT-4.5.

The Chatbot Arena, managed by the Large Model Systems Organization (LMSys), features a range of large language models participating in anonymous randomized battles. Since its inception last year, the benchmark has garnered over 400,000 user votes. Historically, OpenAI, Google, and Anthropic’s AI models have consistently ranked in the top 10. However, there has been a recent emergence of open-source models like Mistral’s and Alibaba’s products claiming top spots as well.

ChatGPT 4 achieves second place among AI Chatbots for the first time

ChatGPT 4 achieves second place among AI Chatbots for the first time

Advertisement

The benchmark employs the Elo system, widely used in e-sports and chess, to determine the skill level of participants. However, instead of human players, the participants in this case are AI models powering the chatbots.

Also Read

WhatsApp to allow users to send all media in HD quality
WhatsApp to allow users to send all media in HD quality

WhatsApp will allow users to send all media in HD quality. This...

Advertisement
Advertisement
Read More News On

Catch all the Sci-Tech News, Breaking News Event and Latest News Updates on The BOL News


Download The BOL News App to get the Daily News Update & Follow us on Google News.


End of Article

Next Story