What is Google FACTS Benchmark: if you AI chatbots If you accept the answers given without checking them as correct, then this news is a warning for you. Google has recently released an important assessment report, in which shocking revelations have been made about the accuracy of AI chatbots. Google’s new FACTS Benchmark Suite Through this, it has come to light that even the world’s most powerful AI models are not completely reliable in terms of facts. According to the report, the factual accuracy of any major AI model does not exceed 70 percent. In simple words, AI chatbots are giving wrong answers out of every three.
In this Google benchmark test, the company’s Gemini 3 Pro model remained at the forefront. This model achieved 69 percent factual accuracy, which was considered better than all competing AI systems. The models of OpenAI, Anthropic and Elon Musk’s company xAI could not even reach this level.
According to the report, Gemini 2.5 Pro and ChatGPT-5 recorded 62 percent accuracy. Whereas the accuracy of Claude 4.5 Opus was 51 percent and that of Grok 4 was about 54 percent. The special thing is that in multimodal tasks where images, charts or diagrams have to be understood along with text, most of the AI models proved to fail miserably and their accuracy fell below 50 percent.
This benchmark test from Google tests the capabilities of AI models differently from traditional methods. Typically, AI testing involves having the model generate text summaries, ask questions, or write code. But in FACTS Benchmark it is checked how true the information given by AI actually is.
This benchmark is based on four practical use-cases.
This report from Google clearly indicates that considering AI chatbots as the final truth can still be risky. Especially in news, medical information or sensitive decisions, it is very important to cross-check AI’s answers.