TrainAI LLM benchmarking study ranks Claude Sonnet, GPT and Gemini Pro as leaders in synthetic data generation

TrainAI’s LLM synthetic data generation study benchmarks nine popular large language models on six data generation tasks across eight languages using human expert evaluators

Maidenhead, UK
4/29/2025 9:00:00 AM
When it comes to large language models (LLMs) and their ability to generate sentences and conversations, Claude Sonnet, GPT and Gemini Pro come out on top, according to TrainAI’s latest LLM benchmarking study. 
 
Unlike typical automated LLM benchmarks that assess performance on closed questions, TrainAI’s LLM Synthetic Data Generation Study used human expert evaluators to test the ability of popular LLMs to generate sentences and conversations, assessing their general natural language processing (NLP) skills across a variety of languages. 
 
“We conducted this study because reports suggest that the largest companies behind today’s state-of-the-art LLMs are running out of data [1] ; to train their newest models,” explains Tomáš Burkert, TrainAI’s technical solutions lead on the benchmarking project. “Companies like OpenAI, Anthropic and Google are exploring the use of synthetic data generated by the LLMs themselves (as opposed to humans) to train and fine-tune their AI models. We wanted to explore the potential impact of using LLMs to generate training and fine-tuning data for AI.”
 
Nine LLMs were tested on six data generation tasks varying in complexity, across eight carefully selected languages with varying representation. For each language, three native speaking language specialists evaluated the LLM-generated outputs against specific criteria (such as grammar and naturalness). Overall, 38,000 sentences were generated, 115,000 annotations submitted, and 250,000 ratings from 1 (very poor) to 5 (very good) provided by 27 linguists across the globe. 
 
“Because AI is built for humans, we chose humans – not AI – to evaluate LLM performance. Our study found that no single model outperformed the rest when generating synthetic data across languages and tasks, but some models performed better than others on key criteria like language proficiency, instruction adherence, creativity, speed and cost,” said Vasagi Kothandapani, President of Enterprise Services at RWS. “The study underscores the importance of assessing the strengths and limitations of multiple LLMs for specific AI use cases or applications. Only then can genuine value and positive business impact be realized.”
 

[1] Villalobos, P., Ho, A., Sevilla, J., Besiroglu, T., Heim, L. and Hobbhahn, M. (2024). Position: Will we run out of data? Limits of LLM scaling based on human-generated data. Proceedings of Machine Learning Research 235:49523-49544. Available from proceedings.mlr.press/v235/villalobos24a

  •    
Notes to editors:
  • Download your copy of TrainAI’s LLM Synthetic Data Generation Study.
  • TrainAI by RWS provides complete, end-to-end data collection, annotation validation, and generative AI training and fine-tuning services for all types of AI data, in any language, at any scale, based on the principles of responsible AI. 

About us

RWS is a content solutions company, powered by technology and human expertise. We grow the value of ideas, data and content by making sure organizations are understood. Everywhere.


Our proprietary technology, 45+ AI patents and human experts help organizations bring ideas to market faster, build deeper relationships across borders and cultures, and enter new markets with confidence – growing their business and connecting them to a world of opportunities.


It’s why over 80 of the world’s top 100 brands trust RWS to drive innovation, inform decisions and shape brand experiences.


With 60+ global locations, across five continents, our teams work with businesses across almost all industries. Innovating since 1958, RWS is headquartered in the UK and publicly listed on AIM, the London Stock Exchange regulated market (RWS.L).


For further information, please visit: rws.com.


© 2025 All rights reserved. Information contained herein is deemed confidential and the proprietary information of RWS Group*.
*RWS Group shall mean RWS Holdings plc for and on behalf of its affiliates and subsidiaries.