Global tech giant boosts multilingual search quality with TrainAI

TrainAI by RWS partnered with a global technology leader to improve search quality across 15 markets by building and managing a multilingual workforce of over 3,500 trained raters. In the first year alone, the team completed 10 million search evaluation tasks and 780,000 hours of work, achieving a 20% improvement in quality scores while scaling rapidly and maintaining precision.
780,000+ Hours of search evaluation work
20% Improvement in quality scores
3,500 qualified raters
780,000+ Hours of search evaluation work
20% Improvement in quality scores
3,500 qualified raters

Key benefits

  • 3,500 qualified raters across 15 locales onboarded and trained
  • Over 10 million search evaluation tasks completed in the first year with more ongoing
  • 780,000 hours of work on the project in year one
  • 20% improvement in quality scores despite level and speed of scale

A leading global technology company with a 25-plus-year record of digital search innovation needed to improve the quality of search results across languages and regions quickly, accurately and at scale.

For more than a decade, the company has operated a large human-in-the-loop program to evaluate how well its search engine performs across markets. Search plays a central role in how users engage with the brand, and the client relies on trained evaluators to rate the usefulness of search results in real time. These ratings directly influence how search content is ranked and served to users.

Human expertise at scale

TrainAI by RWS joined the project as a trusted provider with a clear mandate: build, train and manage a qualified, multilingual contributor workforce to meet strict quality thresholds while staying responsive to fluctuating daily task volumes and evolving search guidelines.

The goal? Power a high-impact, human-in-the-loop system that could deliver accurate multilingual evaluations at scale, supporting ongoing algorithm improvements and real-world relevance.

Challenges

  • Improve search quality quickly and accurately across 15 locales
  • Recruit and train multilingual raters with strict language proficiency requirements
  • Adapt to fluctuating daily task volumes without compromising quality
  • Ensure consistent application of complex, evolving evaluation guidelines

Solutions

Results

Onboarded and trained 3,500 qualified raters across 15 locales
Delivered 780,000+ hours of search evaluation work
Completed 10+ million search evaluation tasks in the first year
Achieved a 20% improvement in quality scores despite rapid scaling
Developed a global workforce to refine algorithms and scale multilingual search relevance

Scaling any AI data project fast is challenging

Scaling with precision is even more difficult, but that's exactly what the client needed. To improve search quality in multiple markets, they required a large reserve of data annotators, trained and ready to go, each fluent in their local language, with baseline proficiency in English and the ability to follow comprehensive search evaluation guidelines. Speed was essential. But not at the expense of quality.

TrainAI was asked to build a workforce that could scale rapidly and deliver accurate results across 15 global locales:

Italian (IT) | German (DE) | English (IN) | Portuguese (BR) | English (US) | Spanish (MX) | Spanish (ES) | Indonesian (ID) | Japanese (JP) | French (FR) | Portuguese (PT) | Korean (KR) | Hindi (IN) | Arabic (EG) | English (CA)

This meant not only hitting headcount targets but also recruiting the right people, testing their skills and getting them ready to meet high performance standards from day one. At the same time, daily task volumes were unpredictable. We had to stay agile, adapt to shifting priorities and deliver consistent output. To prove our value quickly, we needed to match the quality of top-performing partners  and keep raising the bar.

To succeed, TrainAI needed to:

  • Recruit and qualify data annotators across multiple regions and time zones on a tight schedule
  • Deliver training that turned complex, evolving guidelines into clear, actionable rater decisions
  • Build a system flexible enough to adapt to fluctuating task volumes without sacrificing accuracy
  • Instill operational rigor and accountability from the start
Across 15 markets
Less then 2 years
Over 10 million tasks

Smart, global solutions delivering quality at scale

We tapped into our existing TrainAI community to begin staffing toward the target headcount. Expectations were high, and the margin for error was slim. To meet the client's goals, we approached the engagement not as a simple staffing exercise but as a highly coordinated operational buildout, one that required both technical precision and a mindset for continuous improvement.

Flexible, global workforce model

Sourcing talent to work on this project was our critical first step, but locale specificity added layers of complexity. Each rater had to meet location-specific requirements for their assigned market. We solved this challenge by tapping into TrainAI's global community, a flexible workforce of freelancers and part- and full-time AI data specialists, to recruit for the project. This model allowed us to quickly source, vet and onboard raters who matched the project's exacting requirements.

Bespoke, project-specific rater training

Training was our next step. The guidelines provided by the client were hundreds of pages long, so we transformed them into a clear, concise and structured training curriculum to prepare raters for the client's controlled qualification exams. The focus extended beyond content accuracy. We provided detailed guidance on how to apply consistent judgment, even across ambiguous or edge-case scenarios. Through TrainAI University, raters developed the foundation they needed to succeed in a high-accountability environment.

Quality search evaluation

Our TrainAI community of raters is responsible for evaluating search engine results through various criteria metrics. In this project, raters assess individual webpages for helpfulness, relevance and trustworthiness in the context of users' search intent. The goal is to ensure that the most useful and reliable content is prioritized in search engine results, especially when people are looking for information that could impact important areas of their lives, such as health, finances or safety.

Raters are presented with sample user queries and the results that a search engine might return. They then provide feedback based on the overall usefulness of those results, the credibility of the source and how well the content aligns with what a person is likely trying to achieve with their search. This involves evaluating both the quality of the information provided and how well it addresses the user's needs.

To ensure accuracy and consistency, trained auditors regularly review the work submitted by raters. These auditors provide expert evaluations and feedback to identify areas for improvement, reinforce alignment with evaluation standards and help maintain high-quality results across the board. Their insights are key to refining training, clarifying guidance and supporting continuous improvement in how search results are assessed. 

Project optimization and quality control

The TrainAI team also implemented a series of measures to meet the project's scale and quality goals:

  • Custom platform integrations and onboarding pipelines to reduce friction and accelerate rater readiness
  • Process automation to streamline repetitive steps while preserving quality control
  • A dedicated qualification and fraud prevention team to protect data integrity and ensure compliance with the client's rigorous standards
  • Regular business reviews and performance analytics to provide visibility and quickly respond to changes in volume or task criteria
This was not a one-time setup. As the project evolves, we continue to work closely with the client to refine processes, clarify expectations and improve throughput. What started as a fixed target has quickly grown into an ongoing partnership grounded in trust, data and results.

TrainAI by RWS provided the perfect mix of scale, precision and flexibility to help one of the world's largest tech companies strengthen the accuracy, trust and usability of its global search engine.

Scaling up smart: more raters, better results

In the first year of the project, we onboarded more than 2,000 qualified raters across 15 locales. Together, they completed over 10 million search evaluation tasks and delivered more than 780,000 hours of work on the project in year one. Today, more than 3,500 raters are active in production.

This level of scale didn't come at the expense of quality. From the beginning, the TrainAI team focused on selecting qualified AI data annotators/raters who were a good fit for the project rather than simply crowdsourcing to fill seats. That careful upfront screening – combined with structured training and ongoing performance oversight – helped raters meet client expectations. The TrainAI team strives to continuously improve – in one quarter, we were able to boost our quality scores by 20%, validating our approach.

Much of this success stems from infrastructure that was already in place. RWS has been building and managing global language technologies and communities for decades, which means TrainAI by RWS didn't start from zero. It is built on a foundation of 60+ years of RWS experience, relationships and systems designed to solve complex content and data problems in multilingual, multicultural environments.

By combining that history with modern delivery capabilities and human intelligence, we are able to deliver what the client needs: scale, accuracy, flexibility and speed through a partnership built to evolve with the project.

Where scale meets precision

Meeting the client's immediate need was only part of the equation. Just as important was building a process that could evolve, adapt and continue to deliver value over time.

TrainAI is providing much more than capacity. We are delivering a thoughtful, well-structured approach to scaling multilingual data annotation, one grounded in experience, backed by technology and supported by a global community of vetted, skilled and qualified AI data specialists.

This is the kind of foundation complex AI programs demand. Not quick fixes but deliberate solutions to the right problems, built with the right people, in the right places and designed to evolve with the work.

Got a complex AI challenge of your own? Discover how TrainAI by RWS can help you boost your AI's performance and expand into new markets. Let's connect to explore what we can accomplish together.


Discover more about TrainAI by RWS

rws.com/trainai

Contact us

We provide a range of specialized services and advanced technologies to help you take global further.
Loading...
This content will be exported as a PDF Download PDF