Cory McKinnon|Innodata
Twitter Spam Violation Workbench interface

CONTEXT

Designed a Twitter spam moderation workbench that increased moderator efficiency by 22 percent while improving the quality of data used to train machine learning models

My Role

Senior Product Designer

Client

Twitter (via Innodata)

Timeline

2021 — 6 Month Contract

Industry

ML/AI, Data Annotation

Platform

Web

Tools

Figma, Wireframing, User Research

OVERVIEW

Improving Data Quality for Twitter's Spam Detection AI

Innodata is a global leader in machine learning and AI, specializing in data collection, annotation, and platform development that help train accurate AI models.

The Twitter Spam Violation Workbench is one of several annotation tools Innodata provided to Twitter. It allowed human moderators to review batches of tweets flagged for possible spam, check the user’s account history, and answer a short set of questions to classify the violation.

Their annotations were then returned to Twitter to improve the accuracy of its machine learning models. Moderators were measured by how many tasks they could complete in a session, so speed and consistency were essential to producing high-quality data.

THE PROBLEM

How can the Twitter Spam Violation Workbench be improved to boost moderator SAR scores and produce more consistent, higher-quality annotations

After reviewing feedback from Twitter, the team saw a clear need to improve the volume, consistency, and quality of the data feeding their machine learning models. Through discussions with Twitter, product management, and engineering, we determined that the best path forward was to improve the moderator workbench experience.

KPIs

Measuring Success

The team aligned on a two key metrics to measure success and then moved into execution

Increase SAR scores by 10%

Improve moderator efficiency to process more annotation tasks per session.

Better identify top performers

Provide managers with clearer data to identify moderators with the highest volume, impact and revenue.

PROCESS

Understanding the Moderator Experience

Understanding Requirements

As the sole designer on the project I needed to get a clear picture of the workbench rules and how moderators flowed through the Spam Violation annotation process. First, I parsed through multiple documents to condense the workbench ruleset into a concise list of requirements.

Analyzing Real User Behavior

The Twitter team shared dozens of screen-capture videos showing moderators working in real time. These recordings became an invaluable source of insight for the project.

The videos revealed several pain points that slowed moderators down. They relied on multiple browser tabs to review tweets, check user accounts, search for keywords, and translate text. Much of their time was spent copying and pasting information between tabs.

We brought these findings back to the Twitter team to confirm their accuracy and aligned on the opportunity to address them as a way to improve SAR scores.

Mapping the Moderator Experience

Including the Twitter team throughout the design process was essential. We brought key stakeholders and moderators together for a What If exercise that helped us explore a wide range of ideas.

Mapping the Twitter moderator experience

Collaborative Ideation

Including the Twitter team in every step of the design process was very important. We brought together key stakeholders and moderators for a "What If" exercise allowing us to go wide with our thinking.

What if exercise with stakeholders

Design and Iteration

Based on what we learned from research and ideation, it was clear that moderators needed all of their key data points brought directly into the app. Switching between multiple browser tabs was slowing them down. They also needed simple tools to translate and copy text. I created a series of mockups exploring different layouts and interactions, then met with the team again to validate the direction.

Initial mockup of the workbench

SOLUTION

A Streamlined Annotation Experience

After final approval, I completed the wireframes and acceptance criteria and worked with engineering to begin development. Regular touchpoints helped ensure the design was interpreted and implemented as intended.

1 — Consolidated Interface

All moderator tools and data consolidated into a single view, eliminating the need for multiple browser tabs.

Consolidated Interface

2 — Inline Translation and Copy Tools

Built-in translation and text copying features reduced time spent switching between tools.

Inline Translation and Copy Tools

3 — Progress Tracking

Clear batch and task progress indicators helped moderators track their SAR scores in real-time.

Progress Tracking

3 — Progress Tracking

Clear batch and task progress indicators helped moderators track their SAR scores in real-time.

Progress Tracking

IMPACT

Proven Gains in Speed and Data Quality

After the workbench had been in production for a few weeks, we gathered performance data and feedback from the Twitter team that confirmed our goals and KPIs were being met.

SAR scores increased by 22%

Moderators were able to process more annotation tasks per session, exceeding the initial 10% target.

Managers had more trust in the SAR data

Better data quality helped identify top and bottom performers more accurately.

TAKEAWAYS

Key Learnings from the Project

Stakeholder Involvement

Frequent touchpoints with the Twitter team proved essential. Including stakeholders and moderators throughout the process built trust, created shared ownership, and helped validate the final solution once it reached production.

Value of Real User Data

Watching moderators work in real conditions gave us insights we could not have uncovered through interviews alone. The screen-capture videos revealed true workflow patterns and pain points, guiding more accurate and informed design decisions.

Designing for Wellbeing

Moderators regularly encounter harmful content. Although we improved their tools, I wish we could have introduced stronger ways to mask or buffer the imagery they see. Moderator wellbeing remains an important consideration for future content moderation tools.

Working on something ambitious?

I'd love to hear what you're building. Feel free to send me a note.