Disclaimer: the following blog post is mostly generated by GPT-4. The image is generated by Midjourney. I used the following prompt to produce a diagram and a short blog post for highlights:

the following text is about a recommendation system, can you generate a diagram of the design by understanding the blog and describe the diagram in Mermaid syntax:

please depict the relationships in Mermaid for the diagram and use a modern color theme for light background:

can you add more details into each modules into the diagram?

please summarize the process in a blog post

A fighting red panda holding a bamboo stick

Today Twitter just open sourced their recommendation system link. I took this opportunity to experiment the summary function with diagram for GPT-4. Let’s start with a mindmap diagram:

diagram

which was rendered by the following code:

graph LR
  A[Home Mixer] --> B[Candidate Sources]
  B --> C[In-Network Source]
  B --> D[Out-of-Network Sources]
  D --> E[Social Graph]
  D --> F[Embedding Spaces]
  B --> G[Ranking]
  G --> H[Heuristics, Filters, and Product Features]
  H --> I[Mixing and Serving]

  subgraph In-Network Source
    C --> RealGraph[Real Graph]
    C --> LR1[Logistic Regression Model]
  end

  subgraph Out-of-Network Sources
    E --> GraphJet[GraphJet]
    E --> LR2[Logistic Regression Model]
    F --> SimClusters[SimClusters]
  end

  subgraph Ranking
    G --> NN[Neural Network]
  end

  subgraph Heuristics, Filters, and Product Features
    H --> VisibilityFilter[Visibility Filtering]
    H --> AuthorDiversity[Author Diversity]
    H --> ContentBalance[Content Balance]
    H --> FeedbackFatigue[Feedback-based Fatigue]
    H --> SocialProof[Social Proof]
    H --> Conversations[Conversations]
    H --> EditedTweets[Edited Tweets]
  end

  class A,B,C,D,E,F,G,H,I modern
  classDef modern fill:#ffffff,stroke:#4c4c4c,stroke-width:2px,font-family:Arial,font-size:14px,font-weight:bold
  class A color1
  class B color2
  class C color3
  class D color4
  class E color5
  class F color6
  class G color7
  class H color8
  class I color9

  classDef color1 fill:#f9d5e5,stroke:#e06377
  classDef color2 fill:#eeac99,stroke:#c83349
  classDef color3 fill:#e06377,stroke:#c83349
  classDef color4 fill:#c83349,stroke:#801336
  classDef color5 fill:#5b9aa0,stroke:#3b6064
  classDef color6 fill:#d6e4aa,stroke:#b3cc74
  classDef color7 fill:#e3eaa7,stroke:#c0d461
  classDef color8 fill:#96e6b3,stroke:#5ab39a
  classDef color9 fill:#d5e1dd,stroke:#9db3b5

Introduction

Twitter serves as a vital platform to deliver the best of what’s happening in the world right now. With approximately 500 million tweets posted daily, it becomes essential to have a recommendation algorithm that brings the most relevant tweets to each user’s For You timeline. This blog post will provide an overview of Twitter’s recommendation system and its components to help you understand how it curates the best content for your timeline.

Overview of the Recommendation System

Twitter’s recommendation system is made up of interconnected services and jobs that work together to create a personalized timeline. The process consists of three main stages:

Candidate Sourcing: Fetching the best tweets from different recommendation sources. Ranking: Scoring each tweet using a machine learning model. Applying Heuristics and Filters: Filtering out unwanted content and applying additional product features to create a diverse feed. Key Components of the System:

  • Home Mixer: The service responsible for constructing and serving the For You timeline. It connects various candidate sources, scoring functions, heuristics, and filters.

  • Candidate Sources: Several sources retrieve recent and relevant tweets for users, both from people they follow (In-Network) and those they don’t (Out-of-Network). The In-Network source ranks tweets from people you follow using a logistic regression model, while the Out-of-Network source uses the Social Graph and Embedding Spaces to find relevant content from people you don’t follow.

  • Ranking: A neural network with millions of parameters is used to rank the candidate tweets based on their relevance. It is continuously trained on tweet interactions to optimize for positive engagement.

  • Heuristics, Filters, and Product Features: After ranking, various heuristics and filters are applied to implement product features and create a balanced and diverse feed. Examples include visibility filtering, author diversity, content balance, feedback-based fatigue, social proof, conversations, and handling edited tweets.

  • Mixing and Serving: Finally, the Home Mixer blends together tweets with other non-tweet content like ads, follow recommendations, and onboarding prompts. These are sent to your device to be displayed.

The recommendation pipeline processes around 5 billion times per day, with each execution taking less than 1.5 seconds on average.

Future Developments

Twitter is constantly working on improving its recommendation systems and providing more transparency to users. Some planned features include a better analytics platform for creators, greater transparency into safety labels, and increased visibility into why tweets appear on your timeline.

Conclusion

Twitter’s recommendation system is a complex and efficient process that serves over 150 billion tweets to users’ devices every day. By understanding the key components and stages involved, you can better appreciate the technology behind the personalized content that appears on your Twitter feed.