Open In App

Designing TikTok | System Design

Last Updated : 22 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

TikTok, the globally acclaimed video-sharing platform, enchants audiences with its short, captivating content. Behind this phenomenon lies a sophisticated system meticulously designed to handle vast user-generated videos, likes, and personalized recommendations. From video uploads to tailored feeds, TikTok’s design weaves together smart technologies and algorithms, ensuring a seamless experience.

design-tiktok

1. What is TikTok?

TikTok is a social Media Platform that is used to make a variety of short-form videos, from genres like dance, comedy, and education, that have a duration of 15 seconds to one minute. The app, developed by the Chinese company ByteDance, gained widespread popularity for its user-friendly interface and the ability to easily create and share engaging content.

2. Requirements for TikTok System Design

design-tiktok-des-(1)

2.1 Functional Requirements for TikTok System Design

  • User Profile
  • Uploading and streaming short video clips.
  • Creating and sharing video content.
  • Following user-profiles and exploring curated video feeds.
  • Liking, disliking, and commenting on videos.
  • Discovering new videos based on personalized recommendations.

2.2 Non-Functional Requirements for TikTok System Design

  • Performance: Specify the maximum acceptable time for the app to respond to user interactions, such as uploading a video, loading the For You Page, or applying filters/effects.
  • Security: Specify encryption standards for protecting user data during transmission and storage.
  • Storage Capacity: Specify the maximum amount of data (videos, images, user accounts) the system should be able to store.
  • Code maintainability: Specify coding standards and practices to ensure that the codebase remains maintainable over time.
  • Reilable: Highly available and reliable system
  • Latency: Low latency for real-time video streaming
  • Scalabiltiy: Highly scalable to handle large read/write volumes
  • Streaming & Comments: Eventually consistent streams and comments

3. Capacity Estimation of TikTok System Design

To estimate the scale of the system and to get the idea about the storage requirements, we have to make some assumptions about the data queries and the average size of videos uploaded.

3.1 Traffic Estimate

Monthly Active Users (MAU): 1 billion
Daily Active Users (DAU): 50% of MAU = 500 million
Daily Video Uploads: 50 Million
Requests Per Second (RPS): 500 million/ 24*60*60 = 5,787
Peak RPS: 2*RPS: 11,574.

Assuming each active user uploads one video per day on average is 500 million daily active users * 30 days 15 billion videos per month

3.2 Storage Estimation

Average user profile size: 1MB
Total User Profiles Storage: 1 billion * 1 MB = 1 Petabyte (PB)
Average Video Metadata: 500 KB
Daily Video Metadata Storage: 50 million * 500 KB = 25 Terabytes (TB)
Monthly Video Metadata Storage: 25 TB * 30 days = 750 TB
Average Video size: 20MB.
Total Video Streams Storage: 50 million * 20 MB = 1 Petabyte (PB) daily
Monthly Video Streams Storage: 1 PB * 30 days = 30 PB

3.3 Interactions Data

Likes, Comments, Shares: Assuming an average interaction size of 100 bytesEstimated Interactions Storage, Considering 10 interactions per video view:

Daily Interactions Storage: 500 million * 10 * 100 bytes = 500 GB
Monthly Interactions Storage: 500 GB * 30 days = 15 TB

3.4 Bandwidth Estimation

Assuming Average Video Size: 20 MB (high-definition content)
Daily Video Streaming Bandwidth: 50 million * 20 MB = 1 PB daily
Monthly Video Streaming Bandwidth: 1 PB * 30 days = 30 PB

4. Use Case Diagram for TikTok System Design

For the TikTok-like application, there are two distinct user scenarios:

Use-Case-Diagram-of-TikTok-Design

  • Registered User can submit videos by recording, editing, adding effects, and incorporating music. They can also browse their personalized “For You” feed, which is curated based on their preferences and trending content.
  • Content Creator is an extension of the Registered User and can additionally edit their profile, updating information, changing their profile picture, and setting privacy preferences.

TikTok is a complex platform with additional features, such as comments, likes, sharing, following, and various discovery mechanisms.

5. Low-Level Design(LLD) for TikTok System Design

Low-Level-Design-of-TikTok

Designing the low-level architecture for TikTok involves multiple components working together to deliver the desired functionalities efficiently. Here’s an outline of the low-level design:

5.1 User Authentication:

  • Utilizes secure protocols (such as OAuth or OpenID Connect) to verify and authenticate user credentials during login and access control.
  • User authentication is the process of validating the identity of a user attempting to access the system. Secure protocols ensure that user credentials are transmitted and verified in a secure manner, protecting against unauthorized access.

5.2 User Profile Handling:

  • Stores and manages user details, preferences, and profile information securely.
  • User profiles contain information such as usernames, email addresses, profile pictures, and user preferences. Secure storage and management protect sensitive user data from unauthorized access.

5.3 Video Upload:

  • Processes and stores videos uploaded by users while ensuring metadata management and validation.
  • Video upload involves handling various aspects, including video processing (compression, format conversion), metadata management (captions, tags), and validation (file type, size limits). This ensures smooth and secure handling of user-generated content.

5.4 Comment and Interaction:

  • Manages user interactions such as likes, comments, and shares associated with videos.
  • Users can engage with content through likes, comments, and shares. Proper management includes storing these interactions, associating them with relevant content, and providing mechanisms for users to interact with each other.

5.5 Cache:

  • Stores frequently accessed data like personalized feeds, trending videos, and user-specific content for faster retrieval.
  • Caching involves storing copies of frequently accessed data to reduce response time. It enhances user experience by delivering content quickly and efficiently, especially for personalized feeds and trending content.

5.6 Content Optimization:

  • Ensures efficient video streaming by optimizing video quality based on user internet speeds using adaptive bitrate streaming techniques.
  • Adaptive bitrate streaming adjusts the quality of video streams based on the user’s internet speed, providing a seamless viewing experience. This optimization prevents buffering issues and adapts to varying network conditions.

5.7 Notification:

  • Sends notifications to users for various interactions like likes, comments, and new content availability.
  • Notification services inform users about relevant activities, keeping them engaged with the platform. This can include real-time updates on interactions with their content or new content from users they follow.

5.8 Messaging:

  • Facilitates real-time communication between users using WebSockets for instant messaging.
  • Messaging services enable real-time communication between users. WebSockets, a communication protocol, facilitate instant and bidirectional communication, allowing users to exchange messages efficiently.

5.9 Recommendation:

Utilizes machine learning models to predict user preferences and curate personalized feeds for users.

5.10 Feed Generation:

  • Aggregates content based on user interests, trends, and community engagement for the personalized feed.
  • Feed generation involves combining various factors such as user interests, trending content, and community engagement metrics to create a personalized feed for each user. This ensures that users see content relevant to their preferences.

5.11 Structured Database:

  • Utilizes relational databases for structured data like user profiles, posts, comments, and interactions.
  • Relational databases organize data into structured tables with defined relationships. They are suitable for managing structured data such as user profiles, posts, comments, and interactions in a systematic and organized manner.

5.12 NoSQL Database:

  • Manages unstructured or semi-structured data such as media attachments and flexible content storage.
  • NoSQL databases are suitable for managing unstructured or semi-structured data, such as media attachments or flexible content structures. They provide flexibility in data modeling and storage for diverse types of content.

5.13 Moderation Services:

  • Ensures content moderation, community guidelines adherence, and spam detection.
  • Moderation services prevent the dissemination of inappropriate or harmful content by enforcing community guidelines. This includes automated and manual processes for content moderation and spam detection to maintain a safe and positive user experience.

5.14 Security Measures:

  • Implements encryption, access controls, and compliance with data protection regulations for user data security.
  • Security measures involve encrypting sensitive data, implementing access controls to restrict unauthorized access, and ensuring compliance with data protection regulations (such as GDPR). This safeguards user data and maintains the platform’s integrity.

This low-level design encompasses various services, databases, and components working cohesively to handle user interactions, content management, optimization, and security measures within the TikTok-like platform.

6. High Level Design for TikTok System Design

At a high level, the design should handle two main tasks.

high-level-design-of-tiktok

6.1 Video Uploading Process:

  • Users request video uploads via the API Servers.
  • API Servers forward the upload request to Video Upload Services.
  • Video Upload Services store the video in the database.
  • Notification is sent to the user upon successful upload.
  • Uploaded videos are cached for quick user access.

6.2 Video Streaming Process:

  • Users request video streaming through the API.
  • API Servers direct the request to Video Streaming Services.
  • Video Streaming Services retrieve the user’s feed from the cache.
  • Users can seamlessly view their personalized feed.

We’ll need the following components:

  • Client: Users interact with TikTok via client applications, accessing the platform’s functionalities.
  • Load Balancer: Acts as a gatekeeper, ensuring even distribution of incoming requests across multiple web servers to optimize performance and prevent overload.
  • API Servers: Receive requests from clients and direct them to respective services based on the nature of the request.
  • Video Upload Services: Responsible for efficiently handling video uploads. They store video metadata and content in the database, ensuring seamless video ingestion.
  • Cache: Located close to users, the cache stores personalized feeds, trending or viral content, facilitating rapid delivery to users when online. It enhances the user experience by minimizing latency.
  • Video Streaming Services: Upon user login, these services retrieve the pre-assembled personalized feed from the cache and deliver it to the user’s screen for seamless video streaming.

7. Database Design for TikTok System Design

Database-Design-of-TikTok

Database Design for TikTok System Design

Below is the explanation of the above database design:

7.1 Users Database

User




{
UserID (Primary Key)
Username
Email
Password (Hashed)
Profile Picture URL
Bio
Registration Date
}


  • UserID: Unique identifier for users.
  • Username: Name chosen for user identification.
  • Email: User’s email address used for registration.
  • Password: Securely hashed user password.
  • Profile Picture URL: Link to the user’s profile image.
  • Bio: Brief description or biography of the user.
  • Registration Date: Date when the user signed up.

7.2 Social Graph

SocialGraph




{
UserID (Primary Key)
Follower IDs (Array/Map)
Following IDs (Array/Map)
Additional Graph Information
}


  • UserID: Unique identifier for users.
  • Follower IDs: Ds of users following the current user.
  • Following IDs: IDs of users followed by the current user.
  • Additional Graph Information: Supplementary data related to the social graph.

7.3 Videos Database

Video




{
VideoID (Primary Key)
UserID (Foreign Key to Users)
Title
Description
Upload Date
Views Count
Duration
Other Metadata
}


  • VideoID: Unique identifier for each uploaded video.
  • UserID: Foreign key linking the video to its uploader.
  • Title: Name or title of the uploaded video.
  • Description: Brief description of the video content.
  • Upload Date: Date when the video was uploaded.
  • Views Count: Number of views the video has received.
  • Duration: Length of the video in time.
  • Other Metadata: Supplementary data associated with the video.

7.4 Interactions Database

Likes




{
LikeID (Primary Key)
UserID (Foreign Key to Users)
VideoID (Foreign Key to Videos)
Timestamp
Additional Like Information
}


  • LikeID: Unique identifier for each like action.
  • UserID: Foreign key linking the like to the user.
  • VideoID: Foreign key linking the like to a specific video.
  • Timestamp: Time when the like was made.
  • Additional Like Information: Extra data associated with the like action.

Dislikes




{
DislikeID (Primary Key)
UserID (Foreign Key to User)
VideoID (Foreign Key to Video)
Timestamp
}


  • DisikeID: Unique identifier for each dislike action.
  • UserID: Foreign key linking the like to the user.
  • VideoID: Foreign key linking the like to a specific video.
  • Timestamp: Time when the like was made.
  • Additional Like Information: Extra data associated with the dislike action.

Comments




{
CommentID (Primary Key)
UserID (Foreign Key to Users)
VideoID (Foreign Key to Videos)
Comment Text
Timestamp
Additional Comment Information
}


  • CommentID: Unique identifier for each comment.
  • UserID: Foreign key linking the comment to the user.
  • VideoID: Foreign key linking the comment to a specific video.
  • Comment Text: Text content of the user’s comment.
  • Timestamp: Time when the comment was made.
  • Additional Comment Information: Supplementary data related to the comment.

8. Types of Databases used in TikTok Design

8.1 Relational Databases (PostgreSQL):

Usage: PostgreSQL databases are utilized for storing structured data, such as user profiles, relationships, and video metadata.

Significance: Relational databases ensure data consistency and integrity, making them ideal for handling transactional data and maintaining user-related information efficiently.

8.2 NoSQL Databases (Cassandra and Redis):

Usage: NoSQL databases, likeRedis and Cassandra, are utilized for handling unstructured or semi-structured data, such as user interactions (likes, comments, shares) and scalable storage of videos.

Significance: NoSQL databases excel in scalability and flexibility, allowing TikTok to handle vast volumes of user-generated content and interactions while ensuring high performance.

8.3 Blob Storage (Cloud-based Object Storage):

Usage: Cloud-based object storage solutions like Amazon S3 or Google Cloud Storage are employed for storing video content in its original form.

Significance: Blob storage offers scalable and durable storage for multimedia content, ensuring that videos are stored securely and can be accessed reliably.

9. API Used for Communicating with the servers in TikTok System Design

The RESTful API stands as an ideal choice within TikTok’s system due to its suitability for distributed, scalable, and diverse interactions across a myriad of functionalities. It’s flexibility aligns perfectly with TikTok’s dynamic nature, accommodating diverse client devices, scaling to handle the platform’s exponential growth, and offering a reliable foundation for handling the vast array of user interactions and content deliveries.

9.1 User Management

User Registration

Create a new user account with username, email, password, etc.

Register




Endpoint: 'POST /api/users/register'


Request For Body




{
  "username": "Salik_Alim",
  "email": "user123@example.com",
  "password": "securePassword123"
  // Other user details
}


9.2 User Authentication

User Login

Authenticate user credentials and generate a token for access.

Login




Endpoint: 'POST /api/users/login'


Request For Body




{
  "email": "user123@example.com",
  "password": "securePassword123"
}


9.3 Video Handling

Video Upload

Upload a new video with metadata and content.

Video Upload




Endpoint: 'POST /api/videos/upload'


Request For Body




{
  "title": "Funny Moments",
  "description": "A hilarious compilation",
  // Other video metadata
}


9.4 Video Interaction

9.4.1 Like

Allow users to like a specific video.

Like




Endpoint: 'POST /api/videos/:videoID/like'


Request For Body




{
  "user_id": "987654"
}


9.4.2 Dislike

Allow users to dislike a specific video.

Dislike




Endpoint: 'POST /api/videos/:videoID/dislike'


Request For Body




{
  "user_id": "987654"
}


9.4.3 Comment

Post a comment on a particular video.

Comment




Endpoint: 'POST /api/videos/:videoID/comment'


Request For Body




{
  "user_id": "987654",
  "text": "Great video!"
}


9.4.4 Video Retrieval

Get details and metadata of a specific video.

VideoID




Endpoint: 'GET /api/videos/:videoID'


Retrieve videos uploaded by a specific user

UserID




Endpoint: 'GET /api/videos/:userID'


9.4.5 Fetch trending videos

Trending




Endpoint: 'GET /api/videos/trending'


9.5 Social Interactions

9.5.1 Follow a specific user

Follow




Endpoint: 'POST /api/users/:userID/follow'


Request For Body




{
  "follower_id": "456"
}


9.5.2 Unfollow a user

Unfollow




Endpoint: 'DELETE /api/users/:userID/unfollow'


Request For Body




{
  "follower_id": "789"
}


9.5.3 Messaging: Send a message to another user.

send




Endpoint: 'POST /api/messages/send'


Request For Body




{
  "sender_id": "123",
  "recipient_id": "456",
  "message": "Hey, how are you?"
}


9.6 Content Discovery

9.6.1 Feed Generation

Retrieve personalized feed for a user.

UserID




Endpoint: 'GET /api/feed/:userID'


Fetch recommended content based on user preferences.

Discover




Endpoint: 'GET /api/discover'


10. Microservices used in TikTok System Design

Screenshot-2023-12-28-234802

Microservices Used in TikTok System Design

10.1 Authentication Services

It authenticates user identities and assigns unique userIDs upon successful verification, enabling personalized experiences within the platform.

10.2 Video Upload Services

The request for uploading videos is directed to Video Upload Services.Leveraging WebSocket connections, this service allows real-time progress tracking during uploads, providing users with immediate feedback on their upload status.

Which Technologies we use for Video Upload in TikTok?

For instance, technologies like Socket.IO facilitate bidirectional communication, ensuring continuous updates on upload progress. This seamless experience ensures users can monitor and engage with their uploads more effectively.

10.3 Primary Storage (Blob Storage):

The original video is then stored in the Primary Storage, which uses blob storage systems to efficiently and securely store uploaded videos. Employing cloud-based solutions such as Amazon S3 or Azure Blob Storage, it ensures scalability and reliability in storing vast amounts of user-generated content.

Example:

Videos are stored as objects in Amazon S3 buckets, ensuring durability and accessibility across the platform. Upon successful upload, notification is sent to the user about the upload.

10.4 Encoder/Transcoder Services:

Encoder Services play a vital role in optimizing videos for seamless streaming. Utilizing tools like FFmpeg, it converts uploaded videos into various formats and bitrates suitable for different devices. These services ensure that videos are transcoded effectively, enabling smooth playback across a range of devices and network conditions.

10.5 Recommendation Services:

At the heart of personalized content delivery lies the Recommendation Services. Powered by machine learning algorithms, these services curate tailored content feeds for users based on their preferences and interactions. These algorithms analyze user behavior and preferences, employing collaborative filtering or content-based recommendation systems to suggest videos that align with user interests.

10.6 Fanout Services:

Fanout Services are responsible for efficiently distributing uploaded videos to users’ feeds, ensuring a personalized experience. Employing a hybrid strategy combining push and on-demand models, TikTok ensures effective content distribution. Technologies like Kafka or RabbitMQ aid in distributing video notifications based on user activity, ensuring timely and relevant content delivery.

10.7 Cache (Redis):

The Cache services play a crucial role in optimizing content delivery by storing personalized feeds, metadata, and trending content. Utilizing Redis as an in-memory data store, it caches frequently accessed data, reducing latency for users accessing personalized feeds and trending videos.

11. Scalability in TikTok System Design

Scalability in TikTok’s design is facilitated through various mechanisms:

  • Distributed Architecture: TikTok implements a distributed system with microservices architecture. This modular structure allows for scaling individual components independently, minimizing the impact of scaling on the entire system.
  • Horizontal Scaling: Databases and services are designed to horizontally scale, spreading the load across multiple servers. Sharding techniques and partitioning data ensure that the system can handle increasing data volumes effectively.
  • Caching Strategies: To reduce the load on databases, TikTok utilizes caching for frequently accessed content. This strategy optimizes performance by storing data closer to users, ensuring faster retrieval and reduced server load.
  • Load Balancing: Load balancers distribute incoming traffic evenly across multiple servers. This ensures that no single server becomes overwhelmed, maintaining system performance during high traffic periods.
  • Elastic Resources: The use of cloud-based infrastructure allows TikTok to scale resources dynamically based on demand. Elasticity in computing resources ensures the platform can seamlessly handle fluctuations in user activity without compromising performance.
  • Optimized Algorithms: Algorithms for recommendation engines and content delivery are designed for scalability. They efficiently process large datasets and user interactions, adapting to increasing data inputs without sacrificing performance.

This scalable architecture enables TikTok to accommodate growing user bases and surges in activity while maintaining a responsive and efficient platform.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads