Design Spotify Premium | System Design

Last Updated : 21 Mar, 2024

In today’s digital world, the demand for premium music streaming services like Spotify is at an all-time high. Understanding the System Design behind Spotify Premium is crucial for software engineers seeking to build robust, scalable, and reliable music streaming platforms. This article explores the architecture of Spotify Premium, offering insights into creating a cutting-edge system capable of delivering high-quality audio content seamlessly while ensuring scalability, durability, and optimal user experience.

Important Topics for Spotify Premium System Design

Requirements Gathering for Spotify Premium System Design
Capacity Estimation for Spotify Premium System Design
Use case diagram for Spotify Premium System Design
High-Level Design for Spotify Premium System Design
Low-Level Design for Spotify Premium System Design
Database Design for Spotify Premium System Design
Microservices used for System Design for Spotify Premium
API used in Spotify Premium System Design
Scalability for Spotify Premium System Design

1. Requirements Gathering for Spotify Premium System Design

1.1. Functional Requirements for Spotify Premium System Design

Users can search for content by title, Genre, description, etc.
Every audio has its thumbnail.
Spotify Premium shows content that matches the user’s previous preferences.
Users can download up to 10,000 songs on a maximum of 5 devices under the same account.
Interested users can upload their audio files.
Users can listen to uploaded audio files.
Shareable song links, and Spotify URLs across various social media platforms.
Users can review their past listening activity.
Users with premium access to features such as ad-free listening, unlimited skips, high-quality audio streaming, and offline downloads for both users on the two-device subscription plan.

1.2. Non-Functional Requirements for Spotify Premium System Design

High Availability
High Reliability
Good Performance
Highly Scalable
Low Latency

2. Capacity Estimation for Spotify Premium System Design

Song Storage: Utilizing formats like Ogg Vorbis or AAC, with an average song size of 3MB, the storage requirement for songs totals approximately 90TB for 30 million songs.
Song Metadata: Storing metadata for songs and users is essential. With an average metadata size of 100 bytes per song, the metadata storage demand is around 3GB for 30 million songs.
User Metadata: Each user’s profile requires storage, averaging around 1KB per user. With 500,000 users, the storage needed for user metadata amounts to 0.5GB.
Data Redundancy: To ensure high availability and fault tolerance, it’s crucial to replicate data. Replicating data three times, following industry standards, increases the total storage requirement to 300TB.

3. Use Case diagram for Spotify Premium System Design

Below Use Case Diagram Describe the use cases of User and Database System:

use-case-diagram-spotiffy-premium

4. High-Level Design for Spotify Premium System Design

HLD-spotify-Premium

4.1. Components of High Level Design:

SpotifyWebServer: Think of it as the guardian of the gates, ensuring only authorized users access Spotify’s features. This server handles vital tasks like user authentication, rate limiting, and validation, keeping the platform secure and running smoothly.
SongSearchService: This service is like a wizard for music discovery. It swiftly fetches search results based on song names, artists, or even lyrics, thanks to powerful indexing tools like Elasticsearch. This makes finding your favorite tunes a breeze.
SongMetadataService: Acting as a bridge to the treasure trove of song data, this service provides easy access to song information stored in the MetadataDB. It’s like a helpful librarian, fetching details about songs, albums, and artists whenever you need them.
SongStreamingService: Ever wonder how Spotify delivers your favorite jams without missing a beat? It’s all thanks to this service, which fetches audio files from storage solutions like ObjectStore and CDNs. This ensures smooth playback, no matter where you are.

4.2. Design Considerations:

Metadata Datastore:
- Relational Database: Think of it as organizing your music collection neatly on shelves. A relational database keeps song metadata structured and easy to manage, making it simple to find what you’re looking for.
- NoSQL Database: This is like having a flexible storage system that can adapt to your needs. While it may duplicate some information, it ensures quick access to song data, making browsing a breeze.
Object Store: Storing audio files in services like Amazon S3 is like keeping your music library in a safe, accessible place. With different storage tiers available, you can optimize costs while ensuring your tunes are always ready to play.
Song Streaming:
- HTTP Range Requests: Imagine streaming your favorite song in chunks, so even if your internet is slow, you can still enjoy smooth playback. However, it might pause occasionally to catch up.
- Adaptive Bitrate: This is like having a smart music player that adjusts the quality based on your internet speed. It ensures uninterrupted listening, no matter the connection.
- Elasticsearch Failover: While Elasticsearch helps speed up searches, it’s important to have a backup plan. If it goes down, Spotify can rely on MetadataDB to rebuild the cache and keep the music flowing.

5. Low-Level Design for Spotify Premium System Design

LLD-spotify-premium

1. Java

Serving as the primary development language, Java is instrumental in crafting Spotify’s intricate codebase, offering advantages like comprehensive tooling, robust frameworks, and object-oriented paradigms.

2. NGINX

Acting as an elastic load balancer, API gateway, and service client, NGINX optimizes server operations, enhancing speed, security, and load management. Its open-source nature allows for extensive customization without licensing concerns.

3. Hystrix

This Java-based circuit breaker library bolsters fault tolerance within Spotify’s microservices architecture, minimizing failures and enhancing reliability.

4. PostgreSQL

Serving as the SQL database for storing critical user billing and subscription data, PostgreSQL provides a robust, open-source RDBMS solution.

5. Bootstrap

Utilized as the CSS framework for frontend webpage development, Bootstrap streamlines UI design with pre-styled components, ensuring a sleek and responsive user interface.

6. Amazon S3

Facilitating static file storage for licensed Spotify songs, Amazon S3 offers high availability and fault tolerance, essential for housing vast music libraries.

7. Amazon CloudFront

As a CDN provider, CloudFront complements S3, ensuring global accessibility of stored content by efficiently distributing it across multiple regions.

8. Kafka

Powering the event-driven streaming pipeline, Kafka facilitates rapid microservice routing with its high throughput and Java SDK support.

9. Cassandra

Employed as the distributed NoSQL database, Cassandra efficiently manages user data with its scalability and fault-tolerant architecture.

10. Hadoop

Utilized for distributed file storage and batch computing of historical data, Hadoop enables Spotify to analyze large datasets effectively.

11. Google BigQuery

This cloud-based data warehouse empowers Spotify with advanced analytics capabilities, aiding in data-driven decision-making and trend identification.

12. Apache Storm

Offering distributed real-time computation, Storm complements Hadoop by providing optimized analytics for search and recommendation engine results.

13. Google Cloud Bigtable

Serving as a highly available NoSQL database, Bigtable stores essential metadata, seamlessly integrating with BigQuery for enhanced analytics capabilities.

Flow of the Design:

Below is the overview flow of the Low-Level Design of Spotify Premium:

Users interact with the frontend, initiating requests such as song searches.
Requests are forwarded to the NGINX load balancer, which directs them to the NGINX API gateway for authentication and service routing.
The API gateway sends requests to NGINX service clients, which interact with various microservices via Kafka for processing.
Microservices process requests and publish responses on the Kafka pipeline, enabling seamless communication and data retrieval.
Service clients gather responses from the Kafka pipeline and relay them back to the API gateway for aggregation.
Aggregated responses are sent back to the frontend, where users receive rendered content, such as search results or song selections.

6. Database Design for Spotify Premium System Design

Creating a database model for a Spotify-like platform, covering important features like user management, playlist creation, artist following, track liking, premium features, and payment systems.

Database-design-spotify-premium-copy

6.1. User Information

Here we store important details like user names, emails, passwords, birth dates, and profile pictures. We’ll also include a feature to identify whether users are regular or premium members.

User Information

CREATE TABLE Users (
  User_ID INT AUTO_INCREMENT PRIMARY KEY,
  Name VARCHAR(50) NOT NULL,
  Email VARCHAR(50) NOT NULL UNIQUE,
  Password VARCHAR(100) NOT NULL,
  Date_of_Birth DATE,
  Profile_Image Blob,
  User_Type VARCHAR(10) NOT NULL DEFAULT 'regular'
);

6.2. Premium User Features

There are special features for premium users, like ad-free listening. These features will be stored in a table, and we’ll use another table to connect users with their chosen premium features.

Premium User Features

CREATE TABLE Premium_Feature (
  Premium_Feature_ID INT PRIMARY KEY AUTO_INCREMENT,
  Name VARCHAR(50) NOT NULL
);

CREATE TABLE User_Premium_Feature (
  User_ID INT,
  Premium_Feature_ID INT,
  PRIMARY KEY (User_ID, Premium_Feature_ID),
  FOREIGN KEY (User_ID) REFERENCES Users(User_ID),
  FOREIGN KEY (Premium_Feature_ID) REFERENCES Premium_Feature(Premium_Feature_ID)
);

6.3. Payment Integration

To handle payments, we’ll set up tables to store payment details and subscription plans. Another table will link users with their chosen subscription plans.

Payment Integration

CREATE TABLE Payment (
  Payment_ID INT PRIMARY KEY AUTO_INCREMENT,
  User_ID INT NOT NULL,
  Payment_Method VARCHAR(50) NOT NULL,
  Payment_Date DATE NOT NULL,
  Amount DECIMAL(10, 2) NOT NULL,
  FOREIGN KEY (User_ID) REFERENCES Users(User_ID)
);

CREATE TABLE Subscription_Plan (
  Subscription_Plan_ID INT PRIMARY KEY AUTO_INCREMENT,
  Name VARCHAR(50) NOT NULL,
  Price DECIMAL(10, 2) NOT NULL,
  Description VARCHAR(500) NOT NULL
);

CREATE TABLE User_Subscription_Plan (
  User_ID INT,
  Subscription_Plan_ID INT,
  Start_Date DATE NOT NULL,
  End_Date DATE NOT NULL,
  PRIMARY KEY (User_ID, Subscription_Plan_ID),
  FOREIGN KEY (User_ID) REFERENCES Users(User_ID),
  FOREIGN KEY (Subscription_Plan_ID) REFERENCES Subscription_Plan(Subscription_Plan_ID)
);

6.4. Artists Table

This table keeps track of artists’ basic details like their names, genres, and images. It helps organize and display information about various artists on the platform.

Artists Table

CREATE TABLE Artists (
  Artist_ID INT AUTO_INCREMENT PRIMARY KEY,
  Name VARCHAR(50) NOT NULL,
  Genre VARCHAR(50),
  Image_URL VARCHAR(255) 
);

6.5. Albums Table

In this table, we store information about albums, such as their names, release dates, and cover images. It helps users find and explore different albums easily.

Albums Table

CREATE TABLE Albums (
  Album_ID INT AUTO_INCREMENT PRIMARY KEY,
  Artist_ID INT,
  Name VARCHAR(50) NOT NULL,
  Release_Date DATE,
  Image VARCHAR(255),
  FOREIGN KEY (Artist_ID) REFERENCES Artists(Artist_ID)
);

6.6. Tracks Table

Tracks Table stores details about individual songs, including their names, durations, and file locations. It’s essential for playing music and organizing songs within albums.

Tracks Table

CREATE TABLE Tracks (
  Track_ID INT AUTO_INCREMENT PRIMARY KEY,
  Album_ID INT,
  Name VARCHAR(50) NOT NULL,
  Duration INT NOT NULL,
  Path VARCHAR(255),
  FOREIGN KEY (Album_ID) REFERENCES Albums(Album_ID)
);

6.7. Playlists Table

This table helps users create and manage playlists by storing their names and associated user IDs. It’s where users organize their favorite songs into custom collections.

Playlists Table

CREATE TABLE Playlists (
  Playlist_ID INT AUTO_INCREMENT PRIMARY KEY,
  User_ID INT,
  Name VARCHAR(50) NOT NULL,
  Image Blob,
  FOREIGN KEY (User_ID) REFERENCES Users(User_ID)
);

6.8. Playlist_Tracks Table

Playlist_Tracks Table connects playlists with tracks, allowing users to add songs to their playlists. It keeps track of the order of songs within each playlist.

Playlists_Tracks Table

CREATE TABLE Playlist_Tracks (
  Playlist_ID INT,
  Track_ID INT,
  `Order` INT,
  PRIMARY KEY (Playlist_ID, Track_ID),
  FOREIGN KEY (Playlist_ID) REFERENCES Playlists(Playlist_ID),
  FOREIGN KEY (Track_ID) REFERENCES Tracks(Track_ID)
);

6.9. Followers Table

This table manages the relationship between users and artists, showing which artists a user follows. It helps users stay updated with their favorite artists’ latest releases.

Followers Table

CREATE TABLE Followers (
  User_ID INT,
  Artist_ID INT,
  PRIMARY KEY (User_ID, Artist_ID),
  FOREIGN KEY (User_ID) REFERENCES Users(User_ID),
  FOREIGN KEY (Artist_ID) REFERENCES Artists(Artist_ID)
);

6.10. Likes Table

Likes Table keeps track of which songs users have liked, helping personalize their music recommendations. It’s useful for understanding user preferences and improving recommendations.

Likes Table

CREATE TABLE Likes (
  User_ID INT,
  Track_ID INT,
  Like_Date_Time DATETIME,
  PRIMARY KEY (User_ID, Track_ID),
  FOREIGN KEY (User_ID) REFERENCES Users(User_ID),
  FOREIGN KEY (Track_ID) REFERENCES Tracks(Track_ID)
);

7. Microservices used for System Design for Spotify Premium

microservices-spotify

1. Publishing

In the publishing stage, content creators upload their content. This content is stored in both a raw media server and a metadata database. The upload process is facilitated by a service deployed within a containerized environment and auto-scaling group, ensuring scalability.

2. Distribution

Once content is published, it undergoes distribution. The files stored in the raw media server are processed by a media server, which handles tasks like protocol conversion and bitrate optimization. Post-processing, the files are transferred to a content delivery network (CDN), strategically located globally to reduce latency for users.

3. Search and Play

During the search and play phase, clients connect to the system to search for and listen to music. Clients, ranging from smartphones to smart TVs, connect to load balancers, which distribute requests to an auto-scaling group. This group comprises containers running various microservices like search, view, account, add to playlist, and payment services.

4. Amazon Infrastructure

To implement this architecture, Amazon S3 is utilized for storing raw and transcoded files, with Elastic Transcoder for media processing and CloudFront for CDN distribution.

8. API used in Spotify Premium System Design

searchService API – Upon receiving a search request, the search service retrieves relevant information from the metadata database. The response undergoes processing by a rules engine, which applies business rules and configurations before returning the results to the client.
viewService API- Similarly, the view service fetches specific content details from the metadata database, applies business rules, and returns the results to the client.
uploadService API – It helps content creators put their stuff on the platform. When they upload a song, this service makes sure it’s stored properly and ready for others to enjoy.
accountService API – The account service manages user accounts and subscriptions, authenticating users against a dedicated database and verifying subscriptions through payment services.
addPlaylistService API- When you want to add a new song to your playlist, this service handles it. It checks if the song fits your playlist rules (like size limits) and updates your playlist accordingly.
paymentService API- When you upgrade to a premium plan, this service makes it happen. It ensures your payments are secure and that your account reflects the changes accurately.

9. Scalability for Spotify Premium System Design

Assuming Spotify grows to 50 million users and 200 million songs, we need to make sure our system can handle all that data. For this, we have to think about how we store and manage information.

Storing Data:
- When we have more songs and users, we need more space to store information about them.
- For example, storing details about 200 million songs would need about 20GB of space, assuming each song’s info takes up around 100 bytes.
- Similarly, keeping track of 50 million users would need about 50GB of space, assuming each user’s info takes up about 1KB.
Managing the Database:
- To deal with the extra workload on our database, we can use a clever trick called the Leader-Follower method.
- Here, we have one main database (the Leader) that handles both reading and writing.
- Then, we have several other databases (the Followers or Slaves) that only handle reading.
- This helps spread out the work and makes it easier to find song and user details when needed.
Handling Complexity:
- In more complicated situations, like splitting up the database into smaller parts (called sharding) or using multiple main databases (Leader-Leader), things get a bit trickier.
- These methods can help us handle even more data, but they’re also more complex and might not be necessary for every situation.

Suggest improvement

System Design vs. Software Design

Share your thoughts in the comments