Book Summaries | System Design Interview (Part 1) - Scale from Zero to Millions of Users

January 20th, 2024

Chapter 1: Scale from Zero to Millions of Users

This summary will serve to further cement my learnings taken when reading the first chapter of System Design Interview - An Insider's Guide (Part 1), titled Scale from Zero to Millions of Users, and I hope will provide some learnings to you as well.

Single Server Setup:

  • Initially, systems will run on a single server, handling web, database, and cache.
  • Users access the website through domain names, and HTTP requests are sent to the web server.
  • The traffic comes from both web and mobile applications.
  • With the growth of the user base, a single server is no longer sufficient, leading to the need for multiple servers.
  • Choosing between relational and non-relational databases is essential based on application requirements.

Vertical Scaling vs Horizontal Scaling:

  • Vertical scaling involves adding more power (CPU, RAM) to servers, while horizontal scaling adds more servers.
  • Vertical scaling has limitations, including a hard limit and lack of failover and redundancy.
  • Horizontal scaling is more desirable for large-scale applications.
  • Load balancers help distribute traffic among servers, addressing failover and load-balancing issues.
  • Database replication is introduced for failover and redundancy in the data tier. This can include master-slave replication.

Load Balancer:

  • A load balancer evenly distributes incoming traffic among web servers.
  • Private IPs are used for communication between servers for better security.
  • Load balancing prevents website downtime by redirecting traffic if one server goes offline.
  • It allows for graceful handling of rapid traffic growth by adding more servers to the pool.
  • Improves availability and ensures a balanced load among servers.

Database Replication:

  • Database replication involves a master/slave relationship between original and copied databases.
  • Advantages include better performance, reliability, and high availability.
  • Replication allows for continued operation even if a database server goes offline.
  • Failover mechanisms ensure a seamless transition in case of server failures.
  • The system design incorporates load balancers and database replication for improved availability.

Cache:

  • A cache layer is introduced to improve load/response time by storing frequently accessed data.
  • Caches store results of expensive responses, reducing the need to repeatedly call the database.
  • Considerations include deciding when to use cache, setting expiration policies, and maintaining consistency.
  • Cache servers provide APIs for common programming languages.
  • Caching strategies, such as read-through cache, are employed based on data type, size, and access patterns.

Content Delivery Network (CDN):

  • A CDN is a network of servers delivering static content like images, videos, CSS, etc.
  • CDN servers cache static content, improving load times for users.
  • Considerations include cost, setting cache expiry times, handling CDN failures, and invalidating files.
  • Static assets are no longer served by web servers, fetching them from CDN for better performance.
  • Database load is lightened by caching data, improving overall system efficiency.

Stateless Web Tier:

  • Scaling the web tier horizontally involves moving state (e.g. user session data) out of the web tier.
  • Session data is stored in a persistent data store, making the web tier stateless.
  • Stateless architecture simplifies the system, making it more robust and scalable.
  • Stateful architecture involves remembering client data, while stateless architecture keeps no state information.
  • The updated design reflects a stateless web tier, improving scalability and robustness.

Multiple Data Centers:

  • Supporting multiple data centers is crucial for improving availability and user experience across geographical areas.
  • GeoDNS is used to route users to the closest data center based on their location.
  • Challenges include traffic redirection, data synchronization, and testing/deployment across multiple locations.
  • The system directs all traffic to a healthy data center in case of a significant data center outage.
  • A multi-data center setup enhances system resilience and global availability.

Message Queue:

  • A message queue serves as a durable, memory-based component supporting asynchronous communication.
  • Decoupling components using a message queue enhances scalability and reliability.
  • Producers create messages and publish them to the queue, while consumers connect to the queue to perform actions.
  • Message queues facilitate handling asynchronous requests and enable independent scaling of producers and consumers.
  • Use cases include supporting photo customization tasks with asynchronous processing.

Logging, Metrics, Automation:

  • Logging is crucial for monitoring error logs, identifying system errors, and aggregating logs for centralized analysis.
  • Metrics collection, including host-level and business metrics, provides insights into system health and performance.
  • Automation tools, such as continuous integration, enhance productivity by automating code verification and deployment processes.
  • Automated testing, build processes, and deployment contribute to efficient system management.
  • The updated system design includes message queues and various tools for logging, metrics, and automation.

Database Scaling:

  • Two broad approaches for database scaling are vertical scaling (adding power to existing machines) and horizontal scaling (adding more servers).
  • Sharding is a horizontal scaling technique, dividing large databases into smaller, manageable parts called shards.
  • Sharding introduces complexities, including resharding data, the celebrity problem (i.e. too much traffic onto one shard because of popularity of data located at that shard), and challenges with join operations.
  • The choice of a sharding key is crucial for efficient data distribution across shards.
  • The system incorporates sharded databases and NoSQL data stores to handle increasing data traffic efficiently.

Conclusion

  • Keep the web tier stateless for simplicity and scalability.
  • Introduce redundancy at every tier for failover and improved availability.
  • Utilize caching at various levels to reduce database workloads and improve system performance.
  • Support multiple data centers for global availability and better user experience.
  • Scale the data tier through techniques like sharding and incorporating NoSQL data stores.
  • Decouple different components using message queues for independent scaling.
  • Implement logging, metrics, and automation tools for effective system monitoring and management.
  • The iterative process of scaling involves fine-tuning and adapting strategies to tackle new challenges.