Book Summaries | System Design Interview (Part 1) - Scale from Zero to Millions of Users
January 20th, 2024
Chapter 1: Scale from Zero to Millions of Users
This summary will serve to further cement my learnings taken when reading the first chapter of System Design Interview - An Insider's Guide (Part 1)
, titled Scale from Zero to Millions of Users
, and I hope will provide some learnings to you as well.
Single Server Setup:
- Initially, systems will run on a single server, handling web, database, and cache.
- Users access the website through domain names, and HTTP requests are sent to the web server.
- The traffic comes from both web and mobile applications.
- With the growth of the user base, a single server is no longer sufficient, leading to the need for multiple servers.
- Choosing between relational and non-relational databases is essential based on application requirements.
Vertical Scaling vs Horizontal Scaling:
- Vertical scaling involves adding more power (CPU, RAM) to servers, while horizontal scaling adds more servers.
- Vertical scaling has limitations, including a hard limit and lack of failover and redundancy.
- Horizontal scaling is more desirable for large-scale applications.
- Load balancers help distribute traffic among servers, addressing failover and load-balancing issues.
- Database replication is introduced for failover and redundancy in the data tier. This can include master-slave replication.
Load Balancer:
- A load balancer evenly distributes incoming traffic among web servers.
- Private IPs are used for communication between servers for better security.
- Load balancing prevents website downtime by redirecting traffic if one server goes offline.
- It allows for graceful handling of rapid traffic growth by adding more servers to the pool.
- Improves availability and ensures a balanced load among servers.
Database Replication:
- Database replication involves a master/slave relationship between original and copied databases.
- Advantages include better performance, reliability, and high availability.
- Replication allows for continued operation even if a database server goes offline.
- Failover mechanisms ensure a seamless transition in case of server failures.
- The system design incorporates load balancers and database replication for improved availability.
Cache:
- A cache layer is introduced to improve load/response time by storing frequently accessed data.
- Caches store results of expensive responses, reducing the need to repeatedly call the database.
- Considerations include deciding when to use cache, setting expiration policies, and maintaining consistency.
- Cache servers provide APIs for common programming languages.
- Caching strategies, such as read-through cache, are employed based on data type, size, and access patterns.
Content Delivery Network (CDN):
- A CDN is a network of servers delivering static content like images, videos, CSS, etc.
- CDN servers cache static content, improving load times for users.
- Considerations include cost, setting cache expiry times, handling CDN failures, and invalidating files.
- Static assets are no longer served by web servers, fetching them from CDN for better performance.
- Database load is lightened by caching data, improving overall system efficiency.
Stateless Web Tier:
- Scaling the web tier horizontally involves moving state (e.g. user session data) out of the web tier.
- Session data is stored in a persistent data store, making the web tier stateless.
- Stateless architecture simplifies the system, making it more robust and scalable.
- Stateful architecture involves remembering client data, while stateless architecture keeps no state information.
- The updated design reflects a stateless web tier, improving scalability and robustness.
Multiple Data Centers:
- Supporting multiple data centers is crucial for improving availability and user experience across geographical areas.
- GeoDNS is used to route users to the closest data center based on their location.
- Challenges include traffic redirection, data synchronization, and testing/deployment across multiple locations.
- The system directs all traffic to a healthy data center in case of a significant data center outage.
- A multi-data center setup enhances system resilience and global availability.
Message Queue:
- A message queue serves as a durable, memory-based component supporting asynchronous communication.
- Decoupling components using a message queue enhances scalability and reliability.
- Producers create messages and publish them to the queue, while consumers connect to the queue to perform actions.
- Message queues facilitate handling asynchronous requests and enable independent scaling of producers and consumers.
- Use cases include supporting photo customization tasks with asynchronous processing.
Logging, Metrics, Automation:
- Logging is crucial for monitoring error logs, identifying system errors, and aggregating logs for centralized analysis.
- Metrics collection, including host-level and business metrics, provides insights into system health and performance.
- Automation tools, such as continuous integration, enhance productivity by automating code verification and deployment processes.
- Automated testing, build processes, and deployment contribute to efficient system management.
- The updated system design includes message queues and various tools for logging, metrics, and automation.
Database Scaling:
- Two broad approaches for database scaling are vertical scaling (adding power to existing machines) and horizontal scaling (adding more servers).
- Sharding is a horizontal scaling technique, dividing large databases into smaller, manageable parts called shards.
- Sharding introduces complexities, including resharding data, the celebrity problem (i.e. too much traffic onto one shard because of popularity of data located at that shard), and challenges with join operations.
- The choice of a sharding key is crucial for efficient data distribution across shards.
- The system incorporates sharded databases and NoSQL data stores to handle increasing data traffic efficiently.
Conclusion
- Keep the web tier stateless for simplicity and scalability.
- Introduce redundancy at every tier for failover and improved availability.
- Utilize caching at various levels to reduce database workloads and improve system performance.
- Support multiple data centers for global availability and better user experience.
- Scale the data tier through techniques like sharding and incorporating NoSQL data stores.
- Decouple different components using message queues for independent scaling.
- Implement logging, metrics, and automation tools for effective system monitoring and management.
- The iterative process of scaling involves fine-tuning and adapting strategies to tackle new challenges.