Published on November 13, 2023

API Gateway

API Gateway is a vital component to scaling and securing modern distributed systems. It sits between the client and a suite of backend services and serves as a single point of entry to an application. Some major API Gateway providers include AWS API Gateway, Azure API Management, and Google API Gateway. They tend to come with features such as request routing, load balancing, authentication and authorization, rate limiting, caching, and logging right out of the box. Upon receiving a request from the client, an API Gateway will be able to forward the request to the appropriate backend service based on a predefined set of rules. Load balancer comes standard with an API Gateway and helps distribute traffic across multiple machines. Distribution policy can be configured to use round robin, sticky round robin, weighted round robin, IP/URL hashing, least connections, and least latency. See Exploring Different Types of Load Balancers for more details. API Gateway can also serve as a gatekeeper through authentication and authorization. Implementation can vary and depends on the authentication provider. Rate limiter is an important API Gateway feature to help prevent abuse against the backend services. Rate limiting policy can be configured to use token bucket, leaking bucket, fixed window counter, sliding window log, and sliding window counter. Some API Gateway offers caching features to help reduce load on the backend services and improve performance. Logging is another feature that comes with API Gateway. It enables usage tracking and troubleshooting to gain better insight into the system. These are just some of the features provided by an API Gateway. Implementation may vary between providers.

API GatewayAWS API GatewayAzure API ManagementGoogle API GatewayLoad BalancerLoad Balancer CacheRate LimiterTech
Published on November 8, 2023

Exploring Different Types of Cache Systems

Caching is a common technique in modern distributed systems to enhance performance and reduce response time. The general idea is to reuse previously computed values and prevent subsequent server or database hits. A distributed system can have multiple caching points starting with browser cache, CDN, load balancer cache, distributed cache, and database cache. The following techniques assume that data within the cache are not stale. Browser caching can cache HTTP responses and facilitate faster data retrieval. The browser cache should shave off significantly from the response time. Enable browser cache by adding an expiration policy in the response HTTP headers. Web assets such as images, videos, and documents are perfect content for caching because they do not change as often. Web assets are typically cached in content delivery networks (CDN) that are geographically distributed to be as close to the request origin as possible to reduce response time. Content can be personalized through edge nodes as well. Load balancer caching can help reduce stress on the servers and improve response time. Depending on their implementation, load balancers can be configured to respond with cached results for subsequent requests with the same parameters. Distributed caches such as Redis are in-memory key-value stores with high read and write performance. One common application of distributed cache is inverted indexing for full document search. Lastly, depending on implementation, database may have caching functionality such as bufferpool and views to help improve response time. Bufferpool caches query results in allocated memory for future retrieval. Similarly, views cache precomputed query results to help reduce latency.

Browser CacheCacheContent Delivery NetworkDatabase CacheDistributed CacheDistributed SystemsLoad BalancerLoad Balancer CacheTech
Published on October 12, 2023

Exploring Different Types of Load Balancers

Load balancers are crucial components for enabling large-scale web applications. Some key features of a load balancer include distribution of traffic, health checks, session persistence, etc. There are two classes of load balancer: static and dynamic. Static load balancers are rule-based and do not adapt to changing server conditions, whereas dynamic load balancers continuously monitor server metrics and adjust the load distribution based on these metrics. Static load balancing methods include Round Robin, Sticky Round Robin, Weighted Round Robin, and IP/URL Hash. Dynamic load balancers include Least Connections and Least Time. Round Robin is the simplest approach in traffic distribution. The idea is to evenly distribute requests amongst all the servers regardless of server metrics. The down side to this approach is that server can become overloaded if they are not monitored closely. Sticky Round Robin is a variation of Round Robin. In this approach, the load balancer will attempt to send subsequent requests from the same user to the same server. Related data are grouped and processed together, improving processing performance. Uneven load can easily occur in this approach because newly arrived requests are assigned randomly. Weighted Round Robin allows users to configure the weight of each server within the server pool. The load balancer will distribute traffic to each server in proportion to its assigned weight. The downside is that server weights need to be configured manually, which can make this method less adaptable in a rapidly changing environment. IP/URL Hash load balancing is similar to Sticky Round Robin, as it maps user IP addresses or requested URLs to specific servers. Again, related data are grouped and processed together, improving processing performance. Selecting a proper hash algorithm can be a barrier to entry. Least Connections is a dynamic approach to load balancing, where incoming requests are routed to the server with the fewest active connections, effectively assigning requests to servers with the most remaining capacity. This approach requires that each server track its on going connections. Traffic can unintentionally be piled onto a single server. Least Time is another dynamic approach to load balancing, where the load balancer routes traffic to the server with the lowest latency or fastest response time. This involves continuously measuring the latency from each server, leading to increased overhead and complexity. Each of the mentioned methods has its trade-offs in terms of capabilities, constraints, and performance. It is important to consider traffic pattern when choosing and fine-tuning a load balancing approach.

Dynamic Load BalancingIP/URL HashLeast ConnectionsLeast TimeLoad BalancerRound RobinStatic Load BalancingSticky Round RobinTechWeighted Round Robin