What is load balancing? 
Load balancing is the process of distributing traffic among multiple servers to improve a service or application’s performance and reliability.
Load balancing is the practice of distributing computational workloads between two or more computers. On the Internet, load balancing is often employed to divide network traffic among several servers. This reduces the strain on each server and makes the servers more efficient, speeding up performance and reducing latency. Load balancing is essential for most Internet applications to function properly.

Imagine a highway with 8 lanes, but only one lane is open for traffic due to construction. All vehicles must merge into that single lane, causing a massive traffic jam and long delays. Now, imagine the construction ends, and all 8 lanes are opened. Vehicles can spread out across the lanes, significantly reducing travel time for everyone.

Load balancing essentially accomplishes the same thing. By dividing user requests among multiple servers, user wait time is vastly cut down. This results in a better user experience — the grocery store customers in the example above would probably look for a more efficient grocery store if they always experienced long wait times.

How does load balancing work?

 Load balancing is handled by a tool or application called a load balancer. A load balancer can be either hardware-based or software-based. Hardware load balancers require the installation of a dedicated load balancing device; software-based load balancers can run on a server, on a virtual machine, or in the cloud. Content delivery networks (CDN) often include load balancing features.

  When a request arrives from a user, the load balancer assigns the request to a given server, and this process repeats for each request. Load balancers determine which server should handle each request based on a number of different algorithms. These algorithms fall into two main categories: static and dynamic.

Static load balancing algorithms

Static load balancing algorithms distribute workloads without taking into account the current state of the system. A static load balancer will not be aware of which servers are performing slowly, and which servers are not being used enough. Instead, it assigns workloads based on a predetermined plan. Static load balancing is quick to set up but can result in inefficiencies. Referring back to the analogy above, imagine if the grocery store with 8 open checkout lines has an employee whose job it is to direct customers into the lines. Imagine this employee simply goes in order, assigning the first customer to line 1, the second customer to line 2, and so on, without looking back to see how quickly the lines are moving. If the 8 cashiers all perform efficiently, this system will work fine — but if one or more is lagging, some lines may become far longer than others, resulting in bad customer experiences. Static load balancing presents the same risk: sometimes, individual servers can still become overburdened

1.Round robin DNS and client-side random load balancing are two common forms of static load     balancing.
  Round robin: Round robin load balancing distributes traffic to a list of servers in rotation using the Domain Name System (DNS). An authoritative nameserver will have a list of different A records for a domain and provides a different one in response to each DNS query.

2. Weighted round robin: Allows an administrator to assign different weights to each server. Servers deemed able to handle more traffic will receive slightly more. Weighting can be configured within DNS records.

3. IP hash: Combines incoming traffic’s source and destination IP addresses and uses a mathematical function to convert it into a hash. Based on the hash, the connection is assigned to a specific server.

Dynamic load balancing algorithms

 Dynamic load balancing algorithms take the current availability, workload, and health of each server into account. They can shift traffic from overburdened or poorly performing servers to underutilized servers, keeping the distribution even and efficient. However, dynamic load balancing is more difficult to configure. A number of different factors play into server availability: the health and overall capacity of each server, the size of the tasks being distributed, and so on.

Suppose the grocery store employee who sorts the customers into checkout lines uses a more dynamic approach: the employee watches the lines carefully, sees which are moving the fastest, observes how many groceries each customer is purchasing, and assigns the customers accordingly. This may ensure a more efficient experience for all customers, but it also puts a greater strain on the line-sorting employee.

There are several types of dynamic load balancing algorithms, including least connection, weighted least connection, resource-based, and geolocation-based load balancing.

  1. Least connection: Checks which servers have the fewest connections open at the time and sends traffic to those servers. This assumes all connections require roughly equal processing power.

2. Weighted least connection: Gives administrators the ability to assign different weights to each server, if some servers can handle more connections than others.

3. Weighted response time: Averages the response time of each server and combines that with the number of connections each server has open to determine where to send traffic. By sending traffic to the servers with the quickest response time, the algorithm ensures faster service for users.

4. Resource-based: Distributes load based on what resources each server has available at the time. Specialized software (called an “agent”) running on each server measures that server’s available CPU and memory, and the load balancer queries the agent before distributing traffic to that server.

Where is load balancing used?

As discussed above, load balancing is often used with web applications. Software-based and cloud-based load balancers help distribute Internet traffic evenly between servers that host the application. Some cloud load balancing products can balance Internet traffic loads across servers that are spread out around the world, a process known as global server load balancing (GSLB).

Load balancing is also commonly used within large, localized networks, like those within a data center or a large office complex. Traditionally, this has required the use of hardware appliances such as an application delivery controller (ADC) or a dedicated load balancing device. Software-based load balancers are also used for this purpose.

What is server monitoring?

Dynamic load balancers must be aware of server health: their current status, how well they are performing, etc. Dynamic load balancers monitor servers by performing regular server health checks. If a server or group of servers is performing slowly, the load balancer distributes less traffic to it. If a server or group of servers fails completely, the load balancer reroutes traffic to another group of servers, a process known as “failover.”

What is failover? Failover occurs when a given server is not functioning, and a load balancer distributes its normal processes to a secondary server or group of servers. Server failover is crucial for reliability: if there is no backup in place, a server crash could bring down a website or application. It is important that failovers take place quickly to avoid a gap in service.

Load Balancing Techniques:

  • Round Robin load balancing method

Round-robin load balancing is the simplest and most commonly-used load balancing algorithm. Client requests are distributed to application servers in simple rotation. For example, if you have three application servers: the first client request is sent to the first application server in the list, the second client request to the second application server, the third client request to the third application server, the fourth to the first application server, and so on.

Round robin load balancing is most appropriate for predictable client request streams that are being spread across a server farm whose members have relatively equal processing capabilities and available resources (such as network bandwidth and storage).

  • Weighted Round Robin load balancing method

Weighted round robin is similar to the round-robin load balancing algorithm, adding the ability to spread the incoming client requests across the server farm according to the relative capacity of each server. It is most appropriate for spreading incoming client requests across a set of servers that have varying capabilities or available resources. The administrator assigns a weight to each application server based on criteria of their choosing that indicates the relative traffic-handling capability of each server in the farm.

So, for example: if application server #1 is twice as powerful as application server #2 (and application server #3), application server #1 is provisioned with a higher weight and application server #2 and #3 get the same, lower, weight. If there are five (5) sequential client requests, the first two (2) go to application server #1, the third (3) goes to application server #2, the fourth (4) to application server #3. The fifth (5) request would then go to application server #1, and so on.

  • Least Connection load balancing method

Least connection load balancing is a dynamic load balancing algorithm where client requests are distributed to the application server with the least number of active connections at the time the client request is received. In cases where application servers have similar specifications, one server may be overloaded due to longer lived connections; this algorithm takes the active connection load into consideration. This technique is most appropriate for incoming requests that have varying connection times and a set of servers that are relatively similar in terms of processing power and available resources.

  • Weighted Least Connection load balancing method

Weighted least connection builds on the least connection load balancing algorithm to account for differing application server characteristics. The administrator assigns a weight to each application server based on the relative processing power and available resources of each server in the farm. The LoadMaster makes load balancing decisions based on active connections and the assigned server weights (e.g., if there are two servers with the lowest number of connections, the server with the highest weight is chosen).

  • Resource Based (Adaptive) load balancing method

Resource based (or adaptive) load balancing makes decisions based on status indicators retrieved by LoadMaster from the back-end servers. The status indicator is determined by a custom program (an “agent”) running on each server. LoadMaster queries each server regularly for this status information and then sets the dynamic weight of the real server appropriately.

In this fashion, the load balancing method is essentially performing a detailed “health check” on the real server. This method is appropriate in any situation where detailed health check information from each server is required to make load balancing decisions. For example: this method would be useful for any application where the workload is varied and detailed application performance and status is required to assess server health. This method can also be used to provide application-aware health checking for Layer 4 (UDP) services via the load balancing method.

  • Resource Based (SDN Adaptive) load balancing method

SDN (Software Defined Network) adaptive is a load balancing algorithm that combines knowledge from Layers 2, 3, 4 and 7 and input from an SDN (Software Defined Network) controller to make more optimized traffic distribution decisions. This allows information about the status of the servers, the status of the applications running on them, the health of the network infrastructure, and the level of congestion on the network to all play a part in the load balancing decision making. This method is appropriate for deployments that include an SDN (Software Defined Network) controller.

  • Fixed Weighting load balancing method
    Fixed weighting is a load balancing algorithm where the administrator assigns a weight to each application server based on criteria of their choosing to represent the relative traffic-handling capability of each server in the server farm. The application server with the highest weight will receive all of the traffic. If the application server with the highest weight fails, all traffic will be directed to the next highest weight application server. This method is appropriate for workloads where a single server is capable of handling all expected incoming requests, with one or more “hot spare” servers available to pick up the load should the currently active server fail.
  • Weighted Response Time load balancing method

The weighted response time load balancing algorithm that uses the application server’s response time to calculate a server weight. The application server that is responding the fastest receives the next request. This algorithm is appropriate for scenarios where the application response time is the paramount concern.

  • Source IP Hash load balancing method

The source IP hash load balancing algorithm uses the source and destination IP addresses of the client request to generate a unique hash key which is used to allocate the client to a particular server. As the key can be regenerated if the session is broken, the client request is directed to the same server it was using previously. This method is most appropriate when it’s vital that a client always return to the same server for each successive connection.

  • URL Hash load balancing method

The URL hash load balancing algorithm is similar to source IP hashing, except that the hash created is based on the URL in the client request. This ensures that client requests to a particular URL are always sent to the same back-end server.

REFERENCES: –

1. https://youtu.be/sCR3SAVdyCc?si=cBcMmD4jrq_m28Lz

 2. https://youtu.be/dBmxNsS3BGE?si=XfTCni1Wc2tGguy9

Conclusion: –

Load balancing is a critical component of modern web infrastructure that ensures optimal performance, reliability, and scalability of applications and services. Through various algorithms and techniques, load balancers effectively distribute incoming traffic across multiple servers, preventing any single server from becoming overwhelmed while maintaining consistent service delivery.

Leave a Reply

Your email address will not be published. Required fields are marked *