Deploy a HAProxy Load Balancer and multiple Web Servers on AWS instances Using ANSIBLE

7 min readNov 2, 2020

Description:

Provision EC2 instances through ansible.
Retrieve the IP Address of instances using the dynamic inventory concept.
Configure the web servers through the ansible role.
Configure the load balancer through the ansible role.
The target nodes of the load balancer should auto-update as per the status of web servers.

Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems and can configure both Unix-like systems as well as Microsoft Windows. It includes its own declarative language to describe system configuration. Ansible was written by Michael DeHaan and acquired by Red Hat in 2015. Ansible is agentless, temporarily connecting remotely via SSH or Windows Remote Management (allowing remote PowerShell execution) to do its tasks.

HAProxy, which stands for High Availability Proxy, is a popular open-source software TCP/HTTP Load Balancer and proxying solution which can be run on Linux, Solaris, and FreeBSD. Its most common use is to improve the performance and reliability of a server environment by distributing the workload across multiple servers (e.g. web, application, database).

LoadBalancer

Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.

Load balancing distributes server loads across multiple resources — most often across multiple servers. The technique aims to reduce response time, increase throughput, and in general speed things up for each end-user.

A load balancer performs the following functions:

Distributes client requests or network load efficiently across multiple servers
Ensures high availability and reliability by sending requests only to servers that are online
Provides the flexibility to add or subtract servers as demand changes.

HAProxy Algorithms

Round Robin: This algorithm is the most commonly implemented. It works by using each server behind the load balancer in turns, according to their weights. It’s also probably the smoothest and most fair algorithm as the servers’ processing time stays equally distributed. As a dynamic algorithm, Round Robin allows server weights to be adjusted on the go.

Static Round Robin: Similar to Round Robin, each server is used in turns per their weights. Unlike Round Robin though, changing server weight on the fly is not an option. There are, however, no design limitations as far as the number of servers is concerned. When a server goes up, it will always be immediately reintroduced into the farm once the full map is recomputed.

Least Connections: With this algorithm, the server with the lowest number of connections receives the connection. This type of load balancing is recommended when very long sessions are expected, such as LDAP, SQL, TSE, etc. It’s not, however, well-suited for protocols using short sessions such as HTTP. This algorithm is also dynamic like Round Robin.

Source: This algorithm hashes the source IP and divides it by the total weight of running servers. The same client IP always reaches the same server as long as no server goes down or up. If the hash result changes due to the changing number of running servers, clients are directed to a different server. This algorithm is generally used in TCP mode where cookies cannot be inserted. It’s also static by default.

URI: This algorithm hashes either the left part of the URI or the whole URI and divides the hash value by the total weight of running servers. The same URI is always directed to the same server as long as no servers go up or down. It’s also a static algorithm and works the same way as the Source algorithm.

URL Parameter: This static algorithm can only be used on an HTTP backend. The URL parameter that’s specified is looked up in the query string of each HTTP GET request. If the parameter that’s found is followed by an equal sign and value, the value is hashed and divided by the total weight of running servers.

Requirement:

IAM user with admin power
Ansible installed
Install boto library of python using “pip3 install boto”

Setting up Dynamic Inventory for AWS

Create directory /etc/mydb and download the following files in this directory.

Setting Dynamic Inventory 
wget https://raw.githubusercontent.com/ansible/ansible/stable-2.9/contrib/inventory/ec2.pywget   https://raw.githubusercontent.com/ansible/ansible/stable-2.9/contrib/inventory/ec2.inimake ec2.py executable : chmod  +x ec2.pymake ec2.ini excutable : chmod +x ec2.ini 
export this command
export AWS_REGION= 'set region'
export AWS_ACCESS_KEY=  xxxxxxxxxxxxxxxxxx
export AWS_SECRET_KEY=  xxxxxxxxxxxxxxxxxx

And now, we have to set the path of this inventory in the configuration file of ansible. Since to get the public IP of instance ansible has to execute these files so we need to provide it’s the path to ansible.

Step1: Creating a playbook to provision ec2 instances on the AWS cloud.

Creating Vault

Store the AWS access key and secret key in this vault.

ansible-vault create --vault-id name@prompt vault-name.yml

Create Playbook it will launch EC2 instance in AWS

here I launched four instances one for the load balancer and three for the webserver.

- hosts: localhost
  gather_facts: false
  vars_files:
        - aws_Cred.yml
  tasks:
      - name: "LoadBalancer"
        ec2:
                count: 1
                image: "ami-0e306788ff2473ccb"
                instance_type: t2.micro
                region: "ap-south-1"
                wait: yes
                instance_tags:
                        Name: Ansible_Load_Balancer
                group_id: "sg-014233fc4d89285b0"
                key_name: "myrdr66"
                state: present
                aws_access_key: "{{access_key}}"
                aws_secret_key: "{{secret_key}}"      - name: "webserver"
        ec2:
                count: 3
                image: "ami-0e306788ff2473ccb"
                instance_type: t2.micro
                region: "ap-south-1"
                wait: yes
                instance_tags:
                        Name: Ansible_WebServer
                group_id: "sg-014233fc4d89285b0"
                key_name: "myrdr66"
                state: present
                aws_access_key: "{{access_key}}"
                aws_secret_key: "{{secret_key}}"

Run Playbook

ansible-playbook --vault-id name@prompt ec2.yml

Configure launched instances

for this create a role for webserver and load balancer.

Load Balancer

tasks/main.yml
---
# tasks file for LB
- name: "installing haproxy"
  package:
        name: "haproxy"
        state: present
- template:
        src: "haproxy.j2"
        dest: "/etc/haproxy/haproxy.cfg"
  notify: "restart service"- service:
       name: "haproxy"
       state: started
       enabled: yeshandlers/main.yml
---
# handlers file for LB
- name: "restart service"
  service:
      name: "haproxy"
      state: restartedtemplates
haproxy.j2

I have changed the port number and added jinja for loop that will fetch all Ip from group tag_Name_ansible_webserver that consist of Ip of the webserver.