Kevin Luke: December 2017

Use configuration management tools such as Puppet, Ansible (preferable), or Chef to automate the installation of an ELK stack. You can use AWS services to build it, or individual services on a host. It is your choice.

Deliverable:

A cloudformation template (or a complete Ansible script) that builds an ELK stack.

This write-up will document setup and construction of an Ansible script that will automatically deploy and configure an EC2 instance on AWS that includes a complete ELK stack. The installation will also include NGINX as a reverse proxy. Dependencies are Java 1.8.

A Linux host running Ansible can be used to deploy the playbook.

Setup: Creating an instance.
A standalone Linux distro can be used, but for the sake of familiarity and ease of use, the standard Amazon Linux AMI (ami-15e9c770) was used on a t2.micro instance. This can be initiated by signing into the AWS Management Console (this assumes you've already created a free account), browsing to Services, clicking on EC2, clicking on Launch Instance, and selecting your machine image. Instance details can be left at defaults, including the security group. (https://aws.amazon.com/console/)

Clicking launch on the instance will prompt you to select a key pair. You can create a new key pair via the GUI. Download the private key and save to a secure location on your machine. This private key will be used to authenticate and securely SSH into your Linux instance.

A note on SSH: Secure Shell (SSH) is a secure network protocol used to establish a remote login to another computer system. The holder of the private key will have access via SSH connection to any system profile that has the authorized public key stored in it's ~/.ssh/authorized_keys file. This assumes that the key is not password protected. Keys can be created using the built in ssh-keygen and similar commands (https://www.ssh.com/ssh/).

Before any SSH connection can be established, port 22 must be opened for inbound traffic on the security group associated with the instance. The "default" security group can be edited by selecting it from the left pane of the screen. Once selected, scroll down and click on the Inbound tab. Click Edit, Add Rule, select the SSH type with TCP protocol. Source should be My IP. Click Save. Note that you can also add an inbound rule called Custom ICMP Echo Request at your IP to be able to ping the machine. Several other rules will need to be added later on for ELK access.

Once the key pair has been created and the private key pair downloaded, an SSH client can be used to establish a connection. PuTTY was used for this documentation. (http://www.putty.org/) Documentation for PuTTY configuration with Amazon EC2 instances can be found here (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html).

It's important to note where system information is located in the Amazon EC2 screen. Select the created instance under Running Instances and scroll down to view all of the system information. This includes public and private IPs and DNS addresses, along with key pairs and all other important information about the instance.

After establishing an SSH connection to the machine, the configuration management software can be loaded. For this documentation, Ansible will be used.

A note on Ansible: Ansible allows for software to be pushed to machines without an associated agent on the target systems. It does this by using an inventory file which includes all IP addresses of the target systems. Playbooks can be run to deploy configurations and install software on target systems, as well as create new instances from scratch. It does this by running modules, written in YAML. (http://yaml.org/)

To install Ansible, the Python package manager can be used (pip). Amazon Linux AMI includes this by default. To install Ansible on your host machine, use the following command line:

sudo pip install ansible

Reference:
(http://docs.ansible.com/ansible/latest/intro_installation.html)

Once installed, a few configuration changes need to be made to allow Ansible to connect to newly created instances via SSH. Since the main method that Ansible pushes configurations is over SSH, it will not be able to service remote machines that it cannot establish an SSH connection to. To resolve this, the private key can be copied to your ~/.ssh folder and id_rsa file.

Additionally, the Ansible remote user has to be set to the remote user that the SSH key is configured for. This will come into play when writing the playbook for the deployment. The default Amazon AMI user is "ec2-user".

The Ansible config will also need to be modified to ignore authenticity warning when creating instances and connecting via SSH. This can be achieved by editing the ansible.cfg file located at /etc/ansible/ansible.cfg with an editor of your choice. If no config is present, a config can be created at the location with the same name or in the same directory as the playbook you are running. Add the following lines to the config:

host_key_checking = False

Note: It's worth mentioning during learning that a completely new key can be created and moved to the server via the ssh-keygen method. (https://superuser.com/questions/1135766/how-to-ssh-from-one-ec2-instance-to-another)

You will also need an AWS secret key and user ID. In any real world example, you would always want to secure your key and name, but for this example the values were stored inside of the script. Amazon provides several methods for storing these values in variables (http://docs.aws.amazon.com/cli/latest/userguide/cli-environment.html).

To generate an Access Key ID, go to AWS Services, select IAM, Add User, provide a name and select access type. Programmatic access was used. Select a group or create a group for the permissions policies. A group of AdministratorAccess was used for this example. Once you've generated an Access Key ID and Secret Key, make sure to save the Secret Key in a secure location.

Now that Ansible and SSH are properly configured on the host machine, an Ansible playbook can be constructed to provision an EC2 instance. This instance will be prepped for an ELK stack deployment.

It's good practice to reference examples and familiarize yourself with the syntax via hands on learning before starting on the main script. Ansible module documentation and module documentation can be found here (http://docs.ansible.com/ansible/latest/modules_by_category.html). A reading of the entire guide is a great primer to writing a playbook.

Since Ansible will be used to deploy a server from scratch, an ansible host can be defined as the localhost. This will run all commands from the host machine. The user is defined as ec2-user to ensure the SSH connection is established correctly. Become gives sudo access to the user. The become user can be specified via another command also.

It's worth noting that the pub IP setting or VPC subnet setting cannot be used in the EC2 module as of writing this without errors. An extremely basic ec2 instance is provisioned and created via the script. This instance is added to a new list of inventory items and an SSH connection is established. It's important to note that the private IP names are added as inventory items, since they are part of the same VPC. When SSHing from one EC2 instance to another, the private IP is used.

__________________________________________
---
- hosts: localhost
   user: ec2-user
   tasks:
   - name: Start EC2 Instance
     become: true
     ec2:
        aws_access_key: REMOVED
        aws_secret_key: REMOVED
        key_name: "linkey1"
        group: "default"
        instance_type: "t2.large"
        image: "ami-15e9c770"
        wait: true
        region: "us-east-2"
     register: ec2

   - name: Add new instances to host group
     add_host:
       hostname: "{{ item.private_ip }}"
       groupname: launched
     with_items: "{{ ec2.instances }}"

   - name: Wait for SSH
     wait_for:
       host: "{{ item.public_dns_name }}"
       port: 22
       delay: 60
       timeout: 240
       state: started
     with_items: "{{ ec2.instances }}"
__________________________________________

The individual instance(s) can now be configured.

For learning purposes, each individual component was installed on a new EC2 instance before code was written. Official documentation for each program was used to determine any prerequisites.

Elasticsearch: A search engine based on Apache Lucene. Since it's written in Java, and the official site lists Java as a pre-req, Java will have to be updated on the instance before Elasticsearch can be installed. Amazon Linux AMI comes preloaded with Java 1.7. Java 1.8 is required by the latest version of Elasticsearch.

Since a new group with all of the private IP addresses of the created instances was established in the last code block, the group is now referenced in the hosts section. Remote user is set to the standard Linux user.

Java can be pulled and installed using the YUM package manager. There is a module for this in Ansible. The default java path must be set using the "alternatives" method also.

__________________________________________

# ELASTICSEARCH CONFIGURATION

- name: Configure instance
   hosts: launched
   become: true
   remote_user: ec2-user
   tasks:
   - name: Download & Install Java Version 8
     yum:
      name: "java-1.8.0-openjdk.x86_64"
      state: present

   - name: Set Alternative java
     alternatives:
       name: "java"
       path: "/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java"

   - name: Clean Up
     yum:
       name: "java-1.7"
       state: absent

   - name: Download Elasticsearch
     get_url:
       url: "https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.0.rpm"
       dest: "/home/ec2-user/elasticsearch-6.1.0.rpm"

   - name: Install Elasticsearch
     yum:
       name: "/home/ec2-user/elasticsearch-6.1.0.rpm"
       state: present

   - name: Start Elasticsearch Service
     service:
       name: elasticsearch
       state: started
       enabled: yes

__________________________________________

Elasticsearch is downloaded via a direct link. In a non-test environment, a repository should be used. It's then installed with yum, and the service is started. The official installation instructions for Elasticsearch can be found here (https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html).

The next piece of software in the ELK stack is Logstash. Logs from various sources can be pipelined and sent to Elasticsearch.

__________________________________________

# LOGSTASH CONFIGURATION

   - name: Download Logstash
     get_url:
       url: "https://artifacts.elastic.co/downloads/logstash/logstash-6.1.0.rpm"
       dest: "/home/ec2-user/logstash-6.1.0.rpm"

   - name: Install Logstash
     yum:
      name: "/home/ec2-user/logstash-6.1.0.rpm"
      state: present

   - name: Configure Logstash
     template:
       src: "/home/ec2-user/templates/logstash-simple.j2"
       dest: "/usr/share/logstash/bin/logstash-simple.conf"
       owner: "logstash"
       group: "logstash"
       mode: '0777'

   - name: Install Files Plugin
     shell: "/usr/share/logstash/bin/logstash-plugin install logstash-input-file"

   - name: Set directory permissions 1
     file:
       path: "/usr/share/logstash"
       owner: "logstash"
       group: "logstash"
       mode: '0777'

   - name: Set directory permissions 2
     file:
       path: "/usr/share/logstash/data"
       state: directory
       owner: "logstash"
       group: "logstash"
       mode: '0777'

   - name: Set directory permissions 3
     file:
       path: "/usr/share/logstash/data/queue"
       state: directory
       owner: "logstash"
       group: "logstash"
       mode: '0777'

   - name: Copy test log data
     copy:
       src: "/home/ec2-user/testdata.txt"
       dest: "/home/ec2-user/test.txt"

   - name: Start Logstash Service
     shell: "/usr/share/logstash/bin/logstash -f /usr/share/logstash/bin/logstash-simple.conf -t"

__________________________________________

Logstash download and install process is the same as Elasticsearch, except a configuration template must be used. A Jinja2 script can be used to copy the configuration information.

A note on Jinja2: Jinja2 is built on Python and allows for templates to be configured with variables and other complex data structures before being copied to the destination.

Using the template module in Ansible easily allows for the J2 file to be copied. For this example, a J2 file was created in a templates folder in the user folder. More specifically, for Logstash, a default configuration with no inputs was copied to the instance. This is because the ELK stack is configured all on a single instance. If it were to be configured on an alternate server, the output could be referenced here. The script below simply points the output of Logstash to the local elasticsearch port, 9200.

This script also takes advantage of the files plugin that was installed above. A specific log file will be read from the beginning to end and added to Elasticsearch via the Logstash pipeline for test/demonstration purposes.

Logstash note: To check the installed version of logstash, reference the binary with a fully qualified path:

/usr/share/logstash/bin/logstash --version

__________________________________________

input
{
file {
    path => "/home/ec2-user/test.txt"
    start_position => "beginning"
}
}
output
{
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}

__________________________________________

After the template is copied, the Logstash service can be started. This service is located in the following location:

/usr/share/logstash/bin

In order to push the template to the binary correctly, a shell command can be used with the -f template key. This was already done automatically in the Logstash install section of the main Ansible script above.

The next software package to configure is Kibana, which is used for visualization of logs. Kibana is
an extremely straightforward install, with no config files.

__________________________________________

# KIBANA CONFIGURATION

   - name: Download Kibana
     get_url:
       url: "https://artifacts.elastic.co/downloads/kibana/kibana-6.1.0-x86_64.rpm"
       dest: "/home/ec2-user/kibana-6.1.0-x86-64.rpm"

   - name: Install Kibana
     yum:
      name: "/home/ec2-user/kibana-6.1.0-x86-64.rpm"
      state: present

   - name: Start Kibana Service
     service:
       name: kibana
       state: started
       enabled: yes

__________________________________________

Since the server Kibana and the rest of the stack is hosted on has no GUI or web browser, a way must be created to access the software stack. Using another piece of software, such as NGINX, to configure a reverse proxy allows others to actually use the service directly. NGINX can be installed via a built in repository. Another tool, httpd-tools, can be installed on the host server and used to create a password file via "htpasswd" command. This file can be copied to the new EC2 ELK instance and incorporated in the NGINX configuration. The install command is:

sudo yum install httpd-tools

__________________________________________

# NGINX For Web Access

   - name: Install NGINX
     yum:
      name: nginx
      state: present

   - name: Copy Pswd File
     copy:
      src: "/home/ec2-user/templates/htpasswd.users"
      dest: "/etc/nginx/htpasswd.users"

   - name: Install Config for Rev Proxy
     template:
       src: "/home/ec2-user/templates/defaultnginx.j2"
       dest: "/etc/nginx/nginx.conf"

   - name: Start NGINX Service
     service:
       name: nginx
       state: started
       enabled: yes

__________________________________________

NGINX is installed via a standard repository. An encrypted password file is placed in a specific NGINX folder to allow default access. The reverse proxy requires a configuration file in order to work. Using a Jinja2 template, a simple reverse proxy script can forward the default Kibana port traffic to port 80 at the public DNS. A standard YAML/Ansible variable can be passed from the script into the template upon creation to set the default DNS name.

__________________________________________

    server {
        listen 80;

        server_name {{ public_dns_name }};

        auth_basic "Restricted Access";
        auth_basic_user_file /etc/nginx/htpasswd.users;

        location / {
            proxy_pass http://localhost:5601;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $host;
            proxy_cache_bypass $http_upgrade;
        }
    }
__________________________________________

Authorization is required and the encrypted password file is referenced for authentication in the proxy script above.

With all of the templates in place, the Ansible playbook can be run with the "ansible-playbook" command in the home directory of the script. Example:

ansible-playbook yml.elk

This begins the server configuration and installation of the stack.

A note on security settings: Since Kibana is forwarding to port 80 via NGINX proxy, port 80 must be allowed as an inbound rule. This can be locked down to a specific IP, or to everyone, since there is also an additional layer of password protection. This can be done by editing the rules in the Security Group, Group Name settings on the AWS Management Console.

Once the Ansible playbook has finished running, copy the public IP from the host information or the AWS console. Paste into browser.

Sign in with user: kibadmin
To see new data, SSH into the new server and run the configuration command again:

/usr/share/logstash/bin/logstash -f /usr/share/logstash/bin/logstash-simple.conf

Refresh the Kibana page and you should see a new entry in Discover.

You will now have a complete Ansible installed and configured ELK stack with login and plugins working in minutes. You can launch multiple instances of this by changing the count in the playbook and adjusting the server type. Configuration files are also easily adjusted to pipe in external log sources.

Kevin Luke

Tuesday, December 26, 2017

Configuring an ELK Stack