git clone https://github.com/braveclojure/ansible-tutorial.git
cd ansible-tutorial
vagrant up
An Ansible Tutorial
So far, you’ve treated the deployment tools as a black box. In this section, you’ll learn everything you need to open the black box and understand what’s going on inside. The majority of the server setup and deployment work is done by Ansible, so most of this chapter is designed to quickly dump some Ansible knowledge into your brain.
Before looking at code, though, let’s talk a little about the broader context of server management, and about the philosophy and tools that have grown up around that dark art recently.
DevOps
In putting together the tools you used, I’ve tried to follow the best practices promoted by the DevOps movement.
The DevOps movement encourages collaboration between developers (responsible for building server-based applications) and operations teams (responsible for creating and setting up servers) with the goal of producing more reliable web-based software. The DevOps concept of infrastructure as code (IaC) is particularly helpful. In the same way that we write scripts and programs to automate manual tasks, we treat infrastructure as code to automate the process of provisioning and configuring a server. Automation decreases the amount of time a task takes, it makes a task easily repeatable in different contexts, and it reduces the potential for human error.
You saw these benefits in the last section: once you had everything installed, all you had to do was run a couple commands from the command line. This took way less time than having to learn how to set everything up, and then having to monitor the server while you downloaded the necessary packages like nginx and java, and having to manually write Upstart nginx config files. The process is repeatable in that you can easily use the same scripts to deploy other applications, and it reduces the potential for human error in that almost every step of the deployment process is handled by tested tools which follow the same steps every time.
Another benefit of IaC is that you can version control and share the scripts you use for setting up servers; the scripts you used can be found in the Sweet Tooth github organization. The relevant repos have ansible-role- in their name.
I’ve been a developer for about 13 years, and in that time I’ve manually set up countless servers, so I can tell you from experience that using IaC tools and practices makes you way more productive. IaC makes your life easier in the same way that scripting your workflow and using version control makes your life easier. If you’re interested in learning more, the book Infrastructure as Code: Managing Servers in the Cloud was helpful to me.
Note
|
More about DevOps
A lot of DevOps literature is focused on changing the culture of large organizations to adopt DevOps practices, and a lot of DevOps practices target enterprise environments where you’re responsible for hundreds of servers. Those topics aren’t relevant if you’re just one person deploying a single server, but it’s good to know that there’s a ton of great literature and advice out there if you need to grow your site. You can check out the DevOps Community Picks for some good resources. |
Ansible Tutorial
Ansible is a mature, powerful IaC tool that you can use to accomplish many different kinds of DevOps tasks. We’re using it for configuration management and application deployment.
Configuration management is the term for setting up a server by installing packages (like nginx), configuring applications (for example, by uploading nginx config files), creating directories, setting permissions, and doing whatever other preparations are necessary for your site to run. Application deployment is the process of uploading code or artifacts (like an uberjar) and ensuring that the updated site is being served to visitors.
To fully understand how the Sweet Tooth scripts configure your server and deploy your application, you’ll need to become familiar with many Ansible concepts, including playbooks, tasks, roles, variables, and inventories. In this section, I’ll explain these concepts by walking you through increasingly complex scripts.
As you learn more about Ansible, it’s useful to keep in mind that ultimately, the tool is just running commands on your server. All this extra machinery of playbooks, tasks, etc., is just there to make the process more modular and understandable. It’s kind of like using a higher level programming language: sure, you could write everything in assembly, but using something like Clojure makes it easier to express intent and reuse code. Ansible’s system is like a higher-level language that compiles down to shell commands.
Note
|
On Mastering Tools
Whenever you’re learning to use a new tool, its useful to focus separately on its purpose, external model and internal model. When you understand a tool’s purpose, your brain gets loaded with helpful contextual details that make it easier for you to assimilate new knowledge. It’s like working on a puzzle: when you’re able to look at a picture of the completed puzzle, it’s a lot easier to fit the pieces together. A tool’s external model is the interface it presents and the way it wants you to think about problem solving. Clojure’s external model is a Lisp that wants you to think about programming as mostly data-centric, immutable transformations. You’ll soon see that Ansible wants you to think of server provisioning in terms of defining the end state, rather than defining the steps you should take to get to that state. A tool’s internal model is how it transforms the inputs to its interface into some lower-level abstraction. Clojure transforms Lisp into JVM bytecode. Ansible transforms task definitions into shell commands. In an ideal world, you wouldn’t have to understand the internal model, but in reality it’s almost always helpful to understand a tool’s internal model because it gives you a unified perspective on what might seem like confusing or contradictory parts. When the double-helix model of DNA was discovered, for example, it helped scientists make sense of higher-level pheonema. My point, of course, is that this book is one of the greatest scientific achievements of all time. Mini rant: Tutorials often mix up a tool’s external model and internal model in a way that’s confusing for learners, and I try not to do that here. If I f’d up, then I apologize! But I hope that you’ll find this distinction between purpose, external model, and internal model will serve you well as you continue along your human journey of learning. |
Ansible Basics
In this tutorial, you’ll use Vagrant to create a virtual machine that
will host a server, and you’ll use Ansible to modify the server in
increasingly complex ways. To follow along, get the tutorial repo and
cd
to it, and start the Vagrant server:
(If you haven’t installed Vagrant, see Chapter 1 for instructions.)
Ansible reads declarative configuration files called playbooks and
applies them to inventories. (I’ll explain what I mean by
declarative soon.) Let’s apply our first playbook to an invetory. To
apply a playbook, you use the ansible-playbook
command, and you
specify an inventory with the -i
flag. Try this:
ansible-playbook -i inventory-vagrant-server playbooks/01.yml
This means apply the playbook specified by the file playbooks/01.yml to the inventory specified by the file inventory-vagrant-server. Let’s unpack what Ansible’s doing by looking first at inventory-vagrant-server, then playbooks/01.yml.
The purpose of an inventory file is to tell Ansible which servers to run which commands on. The inventory file lets you specify how to connect to servers, and it lets you organize servers into groups. Here’s inventory-vagrant-server:
default ansible_ssh_host=127.0.0.1 ansible_ssh_port=2222 ansible_ssh_user='vagrant' ansible_ssh_private_key_file='.vagrant/machines/default/virtualbox/private_key'
[webservers]
default
[database]
default
The first line, which starts with default
, defines a server
alias. The alias’s name is default, and it refers to a collection of
variables that specify how to connect to a server. In this case, it’s
saying connect to the IP address 127.0.0.1 using port 2222 for SSH,
and connect as the user "vagrant" with the private key file
at .vagrant/machines/default/virtualbox/private_key. (If you’re not
familiar with SSH keys, Digital Ocean has a
good
tutorial.
The next bit of text, [webservers], defines a group. Groups allow you to organize servers by their function or role so that you can easily apply different playbooks to them. For example, you may be running a server cluster with ten application servers and two database servers. You’ll want to set up the application servers differently from the database servers; for example, you’d install nginx on the app servers, and postgres on the db servers. In just a minute you’ll see how playbooks can target groups.
The next line under [webservers] is default. That tells Ansible that the server default belongs to the webservers group. Similarly, the next couple lines define a database group and specify that the server default belongs to it.
So, this inventory file defines one server, the Vagrant virtual machine. It specifies how to connect to it, and it defines two groups: webservers and database. It specifies that the server belongs to both groups.
Now let’s look at the playbook file, playbooks/01.yml:
---
- hosts: webservers
become: true
become_method: sudo
tasks:
- name: Create an empty file
file: path=/etc/foo.conf state=touch mode=0644
- name: Install nginx
apt:
pkg: nginx
state: installed
update-cache: yes
The file uses the YAML file format. If you’re not familiar with YAML, it basically lets you compactly define mappings (also known as key/value pairs or dictionaries) and sequences (arrays or lists).
The first line, hosts: webservers, is a key/value pair that specifies which server groups to run commands on. It means, apply this playbook to the hosts in the webservers group. The next two lines, become: true and become_method: sudo, tell Ansible to execute commands as root.
The next bit specifies how the server should get modified. First, there’s the key tasks. Whereas the value of the previous keys were scalars (i.e. the value for the key hosts is webservers), the value for tasks is a sequence of mappings. Each mapping has a dash character preceding it, and each specifies a task that Ansible runs. The task specification gets translated into a command that’s run on all the servers specified by hosts.
The first mapping has the keys name and file. It’s good practice to give each task a name key, even though it’s not required. The task’s name does not affect the command that is run in any way; rather, it serves as documentation.
The line file: path=/etc/foo.conf state=touch mode=0644 needs more explaining. Every task must define one and only one module to use, and the module’s arguments. Modules are what get executed on servers, and it’s useful to think of them as just shell scripts. In this case, by including the file key, we’re telling Ansible to use the file module. We’re giving it the arguments path=/etc/foo.conf, state=touch, and mode=0644. When Ansible executes this task, it will use the file module to create a file at /etc/foo.conf and give it permissions 0644.
The next task ensures that nginx is installed. It uses the apt module (apt is Ubuntu’s package manager) with the argumengs pkg: nginx, state: installed, update-cache: yes. Notice that the arguments are written as a YAML mapping rather than a single line of foo=bar pairs. This is just two different way of encoding the same information. These arguments are telling the apt module make sure the nginx package is installed, and make sure the apt cache is up to date.
This readability is one of the reasons I like Ansible. It’s pretty easy to tell what’s going on. I do want to call attention to a couple things, though.
Earlier, I mentioned that Ansible reads declarative files. This
means that you specify the end state of your server, not how to
achieve that state. In this small example file we’ve told Ansible that
we want an end state such that /etc/foo.conf file exists and nginx
is installed, but we haven’t told Ansible exactly what steps it should
take to reach that state. We didn’t tell Ansible run touch
/etc/foo.conf
and `sudo apt-get install nginx`.
You can check that Ansible did in fact create /etc/foo.conf by
running vagrant ssh
from the command line. This will log you in to
the Vagrant server, where you can check the file’s existence with ls
/etc/foo.conf. You can also check that nginx is installed by running
nginx -v
.
Ansible Variables
Ansible playbooks can be made more flexible and modular by using
variables. Our next playbook, playbooks/02.yml, paramaterizes our
two tasks with the file_path
and package
variables:
---
- hosts: webservers
become: true
become_method: sudo
vars:
file_path: /etc/bar.conf
package: wget
tasks:
- name: Create an empty file
file: path={{ file_path }} state=touch mode=0644
- name: Install a package
apt:
pkg: "{{ package }}"
state: installed
update-cache: yes
When you apply this playbook with ansible-playbook -i
inventory-vagrant-server playbooks/02.yml
, Ansible will create the
file /etc/bar.conf and install the wget package. The double
braces, as in {{ file_path }}
, are how you interpolate variable
values into your YAML files.
Note
|
Ansible Templating
Ansible actually relies on the Jinja2 template engine to perform interpolation, and as a result you can do all kinds of crazy pseudo-programming tricks within those two braces. For example, you can filter a value, for example by providing a default value. I’m not going to cover everything that’s possible with Jinja2, but I wanted to let you know about it in case you wanted to do something more sophisticated than variable substitution. |
You can also specify variable values on the command line:
ansible-playbook -i inventory-vagrant-server playbooks/02.yml --extra-vars "file_path=/etc/baz.conf"
This will create the file /etc/baz.conf.
When we go through the Sweet Tooth Ansible tasks, you’ll find that nearly every task relies on variables. Variables make the code more flexible and reusable, just like in real programs.
Inventory Directories
Most applications' server configurations include at least a staging environment where you can do testing and sanity checking, and a production environment that real users use. You’ll likely want to set some of your playbooks' variables to different values for each environment.
For example, with Community Picks I have a staging environment and a production environment. The staging environment logs files to /var/log/staging-community-picks and the production environment logs files to /var/log/www-community-picks. The log file location is stored in an Ansible variable.
One way to set the different variable values for different environments is by both a) creating an inventory for each environment, and b) defining the inventory using a different structure than the one we covered.
In the Ansible Basics section you learned that you can define an inventory with a single file, like the file inventory-vagrant-server. You can also define an inventory using a set of files in a directory, and you can see this in the inventories directory, which contains two sub-directories, staging and prod. If you look in staging, you’ll see a file named hosts and a directory named group_vars. In the group_vars directory you’ll see a file named webservers. The prod directory has the exact same structure.
The files in the group_vars directory are what allow you to set Ansible variables. In the file inventories/staging/group_vars/webservers there’s a single line:
file_path: /var/log/staging
This means that when you apply a playbook to the inventory defined by inventories/staging, the variable file_path will be set to /var/log/staging for all the tasks that Ansible runs on the servers in the webservers group.
For example, if you run ansible-playbook -i inventories/staging
playbooks/03.yml
, you’re saying apply the playbook playbooks/03.yml
to the inventory inventories/staging. In inventories/staging/host,
you’ll see that the Vagrant server belongs to the webservers
group. Therefore, when Ansible runs the tasks in playbooks/03.yml on
that server, it will set the file_path variable to
/var/log/staging, and the result will be that it will create the
file /var/log/staging. The inventory inventories/prod is identical
except that file_path gets set to /var/log/prod.
Ansible Roles
If you stopped reading the tutorial now and proceeded to develop your own playbooks using only tasks and variables, you could actually accomplish quite a bit. In fact, when I first started using Ansible, I would just copy a playbook from one project to another and make a bunch of tweaks. This process yielded some of the advantages of treating infrastructure as code — for example, I could easily test out a server configuration on a VM and have confidence that a remote host would get ste up in exactly the same way — but it was still too cumbersome and error-prone. It was like copying and pasting some functions from one project to another over and over and slightly modifying them, instead of taking the time to properly refactor them into a library.
Roles are like libraries. Just like clojure jars and ruby gems, they’re a way of organizing code and packaging it for reuse. They package together tasks and default values for variables, and they can also package files and templates. The Sweet Tooth roles, for example, include tasks for installing nginx along with an nginx config file template.
If you want to create an Ansible role, you have to organize your files using a predefined structure. By default, Ansible looks for role definitions in the ./roles directory relative to the playbook you’re applying. Our next tutorial playbook, 03.yml, includes two roles, create-file and install-package:
---
- hosts: webservers
become: true
become_method: sudo
vars:
file_path: /etc/qux.conf
roles:
- create-file
- install-package
There’s a roles directory in the playbooks directory, and the roles directory contains a directory named create-file and a directory named install-package; defining a role starts with creating a directory bearing the role’s name within the roles directory.
The role’s directory can include the directories tasks, defaults, files templates and a few others that aren’t relevant for us. You define a role’s tasks in tasks/main.yml and the default values for a role in defaults/main.yml. (You’ll see examples files and templates getting used when we go through the Sweet Tooth code.) Our create-file role defines tasks under playbooks/roles/create-file/tasks/main.yml. Open it and you’ll see a familiar site:
---
- name: Create an empty file
file: path={{ file_path }} state=touch mode=0644
This is the task from 02.yml, copied and pasted and unindented one level. It still contains the variable file_path, which we set in 03.yml.
The install-package role similarly lifts the task from 02.yml and places it in playbooks/roles/install-package/tasks/main.yml:
- name: Install a package
apt:
pkg: "{{ package }}"
state: installed
update-cache: yes
However, 03.yml does not define a value for the package variable. Instead, it’s defined in playbooks/roles/install-package/defaults/main.yml. You can override this default value in 03.yml.
When you list roles in a playbook, as we’ve done in 03.yml, the roles' tasks are executed in the order that the roles are included, so in this case if we apply the 03.yml playbook Ansible will create the file /etc/qux.conf, and after that it will install git.
Relating It Back to Clojure Deployment
Now that you’ve learned some Ansible basics, let’s return to our character-sheet-example directory and look at how the scripts are implemented.
When you run cd
to character-sheet-example/infrastructure and run
./deploy dev
or ./provision prod
, those shell scripts are using
ansible to apply a playbook to an inventory, and those playbooks
include a bunch of roles. Here’s the contents of the deploy script:
#!/bin/bash
cp ../target/build/app.jar ansible/files/app.jar
cd ansible
ansible-playbook -i inventories/$1 sweet-tooth.yml --skip-tags=install
The last line calls ansible-playbook, just as you’ve been doing
throughout the chapter. $1
is the bash variable for the first
command line argument; when you call ./deploy dev
the script
translates that to ansible-playbook -i inventories/dev
sweet-tooth.yml --skip-tags=install
We haven’t covered tags yet, so
ignore --skip-tags=install
for now.
So, when you call ./deploy dev
it applies the playbook
sweet-tooth.yml to the inventory under inventories/dev. If you
look at inventories/dev, you’ll see that the inventory is defined
with a directory structure rather than a single file. The hosts file
points to the project’s vagrant server, and group_vars/webservers
defines a couple variables that I’ll cover in the next chapter.
The playbook sweet-tooth.yml looks like this:
---
- hosts: webservers
become: true
become_method: sudo
roles:
- "sweet-tooth-clojure.clojure-uberjar-webapp-common"
- "sweet-tooth-clojure.clojure-uberjar-webapp-nginx"
- "sweet-tooth-clojure.clojure-uberjar-webapp-datomic-free"
- "sweet-tooth-clojure.clojure-uberjar-webapp-app"
All it’s doing is listing what roles to apply; roles are very powerful and useful! In the next chapter, we’ll go through each of these roles so that you can learn exactly what they do.
Recap
This chapter introduced you to DevOps and the benefits of treating infrastructure as code. You learned some basics of how to use an IaC tool, Ansible. You learned that Ansible applies a playbook, which is a set of tasks, to an inventory, which is an organized collection of servers. You saw how Ansible uses variables to make playbooks more modular, and you got a quick look at how to use roles to organize and package your Ansible tasks and variables.
Note
|
Tales of Chaos and Madness …From the Shadows
Once, when I was eighteen, my friends and I dined and dashed (left without paying for our food). I drove the getaway vehicle. The thrill of our transgression filled me with such a powerful euphoria that it propelled me all the way out of the parking lot before I turned back around and went inside and paid. So dark! |