Installing HashiCorp Nomad on Ubuntu Server 20.04
Introduction
Nomad simple and flexible workload orchestrator to deploy and manage containers and non-containerized applications across on-prem and clouds at scale.
-
Simple and Lightweight
- Single binary that integrates into existing infrastructure. Easy to operate on-prem or in the cloud with minimal overhead.
-
Flexible Workload Support
- Orchestrate applications of any type - not just containers. First class support for Docker, Windows, Java, VMs, and more.
-
Modernize Legacy Applications without Rewrite
- Bring orchestration benefits to existing services. Achieve zero downtime deployments, improved resilience, higher resource utilization, and more without containerization.
-
Easy Federation at Scale
- Single command for multi-region, multi-cloud federation. Deploy applications globally to any region using Nomad as a single unified control plane.
-
Multi-Cloud with Ease
- One single unified workflow for deploying to bare metal or cloud environments. Enable multi-cloud applications with ease.
-
Native Integrations with Terraform, Consul, and Vault
- Nomad integrates seamlessly with Terraform, Consul and Vault for provisioning, service networking, and secrets management.
Installing Nomad as APT Package
Add the HashiCorp GPG key.
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
Add the official HashiCorp Linux repository.
apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
Update and install.
apt-get update && apt-get install nomad
To verify Nomad was installed correctly, try the nomad command.
nomad
Ports Used
Nomad requires 3 different ports to work properly on servers and 2 on clients, some on TCP, UDP, or both protocols:
-
HTTP API (Default
4646
): This is used by clients and servers to serve the HTTP API. TCP only. -
RPC (Default
4647
): This is used for internal RPC communication between client agents and servers, and for inter-server traffic. TCP only. -
Serf WAN (Default
4648
): This is used by servers to gossip both over the LAN and WAN to other servers. It isn't required that Nomad clients can reach this address. TCP and UDP.
FirewallD
firewall-cmd --permanent --zone=public --add-port=4646-4647/tcp
firewall-cmd --permanent --zone=public --add-port=4648/tcp
firewall-cmd --permanent --zone=public --add-port=4648/udp
firewall-cmd --reload
firewall-cmd --zone=public --list-all
ufw
ufw allow 4646:4648/tcp
ufw allow 4648/udp
ufw reload
Nomad Agent
Nomad relies on a long running agent on every machine in the cluster. The agent can run either in server or client mode. The cluster servers are responsible for managing the cluster. All other agents in the cluster should be in client mode. A Nomad client is a very lightweight process that registers the host machine, performs heartbeating, and runs the tasks that are assigned to it by the servers. The agent must be run on every node that is part of the cluster so that the servers can assign work to those machines.
In this guide, you will start the Nomad agent in development mode. This mode is used to quickly start an agent that is acting as a client and server to test job configurations or prototype interactions. Start a single Nomad agent in development mode with the nomad agent command. Note, this command should not be used in production as it does not persist state.
nomad agent -dev
Wait to continue to the next section until you see the agent has acquired leadership:
2020-08-28T09:23:18.317Z [WARN] nomad.raft: heartbeat timeout reached, starting election: last-leader=
2020-08-28T09:23:18.317Z [INFO] nomad.raft: entering candidate state: node="Node at 127.0.0.1:4647 [Candidate]" term=2
2020-08-28T09:23:18.317Z [DEBUG] nomad.raft: votes: needed=1
2020-08-28T09:23:18.317Z [DEBUG] nomad.raft: vote granted: from=127.0.0.1:4647 term=2 tally=1
2020-08-28T09:23:18.317Z [INFO] nomad.raft: election won: tally=1
2020-08-28T09:23:18.317Z [INFO] nomad.raft: entering leader state: leader="Node at 127.0.0.1:4647 [Leader]"
2020-08-28T09:23:18.317Z [INFO] nomad: cluster leadership acquired
In another terminal, use nomad node status to view the registered nodes of the Nomad cluster.
nomad node status
ID DC Name Class Drain Eligibility Status
425b29b5 dc1 salt-master <none> false eligible ready
The output shows your Node ID, its datacenter, node name, node class, drain mode and current status. The Node ID is a randomly generated UUID. Notice that your node is in the ready state and task draining is currently off.
The agent is also in server mode, which means it is part of the gossip protocol used to connect all the server instances together:
nomad server members
Name Address Port Status Leader Protocol Build Datacenter Region
salt-master.global 127.0.0.1 4648 alive true 2 0.12.3 dc1 global
The output shows your agent, the address it is running on, its health state, some version information, and the datacenter and region. Additional metadata can be viewed by providing the -detailed flag:
nomad server members -detailed
Name Address Port Tags
salt-master.global 127.0.0.1 4648 build=0.12.3,bootstrap=1,role=nomad,vsn=1,rpc_addr=127.0.0.1,mvn=1,raft_vsn=2,region=global,expect=1,id=99ee6e58-43bc-6f5c-e1f7-8e6a3194f433,dc=dc1,port=4647
Jobs
Jobs are the primary configuration that users interact with when using Nomad. A job is a declarative specification of tasks that Nomad should run. The job created by running nomad job init
uses the Docker task driver. To run it, you will need a Nomad client available with Docker installed.
To get started, use the job init command which generates a skeleton job file:
mkdir ~/nomad
cd ~/nomad
nomad job init
Example job file written to example.nomad
inside your current working directory. This example job file declares a single task named redis
, which uses the Docker driver to run the a Redis container.:
job "example" {
...
task "redis" {
driver = "docker"
config {
image = "redis:3.2"
port_map {
db = 6379
}
}
}
...
}
The primary way you interact with Nomad is with the job run
command. The run command takes a job file and registers it with Nomad. This is used both to register new jobs and to update existing jobs:
nomad job run example.nomad
==> Monitoring evaluation "9271b1e8"
Evaluation triggered by job "example"
Allocation "9a93c7ff" created: node "425b29b5", group "cache"
Evaluation within deployment: "b5ddfc12"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "9271b1e8" finished with status "complete"
If you receive the following Error message you don't have Docker installed or the Docker daemon is not loaded:
Task Group "cache" (failed to place 1 allocation):
* Constraint "missing drivers": 1 nodes excluded by filter
To inspect the status of your job you use the status command:
nomad status example
ID = example
Name = example
Submit Date = 2020-08-28T11:38:59Z
Type = service
Priority = 50
Datacenters = dc1
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
cache 0 0 1 0 0 0
Latest Deployment
ID = b5ddfc12
Status = running
Description = Deployment is running
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
cache 1 1 0 1 2020-08-28T11:48:59Z
Allocations
ID Node ID Task Group Version Desired Status Created Modified
9a93c7ff 425b29b5 cache 0 run running 3m14s ago 14s ago
The last entry Allocation represents the instance described by the task that is now placed on your node:
nomad alloc status 9a93c7ff
ID = 9a93c7ff-1e61-1277-63aa-95ad42d089f7
Eval ID = 9271b1e8
Name = example.cache[0]
Node ID = 425b29b5
Node Name = salt-master
Job ID = example
Job Version = 0
Client Status = running
Client Description = Tasks are running
Desired Status = run
Desired Description = <none>
Created = 6m21s ago
Modified = 3m21s ago
Deployment ID = b5ddfc12
Deployment Health = unhealthy
Task "redis" is "running"
Task Resources
CPU Memory Disk Addresses
2/500 MHz 988 KiB/256 MiB 300 MiB db: 127.0.0.1:27335
Task Events:
Started At = 2020-08-28T11:40:01Z
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
2020-08-28T11:41:59Z Alloc Unhealthy Task not running for min_healthy_time of 10s by deadline
2020-08-28T11:40:01Z Started Task started by client
2020-08-28T11:38:59Z Driver Downloading image
2020-08-28T11:38:59Z Task Setup Building Task Directory
2020-08-28T11:38:59Z Received Task received by client
To see the logs of a task, use the alloc logs
command:
nomad alloc logs 9a93c7ff redis
1:C 28 Aug 11:40:01.407 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.2.12 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 1
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
After modifying the job specification, use the job plan command to invoke a dry-run of the scheduler to see what would happen if you ran the updated job:
nomad job plan example.nomad
The final step in a job lifecycle is stopping the job. This is done with the job stop command:
nomad job stop example
nomad job status example
ID = example
Name = example
Submit Date = 2020-08-28T11:38:59Z
Type = service
Priority = 50
Datacenters = dc1
Namespace = default
Status = dead (stopped)
Periodic = false
Parameterized = false