Scale HPC Workloads with Elastic Fabric Adapter and AWS ParallelCluster

In April, 2019, AWS announced the general availability of Elastic Fabric Adapter (EFA), an EC2 network device that improves throughput and scalability of distributed High Performance Computing (HPC) and Machine Learning (ML) workloads. Today, we’re excited to announce support of EFA through AWS ParallelCluster.

EFA is a network interface for Amazon EC2 instances that enables you to run HPC applications requiring high levels of inter-instance communications (such as computational fluid dynamics, weather modeling, and reservoir simulation) at scale on AWS. It uses an industry-standard operating system bypass technique, with a new custom Scalable Reliable Datagram (SRD) Protocol to enhance the performance of inter-instance communications, which is critical to scaling HPC applications. For more on EFA and supported instance types, see Elastic Fabric Adapter (EFA) for Tightly-Coupled HPC Workloads.

AWS ParallelCluster takes care of the undifferentiated heavy lifting involved in setting up an HPC cluster with EFA enabled. When you set the enable_efa = compute flag in your cluster section, AWS ParallelCluster will add EFA to all network-enhanced instances. Under the cover, AWS ParallelCluster performs the following steps:

  1. Sets InterfaceType = efa in the Launch Template.
  2. Ensures that the security group has rules to allow all inbound and outbound traffic to itself. Unlike traditional TCP traffic, EFA requires an inbound rule and an outbound rule that explicitly allow all traffic to its own security group ID sg-xxxxx. See Prepare an EFA-enabled Security Group for more information.
  3. Installs EFA kernel module, an AWS-specific version of the Libfabric Network Stack, and OpenMPI 3.1.4.
  4. Validates instance type, base os, and a placement group.

To get started, you’ll need to have AWS ParallelCluster set up, see Getting Started with AWS ParallelCluster. For this tutorial, we’ll assume that you have an AWS ParallelCluster installed and are familiar with the ~/.parallelcluster/config file.

Modify your ~/.parallelcluster/config file to include a cluster section that minimally includes the following:

cluster_template = efa
update_check = true
sanity_check = true

aws_region_name = [your_aws_region]

[cluster efa]
key_name =               [your_keypair]
vpc_settings =           public
base_os =                alinux
master_instance_type =   c5.xlarge
compute_instance_type =  c5n.18xlarge
placement_group =        DYNAMIC
enable_efa =             compute

[vpc public]
vpc_id = [your_vpc]
master_subnet_id = [your_subnet]
  • base_os – currently we support Amazon Linux (alinux), Centos 7 (centos7), and Ubuntu 16.04 (ubuntu1604) with EFA.
  • master_instance_type This can be any instance type (it is outside of the placement group formed for the compute nodes and does not have EFA enabled). We chose c5n.xlarge due to its cheaper price yet still good network performance, as compared with the c5n.18xlarge.
  • compute_instance_type EFA is enabled only on the compute nodes; this is where your code runs when submitted as a job through one of the schedulers, and these instances need to be one of the supported instance types, which at the time of writing includes c5n.18xlarge, i3en.24xlarge, p3dn.24xlarge. See the docs for Currently supported instances.
  • placement_group places your compute nodes physically adjacent, which enables you to benefit fully from EFA’s low network latency and high throughput.
  • enable_efa This is the only new parameter we’ve added to turn on EFA support for the compute nodes. At this time, the only option is compute. This is designed to draw your attention to the fact that EFA is only enabled on the compute nodes.

Now you can create the cluster:

$ pcluster create efa
MasterServer: RUNNING
ClusterUser: ec2-user

Once cluster creation is complete, you can SSH into the cluster:

$ pcluster ssh efa -i ~/path/to/ssh_key

You can now see that there’s a module, openmpi/3.1.4, available. When this is loaded, you can confirm that mpirun is correctly set on the PATH to be the EFA-enabled version in /opt/amazon/efa:

[ec2-user@ip-172-31-17-220 ~]$ module avail

----------------------------------------------- /usr/share/Modules/modulefiles ------------------------------------------------
dot           module-git    module-info   modules       null          openmpi/3.1.3 use.own
[ec2-user@ip-172-31-17-220 ~]$ module load openmpi/3.1.4
[ec2-user@ip-172-31-17-220 ~]$ which mpirun

This version of openmpi is compiled with support for libfabric, a library that allows us to communicate over the EFA device through standard mpi commands. At the time of writing, Open MPI is the only mpi library that supports EFA; Intel MPI is expected to be released shortly.

Now you’re ready to submit a job. First create a file submit.sge containing the following:

#$ -pe mpi 2

module load openmpi
mpirun -N 1 -np 2 [command here]

CFD++ Example

EFA speeds up common workloads, such as Computational Fluid Dynamics. In the following example, we ran CFD++ on a 24M cell case using EFA-enabled c5n.18xlarge instances. CFD++ is a flow solver developed by Metacomp Technologies. The model is an example of a Mach 3 external flow calculation (it’s a Klingon bird of prey):

example of a Mach 3 external flow calculation.

You can see the two scaling curves below; the blue curve shows scaling with EFA; the purple curve without EFA. EFA offers significantly greater scaling and is many times more performant at higher core counts.

scaling curves, with and without EFA.

New Docs!

Last, but definitely not least, we are also excited to announce new docs for AWS ParallelCluster. These are available in ten languages and simply the readthedocs version in many ways. Take a look! Of course, you can still submit doc updates by creating a pull request on the AWS Docs GitHub repo.

AWS ParallelCluster is a community-driven project. We encourage submitting a pull request or providing feedback through GitHub issues. User feedback drives our development and pushes us to excel in every way!