Workflow: Manual

Overview

This workflow describes configuring a simple HPC environment, consisting of:

  • Shared NFS directories for users, data and applications

  • SLURM queuing system for workload processing and management

  • Flight Env for managing configurationg and applications available in the environment

Prerequisites

This document presumes the following situation:

  • The cluster has a gateway node (for running various servers)

  • The cluster has multiple compute nodes (for executing jobs)

  • DNS is correctly configured to allow hostname connections between the nodes

  • Firewall connections between the gateway and compute nodes are open to allow various services to communicate (e.g. queuing system, nfs, etc)

  • SSH keys are correctly configured to allow the gateway to login to nodes (as root)

  • There is sufficient storage space on the gateway and compute nodes (for applications and data, recommended 16GB+)

NFS or Other Shared Storage

NFS is not required for the shared storage solution, however, some form of shared storage is highly recommended to ensure that all nodes in the research environment can access applications and data.

Information on NFS configuration is available in the official NFS documentation.

SLURM

Information on installing SLURM is available in the official SLURM documentation.

Flight User Suite

Information on manually installing the Flight User Suite is available in installation documentation.