Image

Using R on HPC Clusters Webinar

This Webinar tutorial helps users learn a basic workflow for how to use R on an HPC cluster. The tutorial will focus on parallel computing as a means to speed up R scripts on a cluster computer. Many packages in R offer some form of parallel computing yet they rely on a much smaller set of underlying approaches: multithreading in compiled code, the unix fork, and MPI. The tutorial will take a narrow path to focus on packages that directly engage the underlying approaches, yet are easy to use at a high-level.

Objectives 

  1. Learn how to work with GitHub in RStudio. Create a GitHub (or ORNL GitLab) account, create a repository and practice how to work with it from RStudio. Many tutorials are available on the web, for example by RStudio. 
  2. Learn a few basic unix commands for listing files, creating a directory, removing files, etc. Lots of places to learn, for example Unix Shell Crash Course

Tutorial workflow:

We will run R as batch jobs on the clusters. The workflow will be:

Edit your code in RStudio -> push the code to GitHub/GitLab -> pull the code to the cluster and submit as batch -> look at your output and circle back to Edit.

This has the advantage of editing in a familiar environment and running in a common teaching environment. Other workflows are possible if you already know the tools.

Topics covered:

Day 1: Friday, July 20, 9:00am-12:00pm

Parallel hardware and software overview and ways to use multiple cores on a single node: mclapply (fork), multithreaded BLAS

Day 2: Wednesday, July 27, 9:00am-12:00pm

Hardware review and using multiple nodes: MPI at high level via pbdMPI, matrix methods via kazaam and pbdDMAT

Prerequisites:

 

Registration

Date

Jul 20 2022

Time

Repeats on July 27
9:00 am - 12:00 pm
Category
QR Code

Comments are closed.