Weekly Update: July 8, 2020
IN THIS MESSAGE
Center Announcements
– New High-Memory Nodes on Summit
Meetings & Workshops
– New Kokkos Lecture Series Starts Jul 17
– TAU Performance Analysis Training (Jul 28)
– 2020 OLCF GPU Hackathon (virtual)
– CUDA Training Series
Upcoming Scheduled Outages
– HPSS (Jul 14-15)
– Summit, Rhea, Alpine, DTNs (Jul 21)
Center Announcements
New High-Memory Nodes on Summit
We are pleased to announce the addition of 54 high-memory nodes on Summit. These nodes have 2TB of DDR4 memory for the POWER9 processors, 192GB of high-bandwidth memory (HBM2) for the V100 accelerators, and 6.4TB of non-volatile memory (as compared to 512GB DDR4, 96GB HBM2, and 1.6TB for Summit’s other nodes). These nodes are available via the “batch-hm” queue. Jobs submitted to this queue are limited to a 6-hour walltime. For more information, see the Summit User Guide.
Meetings & Workshops
New Kokkos Lecture Series Starts July 17
You are invited to participate in a new Kokkos Lecture Series where attendees will learn everything necessary to start using Kokkos to write performance portable code. This 8-part Kokkos Lecture Series will consist of 2-hour online lectures every Friday and exercises as homework. The team will provide support via GitHub and Slack throughout the time of the training. The events will be recorded so if you have to miss one, you can still watch the video and complete the exercises in time to participate in the next lecture. For more information or to register, please visit https://www.exascaleproject.org/event/kokkos-class-series/.
TAU Performance Analysis Training
July 28, 2020
1:00 – 3:30 PM (ET)
On Tuesday, July 28, the OLCF will host (virtually) Sameer Shende (University of Oregon) to present a training on TAU performance analysis. TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Python. During the training, Sameer will cover TAU using presentations & demos and will also provide example codes for participants to work through. For more information, or to register, please visit: https://www.olcf.ornl.gov/calendar/tau-performance-analysis-training/
2020 OLCF GPU Hackathon (virtual)
The OLCF’s annual GPU hackathon is multi-day coding event in which teams of developers prepare their own application(s) to run on GPUs or focus on optimizing their application(s) that currently run on GPUs. Teams typically consist of three or more developers who are intimately familiar with (some part of) their application, and they work alongside two mentors with GPU programming expertise. These hackathons offer a unique opportunity for teams to set aside time for development, surround themselves with experts in the field, and push toward their development goals. During the event, teams will have access to the OLCF’s Ascent training system (same node architecture as Summit). If you’re interested in more information, have questions about the virtual format, or just want to apply, you can visit the event page below. If you have additional questions, please contact Tom Papatheodore.
– Event Dates: October 19, 26-28 (9 AM – 5 PM (ET))
– Deadline to apply: August 6, 2020 @ 11:59 PM (ET)
– Event page: https://www.olcf.ornl.gov/2020-olcf-gpu-hackathon/
CUDA Training Series
NVIDIA will present a 9-part CUDA training series intended to help new and existing GPU programmers understand the main concepts of the CUDA platform and its programming model. Each part will include a 1-hour presentation and example exercises. The exercises are meant to reinforce the material from the presentation and can be completed during a 1-hour hands-on session following each lecture. The full list of topics can be found on the series page. If you have any questions about this series, please contact Tom Papatheodore. For more information on upcoming events in this series, or to register, see the links below:
CUDA Concurrency (Tuesday, July 21)
GPU Performance Analysis(Tuesday, August 18)
Cooperative Groups(Thursday, September 17)
Upcoming Scheduled Outages
HPSS will be unavailable from 8:00 AM on Tuesday, July 14 until noon on Wednesday, July 15.
Summit, Rhea, the Alpine Filesystem, and the Data Transfer Nodes will be unavailable from 8:00 AM until 4:00 PM on Tuesday, July 21.