Technology - Written by on November 8, 2016

A Decade of User Notification Improvements

Tags: , , ,

The OLCF user assistance group uses multiple channels to reach and notify facility users about system changes or center events. Passive methods include web reminders. Active methods include wall reminders, which notify users logged into a system prior to an outage or when a module or tool is executed.

The OLCF user assistance group uses multiple channels to reach and notify facility users about system changes or center events. Passive methods include web reminders. Active methods include wall reminders, which notify users logged into a system prior to an outage or when a module or tool is executed.

Latest trends include targeted and multi-platform communication

With a diverse and growing user base and regularly evolving computing resources, the Oak Ridge Leadership Computing Facility (OLCF) strives to communicate with its users in the most effective ways possible. The OLCF needs quick, efficient communication methods because some changes to computing resources can drastically affect supercomputer use and research progress. Over the last decade, the OLCF, a US Department of Energy (DOE) Office of Science User Facility located at DOE’s Oak Ridge National Laboratory, has improved its notification practices to include more targeted material and additional ways to get the message out.

“The center’s compute resources are composed of many pieces; changes to even a single piece can impact system usability,” said Chris Fuson, OLCF user assistance task lead. “Because of this, notification of change is a very important task. We don’t want to make changes without notifying the user groups who could be impacted by the change. We are also always looking for tools and methods to improve the effectiveness of our notification.”

Fuson presented a paper at the Cray User Group meeting, earlier this year, focusing on the OLCF’s user assistance notification improvements. The paper—written by Fuson, William Renaud, and James Wynne III of the OLCF—captures the communication trends of the last decade. One trend has been to develop methods and processes that pinpoint specific user groups for targeted messages instead of communicating broadly through mass email.

To do so, user assistance staff members modify—or wrap—multiple command line tools so that messages or checks can be inserted into existing tools. Most recently, they have started wrapping environment modules—common user tools to manage multiple software versions or libraries. Wrapping modules gives staff members an easy way to pinpoint users of a specific library or software so they can target these specific users with messages or advisories relevant to their work.

“We make changes to center resources that impact all users, but many changes only impact small groups of users, so it doesn’t always make sense to email or notify the entire user community about every change,” Fuson said. “This is why we began wrapping modules—because they provide an opportunity to target users of specific packages close to the time when the packages will be used.”

Currently, the user assistance team is working to incorporate the environment module wrapper fully on all of the OLCF machines. So far, the team has implemented wrappers on the OLCF’s Cray XC30, called Eos, and the flagship supercomputer, the Cray XK7 named Titan.

The OLCF user assistance team also uses two groups of email lists—low volume and high volume—to notify users of facility and system events.

“We utilize system-specific email lists, which provide the ability to target users of each OLCF system. We also provide a low-volume list and a high-volume list for each system,” Fuson said. “The low-volume list, which receives about one email a week, is mandatory. The high-volume is optional and receives emails such as the automated notification of system state change.”

In the past, users requested access to the high-volume list. However, within the last year, those considered active users—those who have logged into Titan or placed a job in the queue in the last 10 days—are automatically added to the list. If they want to receive fewer notifications, they can then opt out.

To increase the success of communicating important news, the OLCF also maintains other communication channels. Details about upcoming changes or events often will be placed on the OLCF website, and reminders of a change may be provided to users logged into a system before an outage or when a module or tool is executed.

“The goal is to come up with different ways to communicate with users because everyone has a different workflow and wants to see information in different ways,” Fuson said. “If we can notify users in several different ways, we hope to reach more people in a more effective way.”

Oak Ridge National Laboratory is supported by the US Department of Energy’s Office of Science. The single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.