ifs_earth_image


IFS NR Data Hackathon
Exploring the ECMWF IFS 1-km nature run
A baseline for a digital twin of earth


Proposal Submission Form

Proposal Submission Form

IFS NR Data Hackathon

Name(Required)
Email(Required)
Team Confirmation(Required)
If selected, will you have at least 2 active members on your team (including yourself)?
Additional Team Members
Please list the additional members of your team (if currently known).
First Name
Last Name
Email
Affiliation
 
Please check the boxes below that best describe your team's current experience with high performance computing (HPC).(Required)
This question is meant to help the organizers understand what type of initial training might be helpful for the selected teams.
Are you currently interested in using GPUs as part of your computing needs?(Required)
This question is meant to help the organizers gauge the teams' interest in OLCF's GPU resources. You could always decide later to (or not to) use GPUs.
Please describe your team's potential plans for using this data set?
Contacts

Contacts

If you have any questions about participating in this event, please email one (or more) of the contacts below.

General Inquiries Tom Papatheodore papatheodore@ornl.gov
Data Valentine Anantharaj vga@ornl.gov
Nature run Inna Polichtchouk Inna.Polichtchouk@ecmwf.int

Overview

The European Center for Medium-Range Weather Forecasts (ECMWF) and the Oak Ridge National Laboratory (ORNL) are pleased to announce access to the data collection from global 1-km experimental nature run simulations using the Integrated Forecast System (IFS) with explicit convection [1]. We invite you to join us in exploring this precursor to a digital twin of the earth!

This open science event will facilitate direct access to the data, currently hosted at OLCF. Participating teams will also have access to a large linux cluster with 700 nodes. In addition, teams exploring machine learning applications will also have limited access to Ascent, a stand-alone 15-node IBM AC922 with an architecture identical to Summit with 90 NVIDIA V100 GPUs.


The Experimental Nature Run

A nature run can be considered a “long forecast” of the atmosphere (typically two weeks beyond initialization) [2] when the simulation develops its own course of reality. The experimental nature run at global 1-km resolution (XNR1K) is based on IFS Science Version 45r1 [3]. Note that hereafter, use of the term “nature run” in this announcement refers to XNR1K.

A set of two seasonal simulations have been completed, one corresponding to the northern hemispheric winter months (NDJF) and the other for the North Atlantic tropical cyclone season (ASO). For the first seasonal run (NDJF), the hydrostatic IFS model was initialized at 00Z on 1 November 2018, and the other season (ASO) was initialized at 00Z on 1 August 2019. The XNR1K simulations were forced only by 1/12 degree OSTIA sea surface temperatures (SST) at the lower boundary. The output was saved every 3 hours at all model levels. In addition, the XNR1K simulations were rerun for four specific extreme events, with output every 15 minutes. The NR special cases include a tropical cyclone and three severe storm events over the continental USA.

Both model level data at all 137 levels and pressure level fields at 31 levels are being made available, along with 94 single level fields. The 3D variables include the prognostic variables which are temperature, pressure, moisture and winds (u, v and w), as well as hydrometeor content, namely cloud liquid, cloud ice, rain and snow.


Topics for Exploration

At 1-km resolution, many of the smaller scale processes are resolved in the NR, such as most of the gravity wave spectrum [4]. Hence, this XNR1K dataset provides a unique opportunity to not only investigate smaller scale processes but also to inform the development of their parameterization in order to improve numerical weather and climate prediction. We invite science teams to propose research and application areas of interest that could be explored at scale. Investigation of extreme weather events, including hurricanes and severe storms, are of special interest. Additional emphasis will also be on the development of AI and ML applications for surrogate models, emulators for data assimilation systems, satellite retrievals of earth observations, etc. Visualization of the global 1-km simulations are also encouraged for science investigations as well as for outreach activities to inform and educate. The high temporal frequency (15 minutes) simulations were designed around case studies to develop future observing system technologies using Observing System Simulation Experiments (OSSEs) [5].


Available Resources

The selected science teams will have access to the data and main computational resources for a period of 6 months with an anticipated end date of 31 December 2022. This project has been given a total allocation of 25,000 node-hours on OLCF’s Andes analysis and visualization cluster, 4 PB of storage on high performance file system (GPFS), and access to a Jupyter Hub for interactive analysis with Jupyter Notebooks. For AI and machine learning production needs, teams will also have periodic access to Ascent, a training system with an architecture identical to Summit. A team of mentors, with expertise in systems, data science, computing and machine learning, will be available to provide guidance. A data curator will also offer help with the publication of any derived results and data sets.


Awards

We expect to support 5 – 10 teams of at least two members each with access to the OLCF resources for a period of six months. Applications will be accepted through 31 August, and periodically reviewed. We will select and approve suitable projects on a rolling basis through 30 September. Note that the anticipated date of completion for all the projects is still 31 December 2022. Meritorious projects needing extensions will be evaluated on an individual basis.


Requirements

All applications should provide the necessary information and be submitted online. Initially, only two users per team may be granted access to OLCF resources. Additional users may be approved to meet the science objectives. But all science teams are encouraged to involve as many members as necessary for the success of the projects. International participants are welcome to apply, but approval and access are subject to the laws of the government of the United States.


Limitations

Data are available for open science research and applications only. Redistribution of the data is currently not allowed. List all project team members who will be using the data.


Organizing Committee

Valentine Anantharaj (ORNL) Lead (Data)
Inna Polichtchouk (ECMWF) PI & Lead (Science)
Tom Papatheodore (ORNL) Lead (Event)
Samuel Hatfield (ECMWF) Lead (Computing)
Suzanne Parete-Koon (ORNL) Event Manager

References

[1] Wedi, N. P., Polichtchouk, I., Dueben, P., Anantharaj, V. G., Bauer, P., Boussetta, S., et al. (2020). A baseline for global weather and climate simulations at 1 km resolution. Journal of Advances in Modeling Earth Systems, 12, e2020MS002192. https://doi.org/10.1029/2020MS002192

[2] Hoffman, R., Pertain, P., Peevey, T., and Finley, S. (2018). ECMWF Nature Run. https://www.cira.colostate.edu/imagery-data/ecmwf-nature-run/ [Accessed 15 June 2022]  

[3] ECMWF (2020). IFS documentation. https://www.ecmwf.int/en/publications/ifs-documentation [Accessed 15 June 2022]  

[4] Hoffman, R. N., and Atlas, R. (2016). Future Observing System Simulation Experiments. Bulletin of the American Meteorological Society 97, 9, 1601-1616, available from: https://doi.org/10.1175/BAMS-D-15-00200.1 [Accessed 05 May 2022]  

[5] Polichtchouk, I., Wedi, N. and Kim, Y.H., 2022. Resolved gravity waves in the tropical stratosphere: Impact of horizontal resolution and deep convection parametrization. Quarterly Journal of the Royal Meteorological Society, 148(742), pp.233-251. https://doi.org/10.1002/qj.4202