This webinar is by invitation only.
Workshop date:  October 15th, 2020
Registration Deadline Sep 21st, 2020

This workshop demonstrates a virtual chemoinformatics lab that uses the Summit supercomputer to analyze massive amounts of data using GPUs.  The virtual lab uses Jupyter notebooks and is implemented on top of several novel GPU-accelerated tools based on NVIDIA RAPIDS, BlazingSQL,  for running data science and analytics pipelines entirely on GPUs and using multiple GPUs and the interconnects on Summit (NVLINK2, Infiniband) using UCX.

We will demonstrate how this virtual lab can be used to analyze  massive data sets including (but not limited) to:

  • 1.4 billion SMILES chemical structures (the entire Enamine REAL database)
  • COVID19:  two billion ligand docking poses over active sites of SARS-CoV-2 main protease (Mpro) and their rescoring
  • Other data sets

Our workshop will give a hands-on session on how users can access the virtual lab to organize, search through and analyze terabytes of data in a matter of seconds and apply scalable queries, data analytics and machine learning techniques. We will also discuss how we developed our virtual lab, a multi-GPU, RAPIDS/BlazingSQL-based interactive analysis tool with a Jupyter Notebook interface to provide a major speedup for querying and analyzing the dataset compared to conventional, persistent databases. We also demonstrate how these tools can be deployed to   go from a desktop to the cloud or a supercomputer.

If you have any questions, please contact Oscar Hernandez (

Join ZoomGov Meeting
Meeting ID: 160 051 4523
Password: OLCFDATA

One tap mobile
+16692545252,,1600514523#,,#,03433038# US (San Jose)
+16468287666,,1600514523#,,#,03433038# US (New York)

Dial by your location
+1 669 254 5252 US (San Jose)
+1 646 828 7666 US (New York)
Meeting ID: 160 051 4523
Password: 03433038
Find your local number: