March 2024 OLCF User Conference Call: NVIDIA NeMo
The OLCF hosts monthly User Conference Calls. These calls are your opportunity to speak with center personnel to get the latest updates, express any concerns you may have, etc. No registration is required for this event. An email with an invite to the event will be sent to OLCF users approximately one week beforehand.
Monthly Topic: Unlocking the power of LLMs with NVIDIA NeMo
Speaker: Zahra Ronaghi (NVIDIA), Rohan Rao (NVIDIA), Janaki Vamaraju (NVIDIA)
Abstract: NVIDIA NeMo is an end-to-end framework to build, customize, and deploy generative AI models across various applications such as speech recognition, natural language processing, and text-to-speech synthesis. It includes training and inferencing frameworks, guardrail toolkits, data curation tools, and pretrained models. This framework supports the training of large-scale models with billions of parameters and facilitates multi-node and multi-GPU training and inference to enhance throughput and minimize training time. Leveraging NVIDIA NeMo, Triton Inference Server and TensorRT-LLM, developers can also build Retrieval-Augmented Generation (RAG) workflows to enhance the accuracy and reliability of generative AI models. RAG improves accuracy with up-to-date and domain specific information, and reduces bias and hallucinations by ensuring that the models have access to relevant and up-to-date information. This approach allows users to interact with data repositories, opening up new possibilities for applications across various domain areas.
Looking for a previous User Call recording? Check out our Training Archive and Vimeo Channel.