Q: Can we read these cray PM counters from application to measure energy consumption of specific functions? A: Yes. You could read the files directly from the application, as Ashesh is talking about now, but he will also talk about other interfaces (such as PAPI) later in the talk. Q: Is it possible to read power details at the core level? What is the granularity that’s possible? A: PM counters are on a node basis and you can get node, cpu memory and gpu/accelerator counters. More detailed information is captured by other APIs tools. Q: If I am building a script to read these counters are regular intervals, should I also be aware of the overhead of reading these number? A: If you are reading directly from the system files, the overhead will depend on how frequently you are reading, because every read will cause a context switch. The overhead numbers reported are specific to only if using CrayPat/perftools Q: 0.1s sampling rate is much longer than kernel execution time. Is there any way to get a a per-kernel energy use estimate? A: Seems like tracing with PAT_RT_SUMMARY=1 would give a good integrated estimate of total energy used by each kernel. A: Tracing with summary. Measures energy on entry and exit. By summarizing (integrating), the estimate might be good. A: Omnitrace is able to do that. Bruno will arrive at that in a few slides. Q: How much control does the user have over how much instrumentation is done by the binary instrumentator? A: There is some control - The user can specify only a single function or list of functions to be instrumented, the user can omit functions from being instrumented. The user can also specify predefined tracing groups to be traced, e.g., hip, mpi, blas, etc. The link to pat_build in the presentation should have all the details. If you run into issues, we can help you weed through them.