KernelInfo¶
Introduction¶
This LLVM IR pass reports various statistics for codes compiled for GPUs. The goal of these statistics is to help identify bad code patterns and ways to mitigate them. The pass operates at the LLVM IR level so that it can, in theory, support any LLVM-based compiler for programming languages supporting GPUs.
By default, the pass runs at the end of LTO, and options like
-Rpass=kernel-info
enable its remarks. Example opt
and clang
command lines appear in the next section.
Remarks include summary statistics (e.g., total size of static allocas) and individual occurrences (e.g., source location of each alloca). Examples of the output appear in tests in llvm/test/Analysis/KernelInfo.
Example Command Lines¶
To analyze a C program as it appears to an LLVM GPU backend at the end of LTO:
$ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
-Rpass=kernel-info
To analyze specified LLVM IR, perhaps previously generated by something like
clang -save-temps -g -fopenmp --offload-arch=native test.c
:
$ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \
-pass-remarks=kernel-info -passes=kernel-info
When specifying an LLVM pass pipeline on the command line, kernel-info
still
runs at the end of LTO by default. -no-kernel-info-end-lto
disables that
behavior so you can position kernel-info
explicitly:
$ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
-Rpass=kernel-info \
-Xoffload-linker --lto-newpm-passes='lto<O2>'
$ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
-Rpass=kernel-info -mllvm -no-kernel-info-end-lto \
-Xoffload-linker --lto-newpm-passes='module(kernel-info),lto<O2>'
$ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \
-pass-remarks=kernel-info \
-passes='lto<O2>'
$ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \
-pass-remarks=kernel-info -no-kernel-info-end-lto \
-passes='module(kernel-info),lto<O2>'