This is known generally as predication on the GPU and can lead to poor utilization of resources as in this case. If anyone is more interested on this these slides from 152 have a good example and some more details on GPUs in general:http://www-inst.eecs.berkeley.edu/~cs152/sp19/lectures/L16-GPU.pdf
Carpetfizz
Perhaps branch prediction works better on GPUs since shaders are usually simple, small programs? But there might also be a high variance in what branch the program takes since the shader is being ran (possibly) millions of times.
This is known generally as predication on the GPU and can lead to poor utilization of resources as in this case. If anyone is more interested on this these slides from 152 have a good example and some more details on GPUs in general:http://www-inst.eecs.berkeley.edu/~cs152/sp19/lectures/L16-GPU.pdf
Perhaps branch prediction works better on GPUs since shaders are usually simple, small programs? But there might also be a high variance in what branch the program takes since the shader is being ran (possibly) millions of times.