CS184/284A: Lecture Slides

I think Nvidia's guide here (https://bit.ly/2ZnCCQU) which discusses how to program and manage parallelism across multiple cores is quite interesting. There are a lot of challenges that come with parallelism, specifically caching and memory management, and I think the guide does a good job of identifying the things that Nvidia's program takes care of versus topics the user might have to manage.