Lecture 9 - (cont.) More on CUDA
Some announcements:
- 2 pages of notes (front and back) for the MT on Thursday
- Mostly code revisions, finding dependencies, cache issues, etc.
Note that a lot of issues we find in CUDA with malloc
is just using in incorrectly. It is advised to use C++'s new
keyword instead to fix a lot of these issues, if you prefer.
For a review of the basics of CUDA, see this link.
An interesting thing is that we can turn the virtual memory feature off for our CUDA devices. Why would you do this? Because virtual memory is essentially caching memory accesses using virtual devices, between the program and the main memory. Without it, you can:
- Be able to use a lot more memory!
- Have faster memory access times
- Get way more page faults (uhh oh!)
When creating threads:
- Keep the number of operations under 256, which is the number of registers.
- Use as many threads while keeping operations under this number.