Today I’d like to point out some very common error found in CUDA codes which assume that calling a kernel with stream 0 (either explicitly or implicitly) will result in a synchronous kernel call. Prior to the introduction of streams in CUDA, users did not have to care about synchronization issues, as everything would typically [...]
Hits From The Blog
Declaring dependencies with cudaStreamWaitEvent
cudaStreamWaitEvent is a very useful synchronization primitive which takes two arguments as input: a stream, and an event. Even if this not clear from its name, this is a non blocking function, all operations enqueued in the stream after calling cudaStreamWaitEvent will only be unlocked when the event is triggered. A simple example For example, in [...]
Search
News on CUDA
- New CUDA 4.0 Release Makes Parallel Programming Easier
- GPU-Powered Science and Innovation Drive GPU Technology Conference 2011
- NVIDIA Announces Project Denver
- NVIDIA Names Three New 2010 CUDA Fellows
- NVIDIA Tesla GPUs Power World's Fastest Supercomputer
- NVIDIA Expands CUDA Developer Ecosystem With New CUDA Research and Teaching Centers in the U.S., Canada and Europe
- Widespread Adoption of NVIDIA CUDA Accelerates Broadcast & Film Production