Main interests

- Accelerators & Hybrid Processors
- Manycore Architectures
- Task Scheduling & Data Management
- Runtime Systems
- Parallel Programming
- High Performance Computing

CV

I am currently a System Software Engineer at NVIDIA where in work the CUDA team. Prior to joining NVIDIA, I was a PhD student at the University of Bordeaux/INRIA. You can download my cv here.

Read More

StarPU

During my PhD thesis, I've designed the StarPU runtime system which schedules tasks and automates data management within heterogeneous multicore platforms enhanced with accelerators.

Read More

Hits From The Blog

CUDA kernels launches in the null stream are NOT synchronous

Today I’d like to point out some very common error found in CUDA codes which assume that calling a kernel with stream 0 (either explicitly or implicitly) will result in a synchronous kernel call. Prior to the introduction of streams in CUDA, users did not have to care about synchronization issues, as everything would typically [...]

Declaring dependencies with cudaStreamWaitEvent

cudaStreamWaitEvent is a very useful synchronization primitive which takes two arguments as input: a stream, and an event. Even if this not clear from its name, this is a non blocking function, all operations enqueued in the stream after calling cudaStreamWaitEvent will only be unlocked when the event is triggered. A simple example For example, in [...]