CUDA: Difference between CPU timer and CUDA timer event? -


what difference between using cpu timer , cuda timer event measure time taken execution of cuda code? of these should cuda programmer use , why?

cpu timer usage involve calling cudathreadsynchronize before time noted. noting time clock() used or high-resolution performance counter queryperformancecounter (on windows) queried.

cuda timer event involve recording before , after using cudaeventrecord. @ later time, elapsed time obtained calling cudaeventsynchronize on events, followed cudaeventelapsedtime obtain elapsed time.

the answer first part of question cudaevents timers based off high resolution counters on board gpu, , have lower latency , better resolution using host timer because come "off metal". should expect sub-microsecond resolution cudaevents timers. should prefer them timing gpu operations precisely reason. per-stream nature of cudaevents can useful instrumenting asynchronous operations simultaneous kernel execution , overlapped copy , kernel execution. doing sort of time measurement impossible using host timers.

edit: won't answer last paragraph because deleted it.


Comments