Memory performance estimation of CUDA programs

Yooseong Kim; Aviral Shrivastava

doi:10.1145/2514641.2514648

Kim, Yooseong; Shrivastava, Aviral

2013-09-01 00:00:00

Memory Performance Estimation of CUDA Programs YOOSEONG KIM and AVIRAL SHRIVASTAVA, Arizona State University CUDA has successfully popularized GPU computing, and GPGPU applications are now used in various embedded systems. The CUDA programming model provides a simple interface to program on GPUs, but tuning GPGPU applications for high performance is still quite challenging. Programmers need to consider numerous architectural details, and small changes in source code, especially on the memory access pattern, can affect performance significantly. This makes it very difficult to optimize CUDA programs. This article presents CuMAPz, which is a tool to analyze and compare the memory performance of CUDA programs. CuMAPz can help programmers explore different ways of using shared and global memories, and optimize their program for efficient memory behavior. CuMAPz models several memory-performance-related factors: data reuse, global memory access coalescing, global memory latency hiding, shared memory bank conflict, channel skew, and branch divergence. Experimental results show that CuMAPz can accurately estimate performance with correlation coefficient of 0.96. By using CuMAPz to explore the memory access design space, we could improve the performance of our benchmarks by 30% more than the previous approach [Hong and Kim 2010]. Categories and Subject Descriptors: B.8.2 [Performance and

http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

ACM Transactions on Embedded Computing Systems (TECS) Association for Computing Machinery

http://www.deepdyve.com/lp/association-for-computing-machinery/memory-performance-estimation-of-cuda-programs-rd7ar0y0t0

Memory performance estimation of CUDA programs

Loading next page...

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher: Association for Computing Machinery
ISSN: 1539-9087
DOI: 10.1145/2514641.2514648
Publisher site: See Article on Publisher Site

Abstract

Journal

ACM Transactions on Embedded Computing Systems (TECS) – Association for Computing Machinery

Published: Sep 1, 2013

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Memory performance estimation of CUDA programs

Memory performance estimation of CUDA programs

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Memory performance estimation of CUDA programs

Memory performance estimation of CUDA programs

References

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies