Access the full text.
Sign up today, get DeepDyve free for 14 days.
X. Huang, C. I. Rodrigues, S. Jones, I. Buck, W. Hwu (2010)
XMalloc: A scalable lock-free dynamic memory allocator for many-core machinesProceedings of the 10th IEEE International Conference on Computer and Information Technology (CIT’10) and 7th IEEE International Conference on Embedded Software and Systems (ICESS’10) (ScalCom’10)
R. L. Davidson, C. P. Bridges (2018)
Error resilient GPU accelerated image processing for space applicationsIEEE Trans. Parallel Distrib. Syst., 29
U. Ozgunalp (2018)
Combination of the symmetrical local threshold and the sobel edge detector for lane feature extractionProceedings of the 9th International Conference on Computational Intelligence and Communication Networks (CICN’17)
A. J. Calderón, L. Kosmidis, C. F. Nicolás, F. J. Cazorla, P. Onaindia (2019)
GMAI: GPU Memory Allocation InspectorRetrieved from https://github.com/ajcalderont/gmai.
M. Steinberger, M. Kenzel, B. Kainz, D. Schmalstieg (2012)
ScatterAlloc: Massively parallel dynamic memory allocation for the GPUProceedings of the Innovative Parallel Computing Conference (InPar’12).
E. D. Berger, K. S. McKinley, R. D. Blumofe, P. R. Wilson (2000)
Hoard: A scalable memory allocator for multithreaded applicationsProceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’00)
M. M. Trompouki, L. Kosmidis, N. Navarro (2017)
An open benchmark implementation for multi-CPU multi-GPU pedestrian detection in automotive systemsProceedings of the IEEE/ACM International Conference on Computer-Aided Design
V. Shah, Apurva Shah (2018)
Proposed Memory Allocation Algorithm for NUMA-Based Soft Real-Time Operating SystemAdvances in Intelligent Systems and Computing
Intel Corporation (2014)
Getting the Most from OpenCL 1Retrieved on October 2019 from https://software.intel.com/en-us/articles/getting-the-most-from-opencl-12-how-to-increase-performance-by-minimizing-buffer-copies-on-intel-processor-graphics., 2019
Free Software Foundation (2019)
The GNU AllocatorRetrieved from https://www.gnu.org/software/libc/manual/html_node/The-GNU-Allocator.html.
Xiaohuang Huang, Christopher Rodrigues, Stephen Jones, I. Buck, Wen-mei Hwu (2013)
Scalable SIMD-parallel memory allocation for many-core machinesThe Journal of Supercomputing, 64
T. Amert, N. Otterness, M. Yang, J. H. Anderson, F. Donelson Smith (2018)
GPU scheduling on the NVIDIA TX2: Hidden details revealedProceedings of the Real-time Systems Symposium
H. Vishwanathan, D. L. Peters, J. Z. Zhang (2017)
Traffic sign recognition in autonomous vehicles using edge detectionProceedings of the ASME Dynamic Systems and Control Conference (DSCC’17)
X. Chen, A. Slowinska, H. Bos (2013)
Who allocated my memory? Detecting custom memory allocators in C binariesProceedings of the Working Conference on Reverse Engineering (WCRE’13)
X. Mei, X. Chu (2017)
Dissecting GPU memory hierarchy through microbenchmarkingIEEE Trans. Parallel Distrib. Syst., 28
ARINC. (2010)
Avionics Application Software Standard Interface: ARINC Specification 653P1-3Aeronautical Radio. Retrieved from https://www.aviation-ia.com/product-categories/600-series.
Xiaodong Yu, Hao Wang, Wu-chun Feng, H. Gong, Guohua Cao (2018)
GPU-Based Iterative Medical CT Image ReconstructionsJournal of Signal Processing Systems, 91
H. Wong, M. Papadopoulou, M. Sadooghi-Alvandi, A. Moshovos (2010)
Demystifying GPU microarchitecture through microbenchmarkingProceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’10)
R. Younis, N. Bastaki (2018)
Accelerated fog removal from real images for car detectionProceedings of the 9th IEEE-GCC Conference and Exhibition (GCCCE’17).
Green Hills Software (1996)
Integrity RTOSRetrieved from https://www.ghs.com/products/rtos/integrity.html.
L. Kosmidis, C. Maxim, V. Jegu, F. Vatrinet, F. J. Cazorla (2018)
Industrial experiences with resource management under software randomization in ARINC653 avionics environmentsProceedings of the IEEE/ACM International Conference on Computer-aided Design
M. Yang, N. Otterness, T. Amert, J. Bakita, J. H. Anderson, F. D. Smith (2018)
Avoiding pitfalls when using NVIDIA GPUs for real-time tasks in autonomous systemsLeibniz International Proceedings in Informatics
X. Huang, C. I. Rodrigues, S. Jones, I. Buck, W.-m (2013)
Hwu, 64
AUTOSAR. (2019)
AUTOSARRetrieved on April 2019 from https://www.autosar.org., 2019
Y. Hasan, J. Chang (2006)
A tunable hybrid memory allocatorJ. Syst. Softw., 79
NVIDIA Corporation (2019)
Self Driving CarsRetrieved on April 2019 from https://www.nvidia.com/en-us/self-driving-cars., 2019
A. Slowinska, T. Stancescu, H. Bos (2011)
Howard: A dynamic excavator for reverse engineering data structuresProceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS’11).
P. Wilson, Mark Johnstone, M. Neely, D. Boles (1995)
Dynamic Storage Allocation: A Survey and Critical Review
S. Widmer, D. Wodniok, N. Weber, M. Goesele (2013)
Fast dynamic memory allocator for massively parallel architecturesProceedings of the 6th ACM Workshop on General Purpose Processor Using Graphics Processing Units
Critical real-time systems require strict resource provisioning in terms of memory and timing. The constant need for higher performance in these systems has led industry to recently include GPUs. However, GPU software ecosystems are by their nature closed source, forcing system engineers to consider them as black boxes, complicating resource provisioning. In this work, we reverse engineer the internal operations of the GPU system software to increase the understanding of their observed behaviour and how resources are internally managed. We present our methodology that is incorporated in GMAI (GPU Memory Allocation Inspector), a tool that allows system engineers to accurately determine the exact amount of resources required by their critical systems, avoiding underprovisioning. We first apply our methodology on a wide range of GPU hardware from different vendors showing its generality in obtaining the properties of the GPU memory allocators. Next, we demonstrate the benefits of such knowledge in resource provisioning of two case studies from the automotive domain, where the actual memory consumption is up to 5.6× more than the memory requested by the application.
ACM Transactions on Embedded Computing Systems (TECS) – Association for Computing Machinery
Published: Sep 26, 2020
Keywords: GPUs
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.