Access the full text.
Sign up today, get DeepDyve free for 14 days.
Antonio García-Guirado, Ricardo Pascual, Alberto Ros, José García (2011)
Energy-Efficient Cache Coherence Protocols in Chip-Multiprocessors for Server Consolidation2011 International Conference on Parallel Processing
Ke Bai, Di Lu, Aviral Shrivastava (2011)
Vector class on Limited Local Memory (LLM) multi-core processors2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES)
P. Michaud, André Seznec, Damien Fetis, Yiannakis Sazeides, Theofanis Constantinou (2007)
A study of thread migration in temperature-constrained multicoresACM Trans. Archit. Code Optim., 4
Article 71, Publication date: December 2015
(2010)
Intel core i7 processor extreme edition and intel core i7 processor datasheet, volume 1
S. Jung, Aviral Shrivastava, Ke Bai (2010)
Dynamic code mapping for limited local memory systemsASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors
F. Angiolini, F. Menichelli, Alberto Ferrero, L. Benini, M. Olivieri (2004)
A post-compiler approach to scratchpad mapping of code
T. Austin, E. Larson, Dan Ernst (2002)
SimpleScalar: An Infrastructure for Computer System ModelingComputer, 35
James Smith (1981)
A study of branch prediction strategies
Ke Bai, Jing Lu, Aviral Shrivastava, Bryce Holton (2013)
CMSM: An efficient and effective Code Management for Software Managed Multicores2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
Michael Baker, Amrit Panda, Nikhil Ghadge, Aniruddha Kadne, Karam Chatha (2010)
A performance model and code overlay generator for scratchpad enhanced embedded processors2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
Loc Truong (2009)
Low power consumption and a competitive price tag make the six-core TMS320C6472 ideal for high-performance applications
Ke Bai, Aviral Shrivastava, Saleel Kudchadker (2011)
Stack data management for Limited Local Memory (LLM) multi-core processorsASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors
Matthew Guthaus, J. Ringenberg, Dan Ernst, T. Austin, T. Mudge, Richard Brown (2001)
MiBench: A free, commercially representative embedded benchmark suiteProceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538)
Stefan Metzlaff, Irakli Guliashvili, S. Uhrig, T. Ungerer (2011)
A Dynamic Instruction Scratchpad Memory for Embedded Processors Managed by Hardware
Andhi Janapsatya, A. Ignjatović, S. Parameswaran (2006)
A novel instruction scratchpad memory optimization method based on concomitance metricAsia and South Pacific Conference on Design Automation, 2006.
M. Balakrishnan, P. Marwedel, L. Wehmeyer, Nils Grunwald, R. Banakar, S. Steinke (2002)
Reducing energy consumption by dynamic copying of instructions onto onchip memory15th International Symposium on System Synthesis, 2002.
Bryce Holton, Ke Bai, Aviral Shrivastava, H. Ramaprasad (2014)
Construction of GCCFG for inter-procedural optimizations in Software Managed Manycore (SMM) architectures2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)
Ke Bai, Aviral Shrivastava (2013)
A software-only scheme for managing heap data on limited local memory(LLM) multicore processorsACM Trans. Embed. Comput. Syst., 13
Ke Bai, Aviral Shrivastava (2013)
Automatic and efficient heap data management for Limited Local Memory multicore architectures2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)
Bernhard Egger, Seungkyun Kim, Choonki Jang, Jaejin Lee, S. Min, Heonshik Shin (2010)
Scratchpad Memory Management Techniques for Code in Embedded Systems without an MMUIEEE Transactions on Computers, 59
Lian Li, Hui Feng, Jingling Xue (2009)
Compiler-directed scratchpad memory management via graph coloringACM Trans. Archit. Code Optim., 6
(2006)
Programmer's Guide: Software Development Kit for Multicore Acceleration Version 3
(2010)
Tom's Hardware Raw performance: SiSoftware sandra 2010 pro (GFLOPS)
Ke Bai, Aviral Shrivastava (2010)
Heap data management for limited local memory (LLM) multi-core processors2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
Manish Verma, P. Marwedel (2006)
Overlay techniques for scratchpad memories in low power embedded processorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14
A. Martín (2010)
Intel Core i7-980X Extreme Edition A 3,33 GHZ: primera CPU con 6 núcleos
B. Dinechin, P. Massas, Guillaume Lager, Clément Léger, Benjamin Orgogozo, Jérôme Reybert, Thierry Strudel (2013)
A Distributed Run-Time Environment for the Kalray MPPA®-256 Integrated Manycore Processor
B. Flachs, S. Asano, S. Dhong, H. Hofstee, G. Gervais, Roy Kim, T. Le, Peichun Liu, J. Leenstra, J. Liberty, B. Michael, H. Oh, S. Müller, O. Takahashi, A. Hatakeyama, Yukio Watanabe, N. Yano, Daniel Brokenshire, M. Peyravian, V. To, E. Iwata (2006)
The microarchitecture of the synergistic processor for a cell processorIEEE Journal of Solid-State Circuits, 41
Byn Choi, Rakesh Komuravelli, Hyojin Sung, Robert Smolinski, N. Honarmand, S. Adve, Vikram Adve, N. Carter, Ching-Tsun Chou (2011)
DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism2011 International Conference on Parallel Architectures and Compilation Techniques
A. Pabalkar, Aviral Shrivastava, Arun Kannan, Jongeun Lee (2008)
SDRM: simultaneous determination of regions and function-to-region mapping for scratchpad memories
Jing Lu, Ke Bai, Aviral Shrivastava (2013)
SSDM: Smart Stack Data Management for software managed multicores (SMMs)2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC)
Garo Bournoutian, A. Orailoglu (2011)
Dynamic, multi-core cache coherence architecture for power-sensitive mobile processors2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
Kaushik Vaidyanathan, Qiuling Zhu, L. Liebmann, K. Lai, Stephen Wu, Renzhi Liu, Yandong Liu, A. Strojwas, L. Pileggi (2014)
Exploiting sub-20-nm complementary metal-oxide semiconductor technology challenges to design affordable systems-on-chipJournal of Micro/Nanolithography, MEMS, and MOEMS, 14
Bernhard Egger, Jaejin Lee, Heonshik Shin (2006)
Scratchpad memory management for portable systems with a memory management unit
M. Kistler, M. Perrone, F. Petrini (2006)
Cell Multiprocessor Communication Network: Built for SpeedIEEE Micro, 26
Lian Li, Lin Gao, Jingling Xue (2005)
Memory coloring: a compiler approach for scratchpad memory management14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)
(2012)
The SCC Programmer's Guide. https://communities.intel.com/servlet/JiveServlet/previewBody/ 5684-102-8-22523/SCCProgrammersGuide
Choonki Jang, Jaejin Lee, Bernhard Egger, Soojung Ryu (2012)
Automatic code overlay generation and partially redundant code fetch eliminationACM Trans. Archit. Code Optim., 9
R. Banakar, S. Steinke, Bo-Sik Lee, M. Balakrishnan, P. Marwedel (2002)
Scratchpad memory: a design alternative for cache on-chip memory in embedded systemsProceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627)
Sumesh Udayakumaran, A. Dominguez, R. Barua (2006)
Dynamic allocation for scratch-pad memory using compile-time decisionsACM Trans. Embed. Comput. Syst., 5
Yi Xu, Yu Du, Youtao Zhang, Jun Yang (2011)
A composite and scalable cache coherence protocol for large scale CMPs
Martin Schoeberl (2009)
Time-predictable Cache Organization2009 Software Technologies for Future Dependable Distributed Systems
Efficient Code Assignment Techniques for Local Memory on Software Managed Multicores JING LU, KE BAI, and AVIRAL SHRIVASTAVA, Arizona State University Scaling the memory hierarchy is a major challenge when we scale the number of cores in a multicore processor. Software Managed Multicore (SMM) architectures come up as one of the promising solutions. In an SMM architecture, there are no caches, and each core has only a local scratchpad memory [Banakar et al. 2002]. As the local memory usually is small, large applications cannot be directly executed on it. Code and data of the task mapped to each core need to be managed between global memory and local memory. This article solves the problem of efficiently managing code on an SMM architecture. The primary requirement of generating efficient code assignments is a correct management cost model. In this article, we address this problem by proposing a cost calculation graph. In addition, we develop two heuristics CMSM (Code Mapping for Software Managed multicores) and CMSM_advanced that result in efficient code management execution on the local scratchpad memory. Experimental results collected after executing applications from the MiBench suite [Guthaus et al. 2001] demonstrate that merely by adopting the correct management
ACM Transactions on Embedded Computing Systems (TECS) – Association for Computing Machinery
Published: Dec 8, 2015
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.