Access the full text.
Sign up today, get DeepDyve free for 14 days.
G. James, D. Witten, T. Hastie, R. Tibshirani (2013)
An Introduction to Statistical Learning
P. Mochel (2005)
The Sysfs FilesystemProc. of the Linux Symp.
A. Aalsaud (2016)
Power--Aware Performance Adaptation of Concurrent Applications in Heterogeneous Many-Core SystemsProc. of the Intl. Symp. on Low Power Elec. and Design
C. Bienia, S. Kumar, J. P. Singh, K. Li (2008)
The PARSEC Benchmark Suite: Characterization and Architectural ImplicationsProc. of the Intl. Conf. on Parallel Arch. and Compilation Tech
P. Bogdan, R. Marculescu, S. Jain, R. T. Gavila (2012)
An Optimal Control Approach to Power Management for Multi-Voltage and Frequency Islands Multiprocessor Platforms under Highly Variable WorkloadsProc. of the Intl. Symp. on Networks on Chip
R. G. Kim (2016)
Wireless NoC and Dynamic VFI Codesign: Energy Efficiency Without Performance PenaltyIEEE Trans. Very Large Scale Integr. (VLSI) Syst., 24
J. Friedman, T. Hastie, R. Tibshirani (2001)
The Elements of Statistical Learning
P. J. Mucci, S. Browne, C. Deane, G. Ho (1999)
PAPI: A Portable Interface to Hardware Performance CountersProc. of the Department of Defense HPCMP Users Group Conf.
V. Pallipadi, A. Starikovskiy (2006)
The Ondemand GovernorProc. of the Linux Symp.
S. Thomas (2014)
CortexSuite: A Synthetic Brain Benchmark SuiteProc. of the Intl. Symp. on Workload Char
L. Benini, A. Bogliolo, G. De Micheli (2000)
A Survey of Design Techniques For System-Level Dynamic Power ManagementIEEE Trans. Very Large Scale Integr. (VLSI) Syst., 8
C. Isci, G. Contreras, M. Martonosi (2006)
Live, Runtime Phase Monitoring and Prediction on Real Systems With Application to Dynamic Power ManagementProc. of the Intl. Symp. on Microarch
M. R. Guthaus (2001)
Mibench: A Free, Commercially Representative Embedded Benchmark SuiteProc. of the Intl. Work. on Workload Char
Y. Zhu, V. J. Reddi (2013)
High-Performance and Energy-Efficient Mobile Web Browsing on Big/Little SystemsIntl. Symp. on High Perf. Comput. Arch
M. Palesi, T. Givargis (2002)
Multi-objective Design Space Exploration Using Genetic AlgorithmsProc. of the Intl. Symp. on Hardware/Software Codesign
X. Chen (2013)
Dynamic Voltage and Frequency Scaling for Shared Resources in Multicore Processor DesignsProc. of the Design Autom. Conf
X. Zheng, L. K. John, A. Gerstlauer (2016)
Accurate Phase-level Cross-platform Power and Performance EstimationProc. of Design Autom. Conf
B. Donyanavard, T. Mück, S. Sarma, N. Dutt (2016)
SPARTA: Runtime Task Allocation for Energy Efficient Heterogeneous Many-coresProc. of the Intl. Conf. on Hardware/Software Codesign and Sys. Syn
U. Gupta (2017)
Dynamic Power Budgeting for Mobile Systems Running Graphics WorkloadsIEEE Trans. on Multi-Scale Comp. Sys.
J. Henkel (2015)
Dark Silicon: From Computation to CommunicationProc. of the Intl. Symp. on Networks-on-Chip
T. Sherwood, E. Perelman, G. Hamerly, S. Sair, B. Calder (2003)
Discovering and Exploiting Program PhasesIEEE micro, 23
W. Wang, P. Mishra, S. Ranka (2012)
Dynamic Reconfiguration in Real-Time SystemsSpringer.
X. Wang (2016)
A Pareto-Optimal Runtime Power Budgeting Scheme for Many-Core SystemsMicroprocessors and Microsystems 46 (2016), 46
C. Lattner (2008)
LLVM and Clang: Next Generation Compiler TechnologyProc. of the BSD
G. Dhiman, T. S. Rosing (2009)
System-Level Power Management Using Online LearningIEEE Trans. Comput.-Aided Design Integr. Circuits and Syst., 28
A. K. Coskun, T. S. Rosing, K. Whisnant (2007)
Temperature Aware Task Scheduling in MPSoCsProc. of the Conf. on Design
E. Del Sozzo (2016)
Workload-aware Power Optimization Strategy for Asymmetric MultiprocessorsProc. of the Design, 8
G. Singla, G. Kaur, A. K. Unver, U. Y. Ogras (2015)
Predictive Dynamic Thermal and Power Management for Heterogeneous Mobile PlatformsProc. of the Conf. on Design, 8
C. Lattner, V. Adve (2004)
LLVM: A Compilation Framework for Lifelong Program Analysis 8 TransformationProc. of the Intl. Symp. on Code Gen. and Opt.: Feedback-directed and Runtime Opt
XDA-Developers Forums (2017)
https://forum, 048957
V. Pallipadi, S. Li, A. Belay (2007)
Cpuidle: Do Nothing, EfficientlyProc. of the Linux Symp.
A. C. de Melo (2010)
The New Linux Perf ToolsLinux Kongress
N. Vallina-Rodriguez, J. Crowcroft (2012)
Energy Management Techniques in Modern Mobile HandsetsIEEE Comm. Surveys 8 Tutorials 99 (2012), 8
J. Li, J. F. Martinez (2006)
Dynamic Power-Performance Adaptation of Parallel Computation on Chip MultiprocessorsProc. of the Intl. Symp. on High-Perf. Comp. Arch
G. Palermo, C. Silvano, V. Zaccaria (2005)
Multi-objective Design Space Exploration of Embedded SystemsJrnl of Embd. Comp
R. Cochran, C. Hankendi, A. K. Coskun, S. Reda (2011)
Pack 8 Cap: Adaptive DVFS and Thread Packing Under Power CapsProc. of the Intl. Symp. on Microarch
T. S. Muthukaruppan, M. Pricopi, V. Venkataramani, T. Mitra, S. Vishin (2013)
Hierarchical Power Management for Asymmetric Multi-Core in Dark Silicon EraProc. of the Design Autom. Conf
TI-INA2 (2017)
http://www, 06
S. Herbert, D. Marculescu (2007)
Analysis of Dynamic Voltage/Frequency Scaling in Chip-MultiprocessorsProc. of the Intl. Symp. on Low Power Elec. and Design
A. Cortex (2013)
A15 MPCore Processor Technical Reference ManualARM Holdings PLC 24 (2013)., 24
U. Y. Ogras, R. Marculescu (2013)
Modeling, Analysis and Optimization of Network-on-Chip Communication Architectures, 8
Modern multiprocessor systems-on-chip (MpSoCs) offer tremendous power and performance optimization opportunities by tuning thousands of potential voltage, frequency and core configurations. As the workload phases change at runtime, different configurations may become optimal with respect to power, performance or other metrics. Identifying the optimal configuration at runtime is infeasible due to the large number of workloads and configurations. This paper proposes a novel methodology that can find the Pareto-optimal configurations at runtime as a function of the workload. To achieve this, we perform an extensive offline characterization to find classifiers that map performance counters to optimal configurations. Then, we use these classifiers and performance counters at runtime to choose Pareto-optimal configurations. We evaluate the proposed methodology by maximizing the performance per watt for 18 single- and multi-threaded applications. Our experiments demonstrate an average increase of 93%, 81% and 6% in performance per watt compared to the interactive, ondemand and powersave governors, respectively.
ACM Transactions on Embedded Computing Systems (TECS) – Association for Computing Machinery
Published: Sep 27, 2017
Keywords: Basic blocks
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.