Access the full text.
Sign up today, get DeepDyve free for 14 days.
Peter Petrov, A. Orailoglu (2001)
Performance and power effectiveness in embedded processors customizable partitioned cachesIEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 20
Lea Lee, Jeff Scott, B. Moyer, John Arends (1999)
Low-cost branch folding for embedded applications with small tight loopsMICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture
J. Fisher, P. Faraboschi, G. Desoli (1996)
Custom-fit processors: letting applications define architecturesProceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29
D. Ditzel, H. McLellan (1987)
Branch folding in the CRISP microprocessor: reducing branch delay to zero
(2003)
Received October
Peter Petrov, A. Orailoglu (2001)
Data cache energy minimizations through programmable tag size matching to the applicationsInternational Symposium on System Synthesis (IEEE Cat. No.01EX526)
M. Lam (1998)
RETROSPECTIVE : Software Pipelining : An Effective Scheduling Technique for VLIW Machines
Orailoglu
J. Fisher (1999)
Customized instruction-sets for embedded processorsProceedings 1999 Design Automation Conference (Cat. No. 99CH36361)
C. Young, Michael Smith (1999)
Static correlated branch predictionACM Trans. Program. Lang. Syst., 21
Chunho Lee, M. Potkonjak, W. Mangione-Smith (1997)
MediaBench: a tool for evaluating and synthesizing multimedia and communications systemsProceedings of 30th Annual International Symposium on Microarchitecture
T. Sherwood, B. Calder (2001)
Automated design of finite state machine predictors for customized processorsProceedings 28th Annual International Symposium on Computer Architecture
D. Burger, T. Austin (1997)
The SimpleScalar tool set, version 2.0SIGARCH Comput. Archit. News, 25
Scott McFarling (1998)
Combining Branch Predictors
Shien-Tai Pan, K. So, J. Rahmeh (1992)
Improving the accuracy of dynamic branch prediction using branch correlation
S. Muchnick, Phillip Gibbons (1986)
Efficient instruction scheduling for a pipelined architecture
W. Wolf (1994)
Hardware-software co-design of embedded systemsProc. IEEE, 82
We present a customization framework for embedded processors which employs the utilization of application-specific information, thus specializing the processor's microarchitecture to the application needs. The increased processor utilization leads to a low-cost system implementation with no sacrifice in performance requirements and to reduced custom hardware in a typical SOC. We illustrate these ideas through the branch resolution problem, known to impose severe performance degradation on control-dominated embedded applications. A customization approach for early branch resolution and subsequent folding is presented. The application-specific information is captured by the microarchitecture through a low-cost reprogrammable hardware, thus attaining the twin benefits of processor standardization and application-specific customization. Experimental results show that for a representative set of control-dominated applications a reduction in the range of 3--22% in processor cycles can be achieved, thus extending the scope of low-cost embedded processors in complex codesigns for control intensive systems.
ACM Transactions on Embedded Computing Systems (TECS) – Association for Computing Machinery
Published: May 1, 2005
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.