Access the full text.
Sign up today, get DeepDyve free for 14 days.
David Duarte, Yuh-Fang Tsai, Narayanan Vijaykrishnan, Mary Jane Irwin (2002)
Evaluating run-time techniques for leakage power reductionProceedings of the 2002 Asia and South Pacific Design Automation Conference (ASP-DAC’02), 2002
Samsung Gear S. (2013)
Home PageRetrieved October 18, 2017, from http://goo.gl/aE6ApL., 18
HERE. (2014)
HERE for Gear: Apps Inbound for Samsung TizenRetrieved October 18, 2017, from http://goo.gl/lVPqux., 18
Andrea Corradini (2001)
Dynamic time warping for off-line recognition of a small gesture vocabularyProceedings of the IEEE ICCV Workshop on Recognition
Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, Krisztian Flautner (2005)
An architecture framework for transparent instruction set customization in embedded processorsACM SIGARCH Computer Architecture News, 33
Sony SmartWatch (2014)
SmartWatch 3 SWR50Retrieved October 18, 2017, from http://goo.gl/qrV8ux., 18
L. Porras-Hurtado, Y. Ruíz, Carla Santos, C. Phillips, Á. Carracedo, M. Lareu (2013)
An overview of STRUCTURE: applications, parameter settings, and supporting softwareFrontiers in Genetics, 4
Ahmed Yasir Dogan, Jeremy Constantin, Martino Ruggiero, Andreas Burg, David Atienza (2012)
Multi-core architecture design for ultra-low-power wearable health monitoring systemsProceedings of the Conference on Design
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness (2011)
The gem5 simulatorACM SIGARCH Computer Architecture News, 39
Michael Bedford Taylor, Jason Kim, Jason Miller, David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffman (2002)
The raw microprocessor: A computational fabric for software circuits and general-purpose programsIEEE Micro, 22
Ashraf Eassa (2015)
How Much Does a Qualcomm IncSnapdragon 400 Chip Cost? Retrieved October 18, 2017, from http://goo.gl/YAIqzJ., 400
Xiuwen Zheng, David Levine, Jess Shen, S. Gogarten, C. Laurie, B. Weir (2012)
A high-performance computing toolset for relatedness and principal component analysis of SNP dataBioinformatics, 28 24
K. Götz, G. Thaller (1998)
Assignment of individuals to populations using microsatellitesJournal of Animal Breeding and Genetics, 115
Cheng Tan, Aditi Kulkarni, Vanchinathan Venkataramani, Manupa Karunaratne, Tulika Mitra, Li-Shiuan Peh (2016)
LOCUS: Low-power customizable many-core architecture for wearablesProceedings of the 2016 International Conference on Compilers, 2016
(1973)
A note on the metric properies of trees
James Psota, Anant Agarwal (2008)
rMPI: Message passing on multicore processors with on-chip interconnectProceedings of the 2008 International Conference on High-Performance Embedded Architectures and Compilers, 2008
J. Pritchard, Matthew Stephens, P. Donnelly (2000)
Inference of population structure using multilocus genotype data.Genetics, 155 2
(2002)
Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates
T. Jombart (2008)
adegenet: a R package for the multivariate analysis of genetic markersBioinformatics, 24 11
Anil Raj, M. Stephens, J. Pritchard (2014)
fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data SetsGenetics, 197
A. Georges, M. Adams (1996)
Electrophoretic delineation of species boundaries within the short-necked freshwater turtles of Australia (Testudines: Chelidae)Zoological Journal of the Linnean Society, 118
A. Georges, M. Adams (1992)
A Phylogeny for Australian Chelid Turtles Based on Allozyme ElectrophoresisAustralian Journal of Zoology, 40
(1972)
Measures of similarity and genetic distance
Francesco Conti, Davide Rossi, Antonio Pullini, Igor Loi, Luca Benini (2015)
PULP: A ultra-low power parallel accelerator for energy-efficient and flexible embedded visionJournal of Signal Processing Systems, 84
Bo Li, Hung-Ching Chang, Shuaiwen Song, Chun-Yi Su, Timmy Meyer, John Mooring, Kirk W. Cameron (2014)
The power-performance tradeoffs of the Intel Xeon Phi on HPC applicationsProceedings of the 2014 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’14). IEEE, 2014
Charles C. Tappert, Ching Y. Suen, Toru Wakahara (1990)
The state of the art in online handwriting recognitionIEEE Transactions on Pattern Analysis and Machine Intelligence, 12
Sergey Chernenko (2015)
ECG Processing—R-Peaks DetectionRetrieved October 18, 2017, from http://goo.gl/oYbn8C., 18
Offline Navigation (2016)
Routing/Offline RoutersRetrieved October 18, 2017, from http://goo.gl/Bmeljs., 18
LG Watch Urbane W1 (2015)
LG Watch Urbane in Silver: W150Retrieved October 18, 2017, from http://goo.gl/qg76vg., 18
Peng Rong, Massoud Pedram (2006)
Power-aware scheduling and dynamic voltage setting for tasks running on a hard real-time systemProceedings of the 2006 Asia and South Pacific Conference on Design Automation. IEEE, 2006
Qualcomm Snapdragon (2012)
Snapdragon 400 ProcessorRetrieved October 18, 2017, from https://goo.gl/aja771., 18
Z. Cvetanovic, C. Nofsinger (1990)
Parallel Astar search on message-passing architecturesProceedings of the 23rd Annual Hawaii International Conference on System Sciences
Linley Gwennap (2011)
Adapteva: More flops, less wattsMicroprocessor Report, 6
Libo Huang, Zhiying Wang, Nong Xiao (2012)
Accelerating NoC-based MPI primitives via communication architecture customizationProceedings of the 2012 IEEE 23rd International Conference on Application-Specific Systems, 2012
Intel Xeon Phi (2012)
Intel Xeon Phi Coprocessor 5110PRetrieved October 18, 2017, from http://goo.gl/8jXTzR., 18
D. Paetkau, Rob Slade, M. Burden, A. Estoup (2004)
Genetic assignment methods for the direct, real‐time estimation of migration rate: a simulation‐based exploration of accuracy and powerMolecular Ecology, 13
Tushar Krishna, Chia-Hsin Owen Chen, Woo Cheol Kwon, Li-Shiuan Peh (2013)
Breaking the on-chip latency barrier using SMARTProceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’13). IEEE, 2013
Meinard Müller (2007)
Dynamic time warpingInformation Retrieval for Music and Motion. Springer
Chen Sun, Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason Miller, Anant Agarwal, Li-Shiuan Peh, Vladimir Stojanovic (2012)
DSENT—a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modelingProceedings of the 2012 6th IEEE/ACM International Networks on Chip Symposium (NoCS’12). IEEE, 2012
Andrew Duller, Gajinder Panesar, Daniel Towner (2003)
Parallel processing—the picoChip wayCommunicating Processing Architectures, 2003
Google’s Fused Location API. (2013)
Google I/O 2013—Beyond the Blue Dot: New Features in Android Location (Video)Retrieved October 18, 2017, from https://goo.gl/fAckD8., 18
Larry McMurchie, Carl Ebeling (1995)
PathFinder: A negotiation-based performance-driven router for FPGAsProceedings of the 1995 ACM 3rd International Symposium on Field-Programmable Gate Arrays. ACM, 1995
Gartner (2014)
Gartner Says 4Retrieved October 18, 2017, from http://goo.gl/TVinZF., 18
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, Norman P. Jouppi (2009)
McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architecturesProceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM
Sergio V. Tota, Mario R. Casu, Massimo Ruo Roch, Luca Rostagno, Maurizio Zamboni (2010)
MEDEA: A hybrid shared-memory/message-passing multiprocessor NoC-based architectureProceedings of the 2010 Design, 2010
T. Jombart, Ismaïl Ahmed (2011)
adegenet 1.3-1: new tools for the analysis of genome-wide SNP dataBioinformatics, 27 21
R. Cattell (1966)
The Scree Test For The Number Of Factors.Multivariate behavioral research, 1 2
Pan Yu, Tulika Mitra (2004)
Scalable custom instructions identification for instruction-set extensible processorsProceedings of the 2004 International Conference on Compilers, 2004
Alon Efrat, Quanfu Fan, Suresh Venkatasubramanian (2007)
Curve matching, time warping, and light fields: New algorithms for computing similarity between curvesJournal of Mathematical Imaging and Vision, 27
J. Blanchong, K. Scribner, S. Winterstein (2002)
Assignment of individuals to populations: Bayesian methods and multi-locus genotypesJournal of Wildlife Management, 66
(1989)
PHYLIP - Phylogeny Inference Package (Version 3.2)
Shane Bell, Bruce Edwards, John Amann, Rich Conlin, Kevin Joyce, Vince Leung, John MacKay (2008)
TILE64-processor: A 64-core SoC with mesh interconnectProceedings of the 2008 IEEE International Solid-State Circuits Conference (ISSCC’08) Digest of Technical Papers. IEEE, 2008
Jason Howard, Saurabh Dighe, Yatin Hoskote, Sriram Vangal, David Finan, Gregory Ruhl, David Jenkins (2010)
A 48-core IA-32 message-passing processor with DVFS in 45nm CMOSProceedings of the 2010 IEEE International Solid-State Circuits Conference (ISSCC’10). IEEE, 2010
Pan Yu, Tulika Mitra (2004)
Characterizing embedded applications for instruction-set extensible processorsProceedings of the 41st Annual Design Automation Conference. ACM
Niket Agarwal, Tushar Krishna, Li-Shiuan Peh, Niraj K. Jha (2009)
GARNET: A detailed on-chip network model inside a full-system simulatorProceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’09). IEEE
Moto (2015)
Moto 360 (2nd Generation)Retrieved October 18, 2017, from http://goo.gl/N1jquY., 18
Lucien M. Censier, Paul Feautrier (1978)
A new solution to coherence problems in multicache systemsIEEE Transactions on Computers, 100
Moriyoshi Ohara, Hiroshi Inoue, Yukihiko Sohda, Hideaki Komatsu, Toshio Nakatani (2006)
MPI microtask for programming the cell broadband engine processorIBM Systems Journal, 45
Kartik Sankaran, Minhui Zhu, Xiang Fa Guo, Akkihebbal L. Ananda, Mun Choon Chan, Li-Shiuan Peh (2014)
Using mobile phone barometer for low-power transportation context detectionProceedings of the 12th ACM Conference on Embedded Network Sensor Systems. ACM
Liang Chen, Joseph Tarango, Tulika Mitra, Philip Brisk (2013)
A just-in-time customizable processorProceedings of the 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’13). IEEE, 2013
D. Swofford, S. Berlocher (1987)
Inferring Evolutionary Trees from Gene Frequency Data Under the Principle of Maximum ParsimonySystematic Biology, 36
Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramanian, Anantha P. Chandrakasan, Li-Shiuan Peh (2013)
SMART: A single-cycle reconfigurable NoC for SoC applicationsProceedings of the Conference on Design
A. Stamatakis (2014)
RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogeniesBioinformatics, 30
J. Gower (1966)
Some distance properties of latent root and vector methods used in multivariate analysisBiometrika, 53
Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, Krisztian Flautner (2004)
Application-specific processing on a general-purpose core via transparent instruction set customizationProceedings of the 37th International Symposium on Microarchitecture (MICRO-37). IEEE
Jason Zebchuk, Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, Andreas Moshovos (2009)
A tagless coherence directoryProceedings of the 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). IEEE, 2009
Natalie Enright Jerger, Li-Shiuan Peh (2009)
On-chip networksSynthesis Lectures on Computer Architecture, 4
Hiroaki Sakoe, Seibi Chiba (1978)
Dynamic programming algorithm optimization for spoken word recognitionIEEE Transactions on Acoustics, 26
Michael Gschwind, H. Peter Hofstee, Brian Flachs, Martin Hopkins, Yukio Watanabe, Takeshi Yamazaki (2006)
Synergistic processing in cell’s multicore architectureIEEE Micro, 26
(2004)
Inferring Phylogenies
E. Anderson, E. Thompson (2002)
A model-based method for identifying species hybrids using multilocus genetic data.Genetics, 160 3
Kanak Agarwal, Kevin Nowka, Harmander Deogun, Dennis Sylvester (2006)
Power gating with multiple sleep modesProceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06)
Application requirements, such as real-time response, are pushing wearable devices to leverage more powerful processors inside the SoC (system on chip). However, existing wearable devices are not well suited for such challenging applications due to poor performance, and the conventional powerful many-core architectures are not appropriate either due to the stringent power budget in this domain. We propose LOCUS—a low-power, customizable, many-core processor for next-generation wearable devices. LOCUS combines customizable processor cores with a customizable network on a message-passing architecture to deliver very competitive performance/watt—an average 3.1 compared to quad-core ARM processors used in state-of-the-art wearable devices. A combination of full system simulation with representative applications from the wearable domain and RTL synthesis of the architecture show that 16-core LOCUS achieves an average 1.52 performance/watt improvement over a conventional 16-core shared memory many-core architecture. A dynamic power management mechanism is proposed to further decrease the power consumption in both computation and communication, which improves the performance/watt of LOCUS by 1.17.
ACM Transactions on Embedded Computing Systems (TECS) – Association for Computing Machinery
Published: Nov 14, 2017
Keywords: Customization
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.