Access the full text.
Sign up today, get DeepDyve free for 14 days.
Boncheol Gu, Andre S. Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, Jonghyun Yoon, Jeong-Uk Kang (2016)
Biscuit: A framework for near-data processing of big data workloadsProceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). IEEE
PCI-SIG. (2018)
Frequently Asked Questions: PCI Express - 3Retrieved February 26, 2019 from https://pcisig.com/faq?field_category_value%5B%5D=pci_express_3.0&keys===., 26
Dan Lin, Nigel Medforth, Kenneth S. Herdy, Arrvindh Shriraman, Rob Cameron (2012)
Parabix: Boosting the efficiency of text processing on commodity processorsProceedings of the IEEE 18th International Symposium on High Performance Computer Architecture (HPCA’12). IEEE
Russ Cox (2009)
Regular Expression Matching: The Virtual Machine ApproachRetrieved February 26, 2019 from https://swtch.com/∼rsc/regexp/regexp2.html., 26
Github (2017)
Performance Comparison of Regular Expression EnginesRetrieved February 26, 2019 from https://zherczeg.github.io/sljit/regex_perf.html., 26
Prateek Tandon, Faissal M. Sleiman, Michael J. Cafarella, Thomas F. Wenisch (2016)
Hawk: Hardware support for unstructured log processingProceedings of the IEEE 32nd International Conference on Data Engineering (ICDE’16). IEEE
A. Barbalace, A. Iliopoulos, Holm Rauchfuss, G. Brasche (2017)
It's Time to Think About an Operating System for Near Data Processing ArchitecturesProceedings of the 16th Workshop on Hot Topics in Operating Systems
S. Jun, Ming Liu, Sungjin Lee, Jamey Hicks, J. Ankcorn, Myron King, Shuotao Xu, Arvind (2015)
BlueDBM: An appliance for Big Data analytics2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA)
Paul Dlugosch, Dave Brown, Paul Glendenning, Michael Leventhal, Harold Noyes (2014)
An efficient and scalable semiconductor architecture for parallel automata processingIEEE Transactions on Parallel and Distributed Systems, 25
Sailesh Kumar, Sarang Dharmapurikar, Fang Yu, Patrick Crowley, Jonathan Turner (2006)
Algorithms to accelerate multiple regular expressions matching for deep packet inspectionACM SIGCOMM Computer Communication Review, 36
Micron (2018)
MT29F2T08CUHBBM4-3RRetrieved February 26, 2019 from https://www.datasheets.com/datasheet/mt29f2t08cuhbbm4-3r:b-micron-technology-75292692., 26
Valentina Salapura, Tejas Karkhanis, Priya Nagpurkar, Jose Moreira (2012)
Accelerating business analytics applicationsProceedings of the IEEE 18th International Symposium on High Performance Computer Architecture (HPCA’12). IEEE
Amazon (2018)
Amazon S3Retrieved February 26, 2019 from https://aws.amazon.com/s3/., 26
Jan van Lunteren, Alexis Guanella (2012)
Hardware-accelerated regular expression matching at multiple tens of Gb/sProceedings of the 2012 IEEE INFOCOM Conference. IEEE, 2012
Micron (2018)
MT29F8G16ADADAH4-ITRetrieved February 26, 2019 from https://www.micron.com/products/nand-flash/slc-nand/part-catalog/mt29f8g16adadah4-it., 26
Yuanwei Fang, Chen Zou, Aaron Elmore, A. Chien (2017)
UDP: A Programmable Accelerator for Extract-Transform-Load Workloads and More2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
Shahriar Akter, S. Wamba (2016)
Big data analytics in E-commerce: a systematic review and agenda for future researchElectronic Markets, 26
Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan, Steven Swanson (2016)
Morpheus: Creating application objects efficiently for heterogeneous computingProceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). IEEE
Lin Tan, Timothy Sherwood (2005)
A high throughput string matching architecture for intrusion detection and preventionProceedings of the 32nd International Symposium on Computer Architecture (ISCA’05). IEEE
Titan IC. (2018)
Hyperion F1 10G Regex File Scanhttp://titan-ic.com/products/hyperion-f1-10g-regex-file-scan
Xiaodong Yu, M. Becchi (2013)
GPU acceleration of regular expression matching for large datasets: exploring the implementation space
K. Kosako (2019)
PCRE—Perl Compatible Regular ExpressionsRetrieved February 26, 2019 from https://github.com/kkos/oniguruma., 26
E. Schadt, M. Linderman, J. Sorenson, Lawrence Lee, G. Nolan (2010)
Computational solutions to large-scale data management and analysisNature Reviews Genetics, 11
Google (2019)
RE2Retrieved February 26, 2019 from https://github.com/google/re2., 26
Jan Van Lunteren, Christoph Hagleitner, Timothy Heil, Giora Biran, Uzi Shvadron, Kubilay Atasu (2012)
Designing a programmable wire-speed regular-expression matching acceleratorProceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 2012
Arun Subramaniyan, R. Das (2017)
Parallel automata processor2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)
S. Xi, Oreoluwatomiwa Babarinsa, Manos Athanassoulis, Stratos Idreos (2015)
Beyond the Wall: Near-Data Processing for DatabasesProceedings of the 11th International Workshop on Data Management on New Hardware
Cheng-Hung Lin, Chen-Hsiung Liu, Lung-Sheng Chien, Shih-Chieh Chang (2013)
Accelerating pattern matching using a novel parallel algorithm on GPUsIEEE Transactions on Computers, 62
Mingyu Gao, Grant Ayers, Christos Kozyrakis (2015)
Practical near-data processing for in-memory analytics frameworksProceedings of the International Conference on Parallel Architecture and Compilation (PACT’15). IEEE
Jack Wadden, Vinh Dang, Nathan Brunelle, Tommy Tracy II, Deyuan Guo, Elaheh Sadredini, Ke Wang (2016)
ANMLzoo: A benchmark suite for exploring bottlenecks in automata processing engines and architecturesProceedings of the IEEE International Symposium on Workload Characterization (IISWC’16). IEEE
J. Hopcroft, J. Ullman (2001)
Introduction to automata theory, languages, and computation, 2nd edition
RegexLib (2017)
Regular Expression LibraryRetrieved February 26, 2019 from http://regexlib.com/., 26
Fei-Yue Wang, Kathleen M. Carley, Daniel Zeng, Wenji Mao (2007)
Social computing: From social informatics to social intelligenceIEEE Intelligent Systems, 22
Shuyi Pei, J. Yang, Qing Yang (2018)
REGISTOR: A Platform for Unstructured Data Processing Inside SSD StorageProceedings of the 11th ACM International Systems and Storage Conference
Junwhan Ahn, Sungpack Hong, S. Yoo, O. Mutlu, Kiyoung Choi (2015)
A scalable processing-in-memory accelerator for parallel graph processing2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA)
D. Ficara, S. Giordano, G. Procissi, F. Vitucci, G. Antichi, A. Pietro (2008)
An improved DFA for fast regular expression matchingComput. Commun. Rev., 38
Project Gutenberg (2018)
The Entire Project Gutenberg Works of Mark Twain by Mark TwainRetrieved February 26, 2019 from http://www.gutenberg.org/ebooks/3200?msg=welcome_stranger., 26
K. Thompson (1968)
Programming Techniques: Regular expression search algorithmCommunications of the ACM, 11
Michela Becchi, Mark Franklin, Patrick Crowley (2008)
A workload for evaluating deep packet inspection architecturesProceedings of the IEEE International Symposium on Workload Characterization (IISWC’08). IEEE
Apache (2018)
LuceneRetrieved February 26, 2019 from https://lucene.apache.org/., 26
Snehasish Kumar, Arrvindh Shriraman, V. Srinivasan, Dan Lin, J. Phillips (2014)
SQRL: Hardware accelerator for collecting software data structures2014 23rd International Conference on Parallel Architecture and Compilation (PACT)
Snort (2017)
Snort—Network Intrusion Detection and Prevention SystemRetrieved February 26, 2019 from https://www.snort.org/., 26
Linux (2019)
Source to sys/ioctlRetrieved February 26, 2019 from https://unix.superglobalmegacorp.com/Net2/newsrc/sys/ioctl.h.html., 26
R. Cameron, T. Shermer, Arrvindh Shriraman, Kenneth Herdy, Dan Lin, Benjamin Hull, Meng Lin (2014)
Bitwise data parallelism in regular expression matching2014 23rd International Conference on Parallel Architecture and Compilation (PACT)
Devesh Tiwari, Simona Boboila, Sudharshan S. Vazhkudai, Youngjae Kim, Xiaosong Ma, Peter Desnoyers, Yan Solihin (2013)
Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machinesProceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13)
David Sidler, Z. István, Muhsen Owaida, G. Alonso (2017)
Accelerating Pattern Matching Queries in Hybrid CPU-FPGA ArchitecturesProceedings of the 2017 ACM International Conference on Management of Data
Todd Mytkowicz, Madanlal Musuvathi, Wolfram Schulte (2014)
Data-parallel finite-state machinesACM SIGARCH Computer Architecture News, 42
Sudharsan Seshadri, Mark Gahagan, Meenakshi Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu (2014)
Willow: A user-programmable SSDProceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14)
Benjamin Brodie, David Taylor, R. Cytron (2006)
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching33rd International Symposium on Computer Architecture (ISCA'06)
A. Gandomi, Murtaza Haider (2015)
Beyond the hype: Big data concepts, methods, and analyticsInt. J. Inf. Manag., 35
Indranil Roy, Ankit Srivastava, Marziyeh Nourian, Michela Becchi, Srinivas Aluru (2016)
High performance pattern matching using the automata processorProceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium. IEEE, 2016
Microsoft (2018)
BEE3 Established: February 26, 2008https://www.microsoft.com/en-us/research/project/bee3/
John Levine (2009)
Flex and Bison: Text Processing ToolsO’Reilly Media Inc.
Vaibhav Gogte, Aasheesh Kolli, Michael J. Cafarella, Loris D’Antoni, Thomas F. Wenisch (2016)
HARE: Hardware accelerator for regular expressionsProceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE
NVM Express (2018)
NVM Express Revision 1Retrieved February 26, 2019 from http://nvmexpress.org/wp-content/uploads/NVM-Express-1_3a-20171024_ratified.pdf., 24
Y. Yang, Weirong Jiang, V. Prasanna (2008)
Compact architecture for high-throughput regular expression matching on FPGA
Yuanwei Fang, T. Hoang, M. Becchi, A. Chien (2015)
Fast support for unstructured data processing: The unified automata processor2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
A. Katal, M. Wazid, R. H. Goudar (2013)
Big data: Issues, challenges, tools and good practicesProceedings of the 6th International Conference on Contemporary Computing (IC3’13). IEEE
D. Knuth, James Morris, V. Pratt (1977)
Fast Pattern Matching in StringsSIAM J. Comput., 6
Reetinder Sidhu, Viktor K. Prasanna (2001)
Fast regular expression matching using FPGAsProceedings of the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’01). IEEE
This article presents REGISTOR, a platform for regular expression grabbing inside storage. The main idea of Registor is accelerating regular expression (regex) search inside storage where large data set is stored, eliminating the I/O bottleneck problem. A special hardware engine for regex search is designed and augmented inside a flash SSD that processes data on-the-fly during data transmission from NAND flash to host. To make the speed of regex search match the internal bus speed of a modern SSD, a deep pipeline structure is designed in Registor hardware consisting of a file semantics extractor, matching candidates finder, regex matching units (REMUs), and results organizer. Furthermore, each stage of the pipeline makes the use of maximal parallelism possible. To make Registor readily usable by high-level applications, we have developed a set of APIs and libraries in Linux allowing Registor to process files in the SSD by recombining separate data blocks into files efficiently. A working prototype of Registor has been built in our newly designed NVMe-SSD. Extensive experiments and analyses have been carried out to show that Registor achieves high throughput, reduces the I/O bandwidth requirement by up to 97%, and reduces CPU utilization by as much as 82% for regex search in large datasets.
ACM Transactions on Storage (TOS) – Association for Computing Machinery
Published: Mar 26, 2019
Keywords: Regular expressions
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.