Access the full text.
Sign up today, get DeepDyve free for 14 days.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (2019)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Zexuan Zhong, Yong Cao, Mu Guo, Zaiqing Nie (2018)
CoLink: An Unsupervised Framework for User Identity Linkage
Chen Zhao, Yeye He (2019)
Auto-EM: End-to-end fuzzy entity-matching using pre-trained deep models and transfer learningProceedings of the World Wide Web Conference (WWW’19). ACM
L. Wolcott, W. Clements, P. Saripalli (2018)
Scalable record linkageProceedings of the 2018 IEEE International Conference on Big Data, 2018
J. Elman (1990)
Finding Structure in TimeCogn. Sci., 14
Jonathan Raiman, O. Raiman (2018)
DeepType: Multilingual Entity Linking by Neural Type System Evolution
A. Elmagarmid, Panagiotis Ipeirotis, Vassilios Verykios (2007)
Duplicate Record Detection: A SurveyIEEE Transactions on Knowledge and Data Engineering, 19
Yoav Goldberg (2017)
Neural Network Methods for Natural Language ProcessingSynthesis Lectures on Human Language Technologies, -
W. Zhang, Hao Wei, Bunyamin Sisman, Xin Dong, C. Faloutsos, David Page (2019)
AutoBlock: A Hands-off Blocking Framework for Entity MatchingProceedings of the 13th International Conference on Web Search and Data Mining
G. Papadakis, Dimitrios Skoutas, Emmanouil Thanos, Themis Palpanas (2019)
A Survey of Blocking and Filtering Techniques for Entity ResolutionArXiv, abs/1905.06167
T. Kohonen (1987)
Adaptive, associative, and self-organizing functions in neural computing.Applied optics, 26 23
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (2015)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence, 39
Steven Whang, P. Lofgren, H. Garcia-Molina (2013)
Question Selection for Crowd Entity ResolutionProc. VLDB Endow., 6
M. Charikar (2002)
Similarity estimation techniques from rounding algorithms
M. Schuster, K. Paliwal (1997)
Bidirectional recurrent neural networksIEEE Trans. Signal Process., 45
(2003)
Duplicate Detection, Record Linkage, and Identity Uncertainty: Datasets
Lin Wu, Yang Wang, Ling Shao (2018)
Cycle-Consistent Deep Generative Hashing for Cross-Modal RetrievalIEEE Transactions on Image Processing, 28
Alex Graves, Santiago Fernández, J. Schmidhuber (2005)
Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition
Rico Sennrich, B. Haddow, Alexandra Birch (2015)
Neural Machine Translation of Rare Words with Subword UnitsArXiv, abs/1508.07909
Ekaterini Ioannou, Nataliya Rassadko, Yannis Velegrakis (2013)
On Generating Benchmark Data for Entity MatchingJournal on Data Semantics, 2
Yash Govind, Erik Paulson, Palaniappan Nagarajan, Paul Suganthan G. C., An Hai Doan, Youngchoon Park, Glenn M. Fung, Devin Conathan, Marshall Carter, Mingju Sun (2018)
CloudMatcher: A hands-off cloud/crowd service for entity matchingProceedings VLDB Endowment, 11
Dan Jurafsky, James Martin (2000)
Speech and language processing - an introduction to natural language processing, computational linguistics, and speech recognition
Mandar Joshi, Omer Levy, Daniel Weld, Luke Zettlemoyer (2019)
BERT for Coreference Resolution: Baselines and Analysis
Yann LeCun, Yoshua Bengio, Geoffrey Hinton (2015)
Deep LearningNature, 521
John R. Talburt (2011)
Entity Resolution and Information QualityElsevier.
A. Doan, A. Ardalan, Jeffrey Ballard, Sanjib Das, Yash Govind, Pradap Konda, Han Li, Sidharth Mudgal, Erik Paulson, C. PaulSuganthanG., Haojun Zhang (2017)
Human-in-the-Loop Challenges for Entity Matching: A Midterm ReportProceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics
Minghe Yu, Guoliang Li, Dong Deng, Jianhua Feng (2016)
String similarity search and join: a surveyFrontiers of Computer Science, 10
R. Srivastava, Klaus Greff, J. Schmidhuber (2015)
Highway NetworksArXiv, abs/1505.00387
Yash Govind, Erik Paulson, Palaniappan Nagarajan, Paul C., A. Doan, Youngchoon Park, Glenn Fung, D. Conathan, Marshall Carter, Mingju Sun (2018)
CloudmatcherProceedings of the VLDB Endowment
Zohra Bellahsene, A. Bonifati, E. Rahm (2013)
Schema Matching and Mapping
O. Reyes-Galaviz, W. Pedrycz, Ziyue He, N. Pizzi (2017)
A supervised gradient-based learning algorithm for optimized entity resolutionData Knowl. Eng., 112
Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun (2015)
Deep Residual Learning for Image Recognition2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
G. Papadakis, Jonathan Svirsky, A. Gal, Themis Palpanas (2016)
Comparative Analysis of Approximate Blocking Techniques for Entity ResolutionProc. VLDB Endow., 9
Ursin Brunner, Kurt Stockinger (2020)
Entity Matching with Transformer Architectures - A Step Forward in Data Integration
Norases Vesdapunt, Kedar Bellare, Nilesh Dalvi (2014)
Crowdsourcing Algorithms for Entity ResolutionProc. VLDB Endow., 7
K. Nozaki, T. Hochin, Hiroki Nomiya (2019)
Semantic Schema Matching for String Attribute with Word Vectors2019 6th International Conference on Computational Science/Intelligence and Applied Informatics (CSII)
Hao Zhu, Ruobing Xie, Zhiyuan Liu, Maosong Sun (2017)
Iterative Entity Alignment via Joint Knowledge Embeddings
Guoliang Li, Jiannan Wang, Yudian Zheng, M. Franklin (2016)
Crowdsourced Data Management: A SurveyIEEE Transactions on Knowledge and Data Engineering, 28
CCS Concepts: • Computing methodologies → Neural networks ; Natural language processing ; • Information systems → Entity resolution
A. Arasu, M. Götz, R. Kaushik (2010)
On active learning of record matching packagesProceedings of the 2010 ACM SIGMOD International Conference on Management of data
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer (2016)
Neural Architectures for Named Entity Recognition
J. Bennett, S. Lanning (2007)
The Netflix Prize
Alec Radford, Jeff Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever (2019)
Language Models are Unsupervised Multitask Learners
Taku Kudo, John Richardson (2018)
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
A. Arasu, S. Chaudhuri, R. Kaushik (2008)
Transformation-based framework for record matchingProceedings of the 2008 IEEE 24th International Conference on Data Engineering. IEEE, 2008
Zeyuan Shang, Yaxiao Liu, Guoliang Li, Jianhua Feng (2016)
K-Join: Knowledge-Aware Similarity JoinIEEE Transactions on Knowledge and Data Engineering, 28
Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov (2016)
Enriching Word Vectors with Subword InformationTransactions of the Association for Computational Linguistics, 5
Ankur Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit (2016)
A Decomposable Attention Model for Natural Language InferenceArXiv, abs/1606.01933
E. Rahm, P. Bernstein (2001)
A survey of approaches to automatic schema matchingThe VLDB Journal, 10
Zequn Sun, Wei Hu, Qingheng Zhang, Yuzhong Qu (2018)
Bootstrapping Entity Alignment with Knowledge Graph Embedding
L. Getoor, Ashwin Machanavajjhala (2012)
Entity Resolution: Theory, Practice & Open ChallengesProc. VLDB Endow., 5
Peter Christen (2012)
Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate DetectionSpringer Science & Business Media.
(2012)
Principles of Data Integration (1st ed.)
Cheng Fu, Xianpei Han, Le Sun, Bo Chen, Wei Zhang, Suhui Wu, Hao Kong (2019)
End-to-End Multi-Perspective Matching for Entity Resolution
Sidharth Mudgal, Han Li, Theodoros Rekatsinas, A. Doan, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, V. Raghavendra (2018)
Deep Learning for Entity Matching: A Design Space ExplorationProceedings of the 2018 International Conference on Management of Data
Olga Russakovsky, Jia Deng, Hao Su, J. Krause, S. Satheesh, Sean Ma, Zhiheng Huang, A. Karpathy, A. Khosla, Michael Bernstein, A. Berg, Li Fei-Fei (2014)
ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision, 115
Neural Networks for Entity Matching: A Survey 35
Burdette Pixton, C. Giraud-Carrier (2006)
Using Structured Neural Networks for Record Linkage
H. Newcombe, J. Kennedy, S. Axford, A. James (1959)
Automatic linkage of vital records.Science, 130 3381
Prateek Jain, Sudheendra Vijayanarasimhan, K. Grauman (2010)
Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active LearningIEEE Transactions on Pattern Analysis and Machine Intelligence, 36
(2015)
Understanding LSTM networks
Zhilin Yang, Zihang Dai, Yiming Yang, J. Carbonell, R. Salakhutdinov, Quoc Le (2019)
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer (2018)
Deep Contextualized Word RepresentationsArXiv, abs/1802.05365
You Li, Dongbo Liu, Weiming Zhang (2005)
Schema matching using neural networkThe 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)
Pradap Konda, Sanjib Das, Paul Suganthan G. C., Anhai Doan, Adel Ardalan, Jeffrey R. Ballard, Han Li, Fatemah Panahi, Haojun Zhang, Jeff Naughton, Shishir Prasad, Ganesh Krishnan, Rohit Deep, Vijay Raghavendra (2016)
Magellan: Toward building entity matching management systemsProceedings VLDB Endowment, 9
Ondrej Bojar, C. Federmann, Mark Fishel, Yvette Graham, B. Haddow, Philipp Koehn, Christof Monz (2018)
Findings of the 2018 Conference on Machine Translation (WMT18)
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015)
Deep residual learning for image recognition(Dec, 2016
Sairam Gurajada, Lucian Popa, Kun Qian, P. Sen (2019)
Learning-Based Methods with Human-in-the-Loop for Entity ResolutionProceedings of the 28th ACM International Conference on Information and Knowledge Management
Hao Nie, Xianpei Han, Ben He, Le Sun, Bo Chen, Wei Zhang, Suhui Wu, Hao Kong (2019)
Deep Sequence-to-Sequence Entity Matching for Heterogeneous Entity ResolutionProceedings of the 28th ACM International Conference on Information and Knowledge Management
Sebastian Ruder (2019)
Neural transfer learning for natural language processing
C. Batini, M. Scannapieco (2006)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Ram Gottapu, C. Dagli, Bharami Ali (2016)
Entity Resolution Using Convolutional Neural NetworkProcedia Computer Science, 95
Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq Joty, M. Ouzzani, N. Tang (2018)
Distributed Representations of Tuples for Entity ResolutionProc. VLDB Endow., 11
Wen-Syan Li, Chris Clifton (2000)
SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networksData Knowl. Eng., 33
Wen-Syan Li, Chris Clifton (1994)
Semantic Integration in Heterogeneous Databases Using Neural Networks
Junming Zhang, Jinglin Li, Shangguang Wang, Jiali Bian (2014)
A Neural Network Based Schema Matching Method for Web Service Matching2014 IEEE International Conference on Services Computing
Holger Schwenk, Loïc Barrault, Alexis Conneau, Yann LeCun (2016)
Very Deep Convolutional Networks for Text Classification
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Ł (2017)
Ukasz Kaiser, and Illia Polosukhin
Alex Graves (2013)
Generating Sequences With Recurrent Neural NetworksArXiv, abs/1308.0850
Jiannan Wang, Tim Kraska, M. Franklin, Jianhua Feng (2012)
CrowdER: Crowdsourcing Entity ResolutionProc. VLDB Endow., 5
Muhao Chen, Yingtao Tian, Kai-Wei Chang, S. Skiena, C. Zaniolo (2018)
Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment
Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa (2019)
Low-resource Deep Entity Resolution with Transfer and Active LearningArXiv, abs/1906.08042
Bhaskar Mitra, Nick Craswell (2017)
Neural Models for Information RetrievalArXiv, abs/1705.01509
N. Kolitsas, O. Ganea, Thomas Hofmann (2018)
End-to-End Neural Entity Linking
Pradap Konda, Sanjib Das, C. PaulSuganthanG., A. Doan, A. Ardalan, Jeffrey Ballard, Han Li, Fatemah Panahi, Haojun Zhang, J. Naughton, Shishir Prasad, Ganesh Krishnan, Rohit Deep, V. Raghavendra (2016)
Technical Perspective:: Toward Building Entity Matching Management SystemsSIGMOD Rec., 47
I. Fellegi, A. Sunter (1969)
A Theory for Record LinkageJournal of the American Statistical Association, 64
Felix Naumann, Melanie Herschel (2010)
An Introduction to Duplicate Detection
S. Hochreiter, J. Schmidhuber (1997)
Long Short-Term MemoryNeural Computation, 9
(2016)
The magellan data repository
Ilya Sutskever, Oriol Vinyals, Quoc Le (2014)
Sequence to Sequence Learning with Neural NetworksArXiv, abs/1409.3215
Nihel Kooli, Robin Allesiardo, Erwan Pigneul (2018)
Deep Learning Based Approach for Entity Resolution in Databases
Kun Qian, Lucian Popa, P. Sen (2017)
Active Learning for Large-Scale Entity ResolutionProceedings of the 2017 ACM on Conference on Information and Knowledge Management
P. Christen (2012)
A Survey of Indexing Techniques for Scalable Record Linkage and DeduplicationIEEE Transactions on Knowledge and Data Engineering, 24
Pradap Konda, Sanjib Das, C. PaulSuganthanG., A. Doan, A. Ardalan, Jeffrey Ballard, Han Li, Fatemah Panahi, Haojun Zhang, J. Naughton, Shishir Prasad, Ganesh Krishnan, Rohit Deep, V. Raghavendra (2016)
Magellan: Toward Building Entity Matching Management Systems over Data Science StacksProc. VLDB Endow., 9
Yuliang Li, Jinfeng Li, Yoshihiko Suhara, A. Doan, W. Tan (2020)
Deep entity matching with pre-trained language modelsProceedings of the VLDB Endowment, 14
Jordi Nin, V. Torra (2006)
New Approach to the Re-identification Problem Using Neural Networks
Prateek Jain, Sudheendra Vijayanarasimhan, Kristen Grauman (2010)
Hashing hyperplane queries to near points with applications to large-scale active learningProceedings of the Advances in Neural Information Processing Systems. J. D. Lafferty
Tomas Mikolov, Ilya Sutskever, Kai Chen, G. Corrado, J. Dean (2013)
Distributed Representations of Words and Phrases and their Compositionality
Kyunghyun Cho, Bart Merrienboer, Dzmitry Bahdanau, Yoshua Bengio (2014)
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
D. Wilson (2011)
Beyond probabilistic record linkage: Using neural networks and complex features to improve genealogical record linkageThe 2011 International Joint Conference on Neural Networks
Adithya Renduchintala, Rebecca Knowles, Philipp Koehn, Jason Eisner (2016)
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)The Association for Computational Linguistics
W. Winkler (2014)
Matching and record linkageWiley Interdisciplinary Reviews: Computational Statistics, 6
Hung Tran, T. Huynh, Tien Do (2014)
Author Name Disambiguation by Using Deep Neural NetworkArXiv, abs/1502.08030
V. Christophides, Vasilis Efthymiou, Themis Palpanas, G. Papadakis, K. Stefanidis (2019)
End-to-End Entity Resolution for Big Data: A SurveyArXiv, abs/1905.06397
Hanna Köpcke, Andreas Thor, E. Rahm (2010)
Evaluation of entity resolution approaches on real-world match problemsProceedings of the VLDB Endowment, 3
Sariel Har-Peled (2006)
Approximate Nearest Neighbors
Kenton Lee, Luheng He, M. Lewis, Luke Zettlemoyer (2017)
End-to-end Neural Coreference ResolutionArXiv, abs/1707.07045
Chen Zhao, Yeye He (2019)
Auto-EM: End-to-end Fuzzy Entity-Matching using Pre-trained Deep Models and Transfer LearningThe World Wide Web Conference
Luke Wolcott, William Clements, P. Saripalli (2018)
Scalable Record Linkage2018 IEEE International Conference on Big Data (Big Data)
P. Christen (2012)
Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection / Peter Christen, 2012
Yoon Kim (2014)
Convolutional Neural Networks for Sentence Classification
Alexandr Andoni, P. Indyk, Thijs Laarhoven, Ilya Razenshteyn, Ludwig Schmidt (2015)
Practical and Optimal LSH for Angular Distance
Jiaheng Lu, Chunbin Lin, Jin Wang, Chen Li (2019)
Synergy of Database Techniques and Machine Learning Models for String Similarity Search and JoinProceedings of the 28th ACM International Conference on Information and Knowledge Management
Daniel Jurafsky, James H. Martin (2008)
Speech and Language Processing: An Introduction to Speech Recognition, Computational Linguistics and Natural Language ProcessingPrentice Hall.
Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-Song Chen (2015)
Deep learning of binary hash codes for fast image retrieval2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Lukasz Kaiser, Illia Polosukhin (2017)
Attention is All you Need
Jingdong Wang, Heng Shen, Jingkuan Song, Jianqiu Ji (2014)
Hashing for Similarity Search: A SurveyArXiv, abs/1408.2927
Saravanan Thirumuruganathan, S. Parambath, M. Ouzzani, N. Tang, Shafiq Joty (2018)
Reuse and Adaptation for Entity Resolution through Transfer LearningArXiv, abs/1809.11084
V. Christophides, Vasilis Efthymiou, K. Stefanidis (2013)
Entity resolution in the web of dataProceedings of the 23rd International Conference on World Wide Web
T. Herzog, F. Scheuren, W. Winkler (2007)
Data quality and record linkage techniques
Jeffrey Pennington, R. Socher, Christopher Manning (2014)
GloVe: Global Vectors for Word Representation
Jia Deng, Wei Dong, R. Socher, Li-Jia Li, K. Li, Li Fei-Fei (2009)
ImageNet: A large-scale hierarchical image database2009 IEEE Conference on Computer Vision and Pattern Recognition
Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio (2014)
Neural Machine Translation by Jointly Learning to Align and TranslateCoRR, abs/1409.0473
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (2015)
Faster R-CNN: Towards real-time object detection with region proposal networksProceedings of the Advances in Neural Information Processing Systems. C. Cortes
(2020)
Speech and Language Processing (3rd ed
W. Winkler (2006)
Overview of Record Linkage and Current Research Directions
A. Arasu, S. Chaudhuri, R. Kaushik (2008)
Transformation-based Framework for Record Matching2008 IEEE 24th International Conference on Data Engineering
W. Winkler (2011)
20. Matching and Record Linkage
Entity matching is the problem of identifying which records refer to the same real-world entity. It has been actively researched for decades, and a variety of different approaches have been developed. Even today, it remains a challenging problem, and there is still generous room for improvement. In recent years, we have seen new methods based upon deep learning techniques for natural language processing emerge. In this survey, we present how neural networks have been used for entity matching. Specifically, we identify which steps of the entity matching process existing work have targeted using neural networks, and provide an overview of the different techniques used at each step. We also discuss contributions from deep learning in entity matching compared to traditional methods, and propose a taxonomy of deep neural networks for entity matching.
ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery
Published: Apr 21, 2021
Keywords: Deep learning
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.