Access the full text.
Sign up today, get DeepDyve free for 14 days.
Brian M. Bowen, Shlomo Hershkop, Angelos D. Keromytis, Salvatore J. Stolfo (2009)
Baiting inside attackers using decoy documentsProceedings of the International Conference on Security and Privacy in Communication Systems. Springer
Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman (2020)
Mining of Massive Data SetsCambridge University Press.
Catalin Cimpanu (2020)
FBI is investigating more than 1,000 cases of Chinese theft of US technologyRetrieved from https://www.zdnet.com/article/fbi-is-investigating-more-than-1000-cases-of-chinese-theft-of-us-technology/.
E. Simperl, Christoph Tempich, York Sure-Vetter (2006)
: A Cost Estimation Model for Ontology Engineering
Jonathan Voris, Nathaniel Boggs, Salvatore J. Stolfo (2012)
Lost in translation: Improving decoy documents via automated translationProceedings of the 2012 IEEE Symposium on Security and Privacy Workshops. IEEE, 2012
Tanmoy Chakraborty, S. Jajodia, Jonathan Katz, A. Picariello, Giancarlo Sperlí, V. Subrahmanian (2019)
A Fake Online Repository Generation Engine for Cyber DeceptionIEEE Transactions on Dependable and Secure Computing, 18
Elena Paslaru Bontas Simperl, Christoph Tempich, York Sure (2006)
Ontocom: A cost estimation model for ontology engineeringProceedings of the International Semantic Web Conference. Springer
Thomas C. Schelling (2008)
Arms and influenceStrategic Studies. Routledge
Eric Rosenbaum (2019)
1 in 5 corporations say China has stolen their IP within the last year: CNBC CFO surveyRetrieved from https://www.cnbc.com/2019/02/28/1-in-5-companies-say-china-stole-their-ip-within-the-last-year-cnbc.html.
Hans Christian, Mikhael Agus, Derwin Suhartono (2016)
Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF)ComTech, 7
Younghee Park, Salvatore J. Stolfo (2012)
Software decoys for insider threatProceedings of the 7th ACM Symposium on Information
Francois Mathey, Francois Mercier, Michel Spagnol, Frederic Robin, Virginie Mouries (2003)
6, 6′-bis-(1-phosphanorbornadiene) diphosphines, their preparation and their usesUS Patent 6,521,795., 6
Ben Whitham (2013)
AUTOMATING THE GENERATION OF FAKE DOCUMENTS TO DETECT NETWORKINTRUDERSInternational Journal of Cyber-Security and Digital Forensics, 2
Lei Wang, Chenglong Li, QingFeng Tan, XueBin Wang (2013)
Generation and distribution of decoy document systemProceedings of the International Conference on Trustworthy Computing and Services. Springer
Steven Bird, Ewan Klein, Edward Loper (2009)
Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit“O’Reilly Media
Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov (2016)
Enriching Word Vectors with Subword InformationTransactions of the Association for Computational Linguistics, 5
P. Rousseeuw (1987)
Silhouettes: a graphical aid to the interpretation and validation of cluster analysisJournal of Computational and Applied Mathematics, 20
V. Subrahmanian, D. Recupero (2008)
AVA: Adjective-Verb-Adverb Combinations for Sentiment AnalysisIEEE Intelligent Systems, 23
James MacQueen et al (1967)
Some methods for classification and analysis of multivariate observationsProceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability
Paul Huth (1999)
DETERRENCE AND INTERNATIONAL CONFLICT: Empirical Findings and Theoretical DebatesAnnual Review of Political Science, 2
Jonathan White, Dale Thompson (2006)
Using Synthetic Decoys to Digitally Watermark Personally-Identifying Data and to Promote Data Security
Ben Whitham (2014)
Design requirements for generating deceptive content to protect document repositoriesProceedings in the 15th Australian Information Warfare Conference
Leyla Bilge, T. Dumitras (2012)
Before we knew it: an empirical study of zero-day attacks in the real worldProceedings of the 2012 ACM conference on Computer and communications security
Tomas Mikolov, Kai Chen, G. Corrado, J. Dean (2013)
Efficient Estimation of Word Representations in Vector Space
Jim Yuill, M. Zappe, D. Denning, F. Feer (2004)
Honeyfiles: deceptive files for intrusion detectionProceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004.
Theft of intellectual property is a growing problem—one that is exacerbated by the fact that a successful compromise of an enterprise might only become known months after the hack. A recent solution called FORGE addresses this problem by automatically generating N “fake” versions of any real document so that the attacker has to determine which of the N + 1 documents that they have exfiltrated from a compromised network is real. In this article, we remove two major drawbacks in FORGE: (i) FORGE requires ontologies in order to generate fake documents—however, in the real world, ontologies, especially good ontologies, are infrequently available. The WE-FORGE system proposed in this article completely eliminates the need for ontologies by using distance metrics on word embeddings instead. (ii) FORGE generates fake documents by first identifying “target” concepts in the original document and then substituting “replacement” concepts for them. However, we will show that this can lead to sub-optimal results (e.g., as target concepts are selected without knowing the availability and/or quality of the replacement concepts, they can sometimes lead to poor results). Our WE-FORGE system addresses this problem in two possible ways by performing a joint optimization to select concepts and replacements simultaneously. We conduct a human study involving both computer science and chemistry documents and show that WE-FORGE successfully deceives adversaries.
ACM Transactions on Management Information Systems (TMIS) – Association for Computing Machinery
Published: Feb 2, 2021
Keywords: AI security
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.