Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The World Wide Web as Complex Data Set: Expanding the Digital Humanities into the Twentieth Century and Beyond through Internet Research

The World Wide Web as Complex Data Set: Expanding the Digital Humanities into the Twentieth... <jats:p> While intellectual property protections effectively frame digital humanities text mining as a field primarily for the study of the nineteenth century, the Internet offers an intriguing object of study for humanists working in later periods. As a complex data source, the World Wide Web presents its own methodological challenges for digital humanists, but lessons learned from projects studying large nineteenth century corpora offer helpful starting points. Complicating matters further, legal and ethical questions surrounding web scraping, or the practice of large scale data retrieval over the Internet, will require humanists to frame their research to distinguish it from commercial and malicious activities. This essay reviews relevant research in the digital humanities and new media studies in order to show how web scraping might contribute to humanities research questions. In addition to recommendations for addressing the complex concerns surrounding web scraping this essay also provides a basic overview of the process and some recommendations for resources. </jats:p> http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Humanities and Arts Computing Edinburgh University Press

The World Wide Web as Complex Data Set: Expanding the Digital Humanities into the Twentieth Century and Beyond through Internet Research

Loading next page...
 
/lp/edinburgh-university-press/the-world-wide-web-as-complex-data-set-expanding-the-digital-sZEqBiX9dI
Publisher
Edinburgh University Press
Copyright
© Edinburgh University Press 2016
Subject
Special Issue: The Future of Digital Methods for Complex Datasets; Historical Studies
ISSN
1753-8548
eISSN
1755-1706
DOI
10.3366/ijhac.2016.0162
Publisher site
See Article on Publisher Site

Abstract

<jats:p> While intellectual property protections effectively frame digital humanities text mining as a field primarily for the study of the nineteenth century, the Internet offers an intriguing object of study for humanists working in later periods. As a complex data source, the World Wide Web presents its own methodological challenges for digital humanists, but lessons learned from projects studying large nineteenth century corpora offer helpful starting points. Complicating matters further, legal and ethical questions surrounding web scraping, or the practice of large scale data retrieval over the Internet, will require humanists to frame their research to distinguish it from commercial and malicious activities. This essay reviews relevant research in the digital humanities and new media studies in order to show how web scraping might contribute to humanities research questions. In addition to recommendations for addressing the complex concerns surrounding web scraping this essay also provides a basic overview of the process and some recommendations for resources. </jats:p>

Journal

International Journal of Humanities and Arts ComputingEdinburgh University Press

Published: Mar 1, 2016

There are no references for this article.