Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The Hansard hazard: gauging the accuracy of British parliamentary transcripts 1

The Hansard hazard: gauging the accuracy of British parliamentary transcripts 1 Large databases of transcribed speech, downloadable from the Internet, are a corpus linguist's dream. They turn into a corpus linguist's nightmare, however, when the transcriptions are not linguistically accurate. In this paper I assess the suitability of the Hansard parliamentary transcripts (200 million words, downloadable) as a corpus linguistic resource, comparing a sample of the official transcript to a transcript made from a recording of a House of Commons session. The findings are that, as could be expected from earlier research, the transcripts omit performance characteristics of spoken language, such as incomplete utterances or hesitations, as well as any type of extrafactual, contextual talk (e.g., about turn-taking). Moreover, however, the transcribers and editors also alter speakers' lexical and grammatical choices towards more conservative and formal variants. Linguists ought, therefore, to be cautious in their use of the Hansard transcripts and, generally, in the use of transcriptions that have not been made for linguistic purposes. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Corpora Edinburgh University Press

The Hansard hazard: gauging the accuracy of British parliamentary transcripts 1

Corpora , Volume 2 (2): 187 – Nov 1, 2007

Loading next page...
 
/lp/edinburgh-university-press/the-hansard-hazard-gauging-the-accuracy-of-british-parliamentary-NicQDqt6HW

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Edinburgh University Press
Copyright
© Edinburgh University Press
ISSN
1749-5032
eISSN
1755-1676
DOI
10.3366/cor.2007.2.2.187
Publisher site
See Article on Publisher Site

Abstract

Large databases of transcribed speech, downloadable from the Internet, are a corpus linguist's dream. They turn into a corpus linguist's nightmare, however, when the transcriptions are not linguistically accurate. In this paper I assess the suitability of the Hansard parliamentary transcripts (200 million words, downloadable) as a corpus linguistic resource, comparing a sample of the official transcript to a transcript made from a recording of a House of Commons session. The findings are that, as could be expected from earlier research, the transcripts omit performance characteristics of spoken language, such as incomplete utterances or hesitations, as well as any type of extrafactual, contextual talk (e.g., about turn-taking). Moreover, however, the transcribers and editors also alter speakers' lexical and grammatical choices towards more conservative and formal variants. Linguists ought, therefore, to be cautious in their use of the Hansard transcripts and, generally, in the use of transcriptions that have not been made for linguistic purposes.

Journal

CorporaEdinburgh University Press

Published: Nov 1, 2007

There are no references for this article.