Access the full text.
Sign up today, get DeepDyve free for 14 days.
1IntroductionFactors in sound change are not a fresh topic in the linguistic field. However, it is still a matter of major debate since different theories hold contradictory views. Frequency, morphosyntactic structure, and word class are the three factors often referred to in discussion of sound change (Thomsen 1879, Postal 1968, Phillips 1983, 2001, Bybee 1985, 2012, 2017, Crowley 1997, Pierrehumbert 2001, Donohue 2005, Ogura 2012, Smith 2012). The present study is motivated by the following two points: (1) the long debate concerning the role of the aforementioned three factors in sound change is yet to come to a conclusion; (2) frequency change, a potential factor in sound change, has generally been ignored in previous studies. This article not only intends to find an answer to the question of whether the three factors have a role in sound change but also intends to understand their exact role in sound change if the answer to the first question is affirmative. The present article focuses on palatalization in Middle Mandarin (c.1500–c.1800). Palatalization began at the beginning of the eighteenth century and was completed no later than the beginning of the nineteenth century (Morrison 1815–23, Edkins 1857, 1864, Akiyasu 1967). The whole process was finished in approximately one century. Therefore, it is possible to explore the different phases of this process. In addition, this process involved a relatively large number of morphemes, which makes the statistical analysis possible. Third, previous studies have mainly used data from the Indo-European language family. The sound change mechanism is supposed to be universal, and thus, data from various language families should be investigated. From this respect, Mandarin is a good candidate since it belongs to the Sino-Tibetan language family and has not been widely discussed. This article uses related classic works and corpus to collect data for palatalization, carries out statistical analysis, and reveals factors in this sound change.This article starts with a sketch of factors in sound change documented in previous studies and then presents the trajectory of palatalization in Middle Mandarin (c.1500–c.1800) and data extraction methods in Section 3. Section 4 focuses on the statistical analysis of corpus data and presents my own proposals concerning factors in sound change. This article arrives at its conclusion in Section 5.2Review of previous studiesA growing body of literature has been arguing that an intersection of factors should be considered for sound change discussion (see, e.g., Grimm 1822, Jespersen  2007, Malkiel 1968, 1969–70, 1976, Postal 1968, Stemberger and MacWhinney 1986, Crowley 1997, Kapatsinski 2010, Wedel et al. 2013). Factors in sound change documented in previous studies include frequency, morphosyntactic structure, and word class. For ease of exposition, this section starts with the word class factor and ends with the most complex factor, frequency.2.1Word classIn the second half of the twentieth century, a number of studies demonstrated the possible relation between word class and sound change. To exemplify, King (1969) discusses the final schwa deletion in Yiddish, notices that this rule does not apply when the final schwa is in an adjective inflectional ending, and concludes that this is evidence of morphologically conditioned sound change (reviewed in Robinson and van Coetsem 1973; see also Anttila 1972). More recently, Crowley (1997) focuses on Southern Paamese and Northern Paamese, two languages of Central Vanuatu, and reports that Southern Paamese /l/ corresponds to Northern Paamese /i/, /l/, or zero in all word classes except verbs. According to Crowley (1997), this process shows that at least some sound changes only affect certain word classes. Although both King (1969) and Crowley (1997) claim that word class has a role in sound change, their claim is to some extent vague. Neither of them states clearly the exact relationship between word class and sound change. Phillips (1983, 2006) is also a proponent of the role of word class in sound changes and gives a much more detailed description of its exact role. Phillips (1983, 2006) classifies words into two categories, content words and function words: content words mainly include adjectives, adverbs, nouns, verbs, etc.; function words refer to a wide range of words that normally receive low sentence stress, such as adverbial conjunctions, auxiliary verbs, determiners, prepositions, quantifiers, and so on. Phillips (1983, 2006) points out that a strengthening sound change, the change from /d/ to /t/ in Old High German Isidor, affected function words last, and goes further to state that content words tend to be affected by strengthening sound changes first and function words by weakening sound changes first.Though intriguing, the role of word class in sound change has been facing challenges (Jasanoff 1971, Blevins and Lynch 2009, Brown 2013). Blevins and Lynch (2009) claim that the sound change discussed in Crowley (1997) applies to all word classes including verbs, but the phonological and morphological components in verb inflections restore the sound change in verbs and make verbs appear to be exceptions. From another perspective, Jasanoff (1971) claims that what appears to be morphologically conditioned is in fact regular sound change partially obscured by subsequent analogy.At the same time, some scholars adopt the middle way by claiming that no conclusion can be drawn yet and further investigation is necessary (Sihler 2000, Campbell 2013, Manker 2015). For example, although Manker (2015) states that the reason why many examples in Phillips (2006) appear to be limited to certain word classes is that these words more often happen to be utilized within the favorable phonetic environment for the change, Manker (2015) does not give any decisive conclusion, stating that the role of morphological factors in sound changes cannot be absolutely ruled out and needs further extensive investigation (see also Sihler 2000, Campbell 2013).2.2Morphosyntactic structureThe second factor to be reviewed is the morphosyntatic structure. Taking the credit for noticing this factor, Postal (1968) focuses on Mohawk, the language spoken by Mohawk people, and notices that [e] is regularly inserted into [kw] sequences except when the [k] is the first person morpheme and the [w] is the first element of the plural morpheme. According to Postal (1968), this language example shows that superficial morphosyntactic structure could also condition sound change (reviewed in Fudge 1972; see also King 1969). Later scholars more plainly state the morphosyntactic structure as the division between bound morphemes and free morphemes. Donohue (2005) discusses the voicing of voiceless stops in Palu’e and sound changes in Modern Indonesian and Bali-Vitu (Austronesian, Oceanic) and points out that the sound change advances in a small set of bound morphemes more completely than in free lexemes. Similarly, Bybee (2002) focuses on word-final /t, d/ deletion in American English and concludes that bound morphemes can affect the deletion process. Guy (1991b) also focuses on word-final /t, d/ deletion in American English and gives a more detailed conclusion: underived or monomorphemic words, such as mist and pact, undergo deletion at a higher rate than inflected forms such as past tense verbs like missed and packed (see also Labov et al. 1968, Fasold 1972, Guy 1991a). Baranowski and Turton (2020) report a similar result for word-final /t, d/ deletion in British English. Different scholars hold different views concerning the exact effect of the morphosyntactic structure: although Guy (1991a, 1991b) claims that underived or monomorphemic word-lead /t, d/ deletion in American English, Donohue (2005) reports that bound morphemes were at the forefront of the voicing of voiceless stops in Palu’e.In contrast, Renwick et al. (2014) focus on word-final /t, d/ deletion in British English and report that their results show little support for the role of morphosyntactic structure. What is most surprising may be that completely opposite conclusions concerning the role of morphosyntactic structure have been drawn from the same phenomenon, word-final /t, d/ deletion in English.To sum up, controversies concerning the morphosyntactic structure factor are first whether it has a role in sound change, and second, whether bound morphemes or free morphemes change faster in sound change.2.3FrequencyWord class and morphosyntactic structure are not the only two factors put forward in previous studies. The frequency factor is possibly the one that has generated the most discussion. There has been a long-running argument between academics over whether the frequency is relevant. Even scholars who argue that the frequency plays a role in sound change cannot agree with each other on the exact frequency effect. In the following, the frequency effect is approached from the perspectives of high frequency and low frequency and relative frequency.2.3.1High frequency vs low frequencySince the nineteenth century, the frequency effect has been studied in detail. Grimm (1822) discusses the connection between irregularity and high-frequency auxiliary verbs. Thomsen (1879) gives a few frequent Romanic verbs, pointing out that they are exceptions to normal phonetic development (see also Thomsen 1920). Jespersen ( 2007) more plainly states that frequent words, due to their common and recurrent usage, tend to be exceptions to sound change.What is interesting is that more than a century has passed, and yet the frequency effect is still not agreed on or well understood. Bybee (1985, 2000, 2002), Pierrehumbert (2001), and Smith (2012), among others, claim that frequent words lead sound change and less frequent words follow, a view different from that of Grimm (1822), Thomsen (1879, 1920), and Jespersen ( 2007). Bybee (2002) studies the deletion of final /t/ and /d/ in American English and finds that the deletion rates in high-frequency words are statistically higher than in low-frequency words. According to Bybee (2002), this is evidence that frequent words lead sound change. Hay and Foulkes (2016) report that frequent words lead a change in the pronunciation of word-medial intervocalic /t/ in New Zealand English. On the contrary, other scholars claim that low-frequency words are at the forefront of changes and high-frequency words follow (Labov 1989, Bermúdez-Otero 2007, Hay et al. 2015). Labov (1989) focuses on Philadelphia a-tensing and reports that the most frequent words lag behind in the shifting to the tense class. Hay et al. (2015) observe pronunciation changes in New Zealand English over a 130-year period and claim that their data show that low-frequency words led these changes and high-frequency words lagged behind.Unlike in the aforementioned two paragraphs, scholars referred to in this paragraph doubt the role of frequency in sound change. Fruehwald et al. (2013) assert that the phonological change progresses in every context at the same rate over time irrelevant of frequency and name this as the constant rate effect (see also Kroch 1989, 1994, Pintzuk 1991, Santorini 1992, 1993, Dinkin 2008). In addition, Zellou and Tamminga (2014) study the co-articulatory vowel nasality in Philadelphia English and claim that their study shows that the changes in nasality are independent of any frequency effect. Similarly, Labov (2010) examines the role of frequency in several different phonetically gradual changes and concludes that the role of frequency in these changes is minimal, if not zero, a view agreed on by Kiparsky (2014).2.3.2Relative frequencyWedel et al. (2013) approach the frequency effect from a different perspective, proposing that the relative frequency, instead of absolute frequency, is the factor relevant to sound changes. Wedel et al. (2013) use 19 systems of phoneme contrasts from nine different languages to explore factors in phoneme merger, such as the cot ∼ caught merger in North American English, and report that no measures based on absolute word frequency could satisfactorily predict merger probability for their data. Instead, Wedel et al. (2013) report that the frequency ratio or the relative frequency of minimal pair members could predict merger well, with minimal pairs whose members are more similar in frequency having less tendency to merge.2.4Proposal of this articleAs reviewed in Section 2.1, the role of word class in sound change is yet to be determined and is thus a focus of this study. Similarly, the role of morphosyntactic structure in sound change is not fully understood either. As noted in Section 2.2, first, this article needs to answer whether the morphosyntactic structure has a role in sound change; second, this article needs to find out whether bound morphemes or free morphemes lead sound change, if the answer to the first question is affirmative.Concerning the most debated factor, frequency, this article explores whether it has a role in sound change. This article extracts numerical data of frequency and investigates its correlation with sound change by statistical analysis. In addition, the present study refers to the concept of relative frequency. Unlike Wedel et al. (2013), the present study does not refer to minimal pair members and thus cannot give relative frequency from the perspective of minimal pair. Instead, the study compares frequencies of all related morphemes and assigns them to one of the three levels, low, medium, and high.The present study also has its own proposal: the factor of frequency change. Previous studies have mainly focused on the frequency effect and have not referred so much to frequency change. If the frequency is correlated with sound change, however, frequency change may also be associated with sound change. To exemplify, Pierrehumbert (2001), among others, claims that sound change usually affects the most frequent lexical items first. Assuming this to be correct, lexical items with increased frequency during the time concerned seem more likely to undergo a certain sound change earlier than lexical items with decreased frequency. The reason is that lexical items with increased frequency seem to be more active and more accessible to a related sound change than lexical items with decreased frequency during the time concerned.The present study also refers to the concept of relative frequency change by comparing the frequency of a related morpheme in the time of palatalization with its frequency in an earlier period to calculate the frequency change of this morpheme and assigning it to one of two levels, decreased frequency and increased frequency.This section has given factors referred to in previous studies and my own proposals. The controversies concerning these factors can be solved only by a close look at empirical data. Section 3 presents a brief description of the language data for the present study.3DataThis article takes palatalization in Northern Mandarin (AD 1324–Present) as its language sample. Palatalization in Northern Mandarin stretched from the beginning of the eighteenth century until the beginning of the nineteenth century and was completed without exception (Morrison 1815–23, Edkins 1857, 1864, Wang 1957, Akiyasu 1967, Chen 1976, Pulleyblank 1984). It can provide detailed information concerning morphemes that were at the forefront of this sound change, its contour, and so on. Section 3.1 briefly describes this process, and Section 3.2 expounds data extraction from CNCORPUS (Xiao 2010) and Piaotongshi xinshi yanjie (New Edition of Vernacular Exposition of Pak the Interpreter; Kim  2005).3.1Trajectory of palatalization in Northern MandarinPalatalization concerns the velars ([k], [kh], and [x]): the velars were palatalized before high front vowels and glides to [ʨ], [ʨh], and [ɕ] (Pulleyblank 1984; see also Chen 1976, Li 1999, Handel 2017). The velar palatalization rule is given in (1) according to Chen (1976), Pulleyblank (1984), Li (1999), and Handel (2017).Different scholars use slightly different symbols for the environment of the velar palatalization rule. Some scholars use [i] and [j], while others use [i] and [y]. This article follows Pulleyblank (1984) and indicates the environment uniformly as [i] and [j].(1) Velar palatalization rule k → ʨ /________i, j kh → ʨh /________i, j x → ɕ /________i, j Palatalization illustrated in (1) is a regular sound change since it does not have exceptions: all morphemes that met the environment have been palatalized (Wang 1957, Pulleyblank 1984). After the completion of palatalization, the alveolopalatal triplexes ([ʨ], [ʨh], [ɕ]) are in complementary distribution with the velar triplexes ([k], [kh], and [x]) in Modern Standard Chinese (c.1900–Present), with the alveolopalatal triplexes before high front vowels and the velar triplexes in all other phonetic environments. Three syllables are exemplified in Table 1 to illustrate the velar palatalization rule in (1).Table 1Palatalization in MandarinEarly Mandarin2 (c.1300–c.1500)Middle Mandarin (c.1500–c.1800)Gloss[kip][tɕi]ʻurgent’ 急[khjt][tɕhi]‘beg, ask’ 乞[xjeh][ɕi]‘opera’ 戏The transcriptions in Table 1 are given according to Baxter and Sagart (2014), with reference to Zhongyuan Yinyun (Rhymes of the Central Plain; Zhou  1996) and reconstructions of Zhongyuan Yinyun in Pulleyblank (1984, 1991) and Chou (1993). In Table 1, the phoneme [k] of the morpheme ʻurgentʼ (急) was palatalized to [ʨ] during palatalization shown as the velar palatalization rule (1). In fact, the morpheme ʻurgentʼ (急) not only underwent palatalization but also lost its final consonant [p]. In this way, [kip] became [ʨi] in Middle Mandarin (c.1500–c.1800). The deletion of consonants in the final position seems especially common in languages (Mortensen 2012). Mandarin is not an exception. The system of codas, namely, final consonants, was dramatically reduced when Middle Chinese (AD 601–AD 1336) evolved into Mandarin (AD 1324–Present) (Wang 1957, 1985, Pulleyblank 1984, 1991). All final voiceless stop consonants, namely, [k], [p], and [t], disappeared altogether (Chen 1976, Pulleyblank 1984). As a result, syllables like [kip] and [khjt] in Table 1 finally became [tɕi] and [tɕhi], respectively. A detailed discussion of the loss of final consonants will be omitted here since it is not the focus of the present article. The morpheme ‘beg, ask’ (乞) was [khjt] in Early Mandarin (c.1300–c.1500). It not only underwent the palatalization but also lost its final consonant [t]. In this way, the morpheme ‘beg, ask’ (乞) finally became [tɕhi] in Middle Mandarin (c.1500–c.1800). The same process applies to the morpheme ‘opera’ (戏).The period of Mandarin (AD 1324–Present) is usually divided into three phases: Early Mandarin (c.1300–c.1500), Middle Mandarin (c.1500–c.1800), and Modern Standard Chinese (c.1900–Present) (Dong 1954, Stimson 1966, Shi 2002, Shen 2015, 2020). The classic book Zhongyuan Yinyun (Rhymes of the Central Plain; Zhou  1996) marks the beginning of Early Mandarin (c.1300–c.1500; Norman 1988, Li 1999, Shen 2015, 2020). The focus of this article, palatalization, falls in the period of Middle Mandarin (c.1500–c.1800).Palatalization was first recorded in the anonymous Chinese rhythm dictionary Yuanyin Zhengkao (On the alveolar and alveolo-palatal consonants;  2015). This dictionary contains transcriptions in the Manchu alphabet, a phonographic system. The dictionary gives the pronunciations of all related morphemes as [ki-], [khi-], and [xi-] since it takes the position that the palatalized pronunciations of [ʨi], [ʨhi], and [ɕi] are not correct and argues that the pronunciations of [ki-], [khi-], and [xi-] should be followed. The argument in this dictionary provides evidence that palatalization had begun before the middle of the eighteenth century, but it does not provide information concerning which morphemes had undergone palatalization. It was not until the rhyme dictionary Lishi Yinjian (The rhyme book of Liruzhen; Li  1992) that palatalization was shown to have been completed for the first time. In sum, the discussion in Section 3.1 presents two important points in time: the period before the middle of the eighteenth century as the beginning of palatalization and the beginning of the nineteenth century as the time for the completion of palatalization.3.2Data extractionThe most reliable source of data seems to be corpora with data in the related period and dictionaries and historical books compiled by native speakers of Mandarin during the time concerned.3.2.1Dictionaries and historical booksThe choice of dictionaries and books during the time concerned is not as straightforward as it appears. As noted in Section 3.1, Yuanyin Zhengkao ( 2015), the first Chinese rhyme dictionary that recorded palatalization, does not indicate which morphemes had been palatalized. It only states that palatalized pronunciations are not correct. In addition, dictionaries and books compiled by Chinese scholars before the twentieth century use fanqie (反切), a traditional method of indicating the pronunciation of a Chinese character by using two other Chinese characters. For example, the pronunciation of the character 唐 might be represented as follows: 徒郎. It roughly means that the initial of 唐 is the same as that of 徒, and the final of 唐 is the same as that of 郎. This representation makes it circular and thus difficult to understand the pronunciations of characters and morphemes they represent, since the Chinese writing system is a representative logographic system, not a phonographic system like English. In contrast, Chinese language learning textbooks used in Korea use Hangul (the Korean alphabet). They can provide a relatively clear picture about the pronunciation of Mandarin during the time concerned.Piaotongshi Xinshi Yanjie (New Edition of Vernacular Exposition of Pak the Interpreter; Kim  2005) is the first Chinese language learning textbook used in Korea to show palatalized pronunciations in Mandarin (Kim 1989, Jin 2001, Lin 2010). Chinese characters in this textbook are transcribed in Hangul (the Korean alphabet). The book was compiled in Beijing, the capital of China, and focuses on oral Mandarin at that time (Kim 1989, Lin 2010). As stated in Section 3.1, palatalization was first recorded in the anonymous Chinese rhythm dictionary Yuanyin Zhengkao ( 2015). Piaotongshi xinshi yanjie (Kim  2005) was compiled almost in the same period as the rhyme dictionary. Thus, it can provide information about morphemes that were at the forefront of palatalization. This study first relies on Piaotongshi xinshi yanjie (Kim  2005) to locate morphemes that had been palatalized in the middle of the eighteenth century and those that had not. Take the two morphemes ʻurgent’ (急) and ‘opera’ (戏) in Table 1 as examples. The morpheme ‘urgent’ is transcribed as 지 in Piaotongshi xinshi yanjie, or [tɕi] in the International Phonetic Alphabetic. Therefore, this morpheme had been palatalized in the middle of the eighteenth century and was at the forefront of palatalization according to Piaotongshi xinshi yanjie. The morpheme ‘opera’ is transcribed as 히 ([xi]), which means that the morpheme had not been palatalized in the time of the book. Thus, the morpheme ‘opera’ lagged behind in the palatalization. In this way, morphemes leading the palatalization and morphemes that lagged behind can be identified. The next step is to compare morphemes leading the palatalization with morphemes lagging behind in this process to locate factors involved in the palatalization.3.2.2CorpusAs noted, the three factors, word class, morphosyntactic structure, and frequency, are of interest here. This study uses CNCORPUS to extract frequency data. CNCORPUS is composed of two databases: an Old Chinese and Middle Chinese database and a Mandarin database. It also permits searching for data according to Chinese dynasties. Another benefit of the corpus is its capacity, 120 million Chinese characters from a wide range of sources. However, CNCORPUS does not give phonetic transcriptions of these Chinese characters. Therefore, the exact phonetic transcription of the related morphemes for this article relies on the book Piaotongshi xinshi yanjie.CNCORPUS permits searching for data according to Chinese dynasties. The book Piaotongshi xinshi yanjie (Kim  2005) was compiled in the middle of the eighteenth century. Palatalization ended at the beginning of the nineteenth century (Li  1992, Morrison 1815–23, Edkins 1857, Edkins 1864, Akiyasu 1967). The two important time points – the middle of the eighteenth century and the beginning of the nineteenth century – fall within the Qing Dynasty (1644–1912). Accordingly, the focus was given to the frequency data in the Qing Dynasty (1644–1912). In addition, because frequency change was assumed to be a factor in sound change, frequency before the time concerned was required to calculate frequency change. The Qing Dynasty (1644–1912) was preceded by the Ming Dynasty (1368–1644). However, data for the Yuan Dynasty (1271–1368) and data for the Ming Dynasty (1368–1644) are combined together in the CNCORPUS. Thus, data from both the Yuan and Ming Dynasties (1271–1644; hereafter the YM Dynasties) were extracted. In other words, the focus was mainly on the frequency data in the Qing Dynasty (1644–1912), and the frequency change from the YM Dynasties (1271–1644) to the Qing Dynasty (1644–1912). Take the morpheme ‘urgent’ in Table 1 as an example again: its frequency is 3,592 (1644–1912) and 4,684 (1271–1644). Its frequency change is thus −1,092, which means that its frequency has decreased from 1271–1644 to 1644–1912. Concerning the morpheme ‘opera’, its frequency is 1,072 (1644–1912). Since its frequency is 1,045 between 1271 and 1644, it has an increased frequency in 1644–1912 compared to its frequency in 1271–1644.3.3MethodologyThis study first relies on Piaotongshi xinshi yanjie (Kim  2005) to locate morphemes that had been palatalized in the middle of the eighteenth century and those that had not. Then the article uses the CNCORPUS to look for related information concerning both palatalized morphemes in the middle of the eighteenth century and unpalatalized morphemes in the same period. Finally, the article uses GraphPad Prism version 8.0.0 for Windows (GraphPad 2019; henceforth GraphPad software) to carry out statistical analysis to locate factors in palatalization.According to my manual research result shown in Table 2, 70.21% of morphemes of [ki-] in Early Mandarin (c.1300–c.1500) have palatalized pronunciations in Piaotongshi xinshi yanjie (Kim  2005), 33 morphemes of 47; 66.67% of morphemes of [khi-] in Early Mandarin have palatalized pronunciations, 20 morphemes of 30. A total of 51.61% of [xi-] in Early Mandarin have palatalized pronunciations, 16 morphemes of 31. The whole palatalization rate in Piaotongshi xinshi yanjie (Kim  2005) is 63.89%. Palatalization had been completed in Lishi Yinjian (Li  1992). The trajectory of palatalization is shown in Figure 1.Table 2Percentages of unpalatalized and palatalized morphemes in Piaotongshi xinshi yanjie (Kim  2005)UnpalatalizedPalatalized[ki-][khi-][xi-][tɕi][tɕhi][ɕi]29.79% (14/47)33.33% (10/30)48.39% (15/31)70.21% (33/47)66.67% (20/30)51.61% (16/31)Total 36.11% (39/108)Total 63.89% (69/108)Figure 1The trajectory of palatalization.It is necessary at this point to emphasize that morphemes of [ki-], [khi-], and [xi-] with the combination of one of the final voiceless stops ([k], [p], and [t]) and morphemes of [ʨi], [ʨhi], and [ɕi] are included in this study. More specifically, morphemes of [kik], [kip], [kit], [khik], [khip], [khit], [xik], [xip], [xit], [ʨi], [ʨhi], and [ɕi] are considered for the present study. Morphemes with a complex nucleus or a final other than [k], [p], or [t] are not included in this study since they may also be involved in other sound changes. The present study tries to exclude any other sound change process as much as possible to solely focus on palatalization. For example, morphemes of [kiak] are not included since they have a complex nucleus [ia]. Morphemes of [ɕin] are not included either because they have the final [n]. From here on, [ki-] will be used to stand for [kik], [kip], and [kit] for convenience of exposition. The same applies to [khi-] and [xi-].In order to locate factors in palatalization observed in Table 2 and Figure 1, statistical analysis is conducted in Section 4, taking all factors discussed in Section 2 into consideration.4Statistical analysis for factorsFor ease of exposition, this section will first explain the factors for statistical analysis and then the statistical analysis results.4.1Factors for the binary logistic regression analysisMultiple logistic regression analysis is performed using the GraphPad software to examine the relationship between factors and palatalization. The dependent factor is palatalization, which has binary outcomes: no and yes, where no means that a morpheme under discussion is transcribed as [ki-], [khi-], or [xi-] in Piaotongshi xinshi yanjie (Kim  2005), and yes means that a relevant morpheme is transcribed as [tɕi], [tɕhi], or [ɕi] in Piaotongshi xinshi yanjie (Kim  2005). Table 3 presents a summary of the factors and factor levels for the analysis.Table 3Factors for the binary logistic regression modelFactorFactor levelCount (%)Word class (1644–1912)Content word99 (91.67)Function word9 (8.33)Morphosyntactic structure (1644–1912)Free67 (62.04)Bound 41 (37.96)Normalized frequency (1644–1912)Numerical dataFrequency dummy (1644–1912)Low36 (33.33)Medium36 (33.33)High36 (33.33)Frequency change from between 1271 and 1644 to between 1644 and 1912 Numerical dataFrequency change from between 1271 and 1644 to between 1644 and 1912 dummy Decrease67 (62.04)Increase41 (37.96)The factor of word class (1644–1912) with two levels content word and function word was configured to examine the contradictory claims concerning the role of word class in sound change discussed in Section 2.1. The time period 1644–1912 means that the data were extracted in the time frame of the Qing Dynasty (1644–1912). Adjectives, nouns, verbs, and so on are classified as content words; adverbial conjunctions, auxiliary verbs, determiners, prepositions, quantifiers, and so on are classified as function words. This is in line with the dichotomy of words in Phillips (1983, 2001, 2006). It may appear ideal to classify words into adjectives, adverbial conjunctions, determiners, nouns, prepositions, verbs, and so on. However, too many factor levels are not preferable for statistical analysis. Mashi Wentong (Master Ma’s linguistic overview; Ma 1898), a classic work on Chinese grammar, has been referred to. To exemplify, the morpheme ‘urgent’ (急) is classified as a content word in Ma (1898). However, as not every related morpheme can be located in Ma (1898), Baxter and Sagart (2014) have also been referred to to decide the word class of each morpheme. The Chinese character 乞 will be used as an example: this character is annotated as ‘beg’ in Baxter and Sagart (2014) and is thus a verb. According to Phillips (1983, 2001, 2006), a verb is a content word. Therefore, this study takes the morpheme ʻbeg’ (乞) as a content word.The factor of morphosyntactic structure (1644–1912) with two levels free and bound was constructed to test whether morphosyntactic structure has a role in sound change. Free and bound, respectively, mean that a related morpheme is mainly used as a free morpheme and as a bound morpheme. According to Packard (2015), if a morpheme can stand alone as an independent word, it is a free morpheme; if it must be used together with other language materials, then it is a bound morpheme (see also Chao 1968, Hsieh 2016). However, the distinction between free and bound morphemes is not always clear-cut in Chinese: bound morphemes may sometimes be used as free morphemes (Packard 2015, 2020). The present study decides the morphosyntactic status of a morpheme depending on whether it is mainly used as a free morpheme or a bound morpheme in example sentences extracted from the CNCORPUS: if a morpheme is used as a free morpheme in most sentences, it is marked as free; otherwise it is marked as bound. To exemplify, the morpheme ʻurgent’ (急 [tɕi]) in Table 1 is marked as a free morpheme because it is used as a free morpheme in more than 90% of all sentences extracted from the corpus, while the morpheme ‘daughter-in-law, wife’ (媳 [ɕi]) is marked as a bound morpheme since it is almost always used with another morpheme found in sentences extracted from the corpus.As noted in Section 2.3, the frequency factor is claimed to have an association with sound change by some scholars, although disagreement exists concerning whether frequent words or less frequent words lead sound change (Grimm 1822, Bybee 1985, 2000, 2002, Pierrehumbert 2001, Bermúdez-Otero 2007, Ogura 2012, Smith 2012, Hay and Foulkes 2016). The raw numerical data of frequency between 1644 and 1912 were examined in the first place. The cross-tabulation analysis carried on the GraphPad software revealed that it was rejected as a statistically significant factor for palatalization (p = 0.46). Thus, the raw data of frequency between 1644 and 1912 were normalized on the GraphPad software and reported as Normalized frequency (1644–1912) in Table 3.The factor of frequency dummy (1644–1912) was configured due to the debate in Section 2.3.1 concerning whether low-frequency words or high-frequency words lead sound change. It was also partly configured in line with claims in the study by Wedel et al. (2013), reviewed in Section 2.3.2. Wedel et al. claim (2013) that the relative frequency of minimal pair members, instead of the absolute frequencies, is a significant predictor of phoneme merger. The present study does not refer to their relative frequency. Instead, the study refers to all morphemes involved in palatalization. Thus, it took into account the relative frequencies of all related morphemes. The factor frequency dummy (1644–1912) has three factor levels: low, medium, and high. Each level takes one third of data: the one third of data with the lowest frequencies in this column is marked as low; the one third of data with the highest frequencies is marked as high; and the leftover one third of data between low and high is medium. Therefore, low here does not refer to frequency lower than a specific count. Instead, it means that the frequency of a certain morpheme is among the lowest frequencies of morphemes involved in palatalization. For example, the morpheme ‘urgent’ [tɕi] in Table 1 is marked as medium since its frequency of 3,592 is among the second tertile in terms of frequency.The next factor frequency change from between 1271 and 1644 to between 1644 and 1912 (henceforth frequency change) was introduced since it seems possible that morphemes with increased frequencies and morphemes with decreased frequencies may have undergone palatalization at different rates. For example, the frequency of the morpheme ‘urgent’ is 4,684 between 1271 and 1644 and 3,592 between 1644 and 1912. Thus, the frequency change from between 1271 and 1644 to between 1644 and 1912 for the morpheme ‘urgent’ is −1,092, where the negative number means that its frequency has decreased from between 1271 and 1644 to between 1644 and 1912. In contrast, the frequency of the morpheme ‘opera’ is 1,045 between 1271 and 1644 and 1,072 between 1644 and 1912. As a result, the frequency change for the morpheme ‘opera’ is 27, where the positive number means that its frequency has increased from between 1271 and 1644 to between 1644 and 1912. It is possible to normalize the raw data of frequency change by adding all numbers with the absolute of the most negative. In this way, the most negative number will become zero, and all the other numbers become positive. However, the focus of frequency change is whether related frequencies have increased or decreased. It therefore serves the present study better to use the raw data, instead of normalized data.The raw numerical data for the aforementioned factor were converted to categorical data with two levels: decrease and increase, with decrease as the reference level. The reason is that it seems that whether frequency has decreased or increased is also a factor in sound change. Therefore, the factor of frequency change from between 1271 and 1644 to between 1644 and 1912 dummy (hereafter frequency change dummy) was introduced. The morphemes ‘urgent’ and ‘opera’ will be exemplified again: the frequency of ‘urgent’ decreased 1,092 counts from between 1271 and 1644 to between 1644 and 1912, so it is marked as decrease for the factor of frequency change dummy; the frequency of ‘opera’ increased 27 counts, so it is marked as increase for the same factor.4.2The binary logistic regression resultsThe statistical relationships between the six factors in Table 3 and the dependent variable were assessed using the multiple logistic regression of the GraphPad software. Model selection was guided by AIC (Akaike Information Criterion; Akaike 1974, Burnham and Anderson 2004), calculated probability (p value), and VIF (variance inflation factor; Rawlings et al. 1998, James et al. 2017). In Table 4, the results are reported.Table 4The binary logistic regression resultsVariableEstimateStd. Error|z| p(Intercept) 0.660.302.230.03*Word class (1644–1912)1.460.861.700.34Morphosyntactic structure (1644–1912)−0.020.070.220.82Normalized frequency (1644–1912)0.020.011.330.18Frequency dummy (1644–1912)0.100.051.930.04*Frequency change from between 1271 and 1644 to between 1644 and 1912−0.010.001.210.23Frequency change from between 1271 and 1644 to between 1644 and 1912 dummy−0.960.430.380.02*Notes: * = p < 0.05.|z| stands for the absolute value of z as given by the GraphPad software.The dependent variable has two categories: unpalatalized and palatalized, with unpalatalized as the reference level. A p-value smaller than 0.05 was considered statistically significant.From here on, time phases will be omitted for easy exposition. For example, word class (1644–1912) will be shortened to word class. The factor of frequency change from between 1271 and 1644 to between 1644 and 1912 will be shortened to frequency change.The binary logistic regression analysis for Table 4 indicates that the factors of word class and morphosyntactic structure did not emerge as statistically significant (p = 0.34, 0.82).The next two factors present an interesting result: while frequency dummy is statistically significant (p = 0.04), normalized frequency is not shown as a statistically significant factor (p = 0.18). The factor frequency dummy shows a significant effect in the direction of increasing the possibility for palatalization from low frequency, medium frequency, to high frequency, as indicated by the positive coefficient value (estimate = 0.10). The correlation between this factor and palatalization is visualized in Figure 2.Figure 2Frequency dummy. The words low, medium, and high in the first row stand for the low, medium, and high levels of the frequency dummy factor, respectively. The words unpalatalized and palatalized in the leftmost column stand for the two categories of the dependent variable, unpalatalized morphemes and palatalized morphemes, respectively.Two tendencies can be observed in Figure 2: (1) the number of unpalatalized morphemes, shown as striped bars, declines from the low and medium levels to the high level of the frequency dummy factor; (2) the number of palatalized morphemes, shown as gray bars, increases from the low and medium levels to the high level of the same factor. In short, Figure 2 clearly indicates that morphemes in the high-frequency level tend to undergo palatalization earlier.Table 4 also shows that the factor of frequency change is independent of palatalization (p = 0.23), while frequency change dummy is statistically significantly associated with palatalization (p = 0.02). The factor frequency change dummy has decrease as the reference level. Morphemes with increased frequencies have a lower probability of undergoing palatalization first, as indicated by the negative coefficient value (estimate = −0.96). In other words, morphemes with decreased frequencies were at the forefront of palatalization. The correlation between this factor and palatalization is visualized in Figure 3.Figure 3Frequency change dummy. The words decrease and increase in the first row, respectively, stand for the decrease and increase levels of the frequency change dummy factor. The words unpalatalized and palatalized in the leftmost column, respectively, stand for unpalatalized morphemes and palatalized morphemes, the two categories of the dependent variable.The graphic illustration of quantitative data in Figure 3 provides evidence for the following two tendencies: (1) the number of unpalatalized morphemes, represented as striped bars, increases from the decrease level to the increase level of the frequency change dummy factor; (2) the number of palatalized morphemes, represented as gray bars, shows a reversed pattern.To sum, data for this study show different mechanisms for the factors of frequency dummy and frequency change dummy.5Discussion and conclusionThis corpus-based study used statistical analysis to locate factors in palatalization. The present study shows that both morphosyntactic structure and word class have been rejected as statistically significant factors in palatalization, while the factors of frequency dummy and frequency change dummy emerged as statistically significant. Results from this study argue against the claim that frequency has minimal or even no role in sound change.A few extra words will be added to explain the categorical data of frequency and frequency change in this article. The factor frequency dummy has three levels, low, medium, and high, with each level taking one third of data. As a result, high frequency is a comparative concept: high is defined by comparison with frequencies of all related morphemes. The factor frequency change dummy is configured with two levels: decrease and increase. Decrease and increase are comparative concepts: a morpheme under discussion should compare with its frequency in an earlier period for its frequency change. The fact that categorical data of frequency, with three levels of low, medium, and high, are statistically significantly correlated with palatalization, rather than numerical data and normalized numerical data, presents an intriguing idea: frequency should be considered from a relative perspective. Frequencies of all morphemes involved in a certain sound change should be compared with each other. Since different morphemes are involved in different sound changes, there is no unified standard of high frequency for all sound changes. Concerning the frequency change factor, its categorical data with decrease and increase levels, instead of numerical data, are shown to be statistically significantly correlated with palatalization. This indicates that the exact count of frequency change is not strongly correlated with sound change, but whether frequency has decreased or increased is. The most surprising part is that frequency change has a negative correlation with palatalization: morphemes with decreased frequencies tend to lead palatalization. In short, morphemes with high frequency and decreased frequencies were at the forefront of palatalization. This may be explained by the following. On the one hand, low-frequency morphemes were too inactive to undergo palatalization first, and this is why high-frequency morphemes tended to lead palatalization. On the other hand, increased frequency indicates that a related morpheme was more frequently used. If this morpheme had undergone palatalization first, it may lead to difficulty in communication. Thus, morphemes with increased frequencies tended to lag behind in palatalization.
Open Linguistics – de Gruyter
Published: Jan 1, 2023
Keywords: frequency; frequency change; morphosyntactic structure; palatalization; sound change; word class
Access the full text.
Sign up today, get DeepDyve free for 14 days.