Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Universal attractors in language evolution provide evidence for the kinds of efficiency pressures involved

Universal attractors in language evolution provide evidence for the kinds of efficiency pressures... ARTICLE https://doi.org/10.1057/s41599-022-01072-0 OPEN Universal attractors in language evolution provide evidence for the kinds of efficiency pressures involved 1 2 Ilja A. Seržant & George Moroz Efficiency is central to understanding the communicative and cognitive underpinnings of language. However, efficiency management is a complex mechanism in which different efficiency effects—such as articulatory, processing and planning ease, mental accessibility, and informativity, online and offline efficiency effects—conspire to yield the coding of lin- guistic signs. While we do not yet exactly understand the interactional mechanism of these different effects, we argue that universal attractors are an important component of any dynamic theory of efficiency that would be aimed at predicting efficiency effects across languages. Attractors are defined as universal states around which language evolution revolves. Methodologically, we approach efficiency from a cross-linguistic perspective on the basis of a world-wide sample of 383 languages from 53 families, balancing all six macro-areas (Eurasia, North and South America, Australia, Africa, and Oceania). We explore the gram- matical domain of verbal person–number subject indexes. We claim that there is an attractor state in this domain to which languages tend to develop and tend not to leave if they happen to comply with the attractor in their earlier stages of evolution. The attractor is characterized by different lengths for each person and number combination, structured along Zipf’s pre- dictions. Moreover, the attractor strongly prefers non-compositional, cumulative coding of person and number. On the basis of these and other properties of the attractor, we conclude that there are two domains in which efficiency pressures are most powerful: strive towards less processing and articulatory effort. The latter, however, is overridden by constant infor- mation flow. Strive towards lower lexicon complexity and memory costs are weaker efficiency pressures for this grammatical category due to its order of frequency. 1 2 University of Potsdam, Potsdam, Germany. National Research University Higher School of Economics, Moscow, Russian Federation. email: serzant@uni- potsdam.de HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 1 1234567890():,; ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Introduction anguage provides a means for communication. It is crucial developing systems of context-independent cues to resolve that communication be not only successful but also efficient, potential rather than actual ambiguity (cf. Malchukov, 2008; Li.e., with minimal effort for both parts and obeying high Seržant, 2019). This unavoidably leads to mismatches between the transmission accuracy (Gibson et al., 2019). length of a cue and its predictability in certain contexts (Seyfarth, We distinguish between two linguistic levels at which the 2014; Sóskuthy and Hay, 2017). effects of efficiency obtain: online, contextual effects produced by To sum up, efficient cues result online from an interaction of individual speakers and offline effects that are found in the mental various trade-offs between the processing, planning and articu- grammar and lexicon of speakers (see Jaeger and Buz (2018)). latory efficiency pressures (see, however, Levshina, 2021). Offline- Online effects are found, e.g., in the pronunciation of words in a efficient cues, in turn, emerge on the population level via selection spontaneous speech: if predictable in the particular context, and conventionalization of one of the efficient variants emerged words may be articulated with less care and be reduced (inter alia, online. Here, social factors play an important role as well. Aylett and Turk, 2004; Aylett and Turk, 2006; Pluymaekers et al., There is no integrative theory combining these different effi- 2005). Online effects pertain to particular communication events ciency effects and their conventionalization mechanisms that and individual speakers. By contrast, offline effects emerge over would be able to predict cross-linguistic data. Here, we suggest time via conventionalization of the more efficient and, therefore, that an essential component of such a theory is universal more frequently selected variant in the online efficiency man- attractors. Attractors are a notion borrowed from dynamic agement (Gibson et al., 2019; Kirby, 2001; Pierrehumbert, 2001; models of cognition, in which they are defined as states that Diessel, 2007; Seyfarth, 2014; Currie et al., 2018; Seržant, 2021b). related states prefer to develop into but not develop away from Crucially, offline effects pertain to the population level of com- (Norton, 1995: p. 56). We extend this notion by using it for monly shared linguistic culture. They are thus subject not only to diachronic linguistic processes. Attractors are universal properties the individual-level effects but are also constrained by the com- of conventionalized cues within a particular domain. The moti- plex sociological and interactional effects emerging on the vation behind attractor states is that languages tend to organize population level. meanings and functions space in certain ways. A corollary is that Moreover, conventionalized, offline strings are not static but languages tend to develop semantically and functionally similar constantly changing over time (Hopper, 1987; Bybee and Hopper, items that, in effect, have similar distributional frequencies and 2001; Seržant, 2021a). Change may be driven by semantic change are therefore subject to similar efficiency pressures across or various external and sociolinguistic factors (Seržant, 2021b). languages. As a consequence, the distribution and frequency of lexical and In this paper, we provide evidence for the attractor in one grammatical items is not at all stable. Thus, the question arises particular grammatical domain: subject indexing on the verb as whether efficiency pressures themselves may essentially change found, for example, in Latin: vide-ō (see-1SG) meaning “I see”, over time, and, accordingly, whether the outcomes of these pro- vidē-s (see-2SG) “you see”, vide-t (see-3SG) “(s)he sees”, vidē-mus cesses may be expected to largely parallel each other within and (see-1PL) “we see”, vidē-tis (see-2PL) “you see”, vide-nt (see-3PL) across languages. “they see”. We show that language evolution revolves around this Offline efficiency effects have most prominently been observed attractor. The attractor is characterized by at least two universal in the lexicon. The Zipfian effect that the length of a word tends properties: (1) preferred absolute lengths of the indexes and (2) to be a function of its inverse frequency (Zipf, 1935; Bentz and preference for the cumulative coding (i.e., non-compositional, Ferrer-i-Cancho, 2016) or informativity (Piantadosi et al., 2011)is atomic coding). The attractor is internally structured and caused the result of various historical processes from which the more by efficiency pressures, which are thus universal. efficient word lengths have been conventionalized. The associa- tion with the original form is often lost here, as in English pants Data from pantaloons or pub from public house (“opacification” in In order to establish the attractor in this domain we manually Kanwal et al., 2017). This is especially true of grammatical items, compiled a database We restricted our study to intransitive verbs which tend to be entirely dissociated from their origin (e.g., the only. We analyzed the six subject indexes (endings/prefixes/cli- indefinite article a and its source one). tics) that encode the person and number (and in some languages In addition to the distinction between online and offline effi- masculine gender, as well) of the subject participant on the verb. ciency effects, efficiency pressures operate on different stages of We excluded the dual. The six person–number indexes found in production. While the information-theoretic approach to effi- the morphologically unmarked (typically present) tense were ciency primarily relies on the articulatory efficiency (boiling down entered into the database: first person singular (1SG), second to the length of the message), it does not take into account the person singular (2SG), 3SG, 1PL, 2PL, 3PL. In total, these data processing efficiency or the planning efficiency, which may have been manually collected from 383 languages from 53 require signs that are less efficient from the articulatory per- families, covering all six macro-areas of the world: Eurasia, North spective. For example, when minimizing the articulatory effort and South America, Australia, Africa, and Oceania (Fig. 1, Moroz, online, the speaker has to assess at the same time whether or not 2017, the entire list is presented in the Appendix 1 in the online the particular reduced form will achieve its communicative goal supplement; the entire dataset is published in Seržant, 2021c). before it actually goes into articulation. This also requires that larger chunks must first be pre-planned before a cue goes into production (Bornkessel-Schlesewsky and Schlesewsky, 2014: Methods p. 107; Jaeger and Tily, 2010: p. 325). This requires processing costs. Potential ambiguities are also costly for the hearer who can correctly interpret an efficient but ambiguous cue only once 15 families contribute each 10-50 languages to the database in enough context has been uttered (Bornkessel-Schlesewsky and order to exclude language-specific effects and in order to control Schlesewsky, 2014: p. 107; Jaeger and Tily, 2010: p. 324). Thus, for family effects. Other families are represented with only few ambiguities created by articulatory efficient signs may require languages (sometimes only one, e.g., with isolates). Two extre- more processing effort because speech is generated and decoded mely large and diverse families are split into subfamilies: Nuclear incrementally. Languages respond to these processing efforts by Trans New Guinea (Sogeram, Awyu-Dumut, Oceanic, and 2 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE Fig. 1 Languages in the database. Dots represent languages in our database. (other) Nucear Trans New Guinea) and Afroasiatic (Semitic and the proto-form but appears in the modern form. Subsequently, (other) Afroasiatic). Likewise, Atlantic-Congo family is repre- we applied a logistic mixed effects model to obtain the prob- sented only by its Bantu subfamily. Furthermore, in order to abilities for the three persons to disprefer compositionality. explore the dynamics we have entered the person–number The properties of the attractor thus obtained are interpreted indexes of the respective proto-languages (Proto-Indo-European, with regard to efficiency effects at different stages of production Proto-Athabaskan, Proto-Semitic, Proto-Salishan, Proto-Musko- (articulatory, processing, memory retrieval, etc.). gean, Proto-Bantu, Proto-Dravidian, etc.; 15 in total) found in the authoritative literature. Since there is a great deal of controversy on the reconstruction of the Proto-Tibeto-Burman indexes, we Results adopted only the reconstructions for two subfamilies Gyalrongic Indexes lengths for each person–number combination do not and Kiranti, over which there is no controversy in the literature. vary much across languages. The dispersion around the average The remaining 38 families were excluded from the diachronic lengths across languages is quite small. This is illustrated in Fig. 2. analysis because no commonly accepted reconstructions for these We evaluated the Poisson regression model with person and families have been found. All computations have been carried out number as fixed effects and clade as a random effect in order to in the R environment (R Core Team, 2015). obtain an exact formula for the observed relation between length Attractor lengths were modeled with Poisson mixed effects of the index, person, and number. 1SG form was selected as a model with person and number as fixed effects. The results from a baseline for the regression. The lme4 (Bates et al., 2015) formula model that neglects the information on person and number sig- used for this model is as follows: nificantly differ from the observations (Fisher exact test). When index length ~ person * number + (1|clade) measuring length we only relied on the number of segments The overall predictions of our model are presented in Fig. 2, (proxied as the number of letters except for French and English). with the estimated values and a 95% confidence interval (model Long segments have been assigned 1.5. printouts are presented in the supplementary materials). Both Evolution towards the attractor was tested by comparing the variables person and number are statistically significant. Since all proto and the modern forms in order to see whether verbal variables are statistically significant and differ from zero, we can person–number indexes tend to move towards (or remain within) conclude that our attractor model is supported by our data. This the attractor or away from it. In order to do so, we established for allows us to compute the lengths of the attractors. The absolute each form whether or not the difference between its modern average lengths computed by the model are presented in Fig. 2. length and the attractor length became smaller than the length While the lengths predicted by the model for all families difference between the attractor and the proto-form. Whenever represent the static evidence for the attractor, we have also tested the difference remains the same and the length of the proto-form whether languages tend to develop towards this state if they is very close to the attractor we counted it as a movement towards happen to deviate from it in their proto-languages or whether the attractor. After we thus obtained the direction of change for each lengths are preserved in the modern languages if the proto- modern form we applied a logistic mixed effects model predicting language already adhered to the attractor. It has been repeatedly the direction of change with person and number as fixed effects argued that linguistic universals are not language states but rather and clade as a random effect. the accumulation of the diachronic processes and the mechan- Preference for cumulative coding was established by testing the isms of change that lead to these states (Bybee, 1988; Bybee, 2006; diachronic preference for and against compositionality. The data Bybee, 2008; Creissels, 2008; Cysouw, 2010; Dunn et al., 2011; points were divided into four categories for each person: (i) no Givón, 1979; Greenberg, 1966; Greenberg, 1978; Haspelmath, compositionality—compositionality is found neither in the proto- 1999; Maslova, 2000; Maslova, 2004; Cristofaro, 2012; Cristofaro, form nor in the modern form; (ii) compositionality disappears— 2014; Bickel et al., 2014). compositionality is present in the proto-form and disappeares in If the attractor lengths exist as suggested by the model on the the modern form; (iii) compositionality remains—composition- basis of the synchronic data above, then the attractor should also ality is present in both the proto-form and in the modern form; become visible in the transitional probability of languages to (iv) compositionality appears—compositionality is absent from adhere to the attractor lengths over the course of time. In order to HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 3 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Fig. 2 Predictions of the Poisson mixed effects model for the number of segments based on person and number (clade is used as a random effect). test whether there is indeed a diachronic pressure towards the number of modern Mayan languages (yielding the modern aver- attractor lengths, we have compared two idealized diachronic age of 2.64 segments). Finally, indexes adhering to the attractor stages: Stage 0 and Stage 1. Stage 0 consists of the lengths of each remain largely unchanged as to their lengths. For example, the of the six person–number indexes in the proto-language recon- length of 1SG in modern Sogeram, Athabaskan, or Semitic lan- structed by the historical-comparative method in the author- guages does not deviate considerably from its proto-forms. We itative literature for 15 (sub)families (see fn. 1 for the references). thus observe that indexes are not randomly affected by reduction Stage 1 is the lengths of each of the six person–number indexes or enlargement (via, for example, analogical extensions). across all modern languages of the respective (sub)family (10–50 In order to model the tendencies between Stage 0 and Stage 1, languages per family). The lengths at Stage 0 is in principle we computed for each language whether or not its indexes have subject to accidental, language-specific pressures, since there is changed toward the attractor estimated in the previous model, as only one proto-language per family. By contrast, the lengths at a binary variable: moving towards or remaining in the attractor vs. Stage 1 may be taken as indicative of universal pressures, since we not moving towards the attractor. Subsequently, we applied a take 10 to 50 modern languages per family, thus leveling out logistic mixed effects model to predict the probability of move- possible language-specific effects. ment towards (and remaining within) the attractor by person and We find that the modern forms, on average, develop towards number. The 1SG form was again selected as a baseline for the the attractor over the course of time. We also do not observe any regression. The lme4 (Bates et al., 2015) formula used for this significant source determination. Modern languages either “fix” model is as follows: the original proto-lengths via (i) shortening or (ii) enlarging, or movement towards attractor or being in the attractor they retain the lengths if these adhered to the attractor lengths range ~ person * number + (1|clade) already in the proto-language. For example, Uralic had singular The overall predictions of our model are presented in Fig. 3, proto-forms that were too short: 1SG -m, 2SG -n, 3SG -ø with the estimated values and a 95% confidence interval (model (Janhunen, 1982: p. 35). Accordingly, some modern Uralic lan- printouts can be found in the Supplement). guages enlarged them to two segments in the 1SG and 2SG and to The model reveals that in all person–number combinations one segment in the 3SG (e.g., Saami, Erzya, Komi-Permyak). there is a high probability to obey the attractor. There is no Observe that this enlargement is differential: in contrast to the statistically significant difference among persons. We conclude singular forms, the first and second plural forms (both three that the model supports our hypothesis that indexes are obeying segments in Proto-Uralic) have not been enlarged in modern the attractor lengths in their diachronic developments. Note that Uralic languages on average. The enlargement only takes place if the probability of obeying the attractor length of the given person the proto-forms considerably deviate from the attractor state. is extremely high in the singular forms (around 90–100%) and By contrast, families with proto-forms considerably longer less so in the plural forms (around 65–90%). The distinction than the attractor shorten their lengths. For example, second between singular and plural forms is also statistically significant. singular in Proto-Indo-European was three segments (*-e-si). It To summarize, despite continuous processes of various pho- was accordingly shortened to 1.57 segments on average in the netic and morphological changes and restructurings (Seržant, modern Indo-European languages. The same applies to first and 2021a), there is a stable blueprint in the coding of person–number second person in Proto-Mayan: with 2.5 (a segment plus a long indexes. Regardless of the lengths in the respective proto-lan- segment) it was somewhat too long and was accordingly shor- guage, modern languages on average stick to the attractor lengths tened to around two segments on average in the modern lan- by the right combination of diachronic processes leading to guages. At the same time, the respective plural proto-form was reduction, enlargement, or retention (see Moroz, 2021 for an somewhat too short with two segments and was enlarged in a exception). Importantly, while many studies since Zipf (1935) 4 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE Fig. 3 Logistic mixed effects model’s predictions for the number of segments based on person and number (clade is used as a random effect). assume that frequency effects on coding length only manifest into the value “dipreferred” and the green values into the value themselves via reduction (Diessel, 2007; Jaeger and Tily, 2010; “preferred.” The lme4 (Bates et al., 2015) formula used for this Bybee, 2001; Bybee, 2003; Cohen Priva and Jaeger, 2018), the model is as follows: length optimization discussed here is a more complex process compositionality of modern language ~ person * compositionality that may result not only from reduction but from retention or of proto-language+ (1|clade) enlargement as well. For example, the Polish 1Pl -my (from The overall predictions of our model are presented in Fig. 5, Proto-Slavic *-mū) is the result of the lengthening of the final with an estimated values and a 95% confidence interval (see vowel, which was originally hyper-short -mŭ (with the reduced supplement). vowel ŭ) in Proto-Slavic and thus much shorter than the attrac- It follows from Figs. 4 and 5 that compositionality is dispreferred tor. The lengthened variant most probably emerged by analogy to in the long run. The model predicts an extremely high probability of the independent 1PL pronoun my (<mū) ‘we’ already in Early non-compositional coding (over 95%) for each person. Slavic. Importantly, no other person-number combination underwent this kind of lengthening. Discussion The second universal property of the attractor is the preference Although the coding of indexes in particular languages is subject for compositionality. Compositionality is found when the person to various independent and language-specific processes including (1st vs. 2nd vs. 3rd) and the number (singular vs. plural) are various types of reduction, reanalyses, analogical extensions, etc. transparently and separately coded. For example, the indexes in (Seržant, 2021a), there are universal pressures that channel their Russian show no compositionality (i.e., are cumulative), cf. 1SG development over time. More specifically, we provided syn- -u vs. 1PL -m or 2SG -š’ vs. 2PL -te. By contrast, Maalula, a chronic and dynamic evidence for a universal attractor in the Western Aramaic language does show compositionality: 2SG či- domain of indexing. The attractor is characterized by the absolute vs. 2PL či- … -un or 3SG yi- vs. 3PL yi-…un. In this language, lengths for each person–number combination (Fig. 2) and second person is marked by či-, third person by yi- and number is cumulative (non-compositional) coding. Finally, subject indexes marked by zero in the singular and by -un in the plural. These are almost never optional in the languages of the world as has forms are thus compositional. been shown earlier (Karlsson, 1986; Siewierska, 1999). From these We coded changes in compositionality into four values: no characteristics of the attractor the following conclusions about the compositionality (neither the proto-language nor the modern universal principles constraining the interaction between under- language has compositionality), compositionality disappears lying efficiency pressures can be drawn. (compositionality of the proto-language decreased in the First, despite an extremely high corpus frequency, indexes modern language), compositionality remains (both the proto- nevertheless are not all equal in their lengths. The absolute language and the modern language have some composition- lengths are structured: (i) the third person tends to be the ality and its degree remains unchanged), compositionality shortest, and (ii) the plural indexes are longer than their appears (the modern language develops some composition- respective singular indexes (Greenberg, 1966: pp. 33–38). These ality). Results are presented on Fig. 4. asymmetries correlate with the asymmetries in the corpus fre- Both green bars stand for the preference of compositionality quencies of these forms as predicted by Zipf’s Law of Abbrevia- while both blue bars indicate dispreference for compositionality. tion: the more frequent form is shorter than the less frequent one. Overwhelmingly, compositionality tends to be avoided. We also Consider the corpus frequencies from the oral subcorpus of the applied logistic mixed effects model to predict compositionality of Russian National Corpus (216,112 words) as a proxy (Table 1). In the modern form depending on the person and the composi- comparison to other persons, third person is the most frequent tionality of the proto-form. For this, we merged the blue values person in both number sets, with 69% in the singular and 62% in HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 5 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Fig. 4 Number of languages that increased/decreased number of compositional persons. Fig. 5 Probability of compositionality of the modern form depending on the person and compositionality of the proto-language. the plural. Likewise, the singular forms are much more frequent zero as one would expect if only the articulatory efficiency were at than the plural ones, with 69% singulars vs. 31% plurals of all play. We did not observe any dynamic bias towards zero (only the forms. Both frequency asymmetries (3rd vs. 1st or 2nd and sin- weaker, reverse statement is true: zeros, if at all, are more prob- gular vs. plural) are statistically significant (p= 0.002, χ ). Similar able in the third singular than elsewhere, Siewierska, 2010; Bickel frequency asymmetries have been obtained for other languages, et al., 2015). In fact, some subfamilies even entirely replace the such as spoken Spanish (Bybee, 1985: p. 71), Finnish (on the basis third-person zero inherited from their proto-languages. For of olla “to be” in Karlsson, 1986: 24), and some other languages example, Proto-Uralic had zero-coded third-person singular (Greenberg, 1966: p. 37). index (Janhunen, 1982: p. 35) while a number of modern Uralic These figures show that articulatory efficiency plays an languages, including the entire Finnic subfamily, developed a important role here: the more expected the sign is the shorter it is. non-zero coding here. Nevertheless, zero is not preferred. The most frequent third- While zero would be the most efficient in terms of articulation, person form is more frequently coded with a segment than with non-zero coding of the third-person singular must be motivated 6 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE Cumulative coding requires higher complexity of the lexicon and Table 1 Person–number frequencies in the oral subcorpus of comes at higher memory and learnability costs because it requires the RNC. six signs (1SG, 2SG… 3PL) while compositional coding would require only four signs (three signs for the three persons and one Singular Plural plural sign applicable to all of them). While both options are 1 26% (2.276) 15% (601) equally informative, it is only the first one that is cross- 2 5% (471) 23% (926) linguistically preferred. This fact allows uncovering the specific 3 69% (6.021) 62% (2.493) efficiency processes involved. Languages structure their lexica Total 69% (8768) 31% (4020) optimally such that the trade-off between the processing costs and Bold indicates the most frequent combinations. the lexicon complexity is resolved within the Pareto frontier either in favor of higher processing costs (more compositional) or in favor of higher lexicon complexity and memory costs (more cumulative coding) (Kemp and Regier, 2012; Kemp et al., 2018;Xu by processing and planning efficiency overriding articulation ease. et al., 2020). Yet, languages prefer the specific choice (corner) Sending the hearer a non-zero phonetic cue facilitates the pro- within the Pareto frontier in high-frequency domains such as the cessing effort on the part of the hearer and thus increases the indexing domain: processing efficiency outweighs lexicon com- chances of a successful transmission of information. A non-zero plexity and, thus, memory (and learnability) costs with linguistic form is also more planning-efficient for the speaker because it items of this order of frequency. The reason for this is that higher provides a straightforward link from meaning to coding, while processing costs are not efficient with high-frequency items that zero is inherently ambiguous by being linked to various meanings are easily learnable and retrievable from the memory anyway and domains. Non-zero coding also alleviates the planning pro- (Kirby, 2001: p. 109). This ties in with Kemp et al. (2018: p. 114) cess because it makes the assessment of whether or not the who claim that the preference for the cumulative coding within context provides enough information unnecessary. the Pareto frontier is found when the lexical domain is important Secondly, it also is the planning efficiency that must be for the culture, if “important for the culture” means that the items responsible for the fact that verbal indexes are almost never of this lexical domains are frequent in this culture (similarly in Xu optional in the languages of the world (Siewierska, 1999; Haig, et al., 2020 for number signs). We conclude from this that pro- 2018). This obligatoriness yields redundant uses in those contexts cessing ease outweighs lexicon simplicity and, thus, memory (and that provide enough information for the identification of the learability) costs with linguistic items of this order of frequency. subject referent, as in ven-ī, vid-ī, vic-ī “came-1SG”, “saw-1SG”, To sum up, first, we have established that there is a universal “conquered-1SG” (the last two occurrences of -1SG are increas- attractor state for indexing around which the evolution revolves. ingly redundant because they can be guessed from the previous Second, the properties of the attractor uncover two domains in which context anyway). Planning efficiency overrides articulatory effi- efficiency pressures are most powerful: strive towards less processing ciency here as well. and articulatory effort while strive towards lower lexicon complexity Thirdly, the most articulatory efficient paradigm that would and lower memory costs are weaker efficiency pressures for this also warrant unambiguous information transmission would grammatical category due to its order of frequency. Having said this, not require the plural to have longer forms than the singular. our evidence is cross-linguistic comparative evidence. Ideally, our Thus, theoretically a morphological system of coding all six conclusions should be supported by experimental evidence. distinctions (1SG, 2SG … 3PL) with one segment—e.g., 1SG -a, 2SG -t,3SG -i (or zero), 1PL -k,2PL -o,3PL -r—would per- Data availability fectly fulfill the requirement of accurate information trans- All data analyzed are included in the manuscript and supple- mission under the lowest articulatory effort. Thus, the effect of mentary information file. articulatory efficiency alone does not explain why cross- linguistically the plural forms require more segments than Received: 13 September 2021; Accepted: 24 January 2022; the singular forms if they all may be sufficiently disambiguated by just one segment. Multiple segments, however, allow the speakers to gain more production time and the hearer more comprehension time with the less expected meanings (plural in Note this case). The longer forms of the plural fulfill here the 1 Proto-Indo-European (Meier-Brügger, 2010: pp. 173–184), Proto-Turkic (Róna-Tas, function of according the message with constant information 1998: p. 75; Old Turkic in Abduraxmanov, 1997: p. 68; Erdal, 2004: p. 232; Tuguševa, flow (Aylett and Turk, 2004; Levy and Jaeger, 2007;Pluy- 1997: p. 59), Proto-Mayan (Bricker, 1977: p. 2; Schele, 1982: p. 9), Proto-Uralic (Honti, maekers et al., 2005; Uniform Information Density hypothesis 2010: p. 21; Janhunen, 1982: p. 35; Kulonen, 2001; Laanest, 1982 [1975]: pp. 229–30), in Coupé et al., 2019). In turn, the selection of particular Proto-Dravidian (Andronov, 2009: pp. 224–231), Proto-Semitic (Hasselbach, 2004:p. 32; Huehnergard, 2000; Lipiński, 2001: p. 378), Proto-Oceanic (Blust, 1972; François, phonetic segments serves the distinguishability function. 2016: p. 32; Ross, 1988: p. 366, 2002 : 60; Starosta et al., 1981), Proto-Bantu (Meeussen, Fourthly, while it is known that high-frequency items as 1967: pp. 97–99; Schadeberg, 2003 [2014]: p.151), Proto-Sogeram (Daniels, 2015:p. opposed to low-frequency items do not require transparent, 155), Proto-Awyu-Dumut (Wester, 2014: pp. 78–85), Proto-Athabaskan (Hoijer, 1971: compositional coding (Kirby, 2001: p. 108; Christiansen and pp. 127–132; Leer, 2006: p. 429), Proto-Muskogean (Booker, 1980: p. 33), Proto- Chater, 2008: p. 499), our cross-linguistic diachronic evidence Worroran (McGregor and Rumsey, 2009: p. 68), and Proto-Salishan (Newman, 1979: suggests that items as frequent as person–number indexes in fact p. 213, 1980: p. 156), Proto-Kiranti and Proto-rGyalrongic (DeLancey, 2010: p. 15, 2011:p.2, 2014; Jacques, 2012, 2016; LaPolla, 2003: p. 30). prefer cumulative coding (number and person being coded by one atomic sign): those families that were not compositional in the proto-language (e.g., Indo-European) did not develop composi- References Abduraxmanov GA (1997) Karaxanidsko-ujgurskij jazyk. In: Tenišev ÈR, Poce- tionality in any of the modern languages, and some of those luevskij EA, Kormušin IV, Kibrik AA (eds.) Jazyki Mira. Tjurkskie jazyki. families that did have compositionality in the proto-language (e.g., “Kyrgyzstan”, Bishkek, pp. 64–74 Awyu-Dumut) removed it in the modern languages at least to Andronov MS (2009) A comparative grammar of the dravidian languages. Beiträge some extent. This “opacification” is also observed in independent zur Kenntnis südasiatischer Sprachen und Literaturen 7. Otto Harrassowitz, words, such as pub from public house (Kanwal et al., 2017). Wiesbaden HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 7 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Aylett M, Turk A (2004) The smooth signal redundancy hypothesis: a functional Dunn M, Greenhill SJ, Levinson SC, Gray RD (2011) Evolved structure of lan- explanation for relationships between redundancy, prosodic prominence, and guages shows lineage-specific trends in word-order universals. Nature duration in spontaneous speech. Lang Speech 47:31–56 473:79–82 Aylett M, Turk A (2006) Language redundancy predicts syllable duration and the Erdal M (2004) A grammar of old-turkic. Handbook of oriental studies. Handbuch spectral characteristics of vocalic syllable nuclei. J Acoust Soc Am der Orientalistik. Section eight. Central Asia. Vol 3. Brill, Leiden/Boston 119:3048–3058 François A (2016) The historical morphology of personal pronouns in northern Bates D, Mächler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models Vanuatu. In: Pozdniakov K (ed.) Comparatisme et reconstruction: tendances using lme4. J Stat Softw 67(1):1–48 actuelles. Faits de Langues. Peter Lang, Bern, pp. 25–60 Bentz CH, Ferrer-i-Cancho R (2016) Zipf’s law of abbreviation as a language Gibson E, Futrell R, Piantadosi ST, Dautriche I, Mahowald K, Bergen L, Levy R universal. In: Bentz CH, Jäger G, Yanovich Y (eds.) In: Proceedings of the (2019) How efficiency shapes human language. Trend Cogn Sci 23(5):389–407 Leiden workshop on capturing phylogenetic algorithms for linguistics. Uni- Givón T (1979) On understanding grammar. Academic Press, New York, NY versity of Tubingen, online publication system. https://publikationen.uni- Greenberg JH (1966) Language universals, with special reference to feature hier- tuebingen.de/xmlui/handle/10900/6855814 archies. Mouton, The Hague Bickel B, Witzlack-Makarevich A, Zakharko T (2014) Typological evidence against Greenberg JH (1978) Diachrony, synchrony and language universals. In: Greenberg universal effects of referential scales on case alignment. In: Bornkessel- JH, Ferguson CA, Moravcsik EA (eds.) Universals of human language, Vol. 1: Schlesewsky I, Malchukov A, Richards M (eds.) Scales and hierarchies: a method and theory. Stanford University Press, Stanford, pp. 61–92 cross-disciplinary perspective on referential hierarchies. De Gruyter, Mouton, Haig G (2018) The grammaticalization of object pronouns: why differential object Berlin, pp. 7–44 indexing is an attractor state. Linguistics 56(4):781–818 Blust RA (1972) Proto-Oceanic addenda with cognates in non-Oceanic Aus- Haspelmath M (1999) Optimality and diachronic adaptation. Zeitschrift für tronesian languages: a preliminary list. WPLUH 411:1–43 Sprachwissenschaft 18(2):180–205 Booker KM (1980) Comparative muskogean: aspects of proto‐muskogean verb Hasselbach R (2004) Final vowels of pronominal suffixes and independent personal morphology. University of Kansas dissertation, Lawrence, KS pronouns in semitic. J Semit Stud 49(1):1–20 Bornkessel-Schlesewsky I, Schlesewsky M (2014) Competition in argument inter- Hoijer H (1971) Athapaskan morphology. In: Sawyer J (ed.) Studies in American pretation: evidence from the neurobiology of language. In: MacWhnney B, Indian Languages. University of California Publications in Linguistics 65. Malchukov A, Moravcsik E (eds.) Competing motivations in grammar and University of California Press, Berkley, pp. 113–147 usage. Oxford University Press, pp. 107–126 Honti L (2010) Personae ingratissimae? A 2. személyek jelölése az uráliban. Bricker VR (1977) Pronominal inflection in the Mayan languages. Occasional Nyelvtudományi Közlemények 107:7–57 Paper 1. Middle American Research Institute, New Orleans Hopper P (1987) Emergent Grammar. Berkley Linguistic. Society 13:139–157 Bybee JL (1988) The diachronic dimension, chapter 13. In: Hawkins JA (ed.) Huehnergard, J. 2000. Comparative Semitic Linguistics. Unpublished. Cambridge, Explaining language universals. OUP, pp. 350–379 Mass Bybee JL (2001) Phonology and language use. Cambridge University Press, Jacques G (2012) Agreement morphology: the case of Rgyalrong and Kiranti. Lang Cambridge Linguist 13(1):83–116 Bybee JL (2003) Mechanisms of change in grammaticization: the role of frequency. Jacques G (2016) Le sino-tibétain: polysynthétique ou isolant? Faits de langues In: Joseph BD, Janda RD (eds.) The Handbook of Historical Linguistics. 47(1):61–74 Blackwell, Oxford, pp. 602–623 Jaeger TF, Tily H (2010) On language ‘utility’: processing complexity and com- Bybee JL (2006) From usage to grammar: the mind’s response to repetition. Lan- municative efficiency. Cogn Sci 2:323–335 guage 82(4):711–733 Jaeger TF, Buz E (2018) Signal reduction and linguistic encoding. In: Fernández Bybee JL (2008) Formal universals as emergent phenomena: the origins of structure EM, Smith Cairns H (eds.) The Handbook of Psycholinguistics. John Wiley & preservation. In: Good J (ed.) Language universals and language change. Sons Oxford University Press, pp. 108–121 Janhunen J (1982) On the structure of Proto-Uralic. Finnisch-ugrische For- Bybee J, Hopper P (2001) Introduction to frequency and the emergence of lin- schungen 44:23–42 guistic structure. In: Bybee J, Hopper P (eds.) Frequency and the emergence Kanwal J, Smith K, Culbertson J, Kirby S (2017) Zipf’s Law of Abbreviation and the of linguistic structure [Typological studies in language 45]. John Benjamins, Principle of Least Effort: language users optimise a miniature lexicon for pp. 1–27 efficient communication. Cognition 165:45–52 Bybee J (1985) Morphology: A Study of the Relations between Meaning and Form. Karlsson F (1986) Frequency considerations in morphology. STUF –Lang Typol Amsterdam/Philadelphia: John Benjamins Univ 39(1):19–28 Christiansen MH, Chater N (2008) Language as shaped by the brain. Behav Brain Kemp C, Regier T (2012) Kinship categories across languages reflect general Sci 31:489–558 communicative principles. Science 336:1049–1054 Cohen Priva U, Jaeger TF (2018) The interdependence of frequency, predictability, Kemp C, Xu Y, Regier T (2018) Semantic typology and efficient communication. and informativity in the segmental domain. Linguist Vanguard 4(2):1–13 Ann Rev Linguist 4:109–128 Coupé CH, Oh YM, Dediu D, Pellegrino F (2019) Different languages, similar Kirby S (2001) Spontaneous evolution of linguistic structure-an iterated learning encoding efficiency: comparable information rates across the human com- model of the emergence of regularity and irregularity. IEEE Trans Evol munication niche. Sci Adv 2594 Comput 5:102–110 Creissels D (2008) Direct and indirect explanations of typological regularities: the Kulonen UM (2001) Zum n-Element der zweiten Personen besonders im Obu- case of alignment variations. Folia Linguistica 42(1):1–38 grischen. Finnisch-Ugrische Forschungen 56:151–174 Cristofaro S (2012) Cognitive explanations, distributional evidence, and diachrony. Laanest A (1982) Einführung in die ostseefinnischen Sprachen. Autorisierte Stud Lang 36(3):645–670 Übertragung aus dem Estnischen von Hans-Hermann Bartens. Buske, Cristofaro S (2014) Competing motivation models and diachrony: what evidence Hamburg for what motivations? In: MacWhnney B, Malchukov A, Moravcsik E (eds.) LaPolla R (2003) Overview of Sino-Tibetan morphosyntax. In: Thurgood G, Competing motivations in grammar and usage. Oxford University Press, Matisoff JA, Bradley D (eds.) Linguistics of the Sino-Tibetan area: The state Oxford, pp. 282–298 of the art. Pacific Linguistics Series C, 87. Department of Linguistics, Aus- Currie KH, Hume E, Jaeger TF, Wedela A (2018) The role of predictability in tralian National University, Canberra, pp. 22–42 shaping phonological patterns. Linguist Vanguard 4(s2):1–15 Leer J (2006) Na-Dene languages. In: Asher RE, Simpson JMY (eds.) The ency- Cysouw M (2010) On the probability distribution of typological frequencies. In: clopedia of language and linguistics. Pergamon, Oxford, pp. 428–430 Ebert CH, Jäger G, Michaelis J (eds.) Math Lang. Springer, Heidelberg, pp. Levshina N (2021) Cross-linguistic trade-offs and causal relationships between cues 29–35 to grammatical subject and object, and the problem of efficiency-related Daniels, D (2015) A Reconstruction of Proto-Sogeram. Phonology, Lexicon, explanations. Front Psychol 12:648200 Morphosyntax. A dissertation in partial satisfaction of the requirements for Levy R, Jaeger FT (2007) Speakers optimize information density through syntactic the degree Doctor of Philosophy in Linguistics. Santa Barbara: University of reduction. Adv Neural Inform Process Syst 19:849–856 California Lipiński E (2001) Semitic Languages: outline of a comparative grammar, 2ed. DeLancey S (2010) Towards a history of verb agreement in Tibeto-Burman. Peeters, Leuven Himalayan Linguist 9(1):1–38 Malchukov AL (2008) Animacy and asymmetries in differential case marking. DeLancey S (2011) Agreement prefixes in Tibeto-Burman. Himalayan Linguist Lingua 118:203–221 10(1):1–29 Maslova E (2000) A dynamic approach to the verification of distributional uni- DeLancey S (2014) Second person verb forms in Tibeto-Burman. Linguist Tibeto- versals. Linguist Typol 4(3):307–333 Burman Area 37(1):3–33 Maslova E (2004) Dinamika tipologičeskix raspredelenij i stabil’nost’ jazykovyx Diessel H (2007) Frequency effects in language acquisition, language use, and tipov [Dynamics of typological distributions and stability of language types]. diachronic change. New Idea Psychol 25:108–127 Voprosy jazykoznanija 5:3–16 8 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE McGregor WB, Rumsey A (2009) Worrorran revisited: the case for genetic rela- Paper presented to the Third International Conference on Austronesian tions among languages of the Northern Kimberley region of Western Aus- Linguistics, Bali. Abridged version published. In: Halirn A, Carrington L, tralia. The Australian National University, Canberra Wurm SA (eds.) Papers from the Third International Conference on Aus- Meeussen, AE (1967) Bantu grammatical reconstructions. Africana Linguistica tronesian Linguistics. Vol. 2. Tracking the travellers. Dept. of Linguistics, 3:79–121 Australian National University, Canberra, pp. 145–170 Meier-Brügger M (2010) Indogermanische Sprachwissenschaft. 9., durchgesehene Tuguševa LJ (1997) Drevneujgurskij jazyk. In: Tenišev, ĖR, JeA Poceluevskij, IV und ergänzte Auflage. Unter Mitarbeit von Matthias Fritz und Manfred Kormušin, AA Kibrik (eds.), Jazyki mira. Tjurkskie jazyki. Biškek: Izdatel’skij Mayrhofer. De Gruyter, Berlin dom “Kyrgyzstan”, pp. 54–63 Moroz G (2021) Length of East Caucasian subject indexes: a quantative research. Wester R (2014) A linguistic history of Awyu-Dumut: Morphological Study and In: Majsak TA, Sumbatova NR, Testelec YG (eds.) Durqasi xazna. Sbornik Reconstruction of a Papuan Language Family. Doctoral dissertation, Vrije statej k 60-letiyu R. O. Mutalova. Buki Vedi, Moscow, pp. 258–282 Universiteit Amsterdam Moroz G (2017) lingtypology: easy mapping for Linguistic Typology. https:// Xu Y, Liu E, Regier T (2020) Numeral systems across languages support efficient CRAN.R-project.org/package=lingtypology communication: from approximate numerosity to recursion. Open Mind Newman S (1979) A History of the Salish Possessive and Subject Forms. Int J Am 4:57–70 Linguist 45(3):207–223 Zipf G (1935) The psychobiology of language. Routledge, London Newman S (1980) Functional changes in the Salish pronominal system. Int J Am Linguist 46(3):155–167 Acknowledgements Norton A (1995) Dynamics: an introduction. In: Port RF, Van Gelder T (eds.) The first author has received funding by the Heisenberg grant SE 2838/1-1 “Exploring Mind as Motion: explorations in the dynamics of cognition. MIT Press, pp. linguistic diversity” of the German Research Foundation (Deutsche For- 44–68 schungsgemeinschaft). The second author greatly acknowledges the support he has Piantadosi ST, Tily H, Gibson E (2011) Word lengths are optimized for efficient received within the Basic Research Program of the National Research University Higher communication. Proc Natl Acad Sci USA 108(9):3526–3529 School of Economics. Pierrehumbert J (2001) Exemplar dynamics: word frequency, lenition and contrast. In: Bybee J, Hopper P (eds.) Frequency effects and the emergence of lexical structure: studies in language. John Benjamins, 137–157 Funding Pluymaekers M, Ernestus M, Baayen RH (2005) Articulatory planning is con- Open Access funding enabled and organized by Projekt DEAL. tinuous and sensitive to informational redundancy. Phonetica 62:146–159 R Core Team (2015) R: A language and environment for statistical computing. Competing interests Austria, Vienna. https://www.R-project.org/ Róna-Tas A (1998) The Reconstruction of Proto-Turkic and the Genetic Question. The authors declare no competing interests. In: Johanson L, Csató ÉÁ (eds.) The Turkic Languages. CUP, Cambridge, pp. 67–80 Ethical approval Ross M (1988) Proto-Oceanic and the Austronesian languages of western Mela- Not applicable. nesia. Pacific Linguistics: Series C, 98. Australian National University dis- sertation. Research School of Pacific and Asian Studies, Canberra Ross M (2002) Proto Oceanic. In: Lynch J, Ross M, Crowley T (eds.) The oceanic Informed consent languages. Routledge, London/New York, pp. 54–91 Not applicable. Schadeberg T (2003) Historical linguistics. In: Nurse, D & G Philippson (eds.), The Bantu Languages. London/New York: Routledge, pp. 143–163 Additional information Schele L (1982) Maya Glyphs. The Verbs. University of Texas Press, Austin Supplementary information The online version contains supplementary material Seržant IA (2019) Weak universal forces: the discriminatory function of case in available at https://doi.org/10.1057/s41599-022-01072-0. differential object marking systems. In: Schmidtke-Bode K, Levshina N, Michaelis SM, Seržant I (eds.) Explanation in typology: diachronic sour- Correspondence and requests for materials should be addressed to Ilja A. Seržant. ces, functional motivations and the nature of the evidence [Conceptual Foundations of Language Science 3]. Language Science Press, Berlin, pp. Reprints and permission information is available at http://www.nature.com/reprints 149–178 Seržant IA (2021b) The dynamics of Slavic morphosyntax is primarily determined by Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in the geographic location and contact configuration. Scando-Slavica 67(1):65–90 published maps and institutional affiliations. Seržant IA (2021a) Cyclic changes in verbal person-number indexes are unlikely. Folia Linguistica Historica 42(1):49–86 Seržant IA (2021c) Dataset for the paper “Universal attractors in language evolu- Open Access This article is licensed under a Creative Commons tion provide evidence for the kinds of efficiency pressures involved” [Data Attribution 4.0 International License, which permits use, sharing, set]. Version 4. Zenodo. https://doi.org/10.5281/zenodo.6028260 Seyfarth S (2014) Word informativity influences acoustic duration: effects of adaptation, distribution and reproduction in any medium or format, as long as you give contextual predictability on lexical representation. Cognition 133(1):140–155 appropriate credit to the original author(s) and the source, provide a link to the Creative Siewierska A (1999) From anaphoric pronoun to grammatical agreement marker: Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless why objects don’t make it. Folia Linguistica 33(1/2):225–251 Siewierska A (2010) Person asymmetries in zero expression and grammatical indicated otherwise in a credit line to the material. If material is not included in the functions. In: F Floricic (ed.), Essais de typologie et de linguistique générale. article’s Creative Commons license and your intended use is not permitted by statutory Mélanges offerts à Denis Creissels. Paris: Presses de L’Ecole Normale regulation or exceeds the permitted use, you will need to obtain permission directly from Supérieure, pp. 471–485 the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Sóskuthy M, Hay J (2017) Changing word usage predicts changing word durations licenses/by/4.0/. in New Zealand English. Cognition 166:298–313 Starosta S, Pawley AK, Reid LA (1981) The evolution of focus in Austronesian. © The Author(s) 2022 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 9 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Humanities and Social Sciences Communications Springer Journals

Universal attractors in language evolution provide evidence for the kinds of efficiency pressures involved

Loading next page...
 
/lp/springer-journals/universal-attractors-in-language-evolution-provide-evidence-for-the-zENSz0Ll86

References (104)

Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2022
eISSN
2662-9992
DOI
10.1057/s41599-022-01072-0
Publisher site
See Article on Publisher Site

Abstract

ARTICLE https://doi.org/10.1057/s41599-022-01072-0 OPEN Universal attractors in language evolution provide evidence for the kinds of efficiency pressures involved 1 2 Ilja A. Seržant & George Moroz Efficiency is central to understanding the communicative and cognitive underpinnings of language. However, efficiency management is a complex mechanism in which different efficiency effects—such as articulatory, processing and planning ease, mental accessibility, and informativity, online and offline efficiency effects—conspire to yield the coding of lin- guistic signs. While we do not yet exactly understand the interactional mechanism of these different effects, we argue that universal attractors are an important component of any dynamic theory of efficiency that would be aimed at predicting efficiency effects across languages. Attractors are defined as universal states around which language evolution revolves. Methodologically, we approach efficiency from a cross-linguistic perspective on the basis of a world-wide sample of 383 languages from 53 families, balancing all six macro-areas (Eurasia, North and South America, Australia, Africa, and Oceania). We explore the gram- matical domain of verbal person–number subject indexes. We claim that there is an attractor state in this domain to which languages tend to develop and tend not to leave if they happen to comply with the attractor in their earlier stages of evolution. The attractor is characterized by different lengths for each person and number combination, structured along Zipf’s pre- dictions. Moreover, the attractor strongly prefers non-compositional, cumulative coding of person and number. On the basis of these and other properties of the attractor, we conclude that there are two domains in which efficiency pressures are most powerful: strive towards less processing and articulatory effort. The latter, however, is overridden by constant infor- mation flow. Strive towards lower lexicon complexity and memory costs are weaker efficiency pressures for this grammatical category due to its order of frequency. 1 2 University of Potsdam, Potsdam, Germany. National Research University Higher School of Economics, Moscow, Russian Federation. email: serzant@uni- potsdam.de HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 1 1234567890():,; ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Introduction anguage provides a means for communication. It is crucial developing systems of context-independent cues to resolve that communication be not only successful but also efficient, potential rather than actual ambiguity (cf. Malchukov, 2008; Li.e., with minimal effort for both parts and obeying high Seržant, 2019). This unavoidably leads to mismatches between the transmission accuracy (Gibson et al., 2019). length of a cue and its predictability in certain contexts (Seyfarth, We distinguish between two linguistic levels at which the 2014; Sóskuthy and Hay, 2017). effects of efficiency obtain: online, contextual effects produced by To sum up, efficient cues result online from an interaction of individual speakers and offline effects that are found in the mental various trade-offs between the processing, planning and articu- grammar and lexicon of speakers (see Jaeger and Buz (2018)). latory efficiency pressures (see, however, Levshina, 2021). Offline- Online effects are found, e.g., in the pronunciation of words in a efficient cues, in turn, emerge on the population level via selection spontaneous speech: if predictable in the particular context, and conventionalization of one of the efficient variants emerged words may be articulated with less care and be reduced (inter alia, online. Here, social factors play an important role as well. Aylett and Turk, 2004; Aylett and Turk, 2006; Pluymaekers et al., There is no integrative theory combining these different effi- 2005). Online effects pertain to particular communication events ciency effects and their conventionalization mechanisms that and individual speakers. By contrast, offline effects emerge over would be able to predict cross-linguistic data. Here, we suggest time via conventionalization of the more efficient and, therefore, that an essential component of such a theory is universal more frequently selected variant in the online efficiency man- attractors. Attractors are a notion borrowed from dynamic agement (Gibson et al., 2019; Kirby, 2001; Pierrehumbert, 2001; models of cognition, in which they are defined as states that Diessel, 2007; Seyfarth, 2014; Currie et al., 2018; Seržant, 2021b). related states prefer to develop into but not develop away from Crucially, offline effects pertain to the population level of com- (Norton, 1995: p. 56). We extend this notion by using it for monly shared linguistic culture. They are thus subject not only to diachronic linguistic processes. Attractors are universal properties the individual-level effects but are also constrained by the com- of conventionalized cues within a particular domain. The moti- plex sociological and interactional effects emerging on the vation behind attractor states is that languages tend to organize population level. meanings and functions space in certain ways. A corollary is that Moreover, conventionalized, offline strings are not static but languages tend to develop semantically and functionally similar constantly changing over time (Hopper, 1987; Bybee and Hopper, items that, in effect, have similar distributional frequencies and 2001; Seržant, 2021a). Change may be driven by semantic change are therefore subject to similar efficiency pressures across or various external and sociolinguistic factors (Seržant, 2021b). languages. As a consequence, the distribution and frequency of lexical and In this paper, we provide evidence for the attractor in one grammatical items is not at all stable. Thus, the question arises particular grammatical domain: subject indexing on the verb as whether efficiency pressures themselves may essentially change found, for example, in Latin: vide-ō (see-1SG) meaning “I see”, over time, and, accordingly, whether the outcomes of these pro- vidē-s (see-2SG) “you see”, vide-t (see-3SG) “(s)he sees”, vidē-mus cesses may be expected to largely parallel each other within and (see-1PL) “we see”, vidē-tis (see-2PL) “you see”, vide-nt (see-3PL) across languages. “they see”. We show that language evolution revolves around this Offline efficiency effects have most prominently been observed attractor. The attractor is characterized by at least two universal in the lexicon. The Zipfian effect that the length of a word tends properties: (1) preferred absolute lengths of the indexes and (2) to be a function of its inverse frequency (Zipf, 1935; Bentz and preference for the cumulative coding (i.e., non-compositional, Ferrer-i-Cancho, 2016) or informativity (Piantadosi et al., 2011)is atomic coding). The attractor is internally structured and caused the result of various historical processes from which the more by efficiency pressures, which are thus universal. efficient word lengths have been conventionalized. The associa- tion with the original form is often lost here, as in English pants Data from pantaloons or pub from public house (“opacification” in In order to establish the attractor in this domain we manually Kanwal et al., 2017). This is especially true of grammatical items, compiled a database We restricted our study to intransitive verbs which tend to be entirely dissociated from their origin (e.g., the only. We analyzed the six subject indexes (endings/prefixes/cli- indefinite article a and its source one). tics) that encode the person and number (and in some languages In addition to the distinction between online and offline effi- masculine gender, as well) of the subject participant on the verb. ciency effects, efficiency pressures operate on different stages of We excluded the dual. The six person–number indexes found in production. While the information-theoretic approach to effi- the morphologically unmarked (typically present) tense were ciency primarily relies on the articulatory efficiency (boiling down entered into the database: first person singular (1SG), second to the length of the message), it does not take into account the person singular (2SG), 3SG, 1PL, 2PL, 3PL. In total, these data processing efficiency or the planning efficiency, which may have been manually collected from 383 languages from 53 require signs that are less efficient from the articulatory per- families, covering all six macro-areas of the world: Eurasia, North spective. For example, when minimizing the articulatory effort and South America, Australia, Africa, and Oceania (Fig. 1, Moroz, online, the speaker has to assess at the same time whether or not 2017, the entire list is presented in the Appendix 1 in the online the particular reduced form will achieve its communicative goal supplement; the entire dataset is published in Seržant, 2021c). before it actually goes into articulation. This also requires that larger chunks must first be pre-planned before a cue goes into production (Bornkessel-Schlesewsky and Schlesewsky, 2014: Methods p. 107; Jaeger and Tily, 2010: p. 325). This requires processing costs. Potential ambiguities are also costly for the hearer who can correctly interpret an efficient but ambiguous cue only once 15 families contribute each 10-50 languages to the database in enough context has been uttered (Bornkessel-Schlesewsky and order to exclude language-specific effects and in order to control Schlesewsky, 2014: p. 107; Jaeger and Tily, 2010: p. 324). Thus, for family effects. Other families are represented with only few ambiguities created by articulatory efficient signs may require languages (sometimes only one, e.g., with isolates). Two extre- more processing effort because speech is generated and decoded mely large and diverse families are split into subfamilies: Nuclear incrementally. Languages respond to these processing efforts by Trans New Guinea (Sogeram, Awyu-Dumut, Oceanic, and 2 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE Fig. 1 Languages in the database. Dots represent languages in our database. (other) Nucear Trans New Guinea) and Afroasiatic (Semitic and the proto-form but appears in the modern form. Subsequently, (other) Afroasiatic). Likewise, Atlantic-Congo family is repre- we applied a logistic mixed effects model to obtain the prob- sented only by its Bantu subfamily. Furthermore, in order to abilities for the three persons to disprefer compositionality. explore the dynamics we have entered the person–number The properties of the attractor thus obtained are interpreted indexes of the respective proto-languages (Proto-Indo-European, with regard to efficiency effects at different stages of production Proto-Athabaskan, Proto-Semitic, Proto-Salishan, Proto-Musko- (articulatory, processing, memory retrieval, etc.). gean, Proto-Bantu, Proto-Dravidian, etc.; 15 in total) found in the authoritative literature. Since there is a great deal of controversy on the reconstruction of the Proto-Tibeto-Burman indexes, we Results adopted only the reconstructions for two subfamilies Gyalrongic Indexes lengths for each person–number combination do not and Kiranti, over which there is no controversy in the literature. vary much across languages. The dispersion around the average The remaining 38 families were excluded from the diachronic lengths across languages is quite small. This is illustrated in Fig. 2. analysis because no commonly accepted reconstructions for these We evaluated the Poisson regression model with person and families have been found. All computations have been carried out number as fixed effects and clade as a random effect in order to in the R environment (R Core Team, 2015). obtain an exact formula for the observed relation between length Attractor lengths were modeled with Poisson mixed effects of the index, person, and number. 1SG form was selected as a model with person and number as fixed effects. The results from a baseline for the regression. The lme4 (Bates et al., 2015) formula model that neglects the information on person and number sig- used for this model is as follows: nificantly differ from the observations (Fisher exact test). When index length ~ person * number + (1|clade) measuring length we only relied on the number of segments The overall predictions of our model are presented in Fig. 2, (proxied as the number of letters except for French and English). with the estimated values and a 95% confidence interval (model Long segments have been assigned 1.5. printouts are presented in the supplementary materials). Both Evolution towards the attractor was tested by comparing the variables person and number are statistically significant. Since all proto and the modern forms in order to see whether verbal variables are statistically significant and differ from zero, we can person–number indexes tend to move towards (or remain within) conclude that our attractor model is supported by our data. This the attractor or away from it. In order to do so, we established for allows us to compute the lengths of the attractors. The absolute each form whether or not the difference between its modern average lengths computed by the model are presented in Fig. 2. length and the attractor length became smaller than the length While the lengths predicted by the model for all families difference between the attractor and the proto-form. Whenever represent the static evidence for the attractor, we have also tested the difference remains the same and the length of the proto-form whether languages tend to develop towards this state if they is very close to the attractor we counted it as a movement towards happen to deviate from it in their proto-languages or whether the attractor. After we thus obtained the direction of change for each lengths are preserved in the modern languages if the proto- modern form we applied a logistic mixed effects model predicting language already adhered to the attractor. It has been repeatedly the direction of change with person and number as fixed effects argued that linguistic universals are not language states but rather and clade as a random effect. the accumulation of the diachronic processes and the mechan- Preference for cumulative coding was established by testing the isms of change that lead to these states (Bybee, 1988; Bybee, 2006; diachronic preference for and against compositionality. The data Bybee, 2008; Creissels, 2008; Cysouw, 2010; Dunn et al., 2011; points were divided into four categories for each person: (i) no Givón, 1979; Greenberg, 1966; Greenberg, 1978; Haspelmath, compositionality—compositionality is found neither in the proto- 1999; Maslova, 2000; Maslova, 2004; Cristofaro, 2012; Cristofaro, form nor in the modern form; (ii) compositionality disappears— 2014; Bickel et al., 2014). compositionality is present in the proto-form and disappeares in If the attractor lengths exist as suggested by the model on the the modern form; (iii) compositionality remains—composition- basis of the synchronic data above, then the attractor should also ality is present in both the proto-form and in the modern form; become visible in the transitional probability of languages to (iv) compositionality appears—compositionality is absent from adhere to the attractor lengths over the course of time. In order to HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 3 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Fig. 2 Predictions of the Poisson mixed effects model for the number of segments based on person and number (clade is used as a random effect). test whether there is indeed a diachronic pressure towards the number of modern Mayan languages (yielding the modern aver- attractor lengths, we have compared two idealized diachronic age of 2.64 segments). Finally, indexes adhering to the attractor stages: Stage 0 and Stage 1. Stage 0 consists of the lengths of each remain largely unchanged as to their lengths. For example, the of the six person–number indexes in the proto-language recon- length of 1SG in modern Sogeram, Athabaskan, or Semitic lan- structed by the historical-comparative method in the author- guages does not deviate considerably from its proto-forms. We itative literature for 15 (sub)families (see fn. 1 for the references). thus observe that indexes are not randomly affected by reduction Stage 1 is the lengths of each of the six person–number indexes or enlargement (via, for example, analogical extensions). across all modern languages of the respective (sub)family (10–50 In order to model the tendencies between Stage 0 and Stage 1, languages per family). The lengths at Stage 0 is in principle we computed for each language whether or not its indexes have subject to accidental, language-specific pressures, since there is changed toward the attractor estimated in the previous model, as only one proto-language per family. By contrast, the lengths at a binary variable: moving towards or remaining in the attractor vs. Stage 1 may be taken as indicative of universal pressures, since we not moving towards the attractor. Subsequently, we applied a take 10 to 50 modern languages per family, thus leveling out logistic mixed effects model to predict the probability of move- possible language-specific effects. ment towards (and remaining within) the attractor by person and We find that the modern forms, on average, develop towards number. The 1SG form was again selected as a baseline for the the attractor over the course of time. We also do not observe any regression. The lme4 (Bates et al., 2015) formula used for this significant source determination. Modern languages either “fix” model is as follows: the original proto-lengths via (i) shortening or (ii) enlarging, or movement towards attractor or being in the attractor they retain the lengths if these adhered to the attractor lengths range ~ person * number + (1|clade) already in the proto-language. For example, Uralic had singular The overall predictions of our model are presented in Fig. 3, proto-forms that were too short: 1SG -m, 2SG -n, 3SG -ø with the estimated values and a 95% confidence interval (model (Janhunen, 1982: p. 35). Accordingly, some modern Uralic lan- printouts can be found in the Supplement). guages enlarged them to two segments in the 1SG and 2SG and to The model reveals that in all person–number combinations one segment in the 3SG (e.g., Saami, Erzya, Komi-Permyak). there is a high probability to obey the attractor. There is no Observe that this enlargement is differential: in contrast to the statistically significant difference among persons. We conclude singular forms, the first and second plural forms (both three that the model supports our hypothesis that indexes are obeying segments in Proto-Uralic) have not been enlarged in modern the attractor lengths in their diachronic developments. Note that Uralic languages on average. The enlargement only takes place if the probability of obeying the attractor length of the given person the proto-forms considerably deviate from the attractor state. is extremely high in the singular forms (around 90–100%) and By contrast, families with proto-forms considerably longer less so in the plural forms (around 65–90%). The distinction than the attractor shorten their lengths. For example, second between singular and plural forms is also statistically significant. singular in Proto-Indo-European was three segments (*-e-si). It To summarize, despite continuous processes of various pho- was accordingly shortened to 1.57 segments on average in the netic and morphological changes and restructurings (Seržant, modern Indo-European languages. The same applies to first and 2021a), there is a stable blueprint in the coding of person–number second person in Proto-Mayan: with 2.5 (a segment plus a long indexes. Regardless of the lengths in the respective proto-lan- segment) it was somewhat too long and was accordingly shor- guage, modern languages on average stick to the attractor lengths tened to around two segments on average in the modern lan- by the right combination of diachronic processes leading to guages. At the same time, the respective plural proto-form was reduction, enlargement, or retention (see Moroz, 2021 for an somewhat too short with two segments and was enlarged in a exception). Importantly, while many studies since Zipf (1935) 4 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE Fig. 3 Logistic mixed effects model’s predictions for the number of segments based on person and number (clade is used as a random effect). assume that frequency effects on coding length only manifest into the value “dipreferred” and the green values into the value themselves via reduction (Diessel, 2007; Jaeger and Tily, 2010; “preferred.” The lme4 (Bates et al., 2015) formula used for this Bybee, 2001; Bybee, 2003; Cohen Priva and Jaeger, 2018), the model is as follows: length optimization discussed here is a more complex process compositionality of modern language ~ person * compositionality that may result not only from reduction but from retention or of proto-language+ (1|clade) enlargement as well. For example, the Polish 1Pl -my (from The overall predictions of our model are presented in Fig. 5, Proto-Slavic *-mū) is the result of the lengthening of the final with an estimated values and a 95% confidence interval (see vowel, which was originally hyper-short -mŭ (with the reduced supplement). vowel ŭ) in Proto-Slavic and thus much shorter than the attrac- It follows from Figs. 4 and 5 that compositionality is dispreferred tor. The lengthened variant most probably emerged by analogy to in the long run. The model predicts an extremely high probability of the independent 1PL pronoun my (<mū) ‘we’ already in Early non-compositional coding (over 95%) for each person. Slavic. Importantly, no other person-number combination underwent this kind of lengthening. Discussion The second universal property of the attractor is the preference Although the coding of indexes in particular languages is subject for compositionality. Compositionality is found when the person to various independent and language-specific processes including (1st vs. 2nd vs. 3rd) and the number (singular vs. plural) are various types of reduction, reanalyses, analogical extensions, etc. transparently and separately coded. For example, the indexes in (Seržant, 2021a), there are universal pressures that channel their Russian show no compositionality (i.e., are cumulative), cf. 1SG development over time. More specifically, we provided syn- -u vs. 1PL -m or 2SG -š’ vs. 2PL -te. By contrast, Maalula, a chronic and dynamic evidence for a universal attractor in the Western Aramaic language does show compositionality: 2SG či- domain of indexing. The attractor is characterized by the absolute vs. 2PL či- … -un or 3SG yi- vs. 3PL yi-…un. In this language, lengths for each person–number combination (Fig. 2) and second person is marked by či-, third person by yi- and number is cumulative (non-compositional) coding. Finally, subject indexes marked by zero in the singular and by -un in the plural. These are almost never optional in the languages of the world as has forms are thus compositional. been shown earlier (Karlsson, 1986; Siewierska, 1999). From these We coded changes in compositionality into four values: no characteristics of the attractor the following conclusions about the compositionality (neither the proto-language nor the modern universal principles constraining the interaction between under- language has compositionality), compositionality disappears lying efficiency pressures can be drawn. (compositionality of the proto-language decreased in the First, despite an extremely high corpus frequency, indexes modern language), compositionality remains (both the proto- nevertheless are not all equal in their lengths. The absolute language and the modern language have some composition- lengths are structured: (i) the third person tends to be the ality and its degree remains unchanged), compositionality shortest, and (ii) the plural indexes are longer than their appears (the modern language develops some composition- respective singular indexes (Greenberg, 1966: pp. 33–38). These ality). Results are presented on Fig. 4. asymmetries correlate with the asymmetries in the corpus fre- Both green bars stand for the preference of compositionality quencies of these forms as predicted by Zipf’s Law of Abbrevia- while both blue bars indicate dispreference for compositionality. tion: the more frequent form is shorter than the less frequent one. Overwhelmingly, compositionality tends to be avoided. We also Consider the corpus frequencies from the oral subcorpus of the applied logistic mixed effects model to predict compositionality of Russian National Corpus (216,112 words) as a proxy (Table 1). In the modern form depending on the person and the composi- comparison to other persons, third person is the most frequent tionality of the proto-form. For this, we merged the blue values person in both number sets, with 69% in the singular and 62% in HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 5 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Fig. 4 Number of languages that increased/decreased number of compositional persons. Fig. 5 Probability of compositionality of the modern form depending on the person and compositionality of the proto-language. the plural. Likewise, the singular forms are much more frequent zero as one would expect if only the articulatory efficiency were at than the plural ones, with 69% singulars vs. 31% plurals of all play. We did not observe any dynamic bias towards zero (only the forms. Both frequency asymmetries (3rd vs. 1st or 2nd and sin- weaker, reverse statement is true: zeros, if at all, are more prob- gular vs. plural) are statistically significant (p= 0.002, χ ). Similar able in the third singular than elsewhere, Siewierska, 2010; Bickel frequency asymmetries have been obtained for other languages, et al., 2015). In fact, some subfamilies even entirely replace the such as spoken Spanish (Bybee, 1985: p. 71), Finnish (on the basis third-person zero inherited from their proto-languages. For of olla “to be” in Karlsson, 1986: 24), and some other languages example, Proto-Uralic had zero-coded third-person singular (Greenberg, 1966: p. 37). index (Janhunen, 1982: p. 35) while a number of modern Uralic These figures show that articulatory efficiency plays an languages, including the entire Finnic subfamily, developed a important role here: the more expected the sign is the shorter it is. non-zero coding here. Nevertheless, zero is not preferred. The most frequent third- While zero would be the most efficient in terms of articulation, person form is more frequently coded with a segment than with non-zero coding of the third-person singular must be motivated 6 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE Cumulative coding requires higher complexity of the lexicon and Table 1 Person–number frequencies in the oral subcorpus of comes at higher memory and learnability costs because it requires the RNC. six signs (1SG, 2SG… 3PL) while compositional coding would require only four signs (three signs for the three persons and one Singular Plural plural sign applicable to all of them). While both options are 1 26% (2.276) 15% (601) equally informative, it is only the first one that is cross- 2 5% (471) 23% (926) linguistically preferred. This fact allows uncovering the specific 3 69% (6.021) 62% (2.493) efficiency processes involved. Languages structure their lexica Total 69% (8768) 31% (4020) optimally such that the trade-off between the processing costs and Bold indicates the most frequent combinations. the lexicon complexity is resolved within the Pareto frontier either in favor of higher processing costs (more compositional) or in favor of higher lexicon complexity and memory costs (more cumulative coding) (Kemp and Regier, 2012; Kemp et al., 2018;Xu by processing and planning efficiency overriding articulation ease. et al., 2020). Yet, languages prefer the specific choice (corner) Sending the hearer a non-zero phonetic cue facilitates the pro- within the Pareto frontier in high-frequency domains such as the cessing effort on the part of the hearer and thus increases the indexing domain: processing efficiency outweighs lexicon com- chances of a successful transmission of information. A non-zero plexity and, thus, memory (and learnability) costs with linguistic form is also more planning-efficient for the speaker because it items of this order of frequency. The reason for this is that higher provides a straightforward link from meaning to coding, while processing costs are not efficient with high-frequency items that zero is inherently ambiguous by being linked to various meanings are easily learnable and retrievable from the memory anyway and domains. Non-zero coding also alleviates the planning pro- (Kirby, 2001: p. 109). This ties in with Kemp et al. (2018: p. 114) cess because it makes the assessment of whether or not the who claim that the preference for the cumulative coding within context provides enough information unnecessary. the Pareto frontier is found when the lexical domain is important Secondly, it also is the planning efficiency that must be for the culture, if “important for the culture” means that the items responsible for the fact that verbal indexes are almost never of this lexical domains are frequent in this culture (similarly in Xu optional in the languages of the world (Siewierska, 1999; Haig, et al., 2020 for number signs). We conclude from this that pro- 2018). This obligatoriness yields redundant uses in those contexts cessing ease outweighs lexicon simplicity and, thus, memory (and that provide enough information for the identification of the learability) costs with linguistic items of this order of frequency. subject referent, as in ven-ī, vid-ī, vic-ī “came-1SG”, “saw-1SG”, To sum up, first, we have established that there is a universal “conquered-1SG” (the last two occurrences of -1SG are increas- attractor state for indexing around which the evolution revolves. ingly redundant because they can be guessed from the previous Second, the properties of the attractor uncover two domains in which context anyway). Planning efficiency overrides articulatory effi- efficiency pressures are most powerful: strive towards less processing ciency here as well. and articulatory effort while strive towards lower lexicon complexity Thirdly, the most articulatory efficient paradigm that would and lower memory costs are weaker efficiency pressures for this also warrant unambiguous information transmission would grammatical category due to its order of frequency. Having said this, not require the plural to have longer forms than the singular. our evidence is cross-linguistic comparative evidence. Ideally, our Thus, theoretically a morphological system of coding all six conclusions should be supported by experimental evidence. distinctions (1SG, 2SG … 3PL) with one segment—e.g., 1SG -a, 2SG -t,3SG -i (or zero), 1PL -k,2PL -o,3PL -r—would per- Data availability fectly fulfill the requirement of accurate information trans- All data analyzed are included in the manuscript and supple- mission under the lowest articulatory effort. Thus, the effect of mentary information file. articulatory efficiency alone does not explain why cross- linguistically the plural forms require more segments than Received: 13 September 2021; Accepted: 24 January 2022; the singular forms if they all may be sufficiently disambiguated by just one segment. Multiple segments, however, allow the speakers to gain more production time and the hearer more comprehension time with the less expected meanings (plural in Note this case). The longer forms of the plural fulfill here the 1 Proto-Indo-European (Meier-Brügger, 2010: pp. 173–184), Proto-Turkic (Róna-Tas, function of according the message with constant information 1998: p. 75; Old Turkic in Abduraxmanov, 1997: p. 68; Erdal, 2004: p. 232; Tuguševa, flow (Aylett and Turk, 2004; Levy and Jaeger, 2007;Pluy- 1997: p. 59), Proto-Mayan (Bricker, 1977: p. 2; Schele, 1982: p. 9), Proto-Uralic (Honti, maekers et al., 2005; Uniform Information Density hypothesis 2010: p. 21; Janhunen, 1982: p. 35; Kulonen, 2001; Laanest, 1982 [1975]: pp. 229–30), in Coupé et al., 2019). In turn, the selection of particular Proto-Dravidian (Andronov, 2009: pp. 224–231), Proto-Semitic (Hasselbach, 2004:p. 32; Huehnergard, 2000; Lipiński, 2001: p. 378), Proto-Oceanic (Blust, 1972; François, phonetic segments serves the distinguishability function. 2016: p. 32; Ross, 1988: p. 366, 2002 : 60; Starosta et al., 1981), Proto-Bantu (Meeussen, Fourthly, while it is known that high-frequency items as 1967: pp. 97–99; Schadeberg, 2003 [2014]: p.151), Proto-Sogeram (Daniels, 2015:p. opposed to low-frequency items do not require transparent, 155), Proto-Awyu-Dumut (Wester, 2014: pp. 78–85), Proto-Athabaskan (Hoijer, 1971: compositional coding (Kirby, 2001: p. 108; Christiansen and pp. 127–132; Leer, 2006: p. 429), Proto-Muskogean (Booker, 1980: p. 33), Proto- Chater, 2008: p. 499), our cross-linguistic diachronic evidence Worroran (McGregor and Rumsey, 2009: p. 68), and Proto-Salishan (Newman, 1979: suggests that items as frequent as person–number indexes in fact p. 213, 1980: p. 156), Proto-Kiranti and Proto-rGyalrongic (DeLancey, 2010: p. 15, 2011:p.2, 2014; Jacques, 2012, 2016; LaPolla, 2003: p. 30). prefer cumulative coding (number and person being coded by one atomic sign): those families that were not compositional in the proto-language (e.g., Indo-European) did not develop composi- References Abduraxmanov GA (1997) Karaxanidsko-ujgurskij jazyk. In: Tenišev ÈR, Poce- tionality in any of the modern languages, and some of those luevskij EA, Kormušin IV, Kibrik AA (eds.) Jazyki Mira. Tjurkskie jazyki. families that did have compositionality in the proto-language (e.g., “Kyrgyzstan”, Bishkek, pp. 64–74 Awyu-Dumut) removed it in the modern languages at least to Andronov MS (2009) A comparative grammar of the dravidian languages. Beiträge some extent. This “opacification” is also observed in independent zur Kenntnis südasiatischer Sprachen und Literaturen 7. Otto Harrassowitz, words, such as pub from public house (Kanwal et al., 2017). Wiesbaden HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 7 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 Aylett M, Turk A (2004) The smooth signal redundancy hypothesis: a functional Dunn M, Greenhill SJ, Levinson SC, Gray RD (2011) Evolved structure of lan- explanation for relationships between redundancy, prosodic prominence, and guages shows lineage-specific trends in word-order universals. Nature duration in spontaneous speech. Lang Speech 47:31–56 473:79–82 Aylett M, Turk A (2006) Language redundancy predicts syllable duration and the Erdal M (2004) A grammar of old-turkic. Handbook of oriental studies. Handbuch spectral characteristics of vocalic syllable nuclei. J Acoust Soc Am der Orientalistik. Section eight. Central Asia. Vol 3. Brill, Leiden/Boston 119:3048–3058 François A (2016) The historical morphology of personal pronouns in northern Bates D, Mächler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models Vanuatu. In: Pozdniakov K (ed.) Comparatisme et reconstruction: tendances using lme4. J Stat Softw 67(1):1–48 actuelles. Faits de Langues. Peter Lang, Bern, pp. 25–60 Bentz CH, Ferrer-i-Cancho R (2016) Zipf’s law of abbreviation as a language Gibson E, Futrell R, Piantadosi ST, Dautriche I, Mahowald K, Bergen L, Levy R universal. In: Bentz CH, Jäger G, Yanovich Y (eds.) In: Proceedings of the (2019) How efficiency shapes human language. Trend Cogn Sci 23(5):389–407 Leiden workshop on capturing phylogenetic algorithms for linguistics. Uni- Givón T (1979) On understanding grammar. Academic Press, New York, NY versity of Tubingen, online publication system. https://publikationen.uni- Greenberg JH (1966) Language universals, with special reference to feature hier- tuebingen.de/xmlui/handle/10900/6855814 archies. Mouton, The Hague Bickel B, Witzlack-Makarevich A, Zakharko T (2014) Typological evidence against Greenberg JH (1978) Diachrony, synchrony and language universals. In: Greenberg universal effects of referential scales on case alignment. In: Bornkessel- JH, Ferguson CA, Moravcsik EA (eds.) Universals of human language, Vol. 1: Schlesewsky I, Malchukov A, Richards M (eds.) Scales and hierarchies: a method and theory. Stanford University Press, Stanford, pp. 61–92 cross-disciplinary perspective on referential hierarchies. De Gruyter, Mouton, Haig G (2018) The grammaticalization of object pronouns: why differential object Berlin, pp. 7–44 indexing is an attractor state. Linguistics 56(4):781–818 Blust RA (1972) Proto-Oceanic addenda with cognates in non-Oceanic Aus- Haspelmath M (1999) Optimality and diachronic adaptation. Zeitschrift für tronesian languages: a preliminary list. WPLUH 411:1–43 Sprachwissenschaft 18(2):180–205 Booker KM (1980) Comparative muskogean: aspects of proto‐muskogean verb Hasselbach R (2004) Final vowels of pronominal suffixes and independent personal morphology. University of Kansas dissertation, Lawrence, KS pronouns in semitic. J Semit Stud 49(1):1–20 Bornkessel-Schlesewsky I, Schlesewsky M (2014) Competition in argument inter- Hoijer H (1971) Athapaskan morphology. In: Sawyer J (ed.) Studies in American pretation: evidence from the neurobiology of language. In: MacWhnney B, Indian Languages. University of California Publications in Linguistics 65. Malchukov A, Moravcsik E (eds.) Competing motivations in grammar and University of California Press, Berkley, pp. 113–147 usage. Oxford University Press, pp. 107–126 Honti L (2010) Personae ingratissimae? A 2. személyek jelölése az uráliban. Bricker VR (1977) Pronominal inflection in the Mayan languages. Occasional Nyelvtudományi Közlemények 107:7–57 Paper 1. Middle American Research Institute, New Orleans Hopper P (1987) Emergent Grammar. Berkley Linguistic. Society 13:139–157 Bybee JL (1988) The diachronic dimension, chapter 13. In: Hawkins JA (ed.) Huehnergard, J. 2000. Comparative Semitic Linguistics. Unpublished. Cambridge, Explaining language universals. OUP, pp. 350–379 Mass Bybee JL (2001) Phonology and language use. Cambridge University Press, Jacques G (2012) Agreement morphology: the case of Rgyalrong and Kiranti. Lang Cambridge Linguist 13(1):83–116 Bybee JL (2003) Mechanisms of change in grammaticization: the role of frequency. Jacques G (2016) Le sino-tibétain: polysynthétique ou isolant? Faits de langues In: Joseph BD, Janda RD (eds.) The Handbook of Historical Linguistics. 47(1):61–74 Blackwell, Oxford, pp. 602–623 Jaeger TF, Tily H (2010) On language ‘utility’: processing complexity and com- Bybee JL (2006) From usage to grammar: the mind’s response to repetition. Lan- municative efficiency. Cogn Sci 2:323–335 guage 82(4):711–733 Jaeger TF, Buz E (2018) Signal reduction and linguistic encoding. In: Fernández Bybee JL (2008) Formal universals as emergent phenomena: the origins of structure EM, Smith Cairns H (eds.) The Handbook of Psycholinguistics. John Wiley & preservation. In: Good J (ed.) Language universals and language change. Sons Oxford University Press, pp. 108–121 Janhunen J (1982) On the structure of Proto-Uralic. Finnisch-ugrische For- Bybee J, Hopper P (2001) Introduction to frequency and the emergence of lin- schungen 44:23–42 guistic structure. In: Bybee J, Hopper P (eds.) Frequency and the emergence Kanwal J, Smith K, Culbertson J, Kirby S (2017) Zipf’s Law of Abbreviation and the of linguistic structure [Typological studies in language 45]. John Benjamins, Principle of Least Effort: language users optimise a miniature lexicon for pp. 1–27 efficient communication. Cognition 165:45–52 Bybee J (1985) Morphology: A Study of the Relations between Meaning and Form. Karlsson F (1986) Frequency considerations in morphology. STUF –Lang Typol Amsterdam/Philadelphia: John Benjamins Univ 39(1):19–28 Christiansen MH, Chater N (2008) Language as shaped by the brain. Behav Brain Kemp C, Regier T (2012) Kinship categories across languages reflect general Sci 31:489–558 communicative principles. Science 336:1049–1054 Cohen Priva U, Jaeger TF (2018) The interdependence of frequency, predictability, Kemp C, Xu Y, Regier T (2018) Semantic typology and efficient communication. and informativity in the segmental domain. Linguist Vanguard 4(2):1–13 Ann Rev Linguist 4:109–128 Coupé CH, Oh YM, Dediu D, Pellegrino F (2019) Different languages, similar Kirby S (2001) Spontaneous evolution of linguistic structure-an iterated learning encoding efficiency: comparable information rates across the human com- model of the emergence of regularity and irregularity. IEEE Trans Evol munication niche. Sci Adv 2594 Comput 5:102–110 Creissels D (2008) Direct and indirect explanations of typological regularities: the Kulonen UM (2001) Zum n-Element der zweiten Personen besonders im Obu- case of alignment variations. Folia Linguistica 42(1):1–38 grischen. Finnisch-Ugrische Forschungen 56:151–174 Cristofaro S (2012) Cognitive explanations, distributional evidence, and diachrony. Laanest A (1982) Einführung in die ostseefinnischen Sprachen. Autorisierte Stud Lang 36(3):645–670 Übertragung aus dem Estnischen von Hans-Hermann Bartens. Buske, Cristofaro S (2014) Competing motivation models and diachrony: what evidence Hamburg for what motivations? In: MacWhnney B, Malchukov A, Moravcsik E (eds.) LaPolla R (2003) Overview of Sino-Tibetan morphosyntax. In: Thurgood G, Competing motivations in grammar and usage. Oxford University Press, Matisoff JA, Bradley D (eds.) Linguistics of the Sino-Tibetan area: The state Oxford, pp. 282–298 of the art. Pacific Linguistics Series C, 87. Department of Linguistics, Aus- Currie KH, Hume E, Jaeger TF, Wedela A (2018) The role of predictability in tralian National University, Canberra, pp. 22–42 shaping phonological patterns. Linguist Vanguard 4(s2):1–15 Leer J (2006) Na-Dene languages. In: Asher RE, Simpson JMY (eds.) The ency- Cysouw M (2010) On the probability distribution of typological frequencies. In: clopedia of language and linguistics. Pergamon, Oxford, pp. 428–430 Ebert CH, Jäger G, Michaelis J (eds.) Math Lang. Springer, Heidelberg, pp. Levshina N (2021) Cross-linguistic trade-offs and causal relationships between cues 29–35 to grammatical subject and object, and the problem of efficiency-related Daniels, D (2015) A Reconstruction of Proto-Sogeram. Phonology, Lexicon, explanations. Front Psychol 12:648200 Morphosyntax. A dissertation in partial satisfaction of the requirements for Levy R, Jaeger FT (2007) Speakers optimize information density through syntactic the degree Doctor of Philosophy in Linguistics. Santa Barbara: University of reduction. Adv Neural Inform Process Syst 19:849–856 California Lipiński E (2001) Semitic Languages: outline of a comparative grammar, 2ed. DeLancey S (2010) Towards a history of verb agreement in Tibeto-Burman. Peeters, Leuven Himalayan Linguist 9(1):1–38 Malchukov AL (2008) Animacy and asymmetries in differential case marking. DeLancey S (2011) Agreement prefixes in Tibeto-Burman. Himalayan Linguist Lingua 118:203–221 10(1):1–29 Maslova E (2000) A dynamic approach to the verification of distributional uni- DeLancey S (2014) Second person verb forms in Tibeto-Burman. Linguist Tibeto- versals. Linguist Typol 4(3):307–333 Burman Area 37(1):3–33 Maslova E (2004) Dinamika tipologičeskix raspredelenij i stabil’nost’ jazykovyx Diessel H (2007) Frequency effects in language acquisition, language use, and tipov [Dynamics of typological distributions and stability of language types]. diachronic change. New Idea Psychol 25:108–127 Voprosy jazykoznanija 5:3–16 8 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-022-01072-0 ARTICLE McGregor WB, Rumsey A (2009) Worrorran revisited: the case for genetic rela- Paper presented to the Third International Conference on Austronesian tions among languages of the Northern Kimberley region of Western Aus- Linguistics, Bali. Abridged version published. In: Halirn A, Carrington L, tralia. The Australian National University, Canberra Wurm SA (eds.) Papers from the Third International Conference on Aus- Meeussen, AE (1967) Bantu grammatical reconstructions. Africana Linguistica tronesian Linguistics. Vol. 2. Tracking the travellers. Dept. of Linguistics, 3:79–121 Australian National University, Canberra, pp. 145–170 Meier-Brügger M (2010) Indogermanische Sprachwissenschaft. 9., durchgesehene Tuguševa LJ (1997) Drevneujgurskij jazyk. In: Tenišev, ĖR, JeA Poceluevskij, IV und ergänzte Auflage. Unter Mitarbeit von Matthias Fritz und Manfred Kormušin, AA Kibrik (eds.), Jazyki mira. Tjurkskie jazyki. Biškek: Izdatel’skij Mayrhofer. De Gruyter, Berlin dom “Kyrgyzstan”, pp. 54–63 Moroz G (2021) Length of East Caucasian subject indexes: a quantative research. Wester R (2014) A linguistic history of Awyu-Dumut: Morphological Study and In: Majsak TA, Sumbatova NR, Testelec YG (eds.) Durqasi xazna. Sbornik Reconstruction of a Papuan Language Family. Doctoral dissertation, Vrije statej k 60-letiyu R. O. Mutalova. Buki Vedi, Moscow, pp. 258–282 Universiteit Amsterdam Moroz G (2017) lingtypology: easy mapping for Linguistic Typology. https:// Xu Y, Liu E, Regier T (2020) Numeral systems across languages support efficient CRAN.R-project.org/package=lingtypology communication: from approximate numerosity to recursion. Open Mind Newman S (1979) A History of the Salish Possessive and Subject Forms. Int J Am 4:57–70 Linguist 45(3):207–223 Zipf G (1935) The psychobiology of language. Routledge, London Newman S (1980) Functional changes in the Salish pronominal system. Int J Am Linguist 46(3):155–167 Acknowledgements Norton A (1995) Dynamics: an introduction. In: Port RF, Van Gelder T (eds.) The first author has received funding by the Heisenberg grant SE 2838/1-1 “Exploring Mind as Motion: explorations in the dynamics of cognition. MIT Press, pp. linguistic diversity” of the German Research Foundation (Deutsche For- 44–68 schungsgemeinschaft). The second author greatly acknowledges the support he has Piantadosi ST, Tily H, Gibson E (2011) Word lengths are optimized for efficient received within the Basic Research Program of the National Research University Higher communication. Proc Natl Acad Sci USA 108(9):3526–3529 School of Economics. Pierrehumbert J (2001) Exemplar dynamics: word frequency, lenition and contrast. In: Bybee J, Hopper P (eds.) Frequency effects and the emergence of lexical structure: studies in language. John Benjamins, 137–157 Funding Pluymaekers M, Ernestus M, Baayen RH (2005) Articulatory planning is con- Open Access funding enabled and organized by Projekt DEAL. tinuous and sensitive to informational redundancy. Phonetica 62:146–159 R Core Team (2015) R: A language and environment for statistical computing. Competing interests Austria, Vienna. https://www.R-project.org/ Róna-Tas A (1998) The Reconstruction of Proto-Turkic and the Genetic Question. The authors declare no competing interests. In: Johanson L, Csató ÉÁ (eds.) The Turkic Languages. CUP, Cambridge, pp. 67–80 Ethical approval Ross M (1988) Proto-Oceanic and the Austronesian languages of western Mela- Not applicable. nesia. Pacific Linguistics: Series C, 98. Australian National University dis- sertation. Research School of Pacific and Asian Studies, Canberra Ross M (2002) Proto Oceanic. In: Lynch J, Ross M, Crowley T (eds.) The oceanic Informed consent languages. Routledge, London/New York, pp. 54–91 Not applicable. Schadeberg T (2003) Historical linguistics. In: Nurse, D & G Philippson (eds.), The Bantu Languages. London/New York: Routledge, pp. 143–163 Additional information Schele L (1982) Maya Glyphs. The Verbs. University of Texas Press, Austin Supplementary information The online version contains supplementary material Seržant IA (2019) Weak universal forces: the discriminatory function of case in available at https://doi.org/10.1057/s41599-022-01072-0. differential object marking systems. In: Schmidtke-Bode K, Levshina N, Michaelis SM, Seržant I (eds.) Explanation in typology: diachronic sour- Correspondence and requests for materials should be addressed to Ilja A. Seržant. ces, functional motivations and the nature of the evidence [Conceptual Foundations of Language Science 3]. Language Science Press, Berlin, pp. Reprints and permission information is available at http://www.nature.com/reprints 149–178 Seržant IA (2021b) The dynamics of Slavic morphosyntax is primarily determined by Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in the geographic location and contact configuration. Scando-Slavica 67(1):65–90 published maps and institutional affiliations. Seržant IA (2021a) Cyclic changes in verbal person-number indexes are unlikely. Folia Linguistica Historica 42(1):49–86 Seržant IA (2021c) Dataset for the paper “Universal attractors in language evolu- Open Access This article is licensed under a Creative Commons tion provide evidence for the kinds of efficiency pressures involved” [Data Attribution 4.0 International License, which permits use, sharing, set]. Version 4. Zenodo. https://doi.org/10.5281/zenodo.6028260 Seyfarth S (2014) Word informativity influences acoustic duration: effects of adaptation, distribution and reproduction in any medium or format, as long as you give contextual predictability on lexical representation. Cognition 133(1):140–155 appropriate credit to the original author(s) and the source, provide a link to the Creative Siewierska A (1999) From anaphoric pronoun to grammatical agreement marker: Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless why objects don’t make it. Folia Linguistica 33(1/2):225–251 Siewierska A (2010) Person asymmetries in zero expression and grammatical indicated otherwise in a credit line to the material. If material is not included in the functions. In: F Floricic (ed.), Essais de typologie et de linguistique générale. article’s Creative Commons license and your intended use is not permitted by statutory Mélanges offerts à Denis Creissels. Paris: Presses de L’Ecole Normale regulation or exceeds the permitted use, you will need to obtain permission directly from Supérieure, pp. 471–485 the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Sóskuthy M, Hay J (2017) Changing word usage predicts changing word durations licenses/by/4.0/. in New Zealand English. Cognition 166:298–313 Starosta S, Pawley AK, Reid LA (1981) The evolution of focus in Austronesian. © The Author(s) 2022 HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2022) 9:58 | https://doi.org/10.1057/s41599-022-01072-0 9

Journal

Humanities and Social Sciences CommunicationsSpringer Journals

Published: Feb 17, 2022

There are no references for this article.