The evolution of high-fidelity social learning
Montrey, Marcel;Shultz, Thomas R.
2020-10-06 00:00:00
A de ning feature of human culture is that knowledge and technology continually improve over time. Such cumulative cultural evolution (CCE) probably depends far more heavily on how reliably information is pre- served than on how eciently it is re ned. Therefore, one possible reason that CCE appears diminished or absent in other species is that it re- quires accurate but specialized forms of social learning at which humans are uniquely adept. Here, we develop a Bayesian model to contrast the evolution of high- delity social learning, which supports CCE, against low- delity social learning, which does not. We nd that high- delity transmission evolves when (1) social and (2) individual learning are inex- pensive, (3) traits are complex, (4) individual learning is abundant, (5) adaptive problems are dicult and (6) behaviour is
exible. Low- delity transmission diers in many respects. It not only evolves when (2) indi- vidual learning is costly and (4) infrequent but also proves more robust when (3) traits are simple and (5) adaptive problems are easy. If condi- tions favouring the evolution of high- delity transmission are stricter (3 and 5) or harder to meet (2 and 4), this could explain why social learning is common, but CCE is rare. 1 Introduction Humanity's unparalleled cultural and technological sophistication has been widely attributed to our ability to not just share information, but continually build upon it as well [1, 2]. This process, called cumulative cultural evolution (CCE), has resulted in knowledge and technology that no single generation could pro- duce on its own. However, despite extensive evidence of culture in a wide range of species [3], non-human animals have demonstrated only a limited capacity for CCE. Not only has observational evidence proved scarce and contentious [4], but experiments have shown that CCE can be surprisingly dicult to evoke even in closely related primates [5, 6]. While some examples have been elicited in various species [7], these often involve extensive human intervention and remain arXiv:2010.02439v1 [physics.soc-ph] 6 Oct 2020 comparatively modest. This raises a question that has perplexed biologists, psychologists and anthropologists alike: what makes humans, if not unique in our capacity for CCE, uniquely adept at producing it? CCE arises when social learning preserves information between generations, allowing individual learning or lucky errors in transmission to re ne it [8]. This process probably depends far more heavily on how reliably information is pre- served than on how eciently it is re ned, because the more knowledge accu- mulates, the more there is to rediscover or reinvent when transmission fails. Theoretical models explicitly support this idea [9] and often nd that transmis- sion delity must pass a threshold for culture to accumulate [10] (though see cultural attractor theory [11] for an alternative view). Notably, humans transmit information with exceptionally high delity by not only communicating through language, but also imitating more accurately [6] and robustly [12], leveraging a more sophisticated theory of mind [13], showing natural inclinations toward pedagogy [14] and practicing a far wider range of teaching behaviours [15]. This has led to the view that CCE relies on accurate but specialized forms of social learning at which humans are particularly adept [16, 17, 2]. Precisely what social learning mechanisms underlie CCE remains unclear, however. Researchers have long emphasized the role of imitation (process- copying) and teaching, drawing sharp contrasts with less accurate forms of so- cial learning like emulation (product-copying) [16, 2, 17]. On this front, trans- mission chain and laboratory microsociety studies have yielded contradictory results. Some have found that imitation and emulation both support CCE [18{ 20], while others suggest that emulation is insucient [21, 22]. To complicate matters further, studies emphasizing ecological validity have found that even im- itation fails to preserve early stone tool manufacturing (knapping) techniques. Teaching through gesture [23] or even language [24] may thus be critical to human-like CCE. Given this empirical ambiguity, it may be useful to draw a functional dis- tinction between high- delity social learning that supports CCE and low- delity social learning that does not, regardless of what the underlying mechanisms turn out to be. Bayesian models drawn from work on language evolution have shown how this can be achieved [25, 26]. These reveal that when social learning is captured as sampling and inference, it is too low- delity for knowledge to accu- mulate [25]. However, when social learning is captured as the direct transmission of beliefs [25] or information about those beliefs [26], it can give rise to CCE. A Bayesian framework thus delineates between these two types of learning in a mechanism-agnostic way. Thus far, such models have largely been used to study cultural evolution in transmission chains. However, they also present an opportunity to address a more fundamental question: why would biological evolution produce high- delity social learning in some species and not others? Early models showed that CCE cannot explain the evolution of accurate transmission, because CCE would take many generations to pay for this upfront investment [16]. As a result, much of the CCE literature has taken such transmission for granted and focused on other factors instead, such as demography, social connectedness, 2 transmission biases and ltering of maladaptive traits [7]. Here, we develop a Bayesian model that contrasts the evolution of high- and low- delity social learning directly. Doing so reveals that high- delity transmission evolves under dierent conditions than social learning that spreads culture but does not re ne it. 2 Model Consider a population facing an adaptive problem that involves estimating a set of parameters, = f ; ; :::; g, where each takes some value between 0 1 2 x and 1. Beliefs about each are encoded as a probability distribution, p(), that describes which values an individual deems likely and which it does not. For example, if encapsulates knowledge about constructing a spear, then elements , and could represent the spear's ideal length, diameter and center of 1 2 3 gravity (where each characteristic is normalized to fall between some minimum plausible value, represented by = 0, and some maximum plausible value, represented by = 1). Similarly, could encode knowledge about knapping, where through represent the ideal striking platform angle,
aking surface 1 x concavity, distance from the edge, amount of force to apply, etc. Alternatively, could capture how much time and eort to devote to one food patch ( ) as opposed to another ( ) and thus encode a foraging strategy. Learning occurs when beliefs, p(), change in response to new data, d, result- ing in an updated set of beliefs, p(jd). This is modelled as Bayesian inference, P (dj)p() p(jd) = ; (2.1) P (dj)p()d where posterior beliefs, p(jd), are a product of prior beliefs, p(), and the like- lihood of observing the data if those priors are true, P (dj). Bayesian inference thus takes a learner's beliefs and updates them with new data, such that sur- prising data change beliefs to a greater extent. The denominator is simply a normalizing term, which ensures that probabilities integrate to 1. Beliefs about each follow a beta distribution and data, d, consist of either n samples drawn from the environment or m samples drawn from the population. After learning, individuals select the most plausible value of as their estimate. This is the posterior distribution's mode, = arg max p(jd); (2.2) MAP which can be calculated directly from a beta distribution's shape parameters: = ( 1)=( + 2). This makes our model analytically tractable, because it allows us to reason in terms of the data individuals observe rather than the resulting distributions. Taken together, these estimates shape the individual's trait. This trait's 3 eciency is de ned by z = x ; (2.3) i i i=1 ^ ^ ^ ^ where = f ; ; :::; g are the individual's estimates after learning and x is 1 2 x the trait's complexity (the set's cardinality). When estimates lie close to their ideal values, absolute error is minimized and trait eciency approaches z = x. Conversely, when estimates lie far from their ideal values, error is maximized and trait eciency is low (z = 0 in the extreme case where each and take opposite values of 0 and 1). This formulation makes several simplifying assumptions. First, a trait's maximum eciency grows linearly with trait complexity (x). We will see later that this assumption can be weakened to include other growth rates (e.g. log- arithmic), subject to some constraints. Second, each trait has a single optimal variant (a unimodal adaptive landscape), which is not necessarily true in com- plex domains like tools [27]. Third, each parameter is independent, with the ideal value of one having no eect on the ideal value of others. In reality, such contingencies do occur, for example, in knapping [28]. In our model, priors re
ect common intuitions about , whose in
uence di- minishes with learning. These may arise through similarities in genes, ontogeny, previous experience, etc. For example, if individuals share only weak intuitions about the ideal length of a spear, some novices could make long spears while oth- ers make short ones. Alternatively, if individuals share strong biases about the amount of force to apply when knapping, novices could consistently overestimate this parameter. In fact, such a pattern has been observed in experiments [28]. We use an asterisk to denote prior estimates, , and trait eciency, z . An adaptive problem's diculty can be de ned as the average distance be- tween a parameter's ideal value and the prior estimate, f = . x i=1 i When problems are hard, the optimal trait is unintuitive and a lot of learning is needed. Conversely, when problems are easy, ecient solutions are obvi- ous, and there is little or nothing to learn. This could be due to luck, shared relevant experience or even because evolution has yielded an innate adaptive behaviour [29]. 2.1 Individual learning Individual learning involves interacting directly with the environment, through observation, exploration or trial-and-error. We formalize this as sampling a random variable X , where E[X ] = . For example, in foraging, a sample could indicate whether a given food patch was productive or unproductive, such that X Bernoulli(). Alternatively, in knapping, a sample could indicate the distance from the platform edge that produced a viable
ake. Distances closer to the ideal could be more likely to succeed, such that X N (; ). Let n be the average number of samples per parameter. The average individual learner's 4 estimate is thus n + v = ; (2.4) n + v which re
ects the combined in
uence of the environment () and the prior ( ). Note that the relative weight placed on the prior, v > 0, can be understood as the number of `virtual samples' that would be needed to form that distribution. Because more genuine samples are needed to overcome stronger priors, v serves as a measure of conservatism. Each sample comes at some cost, c 0, which represents time, energy, opportunity cost, risk of injury or predation, etc. More sampling yields a more ecient trait, but comes at a greater overall cost, cnx. For example, making three spears gives more insight into the ideal length of a spear than making two would, but requires additional time, eort, material and risk. The average individual learner's tness is thus ! = ! + z cnx; (2.5) I 0 I where ! represents aspects of tness unrelated to learning. In Bayesian inference, each sample improves accuracy less than the preceding one. Because the per-sample cost (c) is invariant, this captures the notion of diminishing returns. The optimal learning rate, which maximizes expected utility and tness, is fv n = v: (2.6) Intuitively, individuals learn more when doing so is inexpensive (low c) and problems are dicult (high f ). Conservatism (v) has a more complicated eect. When individuals are highly conservative, it's not worth collecting many sam- ples, because beliefs barely change with new data. Likewise, when priors are extremely diuse, few samples are needed to sway the learner. Sampling peaks when behaviour is
exible and priors are weak, but not so weak that individuals show no skepticism toward surprising data. Combining equations (2.3), (2.4) and (2.6) gives the average individual learner's trait eciency: z = x(1 cfv): (2.7) Because v > 0, individual learning cannot reliably acquire the optimal trait, z = x, unless learning is free (c = 0) or the initial trait is already optimal (f = 0). If learning is costly and dicult, then individual learning only partially improves the trait and CCE is needed to reliably acquire the ideal variant. 2.2 Low- delity social learning In low- delity social learning, individuals learn about the environment by ob- serving others' behavioural outcomes. For example, seeing many long spears but few short ones is indirect evidence that longer spears are more eective. In reality, behavioural outcomes often fail to accurately re
ect beliefs, resulting 5 in incomplete information and errors in inference [30]. To capture this notion, learners do not sample an estimate directly, but rather a random variable Y , where E[Y ] = . For instance, if a demonstrator tries to build spears of length , errors in production may result in some shorter and some longer ones, such that Y N (; ). Let m be the average number of samples per parameter. The average low- delity social learner's estimate is thus ^ ^ m + v = ; (2.8) m + v which re
ects the combined in
uence of social information () and the prior ( ). We con rm in supplementary material, §1.1 that such social learning does not support CCE, because it cannot improve average trait eciency over time when combined with individual learning. Each sample comes at some cost, k 0, which represents the expenditure and risk involved in surveilling others. Collecting additional samples allows learners to more faithfully reproduce the average trait, but comes at a higher overall cost, kmx. The average low- delity social learner's tness is thus ! = ! + z kmx: (2.9) L 0 L As in individual learning, sampling yields diminishing returns. The optimal social learning rate is t ^ ^ m = v; (2.10) kx i=1 though such learning should be avoided entirely, m = 0, if others haven't im- proved on the initial trait, z z . More eort is devoted to learning when doing so is inexpensive (low k) and there is more knowledge to acquire (the summed term is large). Combining equations (2.3), (2.8) and (2.10) gives the average low- delity social learner's trait eciency, t ^ ^ z = z kvx : (2.11) L i i=1 Such learning cannot reliably preserve others' knowledge, z = z , unless learning is free (k = 0) or there is nothing to learn (the summed term is 0). Otherwise, some knowledge is lost in transmission and supplanted by prior beliefs [25]. 2.3 High- delity social learning High- delity social learning involves faithfully reproducing an existing trait, which we formalize as copying another individual's estimates. One way this 6 could happen is if a learner adopts identical underlying beliefs [25]. For exam- ple, language or gesture could convey everything a teacher knows about where to aim blows when knapping. Alternatively, a learner could adopt beliefs that are merely compatible with the observed trait (i.e. dierent distributions with the same posterior mode). For instance, accurately imitating a demonstrator's construction process could yield spears of the same average length, but subtly dierent beliefs about the relative eciency of shorter or longer ones. In either case, the average high- delity social learner's estimate is identical to that of ^ ^ the population, = , as is its trait eciency, z = z . We con rm in sup- H H plementary material, §1.2 that such social learning supports CCE, because it can improve average trait eciency over time when combined with individual learning. Each parameter individuals copy comes at some cost. Thus far, we have assumed that social and individual learning rely on the same cognitive mech- anisms [31] and that the evolution of social learning primarily re
ects changes in attention and motivation. However, high- delity transmission may involve more specialized and cognitively demanding forms of social learning [16, 2, 17]. For example, if it involves accurate imitation, then it may require specialized neural machinery for parsing and reproducing bodily actions that has under- gone signi cant elaboration in the hominin lineage [32, 12]. Alternatively, if it involves human-like teaching, then it may require the capacity for gesture or even language [33]. Though some researchers argue that high- delity transmis- sion is as much a product of cultural as of biological evolution [34], some genetic endowment is clearly needed, even if this consists of a mere `start-up kit' that is later re ned through culture [35]. That being said, the addition of brain tissue is notoriously energetically expensive, particularly during development [36]. The cost of high- delity social learning may thus consist of two components: a dynamic component, g , that re
ects the expenditure and risk involved in employing such learning; and a static component, g , that re
ects the cost of developing and maintaining it. This gives an overall cost g x + g 0, where the dynamic cost grows with how d s extensively learning is employed (x), but the static cost is invariant. To capture both components as a single per-parameter cost, we de ne g = g + g =x. The d s average high- delity social learner's tness is thus ! = ! + z gx: (2.12) H 0 3 Results To contrast the evolution of high- and low- delity social learning, we track the fate of rare social learning mutants in a monomorphic population of individ- ual learners, where z = z and ! = ! . Social learning goes extinct if these I I mutants' average tness (! or ! ) falls below that of the resident type (! ). L H I Conversely, social learning evolves if these mutants have higher tness and their invasion results in either xation or coexistence (a dimorphic equilibrium). We 7 do not consider dimorphic resident populations, because for our purposes the eects would be fairly straightforward. Namely, the resident population's av- erage trait eciency (z ) would decrease as low- delity social learning became more common, which is equivalent to a monomorphic case where high- delity transmission is more costly (i.e. g is z =x higher). While social learning is often subject to frequency-dependent selection [37], this does not concern us for two reasons. First, high- delity social learning's tness is not frequency-dependent at all, because it simply maintains the popula- tion's average trait (z = z ) and this trait's eciency does not change over time (cf. [37]). Any tness advantage it has as a rare mutant thus persists until xa- tion. Second, while low- delity social learning's tness is frequency-dependent, this never brings about its extinction. Rather, as such mutants become more common, their average trait eciency declines until their tness equalizes with that of the resident type, ! = ! , resulting in a dimorphic equilibrium. We L I show in supplementary material, §2 that such an equilibrium exists and is stable whenever they invade. Social learning cost is central to our analysis, because it reveals both when a given type of learning could conceivably pay for itself and when it is best equipped to do so. Setting ! = ! gives the maximum per-sample cost of L I low- delity social learning, p p p p p cv f 2 f cfv + cv 2 f k = ; (3.1) max and setting ! = ! gives the maximum per-parameter cost of high- delity H I social learning, g = cfv cv: (3.2) max At or above these values, such learning no longer confers a tness advantage. Identifying when k > 0 or g > 0 thus reveals the minimum requirements max max for social learning to evolve. More importantly, conditions that maximize k max or g reveal when such learning withstands the broadest possible range of max costs and is thus most likely to evolve (though such conditions do not necessarily maximize its prevalence in the population, learning rate, etc.). 3.1 Social learning cost (k and g) For social learning to evolve, it must either improve on the average trait or reduce the cost of acquiring it [38]. Although transmission errors can yield a superior trait, lucky mistakes are no more likely to be observed than unlucky ones (cf. [30]). Therefore, social learning must reduce cost. Setting ! > ! L I reveals that low- delity social learning evolves when its savings in cost exceed its average loss in trait eciency cnx kmx > z z : (3.3) I L The more errors in transmission, the larger the necessary savings. By contrast, high- delity social learning makes virtually no errors in transmission. It thus 8 evolves (! > ! ) when it oers nearly any savings in cost H I cnx gx > 0: (3.4) (In reality, there will likely always be some slight, non-zero level of error to overcome.) Taken together, equations (3.3) and (3.4) imply that high- delity social learning tolerates a greater overall cost by maintaining more ecient traits. 3.2 Trait complexity (x) Trait complexity can be eliminated from equation (3.3), because each term grows ^ ^ linearly with x. Doing so yields the equivalent expression cn km > , I L which implies that low- delity social learning is as likely to evolve when traits are simple as when they are complex. The same is not true of equation (3.4), once we break cost g down into its static and dynamic components. Instead, the evolution of high- delity social learning requires crossing a threshold in trait complexity, x > g =(cn g ), which increases with the cost of both having (g ) s d s and employing (g ) such learning. Note that this result is not contingent on our assumption that trait eciency and learning cost grow linearly with respect to trait complexity, but rather that they grow at the same rate. For example, x could still be eliminated from equation (3.3) if eciency and cost both grew logarithmically (e.g. if increased complexity yielded diminishing returns in eciency, but learning one parameter made it easier to learn others). 3.3 Individual learning rate (n) Social learning can only evolve (k > 0 or g > 0) when there is knowledge max max to acquire, n > 0. However, dierent types of social learning bene t from vastly dierent individual learning rates ( gure 1a ). This can be seen by nding the values of n that maximize k and g (after rst simplifying these expressions max max by using equation (2.6) to substitute c = fv=(n + v) ). Doing so reveals that low- delity social learning is most likely to evolve when the individual learning rate is low, n = v=3, and beliefs are driven mostly by prior expectations. By contrast, high- delity transmission is most likely to evolve when the individual learning rate is much higher, n = v. 3.4 Individual learning cost (c) Social learning cannot evolve when individual learning is free, c = 0, because it confers no savings. Similarly, it cannot evolve when individual learning is too expensive to engage in, c f=v, because there is nothing to learn. Between these two extremes, however, dierent individual learning costs favour dierent types of social learning ( gure 1b). Low- delity transmission is most likely to evolve 9 Figure 1: Resilience of high- and low- delity social learning as a function of: (a) the individual learning rate, (b) individual learning cost, (c) problem diculty and (d) conservatism. Dotted lines indicate when social learning is most likely to evolve. High- delity transmission evolves when individual learning is com- paratively (a) plentiful and (b) inexpensive. Its evolution may also depend on confronting particularly (c) challenging adaptive problems, because it accrues the bene ts of increased problem diculty more slowly. Finally, while all so- cial learning bene ts from (d) behavioural
exibility, high- delity transmission could bene t from higher levels of conservatism if these stimulate rather than depress individual learning. Parameters: c = 0:005, f = 0:5, v = 12. 10 when individual learning is relatively expensive, c = 9f=(16v), whereas high- delity transmission bene ts from much cheaper individual learning, c = f=(4v). In fact, the latter regime represents a 5=9 56% reduction in cost. 3.5 Problem diculty (f ) Adaptive problems must be suciently dicult, f > cv, for learning to evolve. Below this threshold, learning is not cost-eective, because the optimal trait is highly intuitive. Harder problems favour social learning in particular, which becomes more resilient as diculty increases: @k =@f > 0 and @g =@f > 0. max max Social learning is thus most likely to evolve when problems are as dicult as possible, f = 1. That being said, high- and low- delity transmission react dierently to increases in diculty ( gure 1c). Normalizing k and g max max by their maximum values reveals that k = k j > g = g j over max max max max f=1 f=1 cv < f < 1. In other words, low- delity transmission accrues the bene ts of increased problem diculty sooner. Larger increases are thus needed for high- delity transmission to reap comparable rewards (i.e. a proportional increase in resilience against cost). 3.6 Conservatism (v) Learning can only evolve when the level of conservatism falls below v < f=c. Stronger priors make learning uneconomical, because updating beliefs involves collecting too much data. Low- delity transmission always bene ts from re- duced conservatism, @k =@v < 0, and is thus most likely to evolve when pri- max ors are as diuse as possible (low v). Although high- delity transmission also bene ts from behavioural
exibility ( gure 1d ), its ideal level of conservatism is somewhat higher, v = f=(4c). This value is ideal because it maximizes the individual learning rate. 4 Discussion A longstanding question about CCE is why humans acquired this capacity, which appears diminished or absent in other species. Given the importance of transmission delity [9], one explanation is that CCE relies on powerful but spe- cialized forms of social learning at which humans are uniquely adept [16, 2, 17]. By characterizing social learning in terms of its ability to support CCE rather than speci c underlying mechanisms, we nd that high- delity transmission evolves under dierent conditions than less accurate social learning. Speci - cally, high- delity transmission is most likely to evolve when: (1) social and (2) individual learning are inexpensive, (3) traits are complex, (4) individual learning rates and (5) problem diculty are high, and (6) behaviour is
exible. Low- delity transmission diers in many respects. Not only is it most likely to evolve when individual learning is (2) costly and (4) infrequent, but it is also more robust when (3) traits are simple and (5) problems are easy. If conditions 11 favouring the evolution of high- delity transmission are stricter (3 and 5) or harder to meet (2 and 4), this could explain why social learning is common across species, but CCE is rare. Comparative analyses suggest that reliance on social learning covaries with brain size in primates [39, 40]. Because the hominin brain has undergone sev- eral large evolutionary expansions [41], high- delity social learning may require the addition of costly brain tissue [16]. Our model suggests that one way to compensate for this increased expenditure would be to lower other costs associ- ated with social learning. This could be achieved in several ways. First, social tolerance and grouping could provide easier, safer and more frequent opportu- nities to learn from others. In support of this view, sociability has been found to covary with reliance on social learning both within humans [42] and across primates [39]. Second, extended juvenile periods could free up time for social learning [43] without forgoing the opportunities in reproduction and resource acquisition available to an adult. Third, proactive prosociality could promote teaching [44]. Teaching, in this case, does not necessarily refer to the varied and cognitively complex forms it takes in humans [15, 33], but rather to any instance where individuals modify their behaviour to foster others' learning [44]. Peda- gogy could thus drive its own evolution, with more elaborate forms of teaching evolving in response to this reduction in cost. Another way to oset the added cost of high- delity transmission would be through higher intake [36]. In line with previous models, we nd that accu- rate social learning tolerates a greater overall cost precisely because it yields more ecient traits [16]. We build on this insight by allowing trait eciency to grow with trait complexity. Though this relationship is not universal (e.g. simplifying a trait could make it more ecient), complexity is often indicative of improvement. For example, as knapping techniques became more elaborate and hierarchically structured, this resulted in better tools [45]. Following this as- sumption, we nd that high- delity social learning is more likely to evolve when traits are complex, because the payos in trait eciency dwarf the cost of devel- oping and maintaining such learning. Unlike other species, early hominins may have crossed a threshold in trait complexity that allowed accurate transmission to evolve. This initial complexity may have arisen for reasons other than social learning, for example because encephalization allowed for more sophisticated action sequences [3]. This explanation is consistent with the archaeological record. Stout and Hecht [32] note that the rst stone tools (3.3 Ma) saw only intermittent use and that even the early Oldowan technocomplex (2.6{2.0 Ma) gives the impres- sion of being at the limits of hominin ability. Though the existence of local traditions suggests that Oldowan techniques were culturally transmitted [45], there is a conspicuous lack of evidence for CCE until much later on [46], fol- lowing signi cant increases in brain size [41]. During this early period (and perhaps considerably beyond it [47]), social learning seems to have spread and maintained but not signi cantly re ned the manufacture of tools. Not only is there no clear evidence of high- delity transmission [46] but the observed cul- tural dynamics closely align with those found in our model when individual 12 and low- delity social learning are combined (supplementary material, §1.1). Namely, a steady state emerges where average trait eciency remains stable, but knowledge is repeatedly lost and rediscovered (socially mediated serial rein- novation [48]). In short, rather than high- delity social learning spreading and maintaining early lithic technologies, their relative complexity may have instead facilitated its evolution. The putatively high cost of accurate transmission is only one of the potential impediments to its evolution. Theory suggests that low individual learning rates could also play a role [38]. In line with this view, we nd that much higher rates may be needed for the evolution of high- rather than low- delity social learning. Notably, the hominin lineage is characterized by large brains and high general intelligence, both of which are predictive of innovation rates in primates [39]. If few species are suciently proli c individual learners, this could explain why accurate transmission is rare. Of course, this raises the question of how adequate individual learning rates could be achieved in the rst place. The most obvious way to stimulate individ- ual learning is to reduce its cost. Previous theory [1] and experiments [42] warn that doing so can undercut social learning, however. While we nd support for this view, we also nd that high- delity transmission nevertheless bene ts from such reductions. In practice, many of the same factors that mitigate the cost of social learning could do so for individual learning as well. First, grouping could reduce the cost of exploration by allowing individuals to diuse the associated risks [49]. Second, extended juvenile periods could oer more time for not just social learning, but individual learning as well [43]. Costs borne by juveniles in protected environments, where others provide food, shelter and predator de- tection [43], would be especially aected. Finally, even teaching could play a role in the form of opportunity scaolding, where a teacher does not necessar- ily demonstrate a behaviour, but rather furnishes students with easy and safe opportunities to learn on their own [50]. Another way to promote individual learning is by facing more challenging adaptive problems. We nd that the evolution of high- delity social learning may involve confronting particularly dicult social, ecological and technologi- cal challenges (i.e. problems where optimal traits fall far outside the `zone of latent solutions' [17]). There are several reasons to think that hominins con- fronted such problems. First, because bipedalism allows hominins to cover far larger geographical ranges than other primates, with lifetime home ranges sev- eral orders of magnitude greater than those of chimpanzees [51], individuals were likely subjected to greater variability in environmental conditions, available re- sources, potential threats, etc. If behaviour that is adaptive in one setting is non-adaptive in others, then problems may more frequently require unintuitive solutions. Second, an exceptionally large proportion of the hominin diet consists of high-quality foods [52], such as those procured through hunting, extractive foraging and confrontational scavenging [36]. Compared to foods consumed more regularly by other primates, these are skill intensive and dicult to ob- tain [52]. Finally, new ways of thinking, interacting with others and leveraging technology undoubtedly presented novel problems of their own. This probably 13 resulted in a unique and challenging cognitive, cultural and technological niche, which further shaped the course of our evolution [32]. Lastly, it is worth commenting on the role of conservatism. A striking em- pirical nding is that chimpanzees suer from remarkable functional xedness and behavioural conservatism, which are thought to contribute to the paucity of CCE in this species [5, 6]. We nd that conservatism impedes CCE inso- far as it disfavours investment into social learning. However, we also nd that high- delity social learning could bene t from higher levels of conservatism if these stimulate rather than depress individual learning. For conservatism to impede the evolution of accurate transmission in particular, some additional assumption must be invoked, namely that such transmission also happens to be comparatively expensive. Individually, our criteria for evolving improved delity of transmission seem simple: mitigating the cost of learning, confronting harder adaptive problems, acquiring more complex traits, etc. However, our model emphasizes that meet- ing any one of these criteria is not necessarily sucient. For example, even if migration exposes individuals to less intuitive problems, learning could still be too expensive. Similarly, even if grouping lowers the cost of learning, traits could still be too simple. In short, humans probably evolved high- delity social learning not by meeting any one (or more) of these criteria perfectly, but by meeting all of them well enough. Authors' contributions M.M. and T.R.S conceived the project. M.M. developed the model, performed the analysis and wrote the manuscript. T.R.S provided manuscript revisions. Competing interests We declare we have no competing interests. Funding This work was supported by grants to M.M. (CGS-M and CGS-D) from the Natural Sciences and Engineering Research Council of Canada. References [1] R. Boyd and P. J. Richerson, Culture and the evolutionary process. Chicago, IL, US: University of Chicago Press, 1985. [2] M. Tomasello, The cultural origins of human cognition. Cambridge, MA, US: Harvard University Press, 1999. [3] A. Whiten, V. Horner, and S. Marshall-Pescini, \Cultural panthropology," Evolutionary Anthropology: Issues, News, and Reviews, vol. 12, no. 2, pp. 92{105, 2003. 14 [4] D. P. Scho eld, W. C. McGrew, A. Takahashi, and S. Hirata, \Cumula- tive culture in nonhumans: overlooked ndings from Japanese monkeys?," Primates, vol. 59, no. 2, pp. 113{122, 2018. [5] S. Marshall-Pescini and A. Whiten, \Chimpanzees (Pan troglodytes) and the question of cumulative culture: an experimental approach," Anim Cogn, vol. 11, no. 3, pp. 449{456, 2008. [6] A. Whiten, N. McGuigan, S. Marshall-Pescini, and L. M. Hopper, \Em- ulation, imitation, over-imitation and the scope of culture for child and chimpanzee.," Philosophical transactions of the Royal Society of London. Series B, Biological sciences, vol. 364, no. 1528, pp. 2417{28, 2009. [7] A. Mesoudi and A. Thornton, \What is cumulative cultural evolution?," Proceedings of the Royal Society B: Biological Sciences, vol. 285, no. 1880, p. 20180712, 2018. [8] C. A. Caldwell, E. Renner, and M. Atkinson, \Human Teaching and Cu- mulative Cultural Evolution," Review of Philosophy and Psychology, vol. 9, no. 4, pp. 751{770, 2018. [9] H. M. Lewis and K. N. Laland, \Transmission delity is the key to the build- up of cumulative culture.," Philosophical transactions of the Royal Society of London. Series B, Biological sciences, vol. 367, no. 1599, pp. 2171{80, [10] M. Enquist, S. Ghirlanda, A. Jarrick, and C.-A. Wachtmeister, \Why does human culture increase exponentially?," Theoretical Population Biology, vol. 74, no. 1, pp. 46{55, 2008. [11] A. Buskell, \What are cultural attractors?," Biology & Philosophy, vol. 32, no. 3, pp. 377{394, 2017. [12] F. Subiaul, \What's Special about Human Imitation? A Comparison with Enculturated Apes," Behavioral Sciences, vol. 6, no. 3, p. 13, 2016. [13] J. Call and M. Tomasello, \Does the chimpanzee have a theory of mind? 30 years later," Trends in Cognitive Sciences, vol. 12, no. 5, pp. 187{192, [14] G. Csibra and G. Gergely, \Natural pedagogy," Trends in Cognitive Sci- ences, vol. 13, no. 4, pp. 148{153, 2009. [15] E. R. R. Burdett, L. G. Dean, and S. Ronfard, \A Diverse and Flexible Teaching Toolkit Facilitates the Human Capacity for Cumulative Culture," Review of Philosophy and Psychology, vol. 9, no. 4, pp. 807{818, 2018. [16] R. Boyd and P. J. Richerson, \Why culture is common, but cultural evolu- tion is rare," Proceedings of the British Academy, vol. 88, pp. 77{93, 1996. 15 [17] C. Tennie, J. Call, and M. Tomasello, \Ratcheting up the ratchet: on the evolution of cumulative culture," Philos Trans R Soc Lond B Biol Sci, vol. 364, no. 1528, pp. 2405{2415, 2009. [18] C. A. Caldwell and A. E. Millen, \Social Learning Mechanisms and Cumu- lative Cultural Evolution," Psychological Science, vol. 20, no. 12, pp. 1478{ 1483, 2009. [19] E. Reindl, I. A. Apperly, S. R. Beck, and C. Tennie, \Young children copy cumulative technological design in the absence of action information," Sci- enti c Reports, vol. 7, no. 1, p. 1788, 2017. [20] E. Zwirner and A. Thornton, \Cognitive requirements of cumulative cul- ture: teaching is useful but not essential," Scienti c Reports, vol. 5, no. 1, p. 16781, 2015. [21] M. Derex, B. Godelle, and M. Raymond, \Social learners require process information to outperform individual learners," Evolution, vol. 67, no. 3, pp. 688{697, 2013. [22] H. Wasielewski, \Imitation Is Necessary for Cumulative Cultural Evolution in an Unfamiliar, Opaque Task," Human Nature, vol. 25, no. 1, pp. 161{ 179, 2014. [23] D. M. Cataldo, A. B. Migliano, and L. Vinicius, \Speech, stone tool-making and the evolution of language," PLOS ONE, vol. 13, no. 1, p. e0191071, [24] T. J. H. Morgan, N. T. Uomini, L. E. Rendell, L. Chouinard-Thuly, S. E. Street, H. M. Lewis, C. P. Cross, C. Evans, R. Kearney, I. de la Torre, A. Whiten, and K. N. Laland, \Experimental evidence for the co-evolution of hominin tool-making teaching and language," Nature Communications, vol. 6, no. 1, p. 6029, 2015. [25] A. Beppu and T. L. Griths, \Iterated learning and the cultural ratchet," Proceedings of the 31st Annual Conference of the Cognitive Science Society, pp. 2089{2094, 2009. [26] A. Whalen, L. Maurits, M. Pacer, and T. L. Griths, \Cultural Evolution with Sparse Testimony : when does the Cultural Ratchet Slip?," Proceed- ings of the 36th Annual Conference of the Cognitive Science Society, 2014. [27] A. Mesoudi and M. J. O'Brien, \The cultural transmission of Great Basin projectile-point technology I: An experimental simulation," American An- tiquity, vol. 73, no. 1, pp. 3{28, 2008. [28] B. Bril, R. Rein, T. Nonaka, F. Wenban-Smith, and G. Dietrich, \The role of expertise in tool use: Skill dierences in functional action adaptations to task constraints.," Journal of Experimental Psychology: Human Perception and Performance, vol. 36, no. 4, pp. 825{839, 2010. 16 [29] J. Y. Wakano, K. Aoki, and M. W. Feldman, \Evolution of social learning: a mathematical analysis," Theor Popul Biol, vol. 66, no. 3, pp. 249{258, [30] J. Henrich, \Demography and Cultural Evolution: How Adaptive Cultural Processes can Produce Maladaptive Losses: The Tasmanian Case," Amer- ican Antiquity, vol. 69, no. 2, p. 197, 2004. [31] C. Heyes, \What's social about social learning?," Journal of Comparative Psychology, vol. 126, no. 2, pp. 193{202, 2012. [32] D. Stout and E. E. Hecht, \Evolutionary neuroscience of cumulative cul- ture," Proceedings of the National Academy of Sciences, vol. 114, no. 30, pp. 7861{7868, 2017. [33] M. A. Kline, \How to learn about teaching: An evolutionary framework for the study of teaching behavior in humans and other animals," Behavioral and Brain Sciences, vol. 38, p. e31, 2015. [34] C. M. Heyes, Cognitive Gadgets: The Cultural Evolution of Thinking. Cam- bridge, MA: Harvard University Press, 2017. [35] C. M. Heyes and C. D. Frith, \The cultural evolution of mind reading," Science, vol. 344, no. 6190, pp. 1243091{1243091, 2014. [36] K. Isler and C. P. Van Schaik, \How humans evolved large brains: Com- parative evidence," Evolutionary Anthropology: Issues, News, and Reviews, vol. 23, no. 2, pp. 65{75, 2014. [37] A. R. Rogers, \Does Biology Constrain Culture," American Anthropologist, vol. 90, no. 4, pp. 819{831, 1988. [38] R. McElreath, \The coevolution of genes, innovation, and culture in human evolution," in Mind the Gap: Tracing the Origins of Human Universals (P. M. Kappeler and J. B. Silk, eds.), ch. 21, Springer, 2010. [39] S. M. Reader and K. N. Laland, \Social intelligence, innovation, and en- hanced brain size in primates," Proceedings of the National Academy of Sciences, vol. 99, no. 7, pp. 4436{4441, 2002. [40] S. M. Reader, Y. Hager, and K. N. Laland, \The evolution of primate general and cultural intelligence," Philos Trans R Soc Lond B Biol Sci, vol. 366, no. 1567, pp. 1017{1027, 2011. [41] D. Stout, \Stone toolmaking and the evolution of human culture and cog- nition," Philosophical Transactions of the Royal Society B: Biological Sci- ences, vol. 366, no. 1567, pp. 1050{1059, 2011. [42] T. J. H. Morgan, L. E. Rendell, M. Ehn, W. Hoppitt, and K. N. Laland, \The evolutionary basis of human social learning," Proceedings of the Royal Society B: Biological Sciences, vol. 279, no. 1729, pp. 653{662, 2012. 17 [43] J. M. Burkart, \Socio-cognitive abilities and cooperative breeding," in Learning from Animals?: Examining the Nature of Human Uniqueness, pp. 123{140, New York: Psychology Press, 2008. [44] J. M. Burkart and C. P. van Schaik, \Revisiting the consequences of coop- erative breeding," Journal of Zoology, vol. 299, no. 2, pp. 77{83, 2016. [45] D. Stout, S. Semaw, M. J. Rogers, and D. Cauche, \Technological variation in the earliest Oldowan from Gona, Afar, Ethiopia," Journal of Human Evolution, vol. 58, no. 6, pp. 474{491, 2010. [46] C. Tennie, D. R. Braun, L. S. Premo, and S. P. McPherron, \The Island Test for Cumulative Culture in the Paleolithic," in Vertebrate Paleobiology and Paleoanthropology, pp. 121{133, 2016. [47] C. P. van Schaik, G. R. Pradhan, and C. Tennie, \Teaching and curiosity: sequential drivers of cumulative cultural evolution in the hominin lineage," Behavioral Ecology and Sociobiology, vol. 73, no. 1, p. 2, 2019. [48] E. Bandini and C. Tennie, \Spontaneous reoccurrence of \scooping", a wild tool-use behaviour, in na ve chimpanzees," PeerJ, 2017. [49] L. Moretti, M. Hentrup, K. Kotrschal, and F. Range, \The in
uence of relationships on neophobia and exploration in wolves and dogs," Animal Behaviour, vol. 107, pp. 159{173, 2015. [50] A. H. Boyette and B. S. Hewlett, \Teaching in Hunter-Gatherers," Review of Philosophy and Psychology, 2018. [51] K. Hill, M. Barton, and A. M. Hurtado, \The emergence of human unique- ness: Characters underlying behavioral modernity," Evolutionary Anthro- pology: Issues, News, and Reviews, vol. 18, no. 5, pp. 187{200, 2009. [52] H. Kaplan, K. Hill, J. Lancaster, and a. M. Hurtado, \A theory of hu- man life history evolution: Diet, intelligence, and longevity," Evolutionary Anthropology: Issues, News, and Reviews, vol. 9, no. 4, pp. 156{185, 2000.
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.pngQuantitative BiologyarXiv (Cornell University)http://www.deepdyve.com/lp/arxiv-cornell-university/the-evolution-of-high-fidelity-social-learning-FIJ5GPiNEc