Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Kuri: A Simulator of Ecological Genetics for Tree Populations

Kuri: A Simulator of Ecological Genetics for Tree Populations Hindawi Publishing Corporation Journal of Artificial Evolution and Applications Volume 2009, Article ID 783647, 5 pages doi:10.1155/2009/783647 Research Article 1 1 2 1 Benn R. Alle, Lupe Furtado-Alle, Cedric Gondro, and Joao ˜ Carlos M. Magalhaes ˜ Department of Genetics, Polytechnic Center, Federal University of Parana, Jardim das Americas, 81531-990 Curitiba, PR, Brazil The Institute for Genetics and Bioinformatics (TIGB), University of New England, Armidale, NSW 2351, Australia Correspondence should be addressedtoBennR.Alle, bennalle@gmail.com Received 6 October 2008; Accepted 19 May 2009 Recommended by Marylyn Ritchie This paper presents Kuri, a software package developed to simulate the temporal and spatial dynamics of genetic variability in populations and multispecies communities of trees, as well as their interactions with environmental factors. A conceptual model using agents inspired on Echo models is used to define the environment, the hierarchical structures, and the low-level rules of the system. At the individual agent (tree) level a genetic algorithm is used to model the genotypic structure and the genetic processes, from a small set of simple rules, complex higher-order population, and environmental interactions emerge. The program was written in Delphi for the Windows environment, and was designed to be used for educational and research purposes. Copyright © 2009 Benn R. Alle et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction interact to produce patterns that are analogous to those observed in natural populations, such as Hardy-Weinberg Computational simulations have been widely used to rep- equilibrium. Thus, population data generated in Kuri is not resent and simulate genetic processes. Some examples that obtained from sampling from a distribution, but is instead, a fall within the scope of this work include simulations that quantifiable element at the population level which emerges were mainly developed for educational purposes, such as from the low level mechanistic interactions at the genetic Populus [1], WinPop [2], Sigex [3], and Genup [4]. Others level. were developed for practical applications and are used in, for example, programs for forestry management [5, 6]. Simulations are also used to understand complex adaptive 2. Software Kuri systems from a “first-principles” approach. Conceptual mod- els such as Holland’s Echo model are widely used [3, 7]. Kuri was developed using the Delphi programming lan- In this paper we discuss the software Kuri, a simulator of guage, an object oriented derivative of Pascal. It uses a ecological genetics for tree populations. The program allows modular construct which allows easy implementation of investigation of genetic and microevolutionary phenomena new functions and applications and also enables seamless of tree populations or entire forest communities. Kuri can be integration with the other modules. The program needs used to study the dynamics of neutral genetic markers under limited computational resources and will run on a 1.2 GHz certain biological factors and environmental constraints, processor with 512 M RAM and 2 GB free space on the hard such as dispersion mechanisms and geographical barriers, disk. The operating system can be Windows XP or above. The among others. Either real field data or artificial genetic and current version of Kuri consists of three main modules: the environmental parameters can be used for a given simula- graphical user interface (GUI), a dispersion module, and a tion. The latter allows creation and testing of hypothetical genetic operators (KGOP) module. situations for theoretical and/or educational purposes. In Kuri, environmental factors that affect germina- Along the same lines used in the Sigex simulator [3], Kuri tion/viability of seeds are combined to create a heatmap in mechanistically implements low level elementary biological which the colors represent different germination probabili- rules, for example, Mendelian segregation and mating, which ties (Figure 1 shows a screenshot of Kuri with a probability 2 Journal of Artificial Evolution and Applications heatmap based on satellite images). The GUI allows the user to import images, such as satellite photographs or schematic pictures to represent features of interest in a given area. Up to fiveimagesatatime canbeusedtorepresent different environmental parameters in a given simulation. Each image could represent, for example: (1) inhospitable areas where seeds cannot germinate, (2) areas of human intervention, (3) soil depth, (4) soil quality, and (5) hydrology. Note that each environmental parameter can be altered by the user. For instance, the map of soil depth can be replaced by a topographic map of the region, if it is more relevant for a particular research topic. Currently Kuri works with bitmap image files which are easy to generate or to convert from other file formats with available imaging software. Figure 1: Graphical user interface of Kuri showing the heatmap of seeding probabilities based on satellite imagery of Tangua Park in For each of these (up to 5) environmental parameters, Curitiba, Brazil. Each color represents the combined probability of probabilities of germination success on its respective map up to five different environmental parameters for each cell in the can be assigned to either discrete features or interval ranges grid. Black is used to indicate nonviable regions (roads, rivers, built for continuous features. Probabilities are color coded on the up areas, etc.). map and resolved at the pixel level. This means that each pixel can be assigned its own independent probability, irrespective of neighboring probabilities, allowing for a discontinuous probability landscape. The color scheme of probabilities is user defined which makes it easy to identify features. For example, areas where the germination of seeds is impossible such as buildings, streets, water masses, or rocky terrain are by default represented in black (Figure 1). Since colors and probabilities are linked, it is simply a matter of changing the probability associated with a specific color to update all points in the map to a new probability. The overall germination probability map (Figure 1)is generated by multiplying the probabilities for each of these five environmental parameters at each individual pixel. Thus probability at pixel px is simply P px = P ep,(1) ij j=1 Figure 2: Heatmap of the dispersion of 1000000 pollen grains from a common origin in the center of the figure. Darker colors where ep is an environmental parameter. Color coding is indicate more pollen in a given cell. In this example the wind used to represent the final probabilities on a scale between direction probabilities were the same for all coordinates—hence the 0% and 100%. This assumes rather simplistically that symmetric pattern of dispersion. the overall probabilities are independent terms with no interactions between parameters. To model interactions an additional proceeding can be used. If one of the parameters cell, the probability of dispersing to another cell depends is a map of soil fertility and another map holds hydrology on the wind. This is achieved through a simple probabilistic information, a table can be used to model the interaction function, where an integer ranging between 0 and n (n is a between them, a page control called interaction function. This user defined parameter between zero and the total number of could be a simple scaling function, such that grid cells) is randomly sampled from a uniform distribution and multiplied by the probability of the wind direction P px = λP ep P ep,(2) i1 i2 (Figure 2). The value of n effectively sets the dispersion where λ is a scalar (in practice λ is simply a monochromatic boundaries. Wind direction is also a user defined parameter map with a scalar attached to the single color). More consisting of a set of probabilities for each cardinal point and complex nonlinear interactions can be envisioned (e.g., a a decay rate from the center of dispersion. mapping interval derived from the order terms of a random The KGOP module is essentially a relational database regression) provided (1)holds. that holds information on the biological community, the To simulate the dispersion of pollen and seeds, the various species and their respective biological features, the total simulation area is divided into cells of user defined genetic features of the species, and the genetic composition granularity, with height and width in pixels defined by the (essentially all allelic frequencies across all genes) of the user. For each grain of pollen and for each seed in a particular population of each species, including the chromosome Journal of Artificial Evolution and Applications 3 01 02 03 01 02 03 04 05 10 04 05 10 15 20 25 15 20 25 (a) (b) Figure 3: (a) Distribution of organisms in generations 1, 2, 3, 4, 5, 10, 15, 20, and 25 of replicate 1 under the scenario of strong winds. (b) Distribution of organisms in generations 1, 2, 3, 4, 5, 10, 15, 20, and 25 of replicate 1 under the scenario of mild winds. Each point represents the area occupied by an organism. Note how wind strength can affect the population structure and promote a shift from panmixia in (a) to endogamy in (b). sets for each species with the number of loci in each referred to as chromosomes. The value in each position chromosome, the linkage map between loci, and the number of the string is an allele and the position itself is a gene of alleles in each locus. or locus. The combination of values (alleles) in the string For each species the following biological parameters can (chromosome) can be mapped to a phenotypic expression be stored: the individual occupation range (species bound- (note that in Kuri all alleles are neutral). Thus GAs operate aries), the dispersion of pollen and seeds, the maximum at two structural levels: a genotypic and a phenotypic one. and minimum ages of reproduction and images for each Crossover swaps chromosome parts between selected parents age group of the specimens. For this last parameter, Kuri’s to form the offspring while mutation changes the value of image collection can be used, or the user can import and alleles at randomly selected loci. add his/her own images. All parameters relate directly back The practical limits for the software (i.e., number of to their original biological meaning and can be used quite individuals, size of geographic area, number of generations, intuitively. etc.) relate to the limits of the MySQL database. The effective size of the tables for the database is normally restricted by For each new species added to the database, the user the operating system’s filesystem. The total number of loci should specify the number of chromosomes that will be used are limited to 128. in the simulation and the number of loci per chromosome. Up to 26 allele slots are available for each locus. The chromosomes and genes that will effectively be used in a 3. A Simulation Example: Dispersion Effects simulation can be selected prior to a run. Recombination frequencies between genes should also be specified by the In this section we discuss a simple simulation of seed disper- user. Mutation rates are the same for all genes/alleles, but sion effects to illustrate the use of Kuri in population genetics. can be changed across runs. Note that mutation in Kuri does We created a single species population in a homogeneous not generate new allelic variants; it simply swaps an allele for environment with a single locus and two segregating alleles another one from the database with a uniform probability. of interest. Initially all plants were heterozygous. We ran Initial populations are by default generated in Hardy- the simulation under two scenarios with different wind Weinberg equilibrium based on the given allelic frequencies intensities (strong and mild winds). Wind intensities affect (allelic and genotypic frequencies and chi-squared values for the dispersion process and, consequently, the distribution of Hardy-Weinberg equilibrium tests are given in Kuri), but genetic variability. different initial population structures can be defined. For each scenario, five simulation runs of 25 generations Computationally, the genetic mechanisms of the species each were performed. In Figure 3(a) the distribution pattern are simulated using a Genetic algorithm (GA) [8]. In of the plants across generations is depicted under strong previous work we have [3] detailed how to implement these winds for the first replicate. Note that the distribution pat- genetic processes and shown that they conform to theoretical tern remains homogeneous over the generations, meaning predictions of population genetics. But briefly, GAs are the that dispersion occurs with a high level of panmixia, that is, class of Evolutionary Computation algorithms which most random matting. Figure 3(b) shows the mild wind scenario closely mimic evolutionary processes at the genetic level. over generations for the first replicate. Note the formation GA organisms are represented as linear strings which are of endogamic groups, that is, most matings occur within 4 Journal of Artificial Evolution and Applications 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 G 00 G 01 G 02 G 03 G 04 G 05 G 10 G 15 G 20 G 25 G 00 G 01 G 02 G 03 G 04 G 05 G 10 G 15 G 20 G 25 RI RIV RI RIV RII RV RII RV RIII RIII (a) (b) Figure 4: Changes in frequencies of heterozygotes observed across 25 generations in 5 repetitions. Initially the entire population was heterozygous. (a) Frequencies under the influence of strong winds. Equilibrium is reached after the first generation, oscillating around 0.5. (b) Frequencies under the influence of mild winds. Heterozygosity decreases due to population subdivision—Wahlund effect. subpopulations, which are to be expected in an environment compared to the original experimental model to provide that does not favor dispersion. insights about the dynamics of the system. Kuri was designed to simulate microevolutionary phe- The dynamics over time of the frequencies of heterozy- nomena which can be detected through molecular markers gotes for the five repeats are shown in Figure 4(a) (strong which are usually selectively neutral. Neutral markers have wind) and Figure 4(b) (mild wind). In the former, the the advantage that since they are not being selected for or frequencies of the heterozygotes reach equilibrium after the against, any observed fluctuations in allelic frequencies are first generation, oscillating around 0.5. In the second case, a only due to population structure and environmental effects. decrease in heterozygosity is noticeable since the subdivision creates a new population structure—an example of a genetic phenomenon known as Wahlund effect. In all strong wind 4. Concluding Remarks repeats, equilibrium was reached and maintained across gen- erations whilst with mild winds the number of homozygotes Kuri can be used to simulate a wide range of biological increases over time. scenarios. It allows manipulation of the genomes, alleles, and Even this simple scenario can provide insights about genotypes of different plant species and the interactions of natural populations. Jump and Penuelas [9] showed that these populations with the ecosystem. Kuri’s database can be habitat fragmentation caused by human activity led to high used to store different genetic models of species, being these levels of inbreeding due to a Wahlund effect. This was the based on real data of species or virtual organisms tailored first study showing that even widespread wind-pollinated for educational purposes. Alongside the biological parame- trees are negatively affected by habitat fragmentation. Argu- ters, the user can manipulate and/or create environmental mentatively, Kuri could be used to estimate genetic effects parameters based on field data (such as satellite imagery) under different scenarios. For example, a satellite image of to study how these affect the genetic composition and size a forested area can be artificially fragmented in different of populations. The software meets theoretical expectations, patterns and these used to estimate the genetic effects but it still has to be tested under realistic scenarios for which of deforestation. This has implications for urbanization real data is available and results can be compared. Due to the decisions and can assist in finding a solution that minimizes lack of real data testing it is still unclear how detailed field human impact. Clearly, for realistic results, there has to be data and knowledge of the ecology of the species has to be reliable data and detailed knowledge of the ecology of the able to make valid inferences. Future work and user feedback species. may assist in answering these questions. For population studies the simulated data can be treated The software is modular. It was designed so that it can and analyzed as if it were real data, with the advantage be modified and expanded to simulate other phenomena. of having full knowledge of the population structure and For example, in the current version all genes/alleles are neu- a handle on the mechanisms that yielded the dataset. For tral, but it is straightforward to implement environmental example, data from only the last generation could be used constrains associated to the genotypes in order to simulate to make inferences about the evolutionary processes that natural selection, or even simulate molecular evolution by were acting on the population. The degree of deviation from adding another module that allows handling each allele as a HW equilibrium can be calculated and used to estimate DNA base pair. Kuri is open source and freely available from parameters such as F [10]. These results can then be the web address: http://www.allesys.com.br/kuri/. ST Journal of Artificial Evolution and Applications 5 References [1] D. N. Alstad, “Populus: simulations of population biology,” 2007, http://www.cbs.umn.edu/populus. [2] P.A.S.Nuin andP.A.Otto, “A programfor representing and simulating population genetic phenomena,” Genetics and Molecular Biology, vol. 23, no. 1, pp. 53–60, 2000. [3] C. Gondro and J. C. M. Magalhaes, ˜ “A simple genetic algorithm for studies of Mendelian populations,” in Recent Advances in Artificial Life, H. Abbass, T. Bossamaier, and J. Wiles, Eds., pp. 85–98, World Scientific, London, UK, 2005. [4] B. P. Kinghorn, “GENUP—a suite of programs to help teach animal breeding theory,” in Proceedings of the 10th Australian Association of Animal Breeding and Genetics, pp. 555–559, [5] M.Kanashiro, I. S. Thompson,J.A.G.Yared,etal., “Improv- ing conservation values of managed forests: the Dendrogene project in the Brazilian Amazon,” Unasylva, vol. 53, no. 209, pp. 25–33, 2002. [6] B. Degen, H.-R. Gregorius, and F. Scholz, “ECO-GENE, a model for simulation studies on the spatial and temporal dynamics of genetic structures of tree populations,” Silvae Genetica, vol. 45, no. 5-6, pp. 323–329, 1996. [7] P.T.Hraber, T. Jones, andS.Forrest,“Theecology of Echo,” Artificial Life, vol. 3, no. 3, pp. 165–190, 1997. [8] C. Gondro and B. P. Kinghorn, “Solving complex problems with evolutionary computation,” in Proceedings of the 17th Australian Association for the Advancement of Animal Breeding and Genetics, pp. 272–279, 2007. [9] A. S. Jump and J. Penuelas, “Genetic effects of chronic habitat fragmentation in a wind-pollinated tree,” Proceedings of the National Academy of Sciences of the United States of America, vol. 103, no. 21, pp. 8096–8100, 2006. [10] S. Wright, Evolution and the Genetic of Populations, Vol. II: The Theory of Gene Frequencies, The University of Chicago Press, Chicago, Ill, USA, 1969. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Artificial Evolution and Applications Hindawi Publishing Corporation

Kuri: A Simulator of Ecological Genetics for Tree Populations

Loading next page...
 
/lp/hindawi-publishing-corporation/kuri-a-simulator-of-ecological-genetics-for-tree-populations-4HueQDVAdR

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Hindawi Publishing Corporation
Copyright
Copyright © 2009 Benn R. Alle et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
ISSN
1687-6229
DOI
10.1155/2009/783647
Publisher site
See Article on Publisher Site

Abstract

Hindawi Publishing Corporation Journal of Artificial Evolution and Applications Volume 2009, Article ID 783647, 5 pages doi:10.1155/2009/783647 Research Article 1 1 2 1 Benn R. Alle, Lupe Furtado-Alle, Cedric Gondro, and Joao ˜ Carlos M. Magalhaes ˜ Department of Genetics, Polytechnic Center, Federal University of Parana, Jardim das Americas, 81531-990 Curitiba, PR, Brazil The Institute for Genetics and Bioinformatics (TIGB), University of New England, Armidale, NSW 2351, Australia Correspondence should be addressedtoBennR.Alle, bennalle@gmail.com Received 6 October 2008; Accepted 19 May 2009 Recommended by Marylyn Ritchie This paper presents Kuri, a software package developed to simulate the temporal and spatial dynamics of genetic variability in populations and multispecies communities of trees, as well as their interactions with environmental factors. A conceptual model using agents inspired on Echo models is used to define the environment, the hierarchical structures, and the low-level rules of the system. At the individual agent (tree) level a genetic algorithm is used to model the genotypic structure and the genetic processes, from a small set of simple rules, complex higher-order population, and environmental interactions emerge. The program was written in Delphi for the Windows environment, and was designed to be used for educational and research purposes. Copyright © 2009 Benn R. Alle et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction interact to produce patterns that are analogous to those observed in natural populations, such as Hardy-Weinberg Computational simulations have been widely used to rep- equilibrium. Thus, population data generated in Kuri is not resent and simulate genetic processes. Some examples that obtained from sampling from a distribution, but is instead, a fall within the scope of this work include simulations that quantifiable element at the population level which emerges were mainly developed for educational purposes, such as from the low level mechanistic interactions at the genetic Populus [1], WinPop [2], Sigex [3], and Genup [4]. Others level. were developed for practical applications and are used in, for example, programs for forestry management [5, 6]. Simulations are also used to understand complex adaptive 2. Software Kuri systems from a “first-principles” approach. Conceptual mod- els such as Holland’s Echo model are widely used [3, 7]. Kuri was developed using the Delphi programming lan- In this paper we discuss the software Kuri, a simulator of guage, an object oriented derivative of Pascal. It uses a ecological genetics for tree populations. The program allows modular construct which allows easy implementation of investigation of genetic and microevolutionary phenomena new functions and applications and also enables seamless of tree populations or entire forest communities. Kuri can be integration with the other modules. The program needs used to study the dynamics of neutral genetic markers under limited computational resources and will run on a 1.2 GHz certain biological factors and environmental constraints, processor with 512 M RAM and 2 GB free space on the hard such as dispersion mechanisms and geographical barriers, disk. The operating system can be Windows XP or above. The among others. Either real field data or artificial genetic and current version of Kuri consists of three main modules: the environmental parameters can be used for a given simula- graphical user interface (GUI), a dispersion module, and a tion. The latter allows creation and testing of hypothetical genetic operators (KGOP) module. situations for theoretical and/or educational purposes. In Kuri, environmental factors that affect germina- Along the same lines used in the Sigex simulator [3], Kuri tion/viability of seeds are combined to create a heatmap in mechanistically implements low level elementary biological which the colors represent different germination probabili- rules, for example, Mendelian segregation and mating, which ties (Figure 1 shows a screenshot of Kuri with a probability 2 Journal of Artificial Evolution and Applications heatmap based on satellite images). The GUI allows the user to import images, such as satellite photographs or schematic pictures to represent features of interest in a given area. Up to fiveimagesatatime canbeusedtorepresent different environmental parameters in a given simulation. Each image could represent, for example: (1) inhospitable areas where seeds cannot germinate, (2) areas of human intervention, (3) soil depth, (4) soil quality, and (5) hydrology. Note that each environmental parameter can be altered by the user. For instance, the map of soil depth can be replaced by a topographic map of the region, if it is more relevant for a particular research topic. Currently Kuri works with bitmap image files which are easy to generate or to convert from other file formats with available imaging software. Figure 1: Graphical user interface of Kuri showing the heatmap of seeding probabilities based on satellite imagery of Tangua Park in For each of these (up to 5) environmental parameters, Curitiba, Brazil. Each color represents the combined probability of probabilities of germination success on its respective map up to five different environmental parameters for each cell in the can be assigned to either discrete features or interval ranges grid. Black is used to indicate nonviable regions (roads, rivers, built for continuous features. Probabilities are color coded on the up areas, etc.). map and resolved at the pixel level. This means that each pixel can be assigned its own independent probability, irrespective of neighboring probabilities, allowing for a discontinuous probability landscape. The color scheme of probabilities is user defined which makes it easy to identify features. For example, areas where the germination of seeds is impossible such as buildings, streets, water masses, or rocky terrain are by default represented in black (Figure 1). Since colors and probabilities are linked, it is simply a matter of changing the probability associated with a specific color to update all points in the map to a new probability. The overall germination probability map (Figure 1)is generated by multiplying the probabilities for each of these five environmental parameters at each individual pixel. Thus probability at pixel px is simply P px = P ep,(1) ij j=1 Figure 2: Heatmap of the dispersion of 1000000 pollen grains from a common origin in the center of the figure. Darker colors where ep is an environmental parameter. Color coding is indicate more pollen in a given cell. In this example the wind used to represent the final probabilities on a scale between direction probabilities were the same for all coordinates—hence the 0% and 100%. This assumes rather simplistically that symmetric pattern of dispersion. the overall probabilities are independent terms with no interactions between parameters. To model interactions an additional proceeding can be used. If one of the parameters cell, the probability of dispersing to another cell depends is a map of soil fertility and another map holds hydrology on the wind. This is achieved through a simple probabilistic information, a table can be used to model the interaction function, where an integer ranging between 0 and n (n is a between them, a page control called interaction function. This user defined parameter between zero and the total number of could be a simple scaling function, such that grid cells) is randomly sampled from a uniform distribution and multiplied by the probability of the wind direction P px = λP ep P ep,(2) i1 i2 (Figure 2). The value of n effectively sets the dispersion where λ is a scalar (in practice λ is simply a monochromatic boundaries. Wind direction is also a user defined parameter map with a scalar attached to the single color). More consisting of a set of probabilities for each cardinal point and complex nonlinear interactions can be envisioned (e.g., a a decay rate from the center of dispersion. mapping interval derived from the order terms of a random The KGOP module is essentially a relational database regression) provided (1)holds. that holds information on the biological community, the To simulate the dispersion of pollen and seeds, the various species and their respective biological features, the total simulation area is divided into cells of user defined genetic features of the species, and the genetic composition granularity, with height and width in pixels defined by the (essentially all allelic frequencies across all genes) of the user. For each grain of pollen and for each seed in a particular population of each species, including the chromosome Journal of Artificial Evolution and Applications 3 01 02 03 01 02 03 04 05 10 04 05 10 15 20 25 15 20 25 (a) (b) Figure 3: (a) Distribution of organisms in generations 1, 2, 3, 4, 5, 10, 15, 20, and 25 of replicate 1 under the scenario of strong winds. (b) Distribution of organisms in generations 1, 2, 3, 4, 5, 10, 15, 20, and 25 of replicate 1 under the scenario of mild winds. Each point represents the area occupied by an organism. Note how wind strength can affect the population structure and promote a shift from panmixia in (a) to endogamy in (b). sets for each species with the number of loci in each referred to as chromosomes. The value in each position chromosome, the linkage map between loci, and the number of the string is an allele and the position itself is a gene of alleles in each locus. or locus. The combination of values (alleles) in the string For each species the following biological parameters can (chromosome) can be mapped to a phenotypic expression be stored: the individual occupation range (species bound- (note that in Kuri all alleles are neutral). Thus GAs operate aries), the dispersion of pollen and seeds, the maximum at two structural levels: a genotypic and a phenotypic one. and minimum ages of reproduction and images for each Crossover swaps chromosome parts between selected parents age group of the specimens. For this last parameter, Kuri’s to form the offspring while mutation changes the value of image collection can be used, or the user can import and alleles at randomly selected loci. add his/her own images. All parameters relate directly back The practical limits for the software (i.e., number of to their original biological meaning and can be used quite individuals, size of geographic area, number of generations, intuitively. etc.) relate to the limits of the MySQL database. The effective size of the tables for the database is normally restricted by For each new species added to the database, the user the operating system’s filesystem. The total number of loci should specify the number of chromosomes that will be used are limited to 128. in the simulation and the number of loci per chromosome. Up to 26 allele slots are available for each locus. The chromosomes and genes that will effectively be used in a 3. A Simulation Example: Dispersion Effects simulation can be selected prior to a run. Recombination frequencies between genes should also be specified by the In this section we discuss a simple simulation of seed disper- user. Mutation rates are the same for all genes/alleles, but sion effects to illustrate the use of Kuri in population genetics. can be changed across runs. Note that mutation in Kuri does We created a single species population in a homogeneous not generate new allelic variants; it simply swaps an allele for environment with a single locus and two segregating alleles another one from the database with a uniform probability. of interest. Initially all plants were heterozygous. We ran Initial populations are by default generated in Hardy- the simulation under two scenarios with different wind Weinberg equilibrium based on the given allelic frequencies intensities (strong and mild winds). Wind intensities affect (allelic and genotypic frequencies and chi-squared values for the dispersion process and, consequently, the distribution of Hardy-Weinberg equilibrium tests are given in Kuri), but genetic variability. different initial population structures can be defined. For each scenario, five simulation runs of 25 generations Computationally, the genetic mechanisms of the species each were performed. In Figure 3(a) the distribution pattern are simulated using a Genetic algorithm (GA) [8]. In of the plants across generations is depicted under strong previous work we have [3] detailed how to implement these winds for the first replicate. Note that the distribution pat- genetic processes and shown that they conform to theoretical tern remains homogeneous over the generations, meaning predictions of population genetics. But briefly, GAs are the that dispersion occurs with a high level of panmixia, that is, class of Evolutionary Computation algorithms which most random matting. Figure 3(b) shows the mild wind scenario closely mimic evolutionary processes at the genetic level. over generations for the first replicate. Note the formation GA organisms are represented as linear strings which are of endogamic groups, that is, most matings occur within 4 Journal of Artificial Evolution and Applications 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 G 00 G 01 G 02 G 03 G 04 G 05 G 10 G 15 G 20 G 25 G 00 G 01 G 02 G 03 G 04 G 05 G 10 G 15 G 20 G 25 RI RIV RI RIV RII RV RII RV RIII RIII (a) (b) Figure 4: Changes in frequencies of heterozygotes observed across 25 generations in 5 repetitions. Initially the entire population was heterozygous. (a) Frequencies under the influence of strong winds. Equilibrium is reached after the first generation, oscillating around 0.5. (b) Frequencies under the influence of mild winds. Heterozygosity decreases due to population subdivision—Wahlund effect. subpopulations, which are to be expected in an environment compared to the original experimental model to provide that does not favor dispersion. insights about the dynamics of the system. Kuri was designed to simulate microevolutionary phe- The dynamics over time of the frequencies of heterozy- nomena which can be detected through molecular markers gotes for the five repeats are shown in Figure 4(a) (strong which are usually selectively neutral. Neutral markers have wind) and Figure 4(b) (mild wind). In the former, the the advantage that since they are not being selected for or frequencies of the heterozygotes reach equilibrium after the against, any observed fluctuations in allelic frequencies are first generation, oscillating around 0.5. In the second case, a only due to population structure and environmental effects. decrease in heterozygosity is noticeable since the subdivision creates a new population structure—an example of a genetic phenomenon known as Wahlund effect. In all strong wind 4. Concluding Remarks repeats, equilibrium was reached and maintained across gen- erations whilst with mild winds the number of homozygotes Kuri can be used to simulate a wide range of biological increases over time. scenarios. It allows manipulation of the genomes, alleles, and Even this simple scenario can provide insights about genotypes of different plant species and the interactions of natural populations. Jump and Penuelas [9] showed that these populations with the ecosystem. Kuri’s database can be habitat fragmentation caused by human activity led to high used to store different genetic models of species, being these levels of inbreeding due to a Wahlund effect. This was the based on real data of species or virtual organisms tailored first study showing that even widespread wind-pollinated for educational purposes. Alongside the biological parame- trees are negatively affected by habitat fragmentation. Argu- ters, the user can manipulate and/or create environmental mentatively, Kuri could be used to estimate genetic effects parameters based on field data (such as satellite imagery) under different scenarios. For example, a satellite image of to study how these affect the genetic composition and size a forested area can be artificially fragmented in different of populations. The software meets theoretical expectations, patterns and these used to estimate the genetic effects but it still has to be tested under realistic scenarios for which of deforestation. This has implications for urbanization real data is available and results can be compared. Due to the decisions and can assist in finding a solution that minimizes lack of real data testing it is still unclear how detailed field human impact. Clearly, for realistic results, there has to be data and knowledge of the ecology of the species has to be reliable data and detailed knowledge of the ecology of the able to make valid inferences. Future work and user feedback species. may assist in answering these questions. For population studies the simulated data can be treated The software is modular. It was designed so that it can and analyzed as if it were real data, with the advantage be modified and expanded to simulate other phenomena. of having full knowledge of the population structure and For example, in the current version all genes/alleles are neu- a handle on the mechanisms that yielded the dataset. For tral, but it is straightforward to implement environmental example, data from only the last generation could be used constrains associated to the genotypes in order to simulate to make inferences about the evolutionary processes that natural selection, or even simulate molecular evolution by were acting on the population. The degree of deviation from adding another module that allows handling each allele as a HW equilibrium can be calculated and used to estimate DNA base pair. Kuri is open source and freely available from parameters such as F [10]. These results can then be the web address: http://www.allesys.com.br/kuri/. ST Journal of Artificial Evolution and Applications 5 References [1] D. N. Alstad, “Populus: simulations of population biology,” 2007, http://www.cbs.umn.edu/populus. [2] P.A.S.Nuin andP.A.Otto, “A programfor representing and simulating population genetic phenomena,” Genetics and Molecular Biology, vol. 23, no. 1, pp. 53–60, 2000. [3] C. Gondro and J. C. M. Magalhaes, ˜ “A simple genetic algorithm for studies of Mendelian populations,” in Recent Advances in Artificial Life, H. Abbass, T. Bossamaier, and J. Wiles, Eds., pp. 85–98, World Scientific, London, UK, 2005. [4] B. P. Kinghorn, “GENUP—a suite of programs to help teach animal breeding theory,” in Proceedings of the 10th Australian Association of Animal Breeding and Genetics, pp. 555–559, [5] M.Kanashiro, I. S. Thompson,J.A.G.Yared,etal., “Improv- ing conservation values of managed forests: the Dendrogene project in the Brazilian Amazon,” Unasylva, vol. 53, no. 209, pp. 25–33, 2002. [6] B. Degen, H.-R. Gregorius, and F. Scholz, “ECO-GENE, a model for simulation studies on the spatial and temporal dynamics of genetic structures of tree populations,” Silvae Genetica, vol. 45, no. 5-6, pp. 323–329, 1996. [7] P.T.Hraber, T. Jones, andS.Forrest,“Theecology of Echo,” Artificial Life, vol. 3, no. 3, pp. 165–190, 1997. [8] C. Gondro and B. P. Kinghorn, “Solving complex problems with evolutionary computation,” in Proceedings of the 17th Australian Association for the Advancement of Animal Breeding and Genetics, pp. 272–279, 2007. [9] A. S. Jump and J. Penuelas, “Genetic effects of chronic habitat fragmentation in a wind-pollinated tree,” Proceedings of the National Academy of Sciences of the United States of America, vol. 103, no. 21, pp. 8096–8100, 2006. [10] S. Wright, Evolution and the Genetic of Populations, Vol. II: The Theory of Gene Frequencies, The University of Chicago Press, Chicago, Ill, USA, 1969.

Journal

Journal of Artificial Evolution and ApplicationsHindawi Publishing Corporation

Published: Jul 26, 2009

References