Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research

MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research Hindawi Publishing Corporation International Journal of Plant Genomics Volume 2008, Article ID 496957, 10 pages doi:10.1155/2008/496957 Resource Review MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research 1, 2, 3 4, 5 6, 7 1, 2 Carolyn J. Lawrence, Lisa C. Harper, Mary L. Schaeffer, Taner Z. Sen, 1 1 Trent E. Seigfried, and Darwin A. Campbell USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011, USA Department of Agronomy, Iowa State University, Ames, IA 50011, USA USDA-ARS, Plant Gene Expression Center, 800 Buchanan Street, Albany, CA 94710, USA Department of Molecular and Biology, University of California Berkeley, Berkeley, CA 94720, USA USDA-ARS, Plant Genetics Research Unit, Columbia, MO 65211, USA Division of Plant Sciences, University of Missouri Columbia, Columbia, MO 65211, USA Correspondence should be addressed to Carolyn J. Lawrence, carolyn.lawrence@ars.usda.gov Received 31 August 2007; Accepted 10 July 2008 Recommended by Chunguang Du In 2001 maize became the number one production crop in the world with the Food and Agriculture Organization of the United Nations reporting over 614 million tonnes produced. Its success is due to the high productivity per acre in tandem with a wide variety of commercial uses. Not only is maize an excellent source of food, feed, and fuel, but also its by-products are used in the production of various commercial products. Maize’s unparalleled success in agriculture stems from basic research, the outcomes of which drive breeding and product development. In order for basic, translational, and applied researchers to benefit from others’ investigations, newly generated data must be made freely and easily accessible. MaizeGDB is the maize research community’s central repository for genetics and genomics information. The overall goals of MaizeGDB are to facilitate access to the outcomes of maize research by integrating new maize data into the database and to support the maize research community by coordinating group activities. Copyright © 2008 Carolyn J. Lawrence et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION model organism for developing fuelstock grasses is apparent [1]. Indeed, in addition to its value as a commodity, maize has been a premiere model organism for biological research Maize (Zea mays L.) is a species that encompasses the for over 100 years. Many seminal scientific discoveries have subspecies mays (commonly called “corn” in the US) as well as the various teosintes that gave rise to modern maize. Maize first been shown in maize, such as the identification [2]and cloning [3] of transposable elements, the correlation between is an important crop: not only is it one of the most abundant cytological and genetic crossing over [4], and the discovery of sources of food and feed for people and livestock the world over, it is also an important component of many industrial epigenetic phenomena [5]. These exceptional characteristics of maize set this amazing plant apart: no other species serves products. Maize byproducts are present in, for example, glue, paint, insecticides, toothpaste, rubber tires, rayon, and as both a commodity and a leading model for basic research. Today, with the accelerated generation of maize genetic molded plastics, among others. Maize is also currently the nation’s major source of ethanol, a major biofuel that is more and genomic information, the need for a centralized biologi- environmentally friendly than gasoline and that may be a cal data repository is critical. MaizeGDB (the Maize Genetics more economical fuel alternative in the long run. Although and genomics Data Base [6]) (http://www.maizegdb.org/) it is unlikely that ethanol production from maize directly is the Model Organism Database (MOD) for maize. Stored will be sustainable long-term, maize’s suitability to serve as a at MaizeGDB is comprehensive information on loci (genes 2 International Journal of Plant Genomics and other genetically defined genomic regions including a chromosome, the sequence records at MaizeGDB, and QTL), variations (alleles and other sorts of polymorphisms), the EST (expressed sequence tag) and GSS (genome survey stocks, molecular markers and probes, sequences, gene sequence) contig assemblies at PlantGDB [10] and Dana product information, phenotypic images and descriptions, Farber (The Gene Indices at http://compbio.dfci.harvard metabolic pathway information, reference data, and con- .edu/tgi/cgi-bin/tgi/gimain.pl?gudb=maize, previously at tact information for maize researchers. Described in the TIGR [11]). All of the 3 520 247 sequences in MaizeGDB results and discussion section are example workflows that are accessible by BLAST [12]and canbefiltered to report could be followed by researchers to utilize the MaizeGDB only mapped loci, including any SSRs and overgos that may resource for their research. Other long-term resources serv- not be mapped genetically, but via BACs (bacterial artificial ing maize data include Gramene (http://www.gramene.org/) chromosomes) in anchored contigs. [7], which specializes in grass comparative genomics, and The inclusion of the public BAC FPC (Finger Print GRIN (the Germplasm Resources Information Network; Contig) information [13] adds 439 449 BACs together with http://www.ars-grin.gov/npgs/), which provides access to the associated overgo, SSR, and RFLP markers, which are National Plant Germplasm System’s germplasm stocks and used to assemble the contigs and to link contigs onto related breeding data. MaizeGDB makes an effort to guide genetic map coordinates. The order of loci on the BAC researchers to these resources via context-sensitive linkages contigs is represented by over 27 000 sequenced-based rather than duplicating data, though some data are shared loci on the IBM2 FPC057 maps (http://www.maizegdb simply to allow for the context-sensitive linkages to be .org/cgi-bin/displaymapresults.cgi?term=ibm2+fpc0507)in created. This reduces duplication in effort and allows per- MaizeGDB, by links to contigs at both the Arizona FPC sonnel skilled in comparative genomics and germplasm con- site (http://www.genome.arizona.edu/) and the genome servation/plant breeding to interact with maize researchers sequencing project (http://www.maizesequence.org/). As the directly via Gramene and GRIN, respectively. B73 genome sequence progresses, these BAC sequences are In addition to storing and making maize data available, added to MaizeGDB along with links to the sequencing the MaizeGDB team also provides services to the commu- project, both from the BAC clones and from genetically nity of maize researchers and offers technical support for mapped loci associated with a BAC. the Maize Genetics Executive Committee and the Annual The newest maps in MaizeGDB, IBM SNP 2007 (http:// Maize Genetics Conference. Also available at the MaizeGDB www.maizegdb.org/cgi-bin/displaymapresults.cgi?term=ibm website, as a service to the maize research community, are %20snp%202007), are the first of a new generation of bulletin boards for news items, information of interest to genetic maps from the Maize Diversity Project (http://www cooperators, lists of websites for projects that focus on the .panzea.org/) kindly provided pre-publication by Dr. Mike scientific study of maize, the Editorial Board’s recommended McMullen. The SNP loci on these maps are associated reading list, and educational outreach items. with allelic sequences from a core set of maize and teosinte The genetic and genomic data as well as community- germplasm. Because the majority of the anticipated 1128 related information maintained by MaizeGDB are highly loci have been previously mapped onto BAC clones [13, 14], utilized: MaizeGDB averages 8620 visitors (based on unique these genetic maps tightly link sequence diversity to the B73 Internet Protocol or IP addresses) and over 160 000 page genome sequence. impressions per month (July 2007 to June 2008). In addition, MaizeGDB came in fifth out of 170 in a National Plant 2.2. Methods of access, environments, Genome Initiative Grantees poll in which lead principal and the database back end investigators reported most useful websites for their research [8]. 2.2.1. The production web interface Maize researchers primarily access MaizeGDB through 2. MATERIALS AND METHODS the series of interconnected Web pages available at http://www.maizegdb.org/ (see Figure 1). These web pages 2.1. Kinds of data in the database that link genetic and are dynamically generated and are written in PHP (the genome sequence information recursive abbreviation for PHP Hypertext Preprocessor [15]) MaizeGDB is the primary repository for the major genetic and Perl [16]. Through this interface, each page shows and cytogenetic maps and includes details about genes, detailed information on a specific biological entity (such as mutants, QTL (quantitative trait loci), and molecular a gene) as well as basic information about data associated markers including 2500 RFLPs (restriction fragment length with it (genes are associated with maps, phenotypes, and polymorphisms), 4625 SSRs (simple sequence repeats), citations, among others). These additional data types are 363 SNP (single nucleotide polymorphisms), 2500 indels linked to the gene page, enabling quick access to alternative (insertion/deletion sites), and 10 644 overgos (overlapping data views. The site also includes links to related resources at oligonucleotides). These data are described using 1.27 other databases; genes, for example, are linked to Gramene millions synonyms, 42 000 primer sequences, 16 394 raw [7]. scores from mapping based upon 16 panels of stocks, One may access these individual data pages by using and 323 313 links to GenBank [9] accessions. GenBank either (1) the search bar located at the top right of every page accessions form the links between the genetic position on (Figure 1(A)), or (2) data type-specific advanced querying Carolyn J. Lawrence et al. 3 Figure 1: The MaizeGDB home page. The most commonly utilized search functionality for MaizeGDB is the search bar (A), which is available within the header of any MaizeGDB page. To browse data and to search specific data types using specific limiters, the Data Centers (B) are also quite useful. Also available is a Bin Viewer (C), which allows for a view of lots of data types within the context of their chromosomal location. To enable access to the Data Centers and other displays of interest from any MaizeGDB page, a pull-down menu for “Useful pages” (D) is accessible on the header of any MaizeGDB page. The footer of all MaizeGDB pages contains a context-sensitive “feedback form” link (E). Researchers use the feedback form to report errors, ask questions, and to contact the MaizeGDB team directly. For newcomers to the site, the MaizeGDB Tutorial (F) can help them to get a jump start on how to use the site. tools (accessible via the “Data Centers” links; Figure 1(B)) 2.2.2. Structure and relationship of environments: on the left side of the home page, or (3) the Bin Viewer tool production, staging, and test (Figure 1(C)), which is located in the left margin of the home page or via a pull down labeled “Useful pages” (Figure 1(D)) The production Web interface, which most MaizeGDB accessible at the top of any MaizeGDB page. These tools allow users interact with, is only one component of the overall researchers to easily find relevant data displays. MaizeGDB infrastructure (Figure 2). The data accessed by MaizeGDB’s method of data delivery has three primary the production Web interface are typically updated on goals: placing information within the framework of its the first Tuesday of each month. Prior to being in that scientific meaning, making this information available to Production Environment, the data are prepared for public the researcher with minimal input (often only the relevant accessibility in a Staging Environment. In the Staging Envi- term), and requiring minimal effort from the researcher to ronment, the most up-to-date information is available, new comprehend the data displays. By focusing on biological data are added to the database, and existing data are updated context and ease of use as the primary focus of this with new information. In addition to a Web interface that interface (the “production” Web interface), the database appears identical to the one in the Production Environment, is intended to be intuitive to the researcher as their click the Staging Environment offers SQL (Structured Query stream follows a logical path of biological associations. Language) read-only access to the community so that Up-to-date site usage statistics can be accessed online at researchers interested in interacting with the data in a more http://www.maizegdb.org/usage/. direct and customized manner can have access to the most 4 International Journal of Plant Genomics up-to-date information available. In addition, a Disaster drives with Red Hat Advanced Server 2.1 operating system Recovery system has been put in place whereby the Curation installed. The curation database, either partially or in its Database is backed up in a compressed format to a separate entirety, can be moved to MySQL, Microsoft Access, and machine in Ames, Iowa daily. Once weekly, the Ames file is nearly any other portable data format that a researcher would copied to Columbia, Missouri for off-site storage. need. Requests to gain read-only SQL access to the Curation To aid in the modeling of new types of data for inclusion database can be made via the feedback link that appears at the in the MaizeGDB product and to enable programming to bottom of any MaizeGDB page. Data housed at MaizeGDB be tried out in a safe place, a Test Environment identical to are in the public domain and are freely available for use the Staging Environment has been created. Note that three without a license. copies of the database exist. While each environment and server has a specific purpose, all are configured such that they 2.3. Outreach could serve a backup to each other. If any one server was to fail, either of the other two could provide full, unrestricted One of the strengths of MaizeGDB is its responsiveness data access and site functionality. The curation database is to community input, received either personally or by the backed up on a daily basis and is available for download feedback forms accessible at the bottom of each page (http://goblin1.zool.iastate.edu/∼oracle/) for those who have (Figure 1(E)). To provide outreach and user support as well Oracle Relational Database Management System (RDBMS) as to solicit input from researchers in a more active manner, installed locally. several strategies are employed. The first is tutorials and basic information on MaizeGDB. The MaizeGDB Tutorial (Figure 1(F)) can be reached from the home page at the top 2.2.3. Curation of the left margin. A new user can go through this tutorial, Also available within the Staging Environment are Com- and become familiar with how to use the site quickly. In munity Curation Tools to enable researchers to add small addition, a “Site Tour” with an overview with examples can datasets to the database directly, as well as a set of Profes- be found under the “Useful pages” pull down menu at the sional Curation Tools developed by Dr. Marty Sachs’ group top of each page. More specific tutorial examples and other at the Maize Genetics Cooperation-Stock Center in Urbana- educational materials are available via the “Education” link, Champaign [17]. Whereas the Community Curation Tools also within the “Useful pages” pull down menu. Also, on have many safeguards to help researchers enter data step- many of the “Data Center” pages (available from the left wise and with enforced field requirements, the Professional margin of the front page or via the “Useful pages” pull down) Curation Tools allow MaizeGDB project members as well a discussion of the topic of the page that is suitable for the as Stock Center personnel to enter datasets in a more general public appears toward the bottom. Another form of stream-lined and powerful fashion with fewer integrity outreach supported by MaizeGDB is assistance at meetings enforcement rules (which slow down the data entry process and conferences. Representatives from MaizeGDB attend and considerably). It also should be noted that data added help researchers at the Annual Maize Genetics Conference to the database via the Community Curation Tools are (usually in March), the International Plant and Animal first marked as “Experimental” that must be “Activated” Genome Conference (January), and various other meetings by professional curators at MaizeGDB. This ensures that through direct interaction in person. Finally, researchers can only quality information is made publicly accessible. The request a MaizeGDB site visit. About three times a year, availability of a Curation Web interface (within the Staging an expert curator travels to various research locations and Environment) enables researchers to view the data as they provides tutorials and support for maize researchers. For will appear once they are uploaded to Production. Few these visits, the local maize researchers are asked for a list of researchers (about 30 at present) have Community Curation specific questions ahead of time. During the one to two day accounts. To increase the use of these tools, training sessions visits, researchers interact in groups and one-on-one with the are being organized (see Section 2.3, below). If researchers traveling curator to learn how to utilize MaizeGDB for their wish to deposit complex or large datasets, it would not be research and to deposit data at MaizeGDB. reasonable to enter the data via the Community Curation Tools because those tools work via a “bottom-up” approach 2.4. Community support services whereby the records are (1) built based upon the most basic information included in the dataset and (2) entered one MaizeGDB provides community support in several ways. record at a time (i.e., not in bulk). For complex or large Two members of the MaizeGDB team, MLS and TES, serve datasets, researchers are encouraged to submit data files to as ex officio members of the Maize Genetics Conference the curators at MaizeGDB. Those data are added to the Steering Committee. They collect electronic abstracts database directly by curators and the database administrator. for the Annual Maize Genetics Conference and handle the preparation and printing of the program for the conference. MaizeGDB personnel also manage regular 2.2.4. Database back end community surveys on behalf of the Maize Genetics Each environment’s server has a perpetual license and is Executive Committee. These surveys enable the Executive supportedbyOracleRDBMS poweredby2 × 2.0GHz Xeon Committee to summarize the overall research interest processors, 4 GB of RAM, 5 × 73 GB Ultra 320 10 K RPM of the maize community and to advise funding agencies Carolyn J. Lawrence et al. 5 Maize community Data deposition SQL access Curation Data retrieval Community Professional Curation Curation Tools Tools Primary data Production database (unformatted Genome browser Curation database Genome Production Environment browser Staging Environment “Playground” database Genome browser Ames Columbia IA MO Professional Community Curation Curation Tools Tools MaizeGDB Disaster Recovery Test Environment Figure 2: Simplified infrastructure of MaizeGDB. The community of maize researchers can add data to the database (downward-facing arrows from the uppermost yellow box) via direct data deposition (upper left) and via a set of Community Curation Tools that interacts with the Curation Database (upper center). Researchers are also allowed access to maize data (upward-facing arrows from the lower dashed box) via a web interface that can be accessed at http://www.maizegdb.org/ (upper right) and by way of SQL access to the Curation Database, which houses the most up-to-date data available (upper center). These functionalities are supported by two of the three environments: Production and Staging, respectively (upper dashed gold boxes). Available for use by MaizeGDB personnel to facilitate data modeling and trial programming manipulations is a third environment called Test (lower left dashed gold box), which is identical to the Staging Environment. To ensure that the most up-to-date copy of the database is backed up, a Disaster Recovery process has been instituted (lower center dashed gold box) whereby a compressed copy of the database is backed up to a separate machine in Ames, Iowa daily, and to a server in Columbia, Missouri weekly. on future research directions. MaizeGDB personnel simultaneously, here the researcher types are distinguished also manage the Executive Committee’s website (i.e., as follows: basic researchers investigate the fundamental http://www.maizegdb.org/mgec.php) and conducts the biology of the organism, translational researchers work to Executive Committee’s elections. MaizeGDB houses the determine the application of basic research outcomes for mailing list for the annual Maize Newsletter and project practical purposes [18], and applied researchers implement personnel conduct semi-regular mailings to the maize com- proven technologies to improve crops. munity on behalf of interested researchers by maintaining an electronic list of researchers’ contact information. Potential 3.1. Basic mailings to this list are vetted by the Executive Committee. Many basic researchers work with mutants to understand the processes underlying biological phenomena. Once a new 3. RESULTS AND DISCUSSION mutant is found, there are several standard methods used To demonstrate how researchers utilize MaizeGDB, three to elucidate normal gene functions. These efforts include example usage cases are presented here. Because researchers determining whether the mutant represents an allele of a with very different goals can all utilize MaizeGDB to advance previously described gene, and if not, genetic mapping and their work, the usage cases are classified by research type: cloning of the new gene. Information stored in MaizeGDB is basic, translational, and applied. See Figure 3 for examples of useful in all of these steps. how these research types fit together. By enabling researchers In a large screen for mutations that change pericarp to carry out workflows that support translational and pigmentation from red to some other color, Researcher applied research, MaizeGDB plays a part in influencing crop 1 has found a plant with a brownish-red pericarp development directly. Although a single researcher might coloration. She first wants search MaizeGDB to find even include all of these three aspects in his/her research all known mutants that have red pericarp phenotypes Web interface Web interface Web interface 6 International Journal of Plant Genomics Basic Translational Applied Scientific investigation Investigation to gain The scientific work required to develop a knowledge and carried out to create understanding of a clinical or commercial therapies or Definition particular subject application from a commercial/consumer without regard to basic science products practical applications discovery Investigating disease Determining the relative Using the relative mechanisms and risk of an allele or risk information to Medical processesaswellas therapytoapatient examplar prescribe therapies chemical effects on cellular processes Investigating new Making breeding Utilizing variants as information on, for crosses to create selectable biomarkers Plant example, genome plants that have in breeding crosses biology structure, gene models, value-added traits for and progeny examplar gene function, and real-world production genetic variability agricultural uses Identifying markers Using markers Determining the Example (morphological and genetic cause of red associated with both usage molecular) to guide pigmentation and pigmentation in the cases (See breeding programs to maysin production for pericarpof a newly produce ear worm Section 3) corn ear worm discovered mutant resistant sweet corn resistance lines Figure 3: Three types of biological research. Research can be divided into three categories: basic, translational, and applied. Outcomes from basic research feed into translational predictions, and developed uses for these findings constitute the basis for developing real-world applications that benefit humanity and the world. Listed after the flow of research are definitions for each research type as well as medical and plant biological models for how the different divisions are interrelated. Also shown are overviews of the example usage cases presented in Section 3. to determine whether this mutation represents a newly she orders seed from the Stock Center directly through the discovered gene. Because she does not know how others MaizeGDB interface. She then goes back to the results of her might have described the phenotype, she decides to “pericarp color” query and repeats the process for “cherry browse existing phenotype terms and images. From the pericarp,” ordering stocks for r1-ch (colored1-cherry), also left margin of the MaizeGDB homepage, she selects to be used in her complementation analyses. (Another way “Mutant Phenotypes” under “Data Centers-Functional.” she could have found maize stocks that have red pericarp is On this page (http://www.maizegdb.org/), she selects the following: from the header of any page, select “Useful “pericarp color” from the pull down menu labeled pages” and click “Stocks.” This pulls up the stock search “Show only phenotypes relating to this trait” in the page http://www.maizegdb.org/stock.php. In the green box, green search bar. A number of possible mutant phenotypes select stocks with the phenotype “red pericarp” from the are returned, including “red pericarp.” Clicking on the pull down menu of all phenotype names and submit. A “red pericarp” phenotype link, she finds that the listed long list of stocks that contain alleles of p1 with red pericarp mutants are alleles of p1 (pericarp color1). On this page is returned. Alternatively, the Stock Center Catalog is also (http://www.maizegdb.org/cgi-bin/displayphenorecord.cgi? available from the Stocks Data Center page.) id=13818), she scrolls to the bottom and finds that there Researcher 1 receives several appropriate stocks and are many stocks that can be ordered from the Maize performs allelism tests and determines that her mutant Genetics Cooperation-Stock Center that carry P1-rr (an (which turns out to be recessive) is not allelic to p1 or r1. allele that causes red pericarp and red cob) or P1-rw (red She returns to MaizeGDB and again looks through “Mutant pericarp and white cob). Having these stocks in hand will Phenotype” results using the “pericarp color” query. Listed enable her to test whether the new mutant represents an there are brown pericarp, orange pericarp, white pericarp, allele of the p1 gene, so she decides to order a few for and lacquer red pericarp phenotypes in addition to the complementation analyses. Clicking on the stock links red and cherry phenotypes she focused on initially. She listed on the variation/allele page allows her access to a finds that there is no stock available for the brown pericarp shopping cart utility (in the green right hand panel), and phenotype (the brown pericarp1 mutant has been lost), and Carolyn J. Lawrence et al. 7 all the others are alleles that confer colored pericarp in the and any available GenBank accession numbers for sequences dominant condition as a result of the presence of P1 alleles. as well as sequenced BACs. She finally selects markers and To determine whether the new mutation could be an allele of performs fine structure mapping. As she finds markers bp1, she decides to map it genetically. closer and closer to the gene, she can proceed with positional MaizeGDB houses the largest collection of publicly cloning to determine whether the position is consistent with available genetic maps of maize (currently over 1,337 maps). bp1 (niceexamplesofhow thisisdonecan be foundin These include maps of genes primarily defined by mutants [19–21]). with morphological phenotypes (“Genetic 2005” is the most current), maps based on phenotypic molecular markers, and 3.2. Translational compositemapswhere variousmapshavebeenintegrated. These maps can be easily accessed from the home page, Research to understand the metabolic pathways that produce via the left margin link to “Data Centers-Genetic-Maps” pigmentation (like those outlined in Section 3.1)are well (http://www.maizegdb.org/map.php). This page not only studied in maize [22]. One example of a well-characterized allows various map search functions, but also provides infor- gene that confers pigmentation is p1,which encodesa mation on the most popular maps and a handy reference to transcription factor that regulates synthesis of flavones such explain more about the various composite maps. as anthocyanins [23]. The p1 gene, along with its adjacent The maize genome is divided into genetic bins of duplicate pericarp color2 (p2), controls pericarp and cob approximately 20 centiMorgans each and boundary markers coloration and causes silks to brown when cut. One flavone with nearby SSRs can be used for mapping (for further expla- produced by the pathway is maysin, a compound which nation see http://www.maizegdb.org/cgi-bin/bin viewer.cgi). has been shown to be antinutritive to the corn ear worm Researcher 1 decides to utilize SSRs to map her gene to bin at concentrations above 0.2% fresh weight if husks limit resolution. To find the core markers from the home page, access to the ear such that feeding on silks is required for she clicks on “Tools-Bin Viewer” in the left margin of the the insect to enter [24]. Many QTL for resistance to corn home page. This provides a list of the core bin markers and earworm map near loci in the flavone synthesis pathway that a link to purchase relevant primers to screen her mapping are either regulatory genes (such as p1 and p2), or at rate- population. She generates a mapping population, performs limiting enzymatic steps, such as c1 (chalcone synthase1) that PCR experiments using the polymorphic markers, and maps contribute maysin accumulation in silks [25]. Understanding her mutant to bin 9.02. how maysin functions and how this information could be To see what genes are located in bin 9.02, she goes back used for production agriculture is Researcher 2’s area of to the Bin Viewer (from the homepage), and holds the curser expertise. over the image of chromosome 9 until she sees “bin 9.02,” Researcher 2 has investigated maysin synthesis for some then clicks. The result is a long list of genes, other loci, time, and has decided to clone an uncharacterized maysin sequences, EST contigs, SSRs, BACs, and other data relating QTL near umc105a, in the bin 9.02, which is bounded by to bin 9.02. Searching through this data, she sees that bp1 is bz1 and wx1 [24]. He believes that the QTL may be a listed under “other loci” in bin 9.02. This is a “lapsed locus” previously described, but lost, bp1 mutant thought to be meaning that the stock has been lost, but perhaps she has involved in maysin synthesis. In the first step, he must first foundanewallele! find molecular markers to more finely map the region (his To see more specific genetic mapping data on bp1, preference would be to use SSRs, since members of the she goes to the search bar along the top green bar of lab are already using them successfully). He plans to follow every page, selects “loci” from the pull down menu, the strategy of chromosome walking to narrow down the types “bp1” into the field provided, and clicks the button region of interest [19–21] followed by association mapping marked “Go!” This brings her to the bp1 locus page to identify the actual QTL sequence [26, 27]. Knowing this (http://www.maizegdb.org/cgi-bin/displaylocusrecord.cgi? sequence would enable plant breeders to track the QTL for id=61563) where she can see that bp1 is placed on three marker assisted selection. genetic maps. Clicking on each map, Researcher 1 learns that To find SSR data for mapping to a bin region, Researcher in 1935, bp1 was mapped between sh1 and wx1 (shrunken1 2 goes to the MaizeGDB home page and clicks on “Data and waxy1), two well-studied genes. To search for molecular Centers-Genomic-Molecular Markers/Probes” in the left markers suitable for fine structure mapping, she visits “Data margin, then clicks the “SSR” link at the top of the page Centers-Genetic-Maps” from the link on the home page. (the link is located in “Specific information is available on In the green Advanced Search box, she enters sh1 and wx1 BACs, ESTs, overgos, and SSRs.”) Scrolling down to the green separately in the “Show only maps containing this locus” “Set Up Criteria” box, he then selects bin 9.02 and submits lines. This returns only genetic maps that contain both a search request. A report is returned that lists the available genes. She selects the map with the most markers—IBM2 SSRs for bin 9.02, complete with primers, gel patterns for 2005 Neighbors 9 (with 2,488 markers). She finds sh1 at different germplasm, and related maps. By going back to the position 80.30, and wx1 at 185.00. To choose among several SSR page, he also downloads tabular reports of map locations molecular markers, Researcher 1 follows the available links of all SSRs on chromosome 9, including those that have been leading her to information about suitable primers, a number anchored to a BAC contig. Using this information in the of variations (which can help to decide if there may be a laboratory, members of his research group perform mapping polymorphism in her mapping populations), gel patterns, experiments using several SSRs in bin 9.02 along with some 8 International Journal of Plant Genomics others in the more distal part of bin 9.03. They discover that In the instance of looking for particular stocks, the mid-region peak for the QTL is very near an SSR for Researcher 3 has identified GT114 as a high maysin line from bnlg1372, which is anchored to a BAC contig. [25]. Using the green search bar at the top of any MaizeGDB To find sequenced BACs that may harbor the earworm page, he searches “stocks” for “GT114.” At that page, he resistance QTL, Researcher 2 uses the search bar at the sees a brief annotation stating that GT114 is a poor pollen top of each MaizeGDB page to find the locus bnlg1372. producer and makes a note of that observation and plans to At the top of the bnlg1372 page, he follows a link to cross by IA453 and IA5125, sweet lines that produce pollen the contig 373 display at the Maize Sequencing Project well, to ameliorate this potential difficulty. Clicking the link site (http://www.maizesequence.org/). This is a rather large to GT114, he sees that it is an inbred line derived from GT- contig with many sequenced BACs and assigned markers. DDSA (DD Syn A) in Georgia, and it is made available via At the Maize Sequencing Project site, he uses the export GRIN. Selecting the link for GRIN, a page opens at that site function (a button at the left margin) to view a text list (http://www.ars-grin.gov/cgi-bin/npgs/html/search.pl?PI+ of all the markers and sequenced BAC clones that are 511314). Listed there are the Crop Science Registration data, available on the Finger Print Contig physical map. He finds availability (noted as currently unavailable, but a call to Mark that bnlg1372 is assigned to the region “19742100,1974700,” Millard, maize curator at the maintenance site indicates that encompassed by the sequenced BAC clone, c0324E10. This he could access that stock in limited quantities if current information provides coordinates for viewing the region resources allow), and an image of bulk kernels among on a large contig associated with bnlg1372, the sequence other information. The image of bulked kernels is especially of BAC c0324E10, and any other BACs nearby. Researcher revealing: the kernels are yellow and the cob fragments 2 sequences candidate regions in diverse germplasm and appear red. Aware that a red cob would be unacceptable for conducts association analysis using silk maysin levels as breeding sweet corn (the red pigment could cause quite a a trait. This may require other information about nearby mess for those cooking and eating corn on then cob), he markers, which also are accessible via MaizeGDB [28, 29]. decides to search MaizeGDB for other available high maysin Although these investigations may require the devel- stocks. opment of further sequenced-based markers, Researcher 2 After a literature search of breeding stocks with a white hopes that useful markers already exist and decides to explore cob that might still produce maysin in the silks, Researcher MaizeGDB for any other sequences or primer-based markers 3 starts searching stocks for those known to carry the P1- already assigned to the region of interest including SNPs and wwb allele, a dominant allele of the p1 locus that confers indels. To do this from the locus page for bnlg1372, he clicks white pericarp, white cob, and browning silks. By clicking on the link to the most current IBM neighbors map listed, the “Data Centers-Genetic-Stocks” link from the MaizeGDB then explores the “sequence” and “primer” view versions of homepage, he arrives at the Stocks Data Center page (which the map by clicking on the relevant links at the top of the page is also accessible via the “Useful pages” pull down at the top just under the map name. The primer view shows primers of every MaizeGDB page). He uses the Advanced Search box associated with mapping probes along with the name of the to limit the query by variation to those stocks associated probes—just what he needs to get going with the association with the allele P1-wwb. A number of the stocks returned mapping work. on the results page have been evaluated for silk maysin accumulation (per associated publications) and could be further investigated as potential breeding stocks. 3.3. Applied Although the p1 gene accounts for much of the Interested in breeding plants for organic sweet corn produc- variability in maysin accumulation [32], association and tion, Researcher 3 has decided to use molecular markers to QTL analyses for candidate genes for maysin accumulation select for high maysin content, which would increase resis- also have identified anthocyaninless1 (a1), colorless2 (c2), and tance to the corn earworm—a cause of significant damage white pollen1 (whp1) as contributing significantly [32, 33]. Researcher 3 can track the dominant P1-wwb allele visually to sweet corn [30]. Although plants could be genetically modified to carry the genes that confer high maysin levels by selecting for browning silks given that the sweet lines in silks (e.g., see [31]), Researcher 3’s farming clients require he will be using in the breeding program have silks that do not brown, but tracking the other factors will require the that their product be certified as both organic and “GMO- free.” To meet the producers’ needs, he has decided to pursue use of molecular markers. To find molecular markers to a marker-assisted selection program to create high maysin select for desirable alleles of, for example, a1, Researcher 3 sweet inbred lines, which he will use to generate single- uses the search menu at the top of any page at MaizeGDB cross hybrids. To get started with the work, he searches to find “loci” using the query “a1.” The results page MaizeGDB to find references, markers, and stocks for the (http://www.maizegdb.org/cgi-bin/displaylocusresults.cgi? project. Described here are the details on how he could use term=a1) lists many loci with a1 as a substring, but MaizeGDB to (1) access stocks known to have high maysin shows the exact match (the a1 locus) at the top of the list. Clicking on that link shows the a1 locus page content directly and (2) locate relevant stocks based upon associated data with no prior knowledge of which stocks (http://www.maizegdb.org/cgi-bin/displaylocusrecord.cgi? he wants to find. An outline of how he uses MaizeGDB to id=12000), which lists useful information including six probes/molecular markers that could be used for identify relevant selectable markers for tracking the various QTL associated with maysin accumulation also is described. tracking useful a1 alleles. Using the same process, he also Carolyn J. Lawrence et al. 9 finds markers for the c2 and whp1 loci and sets to work and cytological maps to the assembled genome sequence? determining which markers to use for his selections. Are there sequences present at centromeres that signal the cell to construct kinetochores, the machines that ensure proper chromosome segregation to occur, at the correct site? 4. CONCLUSIONS MaizeGDB aims to enable researchers to discover answers Because MaizeGDB stores and makes accessible data of use to such queries that will enhance the quality of basic maize for a variety of applications, it is a resource of interest to research and ultimately the value of maize as a crop. It will maize researchers spanning many disciplines. The fact that become possible to interrogate the database to find answers basic research outcomes are tied to translational and applied to these and other complex questions, and the content of the data enables all researcher types to utilize the MaizeGDB genome can better be related to its function, both within the resource to further their research goals, and connections to cell and to the plant as a whole. Convergence of traditional external resources like Gramene, NCBI, and GRIN make it biological investigation with the knowledge of genome possible for researchers to find relevant resources quickly, content and organization is currently lacking, and is a new irrespective of storage location. area of research that will open up once a complete genome At present, maize geneticists are at the cusp of a sequence and a method for searching through the whole milestone: the genome of the maize inbred B73 is being of the data are both in place. It is the ability to investigate and answer such basic research questions that will serve as sequenced in the U.S., with anticipated completion in 2008. In addition, scientists working in Mexico the basis for devising sound methods to breed better plants. at Langebio (the National Genomics for Biodiversity Once the relationships among sequence data and more traditional maize data like genotypes, phenotypes, stocks, Laboratory) and Cinvestav (Centro de Investigacion y Estudios Avanzados) have announced through a press and so forth have been captured, it is important that those release (July 12, 2007) that they completely sequenced data be presented to researchers in a way that can be easily 95% of the genes with 4X coverage in a native Mexican understood without requiring that they have any awareness popcorn called palomero, though the data have not yet of how the data are actually stored within a database. It is these needs—creating connections between sequence and been released and the quality of the data is unknown (see http://www.bloomberg.com/apps/news?pid=20601086&sid= traditional genetic data, improving the interface to those aO.Xj8ybAExI&refer=latin america). At present and as more data, and determining how sequence data relate to the overall architecture of the maize chromosome complement—that maize sequence becomes available relating sequences to the existing compendium of maize data is the primary need the MaizeGDB team seeks to fulfill in the very near future. that must be met for maize researchers in the immediate future. Creating and conserving relationships among the ACKNOWLEDGMENTS data will enable researchers to ask and answer questions about the structure and function of the maize genome We are indebted to the community of maize researchers and the MaizeGDB Working Group (Drs. Volker Brendel, that previously could not be addressed. To address this need, MaizeGDB personnel will create a “genome view” by Ed Buckler, Karen Cone, Mike Freeling, Owen Hoekenga, Lukas Mueller, Marty Sachs, Pat Schnable, Tom Slezak, Anne adopting and customizing a Genome Browser that could be used to integrate the outcomes of the Maize Genome Sylvester, and Doreen Ware) for their continued enthusiasm, Sequencing Project. For genome browser functionality, basic help, and guidance. We are grateful to Dr. Bill Beavis for giving us the idea to highlight MaizeGDB’s utility for the researchers have an interest in visualizing genome structure, gene models, functional data, and genetic variability. three user types. We thank Drs. Mike McMullen, Jenelle Translational researchers would like to be able to assign Meyer, Bill Tracy, and Tom Peterson for helpful discussions concerning p1 and maysin research as well as Dr. Damon values to genomic and genetic variants (e.g., the value of a particular allele in a given population) and to view Lisch for suggestions on seminal discoveries in maize and those values within a genomic context. Applied researchers Mark Millard at the USDA-ARS North Central Regional are interested in tagging variants for use as selectable Plant Introduction Station for samples of corn with red cobs. markers and retrieving tags for particular regions of the genome. To best meet these researchers’ needs, the “genome REFERENCES view” will allow researchers to visualize a gene within its [1] C. J. Lawrence and V. Walbot, “Translational genomics for genomic context and a soon to be created “pathway view” bioenergy production from fuelstock grasses: maize as the will enable the visualization of a gene product within the model species,” The Plant Cell, vol. 19, no. 7, pp. 2091–2094, context of relevant metabolic pathways annotated with Plant Ontology (http://www.plantontology.org/)[34]and [2] B. McClintock, “The origin and behavior of mutable loci in Gene Ontology (http://www.geneontology.org/index.shtml) maize,” Proceedings of the National Academy of Sciences of the [35] terms. By making sequence information more easily United States of America, vol. 36, no. 6, pp. 344–355, 1950. accessible and fully integrated with other data stored at [3] N. Fedoroff, S. Wessler, and M. Shure, “Isolation of the MaizeGDB, it will become possible for researchers to begin transposable maize controlling elements Ac and Ds,” Cell, vol. to investigate how sequence relates to the architecture 35, no. 1, pp. 235–242, 1983. of the maize chromosome complement. How are the [4] H. B. Creighton and B. McClintock, “A correlation of cytologi- chromosomes arranged? Is it possible to relate the genetic cal and genetical crossing-over in Zea mays,” Proceedings of the 10 International Journal of Plant Genomics National Academy of Sciences of the United States of America, tation in maize floral organs by directly activating a flavonoid vol. 17, no. 8, pp. 492–497, 1931. biosynthetic gene subset,” Cell, vol. 76, no. 3, pp. 543–553, [5] E. H. Coe Jr., “The properties, origin, and mechanism of 1994. conversion-type inheritance at the B locus in maize,” Genetics, [24] B. R. Wiseman, M. E. Snook, and D. J. Isenhour, “Maysin vol. 53, no. 6, pp. 1035–1063, 1966. content and growth of corn earworm larvae (Lepidoptera: [6] C. J. Lawrence, M. L. Schaeffer,T.E.Seigfried,D.A.Campbell, Noctuidae) on silks from first and second ears of corn,” Journal and L. C. Harper, “MaizeGDB’s new data types, resources and of Economic Entomology, vol. 86, no. 3, pp. 939–944, 1993. activities,” Nucleic Acids Research, vol. 35, database issue, pp. [25] P. F. Byrne, M. D. Mcmullen, M. E. Snook, et al., “Quantitative D895–D900, 2007. trait loci and metabolic pathways: genetic control of the [7] D. H. Ware, P. Jaiswal, J. Ni, et al., “Gramene, a tool for grass concentration of maysin, a corn earworm resistance factor, in genomics,” Plant Physiology, vol. 130, no. 4, pp. 1606–1613, maize silks,” Proceedings of the National Academy of Sciences of 2002. the United States of America, vol. 93, no. 17, pp. 8820–8825, [8] Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology, National Academies Press, [26] J. M. Thornsberry, M. M. Goodman, J. Doebley, S. Kresovich, Washington, DC, USA, 2008. D. Nielsen, andE.S.Buckler,“Dwarf8 polymorphisms [9] D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and associate with variation in flowering time,” Nature Genetics, D. L. Wheeler, “GenBank,” Nucleic Acids Research, vol. 35, vol. 28, no. 3, pp. 286–289, 2001. database issue, pp. D21–D25, 2007. [27] M. Yano and T. Sasaki, “Genetic and molecular dissection of [10] Q. Dong, C. J. Lawrence, S. D. Schlueter, et al., “Comparative quantitative traits in rice,” Plant Molecular Biology, vol. 35, no. plant genomics resources at PlantGDB,” Plant Physiology, vol. 1-2, pp. 145–153, 1997. 139, no. 2, pp. 610–618, 2005. [28] S. A. Flint-Garcia, A.-C. Thuillet, J. Yu, et al., “Maize associ- [11] J. Quackenbush, F. Liang, I. Holt, G. Pertea, and J. Upton, ation population: a high-resolution platform for quantitative “The TIGR gene indices: reconstruction and representation of trait locus dissection,” The Plant Journal,vol. 44, no.6,pp. expressed gene sequences,” Nucleic Acids Research, vol. 28, no. 1054–1064, 2005. 1, pp. 141–145, 2000. [29] S. Salvi, G. Sponza, M. Morgante, et al., “Conserved non- [12] S. F. Altschul, T. L. Madden, A. A. Schaffer, et al., “Gapped coding genomic sequences associated with a flowering-time BLAST and PSI-BLAST: a new generation of protein database quantitative trait locus in maize,” Proceedings of the National search programs,” Nucleic Acids Research, vol. 25, no. 17, pp. Academy of Sciences of the United States of America, vol. 104, 3389–3402, 1997. no. 27, pp. 11376–11381, 2007. [13] F. Wei, E. H. Coe Jr., W. Nelson, et al., “Physical and [30] W. F. Tracy, “Sweet corn,” in Specialty Corns,A.R.Hallauer, genetic structure of the maize genome reflects its complex Ed., pp. 155–198, CRC Press, Boca Raton, Fla, USA, 2nd evolutionary history,” PLoS Genetics, vol. 3, no. 7, p. e123, edition, 2000. [31] E. T. Johnson, M. A. Berhow, and P. F. Dowd, “Expression [14] J. Gardiner, S. Schroeder, M. L. Polacco, et al., “Anchoring of a maize Myb transcription factor driven by a putative 9,371 maize expressed sequence tagged unigenes to the bac- silk-specific promoter significantly enhances resistance to terial artificial chromosome contig map by two-dimensional Helicoverpa zea in transgenic maize,” Journal of Agricultural overgo hybridization,” Plant Physiology, vol. 134, no. 4, pp. and Food Chemistry, vol. 55, no. 8, pp. 2998–3003, 2007. 1317–1326, 2004. [32] J. D. F. Meyer, M. E. Snook, K. E. Houchins, B. G. Rector, N. [15] R. Lerdorf, P. MacIntyre, and K. Tatroe, Programming PHP, W. Widstrom, and M. D. McMullen, “Quantitative trait loci O’Reilly, Sebastopol, Calif, USA, 2006. for maysin synthesis in maize (Zea mays L.) lines selected for [16] L. Wall, T. Christiansen, and J. Orwant, Programming Perl, high silk maysin content,” Theoretical and Applied Genetics, O’Reilly, Cambridge, Mass, USA, 2000. vol. 115, no. 1, pp. 119–128, 2007. [17] R. Scholl, M. M. Sachs, and D. Ware, “Maintaining collections [33] S. J. Szalma, E. S. Buckler IV, M. E. Snook, and M. D. of mutants for plant functional genomics,” Methods in Molec- McMullen, “Association analysis of candidate genes for maysin ular Biology, vol. 236, pp. 311–326, 2003. and chlorogenic acid accumulation in maize silks,” Theoretical [18] S. Carpenter, “Science careers. Carving a career in transla- and Applied Genetics, vol. 110, no. 7, pp. 1324–1333, 2005. tional research,” Science, vol. 317, no. 5840, pp. 966–967, 2007. [34] K. Ilic,E.A.Kellogg,P.Jaiswal,etal., “The plantstructure [19] E. Bortiri, G. Chuck, E. Vollbrecht, T. Rocheford, R. Mar- ontology, a unified vocabulary of anatomy and morphology tienssen, and S. Hake, “ramosa2 encodes a LATERAL ORGAN of a flowering plant,” Plant Physiology, vol. 143, no. 2, pp. 587– BOUNDARY domain protein that determines the fate of stem 599, 2007. cells in branch meristems of maize,” The Plant Cell, vol. 18, no. [35] M. Ashburner, C. A. Ball, J. A. Blake, et al., “Gene ontology: 3, pp. 574–585, 2006. tool for the unification of biology. The gene ontology consor- [20] E. Bortiri, D. Jackson, and S. Hake, “Advances in maize tium,” Nature Genetics, vol. 25, no. 1, pp. 25–29, 2000. genomics: the emergence of positional cloning,” Current Opinion in Plant Biology, vol. 9, no. 2, pp. 164–171, 2006. [21] H. Wang, T. Nussbaum-Wagler, B. Li, et al., “The origin of the naked grains of maize,” Nature, vol. 436, no. 7051, pp. 714– 719, 2005. [22] E. H. CoeJr.,M.G.Neuffer, and D. A. Hosington, “The genetics of corn,” in Corn and Corn Improvement,G.F. Sprague and J. W. Dudley, Eds., pp. 81–258, American Society of Agronomy, Madison, Wis, USA, 1988. [23] E. Grotewold, B. J. Drummond, B. Bowen, and T. Peterson, “The myb-homologous P gene controls phlobaphene pigmen- International Journal of Peptides Advances in International Journal of BioMed Stem Cells Virolog y Research International International Genomics Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 Journal of Nucleic Acids International Journal of Zoology Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 Submit your manuscripts at http://www.hindawi.com The Scientific Journal of Signal Transduction World Journal Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 International Journal of Advances in Genetics Anatomy Biochemistry Research International Research International Microbiology Research International Bioinformatics Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 Enzyme Journal of International Journal of Molecular Biology Archaea Research Evolutionary Biology International Marine Biology Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Plant Genomics Hindawi Publishing Corporation

MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research

Loading next page...
 
/lp/hindawi-publishing-corporation/maizegdb-the-maize-model-organism-database-for-basic-translational-and-e0I5o9dGnZ
Publisher
Hindawi Publishing Corporation
Copyright
Copyright © 2008 Carolyn J. Lawrence et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
ISSN
1687-5370
DOI
10.1155/2008/496957
Publisher site
See Article on Publisher Site

Abstract

Hindawi Publishing Corporation International Journal of Plant Genomics Volume 2008, Article ID 496957, 10 pages doi:10.1155/2008/496957 Resource Review MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research 1, 2, 3 4, 5 6, 7 1, 2 Carolyn J. Lawrence, Lisa C. Harper, Mary L. Schaeffer, Taner Z. Sen, 1 1 Trent E. Seigfried, and Darwin A. Campbell USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011, USA Department of Agronomy, Iowa State University, Ames, IA 50011, USA USDA-ARS, Plant Gene Expression Center, 800 Buchanan Street, Albany, CA 94710, USA Department of Molecular and Biology, University of California Berkeley, Berkeley, CA 94720, USA USDA-ARS, Plant Genetics Research Unit, Columbia, MO 65211, USA Division of Plant Sciences, University of Missouri Columbia, Columbia, MO 65211, USA Correspondence should be addressed to Carolyn J. Lawrence, carolyn.lawrence@ars.usda.gov Received 31 August 2007; Accepted 10 July 2008 Recommended by Chunguang Du In 2001 maize became the number one production crop in the world with the Food and Agriculture Organization of the United Nations reporting over 614 million tonnes produced. Its success is due to the high productivity per acre in tandem with a wide variety of commercial uses. Not only is maize an excellent source of food, feed, and fuel, but also its by-products are used in the production of various commercial products. Maize’s unparalleled success in agriculture stems from basic research, the outcomes of which drive breeding and product development. In order for basic, translational, and applied researchers to benefit from others’ investigations, newly generated data must be made freely and easily accessible. MaizeGDB is the maize research community’s central repository for genetics and genomics information. The overall goals of MaizeGDB are to facilitate access to the outcomes of maize research by integrating new maize data into the database and to support the maize research community by coordinating group activities. Copyright © 2008 Carolyn J. Lawrence et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION model organism for developing fuelstock grasses is apparent [1]. Indeed, in addition to its value as a commodity, maize has been a premiere model organism for biological research Maize (Zea mays L.) is a species that encompasses the for over 100 years. Many seminal scientific discoveries have subspecies mays (commonly called “corn” in the US) as well as the various teosintes that gave rise to modern maize. Maize first been shown in maize, such as the identification [2]and cloning [3] of transposable elements, the correlation between is an important crop: not only is it one of the most abundant cytological and genetic crossing over [4], and the discovery of sources of food and feed for people and livestock the world over, it is also an important component of many industrial epigenetic phenomena [5]. These exceptional characteristics of maize set this amazing plant apart: no other species serves products. Maize byproducts are present in, for example, glue, paint, insecticides, toothpaste, rubber tires, rayon, and as both a commodity and a leading model for basic research. Today, with the accelerated generation of maize genetic molded plastics, among others. Maize is also currently the nation’s major source of ethanol, a major biofuel that is more and genomic information, the need for a centralized biologi- environmentally friendly than gasoline and that may be a cal data repository is critical. MaizeGDB (the Maize Genetics more economical fuel alternative in the long run. Although and genomics Data Base [6]) (http://www.maizegdb.org/) it is unlikely that ethanol production from maize directly is the Model Organism Database (MOD) for maize. Stored will be sustainable long-term, maize’s suitability to serve as a at MaizeGDB is comprehensive information on loci (genes 2 International Journal of Plant Genomics and other genetically defined genomic regions including a chromosome, the sequence records at MaizeGDB, and QTL), variations (alleles and other sorts of polymorphisms), the EST (expressed sequence tag) and GSS (genome survey stocks, molecular markers and probes, sequences, gene sequence) contig assemblies at PlantGDB [10] and Dana product information, phenotypic images and descriptions, Farber (The Gene Indices at http://compbio.dfci.harvard metabolic pathway information, reference data, and con- .edu/tgi/cgi-bin/tgi/gimain.pl?gudb=maize, previously at tact information for maize researchers. Described in the TIGR [11]). All of the 3 520 247 sequences in MaizeGDB results and discussion section are example workflows that are accessible by BLAST [12]and canbefiltered to report could be followed by researchers to utilize the MaizeGDB only mapped loci, including any SSRs and overgos that may resource for their research. Other long-term resources serv- not be mapped genetically, but via BACs (bacterial artificial ing maize data include Gramene (http://www.gramene.org/) chromosomes) in anchored contigs. [7], which specializes in grass comparative genomics, and The inclusion of the public BAC FPC (Finger Print GRIN (the Germplasm Resources Information Network; Contig) information [13] adds 439 449 BACs together with http://www.ars-grin.gov/npgs/), which provides access to the associated overgo, SSR, and RFLP markers, which are National Plant Germplasm System’s germplasm stocks and used to assemble the contigs and to link contigs onto related breeding data. MaizeGDB makes an effort to guide genetic map coordinates. The order of loci on the BAC researchers to these resources via context-sensitive linkages contigs is represented by over 27 000 sequenced-based rather than duplicating data, though some data are shared loci on the IBM2 FPC057 maps (http://www.maizegdb simply to allow for the context-sensitive linkages to be .org/cgi-bin/displaymapresults.cgi?term=ibm2+fpc0507)in created. This reduces duplication in effort and allows per- MaizeGDB, by links to contigs at both the Arizona FPC sonnel skilled in comparative genomics and germplasm con- site (http://www.genome.arizona.edu/) and the genome servation/plant breeding to interact with maize researchers sequencing project (http://www.maizesequence.org/). As the directly via Gramene and GRIN, respectively. B73 genome sequence progresses, these BAC sequences are In addition to storing and making maize data available, added to MaizeGDB along with links to the sequencing the MaizeGDB team also provides services to the commu- project, both from the BAC clones and from genetically nity of maize researchers and offers technical support for mapped loci associated with a BAC. the Maize Genetics Executive Committee and the Annual The newest maps in MaizeGDB, IBM SNP 2007 (http:// Maize Genetics Conference. Also available at the MaizeGDB www.maizegdb.org/cgi-bin/displaymapresults.cgi?term=ibm website, as a service to the maize research community, are %20snp%202007), are the first of a new generation of bulletin boards for news items, information of interest to genetic maps from the Maize Diversity Project (http://www cooperators, lists of websites for projects that focus on the .panzea.org/) kindly provided pre-publication by Dr. Mike scientific study of maize, the Editorial Board’s recommended McMullen. The SNP loci on these maps are associated reading list, and educational outreach items. with allelic sequences from a core set of maize and teosinte The genetic and genomic data as well as community- germplasm. Because the majority of the anticipated 1128 related information maintained by MaizeGDB are highly loci have been previously mapped onto BAC clones [13, 14], utilized: MaizeGDB averages 8620 visitors (based on unique these genetic maps tightly link sequence diversity to the B73 Internet Protocol or IP addresses) and over 160 000 page genome sequence. impressions per month (July 2007 to June 2008). In addition, MaizeGDB came in fifth out of 170 in a National Plant 2.2. Methods of access, environments, Genome Initiative Grantees poll in which lead principal and the database back end investigators reported most useful websites for their research [8]. 2.2.1. The production web interface Maize researchers primarily access MaizeGDB through 2. MATERIALS AND METHODS the series of interconnected Web pages available at http://www.maizegdb.org/ (see Figure 1). These web pages 2.1. Kinds of data in the database that link genetic and are dynamically generated and are written in PHP (the genome sequence information recursive abbreviation for PHP Hypertext Preprocessor [15]) MaizeGDB is the primary repository for the major genetic and Perl [16]. Through this interface, each page shows and cytogenetic maps and includes details about genes, detailed information on a specific biological entity (such as mutants, QTL (quantitative trait loci), and molecular a gene) as well as basic information about data associated markers including 2500 RFLPs (restriction fragment length with it (genes are associated with maps, phenotypes, and polymorphisms), 4625 SSRs (simple sequence repeats), citations, among others). These additional data types are 363 SNP (single nucleotide polymorphisms), 2500 indels linked to the gene page, enabling quick access to alternative (insertion/deletion sites), and 10 644 overgos (overlapping data views. The site also includes links to related resources at oligonucleotides). These data are described using 1.27 other databases; genes, for example, are linked to Gramene millions synonyms, 42 000 primer sequences, 16 394 raw [7]. scores from mapping based upon 16 panels of stocks, One may access these individual data pages by using and 323 313 links to GenBank [9] accessions. GenBank either (1) the search bar located at the top right of every page accessions form the links between the genetic position on (Figure 1(A)), or (2) data type-specific advanced querying Carolyn J. Lawrence et al. 3 Figure 1: The MaizeGDB home page. The most commonly utilized search functionality for MaizeGDB is the search bar (A), which is available within the header of any MaizeGDB page. To browse data and to search specific data types using specific limiters, the Data Centers (B) are also quite useful. Also available is a Bin Viewer (C), which allows for a view of lots of data types within the context of their chromosomal location. To enable access to the Data Centers and other displays of interest from any MaizeGDB page, a pull-down menu for “Useful pages” (D) is accessible on the header of any MaizeGDB page. The footer of all MaizeGDB pages contains a context-sensitive “feedback form” link (E). Researchers use the feedback form to report errors, ask questions, and to contact the MaizeGDB team directly. For newcomers to the site, the MaizeGDB Tutorial (F) can help them to get a jump start on how to use the site. tools (accessible via the “Data Centers” links; Figure 1(B)) 2.2.2. Structure and relationship of environments: on the left side of the home page, or (3) the Bin Viewer tool production, staging, and test (Figure 1(C)), which is located in the left margin of the home page or via a pull down labeled “Useful pages” (Figure 1(D)) The production Web interface, which most MaizeGDB accessible at the top of any MaizeGDB page. These tools allow users interact with, is only one component of the overall researchers to easily find relevant data displays. MaizeGDB infrastructure (Figure 2). The data accessed by MaizeGDB’s method of data delivery has three primary the production Web interface are typically updated on goals: placing information within the framework of its the first Tuesday of each month. Prior to being in that scientific meaning, making this information available to Production Environment, the data are prepared for public the researcher with minimal input (often only the relevant accessibility in a Staging Environment. In the Staging Envi- term), and requiring minimal effort from the researcher to ronment, the most up-to-date information is available, new comprehend the data displays. By focusing on biological data are added to the database, and existing data are updated context and ease of use as the primary focus of this with new information. In addition to a Web interface that interface (the “production” Web interface), the database appears identical to the one in the Production Environment, is intended to be intuitive to the researcher as their click the Staging Environment offers SQL (Structured Query stream follows a logical path of biological associations. Language) read-only access to the community so that Up-to-date site usage statistics can be accessed online at researchers interested in interacting with the data in a more http://www.maizegdb.org/usage/. direct and customized manner can have access to the most 4 International Journal of Plant Genomics up-to-date information available. In addition, a Disaster drives with Red Hat Advanced Server 2.1 operating system Recovery system has been put in place whereby the Curation installed. The curation database, either partially or in its Database is backed up in a compressed format to a separate entirety, can be moved to MySQL, Microsoft Access, and machine in Ames, Iowa daily. Once weekly, the Ames file is nearly any other portable data format that a researcher would copied to Columbia, Missouri for off-site storage. need. Requests to gain read-only SQL access to the Curation To aid in the modeling of new types of data for inclusion database can be made via the feedback link that appears at the in the MaizeGDB product and to enable programming to bottom of any MaizeGDB page. Data housed at MaizeGDB be tried out in a safe place, a Test Environment identical to are in the public domain and are freely available for use the Staging Environment has been created. Note that three without a license. copies of the database exist. While each environment and server has a specific purpose, all are configured such that they 2.3. Outreach could serve a backup to each other. If any one server was to fail, either of the other two could provide full, unrestricted One of the strengths of MaizeGDB is its responsiveness data access and site functionality. The curation database is to community input, received either personally or by the backed up on a daily basis and is available for download feedback forms accessible at the bottom of each page (http://goblin1.zool.iastate.edu/∼oracle/) for those who have (Figure 1(E)). To provide outreach and user support as well Oracle Relational Database Management System (RDBMS) as to solicit input from researchers in a more active manner, installed locally. several strategies are employed. The first is tutorials and basic information on MaizeGDB. The MaizeGDB Tutorial (Figure 1(F)) can be reached from the home page at the top 2.2.3. Curation of the left margin. A new user can go through this tutorial, Also available within the Staging Environment are Com- and become familiar with how to use the site quickly. In munity Curation Tools to enable researchers to add small addition, a “Site Tour” with an overview with examples can datasets to the database directly, as well as a set of Profes- be found under the “Useful pages” pull down menu at the sional Curation Tools developed by Dr. Marty Sachs’ group top of each page. More specific tutorial examples and other at the Maize Genetics Cooperation-Stock Center in Urbana- educational materials are available via the “Education” link, Champaign [17]. Whereas the Community Curation Tools also within the “Useful pages” pull down menu. Also, on have many safeguards to help researchers enter data step- many of the “Data Center” pages (available from the left wise and with enforced field requirements, the Professional margin of the front page or via the “Useful pages” pull down) Curation Tools allow MaizeGDB project members as well a discussion of the topic of the page that is suitable for the as Stock Center personnel to enter datasets in a more general public appears toward the bottom. Another form of stream-lined and powerful fashion with fewer integrity outreach supported by MaizeGDB is assistance at meetings enforcement rules (which slow down the data entry process and conferences. Representatives from MaizeGDB attend and considerably). It also should be noted that data added help researchers at the Annual Maize Genetics Conference to the database via the Community Curation Tools are (usually in March), the International Plant and Animal first marked as “Experimental” that must be “Activated” Genome Conference (January), and various other meetings by professional curators at MaizeGDB. This ensures that through direct interaction in person. Finally, researchers can only quality information is made publicly accessible. The request a MaizeGDB site visit. About three times a year, availability of a Curation Web interface (within the Staging an expert curator travels to various research locations and Environment) enables researchers to view the data as they provides tutorials and support for maize researchers. For will appear once they are uploaded to Production. Few these visits, the local maize researchers are asked for a list of researchers (about 30 at present) have Community Curation specific questions ahead of time. During the one to two day accounts. To increase the use of these tools, training sessions visits, researchers interact in groups and one-on-one with the are being organized (see Section 2.3, below). If researchers traveling curator to learn how to utilize MaizeGDB for their wish to deposit complex or large datasets, it would not be research and to deposit data at MaizeGDB. reasonable to enter the data via the Community Curation Tools because those tools work via a “bottom-up” approach 2.4. Community support services whereby the records are (1) built based upon the most basic information included in the dataset and (2) entered one MaizeGDB provides community support in several ways. record at a time (i.e., not in bulk). For complex or large Two members of the MaizeGDB team, MLS and TES, serve datasets, researchers are encouraged to submit data files to as ex officio members of the Maize Genetics Conference the curators at MaizeGDB. Those data are added to the Steering Committee. They collect electronic abstracts database directly by curators and the database administrator. for the Annual Maize Genetics Conference and handle the preparation and printing of the program for the conference. MaizeGDB personnel also manage regular 2.2.4. Database back end community surveys on behalf of the Maize Genetics Each environment’s server has a perpetual license and is Executive Committee. These surveys enable the Executive supportedbyOracleRDBMS poweredby2 × 2.0GHz Xeon Committee to summarize the overall research interest processors, 4 GB of RAM, 5 × 73 GB Ultra 320 10 K RPM of the maize community and to advise funding agencies Carolyn J. Lawrence et al. 5 Maize community Data deposition SQL access Curation Data retrieval Community Professional Curation Curation Tools Tools Primary data Production database (unformatted Genome browser Curation database Genome Production Environment browser Staging Environment “Playground” database Genome browser Ames Columbia IA MO Professional Community Curation Curation Tools Tools MaizeGDB Disaster Recovery Test Environment Figure 2: Simplified infrastructure of MaizeGDB. The community of maize researchers can add data to the database (downward-facing arrows from the uppermost yellow box) via direct data deposition (upper left) and via a set of Community Curation Tools that interacts with the Curation Database (upper center). Researchers are also allowed access to maize data (upward-facing arrows from the lower dashed box) via a web interface that can be accessed at http://www.maizegdb.org/ (upper right) and by way of SQL access to the Curation Database, which houses the most up-to-date data available (upper center). These functionalities are supported by two of the three environments: Production and Staging, respectively (upper dashed gold boxes). Available for use by MaizeGDB personnel to facilitate data modeling and trial programming manipulations is a third environment called Test (lower left dashed gold box), which is identical to the Staging Environment. To ensure that the most up-to-date copy of the database is backed up, a Disaster Recovery process has been instituted (lower center dashed gold box) whereby a compressed copy of the database is backed up to a separate machine in Ames, Iowa daily, and to a server in Columbia, Missouri weekly. on future research directions. MaizeGDB personnel simultaneously, here the researcher types are distinguished also manage the Executive Committee’s website (i.e., as follows: basic researchers investigate the fundamental http://www.maizegdb.org/mgec.php) and conducts the biology of the organism, translational researchers work to Executive Committee’s elections. MaizeGDB houses the determine the application of basic research outcomes for mailing list for the annual Maize Newsletter and project practical purposes [18], and applied researchers implement personnel conduct semi-regular mailings to the maize com- proven technologies to improve crops. munity on behalf of interested researchers by maintaining an electronic list of researchers’ contact information. Potential 3.1. Basic mailings to this list are vetted by the Executive Committee. Many basic researchers work with mutants to understand the processes underlying biological phenomena. Once a new 3. RESULTS AND DISCUSSION mutant is found, there are several standard methods used To demonstrate how researchers utilize MaizeGDB, three to elucidate normal gene functions. These efforts include example usage cases are presented here. Because researchers determining whether the mutant represents an allele of a with very different goals can all utilize MaizeGDB to advance previously described gene, and if not, genetic mapping and their work, the usage cases are classified by research type: cloning of the new gene. Information stored in MaizeGDB is basic, translational, and applied. See Figure 3 for examples of useful in all of these steps. how these research types fit together. By enabling researchers In a large screen for mutations that change pericarp to carry out workflows that support translational and pigmentation from red to some other color, Researcher applied research, MaizeGDB plays a part in influencing crop 1 has found a plant with a brownish-red pericarp development directly. Although a single researcher might coloration. She first wants search MaizeGDB to find even include all of these three aspects in his/her research all known mutants that have red pericarp phenotypes Web interface Web interface Web interface 6 International Journal of Plant Genomics Basic Translational Applied Scientific investigation Investigation to gain The scientific work required to develop a knowledge and carried out to create understanding of a clinical or commercial therapies or Definition particular subject application from a commercial/consumer without regard to basic science products practical applications discovery Investigating disease Determining the relative Using the relative mechanisms and risk of an allele or risk information to Medical processesaswellas therapytoapatient examplar prescribe therapies chemical effects on cellular processes Investigating new Making breeding Utilizing variants as information on, for crosses to create selectable biomarkers Plant example, genome plants that have in breeding crosses biology structure, gene models, value-added traits for and progeny examplar gene function, and real-world production genetic variability agricultural uses Identifying markers Using markers Determining the Example (morphological and genetic cause of red associated with both usage molecular) to guide pigmentation and pigmentation in the cases (See breeding programs to maysin production for pericarpof a newly produce ear worm Section 3) corn ear worm discovered mutant resistant sweet corn resistance lines Figure 3: Three types of biological research. Research can be divided into three categories: basic, translational, and applied. Outcomes from basic research feed into translational predictions, and developed uses for these findings constitute the basis for developing real-world applications that benefit humanity and the world. Listed after the flow of research are definitions for each research type as well as medical and plant biological models for how the different divisions are interrelated. Also shown are overviews of the example usage cases presented in Section 3. to determine whether this mutation represents a newly she orders seed from the Stock Center directly through the discovered gene. Because she does not know how others MaizeGDB interface. She then goes back to the results of her might have described the phenotype, she decides to “pericarp color” query and repeats the process for “cherry browse existing phenotype terms and images. From the pericarp,” ordering stocks for r1-ch (colored1-cherry), also left margin of the MaizeGDB homepage, she selects to be used in her complementation analyses. (Another way “Mutant Phenotypes” under “Data Centers-Functional.” she could have found maize stocks that have red pericarp is On this page (http://www.maizegdb.org/), she selects the following: from the header of any page, select “Useful “pericarp color” from the pull down menu labeled pages” and click “Stocks.” This pulls up the stock search “Show only phenotypes relating to this trait” in the page http://www.maizegdb.org/stock.php. In the green box, green search bar. A number of possible mutant phenotypes select stocks with the phenotype “red pericarp” from the are returned, including “red pericarp.” Clicking on the pull down menu of all phenotype names and submit. A “red pericarp” phenotype link, she finds that the listed long list of stocks that contain alleles of p1 with red pericarp mutants are alleles of p1 (pericarp color1). On this page is returned. Alternatively, the Stock Center Catalog is also (http://www.maizegdb.org/cgi-bin/displayphenorecord.cgi? available from the Stocks Data Center page.) id=13818), she scrolls to the bottom and finds that there Researcher 1 receives several appropriate stocks and are many stocks that can be ordered from the Maize performs allelism tests and determines that her mutant Genetics Cooperation-Stock Center that carry P1-rr (an (which turns out to be recessive) is not allelic to p1 or r1. allele that causes red pericarp and red cob) or P1-rw (red She returns to MaizeGDB and again looks through “Mutant pericarp and white cob). Having these stocks in hand will Phenotype” results using the “pericarp color” query. Listed enable her to test whether the new mutant represents an there are brown pericarp, orange pericarp, white pericarp, allele of the p1 gene, so she decides to order a few for and lacquer red pericarp phenotypes in addition to the complementation analyses. Clicking on the stock links red and cherry phenotypes she focused on initially. She listed on the variation/allele page allows her access to a finds that there is no stock available for the brown pericarp shopping cart utility (in the green right hand panel), and phenotype (the brown pericarp1 mutant has been lost), and Carolyn J. Lawrence et al. 7 all the others are alleles that confer colored pericarp in the and any available GenBank accession numbers for sequences dominant condition as a result of the presence of P1 alleles. as well as sequenced BACs. She finally selects markers and To determine whether the new mutation could be an allele of performs fine structure mapping. As she finds markers bp1, she decides to map it genetically. closer and closer to the gene, she can proceed with positional MaizeGDB houses the largest collection of publicly cloning to determine whether the position is consistent with available genetic maps of maize (currently over 1,337 maps). bp1 (niceexamplesofhow thisisdonecan be foundin These include maps of genes primarily defined by mutants [19–21]). with morphological phenotypes (“Genetic 2005” is the most current), maps based on phenotypic molecular markers, and 3.2. Translational compositemapswhere variousmapshavebeenintegrated. These maps can be easily accessed from the home page, Research to understand the metabolic pathways that produce via the left margin link to “Data Centers-Genetic-Maps” pigmentation (like those outlined in Section 3.1)are well (http://www.maizegdb.org/map.php). This page not only studied in maize [22]. One example of a well-characterized allows various map search functions, but also provides infor- gene that confers pigmentation is p1,which encodesa mation on the most popular maps and a handy reference to transcription factor that regulates synthesis of flavones such explain more about the various composite maps. as anthocyanins [23]. The p1 gene, along with its adjacent The maize genome is divided into genetic bins of duplicate pericarp color2 (p2), controls pericarp and cob approximately 20 centiMorgans each and boundary markers coloration and causes silks to brown when cut. One flavone with nearby SSRs can be used for mapping (for further expla- produced by the pathway is maysin, a compound which nation see http://www.maizegdb.org/cgi-bin/bin viewer.cgi). has been shown to be antinutritive to the corn ear worm Researcher 1 decides to utilize SSRs to map her gene to bin at concentrations above 0.2% fresh weight if husks limit resolution. To find the core markers from the home page, access to the ear such that feeding on silks is required for she clicks on “Tools-Bin Viewer” in the left margin of the the insect to enter [24]. Many QTL for resistance to corn home page. This provides a list of the core bin markers and earworm map near loci in the flavone synthesis pathway that a link to purchase relevant primers to screen her mapping are either regulatory genes (such as p1 and p2), or at rate- population. She generates a mapping population, performs limiting enzymatic steps, such as c1 (chalcone synthase1) that PCR experiments using the polymorphic markers, and maps contribute maysin accumulation in silks [25]. Understanding her mutant to bin 9.02. how maysin functions and how this information could be To see what genes are located in bin 9.02, she goes back used for production agriculture is Researcher 2’s area of to the Bin Viewer (from the homepage), and holds the curser expertise. over the image of chromosome 9 until she sees “bin 9.02,” Researcher 2 has investigated maysin synthesis for some then clicks. The result is a long list of genes, other loci, time, and has decided to clone an uncharacterized maysin sequences, EST contigs, SSRs, BACs, and other data relating QTL near umc105a, in the bin 9.02, which is bounded by to bin 9.02. Searching through this data, she sees that bp1 is bz1 and wx1 [24]. He believes that the QTL may be a listed under “other loci” in bin 9.02. This is a “lapsed locus” previously described, but lost, bp1 mutant thought to be meaning that the stock has been lost, but perhaps she has involved in maysin synthesis. In the first step, he must first foundanewallele! find molecular markers to more finely map the region (his To see more specific genetic mapping data on bp1, preference would be to use SSRs, since members of the she goes to the search bar along the top green bar of lab are already using them successfully). He plans to follow every page, selects “loci” from the pull down menu, the strategy of chromosome walking to narrow down the types “bp1” into the field provided, and clicks the button region of interest [19–21] followed by association mapping marked “Go!” This brings her to the bp1 locus page to identify the actual QTL sequence [26, 27]. Knowing this (http://www.maizegdb.org/cgi-bin/displaylocusrecord.cgi? sequence would enable plant breeders to track the QTL for id=61563) where she can see that bp1 is placed on three marker assisted selection. genetic maps. Clicking on each map, Researcher 1 learns that To find SSR data for mapping to a bin region, Researcher in 1935, bp1 was mapped between sh1 and wx1 (shrunken1 2 goes to the MaizeGDB home page and clicks on “Data and waxy1), two well-studied genes. To search for molecular Centers-Genomic-Molecular Markers/Probes” in the left markers suitable for fine structure mapping, she visits “Data margin, then clicks the “SSR” link at the top of the page Centers-Genetic-Maps” from the link on the home page. (the link is located in “Specific information is available on In the green Advanced Search box, she enters sh1 and wx1 BACs, ESTs, overgos, and SSRs.”) Scrolling down to the green separately in the “Show only maps containing this locus” “Set Up Criteria” box, he then selects bin 9.02 and submits lines. This returns only genetic maps that contain both a search request. A report is returned that lists the available genes. She selects the map with the most markers—IBM2 SSRs for bin 9.02, complete with primers, gel patterns for 2005 Neighbors 9 (with 2,488 markers). She finds sh1 at different germplasm, and related maps. By going back to the position 80.30, and wx1 at 185.00. To choose among several SSR page, he also downloads tabular reports of map locations molecular markers, Researcher 1 follows the available links of all SSRs on chromosome 9, including those that have been leading her to information about suitable primers, a number anchored to a BAC contig. Using this information in the of variations (which can help to decide if there may be a laboratory, members of his research group perform mapping polymorphism in her mapping populations), gel patterns, experiments using several SSRs in bin 9.02 along with some 8 International Journal of Plant Genomics others in the more distal part of bin 9.03. They discover that In the instance of looking for particular stocks, the mid-region peak for the QTL is very near an SSR for Researcher 3 has identified GT114 as a high maysin line from bnlg1372, which is anchored to a BAC contig. [25]. Using the green search bar at the top of any MaizeGDB To find sequenced BACs that may harbor the earworm page, he searches “stocks” for “GT114.” At that page, he resistance QTL, Researcher 2 uses the search bar at the sees a brief annotation stating that GT114 is a poor pollen top of each MaizeGDB page to find the locus bnlg1372. producer and makes a note of that observation and plans to At the top of the bnlg1372 page, he follows a link to cross by IA453 and IA5125, sweet lines that produce pollen the contig 373 display at the Maize Sequencing Project well, to ameliorate this potential difficulty. Clicking the link site (http://www.maizesequence.org/). This is a rather large to GT114, he sees that it is an inbred line derived from GT- contig with many sequenced BACs and assigned markers. DDSA (DD Syn A) in Georgia, and it is made available via At the Maize Sequencing Project site, he uses the export GRIN. Selecting the link for GRIN, a page opens at that site function (a button at the left margin) to view a text list (http://www.ars-grin.gov/cgi-bin/npgs/html/search.pl?PI+ of all the markers and sequenced BAC clones that are 511314). Listed there are the Crop Science Registration data, available on the Finger Print Contig physical map. He finds availability (noted as currently unavailable, but a call to Mark that bnlg1372 is assigned to the region “19742100,1974700,” Millard, maize curator at the maintenance site indicates that encompassed by the sequenced BAC clone, c0324E10. This he could access that stock in limited quantities if current information provides coordinates for viewing the region resources allow), and an image of bulk kernels among on a large contig associated with bnlg1372, the sequence other information. The image of bulked kernels is especially of BAC c0324E10, and any other BACs nearby. Researcher revealing: the kernels are yellow and the cob fragments 2 sequences candidate regions in diverse germplasm and appear red. Aware that a red cob would be unacceptable for conducts association analysis using silk maysin levels as breeding sweet corn (the red pigment could cause quite a a trait. This may require other information about nearby mess for those cooking and eating corn on then cob), he markers, which also are accessible via MaizeGDB [28, 29]. decides to search MaizeGDB for other available high maysin Although these investigations may require the devel- stocks. opment of further sequenced-based markers, Researcher 2 After a literature search of breeding stocks with a white hopes that useful markers already exist and decides to explore cob that might still produce maysin in the silks, Researcher MaizeGDB for any other sequences or primer-based markers 3 starts searching stocks for those known to carry the P1- already assigned to the region of interest including SNPs and wwb allele, a dominant allele of the p1 locus that confers indels. To do this from the locus page for bnlg1372, he clicks white pericarp, white cob, and browning silks. By clicking on the link to the most current IBM neighbors map listed, the “Data Centers-Genetic-Stocks” link from the MaizeGDB then explores the “sequence” and “primer” view versions of homepage, he arrives at the Stocks Data Center page (which the map by clicking on the relevant links at the top of the page is also accessible via the “Useful pages” pull down at the top just under the map name. The primer view shows primers of every MaizeGDB page). He uses the Advanced Search box associated with mapping probes along with the name of the to limit the query by variation to those stocks associated probes—just what he needs to get going with the association with the allele P1-wwb. A number of the stocks returned mapping work. on the results page have been evaluated for silk maysin accumulation (per associated publications) and could be further investigated as potential breeding stocks. 3.3. Applied Although the p1 gene accounts for much of the Interested in breeding plants for organic sweet corn produc- variability in maysin accumulation [32], association and tion, Researcher 3 has decided to use molecular markers to QTL analyses for candidate genes for maysin accumulation select for high maysin content, which would increase resis- also have identified anthocyaninless1 (a1), colorless2 (c2), and tance to the corn earworm—a cause of significant damage white pollen1 (whp1) as contributing significantly [32, 33]. Researcher 3 can track the dominant P1-wwb allele visually to sweet corn [30]. Although plants could be genetically modified to carry the genes that confer high maysin levels by selecting for browning silks given that the sweet lines in silks (e.g., see [31]), Researcher 3’s farming clients require he will be using in the breeding program have silks that do not brown, but tracking the other factors will require the that their product be certified as both organic and “GMO- free.” To meet the producers’ needs, he has decided to pursue use of molecular markers. To find molecular markers to a marker-assisted selection program to create high maysin select for desirable alleles of, for example, a1, Researcher 3 sweet inbred lines, which he will use to generate single- uses the search menu at the top of any page at MaizeGDB cross hybrids. To get started with the work, he searches to find “loci” using the query “a1.” The results page MaizeGDB to find references, markers, and stocks for the (http://www.maizegdb.org/cgi-bin/displaylocusresults.cgi? project. Described here are the details on how he could use term=a1) lists many loci with a1 as a substring, but MaizeGDB to (1) access stocks known to have high maysin shows the exact match (the a1 locus) at the top of the list. Clicking on that link shows the a1 locus page content directly and (2) locate relevant stocks based upon associated data with no prior knowledge of which stocks (http://www.maizegdb.org/cgi-bin/displaylocusrecord.cgi? he wants to find. An outline of how he uses MaizeGDB to id=12000), which lists useful information including six probes/molecular markers that could be used for identify relevant selectable markers for tracking the various QTL associated with maysin accumulation also is described. tracking useful a1 alleles. Using the same process, he also Carolyn J. Lawrence et al. 9 finds markers for the c2 and whp1 loci and sets to work and cytological maps to the assembled genome sequence? determining which markers to use for his selections. Are there sequences present at centromeres that signal the cell to construct kinetochores, the machines that ensure proper chromosome segregation to occur, at the correct site? 4. CONCLUSIONS MaizeGDB aims to enable researchers to discover answers Because MaizeGDB stores and makes accessible data of use to such queries that will enhance the quality of basic maize for a variety of applications, it is a resource of interest to research and ultimately the value of maize as a crop. It will maize researchers spanning many disciplines. The fact that become possible to interrogate the database to find answers basic research outcomes are tied to translational and applied to these and other complex questions, and the content of the data enables all researcher types to utilize the MaizeGDB genome can better be related to its function, both within the resource to further their research goals, and connections to cell and to the plant as a whole. Convergence of traditional external resources like Gramene, NCBI, and GRIN make it biological investigation with the knowledge of genome possible for researchers to find relevant resources quickly, content and organization is currently lacking, and is a new irrespective of storage location. area of research that will open up once a complete genome At present, maize geneticists are at the cusp of a sequence and a method for searching through the whole milestone: the genome of the maize inbred B73 is being of the data are both in place. It is the ability to investigate and answer such basic research questions that will serve as sequenced in the U.S., with anticipated completion in 2008. In addition, scientists working in Mexico the basis for devising sound methods to breed better plants. at Langebio (the National Genomics for Biodiversity Once the relationships among sequence data and more traditional maize data like genotypes, phenotypes, stocks, Laboratory) and Cinvestav (Centro de Investigacion y Estudios Avanzados) have announced through a press and so forth have been captured, it is important that those release (July 12, 2007) that they completely sequenced data be presented to researchers in a way that can be easily 95% of the genes with 4X coverage in a native Mexican understood without requiring that they have any awareness popcorn called palomero, though the data have not yet of how the data are actually stored within a database. It is these needs—creating connections between sequence and been released and the quality of the data is unknown (see http://www.bloomberg.com/apps/news?pid=20601086&sid= traditional genetic data, improving the interface to those aO.Xj8ybAExI&refer=latin america). At present and as more data, and determining how sequence data relate to the overall architecture of the maize chromosome complement—that maize sequence becomes available relating sequences to the existing compendium of maize data is the primary need the MaizeGDB team seeks to fulfill in the very near future. that must be met for maize researchers in the immediate future. Creating and conserving relationships among the ACKNOWLEDGMENTS data will enable researchers to ask and answer questions about the structure and function of the maize genome We are indebted to the community of maize researchers and the MaizeGDB Working Group (Drs. Volker Brendel, that previously could not be addressed. To address this need, MaizeGDB personnel will create a “genome view” by Ed Buckler, Karen Cone, Mike Freeling, Owen Hoekenga, Lukas Mueller, Marty Sachs, Pat Schnable, Tom Slezak, Anne adopting and customizing a Genome Browser that could be used to integrate the outcomes of the Maize Genome Sylvester, and Doreen Ware) for their continued enthusiasm, Sequencing Project. For genome browser functionality, basic help, and guidance. We are grateful to Dr. Bill Beavis for giving us the idea to highlight MaizeGDB’s utility for the researchers have an interest in visualizing genome structure, gene models, functional data, and genetic variability. three user types. We thank Drs. Mike McMullen, Jenelle Translational researchers would like to be able to assign Meyer, Bill Tracy, and Tom Peterson for helpful discussions concerning p1 and maysin research as well as Dr. Damon values to genomic and genetic variants (e.g., the value of a particular allele in a given population) and to view Lisch for suggestions on seminal discoveries in maize and those values within a genomic context. Applied researchers Mark Millard at the USDA-ARS North Central Regional are interested in tagging variants for use as selectable Plant Introduction Station for samples of corn with red cobs. markers and retrieving tags for particular regions of the genome. To best meet these researchers’ needs, the “genome REFERENCES view” will allow researchers to visualize a gene within its [1] C. J. Lawrence and V. Walbot, “Translational genomics for genomic context and a soon to be created “pathway view” bioenergy production from fuelstock grasses: maize as the will enable the visualization of a gene product within the model species,” The Plant Cell, vol. 19, no. 7, pp. 2091–2094, context of relevant metabolic pathways annotated with Plant Ontology (http://www.plantontology.org/)[34]and [2] B. McClintock, “The origin and behavior of mutable loci in Gene Ontology (http://www.geneontology.org/index.shtml) maize,” Proceedings of the National Academy of Sciences of the [35] terms. By making sequence information more easily United States of America, vol. 36, no. 6, pp. 344–355, 1950. accessible and fully integrated with other data stored at [3] N. Fedoroff, S. Wessler, and M. Shure, “Isolation of the MaizeGDB, it will become possible for researchers to begin transposable maize controlling elements Ac and Ds,” Cell, vol. to investigate how sequence relates to the architecture 35, no. 1, pp. 235–242, 1983. of the maize chromosome complement. How are the [4] H. B. Creighton and B. McClintock, “A correlation of cytologi- chromosomes arranged? Is it possible to relate the genetic cal and genetical crossing-over in Zea mays,” Proceedings of the 10 International Journal of Plant Genomics National Academy of Sciences of the United States of America, tation in maize floral organs by directly activating a flavonoid vol. 17, no. 8, pp. 492–497, 1931. biosynthetic gene subset,” Cell, vol. 76, no. 3, pp. 543–553, [5] E. H. Coe Jr., “The properties, origin, and mechanism of 1994. conversion-type inheritance at the B locus in maize,” Genetics, [24] B. R. Wiseman, M. E. Snook, and D. J. Isenhour, “Maysin vol. 53, no. 6, pp. 1035–1063, 1966. content and growth of corn earworm larvae (Lepidoptera: [6] C. J. Lawrence, M. L. Schaeffer,T.E.Seigfried,D.A.Campbell, Noctuidae) on silks from first and second ears of corn,” Journal and L. C. Harper, “MaizeGDB’s new data types, resources and of Economic Entomology, vol. 86, no. 3, pp. 939–944, 1993. activities,” Nucleic Acids Research, vol. 35, database issue, pp. [25] P. F. Byrne, M. D. Mcmullen, M. E. Snook, et al., “Quantitative D895–D900, 2007. trait loci and metabolic pathways: genetic control of the [7] D. H. Ware, P. Jaiswal, J. Ni, et al., “Gramene, a tool for grass concentration of maysin, a corn earworm resistance factor, in genomics,” Plant Physiology, vol. 130, no. 4, pp. 1606–1613, maize silks,” Proceedings of the National Academy of Sciences of 2002. the United States of America, vol. 93, no. 17, pp. 8820–8825, [8] Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology, National Academies Press, [26] J. M. Thornsberry, M. M. Goodman, J. Doebley, S. Kresovich, Washington, DC, USA, 2008. D. Nielsen, andE.S.Buckler,“Dwarf8 polymorphisms [9] D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and associate with variation in flowering time,” Nature Genetics, D. L. Wheeler, “GenBank,” Nucleic Acids Research, vol. 35, vol. 28, no. 3, pp. 286–289, 2001. database issue, pp. D21–D25, 2007. [27] M. Yano and T. Sasaki, “Genetic and molecular dissection of [10] Q. Dong, C. J. Lawrence, S. D. Schlueter, et al., “Comparative quantitative traits in rice,” Plant Molecular Biology, vol. 35, no. plant genomics resources at PlantGDB,” Plant Physiology, vol. 1-2, pp. 145–153, 1997. 139, no. 2, pp. 610–618, 2005. [28] S. A. Flint-Garcia, A.-C. Thuillet, J. Yu, et al., “Maize associ- [11] J. Quackenbush, F. Liang, I. Holt, G. Pertea, and J. Upton, ation population: a high-resolution platform for quantitative “The TIGR gene indices: reconstruction and representation of trait locus dissection,” The Plant Journal,vol. 44, no.6,pp. expressed gene sequences,” Nucleic Acids Research, vol. 28, no. 1054–1064, 2005. 1, pp. 141–145, 2000. [29] S. Salvi, G. Sponza, M. Morgante, et al., “Conserved non- [12] S. F. Altschul, T. L. Madden, A. A. Schaffer, et al., “Gapped coding genomic sequences associated with a flowering-time BLAST and PSI-BLAST: a new generation of protein database quantitative trait locus in maize,” Proceedings of the National search programs,” Nucleic Acids Research, vol. 25, no. 17, pp. Academy of Sciences of the United States of America, vol. 104, 3389–3402, 1997. no. 27, pp. 11376–11381, 2007. [13] F. Wei, E. H. Coe Jr., W. Nelson, et al., “Physical and [30] W. F. Tracy, “Sweet corn,” in Specialty Corns,A.R.Hallauer, genetic structure of the maize genome reflects its complex Ed., pp. 155–198, CRC Press, Boca Raton, Fla, USA, 2nd evolutionary history,” PLoS Genetics, vol. 3, no. 7, p. e123, edition, 2000. [31] E. T. Johnson, M. A. Berhow, and P. F. Dowd, “Expression [14] J. Gardiner, S. Schroeder, M. L. Polacco, et al., “Anchoring of a maize Myb transcription factor driven by a putative 9,371 maize expressed sequence tagged unigenes to the bac- silk-specific promoter significantly enhances resistance to terial artificial chromosome contig map by two-dimensional Helicoverpa zea in transgenic maize,” Journal of Agricultural overgo hybridization,” Plant Physiology, vol. 134, no. 4, pp. and Food Chemistry, vol. 55, no. 8, pp. 2998–3003, 2007. 1317–1326, 2004. [32] J. D. F. Meyer, M. E. Snook, K. E. Houchins, B. G. Rector, N. [15] R. Lerdorf, P. MacIntyre, and K. Tatroe, Programming PHP, W. Widstrom, and M. D. McMullen, “Quantitative trait loci O’Reilly, Sebastopol, Calif, USA, 2006. for maysin synthesis in maize (Zea mays L.) lines selected for [16] L. Wall, T. Christiansen, and J. Orwant, Programming Perl, high silk maysin content,” Theoretical and Applied Genetics, O’Reilly, Cambridge, Mass, USA, 2000. vol. 115, no. 1, pp. 119–128, 2007. [17] R. Scholl, M. M. Sachs, and D. Ware, “Maintaining collections [33] S. J. Szalma, E. S. Buckler IV, M. E. Snook, and M. D. of mutants for plant functional genomics,” Methods in Molec- McMullen, “Association analysis of candidate genes for maysin ular Biology, vol. 236, pp. 311–326, 2003. and chlorogenic acid accumulation in maize silks,” Theoretical [18] S. Carpenter, “Science careers. Carving a career in transla- and Applied Genetics, vol. 110, no. 7, pp. 1324–1333, 2005. tional research,” Science, vol. 317, no. 5840, pp. 966–967, 2007. [34] K. Ilic,E.A.Kellogg,P.Jaiswal,etal., “The plantstructure [19] E. Bortiri, G. Chuck, E. Vollbrecht, T. Rocheford, R. Mar- ontology, a unified vocabulary of anatomy and morphology tienssen, and S. Hake, “ramosa2 encodes a LATERAL ORGAN of a flowering plant,” Plant Physiology, vol. 143, no. 2, pp. 587– BOUNDARY domain protein that determines the fate of stem 599, 2007. cells in branch meristems of maize,” The Plant Cell, vol. 18, no. [35] M. Ashburner, C. A. Ball, J. A. Blake, et al., “Gene ontology: 3, pp. 574–585, 2006. tool for the unification of biology. The gene ontology consor- [20] E. Bortiri, D. Jackson, and S. Hake, “Advances in maize tium,” Nature Genetics, vol. 25, no. 1, pp. 25–29, 2000. genomics: the emergence of positional cloning,” Current Opinion in Plant Biology, vol. 9, no. 2, pp. 164–171, 2006. [21] H. Wang, T. Nussbaum-Wagler, B. Li, et al., “The origin of the naked grains of maize,” Nature, vol. 436, no. 7051, pp. 714– 719, 2005. [22] E. H. CoeJr.,M.G.Neuffer, and D. A. Hosington, “The genetics of corn,” in Corn and Corn Improvement,G.F. Sprague and J. W. Dudley, Eds., pp. 81–258, American Society of Agronomy, Madison, Wis, USA, 1988. [23] E. Grotewold, B. J. Drummond, B. Bowen, and T. Peterson, “The myb-homologous P gene controls phlobaphene pigmen- International Journal of Peptides Advances in International Journal of BioMed Stem Cells Virolog y Research International International Genomics Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 Journal of Nucleic Acids International Journal of Zoology Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 Submit your manuscripts at http://www.hindawi.com The Scientific Journal of Signal Transduction World Journal Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 International Journal of Advances in Genetics Anatomy Biochemistry Research International Research International Microbiology Research International Bioinformatics Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 Enzyme Journal of International Journal of Molecular Biology Archaea Research Evolutionary Biology International Marine Biology Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Journal

International Journal of Plant GenomicsHindawi Publishing Corporation

Published: Aug 20, 2008

References