Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Status of DNA Barcoding Coverage for the Tropical Western Atlantic Shorefishes and Reef Fishes

Status of DNA Barcoding Coverage for the Tropical Western Atlantic Shorefishes and Reef Fishes Background: Barcode coverage is difficult to assess for large regions due to incomplete species lists, inaccurate identifications, and cryptic diversity. However, as coverage approaches completion, it becomes possible to critically evaluate identifications and validate barcode lineages. We collate the results of the FISH-BOL barcode project and assess coverage for each family of bony shorefishes and reef fishes from the tropical western Atlantic Ocean. Methodology: We identify to species the public and private barcode lineages from the region on BOLD, confirming identifications by vouchers, phylogeographic deduction, and the process of elimination. The lineages and BINs are assigned to species from a comprehensive species list for the region. Results: We estimate 1029 of 1311 total bony shorefish species in the region are barcoded (78.5%). For reef-associated fishes, 902 of 1083 species are barcoded (83.3%). About 70 of the 181 species not yet barcoded are endemic species from Florida/ Gulf of Mexico or Venezuela, leaving about 90% of the central Caribbean reef fish species barcoded to date. Most species are represented by one barcode lineage, but among the gobioids and blennioids there are many more lineages (BINs) than species, indicating substantial cryptic diversity. Conclusions: As barcode coverage for a region approaches completion, a robust assessment of coverage can be made. The reef fish fauna of the tropical western Atlantic now has the highest coverage for a large marine area, from about 80 to 90% depending on definitions and geographic limits. Keywords: Mitochondrial DNA, coral reefs, Caribbean, taxonomy, biogeography, species list, phylogenetics *Corresponding author: Benjamin C. Victor, Guy Harvey Research Institute Nova Southeastern University, Ft. Lauderdale, Florida, USA, Email: ben@coralreeffish.com Martha Valdez-Moreno, Lourdes Vásquez-Yeomans, Departamento de Ecologia y Sistematica Acuatica, El Colegio de la Frontera Sur, Chetumal, Quintana Roo, Mexico © 2015 Benjamin C. Victor et al. licensee De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. but, in all likelihood (and based on nothing rigorous), the numbers are unlikely to change to a large degree since overestimated coverage by incorrect species IDs would be counterbalanced by additional lineages with an incorrect duplicate species ID, or without a species ID at all. Regional coverage is a particularly difficult measure to assess for a number of reasons. Many large regions do not have complete species lists compiled for native fishes, and proposed or published species lists are frequently discordant. The most common problem is defining the habitat limits of a fish fauna; for example, marine fishes can include shorefishes, deeper water fishes, euryhaline fishes, and pelagics and those are variously included or excluded from most regional marine species lists. In addition, defining the geographic limits of a fauna is not always simple, since most regions have smaller satellite locations that can be variously included. A more profound problem for assessing coverage is the accuracy of species identifications: without some high degree of confidence in identifications, the numbers of species barcoded can be an artifact of various contributors' imaginations. An additional problem is unresolved or difficult taxonomy, either in traditional practice or unexpected cryptic diversity, which can account for up to 10% of the species in a list, even in the better known fauna such as the US/Canadian North American freshwater fauna [5], or much more, as in the exceptionally speciose tropical freshwater fish fauna of South America, Africa, and SE Asia. With these impediments, many large (and small) regions cannot be assessed to any degree of certainty; nevertheless, since these regions are also undersampled in the barcode database, we can assume they do not have higher coverage than the examples we discuss here. At this time, the highest barcode coverage for a large fauna is, unsurprisingly, the northern temperate freshwater fish community. For example, the coverage of European FW fish is about 86% of the approximately 600 species total (Jörg Freyhof, pers. comm.). The highest recorded coverage for a relatively large-scale region is the 98% coverage of the 500 freshwater fishes of the Mediterranean Basin [6]. The US/Canadian North American freshwater fauna coverage is also high, with about 83% of the 900 species barcoded at the last review [5]. Marine fish coverage is typically lower; for example, in contrast to the completion of the Mediterranean FW fishes, only a small fraction of the 650 species comprising the marine fish fauna of the Mediterranean Sea have been barcoded [7-9]. For combined marine and FW fishes, most large regions have less than 50% coverage: one of the more complete examples being Argentina, where almost half of the 1000 species have been barcoded [10,11]. If the surveys are limited to shorefishes (deep-water fishes are seriously undersampled), marine fish coverage is typically less than 50%, and well lower in undersampled regions such as the eastern Atlantic, the Red Sea, and the eastern Pacific. After Europe, Canada may have the highest coverage for combined marine and FW fish species, with more than 200 of the 350 shorefishes of Pacific Canada barcoded [12], 95% of the 200 FW species barcoded [13], and more than half of the 500 species of Atlantic fishes of Canada barcoded (Dirk Steinke, pers. comm.). There have been extensive efforts at barcoding tropical reef fishes of the broad Indo-Pacific, especially French Polynesia [14], Queensland and Bali [15], Southern Africa [16], and the South China Sea [17]. However, the extreme number of coral-reef species, peaking with as many as 1700 species co-occurring on reefs in the West Papua region of Indonesia [18], still results in barcode coverage of only about half of the total, at best. The shorefish fauna of the tropical western Atlantic (TWA) has been the most completely barcoded large marine region to date. Fortunately, it is also well studied and inventoried, with a comprehensive guide now available online for all of the shorefishes [19]. This barcoding achievement is mainly the result of three independent large FISH-BOL projects focusing on the region: the ECOSUR group with about 5000 records on BOLD [20], the Smithsonian with about 4000 (21), and the OSF/Victor project with about 3000 records. An important, and novel, factor promoting the reliability of our coverage estimates is the emergent property of positive feedback in identification of a limited set: as we approach completion of coverage for any particular taxonomic group, the identification of the remaining unassigned lineages becomes easier. This is facilitated by two important aids: the process of elimination combined with phylogeographic deduction, i.e. the improved resolution of phylogenetic relationships when most, or almost all, of the potential relatives have been identified and the range of each species is well documented. The TWA is defined here as the northern tropical and warm subtropical W. Atlantic, excluding Brazil and including S. Florida and the Gulf of Mexico, or what could be called the Greater Caribbean region [22]. The species list for shorefishes of the region varies depending on how many peripheral species are included and the definition of a shorefish, especially considering depth on the continental shelf. In general, however, the number of species ranges up to 1500 [22], and we consider here about 1300 bony shorefish species for the region (excluding elasmobranchs, which number well less than 100 spp.). The number of "reef species" is subject to a more fluid definition: for the Greater Caribbean, various large-scale surveys list 605 reef species [22], 774 reef species [23], and 885 reef species [24]. The goal of this survey is to introduce a more rigorous evaluation process for assessing coverage and applying it to the TWA shorefishes. With the complete inventory of species and their ranges well established [19,22], and the number of sequences of shorefishes from the region approaching 15,000 (with many well-vouchered), it becomes possible to critically assess species identifications independently- by phylogeographic deduction in combination with the process of elimination, backed up by expert evaluation of voucher metadata, particularly the location and photographs. In all, we estimate the barcode coverage for general shorefishes of the TWA to range up to 80% and the coverage for smaller subsets, such as the strictly coral-reef fishes of the Caribbean Sea, to approach 90%. Key to the assessment is the categorization of mtDNA lineages, which need to be enumerated and defined by an algorithm, an "operational taxonomic unit"- in BOLD these units are BINs, or Barcode Index Numbers [25]. BINs are not set groups of lineages separated by a certain percentage distance from each other, but a cluster calculated by an algorithm taking into account similarity and connectivity and assessing cluster boundaries. Of course, a cluster of sequences does not a species make, and the taxonomic decision of the relationship of a BIN, or any DNA lineage, to a species is a much more complex analysis, i.e. what is a species? [4]. Nevertheless, the BIN provides the framework for categorizing mtDNA lineages, and, in the large majority of TWA shorefishes at least, proves to correspond one-to-one with known species or suspected sub-species. 2 Methods A complete species list of the bony shorefishes of the tropical western Atlantic/Greater Caribbean (including the Gulf of Mexico and S. Florida and excluding the Brazilian fauna) was assembled by reviewing taxonomic literature, guidebooks [26-29] and assessing published species lists (22-24). Shorefishes were defined as those associated with the substrate in waters up to 200m depth, excluding mid-water species, but including pelagic fish families. This definition has wide usage among tropical fish taxonomy books [26-29]. Taxonomic validity mostly follows Eschmeyer (2015) [30], with a few practical exceptions, and thus undescribed cryptic lineages were not considered as species. The DNA lineages present on the barcode database (specifically collected in the TWA) were assigned to the shorefish species list. Almost all TWA lineages (including unique sequences) in the database were assessed, public and private, as well as lineages with no identification data at all. Some private records were made available to us by the owners allowing us to share projects. Such requests were facilitated by the BOLD ID engine showing related private DNA lineages (stripped of metadata and the sequences themselves hidden) on a neighbor-joining tree in the ID-engine procedure initiated by one of our sequences. Only a rare lineage with only private and unshared sequences and not a single nearby relative (from any ocean) would be invisible to us. The BIN application is also very helpful: the BIN summaries on BOLD list private sequences within the BIN (also without any private associated data), as well as the nearest-neighbor BIN code (even if made up of only private sequences), meaning that virtually all barcode lineages, including GenBank downloads and private projects, could be assessed to some degree by our combined research groups. We did not accept species identifications from ID metadata on BOLD, which are determined by various submitters to the barcode database or GenBank. The general lack of quality control has led to a proliferation of misidentifications on databases, exacerbated by the desire, or even perceived requirement, by contributors to identify specimens to species, often without the expertise to make species-level determinations. This flaw is one of the greatest limitations of specimen-record databases, both for DNA sequences or general occurrence records (such as FishBase or GBIF). BOLD fortunately connects sequences to voucher specimen records, often with photographs. In many cases, the photographs alone contain diagnostic information for species-level identification. Voucher specimens were retained and examined for almost all specimens sequenced in the projects by ECOSUR and the OSF/Victor collection (the majority of BOLD TWA records). DNA lineages without a diagnostic voucher or photograph were assigned to species with varying degrees of certainty based on a combination of three cumulative methods: 1) phylogenetic deduction from the nearest-neighbor species (either from the region, or sibling species from the eastern Pacific), 2) phylogeographic deduction, adding geographic range matching (i.e. an unassigned lineage from a set of locations known to represent the range of a particular candidate species), and 3) the process of elimination; for example, when only one candidate species in a genus is left unassigned and there is one remaining unidentified DNA lineage. With this procedure, almost all DNA lineages could be assigned a species identification (although for gobioids and blennioids, it often was assignment to a local subspecies/population of a nominal species), with only a small fraction considered by us to be questionably assigned (but certainty was not quantifiable). As complete coverage for any particular taxon is approached, the positive feedback property for identification moves many tentative species assignments to confident species assignments. endemics, or unusual or secretive habitats), those with uncertain taxonomy (or unique holotypes), and a very few regular reef-fish species that have just been overlooked in collections, coincidentally by all research groups concentrating on the region. 3.2 Coverage by Family Three of the 30 large bony shorefish families (more than 15 species each in the TWA) have been completely barcoded for the region (Table 1). They are prominent commercially important families, comprising the snappers (Lutjanidae), grunts (Haemulidae), and tunas (Scombridae). Several more of the large families are almost complete, with only one or two species missing: i.e. the cardinalfishes (Apogonidae), damselfishes (Pomacentridae), wrasses (Labridae, excluding parrotfishes), porgies (Sparidae), and jacks (Carangidae). The second largest family of shorefishes in the region, the basses (Serranidae, including the groupers of Epinephelidae), is 86% barcoded, mainly missing a few deepwater and/or rare species. The largest fish family is the gobies (Gobiidae), with only 76% of the 134 regional species barcoded, mainly missing a set of deeper-water species and some rare and/or microendemic species that have not been sampled. The reef-fish subset within families is somewhat better covered, with almost all families having higher coverage of their reef-associated members (Table 1). 3 Results 3.1 Overall Coverage For the broadest category of bony shorefishes of the Greater Caribbean, we include fishes from the Gulf of Mexico, Florida, and the Caribbean Sea- excluding fishes only found in Guyana to Brazil. All bottom-associated species down to 200 m are included, along with pelagic families that can be found nearshore, such as the carangids, scombrids, ariommatids, nomeids, echeniids, molids, belonids, and hemiramphids. Euryhaline fishes such as the eleotrids, atherinopsids, and atherinids are also included. That list totals 1311 bony shorefish species, and 1029 of them are barcoded to date (78.5%) (Table 1). Our subset of bony reef-fish species of the region (found above 100 m and associated with reefs) numbers 1083 species, higher than other published reef-fish compilations, since we generally follow taxonomic guidebook format and thus include pelagic fish families with members that can be observed over reefs (including all of the carangids, scombrids, echeneids, belonids, and the elopiformes and albulids); as well as the soft-substrate species that are found in sandbeds and grassbeds around reefs (such as bothids, paralichthyids, cynoglossids, triglids, ogcocephalids, bythitids, chlopsids, and ophichthids); as well as a small subset of the clupeids, atherinids, congrids, and sciaenids that are typically seen near reefs (Table 1). Of our bony reef-fish species total of 1083 species, 902 are barcoded to date (83.3%). The unbarcoded species are predominantly species endemic to the Gulf of Mexico (GOM) or Venezuela and/or the South American continental shore. Of the 181 unbarcoded species in the reef-fish species list, 30 are Florida/GOM endemics (2.8% of total) and 40 are Venezuelan/S. Caribbean endemics (3.7% of total), leaving only 111 remaining species unbarcoded (10.2%), indicating overall coverage of about 90% for the central reef-fish fauna in the region. The remaining unbarcoded species are mostly rare species (either deeper water, local 3.3 Cryptic Diversity Numerous species of tropical marine fishes, especially reef fishes, show evidence of undescribed cryptic diversity after genetic analyses [4]. The pattern of which species show extensive cryptic diversity is not clear in the vast scale of the Indo-Pacific, where cryptic diversity is frequent among many quite different fish families. However, in the smaller Greater Caribbean region, the pattern is very clear- only families with lower dispersal ability, i.e. benthic brooded eggs and relatively short larval lives, less than about 30 days, break up into cryptic species complexes [4]. A number of cryptic Caribbean speciescomplexes have been described in recent years (e.g. [3137]), and many more remain to be explored. This pattern is clearly apparent in the number of BINs associated with a single nominal species in our review: the number of BINs approximates the number of species barcoded in most families of shorefishes (rarely there are fewer BINs than species, see below), with the main exception being a markedly greater number of BINs in the families with benthic eggs and short larval lives, i.e. the Gobiidae and Table 1. Barcode coverage of tropical western Atlantic bony fishes, by family; in descending order of number of species known for the region. Numbers in bold highlight families that include non-reef species (i.e. #reef species less than total #species). ALL SPECIES Family Gobiidae Serranidae Labrisomidae Sciaenidae Chaenopsidae Ophichthidae Paralichthyidae Carangidae Congridae Syngnathidae Gobiesocidae Haemulidae Bythitidae Scorpaenidae Apogonidae Batrachoididae Ophidiidae Labridae (wrasses) Muraenidae Blenniidae Lutjanidae Sparidae Engraulidae Triglidae Cynoglossidae Dactyloscopidae Scombridae Pomacentridae Scarinae Tetraodontidae Clupeidae Ogcocephalidae Atherinopsidae Bothidae Opistognathidae Nettastomatidae Synodontidae Holocentridae Grammatidae #sp 134 96 58 54 50 45 35 32 32 27 26 25 25 25 24 24 22 21 21 20 18 18 18 18 17 17 16 16 16 15 14 14 14 14 14 13 12 12 12 #barcoded 102 82 50 40 39 29 25 30 22 22 16 25 16 22 23 12 14 20 17 14 18 16 12 15 11 7 16 15 13 11 11 6 7 10 7 7 12 9 8 %barcoded 76% 85% 86% 74% 78% 64% 71% 94% 69% 81% 62% 100% 64% 88% 96% 50% 64% 95% 81% 70% 100% 89% 67% 83% 65% 41% 100% 94% 81% 73% 79% 43% 50% 71% 50% 54% 100% 75% 67% #BINs 157 84 88 44 85 38 31 30 27 22 24 26 20 25 28 15 17 23 17 10 17 15 12 16 11 7 12 17 13 12 12 5 8 11 9 8 15 11 8 12 12 10 12 9 8 100% 75% 80% 15 11 8 9 11 7 7 78% 64% 8 9 18 13 17 16 16 16 14 7 14 15 11 7 16 15 13 11 6 6 83% 85% 41% 100% 94% 81% 79% 86% 43% 16 11 7 12 17 13 12 6 5 REEF SPECIES #reef sp. 130 92 57 8 50 37 30 32 10 25 21 25 25 24 24 23 22 21 20 20 18 18 #barcoded 102 81 50 8 39 27 22 30 8 22 16 25 16 22 23 12 14 20 16 14 18 16 %barcoded 78% 88% 88% 100% 78% 73% 73% 94% 80% 88% 76% 100% 64% 92% 96% 52% 64% 95% 80% 70% 100% 89% #BINs 157 84 88 9 85 26 28 30 11 22 24 26 20 25 28 15 17 23 16 10 17 15 continued Table 1. Barcode coverage of tropical western Atlantic bony fishes, by family; in descending order of number of species known for the region. Numbers in bold highlight families that include non-reef species (i.e. #reef species less than total #species). ALL SPECIES REEF SPECIES #barcoded 11 12 10 7 8 8 4 8 6 7 6 7 7 7 7 3 4 6 5 3 1 4 3 4 4 3 3 1 3 3 3 2 3 3 2 2 2 2 2 %barcoded 92% 100% 100% 88% 100% 100% 50% 100% 86% 100% 86% 100% 100% 100% 100% 50% 67% 100% 100% 75% 25% 100% 75% 100% 100% 75% 75% 33% 100% 100% 100% 67% 100% 100% 100% 100% 100% 100% 100% #BINs 11 16 9 8 8 8 7 15 6 8 6 7 7 7 6 3 6 6 6 3 1 4 3 4 4 3 6 1 4 3 3 2 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 100% 100% 100% 100% 2 2 2 2 4 4 4 4 1 2 3 3 3 3 4 4 3 3 1 2 3 3 2 3 100% 100% 75% 75% 100% 100% 100% 100% 67% 100% 4 4 3 6 1 2 3 3 2 3 4 4 100% 4 7 7 7 7 1 6 6 5 3 7 7 7 7 1 4 6 5 3 100% 100% 100% 100% 100% 67% 100% 100% 100% 7 5 7 6 1 6 6 6 3 1 8 7 7 1 8 6 7 100% 100% 86% 100% 2 15 6 8 #reef sp. 3 12 10 7 8 #barcoded 3 12 10 7 8 %barcoded 100% 100% 100% 100% 100% #BINs 3 16 9 7 8 Family Mugilidae Gerreidae Monacanthidae Chlopsidae Echeneidae Nomeidae Achiridae Tripterygiidae Antennariidae Belonidae Hemiramphidae Chaetodontidae Pomacanthidae Eleotridae Diodontidae Centropomidae Microdesmidae Balistidae Ostraciidae Albulidae Pristigasteridae Priacanthidae Ariommatidae Mullidae Kyphosidae Uranoscopidae Callionymidae Carapidae Atherinidae Sphyraenidae Polynemidae Ptereleotridae Acanthuridae Molidae Elopidae Moringuidae Fistulariidae Coryphaenidae Stromateidae #sp 12 12 10 8 8 8 8 8 7 7 7 7 7 7 7 6 6 6 5 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 2 2 2 2 2 continued Table 1. Barcode coverage of tropical western Atlantic bony fishes, by family; in descending order of number of species known for the region. Numbers in bold highlight families that include non-reef species (i.e. #reef species less than total #species). ALL SPECIES REEF SPECIES #barcoded 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 1 1029 %barcoded 50% 50% 100% 100% 100% 0% 0% 100% 0% 100% 100% 100% 100% 100% 0% 100% #BINs 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 1 1247 1 1083 1 902 100% 1 1099 1 1 1 1 1 1 1 1 0 1 1 1 1 1 100% 0% 100% 100% 100% 100% 100% 1 0 1 1 1 1 1 2 1 1 1 50% 100% 1 1 #reef sp. #barcoded %barcoded #BINs Family Emmelichthyidae Pempheridae Megalopidae Dussumieriidae Anguillidae Heterenchelyidae Colocongridae Aulostomidae Anomalopidae Dactylopteridae Malacanthidae Cirrhitidae Rachycentridae Ephippidae Muraenesocidae Lobotidae Totals #sp 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1311 the blennioid families Labrisomidae, Chaenopsidae, and Tripterygiidae (Table 1). (Interestingly, the pattern has not been found so far for the true blennies (Blenniidae), which are known to have relatively large wide-ranging larvae.) These gobioid and blennnioid families have many species with multiple BINs, typically allopatric, but sometimes sympatric. Among the gobies, there are 102 species barcoded, but 157 barcode BINs within those 102 species. Similarly, among the Labrisomidae there are 88 BINs for 50 species, among the Chaenopsidae there are 85 BINs for 39 species, and among the Tripterygiidae there are 15 BINs for 8 species. The high number of cryptic lineages in the Tripterygiidae persists even though several new species have been recently described [37]. In that study, four cryptic species of Enneanectes (three new species) were found to be coexisting on reefs in the Lesser Antilles of the Caribbean Sea (sympatric cryptic species). Nevertheless, the typical finding is allopatric species complexes and, since not all subregions of the Caribbean have been wellsampled, especially Colombian and Venezuelan reefs, the number of cryptic lineages among this set of reef fish families is likely to continue to increase. 4 Discussion This review of the barcode coverage for the tropical W. Atlantic bony fishes illustrates the changing priorities of a barcode program as it matures and approaches complete coverage. If the earliest phase of a barcode program is to promulgate the message, identify and develop collaborators, and field test the methodology, then the middle phase is intensive recruitment of more collaborators, accumulation of more specimens from unsequenced species and undersampled locations, and development of an optimal quality control approach. As completion is approached, quality control can be improved by more rigorous identification by using the methods of assessment exemplified by this review, i.e. confirmed voucher photographs and specimens, phylogeographic deduction, and process of elimination. This approach permits each BIN, i.e. algorithm-derived operational taxonomic unit (OTU) as defined by Ratnasingham and Hebert (2013) [25], to be assessed in comparison to other lineages in BOLD and a conclusion reached on the species identification for the BIN lineage or sub-lineage, thus creating a "validated BIN". A validated BIN allows an assessor to question species IDs that do not match the validated identification- either to correct the specimen identification or to highlight exceptions to the "one BIN-one species" paradigm. Exceptions to the "one BIN-one species" are particularly interesting, comprising two basic categories: A) different species that share a BIN, either distinct sub-BIN lineages that are under the threshold for full BIN assignment [25], or barcode "phenovariants", species that share COI haplotypes or co-occur within a lineage; or B) barcode "genovariants", or multiple distinct COI lineages that fit the description of a single nominal species. Barcode genovariants, which represent cryptic lineages, subspecies, or species, are lineages or OTUs that need to be assessed taxonomically. They are not automatically cryptic species, since the decision on what is a species is the prerogative of a trained taxonomist, not simply a genetic variant [4]. At present, there is a general consensus among fish taxonomists not to accept genovariants as species unless there is a morphological, meristic (countable features), or marking difference considered sufficiently consistent, reliable, and significant to merit species-level designation. This is frequently a difficult decision, especially when the intervening geographic range is unsampled, few (or unique) specimens are sampled or even retained, only preserved material is available, or the taxonomy of the group is unstable, unreviewed, or perhaps entirely unexamined. Barcode phenovariants are apparently less frequent than genovariants in marine fishes, but are still not unusual, occurring perhaps among as many as 5-10% of all shorefish species (a difficult rough estimate, since "false phenovariants" because of misidentifications are ubiquitous in the databases). When two or more nominal species truly share COI sequences, it indicates that the species deserve a closer taxonomic examination­ they could represent the same species that had erroneously been split, but frequently they are closely-related species that either hybridize occasionally or have not been separated for sufficient time for mutually exclusive sets of haplotypes in COI to develop (incomplete lineage sorting). Additional sequencing of faster evolving markers, such as the mitochondrial control region, could distinguish species that have not yet diverged sufficiently in the COI marker. An example of the latter is the important tuna genus Thunnus, which mostly share barcode BINs, but can be distinguished to some degree by additional mitochondrial and nuclear markers and/or characterbased sequence analyses [38]. Almost all other cases of phenovariants among reef fishes have not been closely examined (e.g. Hyporthodus groupers or the Helicolenus and Sebastes rockfishes), but it is likely that many cases will prove to be similar to the tunas. Acknowledgements: We thank Paul Hebert, Robert Hanner, Dirk Steinke, and Mahmood Shivji for their introduction to the Barcode Project, and general guidance and support throughout the past decade as we exhaustively surveyed the Caribbean fish barcodes. For the Mexican Barcode program, we thank S. Morales, D. Acevedo and Y. Cota for their dedicated work as parataxonomists of fish larvae and C. Quintal, D. Cazarez, R. Herrera and J. A. Cohuo for their assistance in field collections. Most adult specimen photographs were taken by Humberto Bahena Basave and photographs of larvae were taken by S. Morales and J. A. Cohuo. C. Quintal and J. J. Schmitter Soto assisted with the identification of some adult fish, while Maria Eugenia Vega, Uriel Ordóñez and Margarita Ornelas from CINVESTAV Mérida kindly donated specimens from the Yucatan coast and helped with their identification. J. Lamkin and E. Malca from NOAA/UM supported the larval collections (project 517/04). We thank A. Martínez for work in the Barcode Laboratory in Chetumal and the staff of Mexican marine areas for their support in the field, especially M.C. Garcia of Xcalak and M. F. Remolina of Contoy. CONAPESCA provided scientific collection permits. DNA sequencing was carried out at the Biodiversity Institute of Ontario, University of Guelph, supported by grants to PDNH from Genome Canada through the Ontario Genomics Institute, and from NSERC. Part of this work was supported through Grant HE009 and HB043 from Comisión Nacional Para el Uso y Conservación de la Biodiversidad (CONABIO) and the Mexican Barcode of Life (MEXBOL) network. Conflict of interest: Dr Benjamin Victor has nothing to disclose. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png DNA Barcodes de Gruyter

Status of DNA Barcoding Coverage for the Tropical Western Atlantic Shorefishes and Reef Fishes

Loading next page...
 
/lp/de-gruyter/status-of-dna-barcoding-coverage-for-the-tropical-western-atlantic-HU88s7zn9a
Publisher
de Gruyter
Copyright
Copyright © 2015 by the
ISSN
2299-1077
eISSN
2299-1077
DOI
10.1515/dna-2015-0011
Publisher site
See Article on Publisher Site

Abstract

Background: Barcode coverage is difficult to assess for large regions due to incomplete species lists, inaccurate identifications, and cryptic diversity. However, as coverage approaches completion, it becomes possible to critically evaluate identifications and validate barcode lineages. We collate the results of the FISH-BOL barcode project and assess coverage for each family of bony shorefishes and reef fishes from the tropical western Atlantic Ocean. Methodology: We identify to species the public and private barcode lineages from the region on BOLD, confirming identifications by vouchers, phylogeographic deduction, and the process of elimination. The lineages and BINs are assigned to species from a comprehensive species list for the region. Results: We estimate 1029 of 1311 total bony shorefish species in the region are barcoded (78.5%). For reef-associated fishes, 902 of 1083 species are barcoded (83.3%). About 70 of the 181 species not yet barcoded are endemic species from Florida/ Gulf of Mexico or Venezuela, leaving about 90% of the central Caribbean reef fish species barcoded to date. Most species are represented by one barcode lineage, but among the gobioids and blennioids there are many more lineages (BINs) than species, indicating substantial cryptic diversity. Conclusions: As barcode coverage for a region approaches completion, a robust assessment of coverage can be made. The reef fish fauna of the tropical western Atlantic now has the highest coverage for a large marine area, from about 80 to 90% depending on definitions and geographic limits. Keywords: Mitochondrial DNA, coral reefs, Caribbean, taxonomy, biogeography, species list, phylogenetics *Corresponding author: Benjamin C. Victor, Guy Harvey Research Institute Nova Southeastern University, Ft. Lauderdale, Florida, USA, Email: ben@coralreeffish.com Martha Valdez-Moreno, Lourdes Vásquez-Yeomans, Departamento de Ecologia y Sistematica Acuatica, El Colegio de la Frontera Sur, Chetumal, Quintana Roo, Mexico © 2015 Benjamin C. Victor et al. licensee De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. but, in all likelihood (and based on nothing rigorous), the numbers are unlikely to change to a large degree since overestimated coverage by incorrect species IDs would be counterbalanced by additional lineages with an incorrect duplicate species ID, or without a species ID at all. Regional coverage is a particularly difficult measure to assess for a number of reasons. Many large regions do not have complete species lists compiled for native fishes, and proposed or published species lists are frequently discordant. The most common problem is defining the habitat limits of a fish fauna; for example, marine fishes can include shorefishes, deeper water fishes, euryhaline fishes, and pelagics and those are variously included or excluded from most regional marine species lists. In addition, defining the geographic limits of a fauna is not always simple, since most regions have smaller satellite locations that can be variously included. A more profound problem for assessing coverage is the accuracy of species identifications: without some high degree of confidence in identifications, the numbers of species barcoded can be an artifact of various contributors' imaginations. An additional problem is unresolved or difficult taxonomy, either in traditional practice or unexpected cryptic diversity, which can account for up to 10% of the species in a list, even in the better known fauna such as the US/Canadian North American freshwater fauna [5], or much more, as in the exceptionally speciose tropical freshwater fish fauna of South America, Africa, and SE Asia. With these impediments, many large (and small) regions cannot be assessed to any degree of certainty; nevertheless, since these regions are also undersampled in the barcode database, we can assume they do not have higher coverage than the examples we discuss here. At this time, the highest barcode coverage for a large fauna is, unsurprisingly, the northern temperate freshwater fish community. For example, the coverage of European FW fish is about 86% of the approximately 600 species total (Jörg Freyhof, pers. comm.). The highest recorded coverage for a relatively large-scale region is the 98% coverage of the 500 freshwater fishes of the Mediterranean Basin [6]. The US/Canadian North American freshwater fauna coverage is also high, with about 83% of the 900 species barcoded at the last review [5]. Marine fish coverage is typically lower; for example, in contrast to the completion of the Mediterranean FW fishes, only a small fraction of the 650 species comprising the marine fish fauna of the Mediterranean Sea have been barcoded [7-9]. For combined marine and FW fishes, most large regions have less than 50% coverage: one of the more complete examples being Argentina, where almost half of the 1000 species have been barcoded [10,11]. If the surveys are limited to shorefishes (deep-water fishes are seriously undersampled), marine fish coverage is typically less than 50%, and well lower in undersampled regions such as the eastern Atlantic, the Red Sea, and the eastern Pacific. After Europe, Canada may have the highest coverage for combined marine and FW fish species, with more than 200 of the 350 shorefishes of Pacific Canada barcoded [12], 95% of the 200 FW species barcoded [13], and more than half of the 500 species of Atlantic fishes of Canada barcoded (Dirk Steinke, pers. comm.). There have been extensive efforts at barcoding tropical reef fishes of the broad Indo-Pacific, especially French Polynesia [14], Queensland and Bali [15], Southern Africa [16], and the South China Sea [17]. However, the extreme number of coral-reef species, peaking with as many as 1700 species co-occurring on reefs in the West Papua region of Indonesia [18], still results in barcode coverage of only about half of the total, at best. The shorefish fauna of the tropical western Atlantic (TWA) has been the most completely barcoded large marine region to date. Fortunately, it is also well studied and inventoried, with a comprehensive guide now available online for all of the shorefishes [19]. This barcoding achievement is mainly the result of three independent large FISH-BOL projects focusing on the region: the ECOSUR group with about 5000 records on BOLD [20], the Smithsonian with about 4000 (21), and the OSF/Victor project with about 3000 records. An important, and novel, factor promoting the reliability of our coverage estimates is the emergent property of positive feedback in identification of a limited set: as we approach completion of coverage for any particular taxonomic group, the identification of the remaining unassigned lineages becomes easier. This is facilitated by two important aids: the process of elimination combined with phylogeographic deduction, i.e. the improved resolution of phylogenetic relationships when most, or almost all, of the potential relatives have been identified and the range of each species is well documented. The TWA is defined here as the northern tropical and warm subtropical W. Atlantic, excluding Brazil and including S. Florida and the Gulf of Mexico, or what could be called the Greater Caribbean region [22]. The species list for shorefishes of the region varies depending on how many peripheral species are included and the definition of a shorefish, especially considering depth on the continental shelf. In general, however, the number of species ranges up to 1500 [22], and we consider here about 1300 bony shorefish species for the region (excluding elasmobranchs, which number well less than 100 spp.). The number of "reef species" is subject to a more fluid definition: for the Greater Caribbean, various large-scale surveys list 605 reef species [22], 774 reef species [23], and 885 reef species [24]. The goal of this survey is to introduce a more rigorous evaluation process for assessing coverage and applying it to the TWA shorefishes. With the complete inventory of species and their ranges well established [19,22], and the number of sequences of shorefishes from the region approaching 15,000 (with many well-vouchered), it becomes possible to critically assess species identifications independently- by phylogeographic deduction in combination with the process of elimination, backed up by expert evaluation of voucher metadata, particularly the location and photographs. In all, we estimate the barcode coverage for general shorefishes of the TWA to range up to 80% and the coverage for smaller subsets, such as the strictly coral-reef fishes of the Caribbean Sea, to approach 90%. Key to the assessment is the categorization of mtDNA lineages, which need to be enumerated and defined by an algorithm, an "operational taxonomic unit"- in BOLD these units are BINs, or Barcode Index Numbers [25]. BINs are not set groups of lineages separated by a certain percentage distance from each other, but a cluster calculated by an algorithm taking into account similarity and connectivity and assessing cluster boundaries. Of course, a cluster of sequences does not a species make, and the taxonomic decision of the relationship of a BIN, or any DNA lineage, to a species is a much more complex analysis, i.e. what is a species? [4]. Nevertheless, the BIN provides the framework for categorizing mtDNA lineages, and, in the large majority of TWA shorefishes at least, proves to correspond one-to-one with known species or suspected sub-species. 2 Methods A complete species list of the bony shorefishes of the tropical western Atlantic/Greater Caribbean (including the Gulf of Mexico and S. Florida and excluding the Brazilian fauna) was assembled by reviewing taxonomic literature, guidebooks [26-29] and assessing published species lists (22-24). Shorefishes were defined as those associated with the substrate in waters up to 200m depth, excluding mid-water species, but including pelagic fish families. This definition has wide usage among tropical fish taxonomy books [26-29]. Taxonomic validity mostly follows Eschmeyer (2015) [30], with a few practical exceptions, and thus undescribed cryptic lineages were not considered as species. The DNA lineages present on the barcode database (specifically collected in the TWA) were assigned to the shorefish species list. Almost all TWA lineages (including unique sequences) in the database were assessed, public and private, as well as lineages with no identification data at all. Some private records were made available to us by the owners allowing us to share projects. Such requests were facilitated by the BOLD ID engine showing related private DNA lineages (stripped of metadata and the sequences themselves hidden) on a neighbor-joining tree in the ID-engine procedure initiated by one of our sequences. Only a rare lineage with only private and unshared sequences and not a single nearby relative (from any ocean) would be invisible to us. The BIN application is also very helpful: the BIN summaries on BOLD list private sequences within the BIN (also without any private associated data), as well as the nearest-neighbor BIN code (even if made up of only private sequences), meaning that virtually all barcode lineages, including GenBank downloads and private projects, could be assessed to some degree by our combined research groups. We did not accept species identifications from ID metadata on BOLD, which are determined by various submitters to the barcode database or GenBank. The general lack of quality control has led to a proliferation of misidentifications on databases, exacerbated by the desire, or even perceived requirement, by contributors to identify specimens to species, often without the expertise to make species-level determinations. This flaw is one of the greatest limitations of specimen-record databases, both for DNA sequences or general occurrence records (such as FishBase or GBIF). BOLD fortunately connects sequences to voucher specimen records, often with photographs. In many cases, the photographs alone contain diagnostic information for species-level identification. Voucher specimens were retained and examined for almost all specimens sequenced in the projects by ECOSUR and the OSF/Victor collection (the majority of BOLD TWA records). DNA lineages without a diagnostic voucher or photograph were assigned to species with varying degrees of certainty based on a combination of three cumulative methods: 1) phylogenetic deduction from the nearest-neighbor species (either from the region, or sibling species from the eastern Pacific), 2) phylogeographic deduction, adding geographic range matching (i.e. an unassigned lineage from a set of locations known to represent the range of a particular candidate species), and 3) the process of elimination; for example, when only one candidate species in a genus is left unassigned and there is one remaining unidentified DNA lineage. With this procedure, almost all DNA lineages could be assigned a species identification (although for gobioids and blennioids, it often was assignment to a local subspecies/population of a nominal species), with only a small fraction considered by us to be questionably assigned (but certainty was not quantifiable). As complete coverage for any particular taxon is approached, the positive feedback property for identification moves many tentative species assignments to confident species assignments. endemics, or unusual or secretive habitats), those with uncertain taxonomy (or unique holotypes), and a very few regular reef-fish species that have just been overlooked in collections, coincidentally by all research groups concentrating on the region. 3.2 Coverage by Family Three of the 30 large bony shorefish families (more than 15 species each in the TWA) have been completely barcoded for the region (Table 1). They are prominent commercially important families, comprising the snappers (Lutjanidae), grunts (Haemulidae), and tunas (Scombridae). Several more of the large families are almost complete, with only one or two species missing: i.e. the cardinalfishes (Apogonidae), damselfishes (Pomacentridae), wrasses (Labridae, excluding parrotfishes), porgies (Sparidae), and jacks (Carangidae). The second largest family of shorefishes in the region, the basses (Serranidae, including the groupers of Epinephelidae), is 86% barcoded, mainly missing a few deepwater and/or rare species. The largest fish family is the gobies (Gobiidae), with only 76% of the 134 regional species barcoded, mainly missing a set of deeper-water species and some rare and/or microendemic species that have not been sampled. The reef-fish subset within families is somewhat better covered, with almost all families having higher coverage of their reef-associated members (Table 1). 3 Results 3.1 Overall Coverage For the broadest category of bony shorefishes of the Greater Caribbean, we include fishes from the Gulf of Mexico, Florida, and the Caribbean Sea- excluding fishes only found in Guyana to Brazil. All bottom-associated species down to 200 m are included, along with pelagic families that can be found nearshore, such as the carangids, scombrids, ariommatids, nomeids, echeniids, molids, belonids, and hemiramphids. Euryhaline fishes such as the eleotrids, atherinopsids, and atherinids are also included. That list totals 1311 bony shorefish species, and 1029 of them are barcoded to date (78.5%) (Table 1). Our subset of bony reef-fish species of the region (found above 100 m and associated with reefs) numbers 1083 species, higher than other published reef-fish compilations, since we generally follow taxonomic guidebook format and thus include pelagic fish families with members that can be observed over reefs (including all of the carangids, scombrids, echeneids, belonids, and the elopiformes and albulids); as well as the soft-substrate species that are found in sandbeds and grassbeds around reefs (such as bothids, paralichthyids, cynoglossids, triglids, ogcocephalids, bythitids, chlopsids, and ophichthids); as well as a small subset of the clupeids, atherinids, congrids, and sciaenids that are typically seen near reefs (Table 1). Of our bony reef-fish species total of 1083 species, 902 are barcoded to date (83.3%). The unbarcoded species are predominantly species endemic to the Gulf of Mexico (GOM) or Venezuela and/or the South American continental shore. Of the 181 unbarcoded species in the reef-fish species list, 30 are Florida/GOM endemics (2.8% of total) and 40 are Venezuelan/S. Caribbean endemics (3.7% of total), leaving only 111 remaining species unbarcoded (10.2%), indicating overall coverage of about 90% for the central reef-fish fauna in the region. The remaining unbarcoded species are mostly rare species (either deeper water, local 3.3 Cryptic Diversity Numerous species of tropical marine fishes, especially reef fishes, show evidence of undescribed cryptic diversity after genetic analyses [4]. The pattern of which species show extensive cryptic diversity is not clear in the vast scale of the Indo-Pacific, where cryptic diversity is frequent among many quite different fish families. However, in the smaller Greater Caribbean region, the pattern is very clear- only families with lower dispersal ability, i.e. benthic brooded eggs and relatively short larval lives, less than about 30 days, break up into cryptic species complexes [4]. A number of cryptic Caribbean speciescomplexes have been described in recent years (e.g. [3137]), and many more remain to be explored. This pattern is clearly apparent in the number of BINs associated with a single nominal species in our review: the number of BINs approximates the number of species barcoded in most families of shorefishes (rarely there are fewer BINs than species, see below), with the main exception being a markedly greater number of BINs in the families with benthic eggs and short larval lives, i.e. the Gobiidae and Table 1. Barcode coverage of tropical western Atlantic bony fishes, by family; in descending order of number of species known for the region. Numbers in bold highlight families that include non-reef species (i.e. #reef species less than total #species). ALL SPECIES Family Gobiidae Serranidae Labrisomidae Sciaenidae Chaenopsidae Ophichthidae Paralichthyidae Carangidae Congridae Syngnathidae Gobiesocidae Haemulidae Bythitidae Scorpaenidae Apogonidae Batrachoididae Ophidiidae Labridae (wrasses) Muraenidae Blenniidae Lutjanidae Sparidae Engraulidae Triglidae Cynoglossidae Dactyloscopidae Scombridae Pomacentridae Scarinae Tetraodontidae Clupeidae Ogcocephalidae Atherinopsidae Bothidae Opistognathidae Nettastomatidae Synodontidae Holocentridae Grammatidae #sp 134 96 58 54 50 45 35 32 32 27 26 25 25 25 24 24 22 21 21 20 18 18 18 18 17 17 16 16 16 15 14 14 14 14 14 13 12 12 12 #barcoded 102 82 50 40 39 29 25 30 22 22 16 25 16 22 23 12 14 20 17 14 18 16 12 15 11 7 16 15 13 11 11 6 7 10 7 7 12 9 8 %barcoded 76% 85% 86% 74% 78% 64% 71% 94% 69% 81% 62% 100% 64% 88% 96% 50% 64% 95% 81% 70% 100% 89% 67% 83% 65% 41% 100% 94% 81% 73% 79% 43% 50% 71% 50% 54% 100% 75% 67% #BINs 157 84 88 44 85 38 31 30 27 22 24 26 20 25 28 15 17 23 17 10 17 15 12 16 11 7 12 17 13 12 12 5 8 11 9 8 15 11 8 12 12 10 12 9 8 100% 75% 80% 15 11 8 9 11 7 7 78% 64% 8 9 18 13 17 16 16 16 14 7 14 15 11 7 16 15 13 11 6 6 83% 85% 41% 100% 94% 81% 79% 86% 43% 16 11 7 12 17 13 12 6 5 REEF SPECIES #reef sp. 130 92 57 8 50 37 30 32 10 25 21 25 25 24 24 23 22 21 20 20 18 18 #barcoded 102 81 50 8 39 27 22 30 8 22 16 25 16 22 23 12 14 20 16 14 18 16 %barcoded 78% 88% 88% 100% 78% 73% 73% 94% 80% 88% 76% 100% 64% 92% 96% 52% 64% 95% 80% 70% 100% 89% #BINs 157 84 88 9 85 26 28 30 11 22 24 26 20 25 28 15 17 23 16 10 17 15 continued Table 1. Barcode coverage of tropical western Atlantic bony fishes, by family; in descending order of number of species known for the region. Numbers in bold highlight families that include non-reef species (i.e. #reef species less than total #species). ALL SPECIES REEF SPECIES #barcoded 11 12 10 7 8 8 4 8 6 7 6 7 7 7 7 3 4 6 5 3 1 4 3 4 4 3 3 1 3 3 3 2 3 3 2 2 2 2 2 %barcoded 92% 100% 100% 88% 100% 100% 50% 100% 86% 100% 86% 100% 100% 100% 100% 50% 67% 100% 100% 75% 25% 100% 75% 100% 100% 75% 75% 33% 100% 100% 100% 67% 100% 100% 100% 100% 100% 100% 100% #BINs 11 16 9 8 8 8 7 15 6 8 6 7 7 7 6 3 6 6 6 3 1 4 3 4 4 3 6 1 4 3 3 2 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 100% 100% 100% 100% 2 2 2 2 4 4 4 4 1 2 3 3 3 3 4 4 3 3 1 2 3 3 2 3 100% 100% 75% 75% 100% 100% 100% 100% 67% 100% 4 4 3 6 1 2 3 3 2 3 4 4 100% 4 7 7 7 7 1 6 6 5 3 7 7 7 7 1 4 6 5 3 100% 100% 100% 100% 100% 67% 100% 100% 100% 7 5 7 6 1 6 6 6 3 1 8 7 7 1 8 6 7 100% 100% 86% 100% 2 15 6 8 #reef sp. 3 12 10 7 8 #barcoded 3 12 10 7 8 %barcoded 100% 100% 100% 100% 100% #BINs 3 16 9 7 8 Family Mugilidae Gerreidae Monacanthidae Chlopsidae Echeneidae Nomeidae Achiridae Tripterygiidae Antennariidae Belonidae Hemiramphidae Chaetodontidae Pomacanthidae Eleotridae Diodontidae Centropomidae Microdesmidae Balistidae Ostraciidae Albulidae Pristigasteridae Priacanthidae Ariommatidae Mullidae Kyphosidae Uranoscopidae Callionymidae Carapidae Atherinidae Sphyraenidae Polynemidae Ptereleotridae Acanthuridae Molidae Elopidae Moringuidae Fistulariidae Coryphaenidae Stromateidae #sp 12 12 10 8 8 8 8 8 7 7 7 7 7 7 7 6 6 6 5 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 2 2 2 2 2 continued Table 1. Barcode coverage of tropical western Atlantic bony fishes, by family; in descending order of number of species known for the region. Numbers in bold highlight families that include non-reef species (i.e. #reef species less than total #species). ALL SPECIES REEF SPECIES #barcoded 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 1 1029 %barcoded 50% 50% 100% 100% 100% 0% 0% 100% 0% 100% 100% 100% 100% 100% 0% 100% #BINs 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 1 1247 1 1083 1 902 100% 1 1099 1 1 1 1 1 1 1 1 0 1 1 1 1 1 100% 0% 100% 100% 100% 100% 100% 1 0 1 1 1 1 1 2 1 1 1 50% 100% 1 1 #reef sp. #barcoded %barcoded #BINs Family Emmelichthyidae Pempheridae Megalopidae Dussumieriidae Anguillidae Heterenchelyidae Colocongridae Aulostomidae Anomalopidae Dactylopteridae Malacanthidae Cirrhitidae Rachycentridae Ephippidae Muraenesocidae Lobotidae Totals #sp 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1311 the blennioid families Labrisomidae, Chaenopsidae, and Tripterygiidae (Table 1). (Interestingly, the pattern has not been found so far for the true blennies (Blenniidae), which are known to have relatively large wide-ranging larvae.) These gobioid and blennnioid families have many species with multiple BINs, typically allopatric, but sometimes sympatric. Among the gobies, there are 102 species barcoded, but 157 barcode BINs within those 102 species. Similarly, among the Labrisomidae there are 88 BINs for 50 species, among the Chaenopsidae there are 85 BINs for 39 species, and among the Tripterygiidae there are 15 BINs for 8 species. The high number of cryptic lineages in the Tripterygiidae persists even though several new species have been recently described [37]. In that study, four cryptic species of Enneanectes (three new species) were found to be coexisting on reefs in the Lesser Antilles of the Caribbean Sea (sympatric cryptic species). Nevertheless, the typical finding is allopatric species complexes and, since not all subregions of the Caribbean have been wellsampled, especially Colombian and Venezuelan reefs, the number of cryptic lineages among this set of reef fish families is likely to continue to increase. 4 Discussion This review of the barcode coverage for the tropical W. Atlantic bony fishes illustrates the changing priorities of a barcode program as it matures and approaches complete coverage. If the earliest phase of a barcode program is to promulgate the message, identify and develop collaborators, and field test the methodology, then the middle phase is intensive recruitment of more collaborators, accumulation of more specimens from unsequenced species and undersampled locations, and development of an optimal quality control approach. As completion is approached, quality control can be improved by more rigorous identification by using the methods of assessment exemplified by this review, i.e. confirmed voucher photographs and specimens, phylogeographic deduction, and process of elimination. This approach permits each BIN, i.e. algorithm-derived operational taxonomic unit (OTU) as defined by Ratnasingham and Hebert (2013) [25], to be assessed in comparison to other lineages in BOLD and a conclusion reached on the species identification for the BIN lineage or sub-lineage, thus creating a "validated BIN". A validated BIN allows an assessor to question species IDs that do not match the validated identification- either to correct the specimen identification or to highlight exceptions to the "one BIN-one species" paradigm. Exceptions to the "one BIN-one species" are particularly interesting, comprising two basic categories: A) different species that share a BIN, either distinct sub-BIN lineages that are under the threshold for full BIN assignment [25], or barcode "phenovariants", species that share COI haplotypes or co-occur within a lineage; or B) barcode "genovariants", or multiple distinct COI lineages that fit the description of a single nominal species. Barcode genovariants, which represent cryptic lineages, subspecies, or species, are lineages or OTUs that need to be assessed taxonomically. They are not automatically cryptic species, since the decision on what is a species is the prerogative of a trained taxonomist, not simply a genetic variant [4]. At present, there is a general consensus among fish taxonomists not to accept genovariants as species unless there is a morphological, meristic (countable features), or marking difference considered sufficiently consistent, reliable, and significant to merit species-level designation. This is frequently a difficult decision, especially when the intervening geographic range is unsampled, few (or unique) specimens are sampled or even retained, only preserved material is available, or the taxonomy of the group is unstable, unreviewed, or perhaps entirely unexamined. Barcode phenovariants are apparently less frequent than genovariants in marine fishes, but are still not unusual, occurring perhaps among as many as 5-10% of all shorefish species (a difficult rough estimate, since "false phenovariants" because of misidentifications are ubiquitous in the databases). When two or more nominal species truly share COI sequences, it indicates that the species deserve a closer taxonomic examination­ they could represent the same species that had erroneously been split, but frequently they are closely-related species that either hybridize occasionally or have not been separated for sufficient time for mutually exclusive sets of haplotypes in COI to develop (incomplete lineage sorting). Additional sequencing of faster evolving markers, such as the mitochondrial control region, could distinguish species that have not yet diverged sufficiently in the COI marker. An example of the latter is the important tuna genus Thunnus, which mostly share barcode BINs, but can be distinguished to some degree by additional mitochondrial and nuclear markers and/or characterbased sequence analyses [38]. Almost all other cases of phenovariants among reef fishes have not been closely examined (e.g. Hyporthodus groupers or the Helicolenus and Sebastes rockfishes), but it is likely that many cases will prove to be similar to the tunas. Acknowledgements: We thank Paul Hebert, Robert Hanner, Dirk Steinke, and Mahmood Shivji for their introduction to the Barcode Project, and general guidance and support throughout the past decade as we exhaustively surveyed the Caribbean fish barcodes. For the Mexican Barcode program, we thank S. Morales, D. Acevedo and Y. Cota for their dedicated work as parataxonomists of fish larvae and C. Quintal, D. Cazarez, R. Herrera and J. A. Cohuo for their assistance in field collections. Most adult specimen photographs were taken by Humberto Bahena Basave and photographs of larvae were taken by S. Morales and J. A. Cohuo. C. Quintal and J. J. Schmitter Soto assisted with the identification of some adult fish, while Maria Eugenia Vega, Uriel Ordóñez and Margarita Ornelas from CINVESTAV Mérida kindly donated specimens from the Yucatan coast and helped with their identification. J. Lamkin and E. Malca from NOAA/UM supported the larval collections (project 517/04). We thank A. Martínez for work in the Barcode Laboratory in Chetumal and the staff of Mexican marine areas for their support in the field, especially M.C. Garcia of Xcalak and M. F. Remolina of Contoy. CONAPESCA provided scientific collection permits. DNA sequencing was carried out at the Biodiversity Institute of Ontario, University of Guelph, supported by grants to PDNH from Genome Canada through the Ontario Genomics Institute, and from NSERC. Part of this work was supported through Grant HE009 and HB043 from Comisión Nacional Para el Uso y Conservación de la Biodiversidad (CONABIO) and the Mexican Barcode of Life (MEXBOL) network. Conflict of interest: Dr Benjamin Victor has nothing to disclose.

Journal

DNA Barcodesde Gruyter

Published: Jan 1, 2015

References