Predicting RAD-seq Marker Numbers across the Eukaryotic Tree of Life

Herrera, Santiago; Reyes-Herrera, Paula H.; Shank, Timothy M.

doi:10.1093/gbe/evv210

Advanced Search

Select up to three search categories and corresponding keywords using the fields to the right. Refer to the Help section for more detailed instructions.

Search our Collections & Repository

Advanced Search
Custom Query

All these words:

For very narrow results

This exact word or phrase:

When looking for a specific result

Any of these words:

Best used for discovery & interchangable words

None of these words:

Recommended to be used in conjunction with other fields

Language:

Dates

Publication Date Range:

to

Document Data

Title:

Document Type:

Library

Collection:

Series:

People

Author:

Clear All

Query Builder

Query box

Clear All

For additional assistance using the Custom Query please check out our Help Page

i

Predicting RAD-seq Marker Numbers across the Eukaryotic Tree of Life

2015
By Herrera, Santiago ; Reyes-Herrera, Paula H. ; Shank, Timothy M.
Source: Genome Biol Evol. 2015 Dec; 7(12): 3207–3225.

Select the Download button to view the document

This document is over 5mb in size and cannot be previewed

Details You May Also Like

Details:

Journal Title:

Genome Biology and Evolution
Personal Author:

Herrera, Santiago ; Reyes-Herrera, Paula H. ; Shank, Timothy M.

Herrera, Santiago ; Reyes-Herrera, Paula H. ; Shank, Timothy M. Less -
NOAA Program & Office:

OAR (Oceanic and Atmospheric Research)
Description:

High-throughput sequencing of reduced representation libraries obtained through digestion with restriction enzymes—generically known as restriction site associated DNA sequencing (RAD-seq)—is a common strategy to generate genome-wide genotypic and sequence data from eukaryotes. A critical design element of any RAD-seq study is knowledge of the approximate number of genetic markers that can be obtained for a taxon using different restriction enzymes, as this number determines the scope of a project, and ultimately defines its success. This number can only be directly determined if a reference genome sequence is available, or it can be estimated if the genome size and restriction recognition sequence probabilities are known. However, both scenarios are uncommon for nonmodel species. Here, we performed systematic in silico surveys of recognition sequences, for diverse and commonly used type II restriction enzymes across the eukaryotic tree of life. Our observations reveal that recognition sequence frequencies for a given restriction enzyme are strikingly variable among broad eukaryotic taxonomic groups, being largely determined by phylogenetic relatedness. We demonstrate that genome sizes can be predicted from cleavage frequency data obtained with restriction enzymes targeting “neutral” elements. Models based on genomic compositions are also effective tools to accurately calculate probabilities of recognition sequences across taxa, and can be applied to species for which reduced representation data are available (including transcriptomes and neutral RAD-seq data sets). The analytical pipeline developed in this study, PredRAD (https://github.com/phrh/PredRAD), and the resulting databases constitute valuable resources that will help guide the design of any study using RAD-seq or related methods.
Source:

Genome Biol Evol. 2015 Dec; 7(12): 3207–3225.
DOI:

http://dx.doi.org/10.1093/gbe/evv210
Pubmed ID:

26537225
Pubmed Central ID:

PMC4700943
Document Type:

Journal Article
Funding:

Grant no. NA09OAR4320129
Rights Information:

CC BY
Compliance:

PMC
Main Document Checksum:

[+]

urn:sha256:7834d6d88f27b7956bbee88952e6bce64d11dcf68b7caf49e43155ca193227ae
Download URL:

https://repository.library.noaa.gov/view/noaa/26305/noaa_26305_DS1.pdf
File Type:

[PDF-8.04 MB]