Tuesday, August 6, 2013


Metabarcoding is a rapid method of biodiversity assessment that combines two technologies: DNA based identification and high-throughput DNA sequencing. It uses universal PCR primers to mass-amplify DNA Barcodes from mass collections of organisms or from environmental DNA. The PCR product is sent to a next generation sequencer and the result is a wealth of DNA sequences. Such sequence collections are auditable, because sites can be sampled by independent parties, or samples can be split, and analysed by certified entities following a standardized protocol. They can also be verified by fieldwork to confirm the presence or absence of particular species. These metabarcode data sets are taxonomically more comprehensive, many times quicker to produce, and less reliant on taxonomic expertise.

However, the reliability of such datasets has not been tested fully: In general, studies have found that not every species is recovered from samples and that the ecological patterns do not perfectly match those found using standard data sets. Can these discrepancies be ignored? Are the metabarcode data sets in fact revealing higher resolution ecological patterns? Most importantly, can the information that is recovered by metabarcoding be used to answer policy and management questions reliably?

A new study published in Ecology Letters two days ago aims to answer those questions:
Here, we validate metabarcoding by testing it against three high-quality standard data sets that were collected in Malaysia (tropical), China (subtropical) and the United Kingdom (temperate) and that comprised 55,813 arthropod and bird specimens identified to species level with the expenditure of 2,505 person-hours of taxonomic expertise. The metabarcode and standard data sets exhibit statistically correlated alpha- and beta-diversities, and the two data sets produce similar policy conclusions for two conservation applications: restoration ecology and systematic conservation planning. Compared with standard biodiversity data sets, metabarcoded samples are taxonomically more comprehensive, many times quicker to produce, less reliant on taxonomic expertise and auditable by third parties, which is essential for dispute resolution.

Needless to say that this study was very successful and provided some interesting numbers, e.g. on the amount of work the went into both the standard approach and into the next generation sequencing method. Even at such an experimental stage only about a fourth of the person-hours is necessary and under standard conditions not a full sample was analysed. The authors close with some recommendations for further metabarcoding work which I like to share:

Naturally, metabarcode data sets are subject to error and loss of information, so most research effort to date has been to validate metabarcoding against standard biodiversity censuses (see Introduction), and to develop more efficient and reliable pipelines that take advantage of advances in sequencing technology. Another focus has been devising clever ways to collect the DNA of difficult-to-trap taxa: water, soil, pollen traps, faeces and parasites. We expect both of these areas to continue to consume research effort.

In addition, we see the following three directions as especially important if metabarcoding is to bridge the science-practitioner divide:

1. Developing statistical and laboratory methods to allow robust inference of species abundances in samples and across landscapes. Related to this is the development of PCR-free methods that reduce read-number biases and allow the detection of taxa that do not amplify well, such as the Hymenoptera (Yu et al. 2012).
2. Robust methods of taxonomic assignment and phylogenetic placement, with confidence estimates at each taxonomic level, while minimising false-positive assignments (Matsen et al. 2010; Zhang et al. 2012).

3. Deeper connection with the end-users of biodiversity data (Cook et al. 2013), including the development of chain-of-evidence and bioinformatic-reporting protocols to increase the credibility of the data.


  1. Could you possibly post the full references for your citations please?

  2. All references are now hyperlinked using a DOI. Sorry about the oversight.

  3. I see the articles about sequencing DNA through insect soup/ leech soup can provide biodiversity information and can make conservation strategies but how that will be possible through DNA metabarcoding, I am in totally surprised.

  4. what is the difference between metabarcoding and metagenomics?