Supplementary information for Altermatt et al. Methods in Ecology and Evolution. DOI: 10.1111/2041-210X.12312

“Big answers from small worlds: a user's guide for protist microcosms as a model system in ecology and evolution”

Altermatt F, Fronhofer EA, Garnier A, Giometto A, Hammes F, Klecka J, Legrand D, Mächler E, Massie TM, Pennekamp F, Plebani M, Pontarp M, Schtickzelle N, Thuillier V & Petchey OL

2.7 DNA Sequencing and Barcoding


DNA sequencing of protist species is done to analyse population dynamics (Hajibabaei et al. 2011; Zufall, Dimon & Doerder 2013) or genetic diversity of species complex (e.g., Catania et al. 2009), for comparative studies (Gray et al. 1998) or to understand the evolution of gene and genomes (Brunk et al. 2003; Chen, Zhong & Monteiro 2006; Moradian et al. 2007). DNA barcoding is a special case of sequencing, focusing on a short and conserved portion of the genome in order to disentangle the phylogenetic relationships between taxa (Pawlowski et al. 2012). The use of DNA barcoding or sequencing enables to estimate nucleotide diversity and fixation indices (Fst), consequently to access the genetic structure and gene flow within and among populations. The genetic variability can also be compared to life history traits or phenotypic plasticity resulting from local adaptation (Krenek, Petzoldt & Berendonk 2012) in order to understand the pattern of evolution. DNA barcoding has been of great interest in phylogenetics to discover morphospecies or cryptic species and to identify the species’ composition in a particular environment. Barcodes have been used to study the composition and interaction between species coming from the same environment, like soil (Blaxter 2004) or water column (Stern et al. 2010; Hajibabaei et al. 2011) and identify cryptic or morpho-species frequent in protists (e.g., Barth et al. 2006).

The choice of the gene or barcode of interest should be carefully made depending on the taxonomic level and species one works on. Knowing that the mitochondrial genome evolves faster, the accumulation of sequence variability between organisms would be higher, enabling to discriminate the intraspecific relationships or recent phylogenetic splits. Many barcodes have been previously developed (Nassonova et al. 2010; Pawlowski et al. 2012) either on the mitochondrial genome (e.g., Cox-1, Cob, SSU of rDNA) or in the nuclear genome (e.g., ITS1-2, SSU and LSU of rDNA).


The procedure for DNA sequencing and barcoding consists of three steps: DNA extraction, gene/barcode choice and amplification by PCR, and sequencing methods.

First, DNA should be extracted from the cell. Different procedures have been developed to isolate the whole DNA: Chelex solution (Walsh, Metzger & Higuchi 1991), various DNA isolation kits (e.g., Epicentre) or modified phenol/chloroform extraction (Couvillion & Collins 2012). The extraction of whole DNA is sufficient for known barcodes or single copy gene sequencing. However, many protist species are polyploid (>45 in Tetrahymena thermophila), and some events of duplication of particular genes well known. Furthermore, ribosomal genes have been duplicated from the mitochondrial genome to the nuclear genome. In that particular case, all copies will be amplified without distinction. Since these two genomes do not evolve at the same speed, a mixture of amplified sequences will increase the noise on the chromatogram. This will complicate the readability of the resulting sequence and can lead to false interpretations. When one wants to create new barcodes and ensure their specificity, the mitochondrial genome should be separated from the nuclear genome as a necessary precaution. The separation between nuclear and mitochondrial materials could be achieved by migration on agarose gel (0.4% at 50V for 6h) with total DNA isolated by modified chloroform extraction (V. Thuillier et al. unpub. results). Depending on the organism being studied, the upper and brighter band in the agarose gel corresponds to the nuclear DNA and the mitochondrial DNA appears around 40kb. The band of interest could be excised and purified by a kit (e.g., wizard SV kit). In ciliates, two nuclei are found: the macronucleus participates in the somatic division and the micronucleus, which is responsible for the germinal line. Both genomes are particularly similar given that the micronucleus genes are copied several times to form the macronucleus (Prescott 1994). Therefore, in order to analyse nuclear genes, the two nuclei should be separated by gradient separations, such as Percoll gradients (Allen 1999; Asai & Forney 2000).

Second, the choice of the gene of interest or barcode should be carefully made depending on the taxonomic level and species one works on. Knowing that the mitochondrial genome evolves faster, the accumulation of sequence variability between organisms would be higher, enabling to discriminate the intraspecific relationships or recent phylogenetic node. Many barcodes have be developed (Nassonova et al. 2010; Pawlowski et al. 2012) either on the mitochondrial genome (e.g., Cox-1 cytochrome oxydase 1, in Tetrahymena, in Amoebae Cob cytochrome b, SSU of rDNA ribosomal small sub-unit, Slapeta, Moreira & Lopez-Garcia 2005; Chantangsi et al. 2007; Nassonova et al. 2010; Kher et al. 2011) or in fast evolving nuclear portions (e.g. ITS1-2 internal transcribed spacer 1-2 in Carchesium polypinum, diatoms, and Tetrahymena thermophila, SSU rDNA 5.8S in Paramecium aurelia or LSU rDNA ribosomal large sub-unit, Chen, Zhong & Monteiro 2006; Catania et al. 2009; Gentekaki & Lynn 2009; Moniz & Kaczmarska 2010). The PCR conditions and primers used are described in the corresponding publications. New barcodes could also be designed with Primer3 software ( that helps to design primers in association with NCBI database. A classical procedure for the PCR (Chen, Zhong & Monteiro 2006) could be tested and modified if necessary knowing that the Tm (melting point temperature) has a strong influence on the PCR functioning. An optimal PCR protocol can be achieved by testing across a temperature-magnesium gradient.

Finally, the PCR products could be sequenced by Sanger Sequencing method or Next Generation Sequencing (NGS) (llumina, Solexa, Solid, see Valentini, Pompanon & Taberlet 2009). The use of Sanger method is favoured when the number of sequences and barcodes are limited. NGS costs have much decreased recently. NGS are usually used in metagenomics or in comparative studies. For Sanger methods, resulting sequences should be cleaned, most often achieved visually on the chromatogram in MEGA (open source software,, Sequencher (open source software, or Geneious (private software, For the analysis of the sequences, many software exist and are well explained (Hall 2013) depending on the purpose. The treatment and analysis of the sequences generated requires expertise in bioinformatics and the detailed procedure is out of scope of this paper. NGS are usually used in metagenomics (Hajibabaei et al. 2011), surveys of microorganism diversity (Medinger et al. 2010) or in comparative studies. Sequencing data are available and compiled into various databases like GenBank (NCBI, and BOLD (Barcode of Life Data Systems, for the barcoding sequences.



A standard molecular biology laboratory (including a fume hood) and respective equipment is needed.


DNA extractions (Phenole/Chloroform extraction), Lysis buffer (pH 9.5):

  • 10mM Tris, pH 7.5

  • 0.5M EDTA

  • 1% SDS, completed with ultrapure water

DNA extractions (modified Chloroform extraction, modified by V. Thuillier et al.), Lysis buffer (pH=8):

  • Tris 20 mM pH 7.5

  • EDTA 1 mM

  • NaCl 100 mM

  • SDS 10%

  • ddH2O

Choice of Barcode and PCR amplification:

Choice of Cox-1 barcode with (Chantangsi et al. 2007) forward primer 5’-ATGTGAGTTGATTTTATAGA-3’ and reverse primer 5’-CTCTTCTATGTCTTAAACCAGGCA-3’.


DNA extractions (Phenole/Chloroform extraction):

  1. Collect 2.5*105 cells in 50 𝜇LTris (10 mM, pH 7.5).

  2. Re-suspend and add 200 𝜇L of pre-heated Lysis buffer (60 °C).

  3. Add two volumes water and incubate at 60 °C at least 1 h.

  4. Cool to room temperature, add 50 𝜇g/mL proteinase K and incubate at 37 °C overnight.

  5. Purify with one volume phenol/chloroform/isoamid.

  6. Precipitate with one-tenth volume sodium acetate (pH 5.2) and one volume of isopropanol.

  7. Wash pellet in 70 % ethanol.

  8. Re-suspend in 75𝜇L Tris-EDTA.

  9. Add 0.8 𝜇g/𝜇L RNase A and incubate for 30min at 37 °C.

  10. Purify with one volume phenol/chloroform/isoamid.

  11. Precipitate with one-tenth volume sodium acetate (pH 5.2) and one volume of isopropanol.

  12. Wash pellet in 70% ethanol.

  13. Re-suspend in the desired volume of Tris-EDTA.

DNA extractions (modified Chloroform extraction, modified by V. Thuillier et al.):

  1. Dilute the amount of cells in order to have a final volume of 200 µl ultrapure H2O.

  2. Add 500 µL of lysis Buffer (pH=8) and vortex them for few seconds until all is homogenized. Then, add 10 µL proteinase k (mg/µl).

  3. Inverse the tube 2-3 times.

  4. Incubate at 37 °C for 20 min, then vortex for few seconds.

  5. Inactivate the enzyme by incubation for 20 min at 65 °C.

  6. Add 10mg/mL RNAse A, mix gently and incubate for 30 min at 37 °C. Vortex for few seconds.

  7. Separation with 750 µl of chloroform-isoamid (24:1). Then, homogenise and centrifuge at 12 000 rcf for 10min at room temperature. Collect the upper phase (aqueous phase).

  8. Separation with 750 µL chloroform-isoamid (24:1) and repeat the same process.

  9. Precipitation with 1 mL ethanol 100% (-20 °C) at room temperature. Mix carefully and incubate for 15 mins.

  10. Centrifugation at 10000 rpm for 30mins and return the tube to eliminate the ethanol.

  11. Washing with 1 mL ethanol 70% and centrifuge for 5 min at 8000 rpm. Remove the ethanol with a pipette. Dry only if it rests some ethanol for few minutes.

  12. Dissolution in 20 µL of water.

Choice of Barcode and PCR amplification (Chantangsi et al. 2007 for COX-1):

  1. Initial denaturation step of 94 °C for 4 min.

  2. Followed by 5 cycles consisting of (each cycle): 30 s at 94 °C; 1 min at 45 °C; 105 s at 72 °C.

  3. Followed by 35 cycles consisting of (each cycle): 30 s at 94 °C; 1 min at 55 °C; s at 72 °C.

  4. Final extension step at 72°C for 10 min.

Classical procedure for the PCR (Chen, Zhong & Monteiro 2006):

  1. Initial denaturation step of 94°C for 10 min.

  2. Followed by 30 cycles consisting of (each cycle): 1 min at 94 °C, 1 min at Tm; 1 min at 72 °C.

  3. Final extension step at 72 °C for 10 min.


Allen, S.L. (1999) Chapter 8 Isolation of Micronuclear and Macronuclear DNA. Methods in Cell Biology (eds J.A. David & D.F. James), pp. 241-252. Academic Press.

Asai, D.L. & Forney, J.D. (2000) Tetrahymena termophila. Academic Press, San Diego.

Barth, D., Krenek, S., Fokin, S.I. & Berendonk, T.U. (2006) Intraspecific genetic variation in Paramecium revealed by mitochondrial cytochrome C oxidase I sequences. J Eukaryot Microbiol, 53, 20-25.

Blaxter, M.L. (2004) The promise of a DNA taxonomy. Philos Trans R Soc Lond B Biol Sci, 359, 669-679.

Brunk, C.F., Lee, L.C., Tran, A.B. & Li, J. (2003) omplete sequence of the mitochondrial genome of Tetrahymena thermophila and comparative methods for identifying highly divergent genes. Nucleic Acids Research, 31, 1673-1682.

Catania, F., Wurmser, F., Potekhin, A.A., Przyboś, E. & Lynch, M. (2009) Genetic Diversity in the Paramecium aurelia Species Complex. Molecular Biology and Evolution, 26, 421-431.

Chantangsi, C., Lynn, D.H., Brandl, M.T., Cole, J.C., Hetrick, N. & Ikonomi, P. (2007) Barcoding ciliates: a comprehensive study of 75 isolates of the genus Tetrahymena. Int J Syst Evol Microbiol, 57, 2412-2425.

Chen, B., Zhong, D. & Monteiro, A. (2006) Comparative genomics and evolution of the HSP90 family of genes across all kingdoms of organisms. BMC Genomics, 7, 156.

Couvillion, M.T. & Collins, K. (2012) Chapter 12 - Biochemical Approaches Including the Design and Use of Strains Expressing Epitope-Tagged Proteins. Methods in Cell Biology (ed. C. Kathleen), pp. 347-355. Academic Press.

Diggles, B.K. & Adlard, R.D. (1997) Intraspecific variation in Cryptocaryon irritans. J Eukaryot Microbiol, 44, 25-32.

Foissner, W., Chao, A. & Katz, L.A. (2007) Diversity and geographic distribution of ciliates (Protista: Ciliophora). Biodiversity and Conservation, 17, 345-363.

Gentekaki, E. & Lynn, D.H. (2009) High-level genetic diversity but no population structure inferred from nuclear and mitochondrial markers of the peritrichous ciliate Carchesium polypinum in the Grand River basin (North America). Appl Environ Microbiol, 75, 3187-3195.

Gray, M.W., Lang, B.F., Cedergren, R., Golding, G.B., Lemieux, C., Sankoff, D., Turmel, M., Brossard, N., Delage, E., Littlejohn, T.G., Plante, I., Rioux, P., Saint-Louis, D., Zhu, Y. & Burger, G. (1998) Genome structure and gene content in protist mitochondrial DNAs. Nucleic Acids Research, 26, 865-878.

Hajibabaei, M., Shokralla, S., Zhou, X., Singer, G.A.C. & Baird, D.J. (2011) Environmental Barcoding: A Next-Generation Sequencing Approach for Biomonitoring Applications Using River Benthos. PLoS ONE, 6, e17497.

Hajibabaei, M., Singer, G.A., Clare, E.L. & Hebert, P.D. (2007) Design and applicability of DNA arrays and DNA barcodes in biodiversity monitoring. BMC Biol, 5, 24.

Hall, B.G. (2013) Building phylogenetic trees from molecular data with MEGA. Mol Biol Evol, 30, 1229-1235.

Kher, C.P., Doerder, F.P., Cooper, J., Ikonomi, P., Achilles-Day, U., Kupper, F.C. & Lynn, D.H. (2011) Barcoding Tetrahymena: discriminating species and identifying unknowns using the cytochrome c oxidase subunit I (cox-1) barcode. Protist, 162, 2-13.

Krenek, S., Petzoldt, T. & Berendonk, T.U. (2012) Coping with temperature at the warm edge--patterns of thermal adaptation in the microbial eukaryote Paramecium caudatum. PLoS ONE, 7, e30598.

Medinger, R., Nolte, V., Pandey, R.V., Jost, S., Ottenwalder, B., Schlotterer, C. & Boenigk, J. (2010) Diversity in a hidden world: potential and limitation of next-generation sequencing for surveys of molecular diversity of eukaryotic microorganisms. Mol Ecol, 19 Suppl 1, 32-40.

Moniz, M.B. & Kaczmarska, I. (2010) Barcoding of diatoms: nuclear encoded ITS revisited. Protist, 161, 7-34.

Moradian, M.M., Beglaryan, D., Skozylas, J.M. & Kerikorian, V. (2007) Complete Mitochondrial Genome Sequence of Three \<italic>Tetrahymena\</italic> Species Reveals Mutation Hot Spots and Accelerated Nonsynonymous Substitutions in \<italic>Ymf\</italic> Genes. PLoS ONE, 2, e650.

Nassonova, E., Smirnov, A., Fahrni, J. & Pawlowski, J. (2010) Barcoding amoebae: comparison of SSU, ITS and COI genes as tools for molecular identification of naked lobose amoebae. Protist, 161, 102-115.

Pawlowski, J., Audic, S.p., Adl, S., Bass, D., Belbahri, L.d., Berney, C.d., Bowser, S.S., Cepicka, I., Decelle, J., Dunthorn, M., Fiore-Donno, A.M., Gile, G.H., Holzmann, M., Jahn, R., Jirků, M., Keeling, P.J., Kostka, M., Kudryavtsev, A., Lara, E., Lukeš, J., Mann, D.G., Mitchell, E.A.D., Nitsche, F., Romeralo, M., Saunders, G.W., Simpson, A.G.B., Smirnov, A.V., Spouge, J.L., Stern, R.F., Stoeck, T., Zimmermann, J., Schindel, D. & de Vargas, C. (2012) CBOL Protist Working Group: Barcoding Eukaryotic Richness beyond the Animal, Plant, and Fungal Kingdoms. PLoS Biol, 10, e1001419.

Prescott, D.M. (1994) The DNA of ciliated protozoa. Microbiol Rev, 58, 233-267.

Slapeta, J., Moreira, D. & Lopez-Garcia, P. (2005) The extent of protist diversity: insights from molecular ecology of freshwater eukaryotes. Proc Biol Sci, 272, 2073-2081.

Stern, R.F., Horak, A., Andrew, R.L., Coffroth, M.A., Andersen, R.A., Kupper, F.C., Jameson, I., Hoppenrath, M., Veron, B., Kasai, F., Brand, J., James, E.R. & Keeling, P.J. (2010) Environmental barcoding reveals massive dinoflagellate diversity in marine environments. PLoS ONE, 5, e13991.

Valentini, A., Pompanon, F. & Taberlet, P. (2009) DNA barcoding for ecologists. Trends Ecol Evol, 24, 110-117.

Walsh, P.S., Metzger, D.A. & Higuchi, R. (1991) Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material. Biotechniques, 10, 506-513.

Zufall, R.A., Dimon, K.L. & Doerder, F.P. (2013) Restricted distribution and limited gene flow in the model ciliate Tetrahymena thermophila. Molecular Ecology, 22, 1081-1091.