
Transcriptomic data reveal divergent paths of chitinase evolution underlying dietary convergence in anteaters and pangolins Rémi Allio1,2,§,*, Sophie Teullet1,§, Dave Lutgen1,3,4,§, Amandine Magdeleine1, Rachid Koual1, Marie-Ka Tilak1, Benoit de Thoisy5,6, Christopher A. Emerling1,7, Tristan Lefébure8, and Frédéric Delsuc1,* 1ISEM, Univ. Montpellier, CNRS, IRD, Montpellier, France 2CBGP, INRAE, CIRAD, IRD, Montpellier SupAgro, Univ. Montpellier, Montpellier, France 3Institute of Ecology and Evolution, University of Bern, Bern, Switzerland 4Swiss ornithological Institute, Sempach, Switzerland 5Institut Pasteur de la Guyane, Cayenne, French Guiana, France 6Kwata NGO, Cayenne, French Guiana, France 7Biology Department, Reedley College, Reedley, CA, USA 8Univ. Lyon, Université Claude Bernard Lyon 1, CNRS, ENTPE, UMR 5023 LEHNA, F-69622, Villeurbanne, France §Equal contribution *Correspondence Rémi Allio: remi.allio@inrae.fr Frédéric Delsuc: frederic.delsuc@umontpellier.fr Abstract Ant-eating mammals represent a textbook example of convergent evolution. Among them, anteaters and pangolins exhibit the most extreme convergent phenotypes with complete tooth loss, elongated skulls, protruding tongues, hypertrophied salivary glands producing large amounts of saliva, and powerful claws for ripping open ant and termite nests. However, comparative genomic analyses have shown that anteaters and pangolins differ in their chitinase acidic gene (CHIA) repertoires, which potentially degrade the chitinous exoskeletons of ingested ants and termites. While the southern tamandua (Tamandua tetradactyla) harbors four functional CHIA paralogs (CHIA1-4), Asian pangolins (Manis spp.) have only one functional paralog (CHIA5). Here, we performed a comparative transcriptomic analysis of salivary glands in 33 placental species, including 16 novel transcriptomes from ant-eating species and close relatives. Our results suggest that salivary glands play an important role in adaptation to an insect-based diet, as expression of different CHIA paralogs is observed in insectivorous species. Furthermore, convergently-evolved pangolins and anteaters express different chitinases in their digestive tracts. In the Malayan pangolin, CHIA5 is overexpressed in all major digestive organs, whereas in the southern tamandua, all four functional paralogs are expressed, at very high levels for CHIA1 and CHIA2 in the pancreas, and for CHIA3 and CHIA4 in the salivary glands, stomach, liver, and pancreas. Overall, our results demonstrate that divergent molecular mechanisms within the chitinase acidic gene family underlie convergent adaptation to the ant-eating diet in pangolins and anteaters. This study highlights the role of historical contingency and molecular tinkering of the chitin-digestive enzyme toolkit in this classic example of convergent evolution. Figures & Tables Figure 1: Dated placental mammal phylogeny including representative species of the four major clades (Afrotheria, Xenarthra, Euarchontoglires, and Laurasiatheria) for which CHIA gene repertoires have been previously characterized. Numbers between brackets represent percentages of invertebrates included in the diet with myrmecophagous species indicated by an ant silhouette. Ψ symbols indicate CHIA pseudogenes as determined in previous studies (Emerling et al. 2018; Janiak et al. 2018; Wang et al. 2020). Ancestral CHIA gene repertoires for Placentalia and Ferae (Pholidota + Carnivora) as inferred by Emerling et al. (2018) are presented. The chronogram was extracted from www.timetree.org (Kumar et al. 2022). Silhouettes were obtained from www.phylopic.org. Figure 2: A. Mammalian chitinase-like gene family tree reconstructed using a maximum likelihood gene-tree/species-tree reconciliation approach on protein sequences. The nine chitinase paralogs are indicated on the outer circle. Scale bar represents the mean number of amino acid substitutions per site. B. Synteny analysis of the nine chitinase paralogs in humans (Homo sapiens), tarsier (Carlito syrichta), nine-banded armadillo (Dasypus novemcinctus) and the two main focal convergent ant-eating species: the southern tamandua (Tamandua tetradactyla) and the Malayan pangolin (Manis javanica). Assembly names and accession numbers are indicated below species names. Boxes represent different contigs with their most upstream and downstream BLAST hit positions to chitinase genes (colored arrows). Genes PIFO and DENND2D (grey arrows) are not chitinase paralogs but were used in the synteny analysis. Arrow direction indicates gene transcription direction as inferred in Genomicus v100.01 (Nguyen et al. 2022) for genes located on short contigs. Ψ symbols indicate pseudogenes as determined in Emerling et al. (2018). Genes with non significant BLAST hits were not represented and are probably not functional or absent. Silhouettes were obtained from www.phylopic.org. Figure 3: Comparison of predicted ancestral protein sequences of the nine mammalian chitinase paralogs. A. Conserved amino acid residues of the canonical chitinolytic domain active site (DXXDXDXE). Arrows indicate paralogs in which changes occurred in the active site. B. Summary of the evolution of chitinase paralogs functionality. C. Conserved cysteine residues of the chitin-binding domain. The arrow indicates OVGP1 in which the last four cysteines have been replaced. Figure 4: Expression of the nine chitinase paralogs in 40 mammalian salivary gland transcriptomes. The 33 species are presented in their phylogenetic context covering the four major placental clades: Afrotheria (AFR), Xenarthra (XEN), Euarchontoglires (EUA), and Laurasiatheria (LAU). The chronogram was extracted from www.timetree.org (Kumar et al. 2022). Non-functional pseudogenes are only indicated for the three focal species (in bold) using a Ψ symbol: nine-banded armadillo (Dasypus novemcinctus), southern tamandua (Tamandua tetradactyla) and Malayan pangolin (Manis javanica). Expression level is represented as log10 (Normalized Counts + 1). Asterisks indicate the 16 new transcriptomes produced in this study. Myrmecophagous and insectivorous species are indicated by ant and beetle silhouettes, respectively. Silhouettes were obtained from www.phylopic.org. Figure 5: Expression of the nine chitinase paralogs in 72 transcriptomes from different organs of the three focal species: the nine-banded armadillo (Dasypus novemcinctus), the Malayan pangolin (Manis javanica), and the southern tamandua (Tamandua tetradactyla). Non-functional pseudogenes are represented by a Ψ symbol and hatched background. Boxes indicate organs of the digestive tract. Expression level is represented as log10 (Normalized Counts + 1). Silhouettes were obtained from www.phylopic.org. Figure 6: Summary figure presenting the evolution and expression of chitinase acidic (CHIA) paralogous genes in the convergently evolved Malayan pangolin (Manis javanica) and southern tamandua (Tamandua tetradactyla) in their phylogenetic context. Reconstructed CHIA gene repertoires are indicated for the two myrmecophagous species and for the most recent common ancestor (MRCA) of placentals, pangolins+carnivores (Ferae) and anteaters+sloths (Pilosa). Non-functional pseudogenes are represented by the Ψ symbol and dashed line contour. Organ icons indicate expression of the corresponding gene in different digestive organs. SG: Salivary glands; S: Stomach; T: Tongue; P: Pancreas; L: Liver; I: Intestine. Silhouettes were obtained from www.phylopic.org and www.vecteezy.com. Supplementary Materials Table S1: Detailed information on the tissues sequenced or retrieved from public databases for the project. Table S2: BUSCO v5 scores of all transcriptomes based on a dataset of 9,226 single-copy orthologs conserved in over 90% of mammalian species (Manni et al. 2021). Zenodo supplementary files CHIAs_OG_tree-RAxML_EPA.zip contains CHIA sequences (obtained from the OrthoFinder orthogroups and the sequences used to infer the chitinase genes evolution) and the corresponding ML tree. Chitinases_ancestral_sequences.zip contains the ancestral chitinase sequence reconstructions, the associated posterior probabilities, and the alignment of the ancestral sequences inferred by RAxML-NG. Chitinases_gene_tree.zip contains input and output files corresponding to the chitinase gene tree presented in Figure 2: - mammalina_species_tree_input_Generax.newick = species tree used for the reconciliation with Generax - chitinase_gene_alignment_renamed_input_Generax.fasta = chitinase gene alignment with the sequence names renamed for Generax - chitinase_gene_alignment_not_renamed.fasta = chitinase gene alignment with the original sequence names (for information) - chitinases_gene_tree_sequences_renamed_input_Generax.newick = chitinase gene tree inferred with RAxML-NG and reconciled using the TreeRecs algorithm to find the optimal rooting scheme; this tree was used for Generax - reconciled_chitinase_genes_tree_output_Generax.newick = reconciled chitinase gene tree inferred by Generax and presented in Figure 2 Chitinases_expression.zip contains all orthogroup gene expressions plus chitinase gene expressions. Kallisto_abundances.zip contains the abundances estimated with kallisto for each organ of each species. Supplementary table figure 2B - BLAST results.xlsx contains BLAST results supporting sequence inferences. Transcriptome_assemblies.tar.gz contains the transcriptome assemblies obtained for each organ and species with Trinity.
Molecular evolution, Evolutionary biology
Molecular evolution, Evolutionary biology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
