This study explores the evolution of CD44 expression in therian mammals in both SF as well as ESF and demonstrates that the human lineage has experienced a concerted evolutionary enhancement of CD44 expression, correlating with an increase in human vulnerability to cancer malignancy.
The results suggest that therapeutic modulation of CD44 expression in skin fibroblasts could attenuate the cancer-promoting effect of cancer associated fibroblasts with minimal side effects on other cell types.
Samuel Hibdige, Pauline Raimondeau, Pascal-Antoine Christin, Luke Dunning
Published: Feb 2020
Background: Lateral gene transfer (LGT) has been documented in a broad range of eukaryotes, where it can promote adaptation. In plants, LGT of functional nuclear genes has been repeatedly reported in parasitic plants, ferns and grasses, but the exact extent of the phenomenon remains unknown. Systematic studies are now needed to identify the factors that govern the frequency of LGT among plants. Results: Here we scan the genomes of a diverse set of grass species that span more than 50 million years of divergence and include major crops. We identify protein coding LGT in a majority of them (13 out of 17). There is variation among species in the amount of LGT received, with rhizomatous species receiving more genes. In addition, the amount of LGT increases with phylogenetic relatedness, which might reflect genomic compatibility among close relatives facilitating successful transfers. However, we also observe genetic exchanges among distantly related species that diverged shortly after the origin of the grass family when they co-occur in the wild, pointing to a role of biogeography. The dynamics of successful LGT in grasses therefore appear to be dependent on both opportunity (co-occurrence and rhizomes) and compatibility (phylogenetic distance). Conclusion: Overall, we show that LGT is a widespread phenomenon in grasses, which is boosted by repeated contact among related lineages. The process has moved functional genes across the entire grass family into domesticated and wild species alike.
We have developed an analysis pipeline to facilitate real-time mutation tracking in SARS-CoV-2, focusing initially on the Spike (S) protein because it mediates infection of human cells and is the target of most vaccine strategies and antibody-based therapeutics. To date we have identified fourteen mutations in Spike that are accumulating. Mutations are considered in a broader phylogenetic context, geographically, and over time, to provide an early warning system to reveal mutations that may confer selective advantages in transmission or resistance to interventions. Each one is evaluated for evidence of positive selection, and the implications of the mutation are explored through structural modeling. The mutation Spike D614G is of urgent concern; after beginning to spread in Europe in early February, when introduced to new regions it repeatedly and rapidly becomes the dominant form. Also, we present evidence of recombination between locally circulating strains, indicative of multiple strain infections. These finding have important implications for SARS-CoV-2 transmission, pathogenesis and immune interventions.### Competing Interest StatementThe authors have declared no competing interest.
Debaleena Bhowmik, Sourav Pal, Abhishake Lahiri, Arindam Talukdar, Sandip Paul
Published: Apr 2020
This study explores the divergence pattern of SARS-CoV-2 using whole genome sequences of the isolates from various COVID-19 affected countries. The phylogenomic analysis indicates the presence of at least four distinct groups of the SARS-CoV-2 genomes. The emergent groups have been found to be associated with signature structural changes in specific proteins. Also, this study reveals the differential levels of divergence patterns for the protein coding regions. Moreover, we have predicted the impact of structural changes on a couple of important viral proteins via structural modelling techniques. This study further advocates for more viral genetic studies with associated clinical outcomes and hosts response for better understanding of SARS-CoV-2 pathogenesis enabling better mitigation of this pandemic situation.### Competing Interest StatementThe authors have declared no competing interest.
Joseph Walker, Xing-Xing Shen, Antonis Rokas, Stephen Smith, Edwige Moyroud
Published: Apr 2020
The genomic data revolution has enabled biologists to develop innovative ways to infer key episodes in the history of life. Whether genome-scale data will eventually resolve all branches of the Tree of Life remains uncertain. However, through novel means of interrogating data, some explanations for why evolutionary relationships remain recalcitrant are emerging. Here, we provide four biological and analytical factors that explain why certain genes may exhibit "outlier" behavior, namely, rate of molecular evolution, alignment length, misidentified orthology, and errors in modeling. Using empirical and simulated data we show how excluding genes based on their likelihood or inferring processes from the topology they support in a supermatrix can mislead biological inference of conflict. We next show alignment length accounts for the high influence of two genes reported in empirical datasets. Finally, we also re-iterate the impact misidentified orthology and short alignments have on likelihoods in large scale phylogenetics. We suggest that researchers should systematically investigate and describe the source of influential genes, as opposed to discarding them as outliers. Disentangling whether analytical or biological factors are the source of outliers will help uncover new patterns and processes that are shaping the Tree of Life.### Competing Interest StatementThe authors have declared no competing interest.
Gene regulatory changes underlie much of phenotypic evolution. However, the evolutionary potential of regulatory evolution is unknown, because most evidence comes from either natural variation or limited experimental perturbations. Surveying an unbiased mutation library for a developmental enhancer in Drosophila melanogaster using an automated robotics pipeline, we found that most mutations alter gene expression. Our results suggest that regulatory information is distributed throughout most of a developmental enhancer and that parameters of gene expression: levels, location, and state, are convolved. The widespread pleiotropic effects of most mutations and the codependency of outputs may constrain the evolvability of developmental enhancers. Consistent with these observations, comparisons of diverse drosophilids reveal mainly stasis and apparent biases in the phenotypes influenced by this enhancer. Developmental enhancers may encode a much higher density of regulatory information than has been appreciated previously, which may impose constraints on regulatory evolution.### Competing Interest StatementThe authors have declared no competing interest.
Casper Lumby, Lei Zhao, Judy Breuer, Christopher Illingworth
Published: Apr 2020
Strains of the influenza virus form coherent global populations, yet exist at the level of single infections in individual hosts. The relationship between these scales is a critical topic for understanding viral evolution. Here we investigate the within-host relationship between selection and the stochastic effects of genetic drift, estimating an effective population size of infection Ne for influenza infection. Examining whole-genome sequence data describing a chronic case of influenza B in a severely immunocompromised child we infer an Ne of 2.5 x 107 (95% confidence range 1.0 x 107 to 9.0 x 107) suggesting the importance of genetic drift to be minimal. Our result, supported by the analysis of data from influenza A infection, suggests that positive selection during within-host infection is primarily limited by the typically short period of infection. Atypically long infections may have a disproportionate influence upon global patterns of viral evolution.
Terraces in phylogenetic tree space are sets of trees with identical optimality scores for a given data set, arising from missing data. These were first described for multilocus phylogenetic data sets in the context of maximum parsimony inference and maximum likelihood inference under certain model assumptions. Here we show how the mathematical properties that lead to terraces extend to gene tree - species tree problems in which the gene trees are incomplete. Inference of species trees from either sets of gene family trees subject to duplication and loss, or allele trees subject to incomplete lineage sorting, can exhibit terraces in their solution space. First, we show conditions that lead to a new kind of terrace, which stems from subtree operations that appear in reconciliation problems for incomplete trees. Then we characterize when terraces of both types can occur when the optimality criterion for tree search is based on duplication, loss or deep coalescence scores. Finally, we examine the impact of assumptions about the causes of losses: whether they are due to imperfect sampling or true evolutionary deletion.### Competing Interest StatementThe authors have declared no competing interest.
Whole genome duplication (WGD) has occurred in relatively few sexually reproducing invertebrates. Consequently, the WGD that occurred in the common ancestor of horseshoe crabs ~135 million years ago provides a rare opportunity to decipher the evolutionary consequences of a duplicated invertebrate genome. Here, we present a high-quality genome assembly for the mangrove horseshoe crab Carcinoscorpius rotundicauda (1.7Gb, N50 = 90.2Mb, with 89.8% sequences anchored to 16 pseudomolecules, 2n = 32), and a resequenced genome of the tri-spine horseshoe crab Tachypleus tridentatus (1.7Gb, N50 = 109.7Mb). Analyses of gene families, microRNAs, and synteny show that horseshoe crabs have undergone three rounds (3R) of WGD, and that these WGD events are shared with spiders. Comparison of the genomes of C. rotundicauda and T. tridentatus populations from several geographic locations further elucidates the diverse fates of both coding and noncoding genes. Together, the present study represents a cornerstone for a better understanding of the consequences of invertebrate WGD events on evolutionary fates of genes and microRNAs at individual and population levels, and highlights the genetic diversity with practical values for breeding programs and conservation of horseshoe crabs.### Competing Interest StatementThe authors have declared no competing interest.
Sheep was among the first domesticated animals, but its demographic history is little understood. Here we present combined analyses of mitochondrial and nuclear polymorphism data from ancient central and west Anatolian sheep dating to the Late Glacial and early Holocene. We observe loss of mitochondrial haplotype diversity around 7500 BCE during the early Neolithic, consistent with a domestication-related bottleneck. Post-7000 BCE, mitochondrial haplogroup diversity increases, compatible with admixture from other domestication centres and/or from wild populations. Analysing archaeogenomic data, we further find that Anatolian Neolithic sheep (ANS) are genetically closest to present-day European breeds, and especially those from central and north Europe. Our results indicate that Asian contribution to south European breeds in the post-Neolithic era, possibly during the Bronze Age, may explain this pattern.### Competing Interest StatementThe authors have declared no competing interest.