The enormous mammal’s lifespan variation is the result of each species’ adaptations to their own biological trade-offs and ecological conditions. Comparative genomics have demonstrated that genomic factors underlying both, species lifespans and longevity of individuals, are in part shared across the tree of life. Here, we compared protein-coding regions across the mammalian phylogeny to detect individual amino acid (AA) changes shared by the most long-lived mammals and genes whose rates of protein evolution correlate with longevity. We discovered a total of 2,737 AA in 2,004 genes that distinguish long- and short-lived mammals, significantly more than expected by chance (P = 0.003). These genes belong to pathways involved in regulating lifespan, such as inflammatory response and hemostasis. Among them, a total 1,157 AA showed a significant association with maximum lifespan in a phylogenetic test. Interestingly, most of the detected AA positions do not vary in extant human populations (81.2%) or have allele frequencies below 1% (99.78%). Consequently, almost none of these putatively important variants could have been detected by genome-wide association studies. Additionally, we identified four more genes whose rate of protein evolution correlated with longevity in mammals. Crucially, SNPs located in the detected genes explain a larger fraction of human lifespan heritability than expected, successfully demonstrating for the first time that comparative genomics can be used to enhance interpretation of human genome-wide association studies. Finally, we show that the human longevity-associated proteins are significantly more stable than the orthologous proteins from short-lived mammals, strongly suggesting that general protein stability is linked to increased lifespan.
Increased risk-taking is a central component of bipolar disorder (BIP) and is implicated in schizophrenia (SCZ). Risky behaviours, including smoking and alcohol use, are overrepresented in both disorders and associated with poor health outcomes. Positive genetic correlations are reported but an improved understanding of the shared genetic architecture between risk phenotypes and psychiatric disorders may provide insights into underlying neurobiological mechanisms. We aimed to characterise the genetic overlap between risk phenotypes and SCZ, and BIP by estimating the total number of shared variants using the bivariate causal mixture model and identifying shared genomic loci using the conjunctional false discovery rate method. Summary statistics from genome wide association studies of SCZ, BIP, risk-taking and risky behaviours were acquired (n = 82,315–466,751). Genomic loci were functionally annotated using FUMA. Of 8.6–8.7 K variants predicted to influence BIP, 6.6 K and 7.4 K were predicted to influence risk-taking and risky behaviours, respectively. Similarly, of 10.2–10.3 K variants influencing SCZ, 9.6 and 8.8 K were predicted to influence risk-taking and risky behaviours, respectively. We identified 192 loci jointly associated with SCZ and risk phenotypes and 206 associated with BIP and risk phenotypes, of which 68 were common to both risk-taking and risky behaviours and 124 were novel to SCZ or BIP. Functional annotation implicated differential expression in multiple cortical and sub-cortical regions. In conclusion, we report extensive polygenic overlap between risk phenotypes and BIP and SCZ, identify specific loci contributing to this shared risk and highlight biologically plausible mechanisms that may underlie risk-taking in severe psychiatric disorders.
Highlights •Low-dose antibiotic exposure perturbs the infant gut mouse microbiome to PND10 •Frontal cortex and amygdala gene expression were substantially affected •Multiple pathways underlying neurodevelopment were affected •Specific gut microbial populations were linked with expression of particular genes Summary We have established experimental systems to assess the effects of early-life exposures to antibiotics on the intestinal microbiota and gene expression in the brain. This model system is highly relevant to human exposure and may be developed into a preclinical model of neurodevelopmental disorders in which the gut–brain axis is perturbed, leading to organizational effects that permanently alter the structure and function of the brain. Exposing newborn mice to low-dose penicillin led to substantial changes in intestinal microbiota population structure and composition. Transcriptomic alterations implicate pathways perturbed in neurodevelopmental and neuropsychiatric disorders. There also were substantial effects on frontal cortex and amygdala gene expression by bioinformatic interrogation, affecting multiple pathways underlying neurodevelopment. Informatic analyses established linkages between specific intestinal microbial populations and the early-life expression of particular affected genes. These studies provide translational models to explore intestinal microbiome roles in the normal and abnormal maturation of the vulnerable central nervous system.
Generative models have shown breakthroughs in a wide spectrum of domains due to recent advancements in machine learning algorithms and increased computational power. Despite these impressive achievements, the ability of generative models to create realistic synthetic data is still under-exploited in genetics and absent from population genetics. Yet a known limitation in the field is the reduced access to many genetic databases due to concerns about violations of individual privacy, although they would provide a rich resource for data mining and integration towards advancing genetic studies. In this study, we demonstrated that deep generative adversarial networks (GANs) and restricted Boltzmann machines (RBMs) can be trained to learn the complex distributions of real genomic datasets and generate novel high-quality artificial genomes (AGs) with none to little privacy loss. We show that our generated AGs replicate characteristics of the source dataset such as allele frequencies, linkage disequilibrium, pairwise haplotype distances and population structure. Moreover, they can also inherit complex features such as signals of selection. To illustrate the promising outcomes of our method, we showed that imputation quality for low frequency alleles can be improved by data augmentation to reference panels with AGs and that the RBM latent space provides a relevant encoding of the data, hence allowing further exploration of the reference dataset and features for solving supervised tasks. Generative models and AGs have the potential to become valuable assets in genetic studies by providing a rich yet compact representation of existing genomes and high-quality, easy-access and anonymous alternatives for private databases.
Family health history (FHx) is an effective tool for identifying patients at risk of hereditary cancer. Hereditary cancer clinical practice guidelines (CPG) contain criteria used to evaluate FHx and to make recommendations for genetic consultation. Comparing different CPGs used to evaluate a common set of FHx provides insight into how well the CPGs perform, the extent of agreement across guidelines, and how well they identify patients who should consider a cancer genetic consultation.
Studies on humans and animals suggest associations between gestational diabetes mellitus (GDM) with increased susceptibility to develop neurological disorders in offspring. However, the molecular mechanisms underpinning the intergenerational effects remain unclear. Using a mouse model of diabetes during pregnancy, we found that intrauterine hyperglycemia exposure resulted in memory impairment in both the first filial (F1) males and the second filial (F2) males from the F1 male offspring. Transcriptome profiling of F1 and F2 hippocampi revealed that differentially expressed genes (DEGs) were enriched in neurodevelopment and synaptic plasticity. The reduced representation bisulfite sequencing (RRBS) of sperm in F1 adult males showed that the intrauterine hyperglycemia exposure caused altered methylated modification of F1 sperm, which is a potential epigenetic mechanism for the intergenerational neurocognitive effects of GDM.
Hitching a ride with a retroelement Retroviruses and retroelements have inserted their genetic code into mammalian genomes throughout evolution. Although many of these integrated virus-like sequences pose a threat to genomic integrity, some have been retooled by mammalian cells to perform essential roles in development. Segel et al. found that one of these retroviral-like proteins, PEG10, directly binds to and secretes its own mRNA in extracellular virus–like capsids. These virus-like particles were then pseudotyped with fusogens to deliver functional mRNA cargos to mammalian cells. This potentially provides an endogenous vector for RNA-based gene therapy. Science, abg6155, this issue p. 882 Eukaryotic genomes contain domesticated genes from integrating viruses and mobile genetic elements. Among these are homologs of the capsid protein (known as Gag) of long terminal repeat (LTR) retrotransposons and retroviruses. We identified several mammalian Gag homologs that form virus-like particles and one LTR retrotransposon homolog, PEG10, that preferentially binds and facilitates vesicular secretion of its own messenger RNA (mRNA). We showed that the mRNA cargo of PEG10 can be reprogrammed by flanking genes of interest with Peg10’s untranslated regions. Taking advantage of this reprogrammability, we developed selective endogenous encapsidation for cellular delivery (SEND) by engineering both mouse and human PEG10 to package, secrete, and deliver specific RNAs. Together, these results demonstrate that SEND is a modular platform suited for development as an efficient therapeutic delivery modality. The retrotransposon-derived Gag protein PEG10 can facilitate efficient and specific intercellular delivery of mRNAs in mammalian cells. The retrotransposon-derived Gag protein PEG10 can facilitate efficient and specific intercellular delivery of mRNAs in mammalian cells.
Many DNA-hypermethylated cancer genes are occupied by the Polycomb (PcG) repressor complex in embryonic stem cells (ESCs). Their prevalence in the full spectrum of cancers, the exact context of chromatin involved, and their status in adult cell renewal systems are unknown. Using a genome-wide analysis, we demonstrate that ∼75% of hypermethylated genes are marked by PcG in the context of bivalent chromatin in both ESCs and adult stem/progenitor cells. A large number of these genes are key developmental regulators, and a subset, which we call the “DNA hypermethylation module,” comprises a portion of the PcG target genes that are down-regulated in cancer. Genes with bivalent chromatin have a low, poised gene transcription state that has been shown to maintain stemness and self-renewal in normal stem cells. However, when DNA-hypermethylated in tumors, we find that these genes are further repressed. We also show that the methylation status of these genes can cluster important subtypes of colon and breast cancers. By evaluating the subsets of genes that are methylated in different cancers with consideration of their chromatin status in ESCs, we provide evidence that DNA hypermethylation preferentially targets the subset of PcG genes that are developmental regulators, and this may contribute to the stem-like state of cancer. Additionally, the capacity for global methylation profiling to cluster tumors by phenotype may have important implications for further refining tumor behavior patterns that may ultimately aid therapeutic interventions.
Stable epigenetic changes appear uncommon, suggesting that changes typically dissipate or are repaired. Changes that stably alter gene expression across generations presumably require particular conditions that are currently unknown. Here we report that a minimal combination of cis-regulatory sequences can support permanent RNA silencing of a single-copy transgene and its derivatives in C. elegans simply upon mating. Mating disrupts competing RNA-based mechanisms to initiate silencing that can last for >300 generations. This stable silencing requires components of the small RNA pathway and can silence homologous sequences in trans. While animals do not recover from mating-induced silencing, they often recover from and become resistant to trans silencing. Recovery is also observed in most cases when double-stranded RNA is used to silence the same coding sequence in different regulatory contexts that drive germline expression. Therefore, we propose that regulatory features can evolve to oppose permanent and potentially maladaptive responses to transient change.
We hypothesized that the highly controlled pattern of gene expression that is essential for liver regeneration is encoded by an epigenetic code set in quiescent hepatocytes. Here we report that epigenetic and transcriptomic profiling of quiescent and regenerating mouse livers define chromatin states that dictate gene expression and transposon repression. We integrate ATACseq and DNA methylation profiling with ChIPseq for the histone marks H3K4me3, H3K27me3 and H3K9me3 and the histone variant H2AZ to identify 6 chromatin states with distinct functional characteristics. We show that genes involved in proliferation reside in active states, but are marked with H3K27me3 and silenced in quiescent livers. We find that during regeneration, H3K27me3 is depleted from their promoters, facilitating their dynamic expression. These findings demonstrate that hepatic chromatin states in quiescent livers predict gene expression and that pro-regenerative genes are maintained in active chromatin states, but are restrained by H3K27me3, permitting a rapid and synchronized response during regeneration.