In order to successfully obtain a faculty position, postdoctoral fellows or ‘postdocs’, must submit an application which requires considerable time and effort to produce. These job applications are often reviewed by mentors and colleagues, but rarely are postdocs offered the opportunity to solicit feedback multiple times from reviewers with the same breadth of expertise often found on an academic search committee. To address this gap, this manuscript describes an international peer reviewing program for small groups of postdocs with a broad range of expertise to reciprocally and iteratively provide feedback to each other on their application materials. Over 145 postdocs have participated, often multiple times, over three years. A survey of participants in this program revealed that nearly all participants would recommend participation in such a program to other faculty applicants. Furthermore, this program was more likely to attract participants who struggled to find mentoring and support elsewhere, either because they changed fields or because of their identity as a woman or member of an underrepresented population in STEM. Participation in programs like this one could provide early career academics like postdocs with a diverse and supportive community of peer mentors during the difficult search for a faculty position. Such psychosocial support and encouragement has been shown to prevent attrition of individuals from these populations and programs like this one target the largest ‘leak’ in the pipeline, that of postdoc to faculty. Implementation of similar peer reviewing programs by universities or professional scientific societies could provide a valuable mechanism of support and increased chances of success for early-career academics in their search for independence.
The emergence of drug-resistant bacteria calls for the discovery of new antibiotics. Yet, for decades, traditional discovery strategies have not yielded new classes of antimicrobial. Here, by mining the human proteome via an algorithm that relies on the sequence length, net charge, average hydrophobicity and other physicochemical properties of antimicrobial peptides, we report the identification of 2,603 encrypted peptide antibiotics that are encoded in proteins with biological function unrelated to the immune system. We show that the encrypted peptides kill pathogenic bacteria by targeting their membrane, modulate gut and skin commensals, do not readily select for bacterial resistance, and possess anti-infective activity in skin abscess and thigh infection mouse models. We also show, in vitro and in the two mouse models of infection, that encrypted antibiotic peptides from the same biogeographical area display synergistic antimicrobial activity. Our algorithmic strategy allows for the rapid mining of proteomic data and opens up new routes for the discovery of candidate antibiotics.
DeepMind presented remarkably accurate predictions at the recent CASP14 protein structure prediction assessment conference. We explored network architectures incorporating related ideas and obtained the best performance with a three-track network in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging X-ray crystallography and cryo-EM structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short circuiting traditional approaches which require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.
Many DNA-hypermethylated cancer genes are occupied by the Polycomb (PcG) repressor complex in embryonic stem cells (ESCs). Their prevalence in the full spectrum of cancers, the exact context of chromatin involved, and their status in adult cell renewal systems are unknown. Using a genome-wide analysis, we demonstrate that ∼75% of hypermethylated genes are marked by PcG in the context of bivalent chromatin in both ESCs and adult stem/progenitor cells. A large number of these genes are key developmental regulators, and a subset, which we call the “DNA hypermethylation module,” comprises a portion of the PcG target genes that are down-regulated in cancer. Genes with bivalent chromatin have a low, poised gene transcription state that has been shown to maintain stemness and self-renewal in normal stem cells. However, when DNA-hypermethylated in tumors, we find that these genes are further repressed. We also show that the methylation status of these genes can cluster important subtypes of colon and breast cancers. By evaluating the subsets of genes that are methylated in different cancers with consideration of their chromatin status in ESCs, we provide evidence that DNA hypermethylation preferentially targets the subset of PcG genes that are developmental regulators, and this may contribute to the stem-like state of cancer. Additionally, the capacity for global methylation profiling to cluster tumors by phenotype may have important implications for further refining tumor behavior patterns that may ultimately aid therapeutic interventions.
Application of machine and deep learning methods in drug discovery and cancer research has gained a considerable amount of attention in the past years. As the field grows, it becomes crucial to systematically evaluate the performance of novel computational solutions in relation to established techniques. To this end we compare rule-based and data-driven molecular representations in prediction of drug combination sensitivity and drug synergy scores using standardized results of 14 throughput screening studies, comprising 64 200 unique combinations of 4 153 molecules tested in 112 cancer cell lines. We evaluate the clustering performance of molecular representations and quantify their similarity by adapting the Centered Kernel Alignment metric. Our work demonstrates that to identify an optimal molecular representation type it is necessary to supplement quantitative benchmark results with qualitative considerations, such as model interpretability and robustness, which may vary between and throughout preclinical drug development projects.
All cellular functions are governed by complex molecular machines that assemble through protein-protein interactions. Their atomic details are critical to the study of their molecular mechanisms but fewer than 5% of hundreds of thousands of human interactions have been structurally characterized. Here, we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human interactions. We show that higher confidence models are enriched in interactions supported by affinity or structure based methods and can be orthogonally confirmed by spatial constraints defined by cross-link data. We identify 3,137 high confidence models, of which 1,371 have no homology to a known structure, from which we identify interface residues harbouring disease mutations, suggesting potential mechanisms for pathogenic variants. We find groups of interface phosphorylation sites that show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple interactions as signalling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies. Accurate prediction of protein complexes promises to greatly expand our understanding of the atomic details of human cell biology in health and disease.
The multidimensional nature of obsessive-compulsive disorder (OCD) has been consistently reported. Clinical and biological characteristics have been associated with OCD dimensions in different ways. Studies suggest the existence of specific genetic bases for the different OCD dimensions. In this study, we analyze the genomic markers, genes, gene ontology and biological pathways associated with the presence of aggressive/checking, symmetry/order, contamination/cleaning, hoarding, and sexual/religious symptoms, as assessed via the Dimensional Yale-Brown Obsessive Compulsive Scale (DY-BOCS) in 399 probands. Logistic regression analyses were performed at the single-nucleotide polymorphism (SNP) level. Gene-based and enrichment analyses were carried out for common (SNPs) and rare variants. No SNP was associated with any dimension at a genome-wide level (p
In a randomized, double-blind trial of 51 cancer patients with life-threatening diagnoses and symptoms of depression /anxiety, high-dose psilocybin produced significant improvement across several markers of mental health and well-being
Genome-wide transcriptome profiling identifies genes that are prone to differential expression across contexts ("common DEGs"), as well as specific changes relevant to the experimental manipulation. Distinguishing common DEGs from those that artggvtte specifically changed in a context of interest will allow more efficient inference of relevant mechanisms and a more systematic understanding of the biological process under scrutiny. Currently, common changes can only be identified through the laborious manual curation of highly controlled experiments, an inordinately time-consuming and impractical endeavor. Here we pioneer a method for identifying common patterns using generative neural networks. This method produces a background set of transcriptomic experiments from which a gene and pathway-specific null distribution can be generated. By comparing the set of differentially expressed genes found in a target experiment against the background set, common results can easily be separated from specific ones. This "Specific cOntext Pattern Highlighting In Expression data" (SOPHIE) method is broadly applicable to new platforms or any species with a large collection of unannotated gene expression data. We apply SOPHIE to diverse datasets including human, including human cancer, and bacterial datasets. SOPHIE recapitulates previously described common DEGs, and our molecular validation indicates it detects highly specific, but low magnitude, biologically relevant, transcriptional changes. SOPHIE's measure of specificity can complement log fold change activity generated from traditional differential expression analyses by, for example, filtering the set of changed genes to identify those that are specifically relevant to the experimental condition of interest. Consequently, these results can inform future research directions.
We present a proteogenomic study of 108 human papilloma virus (HPV)-negative head and neck squamous cell carcinomas (HNSCCs). Proteomic analysis systematically catalogs HNSCC-associated proteins and phosphosites, prioritizes copy number drivers, and highlights an oncogenic role for RNA processing genes. Proteomic investigation of mutual exclusivity between FAT1 truncating mutations and 11q13.3 amplifications reveals dysregulated actin dynamics as a common functional consequence. Phosphoproteomics characterizes two modes of EGFR activation, suggesting a new strategy to stratify HNSCCs based on EGFR ligand abundance for effective treatment with inhibitory EGFR monoclonal antibodies. Widespread deletion of immune modulatory genes accounts for low immune infiltration in immune-cold tumors, whereas concordant upregulation of multiple immune checkpoint proteins may underlie resistance to anti-programmed cell death protein 1 monotherapy in immune-hot tumors. Multi-omic analysis identifies three molecular subtypes with high potential for treatment with CDK inhibitors, anti-EGFR antibody therapy, and immunotherapy, respectively. Altogether, proteogenomics provides a systematic framework to inform HNSCC biology and treatment.