Subject motion can introduce noise into neuroimaging data and result in biased estimations of brain structure. In-scanner motion can compromise data quality in a number of ways and varies widely across developmental and clinical populations. However, quantification of structural image quality is often limited to proxy or indirect measures gathered from functional scans; this may be missing true differences related to these potential artifacts. In this study, we take advantage of novel informatic tools, the CAT12 toolbox, to more directly measure image quality from T1-weighted images to understand if these measures of image quality: (1) relate to rigorous quality-control checks visually completed by human raters; (2) are associated with sociodemographic variables of interest; (3) influence regional estimates of cortical surface area, cortical thickness, and subcortical volumes from the commonly used Freesurfer tool suite. We leverage public-access data that includes a community-based sample of children and adolescents, spanning a large age-range (N = 388; ages 5–21). Interestingly, even after visually inspecting our data, we find image quality significantly impacts derived cortical surface area, cortical thickness, and subcortical volumes from multiple regions across the brain (~ 23.4% of all areas investigated). We believe these results are important for research groups completing structural MRI studies using Freesurfer or other morphometric tools. As such, future studies should consider using measures of image quality to minimize the influence of this potential confound in group comparisons or studies focused on individual differences.
Objective To investigate the role of salivary small non-coding RNAs (sncRNAs) in the diagnosis of sport-related concussion. Methods Saliva was obtained from male professional players in the top two tiers of England’s elite rugby union competition across two seasons (2017–2019). Samples were collected preseason from 1028 players, and during standardised head injury assessments (HIAs) at three time points (in-game, post-game, and 36–48 hours post-game) from 156 of these. Samples were also collected from controls (102 uninjured players and 66 players sustaining a musculoskeletal injury). Diagnostic sncRNAs were identified with next generation sequencing and validated using quantitative PCR in 702 samples. A predictive logistic regression model was built on 2017–2018 data (training dataset) and prospectively validated the following season (test dataset). Results The HIA process confirmed concussion in 106 players (HIA+) and excluded this in 50 (HIA−). 32 sncRNAs were significantly differentially expressed across these two groups, with let-7f-5p showing the highest area under the curve (AUC) at 36–48 hours. Additionally, a combined panel of 14 sncRNAs (let-7a-5p, miR-143-3p, miR-103a-3p, miR-34b-3p, RNU6-7, RNU6-45, Snora57, snoU13.120, tRNA18Arg-CCT, U6-168, U6-428, U6-1249, Uco22cjg1,YRNA_255) could differentiate concussed subjects from all other groups, including players who were HIA− and controls, immediately after the game (AUC 0.91, 95% CI 0.81 to 1) and 36–48 hours later (AUC 0.94, 95% CI 0.86 to 1). When prospectively tested, the panel confirmed high predictive accuracy (AUC 0.96, 95% CI 0.92 to 1 post-game and AUC 0.93, 95% CI 0.86 to 1 at 36–48 hours). Conclusions SCRUM, a large prospective observational study of non-invasive concussion biomarkers, has identified unique signatures of concussion in saliva of male athletes diagnosed with concussion.
Using an electronic health records network we estimated the absolute incidence of cerebral venous thrombosis (CVT) in the two weeks following COVID-19 diagnosis (N=513,284), or influenza (N=172,742), or receipt of the BNT162b2 or mRNA-1273 COVID-19 vaccines (N=489,871). The incidence of portal vein thrombosis (PVT) was also assessed in these groups, as well asthe baseline CVT incidence over a two-week period. The incidence of CVT after COVID-19 diagnosis was 39.0 per million people (95% CI, 25.2–60.2). This was higher than the CVT incidence after influenza (0.0 per million people, 95% CI 0.0–22.2, adjusted RR=6.73, P=.003) or after receiving BNT162b2 or mRNA-1273 vaccine (4.1 per million people, 95% CI 1.1–14.9, adjusted RR=6.36, P<.001). The relative risks were similar if a broader definition of CVT was used. For PVT, the incidence was 436.4 per million people (382.9-497.4) after COVID-19, 98.4 (61.4-157.6) after influenza, and 44.9 (29.7-68.0) after BNT162b2 or mRNA-1273. The incidence of CVT following COVID-19 was higher than the incidence observed across the entire health records network (0.41 per million people over any 2-week period). Laboratory test results, availablein a subsetof the COVID-19 patients, provide preliminary evidence suggestive of raised D-dimer, lowered fibrinogen, and an increased rate of thrombocytopenia in the CVT and PVT groups. Mortality was 20% and 18.8% respectively. These data show that the incidence of CVT is significantly increased after COVID-19,andgreater than that observed with BNT162b2 and mRNA-1273 COVID-19 vaccines. The risk of CVT following COVID-19 is also higher than the latest estimate from the European Medicines Agency for the incidence associated with ChAdOx1 nCoV-19 vaccine (5.0 per million people, 95% CI 4.3–5.8). Although requiring replication and corroboration, the present data highlight the risk of serious thrombotic events in COVID-19, and can help contextualize the risks and benefits of vaccination in this regard.
Comprehensive descriptions of animal behavior require precise three-dimensional (3D) measurements of whole-body movements. Although two-dimensional approaches can track visible landmarks in restrictive environments, performance drops in freely moving animals, due to occlusions and appearance changes. Therefore, we designed DANNCE to robustly track anatomical landmarks in 3D across species and behaviors. DANNCE uses projective geometry to construct inputs to a convolutional neural network that leverages learned 3D geometric reasoning. We trained and benchmarked DANNCE using a dataset of nearly seven million frames that relates color videos and rodent 3D poses. In rats and mice, DANNCE robustly tracked dozens of landmarks on the head, trunk, and limbs of freely moving animals in naturalistic settings. We extended DANNCE to datasets from rat pups, marmosets, and chickadees, and demonstrate quantitative profiling of behavioral lineage during development.
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos. VideoGPT uses VQ-VAE that learns downsampled discrete latent representations of a raw video by employing 3D convolutions and axial self-attention. A simple GPT-like architecture is then used to autoregressively model the discrete latents using spatio-temporal position encodings. Despite the simplicity in formulation and ease of training, our architecture is able to generate samples competitive with state-of-the-art GAN models for video generation on the BAIR Robot dataset, and generate high fidelity natural images from UCF-101 and Tumbler GIF Dataset (TGIF). We hope our proposed architecture serves as a reproducible reference for a minimalistic implementation of transformer based video generation models. Samples and code are available at https://wilson1yan.github.io/videogpt/index.html
This paper presents SimCSE, a simple contrastive learning framework that greatly advances the state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective, with only standard dropout used as noise. This simple method works surprisingly well, performing on par with previous supervised counterparts. We hypothesize that dropout acts as minimal data augmentation and removing it leads to a representation collapse. Then, we draw inspiration from the recent success of learning sentence embeddings from natural language inference (NLI) datasets and incorporate annotated pairs from NLI datasets into contrastive learning by using "entailment" pairs as positives and "contradiction" pairs as hard negatives. We evaluate SimCSE on standard semantic textual similarity (STS) tasks, and our unsupervised and supervised models using BERT-base achieve an average of 74.5% and 81.6% Spearman's correlation respectively, a 7.9 and 4.6 points improvement compared to previous best results. We also show that contrastive learning theoretically regularizes pre-trained embeddings' anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.
One-step purification and desalination The purification of water for drinking purposes can require multiple filtration steps and technologies to remove contaminants such as salts and heavy metals. Some contaminants could have value if recovered, but these are often discharged in the waste streams. Uliana et al. describe a general approach for the fabrication of robust, tunable, adsorptive membranes through the incorporation of porous aromatic framework (PAF) nanoparticles into ion exchange membranes such as those made from sulfonated polymers. Salts are removed using a series of cation and anion exchange membranes, and the PAF particles can be selected to capture specific target ions, such as those of copper, mercury, or iron. This allows for simultaneous desalination and decontamination of the water. Science, this issue p. 296 Technologies that can efficiently purify nontraditional water sources are needed to meet rising global demand for clean water. Water treatment plants typically require a series of costly separation units to achieve desalination and the removal of toxic trace contaminants such as heavy metals and boron. We report a series of robust, selective, and tunable adsorptive membranes that feature porous aromatic framework nanoparticles embedded within ion exchange polymers and demonstrate their use in an efficient, one-step separation strategy termed ion-capture electrodialysis. This process uses electrodialysis configurations with adsorptive membranes to simultaneously desalinate complex water sources and capture diverse target solutes with negligible capture of competing ions. Our methods are applicable to the development of efficient and selective multifunctional separations that use adsorptive membranes. High-performance adsorptive membranes enable one-step desalination of complex water sources and target solute capture. High-performance adsorptive membranes enable one-step desalination of complex water sources and target solute capture.
Authors: Matt Post, Elizabeth Salesky, David Etter, Matt Post
Published: Apr 16, 2021
Authors: Matt Post, Elizabeth Salesky, David Etter, Matt Post
Machine translation models have discrete vocabularies and commonly use subword segmentation techniques to achieve an 'open-vocabulary.' This approach relies on consistent and correct underlying unicode sequences, and makes models susceptible to degradation from common types of noise and variation. Motivated by the robustness of human language processing, we propose the use of visual text representations, which dispense with a finite set of text embeddings in favor of continuous vocabularies created by processing visually rendered text. We show that models using visual text representations approach or match performance of text baselines on clean TED datasets. More importantly, models with visual embeddings demonstrate significant robustness to varied types of noise, achieving e.g., 25.9 BLEU on a character permuted German--English task where subword models degrade to 1.9.
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. Therefore, eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models. Conventional methods assume either the known heterogeneity of training data (e.g. domain labels) or the approximately equal capacities of different domains. In this paper, we consider a more challenging case where neither of the above assumptions holds. We propose to address this problem by removing the dependencies between features via learning weights for training samples, which helps deep models get rid of spurious correlations and, in turn, concentrate more on the true connection between discriminative features and labels. Extensive experiments clearly demonstrate the effectiveness of our method on multiple distribution generalization benchmarks compared with state-of-the-art counterparts. Through extensive experiments on distribution generalization benchmarks including PACS, VLCS, MNIST-M, and NICO, we show the effectiveness of our method compared with state-of-the-art counterparts.
Detecting and fixing bugs are two of the most important yet frustrating parts of the software development cycle. Existing bug detection tools are based mainly on static analyzers, which rely on mathematical logic and symbolic reasoning about the program execution to detect common types of bugs. Fixing bugs is typically left out to the developer. In this work we introduce DeepDebug: a data-driven program repair approach which learns to detect and fix bugs in Java methods mined from real-world GitHub repositories. We frame bug-patching as a sequence-to-sequence learning task consisting of two steps: (i) denoising pretraining, and (ii) supervised finetuning on the target translation task. We show that pretraining on source code programs improves the number of patches found by 33% as compared to supervised training from scratch, while domain-adaptive pretraining from natural language to code further improves the accuracy by another 32%. We refine the standard accuracy evaluation metric into non-deletion and deletion-only fixes, and show that our best model generates 75% more non-deletion fixes than the previous state of the art. In contrast to prior work, we attain our best results when generating raw code, as opposed to working with abstracted code that tends to only benefit smaller capacity models. Finally, we observe a subtle improvement from adding syntax embeddings along with the standard positional embeddings, as well as with adding an auxiliary task to predict each token's syntactic class. Despite focusing on Java, our approach is language agnostic, requiring only a general-purpose parser such as tree-sitter.