A Bayesian hierarchical framework with a Gaussian copula and a generalized extreme value (GEV) marginal distribution is proposed for the description of spatial dependencies in data. This spatial copula model was applied to extreme summer temperatures over the Extremadura Region, in the southwest of Spain, during the period 1980–2015, and compared with the spatial noncopula model. The Bayesian hierarchical model was implemented with a Monte Carlo Markov Chain (MCMC) method that allows the distribution of the model’s parameters to be estimated. The results show the GEV distribution’s shape parameter to take constant negative values, the location parameter to be altitude dependent, and the scale parameter values to be concentrated around the same value throughout the region. Further, the spatial copula model chosen presents lower deviance information criterion (DIC) values when spatial distributions are assumed for the GEV distribution’s location and scale parameters than when the scale parameter is taken to be constant over the region.
Over the last years, a surge of empirical studies converged on complexity-related measures as reliable markers of consciousness across many different conditions, such as sleep, anesthesia, hallucinatory states, coma, and related disorders. Most of these measures were independently proposed by researchers endorsing disparate frameworks and employing different methods and techniques. Since this body of evidence has not been systematically reviewed and coherently organized so far, this positive trend has remained somewhat below the radar. The aim of this paper is to make this consilience of evidence in the science of consciousness explicit. We start with a systematic assessment of the growing literature on complexity-related measures and identify their common denominator, tracing it back to core theoretical principles and predictions put forward more than 20 years ago. In doing this, we highlight a consistent trajectory spanning two decades of consciousness research and provide a provisional taxonomy of the present literature. Finally, we consider all of the above as a positive ground to approach new questions and devise future experiments that may help consolidate and further develop a promising field where empirical research on consciousness appears to have, so far, naturally converged.
Spatial Statistics and Disease Mapping Paper Session at the 2016 American Association of Geographers Annual Meeting.https://github.com/maps-apps-n/geography-thesis/blob/master/TBIncidenceConfPaper.pdf tl;dr A multi-variate multi-level regression analysis proposes primary schools as major hotspot sources of TB incidence in Bolivia per geographical coverage .
Making scientific information machine-readable greatly facilitates its re-use. Many scientific articles have the goal to test a hypothesis, so making the tests of statistical predictions easier to find and access could be very beneficial. We propose an approach that can be used to make hypothesis tests machine readable. We believe there are two benefits to specifying a hypothesis test in a way that a computer can evaluate whether the statistical prediction is corroborated or not. First, hypothesis test will become more transparent, falsifiable, and rigorous. Second, scientists will benefit if information related to hypothesis tests in scientific articles is easily findable and re-usable, for example when performing meta-analyses, during peer review, and when examining meta-scientific research questions. We examine what a machine readable hypothesis test should look like, and demonstrate the feasibility of machine readable hypothesis tests in a real-life example using the fully operational prototype R package scienceverse.
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d). This nested system of two flows, where the parameter-flow is constrained to lie on the compact manifold, provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem which is intrinsically related to training deep neural network architectures such as Neural ODEs. Consequently, it leads to better downstream models, as we show on the example of training reinforcement learning policies with evolution strategies, and in the supervised learning setting, by comparing with previous SOTA baselines. We provide strong convergence results for our proposed mechanism that are independent of the depth of the network, supporting our empirical studies. Our results show an intriguing connection between the theory of deep neural networks and the field of matrix flows on compact manifolds.
Neural Ordinary Differential Equations (ODEs) are elegant reinterpretations of deep networks where continuous time can replace the discrete notion of depth, ODE solvers perform forward propagation, and the adjoint method enables efficient, constant memory backpropagation. Neural ODEs are universal approximators only when they are non-autonomous, that is, the dynamics depends explicitly on time. We propose a novel family of Neural ODEs with time-varying weights, where time-dependence is non-parametric, and the smoothness of weight trajectories can be explicitly controlled to allow a tradeoff between expressiveness and efficiency. Using this enhanced expressiveness, we outperform previous Neural ODE variants in both speed and representational capacity, ultimately outperforming standard ResNet and CNN models on select image classification and video prediction tasks.
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness. To approximate softmax attention-kernels, Performers use a novel Fast Attention Via positive Orthogonal Random features approach (FAVOR+), which may be of independent interest for scalable kernel methods. FAVOR+ can be also used to efficiently model kernelizable attention mechanisms beyond softmax. This representational power is crucial to accurately compare softmax with other kernels for the first time on large-scale tasks, beyond the reach of regular Transformers, and investigate optimal attention-kernels. Performers are linear architectures fully compatible with regular Transformers and with strong theoretical guarantees: unbiased or nearly-unbiased estimation of the attention matrix, uniform convergence and low estimation variance. We tested Performers on a rich set of tasks stretching from pixel-prediction through text models to protein sequence modeling. We demonstrate competitive results with other examined efficient sparse and dense attention methods, showcasing effectiveness of the novel attention-learning paradigm leveraged by Performers.
Web-facing companies, including Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga use online controlled experiments to guide product development and accelerate innovation. At Microsoft’s Bing, the use of controlled experiments has grown exponentially over time, with over 200 concurrent experiments now running on any given day. Running experiments at large scale requires addressing multiple challenges in three areas: cultural/organizational, engineering, and trustworthiness. On the cultural and organizational front, the larger organization needs to learn the reasons for running controlled experiments and the tradeoffs between controlled experiments and other methods of evaluating ideas. We discuss why negative experiments, which degrade the user experience short term, should be run, given the learning value and long-term benefits. On the engineering side, we architected a highly scalable system, able to handle data at massive scale: hundreds of concurrent experiments, each containing millions of users. Classical testing and debugging techniques no longer apply when there are billions of live variants of the site, so alerts are used to identify issues rather than relying on heavy up- front testing. On the trustworthiness front, we have a high occurrence of false positives that we address, and we alert experimenters to statistical interactions between experiments. The Bing Experimentation System is credited with having accelerated innovation and increased annual revenues by hundreds of millions of dollars, by allowing us to find and focus on key ideas evaluated through thousands of controlled experiments. A 1% improvement to revenue equals more than $10M annually in the US, yet many ideas impact key metrics by 1% and are not well estimated a-priori. The system has also identified many negative features that we avoided deploying, despite key stakeholders’ early excitement, saving us similar large amounts.