Trending

Today

From Paper: Asymmetric correlation and hedging effectiveness of gold & cryptocurrencies: From pre-industrial to the 4th industrial revolution✰

Published: Jul 2020

From Paper: Asymmetric correlation and hedging effectiveness of gold & cryptocurrencies: From pre-industrial to the 4th industrial revolution✰

Published: Jul 2020

- The effects of gold on stock market are asymmetric in most of the cases. • Cryptocurrency does not affect stock market significantly. • Correlations between stock/gold and stock/cryptocurrency pairs are found to be positive in most cases. • Neither gold nor cryptocurrency acts as a good instrument for hedging stock market.

Slide 1 of 1

From Paper: Raincloud plots: a multi-platform tool for robust data visualization

Published: Aug 2018

From Paper: Raincloud plots: a multi-platform tool for robust data visualization

Published: Aug 2018

- Space saved compared to violin + boxplot is replaced with 'jittered' (arbitrarily offset) data points at each axis increment that provides raw, statistical data insights--forms 'raindrops' of raincloud
- Raincloud plot, in part, employs 'glanceable' nature of violin plot but without unnecessary mirroring across center axis (resulting in better "ink-to-data" ratio)

Web-facing companies, including Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga use online controlled experiments to guide product development and accelerate innovation. At Microsoft’s Bing, the use of controlled experiments has grown exponentially over time, with over 200 concurrent experiments now running on any given day. Running experiments at large scale requires addressing multiple challenges in three areas: cultural/organizational, engineering, and trustworthiness. On the cultural and organizational front, the larger organization needs to learn the reasons for running controlled experiments and the tradeoffs between controlled experiments and other methods of evaluating ideas. We discuss why negative experiments, which degrade the user experience short term, should be run, given the learning value and long-term benefits. On the engineering side, we architected a highly scalable system, able to handle data at massive scale: hundreds of concurrent experiments, each containing millions of users. Classical testing and debugging techniques no longer apply when there are billions of live variants of the site, so alerts are used to identify issues rather than relying on heavy up- front testing. On the trustworthiness front, we have a high occurrence of false positives that we address, and we alert experimenters to statistical interactions between experiments. The Bing Experimentation System is credited with having accelerated innovation and increased annual revenues by hundreds of millions of dollars, by allowing us to find and focus on key ideas evaluated through thousands of controlled experiments. A 1% improvement to revenue equals more than $10M annually in the US, yet many ideas impact key metrics by 1% and are not well estimated a-priori. The system has also identified many negative features that we avoided deploying, despite key stakeholders’ early excitement, saving us similar large amounts.

Slide 1 of 1

Published: Oct 2020

Published: Oct 2020

- This paper describes the benefits provided by hypothesis tests and gives examples of how to create machine-readable statistical predictions
- The authors propose that the gold standard for well-specified hypothesis tests should be a statistical prediction that is machine-readable.

Slide 1 of 1

Authors: Ross Harper, Joshua Southern

Published: Feb 2019

Authors: Ross Harper, Joshua Southern

Published: Feb 2019

- Emotion come from heart
- Bayesian

Slide 1 of 1

Authors: Edgar Dobriban, Sifan Liu

Published: Oct 2018

Authors: Edgar Dobriban, Sifan Liu

Published: Oct 2018

We consider a least squares regression problem where the data has beengenerated from a linear model, and we are interested to learn the unknownregression parameters. We consider "sketch-and-solve" methods that randomlyproject the data first, and do regression after. Previous works have analyzedthe statistical and computational performance of such methods. However, theexisting analysis is not fine-grained enough to show the fundamentaldifferences between various methods, such as the Subsampled Randomized HadamardTransform (SRHT) and Gaussian projections. In this paper, we make progress onthis problem, working in an asymptotic framework where the number of datapointsand dimension of features goes to infinity. We find the limits of the accuracyloss (for estimation and test error) incurred by popular sketching methods. Weshow separation between different methods, so that SRHT is better than Gaussianprojections. Our theoretical results are verified on both real and syntheticdata. The analysis of SRHT relies on novel methods from random matrix theorythat may be of independent interest.

Retrieved from arxiv

Retrieved from arxiv

Authors: Sifan Liu, Edgar Dobriban

Published: Oct 2019

Authors: Sifan Liu, Edgar Dobriban

Published: Oct 2019

We study the following three fundamental problems about ridge regression: (1)what is the structure of the estimator? (2) how to correctly usecross-validation to choose the regularization parameter? and (3) how toaccelerate computation without losing too much accuracy? We consider the threeproblems in a unified large-data linear model. We give a precise representationof ridge regression as a covariance matrix-dependent linear combination of thetrue parameter and the noise. We study the bias of $K$-fold cross-validationfor choosing the regularization parameter, and propose a simplebias-correction. We analyze the accuracy of primal and dual sketching for ridgeregression, showing they are surprisingly accurate. Our results are illustratedby simulations and by analyzing empirical data.

Retrieved from arxiv

Retrieved from arxiv

Authors: Xiucai Ding, Fan Yang

Published: May 2019

Authors: Xiucai Ding, Fan Yang

Published: May 2019

We introduce a class of separable sample covariance matrices of the form$\widetilde{\mathcal{Q}}_1:=\widetilde A^{1/2} X \widetilde B X^* \widetildeA^{1/2}.$ Here $\widetilde{A}$ and $\widetilde{B}$ are positive definitematrices whose spectrums consist of bulk spectrums plus several spikes, i.e.larger eigenvalues that are separated from the bulks. Conceptually, we call$\widetilde{\mathcal{Q}}_1$ a \emph{spiked separable covariance matrix model}.On the one hand, this model includes the spiked covariance matrix as a specialcase with $\widetilde{B}=I$. On the other hand, it allows for more generalcorrelations of datasets. In particular, for spatio-temporal dataset,$\widetilde{A}$ and $\widetilde{B}$ represent the spatial and temporalcorrelations, respectively. In this paper, we study the outlier eigenvalues and eigenvectors, i.e. theprincipal components, of the spiked separable covariance model$\widetilde{\mathcal{Q}}_1$. We prove the convergence of the outliereigenvalues $\widetilde \lambda_i$ and the generalized components (i.e.$\langle \mathbf v, \widetilde{\mathbf{\xi}}_i \rangle$ for any deterministicvector $\mathbf v$) of the outlier eigenvectors $\widetilde{\mathbf{\xi}}_i$with optimal convergence rates. Moreover, we also prove the delocalization ofthe non-outlier eigenvectors. We state our results in full generality, in thesense that they also hold near the so-called BBP transition and for degenerateoutliers. Our results highlight both the similarity and difference between thespiked separable covariance matrix model and the spiked covariance model. Inparticular, we show that the spikes of both $\widetilde{A}$ and $\widetilde{B}$will cause outliers of the eigenvalue spectrum, and the eigenvectors can helpus to select the outliers that correspond to the spikes of $\widetilde{A}$ (or$\widetilde{B}$).

Retrieved from arxiv

Retrieved from arxiv

Authors: David L. Donoho, Behrooz Ghorbani

Published: Oct 2018

Authors: David L. Donoho, Behrooz Ghorbani

Published: Oct 2018

We study estimation of the covariance matrix under relative condition numberloss $\kappa(\Sigma^{-1/2} \hat{\Sigma} \Sigma^{-1/2})$, where $\kappa(\Delta)$is the condition number of matrix $\Delta$, and $\hat{\Sigma}$ and $\Sigma$ arethe estimated and theoretical covariance matrices. Optimality in $\kappa$-lossprovides optimal guarantees in two stylized applications: Multi-User CovarianceEstimation and Multi-Task Linear Discriminant Analysis. We assume the so-calledspiked covariance model for $\Sigma$, and exploit recent advances inunderstanding that model, to derive a nonlinear shrinker which isasymptotically optimal among orthogonally-equivariant procedures. In ourasymptotic study, the number of variables $p$ is comparable to the number ofobservations $n$. The form of the optimal nonlinearity depends on the aspectratio $\gamma=p/n$ of the data matrix and on the top eigenvalue of $\Sigma$.For $\gamma > 0.618...$, even dependence on the top eigenvalue can be avoided.The optimal shrinker has two notable properties. First, when $p/n \rightarrow\gamma \gg 1$ is large, it shrinks even very large eigenvalues substantially,by a factor $1/(1+\gamma)$. Second, even for moderate $\gamma$, certain highlystatistically significant eigencomponents will be completely suppressed. Weshow that when $\gamma \gg 1$ is large, purely diagonal covariance matrices canbe optimal, despite the top eigenvalues being large and the empiricaleigenvalues being highly statistically significant. This aligns withpractitioner experience. We identify intuitively reasonable procedures withsmall worst-case relative regret - the simplest being generalized softthresholding having threshold at the bulk edge and slope $(1+\gamma)^{-1}$above the bulk. For $\gamma < 2$ it has at most a few percent relative regret.

Retrieved from arxiv

Retrieved from arxiv

Authors: Chen Amiraz, Robert Krauthgamer, Boaz Nadler

Published: May 2019

Authors: Chen Amiraz, Robert Krauthgamer, Boaz Nadler

Published: May 2019

Orthogonal Matching pursuit (OMP) is a popular algorithm to estimate anunknown sparse vector from multiple linear measurements of it. Assuming exactsparsity and that the measurements are corrupted by additive Gaussian noise,the success of OMP is often formulated as exactly recovering the support of thesparse vector. Several authors derived a sufficient condition for exact supportrecovery by OMP with high probability depending on the signal-to-noise ratio,defined as the magnitude of the smallest non-zero coefficient of the vectordivided by the noise level. We make two contributions. First, we derive aslightly sharper sufficient condition for two variants of OMP, in which eitherthe sparsity level or the noise level is known. Next, we show that this sharpersufficient condition is tight, in the following sense: for a wide range ofproblem parameters, there exist a dictionary of linear measurements and asparse vector with a signal-to-noise ratio slightly below that of thesufficient condition, for which with high probability OMP fails to recover itssupport. Finally, we present simulations which illustrate that our condition istight for a much broader range of dictionaries.

Retrieved from arxiv

Retrieved from arxiv

Load More Papers