RH Logo
Hubs
About
Live
Leaderboard

My Hubs
All
ResearchHub Feeds
My Hubs
All

Trending Users

Author Profile Avatar
Tim Dingman
Author Profile Avatar
Brian Novak
Author Profile Avatar
Patrick Joyce
Author Profile Avatar
Sebastian Hunte
Author Profile Avatar
khush deoja
Author Profile Avatar
Daniel Himmelstein
Author Profile Avatar
Елена Белоснежная
Author Profile Avatar
Angela Meng
Author Profile Avatar
Rafael Baptista
Author Profile Avatar
Andy Tertychniy

Trending Papers in Machine Learning

Trending
Today
Trending
Today

Sign in to discover all of the research papers you care about, live as they're published.

20
Published: Nov 2020
Published: Nov 2020
A method for machine learning and serving of discrete field theories in physics is developed. The learning algorithm trains a discrete field theory from a set of observational data on a spacetime lattice, and the serving algorithm uses the learned discrete field theory to predict new observations of the field for new boundary and initial conditions. The approach of learning discrete field theories overcomes the difficulties associated with learning continuous theories by artificial intelligence. The serving algorithm of discrete field theories belongs to the family of structure-preserving geometric algorithms, which have been proven to be superior to the conventional algorithms based on discretization of differential equations. The effectiveness of the method and algorithms developed is demonstrated using the examples of nonlinear oscillations and the Kepler problem. In particular, the learning algorithm learns a discrete field theory from a set of data of planetary orbits similar to what Kepler inherited from Tycho Brahe in 1601, and the serving algorithm correctly predicts other planetary orbits, including parabolic and hyperbolic escaping orbits, of the solar system without learning or knowing Newton’s laws of motion and universal gravitation. The proposed algorithms are expected to be applicable when the effects of special relativity and general relativity are important.
42
Authors: Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus, Radu Grosu
Published: Jun 2020
Authors: Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus, Radu Grosu
Published: Jun 2020
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. The resulting models represent dynamical systems with varying (i.e., liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations, and give rise to improved performance on time-series prediction tasks. To demonstrate these properties, we first take a theoretical approach to find bounds over their dynamics and compute their expressive power by the trajectory length measure in latent trajectory space. We then conduct a series of time-series prediction experiments to manifest the approximation capability of Liquid Time-Constant Networks (LTCs) compared to classical and modern RNNs. Code and data are available at https://github.com/raminmh/liquid_time_constant_networks
Slide 1 of 1
  • Paper Preview Page 1
15
Authors: Ren, Jie, et al
Published: Jan 2021
Authors: Ren, Jie, et al
Published: Jan 2021
Large-scale model training has been a playing ground for a limited few requiring complex model refactoring and access to prohibitively expensive GPU clusters. ZeRO-Offload changes the large model training landscape by making large model training accessible to nearly everyone. It can train models with over 13 billion parameters on a single GPU, a 10x increase in size compared to popular framework such as PyTorch, and it does so without requiring any model change from the data scientists or sacrificing computational efficiency. ZeRO-Offload enables large model training by offloading data and compute to CPU. To preserve compute efficiency, it is designed to minimize the data movement to/from GPU, and reduce CPU compute time while maximizing memory savings on GPU. As a result, ZeRO-Offload can achieve 40 TFlops/GPU on a single NVIDIA V100 GPU for 10B parameter model compared to 30TF using PyTorch alone for a 1.4B parameter model, the largest that can be trained without running out of memory. ZeRO-Offload is also designed to scale on multiple-GPUs when available, offering near linear speedup on up to 128 GPUs. Additionally, it can work together with model parallelism to train models with over 70 billion parameters on a single DGX-2 box, a 4.5x increase in model size compared to using model parallelism alone. By combining compute and memory efficiency with ease-of-use, ZeRO-Offload democratizes large-scale model training making it accessible to even data scientists with access to just a single GPU.
Slide 1 of 1
  • Paper Preview Page 1
17
Published: Jan 2021
Published: Jan 2021
The nature of the bulk hydrated electron has been a challenge for both experiment and theory due to its short lifetime and high reactivity, and the need for a high-level of electronic structure theory to achieve predictive accuracy. The lack of a classical atomistic structural formula makes it exceedingly difficult to model the solvated electron using conventional empirical force fields, which describe the system in terms of interactions between point particles associated with atomic nuclei. Here we overcome this problem using a machine-learning model, that is sufficiently flexible to describe the effect of the excess electron on the structure of the surrounding water, without including the electron in the model explicitly. The resulting potential is not only able to reproduce the stable cavity structure but also recovers the correct localization dynamics that follow the injection of an electron in neat water. The machine learning model achieves the accuracy of the state-of-the-art correlated wave function method it is trained on. It is sufficiently inexpensive to afford a full quantum statistical and dynamical description and allows us to achieve accurate determination of the structure, diffusion mechanisms, and vibrational spectroscopy of the solvated electron.
17
Published: Dec 2020
Published: Dec 2020
Screening for prostate cancer relies on the serum prostate-specific antigen test, which provides a high rate of false positives (80%). This results in a large number of unnecessary biopsies and subsequent overtreatment. Considering the frequency of the test, there is a critical unmet need of precision screening for prostate cancer. Here, we introduced a urinary multimarker biosensor with a capacity to learn to achieve this goal. The correlation of clinical state with the sensing signals from urinary multimarkers was analyzed by two common machine learning algorithms. As the number of biomarkers was increased, both algorithms provided a monotonic increase in screening performance. Under the best combination of biomarkers, the machine learning algorithms screened prostate cancer patients with more than 99% accuracy using 76 urine specimens. Urinary multimarker biosensor leveraged by machine learning analysis can be an important strategy of precision screening for cancers using a drop of bodily fluid.
260
From Paper: Reconciling modern machine learning practice and the bias-variance trade-off
Authors: Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal
Published: Dec 2018
From Paper: Reconciling modern machine learning practice and the bias-variance trade-off
Authors: Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal
Published: Dec 2018
The "double descent" regime, where additional power improves the model after a period of harming it, appears in several model classes, including neural nets
Double descent often occurs when the number of parameters exceeds the number of examples
Slide 1 of 1
  • Paper Preview Page 1
17
Authors: Hennigh, Oliver, et al
Published: Dec 2020
Authors: Hennigh, Oliver, et al
Published: Dec 2020
We present SimNet, an AI-driven multi-physics simulation framework, to accelerate simulations across a wide range of disciplines in science and engineering. Compared to traditional numerical solvers, SimNet addresses a wide range of use cases - coupled forward simulations without any training data, inverse and data assimilation problems. SimNet offers fast turnaround time by enabling parameterized system representation that solves for multiple configurations simultaneously, as opposed to the traditional solvers that solve for one configuration at a time. SimNet is integrated with parameterized constructive solid geometry as well as STL modules to generate point clouds. Furthermore, it is customizable with APIs that enable user extensions to geometry, physics and network architecture. It has advanced network architectures that are optimized for high-performance GPU computing, and offers scalable performance for multi-GPU and multi-Node implementation with accelerated linear algebra as well as FP32, FP64 and TF32 computations. In this paper we review the neural network solver methodology, the SimNet architecture, and the various features that are needed for effective solution of the PDEs. We present real-world use cases that range from challenging forward multi-physics simulations with turbulence and complex 3D geometries, to industrial design optimization and inverse problems that are not addressed efficiently by the traditional solvers. Extensive comparisons of SimNet results with open source and commercial solvers show good correlation.
Slide 1 of 1
  • Paper Preview Page 1
17
The authors present a method of modeling time series with irregular sampling using machine learning
Slide 1 of 1
  • Paper Preview Page 1
15
Published: Sep 2020
Published: Sep 2020
The electronic Schrödinger equation can only be solved analytically for the hydrogen atom, and the numerically exact full configuration-interaction method is exponentially expensive in the number of electrons. Quantum Monte Carlo methods are a possible way out: they scale well for large molecules, they can be parallelized and their accuracy has, as yet, been only limited by the flexibility of the wavefunction ansatz used. Here we propose PauliNet, a deep-learning wavefunction ansatz that achieves nearly exact solutions of the electronic Schrödinger equation for molecules with up to 30 electrons. PauliNet has a multireference Hartree–Fock solution built in as a baseline, incorporates the physics of valid wavefunctions and is trained using variational quantum Monte Carlo. PauliNet outperforms previous state-of-the-art variational ansatzes for atoms, diatomic molecules and a strongly correlated linear H10, and matches the accuracy of highly specialized quantum chemistry methods on the transition-state energy of cyclobutadiene, while being computationally efficient.
281
From Paper: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Published: Nov 2019
From Paper: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Published: Nov 2019
When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.
When evaluated on 57 different Atari games - the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled - MuZero achieved a new state of the art.
Load More Papers