My Hubs

All

My Hubs

All

Trending

Today

Trending

Today

Published: Nov 2020

Published: Nov 2020

A method for machine learning and serving of discrete field theories in physics is developed. The learning algorithm trains a discrete field theory from a set of observational data on a spacetime lattice, and the serving algorithm uses the learned discrete field theory to predict new observations of the field for new boundary and initial conditions. The approach of learning discrete field theories overcomes the difficulties associated with learning continuous theories by artificial intelligence. The serving algorithm of discrete field theories belongs to the family of structure-preserving geometric algorithms, which have been proven to be superior to the conventional algorithms based on discretization of differential equations. The effectiveness of the method and algorithms developed is demonstrated using the examples of nonlinear oscillations and the Kepler problem. In particular, the learning algorithm learns a discrete field theory from a set of data of planetary orbits similar to what Kepler inherited from Tycho Brahe in 1601, and the serving algorithm correctly predicts other planetary orbits, including parabolic and hyperbolic escaping orbits, of the solar system without learning or knowing Newton’s laws of motion and universal gravitation. The proposed algorithms are expected to be applicable when the effects of special relativity and general relativity are important.

Authors: Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus, Radu Grosu

Published: Jun 2020

Authors: Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus, Radu Grosu

Published: Jun 2020

We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities,
we construct networks of linear first-order dynamical systems modulated via
nonlinear interlinked gates. The resulting models represent dynamical systems
with varying (i.e., liquid) time-constants coupled to their hidden state, with
outputs being computed by numerical differential equation solvers. These neural
networks exhibit stable and bounded behavior, yield superior expressivity
within the family of neural ordinary differential equations, and give rise to
improved performance on time-series prediction tasks. To demonstrate these
properties, we first take a theoretical approach to find bounds over their
dynamics and compute their expressive power by the trajectory length measure in
latent trajectory space. We then conduct a series of time-series prediction
experiments to manifest the approximation capability of Liquid Time-Constant
Networks (LTCs) compared to classical and modern RNNs. Code and data are
available at https://github.com/raminmh/liquid_time_constant_networks

Slide 1 of 1

Authors: Ren, Jie, et al

Published: Jan 2021

Authors: Ren, Jie, et al

Published: Jan 2021

Large-scale model training has been a playing ground for a limited few
requiring complex model refactoring and access to prohibitively expensive GPU
clusters. ZeRO-Offload changes the large model training landscape by making
large model training accessible to nearly everyone. It can train models with
over 13 billion parameters on a single GPU, a 10x increase in size compared to
popular framework such as PyTorch, and it does so without requiring any model
change from the data scientists or sacrificing computational efficiency.
ZeRO-Offload enables large model training by offloading data and compute to
CPU. To preserve compute efficiency, it is designed to minimize the data
movement to/from GPU, and reduce CPU compute time while maximizing memory
savings on GPU. As a result, ZeRO-Offload can achieve 40 TFlops/GPU on a single
NVIDIA V100 GPU for 10B parameter model compared to 30TF using PyTorch alone
for a 1.4B parameter model, the largest that can be trained without running out
of memory. ZeRO-Offload is also designed to scale on multiple-GPUs when
available, offering near linear speedup on up to 128 GPUs. Additionally, it can
work together with model parallelism to train models with over 70 billion
parameters on a single DGX-2 box, a 4.5x increase in model size compared to
using model parallelism alone. By combining compute and memory efficiency with
ease-of-use, ZeRO-Offload democratizes large-scale model training making it
accessible to even data scientists with access to just a single GPU.

Slide 1 of 1

Published: Jan 2021

Published: Jan 2021

The nature of the bulk hydrated electron has been a challenge for both experiment and theory due to its short lifetime and high reactivity, and the need for a high-level of electronic structure theory to achieve predictive accuracy. The lack of a classical atomistic structural formula makes it exceedingly difficult to model the solvated electron using conventional empirical force fields, which describe the system in terms of interactions between point particles associated with atomic nuclei. Here we overcome this problem using a machine-learning model, that is sufficiently flexible to describe the effect of the excess electron on the structure of the surrounding water, without including the electron in the model explicitly. The resulting potential is not only able to reproduce the stable cavity structure but also recovers the correct localization dynamics that follow the injection of an electron in neat water. The machine learning model achieves the accuracy of the state-of-the-art correlated wave function method it is trained on. It is sufficiently inexpensive to afford a full quantum statistical and dynamical description and allows us to achieve accurate determination of the structure, diffusion mechanisms, and vibrational spectroscopy of the solvated electron.

Published: Dec 2020

Published: Dec 2020

Screening for prostate cancer relies on the serum prostate-specific antigen test, which provides a high rate of false positives (80%). This results in a large number of unnecessary biopsies and subsequent overtreatment. Considering the frequency of the test, there is a critical unmet need of precision screening for prostate cancer. Here, we introduced a urinary multimarker biosensor with a capacity to learn to achieve this goal. The correlation of clinical state with the sensing signals from urinary multimarkers was analyzed by two common machine learning algorithms. As the number of biomarkers was increased, both algorithms provided a monotonic increase in screening performance. Under the best combination of biomarkers, the machine learning algorithms screened prostate cancer patients with more than 99% accuracy using 76 urine specimens. Urinary multimarker biosensor leveraged by machine learning analysis can be an important strategy of precision screening for cancers using a drop of bodily fluid.

From Paper: Reconciling modern machine learning practice and the bias-variance
trade-off

Authors: Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal

Published: Dec 2018

From Paper: Reconciling modern machine learning practice and the bias-variance
trade-off

Authors: Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal

Published: Dec 2018

The "double descent" regime, where additional power improves the model after a period of harming it, appears in several model classes, including neural nets

Double descent often occurs when the number of parameters exceeds the number of examples

Slide 1 of 1

Authors: Hennigh, Oliver, et al

Published: Dec 2020

Authors: Hennigh, Oliver, et al

Published: Dec 2020

We present SimNet, an AI-driven multi-physics simulation framework, to
accelerate simulations across a wide range of disciplines in science and
engineering. Compared to traditional numerical solvers, SimNet addresses a wide
range of use cases - coupled forward simulations without any training data,
inverse and data assimilation problems. SimNet offers fast turnaround time by
enabling parameterized system representation that solves for multiple
configurations simultaneously, as opposed to the traditional solvers that solve
for one configuration at a time. SimNet is integrated with parameterized
constructive solid geometry as well as STL modules to generate point clouds.
Furthermore, it is customizable with APIs that enable user extensions to
geometry, physics and network architecture. It has advanced network
architectures that are optimized for high-performance GPU computing, and offers
scalable performance for multi-GPU and multi-Node implementation with
accelerated linear algebra as well as FP32, FP64 and TF32 computations. In this
paper we review the neural network solver methodology, the SimNet architecture,
and the various features that are needed for effective solution of the PDEs. We
present real-world use cases that range from challenging forward multi-physics
simulations with turbulence and complex 3D geometries, to industrial design
optimization and inverse problems that are not addressed efficiently by the
traditional solvers. Extensive comparisons of SimNet results with open source
and commercial solvers show good correlation.

Slide 1 of 1

The authors present a method of modeling time series with irregular sampling using machine learning

Slide 1 of 1

Published: Sep 2020

Published: Sep 2020

The electronic Schrödinger equation can only be solved analytically for the hydrogen atom, and the numerically exact full configuration-interaction method is exponentially expensive in the number of electrons. Quantum Monte Carlo methods are a possible way out: they scale well for large molecules, they can be parallelized and their accuracy has, as yet, been only limited by the flexibility of the wavefunction ansatz used. Here we propose PauliNet, a deep-learning wavefunction ansatz that achieves nearly exact solutions of the electronic Schrödinger equation for molecules with up to 30 electrons. PauliNet has a multireference Hartree–Fock solution built in as a baseline, incorporates the physics of valid wavefunctions and is trained using variational quantum Monte Carlo. PauliNet outperforms previous state-of-the-art variational ansatzes for atoms, diatomic molecules and a strongly correlated linear H10, and matches the accuracy of highly specialized quantum chemistry methods on the transition-state energy of cyclobutadiene, while being computationally efficient.

From Paper: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Published: Nov 2019

From Paper: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Published: Nov 2019

When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

When evaluated on 57 different Atari games - the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled - MuZero achieved a new state of the art.

Load More Papers