# The Separation Capacity of Random Neural Networks

@article{Dirksen2021TheSC, title={The Separation Capacity of Random Neural Networks}, author={Sjoerd Dirksen and Martin Genzel and Laurent Jacques and Alexander Stollenwerk}, journal={ArXiv}, year={2021}, volume={abs/2108.00207} }

Neural networks with random weights appear in a variety of machine learning applications, most prominently as the initialization of many deep learning algorithms and as a computationally cheap alternative to fully learned neural networks. In the present article we enhance the theoretical understanding of random neural nets by addressing the following data separation problem: under what conditions can a random neural network make two classes X−,X+ ⊂ Rd (with positive distance) linearly separable… Expand

#### References

SHOWING 1-10 OF 47 REFERENCES

Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?

- Computer Science, Mathematics
- IEEE Transactions on Signal Processing
- 2016

It is formally proved that these networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Expand

Random Vector Functional Link Networks for Function Approximation on Manifolds

- Computer Science, Mathematics
- ArXiv
- 2020

A (corrected) rigorous proof that the Igelnik and Pao construction is a universal approximator for continuous functions on compact domains, with approximation error decaying asymptotically like $O(1/\sqrt{n})$ for the number of network nodes, is provided. Expand

On the Power and Limitations of Random Features for Understanding Neural Networks

- Computer Science, Mathematics
- NeurIPS
- 2019

This paper rigorously show that random features cannot be used to learn even a single ReLU neuron with standard Gaussian inputs, unless the network size is exponentially large, and concludes that a single neuron is learnable with gradient-based methods. Expand

Learning Polynomials with Neural Networks

- Mathematics, Computer Science
- ICML
- 2014

This paper shows that for a randomly initialized neural network with sufficiently many hidden units, the generic gradient descent algorithm learns any low degree polynomial, assuming the authors initialize the weights randomly, and shows that if they use complex-valued weights, there are no "robust local minima". Expand

Understanding deep learning requires rethinking generalization

- Computer Science
- ICLR
- 2017

These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity. Expand

Understanding deep learning (still) requires rethinking generalization

- Computer Science
- Commun. ACM
- 2021

These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity. Expand

Extreme learning machine: Theory and applications

- Computer Science
- Neurocomputing
- 2006

A new learning algorithm called ELM is proposed for feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs which tends to provide good generalization performance at extremely fast learning speed. Expand

Towards a Unified Analysis of Random Fourier Features

- Computer Science, Mathematics
- ICML
- 2019

This work provides the first unified risk analysis of learning with random Fourier features using the squared error and Lipschitz continuous loss functions and devise a simple approximation scheme which provably reduces the computational cost without loss of statistical efficiency. Expand

The Benefits of Over-parameterization at Initialization in Deep ReLU Networks

- Computer Science, Mathematics
- ArXiv
- 2019

This paper proves some desirable theoretical properties at initialization of over-parameterized ReLU networks and shows novel properties that hold under He initialization, including the aforementioned hidden activation norm property, and shows that this property holds for a finite width network even when the number of data samples is infinite. Expand

On the Approximation Properties of Random ReLU Features

- Mathematics
- 2018

We study the approximation properties of random ReLU features through their reproducing kernel Hilbert space (RKHS). We first prove a universality theorem for the RKHS induced by random features… Expand