Abstract
We connect the information flow in a neural network to sufficient statistics; and show how techniques that are rooted in information theory, such as the source-coding based information bottleneck method can lead to improved architectures, as well as a better understanding of the theoretical foundation of neural networks, viewed as a cascade compression network. We illustrate our results and view through some numerical examples.