۲ݮƵ

Event

Markos Katsoulakis (University of Massachusetts-Amherst)

Monday, April 3, 2023 16:30to17:30
Burnside Hall Room 1104, 805 rue Sherbrooke Ouest, Montreal, QC, H3A 0B9, CA

Title:

Function-space regularized information divergences and optimal transport for enhanced generative modeling

Abstract:

We present recent work on new variational representations for probability divergences and metrics with applications to machine learning and uncertainty quantification (UQ). The newly constructed information-theoretic divergences interpolate between f-divergences (e.g. KL-divergence) and Integral Probability Metrics (IPM) such as the Wasserstein or the MMD distances.

These divergences show improved convergence and stability properties in statistical learning applications (in particular for generative adversarial networks (GANs)) as well as tighter uncertainty regions in UQ.

These divergences also provide new mathematical and computational insights on Lipschitz regularization methods (e.g. spectral normalization in neural networks) which have recently emerged as an important algorithmic tool in Deep Learning. A version of the Data Processing Inequality allows flexibility in selecting the functions to be optimized over in the variational representation of the divergences.

This feature comes in particularly handy when learning distributions which preserve additional structure such as group symmetries, or more general constraints.

Combining our new divergences with recent advances in invariant and equivariant neural networks allowed us to introduce Structure-Preserving GANs (SP-GAN) as a data-efficient approach for learning distributions with symmetries.

Our theoretical insights lead to a reduced invariant discriminator space, as well as to carefully constructed equivariant generators, avoiding flawed designs that can easily lead to a catastrophic “mode collapse” for the learned distribution.

Our experimental and theoretical results show a drastic improvement in sample fidelity and diversity, and importantly in the amount of data needed to learn invariant distributions.

Follow us on

Back to top