0

Effectively Unbiased FID and Inception Score and where to find them

The paper addresses biases in FID and IS metrics for generative models and proposes methods to compute bias-free estimates using Quasi-Monte Carlo integration and low discrepancy sequences in GAN training.

Year
2019
Venue
effectively-unbiased-fid-and-inception-score-1
Authors
2
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1911.07023v3ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

This paper shows that two commonly used evaluation metrics for generative models, the Fr'echet Inception Distance (FID) and the Inception Score (IS), are biased -- the expected value of the score computed for a finite sample set is not the true value of the score. Worse, the paper shows that the bias term depends on the particular model being evaluated, so model A may get a better score than model B simply because model A's bias term is smaller. This effect cannot be fixed by evaluating at a fixed number of samples. This means all comparisons using FID or IS as currently computed are unreliable. We then show how to extrapolate the score to obtain an effectively bias-free estimate of scores computed with an infinite number of samples, which we term $\overline{\textrm{FID}}\infty$ and $\overline{\textrm{IS}}\infty$. In turn, this effectively bias-free estimate requires good estimates of scores with a finite number of samples. We show that using Quasi-Monte Carlo integration notably improves estimates of FID and IS for finite sample sets. Our extrapolated scores are simple, drop-in replacements for the finite sample scores. Additionally, we show that using low discrepancy sequence in GAN training offers small improvements in the resulting generator.

Authors

2