Natural scenes, like most all natural data sets, show considerable redundancy. Although many forms of redundancy have been investigated (e.g., pixel distributions, power spectra, contour relationships, etc.), estimates of the true entropy of natural scenes have been largely considered intractable. We describe a technique for estimating the entropy and relative dimensionality of image patches based on a function we call the proximity distribution (a nearest-neighbor technique). The advantage of this function over simple statistics such as the power spectrum is that the proximity distribution is dependent on all forms of redundancy. We demonstrate that this function can be used to estimate the entropy (redundancy) of patches of known entropy as well as patches of Gaussian white noise, natural scenes, and noise with the same power spectrum as natural scenes. The techniques are based on assumptions regarding the intrinsic dimensionality of the data, and although the estimates depend on an extrapolation model for images larger than , we argue that this approach provides the best current estimates of the entropy and compressibility of natural-scene patches and that it provides insights into the efficiency of any coding strategy that aims to reduce redundancy. We show that the sample of patches of natural scenes used in this study has less than half the entropy of white noise and less than 60% of the entropy of noise with the same power spectrum. In addition, given a finite number of samples drawn randomly from the space of patches, the subspace of natural-scene patches shows a dimensionality that depends on the sampling density and that for low densities is significantly lower dimensional than the space of patches of white noise and noise with the same power spectrum.
© 2007 Optical Society of AmericaFull Article | PDF Article