Abstract

Traditional superresolution techniques employ optimizers, priors, and regularizers to deliver stable, appealing restorations even though deviating from the real, ground-truth scene. We have developed a non-regularized superresolution algorithm that directly solves a fully-characterized multi-shift imaging reconstruction problem to achieve realistic restorations without being penalized by improper assumptions made in the inverse problem. An adaptive frequency-based filtering scheme is introduced to upper bound the reconstruction errors while still producing more fine details as compared with previous methods when inaccurate shift estimation, noise, and blurring scenarios are considered.

© 2015 Optical Society of America

1. Introduction

Many commercialized imaging systems do not capture the high-resolution (HR) features of scenes due to the additional monetary costs of the high-quality optics and the dense sensors, the longer acquisition time needed to collect sufficient photons at smaller imaging units, and the physical restrictions imposed by miniature imaging applications such as endoscopic systems and industrial fiberscopes which limits the number of detectors on the focal planar array (FPA). Superresolution (SR) techniques [1–3] are a powerful way to bypass the diffraction limits mandated by the imaging optics and the sensing medium so the HR details within the scene can be retrieved. Many SR techniques integrate multiple low-resolution (LR) images, each contributing additional information, through an inverse problem to output a relatively richer scene of greater resolving power with less aliasing.

The SR inverse problem, in general, is an ill-posed problem as the solution is not guaranteed to be existent, stable, or unique [4]. Therefore, SR techniques approximate the HR image by optimizing weights for non-uniform interpolation [5], or implementing Tikhonov’s regularization component to overcome noise [6], or employing stochastic reconstruction methods such as maximum likelihood (ML) [7] or maximum a posteriori (MAP) estimation [8], or combining deterministic and stochastic regularizers such as sparse and non-sparse priors [9, 10] to constrain the restorations (i.e., the reconstructed images), or iterating the projections onto convex sets (POCS) of priors to assure convergence while satisfying the assumed priors [11, 12], or adopting back-projection schemes [13] to iteratively minimize the differences between the measured LR images and the synthesized ones, or even alternating the optimization using deconvolution techniques [14] to reverse the acquisition process. These optimization schemes may ensure a converging solution, but at the cost of deviating from the real ground-truth restoration since they are inherently biased by the type of priors, the utilized regularizers, and the optimized objective functions. For instance, regularizing to enhance smoothness and reduce aliasing can make the reconstructions visually better but may not necessarily resolve the actual HR details critical for accurate image analyses (e.g., medical diagnostics).

We propose a fast SR technique that delivers an unconstrained accurate physical HR solution without being biased by regularizing terms or optimization schemes. A forward observation model is first characterized using information such as an estimate of the blurring kernels (resulting from the imaging optics and the sensor’s finite size) and the relative shifts between the acquired LR images to find a full-rank matrix representation of the multi-shift imaging point spread function (PSF), and by characterizing the noise statistics. The HR reconstruction is then found by directly solving a set of sufficient linear equations to reconstruct a rich HR image. This technique produces a unique exact restoration under ideal shift estimates and in-focus noiseless measurements, making the inverse a well-posed problem. The realistic scenarios of inaccurate shift estimates and blurred noisy measurements in some cases may result in more unknowns than the number of equations or small changes in some variables which may hurt the stability of the solution. For such scenarios, an adaptive frequency-based filtering scheme is introduced to upper bound the reconstruction errors while still producing more fine details as compared with other regularized SR techniques.

2. Multi-shift imaging systems

In multi-shift imaging systems, the LR images are linearly related to the unknown HR object through blurring, sampling, shifting, and addition of noise. To assure diversity in the collected LR measurements, consecutive non-integer (subpixel or large) shifts at the acquisition stage can be introduced by shifting the sensor array itself or the camera as a whole. Shifting the sensor may be accomplished by small piezoelectric actuators while maintaining the optics and camera body fixed. Having actuators shifting only the sensor may be advantageous, for example, in endoscopic imaging since it enables automating accurate shifts within the limited physical space while maintaining the stability of the optics and reducing the translated load. Camera shift is a second possible case. For general imaging purposes, shifting the camera may be simpler as successive measurements are captured in motion (keeping in mind that multi-head systems are a special case of camera shift model). A distinction between sensor shift and camera shift, presented in Fig. 1, is the captured content itself; shifting the sensor resembles spatial resampling within the same angular bandwidth captured by the optics whereas shifting the camera represents spatial sampling of shifted angular bandwidth as new rays propagate through the system.

 figure: Fig. 1

Fig. 1 Illustration of the shift impact on spatial sampling and angular bandwidth: sensor shift (left) and camera shift (right) as compared with the non-shifted case (center).

Download Full Size | PPT Slide | PDF

Consider a vector notation such that the 2D image is rasterized block-wise lexicographically into a 1D vector. For the sensor shift (SS) case, the imaged HR object f is first optically blurred by kernel B, then undergoes K consecutive lateral non-integer-pixel shifts Sk where k=1,, K, then is subsampled by the FPA operator D which includes the sensor blur as well, and then perhaps is combined with an additive thermal noise nk, resulting in a captured image. The mathematical acquisition model for the SS case can be described by gkSS=D(Sk(Bf))+nk, where gkSS is the kth captured image using sensor shift, nk is the kth additive noise, and is the convolution operator. For the camera shift (CS) case, the shift and blurring order is alternated so the observation model becomes gkCS=D(BSk(f))+nk where gkCS is the kth captured image using camera shift. Note that the Sk operator can be generalized to incorporate rotation as well where the origin and the amount of such consecutive rotations should be chosen to ensure capturing new information over all image sensors. The global shift vectors and the blurring and subsampling kernels are employed to generate the PSF matrix H, so the multi-shift imaging system can be simplified to a linear system incorporating all K measurements as g=Hf+n.

2.1 Parameters estimate from the LR images

The relative shifts between captured images can be estimated using spatial-based or frequency-based registration techniques [15–17] with subpixel accuracy of about 1/10-1/100 pixel. Yet this accuracy is not adequate for solving the linear system as a small variation in any of the parameters can result in a large deviation from the correct solution; therefore, a more accurate approach, which may involve setting an iterative simulated annealing search [18], may be used to find the best shift combinations in XY directions for all the LR images.

The blurring kernel is assumed to be a spatially invariant operator (ignoring aberrations) and representing the optical blur due to the finite aperture of the imaging optics. The blurring profile (shape and strength) may be estimated through blind deconvolution techniques [14, 19].

The additive noise is assumed to follow the normal distribution (additive white Gaussian noise) where the noise statistics (mean and variance) can be estimated using the pixel-wise adaptive Wiener method [20].

3. Direct superresolution algorithm

Assuming N pixels at the HR level and the dimensional subsampling factor d, the minimum number of LR images (each having N/d2 pixels) required to make the linear imaging system a well-posed SR problem is K=d2. Figure 2 illustrates the direct SR algorithm by a simple case of a 4×4 object fj (j indexing the HR pixels) divided into 2×2 blocks, subsampling factor d=2, four LR captured images gik with noise nik (k indexing the LR image and i indexing its pixels), and equal non-integer-pixel shifts in a noisy environment to obtain a full-rank system H. Each column in H is found by letting the associated object’s pixel be one while setting the rest of the pixels to zero and recording the response through the observation model on the FPA. The weighted linear combinations of HR pixels, as shown in Fig. 2, form the measured LR images; solving this set of unique linear equations can grant the exact restoration of the HR object under ideal imaging conditions.

 figure: Fig. 2

Fig. 2 Simple imaging scheme illustrating the linear combinations of HR pixels including the boundaries forming the LR images (top) and the resulting vector-matrix representation that direct SR solves (bottom) in a noisy environment.

Download Full Size | PPT Slide | PDF

However, there is an inherent dependency between the adjacent LR pixels (including the boundary conditions involved in the subsampling process, which are shown as fj' in Fig. 2) which forces us to solve the whole system globally (and not individually at the block level). This may result in a giant H matrix for large images; nevertheless, imposing block-wise lexicographical ordering in both the object and the measurement domains results in a diagonal-like sparse matrix, shown in Fig. 3, that can be directly and efficiently solved using Gaussian elimination technique [21] (without the need to find its inverse). In the event that the number of measurements is not sufficient, i.e.  K<d2, the PSF matrix becomes rank deficient and the linear least-squares solution [22] is computed using Moore-Penrose pseudoinverse or linear minimum mean square error techniques. For the boundary pixels, the assumption made is that the HR objects are smoothly varying at the boundaries, which allows us to approximate their values by the closest HR pixels fj'fj so no additional unknowns are introduced and the related boundary weights h(i1)K+k,jul,ur,bl,br can be compacted into h(i1)K+k,j* (with superscript *) in the H matrix; see boundary weights of block 4 in Fig. 2 for extra clarification.

 figure: Fig. 3

Fig. 3 Structure of the H matrix (right) showing the sparse diagonal-like distribution (locations of nonzero elements) and the color-coded repetitive blocks for a 16×16 object (left) using 4×4 blocks and 4 LR images. The colored blocks below the horizontal axis of the H matrix are highlighting a group of columns that can be copied (with circular shifts) from any other block’s columns belonging to the same colored group.

Download Full Size | PPT Slide | PDF

An additional advantage of adopting the block approach is the repetitive pattern found in the H matrix for similar blocks (inner blocks, boundary blocks, corner blocks), where repetitive blocks are color coded in Fig. 3. For instance, when generating H, the response of the object’s points is first computed for a single inner block (the columns of H associated with that block) and then the computed response is copied to all other columns of H associated with the other inner blocks while maintaining the proper shifts representing the spatial location of these blocks within the HR image. This means that only the impulse response of a single block within each of the nine color-encoded groups, displayed in Fig. 3-left, needs to be computed and then copied (with the relative circular shifts) to other blocks belonging to the same group. The size of the blocks is also set adaptively according to the subsampling rate and the size of the HR object so the computational loads made to find and copy the blocks’ responses are balanced. This makes generating the sparse H matrix computationally efficient (highly parallelized), highly scalable (for large images and subsampling factors), and with an efficient memory allocation. For instance, superresolving to get an HR image of size 512×512 using 2× and 4× subsampling factors takes 3.25 min (TPSF = 1.54 min, TRecon= 1.43 min) and 20.8 min (TPSF = 4.43 min, TRecon= 15.83 min) respectively on a computer with 12 cores, each having a 2 GHz clock rate and 16 GB of RAM.

Finally, it should be noted that the mechanism of simulating the H matrix may require an accurate knowledge of the shifting vectors (it can work with random non-integer large shifts, too), fast acquisition of the LR images (so that global shifts are sufficient as the captured scene can be assumed static), and smooth variations at the boundaries (since we consider replications when finding the responses at boundaries). Also, to deliver a stable solution, the direct SR technique favors acquisition of the LR images with good focus and low noise. The steps and a block diagram summarizing all the operations in the direct SR technique is presented in Fig. 4.

 figure: Fig. 4

Fig. 4 The steps (top) and a block diagram (bottom) summarizing the direct SR technique.

Download Full Size | PPT Slide | PDF

4. Adaptive frequency-based filtering scheme

A closer look at the impact of non-idealities on the linear imaging equations, given in Fig. 2, reveals challenges that may prevent any robust solution: 1) The misestimate of shifting vectors or blurring kernel will result in global deviations in the H weights. 2) Out-of-focus or noisy measurements may prevent the uniqueness or existence of the reconstruction. 3) The nonconformity of the replication assumption at the boundaries may affect the starred weightsh*, altering some equations, hence, affecting the whole system.

To enable the direct SR technique to function in such realistic scenarios while maintaining its merit by not incorporating priors and optimizers directly, an adaptive frequency-based filtering scheme (AFFS) has been developed to adaptively combine the direct SR reconstruction with reference SR reconstructions in a frequency-by-frequency manner based on trained masks to output an improved HR reconstruction having bounded errors. At least one reference SR technique (indexed r=1,,R) is chosen to upper bound the reconstruction errors due to non-idealities.

At the training stage, shown at the top of Fig. 5, a set of HR images is used to generate noisy blurred LR images through the observation model at predefined shifting vectors, blurring kernel, and subsampling factor. This can be repeated to generate multiple realizations of LR images. The LR images are then inputted into multiple SR techniques to generate corresponding HR reconstructed images, and the discrete Fourier transform (DFT) of the reconstructed images is calculated and ordered so the DC frequency component is at the center of the DFT array. Next, masking information is generated to associate each frequency with a corresponding preferred direct SR technique or a reference SR technique. An outward spiral scanning path can be chosen to step through the frequency locations in the mask, starting with zero frequency as the current frequency component. The calculation of the mask starts by taking only the currently selected frequency components of the reference reconstructed image DFT and the direct reconstructed image DFT while setting the other frequency components to zero, computing the inverse discrete Fourier transform (IDFT), and then calculating the root mean square error (RMSE) with respect to the original HR image in the spatial domain. The current frequency location in the mask is then associated with the SR technique whose frequency component results in the lowest RMSE, and a new search to find the best next frequency component (along with the already selected frequency components) is conducted till all frequency locations have been associated with a preferred SR technique. The outward spiral pattern can be beneficial as lower frequencies usually carry more energy as compared with higher frequencies for natural scenes. The masks resulting from each realization in the training stage represent binary decision maps for the given HR image, subsampling factor, translating information, blurring kernel, noise statistics, and noise realization. By averaging masks over many different realizations, the training stage outputs multiple filtering masks (wRefr for each reference SR technique and wDirect for the direct SR technique) scaled to have real values (optimized weights) from zero to one.

 figure: Fig. 5

Fig. 5 Block diagram illustrating the adaptive frequency-based filtering scheme in the training stage (top) and the testing stage (bottom).

Download Full Size | PPT Slide | PDF

Similarly at the testing stage, shown at the bottom of Fig. 5, the captured LR images are used to calculate reconstructed images using both the reference SR techniques and the direct SR technique. The DFT of each reconstructed image is computed, yielding Refr and Direct. In the testing stage, the filtering masks wRefr and wDirect (averaged over all cases) are used to weight the frequency components, i.e. perform filtering, to obtain the DFT of the final reconstructed image: AFFS=rwRefr Refr+wDirect Direct. The IDFT of AFFS is then computed to deliver the final reconstructed image.

Example training images to be used in this report are presented in Fig. 6. For validation purposes, any test image may be selected from this set, and the training procedure will then use masks averaged over the other images (not including the test image) with noise realizations for a given noise strength (NS), blurring kernel of standard deviation σ,and subsampling factord.

 figure: Fig. 6

Fig. 6 Training images used in adaptive frequency-based filtering scheme.

Download Full Size | PPT Slide | PDF

Note that the AFFS scheme will always select best frequencies in the RMSE sense restored by any implemented SR techniques as long as the same techniques used at the testing stage are utilized in the training stage (e.g., the direct SR technique may be replaced by any other desired SR technique).

5. Simulation results

A simulation study has been conducted to evaluate the performance of the direct SR technique described herein and the AFFS visually and numerically in idealistic and realistic scenarios. Unless noted differently in the evaluated scenarios, the observation model to be used assumes a sensor shift with 2× subsampling factor, equal known non-integer-pixel shifts along XY directions, in-focus capturing (B has Gaussian distribution with standard deviation σblur = 0.5 pixel), replicated boundary conditions, and noiseless environment. As case studies, we vary the imaging parameters in gk=D(Sk(Bf))+nk one at a time and study its impact on the restored images; we either vary the boundary conditions in f, or modify the blurring variance in B, or introduce shift misestimates in Sk, or alter the sequence order between B and Sk, or change the subsampling in D, or even apply different noise strengths in nk.

5.1 Evaluating various SR techniques under ideal imaging conditions

Four different SR techniques are implemented to visualize the impact of biasing priors and optimization schemes on the reconstructions of 2× and 4× subsampling factors, shown in Fig. 7. The multi-image SR techniques utilized here are the non-uniform bicubic interpolation [5], SR with mixed priors [9] (sparse ℓ1-norm of horizontal and vertical first-order differences and non-sparse simultaneous autoregressive prior with λ = 0.5), the multi-channel blind deconvolution (MCBD) [14], and the direct SR technique. To quantify the reconstruction errors (at pixel level) between different SR techniques for various objects, the relative numerical error is calculated by RMSE =sqrt(1Nj=1N(fjf^j)2)/L, where fj and f^j are the jth pixels of the HR image and the recon. image, respectively, N is the number of pixels, and L is the maximum allowed intensity (e.g., 255 for an 8-bit image); see Fig. 7.

 figure: Fig. 7

Fig. 7 Visual comparison of various SR techniques at 2× and 4× subsampling factors using sensor shift model, known non-integer-pixel shifts, σblur=0.5 pixel, and noiseless environment for U.S. Air Force (USAF) target and cameraman image. Shown at the first row: sample of LR images and the HR images. Rows 2-5 displays from left to right: bicubic interpolation, SR of mixed priors, MCBD, and direct SR. The associated RMSE% is shown at their left.

Download Full Size | PPT Slide | PDF

The visual quality of the restorations (especially at the resolved HR features of the USAF target, the camera body and its aligning stick) and the numerical errors of the implemented SR techniques in idealistic scenarios confirm that the direct SR technique described herein resolves the true features with fewer artifacts than the other SR techniques.

5.2 Studying the adaptive frequency-based filtering scheme at noisy environments

Two reference SR techniques, bicubic and MCBD, are implemented besides the direct SR technique in the training stage to increase the degrees of freedom in selecting best frequencies and weighting masks that minimize the reconstruction errors. The resulting masks are averaged over 11 training images (since the test image is not included) and 20 noise realizations of normal distribution N(0,σnoise2) at the given noise strength NS related to σnoise=NS ·L, whereL is the maximum allowed intensity. The sharpness circle is used as a test image, and Fig. 8 shows the calculated reconstruction RMSE vs. NS, averaged over 20 noise realizations using the direct SR technique, bicubic, MCBD, and the AFFS.

 figure: Fig. 8

Fig. 8 Reconstruction error vs. noise of various SR techniques and the AFFS results for the sharpness circle.

Download Full Size | PPT Slide | PDF

The RMSE curves, shown in Fig. 8, demonstrate that while the bicubic and MCBD SR techniques experience insignificant changes, the reconstructed images using direct SR suffer as NS increases. Yet, the AFFS succeeded in picking the best frequencies to suppress any artifacts while maintaining the fine structures leading to the lowest RMSE values; see also Fig. 9. Note that in the noiseless scenario (NS = 0), due to the spiral path utilized in the AFFS, the filtering masks are not picking all frequencies from the direct SR technique, leading to slightly larger errors (black curve) than the direct SR results (red curve).

 figure: Fig. 9

Fig. 9 Visual comparison of the reconstructed sharpness circle (zoomed) at NS = 0.1% and 0.5% for various SR techniques and averaged over 20 noise realizations. A sample LR image and the ground-truth HR image are shown on the left of the first row. The three trained multi-gray masks (wDirect, wBicubic,  and wMCBD) are color encoded as red for direct, green for bicubic, and blue for MCBD in the composite RGB masks displayed on the right of first row.

Download Full Size | PPT Slide | PDF

A visual assessment of the reconstruction results, presented in Fig. 9, shows that despite the noisy restorations, there are still additional fine details resolved by the direct SR technique as compared with the bicubic and MCBD SR techniques. The pixelated artifacts reside in the high-frequency regions while washed-out areas exist in the low-frequency regions of the direct SR reconstructed image; thus, the trained filtering masks succeeded in assigning the low and high-frequency locations to the reference SR techniques instead leading to the donut shape in the direct mask wDirect encoded in the red channel of the displayed masks. More noise means more frequencies to be selected from, or greater weight given to, the reference SR techniques. Note that the noise has no direct impact on generating the point spread function’s matrix H, yet the noise distorts the captured images gk and drives the solution, i.e. the reconstructed image, far from the optimal answer.

5.3 Assessing the impact of blurring when imaging through different observation models

Simulating out-of-focus imaging can be done by widening the blurring kernel (i.e. increasing σblur), which in turn affects H globally besides blurring the LR measurements. As a test case, the boat image (bottom left image in Fig. 6) was imaged through two different observation models, sensor shift (SS) and camera shift (CS), at various σblur values (in LR pixel units).

The general trend in all RMSE curves, plotted in Fig. 10, is the increment in RMSE as more blurring is introduced. Comparing the RMSE curves of the two observation models, we see insignificant impact in the results of the bicubic and MCBD SR techniques as they are not relying directly on H in their calculation of the reconstructed image. However, when using the direct SR or the AFFS, the performance of the CS model outperforms the SS model since the blurring in SS precedes and hence impacts the shifting, causing an additional source of error.

 figure: Fig. 10

Fig. 10 Reconstruction errors vs. σblur when imaging the boat image through two observation models: sensor shift (SS) and camera shift (CS) using various SR techniques.

Download Full Size | PPT Slide | PDF

The blurring through SS results in dappled artifacts in the reconstructed image of the direct SR technique, yet many fine details survived and were later chosen by the filtering masks (trained for different σblur values and observation models and averaged over the other training images), leading to much richer content (e.g., see the outrigger poles of the boats) than both the bicubic and MCBD SR techniques (see center row in Fig. 11). The same blurring amount in the CS model seems to be ineffective; thus, the masks favored the direct SR output, leading to sharp reconstruction.

 figure: Fig. 11

Fig. 11 Visual comparison of the reconstructed boat at σblur=1 pixel for various SR techniques using sensor shift (middle row) and camera shift (bottom row). A sample LR image and the original HR image are shown on the left of the first row. The three trained averaged masks(wDirect, wBicubic,  and wMCBD) are color encoded as red for direct, green for bicubic, and blue for MCBD in the composite RGB masks displayed on the right of the first row.

Download Full Size | PPT Slide | PDF

5.4 Validating the replicating boundary assumption made in the direct SR technique

The subsampling operator D inherently makes the boundary values part of the generation of H at the starred PSF values h*; see Fig. 2. A replication boundary condition is assumed in order to avoid having more unknowns than the number of equations, so the full-rank property of H is preserved. To verify the impact of this assumption, we derive the boundary values from normal distributions N(nearest image values,σbound2) while varying the boundary tolerance BTol related to the standard deviation σbound=BTol ·L, where L is the maximum allowed intensity. Again, the filtering masks to be used here are trained for each BTol and averaged over 20 boundary realizations.

The RMSE curves and visual reconstructions in Figs. 12 and 13 show that the direct SR technique is still capable of resolving fine details even with large BTol, which validates the boundary assumption made earlier. The additional filtering step, AFFS, indeed assists in eliminating many of the boundary artifacts. Note that both the bicubic and MCBD SR techniques are totally insensitive to the boundary conditions, and their HR reconstructions are either too smooth (in the bicubic case) or have artificial ripples (in MCBD case; see the edges of the objects in the peppers image).

 figure: Fig. 12

Fig. 12 Recon. errors vs. BTol when reconstructing peppers image with various SR techniques.

Download Full Size | PPT Slide | PDF

 figure: Fig. 13

Fig. 13 Visual comparison of the reconstructed pepper (zoomed) at BTol=10% (middle row) and 20% (bottom row) for various SR techniques. A sample LR image and the original HR image are shown on the left of the first row. The three trained multi-gray masks(wDirect, wBicubic,  and wMCBD) are color encoded as red for direct, green for bicubic, and blue for MCBD in the composite RGB masks displayed on the right of the first row.

Download Full Size | PPT Slide | PDF

5.5 Evaluating the impact of the shift misestimates on the direct SR restorations

The direct SR technique requires accurate subpixel shift estimation for all captured LR images as any tiny error in the estimated shift vectors affects the generation of H globally. For the 2× subsampling case, there are 6 random variables (in XY directions) that need to be estimated with respect to one of the captured LR images selected as a reference. To test how accurate the shift estimate needs to be, we set the estimated shifts to follow normal distributions N(true shifts,σshift2) while varying the shift tolerance STol related to the standard deviation σshift=STol·DetectorPitch. Similarly, the filtering masks are trained for each STol and averaged over 20 shift misestimate realizations.

The RMSE curves and visual reconstructions in Figs. 14 and 15 show that direct SR technique is significantly sensitive to the STol even as low as 0.1%. Yet, there are still some fine details that survived (see blood and black edges in the reconstructed endoscopy images) and were successfully picked up by the additional AFFS (see red channels in the filtering masks) without propagating any of the shift artifacts.

 figure: Fig. 14

Fig. 14 Reconstruction errors vs. STol when restoring endoscopy image with various SR techniques.

Download Full Size | PPT Slide | PDF

 figure: Fig. 15

Fig. 15 Visual comparison of the reconstructed endoscopy image (zoomed) at STol=0.1% (middle row) and 1% (bottom row) for various SR techniques. A sample LR image and the original HR image are shown on the left of the first row. The three trained multi-gray masks(wDirect, wBicubic,  and wMCBD) are color encoded as red for direct, green for bicubic, and blue for MCBD in composite RGB masks displayed on the right of the first row.

Download Full Size | PPT Slide | PDF

6. Conclusion and future work

We developed a non-regularized direct superresolution (SR) technique that uniquely solves a set of linear equations representing the multi-shift image reconstruction problem with sufficient measurements to deliver realistic reconstructions without any inherent bias imposed by priors, regularizers, and optimization schemes. An adaptive frequency-based filtering scheme (AFFS) is introduced to gain robustness against out-of-focus, shift misestimate, boundary variations, and noisy scenarios while maintaining the merit of direct SR to achieve optimal restorations with minimal artifacts.

As a future work, we will continue to address physical imaging issues by considering a combination of all imperfections including additional challenges such as the motion blur, chromatic and off-axis aberrations, and dynamic scenes in the observation models. The point-spread function will be calibrated experimentally to accurately super-resolve various test objects. We also will investigate different scanning paths other than the current outward spiral scan, and we will investigate training with more reference SRs in the AFFS to keep improving the restorations under various realistic imaging conditions.

References

1. Z. Zalevsky and D. Mendlovic, Optical Superresolution (Springer, 2003).

2. T. Lukeš, Super-Resolution Methods for Digital Image and Video Processing (Czech Technical University, 2013).

3. Z. Zalevsky, Super-Resolved Imaging: Geometrical and Diffraction Approaches (Springer, 2011).

4. J. Hadamard, Lectures on Cauchy’s Problem in Linear Partial Differential Equation (Dover, 1923).

5. A. Gilman and D. G. Bailey, “Near optimal non-uniform interpolation for image super-resolution from multiple images,” in Image and Vision Computing, (2006), pp. 31–35.

6. A. N. Tikhonov and V. I. Arsenin, Solutions of Ill-Posed Problems (Winston, 1977).

7. B. C. Tom and A. K. Katsaggelos, “Reconstruction of a high resolution image from multiple degraded mis-registered low resolution images,” in Proceedings of IEEE Intl. Conf. on Image Processing (IEEE, 1994), pp. 553–557. [CrossRef]  

8. R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997). [CrossRef]   [PubMed]  

9. S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013). [CrossRef]  

10. P. Milanfar, Super-Resolution Imaging (CRC Press, 2010).

11. H. Stark and P. Oskoui, “High-resolution image recovery from image-plane arrays, using convex projections,” J. Opt. Soc. Am. A 6(11), 1715–1726 (1989). [CrossRef]   [PubMed]  

12. F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005). [CrossRef]  

13. A. Zomet, A. Rav-Acha, and S. Peleg, “Robust super-resolution,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 2001), pp. 645–650.

14. F. Šroubek and P. Milanfar, “Robust multichannel blind deconvolution via fast alternating minimization,” IEEE Trans. Image Process. 21(4), 1687–1700 (2012). [CrossRef]   [PubMed]  

15. P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006). [CrossRef]  

16. D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-pixel displacement,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 1988), pp. 742–746.

17. J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image registration by template matching using normalized cross-correlation,” in Proceedings of IEEE Intl. Conf. on Advances in Computing, Control, and Telecommunication Technologies (IEEE, 2009), pp. 819–822. [CrossRef]  

18. M. Johansson, Image Registration with Simulated Annealing and Genetic Algorithms (Royal Institute of Technology in Sweden, 2006).

19. E. Y. Lam and J. W. Goodman, “Iterative statistical approach to blind image deconvolution,” J. Opt. Soc. Am. A 17(7), 1177–1184 (2000). [CrossRef]   [PubMed]  

20. J. S. Lim, Two-Dimensional Signal and Image Processing (Prentice Hall, 1990).

21. A. K. Kaw, E. E. Kalu, and D. Nguyen, Numerical Methods with Applications (Univ. of South Florida, 2010), Chap. 4.

22. R. Farebrother, Linear Least Squares Computations (CRC Press, 1988).

References

  • View by:

  1. Z. Zalevsky and D. Mendlovic, Optical Superresolution (Springer, 2003).
  2. T. Lukeš, Super-Resolution Methods for Digital Image and Video Processing (Czech Technical University, 2013).
  3. Z. Zalevsky, Super-Resolved Imaging: Geometrical and Diffraction Approaches (Springer, 2011).
  4. J. Hadamard, Lectures on Cauchy’s Problem in Linear Partial Differential Equation (Dover, 1923).
  5. A. Gilman and D. G. Bailey, “Near optimal non-uniform interpolation for image super-resolution from multiple images,” in Image and Vision Computing, (2006), pp. 31–35.
  6. A. N. Tikhonov and V. I. Arsenin, Solutions of Ill-Posed Problems (Winston, 1977).
  7. B. C. Tom and A. K. Katsaggelos, “Reconstruction of a high resolution image from multiple degraded mis-registered low resolution images,” in Proceedings of IEEE Intl. Conf. on Image Processing (IEEE, 1994), pp. 553–557.
    [Crossref]
  8. R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997).
    [Crossref] [PubMed]
  9. S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
    [Crossref]
  10. P. Milanfar, Super-Resolution Imaging (CRC Press, 2010).
  11. H. Stark and P. Oskoui, “High-resolution image recovery from image-plane arrays, using convex projections,” J. Opt. Soc. Am. A 6(11), 1715–1726 (1989).
    [Crossref] [PubMed]
  12. F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005).
    [Crossref]
  13. A. Zomet, A. Rav-Acha, and S. Peleg, “Robust super-resolution,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 2001), pp. 645–650.
  14. F. Šroubek and P. Milanfar, “Robust multichannel blind deconvolution via fast alternating minimization,” IEEE Trans. Image Process. 21(4), 1687–1700 (2012).
    [Crossref] [PubMed]
  15. P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006).
    [Crossref]
  16. D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-pixel displacement,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 1988), pp. 742–746.
  17. J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image registration by template matching using normalized cross-correlation,” in Proceedings of IEEE Intl. Conf. on Advances in Computing, Control, and Telecommunication Technologies (IEEE, 2009), pp. 819–822.
    [Crossref]
  18. M. Johansson, Image Registration with Simulated Annealing and Genetic Algorithms (Royal Institute of Technology in Sweden, 2006).
  19. E. Y. Lam and J. W. Goodman, “Iterative statistical approach to blind image deconvolution,” J. Opt. Soc. Am. A 17(7), 1177–1184 (2000).
    [Crossref] [PubMed]
  20. J. S. Lim, Two-Dimensional Signal and Image Processing (Prentice Hall, 1990).
  21. A. K. Kaw, E. E. Kalu, and D. Nguyen, Numerical Methods with Applications (Univ. of South Florida, 2010), Chap. 4.
  22. R. Farebrother, Linear Least Squares Computations (CRC Press, 1988).

2013 (1)

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

2012 (1)

F. Šroubek and P. Milanfar, “Robust multichannel blind deconvolution via fast alternating minimization,” IEEE Trans. Image Process. 21(4), 1687–1700 (2012).
[Crossref] [PubMed]

2006 (1)

P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006).
[Crossref]

2005 (1)

F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005).
[Crossref]

2000 (1)

1997 (1)

R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997).
[Crossref] [PubMed]

1989 (1)

Armstrong, E. E.

R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997).
[Crossref] [PubMed]

Babacan, D.

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

Barnard, K. J.

R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997).
[Crossref] [PubMed]

Barrett, E. B.

F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005).
[Crossref]

Bombaywala, S.

J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image registration by template matching using normalized cross-correlation,” in Proceedings of IEEE Intl. Conf. on Advances in Computing, Control, and Telecommunication Technologies (IEEE, 2009), pp. 819–822.
[Crossref]

Brada, R.

D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-pixel displacement,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 1988), pp. 742–746.

Goodman, J. W.

Hardie, R. C.

R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997).
[Crossref] [PubMed]

Hoctor, R. T.

F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005).
[Crossref]

Katsaggelos, A.

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

Katsaggelos, A. K.

B. C. Tom and A. K. Katsaggelos, “Reconstruction of a high resolution image from multiple degraded mis-registered low resolution images,” in Proceedings of IEEE Intl. Conf. on Image Processing (IEEE, 1994), pp. 553–557.
[Crossref]

Keren, D.

D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-pixel displacement,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 1988), pp. 742–746.

Lam, E. Y.

Milanfar, P.

F. Šroubek and P. Milanfar, “Robust multichannel blind deconvolution via fast alternating minimization,” IEEE Trans. Image Process. 21(4), 1687–1700 (2012).
[Crossref] [PubMed]

Molina, R.

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

Oskoui, P.

Patnaik, S.

J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image registration by template matching using normalized cross-correlation,” in Proceedings of IEEE Intl. Conf. on Advances in Computing, Control, and Telecommunication Technologies (IEEE, 2009), pp. 819–822.
[Crossref]

Peleg, S.

A. Zomet, A. Rav-Acha, and S. Peleg, “Robust super-resolution,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 2001), pp. 645–650.

D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-pixel displacement,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 1988), pp. 742–746.

Rav-Acha, A.

A. Zomet, A. Rav-Acha, and S. Peleg, “Robust super-resolution,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 2001), pp. 645–650.

Sarvaiya, J. N.

J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image registration by template matching using normalized cross-correlation,” in Proceedings of IEEE Intl. Conf. on Advances in Computing, Control, and Telecommunication Technologies (IEEE, 2009), pp. 819–822.
[Crossref]

Šroubek, F.

F. Šroubek and P. Milanfar, “Robust multichannel blind deconvolution via fast alternating minimization,” IEEE Trans. Image Process. 21(4), 1687–1700 (2012).
[Crossref] [PubMed]

Stark, H.

Süsstrunk, S.

P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006).
[Crossref]

Tom, B. C.

B. C. Tom and A. K. Katsaggelos, “Reconstruction of a high resolution image from multiple degraded mis-registered low resolution images,” in Proceedings of IEEE Intl. Conf. on Image Processing (IEEE, 1994), pp. 553–557.
[Crossref]

Vandewalle, P.

P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006).
[Crossref]

Vega, M.

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

Vetterli, M.

P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006).
[Crossref]

Villena, S.

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

Wheeler, F. W.

F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005).
[Crossref]

Zomet, A.

A. Zomet, A. Rav-Acha, and S. Peleg, “Robust super-resolution,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 2001), pp. 645–650.

Digit. Signal Process. (1)

S. Villena, M. Vega, D. Babacan, R. Molina, and A. Katsaggelos, “Bayesian combination of sparse and non-sparse priors in image super-resolution,” Digit. Signal Process. 23(2), 530–541 (2013).
[Crossref]

EURASIP J. Appl. Signal Process. (1)

P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP J. Appl. Signal Process. 2006, 71459 (2006).
[Crossref]

IEEE Trans. Image Process. (2)

R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Trans. Image Process. 6(12), 1621–1633 (1997).
[Crossref] [PubMed]

F. Šroubek and P. Milanfar, “Robust multichannel blind deconvolution via fast alternating minimization,” IEEE Trans. Image Process. 21(4), 1687–1700 (2012).
[Crossref] [PubMed]

J. Opt. Soc. Am. A (2)

Proc. SPIE (1)

F. W. Wheeler, R. T. Hoctor, and E. B. Barrett, “Super-resolution image synthesis using projections onto convex sets in the frequency domain,” Proc. SPIE 5674, 479–490 (2005).
[Crossref]

Other (15)

A. Zomet, A. Rav-Acha, and S. Peleg, “Robust super-resolution,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 2001), pp. 645–650.

P. Milanfar, Super-Resolution Imaging (CRC Press, 2010).

Z. Zalevsky and D. Mendlovic, Optical Superresolution (Springer, 2003).

T. Lukeš, Super-Resolution Methods for Digital Image and Video Processing (Czech Technical University, 2013).

Z. Zalevsky, Super-Resolved Imaging: Geometrical and Diffraction Approaches (Springer, 2011).

J. Hadamard, Lectures on Cauchy’s Problem in Linear Partial Differential Equation (Dover, 1923).

A. Gilman and D. G. Bailey, “Near optimal non-uniform interpolation for image super-resolution from multiple images,” in Image and Vision Computing, (2006), pp. 31–35.

A. N. Tikhonov and V. I. Arsenin, Solutions of Ill-Posed Problems (Winston, 1977).

B. C. Tom and A. K. Katsaggelos, “Reconstruction of a high resolution image from multiple degraded mis-registered low resolution images,” in Proceedings of IEEE Intl. Conf. on Image Processing (IEEE, 1994), pp. 553–557.
[Crossref]

J. S. Lim, Two-Dimensional Signal and Image Processing (Prentice Hall, 1990).

A. K. Kaw, E. E. Kalu, and D. Nguyen, Numerical Methods with Applications (Univ. of South Florida, 2010), Chap. 4.

R. Farebrother, Linear Least Squares Computations (CRC Press, 1988).

D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-pixel displacement,” in Proceedings of IEEE Intl. Conf. on Computer Vision and Pattern Recognition (IEEE, 1988), pp. 742–746.

J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image registration by template matching using normalized cross-correlation,” in Proceedings of IEEE Intl. Conf. on Advances in Computing, Control, and Telecommunication Technologies (IEEE, 2009), pp. 819–822.
[Crossref]

M. Johansson, Image Registration with Simulated Annealing and Genetic Algorithms (Royal Institute of Technology in Sweden, 2006).

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (15)

Fig. 1
Fig. 1 Illustration of the shift impact on spatial sampling and angular bandwidth: sensor shift (left) and camera shift (right) as compared with the non-shifted case (center).
Fig. 2
Fig. 2 Simple imaging scheme illustrating the linear combinations of HR pixels including the boundaries forming the LR images (top) and the resulting vector-matrix representation that direct SR solves (bottom) in a noisy environment.
Fig. 3
Fig. 3 Structure of the H matrix (right) showing the sparse diagonal-like distribution (locations of nonzero elements) and the color-coded repetitive blocks for a 16×16 object (left) using 4×4 blocks and 4 LR images. The colored blocks below the horizontal axis of the H matrix are highlighting a group of columns that can be copied (with circular shifts) from any other block’s columns belonging to the same colored group.
Fig. 4
Fig. 4 The steps (top) and a block diagram (bottom) summarizing the direct SR technique.
Fig. 5
Fig. 5 Block diagram illustrating the adaptive frequency-based filtering scheme in the training stage (top) and the testing stage (bottom).
Fig. 6
Fig. 6 Training images used in adaptive frequency-based filtering scheme.
Fig. 7
Fig. 7 Visual comparison of various SR techniques at 2× and 4× subsampling factors using sensor shift model, known non-integer-pixel shifts, σ blur =0.5 pixel, and noiseless environment for U.S. Air Force (USAF) target and cameraman image. Shown at the first row: sample of LR images and the HR images. Rows 2-5 displays from left to right: bicubic interpolation, SR of mixed priors, MCBD, and direct SR. The associated RMSE% is shown at their left.
Fig. 8
Fig. 8 Reconstruction error vs. noise of various SR techniques and the AFFS results for the sharpness circle.
Fig. 9
Fig. 9 Visual comparison of the reconstructed sharpness circle (zoomed) at NS = 0.1% and 0.5% for various SR techniques and averaged over 20 noise realizations. A sample LR image and the ground-truth HR image are shown on the left of the first row. The three trained multi-gray masks ( w Direct ,  w Bicubic ,   and   w MCBD ) are color encoded as red for direct, green for bicubic, and blue for MCBD in the composite RGB masks displayed on the right of first row.
Fig. 10
Fig. 10 Reconstruction errors vs. σ blur when imaging the boat image through two observation models: sensor shift (SS) and camera shift (CS) using various SR techniques.
Fig. 11
Fig. 11 Visual comparison of the reconstructed boat at σ blur =1  pixel for various SR techniques using sensor shift (middle row) and camera shift (bottom row). A sample LR image and the original HR image are shown on the left of the first row. The three trained averaged masks ( w Direct ,  w Bicubic ,   and   w MCBD ) are color encoded as red for direct, green for bicubic, and blue for MCBD in the composite RGB masks displayed on the right of the first row.
Fig. 12
Fig. 12 Recon. errors vs. BTol when reconstructing peppers image with various SR techniques.
Fig. 13
Fig. 13 Visual comparison of the reconstructed pepper (zoomed) at BTol=10%  (middle row) and  20%  (bottom row) for various SR techniques. A sample LR image and the original HR image are shown on the left of the first row. The three trained multi-gray masks ( w Direct ,  w Bicubic ,   and   w MCBD ) are color encoded as red for direct, green for bicubic, and blue for MCBD in the composite RGB masks displayed on the right of the first row.
Fig. 14
Fig. 14 Reconstruction errors vs. STol when restoring endoscopy image with various SR techniques.
Fig. 15
Fig. 15 Visual comparison of the reconstructed endoscopy image (zoomed) at STol=0.1%  (middle row) and  1%  (bottom row) for various SR techniques. A sample LR image and the original HR image are shown on the left of the first row. The three trained multi-gray masks ( w Direct ,  w Bicubic ,   and   w MCBD ) are color encoded as red for direct, green for bicubic, and blue for MCBD in composite RGB masks displayed on the right of the first row.

Metrics