## Abstract

The theory of compressed sensing (CS) shows that signals can be acquired at sub-Nyquist rates if they are sufficiently sparse or compressible. Since many images bear this property, several acquisition models have been proposed for optical CS. An interesting approach is random convolution (RC). In contrast with single-pixel CS approaches, RC allows for the parallel capture of visual information on a sensor array as in conventional imaging approaches. Unfortunately, the RC strategy is difficult to implement *as is* in practical settings due to important contrast-to-noise-ratio (CNR) limitations. In this paper, we introduce a modified RC model circumventing such difficulties by considering measurement matrices involving sparse non-negative entries. We then implement this model based on a slightly modified microscopy setup using incoherent light. Our experiments demonstrate the suitability of this approach for dealing with distinct CS scenarii, including 1-bit CS.

© 2016 Optical Society of America

## 1. Introduction

Since the 1970s, the redundant nature of real-world signals has been widely exploited in distinct applications such as seismology, computed tomography, magnetic resonance imaging, and radar imaging. More recently, the theory of compressed sensing (CS) has explicitly shown that, when a signal **x** ∈ ℝ* ^{N}* is sufficiently

*sparse*, it can be reconstructed from

*M*≪

*N*measurements of the form

**Φx**, where

**Φ**is a suitable measurement matrix [1]. By sparsity, we understand that, when expressed in some appropriate basis

**Ψ**, the signal

**x**consists of few non-zero entries. This is mathematically expressed as ‖

**Ψx**‖

_{0}≪

*N*, where ‖ · ‖

_{0}is the

*l*

_{0}pseudo-norm. Such a property holds at least approximately for large classes of natural signals: sound and seismic waves are sparse in the Fourier domain, whereas natural images are sparse in the wavelet domain, for instance.

During the last decade a formal development of the CS theory and related algorithms took place. In particular, algorithmic tools exploiting the (conditional) equivalence between the *l*_{0} and *l*_{1}-norm have been developed. This equivalence is the key to computational tractability because it allows one to cast CS reconstruction—which initially corresponds to an NP-hard *l*_{0}-norm-minimization problem—into a convex *l*_{1}-norm minimization problem for which several efficient resolution techniques are readily available in the literature.

## 2. Random convolution

Conventional imaging systems (*e.g.*, digital cameras) are designed in such a way that the acquired images map to the observed scene of interest as accurately as possible. Accordingly, the components of such a system are optimized to compensate aberrations. Now, when stemming from real-world scenes or objects, the samples or pixels of an acquired image tend to exhibit high spatial redundancy, *i.e.*, several areas in the field of view (FOV) can be associated with similar colors or patterns. This qualitatively explains why compression algorithms based on sparsifying transforms—such as discrete cosine transform in JPEG or wavelet transform in JPEG2000—are consistently performant when dealing with digital photographies or similar imaging information.

The use of CS in optical imaging can allow to capture less image pixels by exploiting the sparsity of visual datasets. Specifically, in the CS framework, the measurements are acquired on-the-fly in compressed form, therefore not requiring further compression methods such as JPEG to be applied. An additional reconstruction step is then required for the acquisition to be visualized by an observer. Specific acquisition and reconstruction schemes for optical CS have been proposed in the literature and also implemented. Some of these implementations are able to capture scenes in compressed form based on sequential single-pixel acquisitions. This is the case of the one-pixel camera from Rice University [2] and of the Bell Lab’s lenseless camera [3]. By contrast, the MIT random lens camera [4] uses a sensor array as in conventional settings. It thus allows to perform optical CS in parallel mode. Similarly, the work in [5] proposes to capture multi-channel images with a single-channel sensor array by applying distinct transformations on every channel followed by a multiplexing step. A particularly promising parallel-acquisition approach for optical CS is called *random convolution* (RC) [6–9]. Unlike the aforementioned methods, RC can be implemented using a conventional optical device. The principle of RC is to modify the point-spread function (PSF) of the device using a coded aperture [10], thus realizing an optical convolution operation that is suitable for CS-type measurements [9]. The RC model provides similar recovery guarantees as the independent and identically distributed (i.i.d.) Gaussian measurement matrices that are used in classical CS models. Furthermore, unlike such Gaussian matrices, convolution matrices are associated with efficient O(*n* log*n*) reconstruction algorithms by exploiting the fast Fourier transform [9, 11].

Several imaging settings in photography and microscopy involve illumination with spatially incoherent light. This is associated with an intensity determined PSF and, accordingly, with measurement matrices composed of non-negative-only components [12]. Besides the fact that their practical implementations are commonplace, a major advantage of incoherent-light acquisition models lie in their simplicity—relative to coherent-light settings, where the acquisition model is often more complex and prone to speckle —and to the tractability of the corresponding CS-reconstruction operations. An RC model adapted to the spatially incoherent setting has been investigated by Bourquard et al. [7]. Promising results have been obtained with the proposed model in several noiseless settings. Unfortunately, the realization of a practical RC implementation in an incoherent-light setting is challenging. In particular, the PSF non-negativity that is intrinsic to such a setting causes the image contrast to decrease proportionally to the PSF footprint, the latter being as large as the original image in the RC model as proposed [7].

In this work, we thus devise a new CS acquisition model that is based on the RC concept. Besides constituting an extension of RC to spatially-incoherent illumination, this model is also suitable for a straightforward optical implementation. In Section 3, we present the proposed RC-based optical model. In Section 4, we describe a practical implementation of this model based on a modified microscopy setup. In Section 5, we present distinct classes of CS problems potentially to be addressed with our optical setup, including 1-bit CS. Experiments related to these problems are presented in Section 6, with a conclusion in Section 7.

## 3. Proposed optical model

The proposed optical system allows to perform CS-type image acquisitions following a modified RC model. As described in more details below, this model can be implemented in an existing standard optical device through the mere introduction of a specifically designed phase-shifting mask. Compared to classical imaging, one main advantage of our CS-based approach is its ability to increase the space-bandwidth product, *i.e.*, to reconstruct and visualize more object pixels than the number of pixels of the image sensor [13]. As demonstrated in Section 6, this implies that visual data can be sampled at sub-Nyquist rates and super-resolved.

The generic form of our optical model follows a diffraction-limited setting and employs monochromatic spatially incoherent light. The forward model proposed by Goodman [14] allows to describe a convolutional relationship between a 2D object centered and perpendicular to the optical axis and a corresponding image as

where*x*and

*y*denote the unknown-object and measured-image intensities,

*h*is the intensity PSF, * denotes the continuous-domain convolution operator, and

**u**denotes 2D physical coordinates parallel to the object and image plane. In this incoherent-imaging model,

*h*corresponds to the inverse Fourier transform of the autocorrelation of the generalized pupil function [7, 14]. Specifically,

*𝒦*is an energy-normalization constant, NA the numerical aperture of the system,

_{h}*λ*

_{0}is the vacuum illumination wavelength,

**denotes 2D normalized coordinates,**

*ξ**q*is the generalized pupil function, and

*ℱ*is the Fourier transform defined as

*x*can be described as a 4F-type system [7, 15, 16]. Following the direction

*d*of light propagation, the input light intensities are transformed into the image

*y*= (

*x**

*h*), the latter being captured by a sensor array. The actual intensity profile of the propagated image

*y*depends on

*q*according to the Fourier-transforming property of the lenses. The form of Eq. (2) implies that the pupil function

*q*is located at the so-called

*Fourier plane*of the optical system.

In practice, we are dealing with discrete measurements and operations, and thus consider equivalent sequences *x*[·], *y*[·], *h*[·] expressed in vectorized form as **x**, **y**, **h**. The entries of the measurement matrix **Φ** are associated with the values of *h*, the exact structure of **Φ** being associated with the discrete convolution operation.

#### 3.1. Sparse random convolution

The RC filters *h* associated with convolution-type measurement matrices **Φ** are known in literature based on different properties and in consequence different definitions [11]. For instance, the work of Romberg [9] imposes discrete-Fourier-transform (DFT) coefficients *ĥ*[*·*] that have unit amplitude and random phase with imposed conjugate symmetry, *i.e.*, *h*[·] ∈ ℝ. Alternative definitions in incoherent-light settings include the pointwise-squared magnitude of a similar filter without conjugate symmetry [7], which is consistent with Eq. (2). The filters can also be defined in the spatial domain following some random i.i.d. distribution which can be Gaussian-type, Bernoulli-type, or Rademacher-type as in [8] where *h*[·] ∈ {−1, 1}.

Our contribution is to propose a spatially sparse filter *h* for RC that not only complies with Eq. (2), but allows for a straightforward implementation in an optical setup with spatially-incoherent illumination. We design this filter as a 2D pseudo-random sequence containing few non-negative impulses in the spatial domain, resulting in a PSF that consists of randomly located impulse functions following a spatially uniform distribution [17]. Specifically, we define *h* as stemming from a Bernoulli-type distribution, where the values of *h* correspond—up to a multiplicative constant—to *h*[·] ∈ {0, 1}, and where the non-zero components of *h* appear with low probability *p* ≪ 1. Firstly, given its compliance with Eq. (2), this definition matches the physical intensity PSF that can be generated in an incoherent-light setting. Secondly, the spatial sparsity of *h* allows—despite its non-negativity—for the acquisition of visual information with sufficiently high contrast-to-noise ratio (CNR). While non-negative random measurement matrices consisting only of 0s and 1s have been shown to be bad performers in terms of the restricted-isometry property (RIP) for CS [18], they can reach a performance comparable to Gaussian matrices when sparse [19,20], being the case in our definition. Experiments in Section 6 will further assess the practical suitability of our filter. Note that, while sparse-convolution-type models have been considered for multiplexed imaging [5, 17], the filters that we propose allow to perform CS-type acquisitions in the context of single-channel imaging. Moreover, our approach can be implemented into a conventional optical device, as demonstrated in Section 6.

The class of filters *h* specified above can be produced using a mask *m* placed in the Fourier plane of the optical system, which specifies the generalized pupil function *q* in Eq. (2) as

**) is a unit-radius circular function defining the system aperture, and where**

*ξ**m*(

**) is the mask function. Given Eq. (2) and the properties of the Fourier transform, the extent of the PSF**

*ξ**h*that is produced is inversely proportional to the characteristic size of the corresponding mask profile

*m*. The mask pitch size can thus be adjusted to obtain the desired PSF dimensions.

#### 3.2. Phase-only approximation

According to Eq. (2), and given the property *h*[·] ∈ {0, 1}, the mask profile *m* can be determined as a scaled inverse Fourier transform of *h*. In general, such a mask is expected to induce simultaneous changes in light amplitude and phase. Indeed, the Fourier transform in Eq. (2) implies that every impulse in *h* corresponds to a 2D complex-exponential profile in *m*. Manufacturing such a mask thus entails the creation of a surface profile with varying refractive indices and absorbances, the latter being associated with the desired amplitudes and phases, respectively. Therefore, complex-valued masks are complicated for fabrication, besides having the drawback of lowering the light throughput due to their non-negligible absorptive component [21].

As an alternative solution, we thus propose to follow a simplified phase-only mask design that is amenable to implementation while still allowing to produce a filter similar to *h*. Qualitatively, our mask can be seen as a phase-only diffraction grating producing several diffracted modes of the input field. As in the approach described in [21], we choose our approximate mask *m̃* to be unit-amplitude, *i.e.* transparent, which maximizes the light throughput, and to only induce two possible phase shifts of 0 or *π* radians, which further simplifies the mask design. As specified below, this specific pair of phase-shift values allows for our approximation to be consistent with the zero-crossings of the non-approximated profile *m*. Accordingly, we define *m̃* as

*φ*∈ {0,

*π*} are determined from the initial exact mask profile as

*h̃*generated by

*m̃*consists in a central symmetrization of the impulses in

*h*. In Fig. 2, we illustrate this property based on a synthetic example where

*h*is represented as a 128 × 128 sequence containing 20 non-zero impulses placed within a centered 64 × 64 window. One can observe that, despite the quantization noise that is associated with our approximation, the filter

*h̃*that is produced by our approximate mask still consists of sparse impulses. Note that, while relative phase shifts departing from {0,

*π*} cause unwanted 0

*-order-component artifacts and are thus suboptimal, we have found empirically that our simplified-mask model tolerates imprecisions of up to 0.1*

^{th}*π*. This is further illustrated in Fig. 3. Such phase non-idealities are typically due to manufacturing imprecisions or changes in the illumination wavelength during acquisition.

The symmetrization property associated with our approximation is explained as follows. While the {0, *π*} set of phases captures all zero-crossings of the real part of the complex-exponential sum in *m*, it remains insensitive to all imaginary components. This amounts to discarding the latter, thus creating additional conjugate complex exponentials—each of which corresponds to the addition of a symmetric impulse—to the original mask profile unless they are already present in the original mask profile *m*.

Our PSF has to comply with two essential yet competing properties: (1) measurement pseudo-randomness [15] and (2) minimal measurement SNR. On the one hand, as a necessary condition, measurement pseudo-randomness requires the PSF to be composed of enough pseudo-randomly scattered impulses spread over a large enough FOV, so that every single measurement combines information of several unknowns [20]. On the other hand, minimizing the measurement SNR is achieved by maximizing the amount of optical energy inside the FOV as well as the CNR. This requires PSF impulses to be limited in number and confined within a restricted area. Ultimately, the practical criterion under which the suitability of our PSF can be assessed empirically is the reconstruction quality.

As demonstrated below, the reconstruction quality can be maximized when a suitable compromise between the aforementioned requirements (1) and (2) is obtained. In the sequel, we propose to analyze the effect of the amount of impulses as a free parameter towards reaching that compromise. Considering one particular example, both object and PSF sizes are kept fixed, most of their energy being confined within a centered square whose side length is half the FOV size. This allows for the acquired image to be contained within the FOV of the sensor array.

#### 3.3. 1D example

We consider a 1D case to visualize how an approximately sparse PSF can be constructed using only a binary phase mask in the Fourier domain. In Figs. 4(a)–4(d), we show the result of binarizing a very sparse PSF in 1D; a beam splitter. The PSF encoded by the binary phase mask shows some noise but is still nearly sparse. This is because the square wave, shown in Fig. 4(c) in red, we use as a phase grating captures most of the low frequency information of the original signal, in Fig. 4(b). Figures 4(e)–4(h) show the same procedure, but for an 8-peaked PSF.

We now proceed with the 1D example in Figs. 4(e)–4(h) to provide an intuition of why the type of PSF we propose can be useful for compressed sensing.

In Fig. 5(a), we show a sparse signal that we would like to accurately capture. While this can be achieved directly by taking 80 regular samples across the signal range, capturing all signal features becomes unfeasible when reducing the number of samples to 40. Figure 5(b) illustrates two possible sampling results, each of which misses out an important part of the signal.

Now, instead of directly sampling in this way, we can first apply a convolution with the sparse 8-spike PSF shown in Fig. 4(h), which yields Fig. 5(c). In this setting, it turns out that, based on any of the two sets of 40 convolved samples, one can reconstruct a fairly close approximation to the signal, as shown in Fig. 5(d).

#### 3.4. Analysis

Based on simulations, we start by analyzing how—in the context of our PSF model—the amount of randomly-scattered impulses affects the PSF suitability from the perspective of the final reconstruction quality. Qualitatively speaking, an extreme case scenario is the use of a single impulse, which causes image intensities to match the acquired-object profile. This case essentially makes our framework degenerate to the classical acquisition scenario where measurements cannot suffer any losses as considered in CS scenarios. At the other extreme, the use of PSFs composed of too many impulses would cause the measurement SNR to decrease in our incoherent-light setting, due to a contrast-lowering effect that is similar to the one mentioned in Section 2.

Figure 6 illustrates the decrease in image contrast occurring in the incoherent-light setting with our PSFs. It also shows the same convolution results using a hypothetical non-physical but contrast-preserving PSF whose first moment (*i.e.*, mean value) would be set to zero. From a visual standpoint, the image obtained by a convolution with 2 impulses lacks pseudo-randomness, Fig. 6(d), while the one obtained by convolution with 1000 impulses lacks contrast, Fig. 6(f). The latter result actually looks similar to a convolution with a Gaussian kernel. Visually, the result with 30 impulses, in Fig. 6(e), looks more appropriate from both contrast and pseudo-randomness standpoints.

Ultimately, for any particular CS problem, the optimal amount of impulses can be determined objectively based on the reconstruction quality. Accordingly, we propose to study the behavior of the reconstruction quality as a function of the amount of impulses in the framework of 1-bit CS. More specifically, we consider the optical CS framework proposed in [15] where pixels are acquired as compressed binary values. While that work tackles the noiseless case, we study the behavior of the reconstruction with distinct amounts of additive Gaussian noise on the acquisition side (the noise being a cause of degradation of the measurements and thus of the reconstructed object profile). The noise parameter *σ* is defined as the ratio between the noise standard deviation and the maximum noiseless image intensity (the noise being applied before the finite differentiation and binarization operations described in [15]). Results are shown in Fig. 7. In Fig. 7(e), we observe that the reconstruction quality reaches an optimum peak corresponding to a precise amount of impulses. We also observe that this optimum decreases with the noise level, which is expected because the required image contrast must increases when more noise is present, so as to keep the same CNR. Furthermore, the same plot shows that, when approaching optimal amounts of impulses, our approximated mask profiles are virtually as effective as the original complex-valued profiles. These mask approximations—whose goal is to simplify the manufacturing process—are thus satisfactory within our optical framework according to these simulations. Finally, we show that the extent of the region within which PSF impulses are scattered does influence the quality of reconstruction. In that regard, the results in Fig. 7(f) indicate that the optimal choice lies around half of the total FOV of the system for the PSF size. This result corresponds to the largest possible peak spread that still allows the acquired convolved image to be confined within the sensor’s field of view.

Given the above analysis based on simulations, we chose to consider 30 impulses as the amount for our real-world device implementation described below, since it is near the optimum for the range of noise level we estimated to find in the real images, generally with *σ* not above 0.1%. Note that, while this optimum only represents an ideal solution (for some arguably acceptable noise-level range) in the framework of a particular acquisition and (algorithmic) reconstruction setting, we found experimentally that it also allowed to obtain satisfactory results when considering other CS approaches than the one of [15]. In that sense, even though the analysis we performed is specific, the trend as a function of the amount of impulses may be a general one (that one may re-investigate using Cramer-Rao bounds or similar).

## 4. Implementation

#### 4.1. Materials

All phase masks were etched onto the same 5″ chrome wafer, shown in Fig. 8(a). The specific phase-shifting profiles were defined on a layer of Bayfol 102 HX photopolymer film (Bayer AG). The latter—which is typically used to store interference fringe patterns—converts the exposure to certain light-intensity levels into refractive-index variations. This property was used to imprint the particular phase-shifting profiles corresponding to our filters *h*. Film photoexposure was performed directly on the chrome layer using a 682 nm laser source with 2.5 · 10^{−4}W/cm^{2} irradiance and 35 s duration. Subsequent curing occurred for several days under white light. These settings produce phase shifts closest to {0, *π*} when using the red-light channel during acquisition. In Fig. 8(b) we show a photograph of a red LED taken through one such mask.

Distinct binary phase masks corresponding to distinct PSF sizes were produced, with a total of 9 masks on the wafer, ranging from 100%, meaning that the PSF was designed scattering the impulses over all the area, to 0.1%, in which all the impulses were confined in a square of side 0.1% of the PSF canvas. Each PSF contains 30 pseudo-randomly-placed peaks, the placement being associated with one particular realization. In each mask, the binary phase-shifting values of interest are organized into squares composed of 1024 × 1024 pixels of size 25 × 25*μm*^{2}.

In view of performing CS image acquisition, the phase mask corresponding to the filter *h* of interest was integrated in a simple optical system. Specifically, the mask was introduced between the objective and the camera of a polarizing microscope, thus replacing the initially-placed polarizing filter. The microscope was equipped with a Jenoptik ProgRes SpeedXT 5 camera, having a 2/3″ sensor with 3.4*μm*^{2} pixel size. Note that the position of the phase mask corresponds (at least approximately) to the Fourier plane in the microscope, as required in our Fourier-optics model. The diaphragm of the light source was set to its minimum size in order to limit the effective size of the object profile, see Fig. 9(a). Every image was taken as the average of 15 acquisitions, Figs. 9(a)–9(b), for noise reduction. Each acquisition was performed in raw 12-bit format, keeping only the content of the red channel.

Note that the effect of the phase mask is wavelength-dependent, in the sense that the phase shifts are exact only for a specific acquisition wavelength. Accordingly, the relative 0 and *π* phase-shift values (approximately) hold only when employing a tight band around the required wavelength. In this case we tuned the laser intensity and exposure time until we found the parameters that would produce good results in the camera’s red channel. Still, most of the wavelengths in this band will be shifted by values approximately between 0.9*π* and 1.1*π*, assuming a 100 nm wide band centered at 600 nm. As we have discussed above, the main effect of this will be that part of the energy of the image will pass through the mask unaltered, adding an additional impulse to the PSF in its center, as well as the addition of some amount of radial blur to the impulse response.

#### 4.2. Calibration

The CS reconstruction algorithms considered in this work all assume—and thus require—the knowledge of the PSF of the system. While a PSF could ideally be computed from the corresponding phase-mask profile—based on our optical model—the many nonidealities that are intrinsic to practical implementation favor calibration approaches with real data. Accordingly, we proposed to deduce the PSF from distinct image pairs, each of these pairs being composed of two image acquisitions with and without phase mask. In total, we used 6 such pairs for calibration.

Now, the particular structure of our PSF not only allows to perform RC-type acquisition with acceptable contrast in incoherent-light settings, but also has the potential to improve calibration accuracy, when properly taken into account. Indeed, the fact that our PSFs are both non-negative and sparse in the spatial domain—especially sparse are the 30-peak PSFs corresponding to our phase masks—constitutes substantial prior knowledge. In order to exploit the latter, we define the calibration step itself as a convex-optimization problem where the PSF *h*_{sol} to be estimated is defined as

*y*is the image of the

_{i}*i*th object acquired with a phase mask, and

*x*is the image of the

_{i}*i*th object acquired without any mask.

The set Ω^{+} corresponds to the set of all non-negative PSFs. The regularization term enforcing the sparsity of the recovered filter *h*_{sol} is the *l*_{1}-norm denoted by ‖ · ‖_{1}. The real parameter Λ weights its influence with respect to the left term that enforces fidelity of the PSF to the (noisy) calibration data. The close relationship between Problem (7) and *l*_{0}-pseudonorm penalization explains the generation of sparse solutions [22]. The above calibration problem is actually itself a particular instance of a CS problem.

We solved (7) based on the YALL1 1.4 basic solver [23]. Specifically, we first run this solver for 5000 iterations, and then run it again for 5000 additional iterations while enforcing *h* ∈ Ω^{+} by clipping at the end of each single iteration. One instance of our calibration problem is shown with a corresponding solution in Fig. 9. Note that the optical system and the calibration technique that we use can potentially suffer from non-idealities. For instance, in this example involving a real phase-mask implementation, one can clearly observe some asymmetries and a radial pattern on the (outer-most) PSF impulses, unlike in the simulated case. The radial pattern is due to the fact that the red filter selected for acquisition contains several distinct light wavelengths, the phase-shifting properties of the phase mask being thus non-ideal and superimposed. The impulse in the center, which has not been coded in the mask, is caused by a phase shift slightly different from *π* radians (see Fig. 3 in Section 3.1). While our practical device is suitable in our context despite the presence of such non-idealities, as shown in Section 6, the characterization of the latter remains the topic of further investigation.

## 5. Reconstruction problems

We have defined several types of CS reconstruction problems involving our optical acquisition device. The goal of each of these problems is to assess to what extent the acquisitions obtained from our practical optical system in the presence of a phase mask comply with the properties of CS. Specifically, we propose to assess the quality of reconstruction compared to the original object when the measurements are subjected either to loss—as in classical CS-recovery problems—or to quantization effects. These effects are intrinsic to the 4 reconstruction problems that we define below.

**Super-Resolution:**A lower-resolution version of the measurements is generated by downsampling the 2D acquisition onto a coarser Cartesian grid using nearest neighbor interpolation. The goal is thus to retrieve the original high-resolution version of the measurements.**Unmasking:**The measurements corresponding to one of the image halves are removed and substituted with zeros. This removed image half thus has to be recovered.**Desaturation:**All measurements whose brightness exceeds 60% of the brightest pixel value are discarded, thus simulating a sensor-saturation effect. These saturated values must be recovered.**1-Bit CS:**All measurements are quantized to one single bit based on local comparisons of the intensity values. This amounts to applying finite-differentiation and zero-thresholding steps to the original measurements, as described in [15]. The goal there is to retrieve the original non-quantized measurement values up to a common scaling factor.

The overall discrete acquisition model that is common to the distinct CS problems defined above is expressed in the generic form

where*x*corresponds to the unknown object-intensity profile—as specified on the original sensor grid—to be retrieved, where

*h*is the PSF of the optical system, where

*𝒟*denotes the 1-bit quantization and/or sample-loss degradation to which the measurements are subjected, and where

*𝒩*corresponds to the noise to which the light intensities

*x*★

*h*measured by the sensors are affected. Note that, in our incoherent-light setting, the relevance and practicality of our acquisition system for CS relies on the design, the implementation, and the calibration of the optical filters

*h*in the presence of noise. In the current version of our system, the subsequent sample-loss and/or binary-quantization operations in

*𝒟*are carried out numerically based on the optically acquired intensities

*y*=

*𝒩*{

*x*★

*h*}. This allows us to introduce our novel phase masks in an existing optical system that is endowed with a conventional sensor array. In Problems (1), (2), and (3), the forward model corresponding to Eq. (8) corresponds to a convolution operation followed by downsampling, which, assuming additive Gaussian noise, can be written in vector notation as where

**H**is a convolution matrix associated with the unknown PSF

*h*, where

**D**is a down-sampling matrix associated with the sample loss specified in

*𝒟*, and where

**and**

*γ***n**are vectors corresponding to lexicographically-ordered versions of the components of

*γ*and

*n*, respectively. In Problem (4), the acquisition involves finite-differentiation and quantization operations following convolution with

*h*. The finite-differentiation operation corresponds to a discrete convolution with a filter

*f*, and the quantization operation corresponds to a zero-thresholding step. Accordingly, the forward model corresponding to Problem (4) can be written in the form where the convolution matrix

**F**implements the finite-differentiation filter

*f*applied on the measurements before zero-thresholding, and where the operator

*𝒬*quantizes any vector argument

**v**as

*𝒟*could further improve the acquisition speed and/or bandwidth of our system for all types of forward models under consideration. In principle, efficient and accurate implementations could be realized electronically using random-accessible-type arrays for downsampling and/or additional binary-comparator stages for 1-bit quantization [7, 15]. Note that the hardware-quantization stage should be disableable if required,

*e.g.*, during PSF calibration. While the development of such a modified sensor array requires further investigations, its manufacturing could exploit the currently existing sensor technology. Indeed, and first of all, optical sensors with random access can be manufactured using CMOS technology [24]. Furthermore, even though no 1-bit-comparator stage has been designed for optical acquisition up to our knowledge, optical CMOS-type sensors with embedded finite-differentiation capabilities that are closely related have been proposed [25].

Now, according to Eq. (8), the common goal of our 4 distinct CS-reconstruction problems is to find a suitable solution *x̃* that satisfies

*h*corresponds to the known calibrated PSF defined in Eq. (7), and where the operator

_{sol}*𝒟*is defined according to the specific forward model under consideration. By

*suitable solution*, we mean a solution that not only complies with the constraint (12), but is also sparse in the CS sense. In Problems (1)–(3), the solution

*x̃*expressed in generic form in Eq. (12) can be formulated explicitly in vector notation as

**H**

*is a convolution matrix that implements the PSF*

_{sol}*h*, where

_{sol}**D**is the downsampling matrix defined above, where ‖ · ‖

*denotes the total-variation (TV) semi-norm, and where Λ*

_{TV}_{1}balances the influence of the constrained fidelity to the known measurements and of the TV regularization term. In order to solve Eq. (13) corresponding to Problems (1), (2), and (3), we used the NESTA solver, described in [26]. In Problem (4), the solution defined implicitly in Eq. (12) is solved based on the 1-bit-CS-reconstruction formulation and algorithm proposed in [15]. Compared to Eq. (13), the solution formulation that is proposed in that case involves a data-fidelity term whose form is tailored to better ensure consistency with 1-bit-type measurements than an

*ℓ*

_{2}-norm. Following the above notations and definitions, we have

*M*denotes the amount of acquired measurements. Besides penalizing sign inconsistencies of potential solutions with respect to the existing binary measurements, the rationale behind this definition is to yield nontrivial solutions while ensuring the convexity of the 1-bit-CS optimization problem [15]. The regularization term

*ℛ*is a Huber-based convex functional whose sparsity-promoting effect is essentially similar to the one of TV. Finally, the parameter Λ

_{2}balances the respective influences of the data and regularization terms.

## 6. Experiments

In this section, we perform several experiments involving the general acquisition and reconstruction framework described above. In the framework of these experiments, a set of 12 distinct object profiles is acquired with as well as without phase mask. As mentioned above, 6 of these acquired profiles are employed for calibration. The remaining acquisitions are used to assess the suitability of our optical CS system based on the resolution approaches formalized in Eqs. (13) and (14). The results of our reconstruction experiments are evaluated visually. Note that, in Problem (4), the solution reconstruction is only known up to scaling. In that case, the scaling factor of the reconstruction is adjusted as in [15] to yield meaningful results.

The 3 objects used for visual evaluation are shown in Fig. 10.

#### 6.1. Super-resolution

In this set of experiments, we consider the resolution of Problem (1) using a downsampling matrix **D** corresponding to a regularly sub-sampled grid. The latter is 5-fold coarser than the original sensor grid in each dimension. Visual results are shown in Fig. 11 on the object USAF target. The reconstruction obtained from our method contains details that are lost due to aliasing when no phase mask is employed.

#### 6.2. Unmasking

Following the definition of Problem (2), the goal in this second set of experiments is to reconstruct images from their masked versions where all pixels from the left half of the FOV are missing. The masking operation makes the overall acquisition setting equivalent to using a sensor array with a horizontally narrower FOV. From that perspective, our reconstruction problem relates to FOV extension. Note that Problem (2) is structurally similar to Problem (1), except for the definition of **D** which is associated with masking instead of 5 × 5 regular downsampling. Visual results are shown in Fig. 12 on the object USAF target. In this figure, our CS technique is shown to allow for the recovery of the whole image from its masked version, unlike in the conventional-acquisition setting.

#### 6.3. Desaturation

In conventional imaging, saturation causes the loss of all detailed visual information in the related areas. In this experiment, we assess the robustness of our CS-based acquisition device to saturation effects. The desaturation problem (3) that we consider is again structurally similar to Problem (1) except for the specification of **D**. Here, the latter is associated with the removal of all saturated image pixels. We simulate the effect of saturation by removing all pixels in the image that have a value of more than 60% of the brightest pixel. Visual results on the object Salt crystal are shown in Fig. 13. We observe that our technique allows for object-profile recovery even in the presence of saturation. This result is consistent with those of our above unmasking and downsampling experiments. In all these problems, the original object profile is recovered by exploiting the non-locality of the pixel information that is intrinsic to CS acquisitions.

#### 6.4. 1-bit CS

In this last set of experiments, we investigate the possibility to reconstruct object profiles from CS measurements that are quantized to one single bit per pixel, based on our practical optical implementation. These experiments correspond to Problem (4) and follow the overall acquisition-and-reconstruction framework that was proposed in [15] for 1-bit CS based on a simulated optical model. In our setting, we employ the same reconstruction algorithm as in [15], the important difference being that our acquired data stem from a real-world optical device where no periodic boundary conditions are assumed. Our practical setting also implies that measurements are subjected to noise and that model mismatch, *i.e.*, potential discrepancies between the actual and estimated kernels *h* and *h _{sol}*, can occur. The acquisition and reconstruction parameters used are the same as in [15], except for the regularization parameter Λ, which was set to 7 · 10

^{−3}. The number of iterations was set to 30. The finite-differentiation filter

*f*that we use for acquisition corresponds to a

*bent 1D Laplacian*, which was empirically found to provide with better reconstructions than a 1D gradient or 2D Laplacian finite-differentiation.

It is defined as

where*δ*denotes the unit impulse, and where

**k**∈ ℝ

^{2}denotes the discrete 2D coordinate with origin is set at the upper-left image corner. Visual results on the objects circuit and salt crystal are shown in Fig. 14. In this figure, we observe that the quality of the object profile reconstructed from our method is acceptable despite the losses that are due to the 1-bit quantization of our measurements.

In order to get an idea of how far these results could be from the case in which we would be able to perfectly know the PSF of the system, we simulated a new measurement by applying the calibrated PSF to the original object circuit. The result of applying the above described 1-bit CS reconstruction procedure to this image can be seen in Fig. 15(c), and we can see how the grayscale levels seem to have been somewhat better recovered than in the actual reconstructed image, Fig. 15(b), but most features of both images are visually comparable.

Figure 15(d) shows the result of attempting to reconstruct Fig. 15(b) from 1-bit measurements taken directly from the original image, without using any phase mask and applying the same reconstruction parameters.

## 7. Conclusions

We proposed a novel strategy and implementation for the parallel acquisition of optical CS-type measurements under spatially-incoherent illumination. Based on an adaptation of the initial random-convolution paradigm, our approach consisted in the modeling and design of specifically tailored phase masks ensuring satisfactory contrast-to-noise ratios in practical settings. Our optical implementation was based on an existing microscope, the phase masks being introduced in the aperture of the system.

Based on our implementation, we demonstrated the possibility to reconstruct several types of object profiles from the acquired measurements following downsampling or binarization. Our approach was proven successful in several distinct problems, including super-resolution and 1-bit CS, even though our practical setting involved noise and potential model mismatches during calibration. At this stage, our joint acquisition and reconstruction framework thus constitutes a proof of concept of the practical viability of optical random convolution in general, as well as of binary compressed imaging in the sense of [15].

Further investigations are worth conducting from both theoretical and technological standpoints. First, our overall acquisition and reconstruction strategies could be refined to better take potential sources of noise or model mismatch into account. Another line of research is the optimization of the phase-mask design in our incoherent-light setting based on sound mathematical grounds, and the determination of associated CS-recovery guarantees. Such an optimization could either consider the number and position of pulses as parameters or follow a distinct type of design. Finally, the development of an optical acquisition system comprising CMOS-type sensors with embedded 1-bit-quantization capabilities can be envisioned.

## Acknowledgments

We would like to thank all the people at EPFL that, very patiently, helped us setting up the experiments: everyone at the Center of MicroNanoTechnolgy, with special mentions for Laszlo Petho and Kaspar Suter for the assistance in the white room; everyone at the Laboratory of Applied Photonics Devices and, in particular, Mickäel Guillaumée and Volker Zagolla, for their help with the fabrication of the phase masks and instrumentation; as well as Hadrien Michaud and Robin Nigon for helping with the image acquisition. We would also like to thank Emrah Bostan, at the Laboratoire d’Imagerie Biomédicale, for his very useful ideas.

## References and links

**1. **D. L Donoho, “Compressed sensing,” IEEE Trans on Info Theory , **52**, 1289–1306 (2006). [CrossRef]

**2. **R. G Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Sig. Process. Mag. , **25**, 83–91 (2008). [CrossRef]

**3. **G. Huang, H. Jiang, K. Matthews, and P. Wilford, “Lensless imaging by compressive sensing,” In Proc of IEEE Int Conf on Image Process, 2101–2105 (2013).

**4. **R. Fergus, A. Torralba, and W. T Freeman, “Random lens imaging,” preprint (2006).

**5. **R. Horisaki and J. Tanida, “Multi-channel data acquisition using multiplexed imaging with spatial encoding,” Opt. Express , **18**, 23041–23053 (2010). [CrossRef] [PubMed]

**6. **A. Ashok and M. A Neifeld, “Pseudorandom phase masks for superresolution imaging from subpixel shifting,” Appl. Opt. **46**, 2256–2268 (2007). [CrossRef] [PubMed]

**7. **A. Bourquard, F. Aguet, and M. Unser, “Optical imaging using binary sensors,” Opt. Express , **18**, 4876–4888 (2010). [CrossRef] [PubMed]

**8. **L. Jacques, P. Vandergheynst, A. Bibet, V. Majidzadeh, A. Schmid, and Y. Leblebici, “Cmos compressed imaging by random convolution,” In Proc IEEE Int Conf Acoust Speech Signal Process, 1113–1116 (2009).

**9. **J. Romberg, “Compressive sensing by random convolution,” SIAM J Imaging Sci , **2**, 1098–1128 (2009). [CrossRef]

**10. **R. F. Marcia, Z. T. Harmany, and R. M. Willett, “Compressive coded aperture imaging,” Proc. SPIE **7246**, Computational Imaging VII, 72460G (2009). [CrossRef]

**11. **W. Yin, S. Morgan, J. Yang, and Y. Zhang, “Practical compressive sensing with toeplitz and circulant matrices,” Proc. SPIE **7744**, 77440K (2010). [CrossRef]

**12. **Z. T. Harmany, R. F. Marcia, and R.M. Willett, “Compressive coded aperture keyed exposure imaging with optical flow reconstruction,” arXiv:1306.6281 (2013).

**13. **A. Stern and B. Javidi, “Random projections imaging with extended space-bandwidth product,” Journal of Display Technology , **3**, 315–320 (2007). [CrossRef]

**14. **J. W. Goodman, *Introduction to Fourier Optics*, (McGraw-Hill, 1960).

**15. **A. Bourquard and M. Unser, “Binary compressed imaging,” IEEE Trans on Image Process , **22**, 1042–1055 (2013). [CrossRef]

**16. **A. Bourquard, “Compressed optical imaging,” PhD Thesis, Ecole Polytechnique Fédérale de Lausanne, (2013).

**17. **R. Horisaki and J. Tanida, “Preconditioning for multiplexed imaging with spatially coded PSFs,” Opt. Express , **19**, 12540–12550 (2011). [CrossRef] [PubMed]

**18. **V. Chandar, “A negative result concerning explicit matrices with the restricted isometry property,” preprint, (2008).

**19. **R. Berinde, A. C Gilbert, P. Indyk, H. Karloff, and M. J Strauss, “Combining geometry and combinatorics: A unified approach to sparse signal recovery,” Proc of Allerton Conf on Communication, Control, and Computing, 798–805 (IEEE2008).

**20. **R. Berinde and P. Indyk, “Sparse recovery using sparse random matrices,” preprint, (2008).

**21. **E. Ben-Eliezer, N. Konforti, B. Milgrom, and E. Marom, “An optimal binary amplitude-phase mask for hybrid imaging systems that exhibit high resolution and extended depth of field,” Opt. Express , **16**, 20540–20561 (2008). [CrossRef] [PubMed]

**22. **D. L. Donoho, “For most large underdetermined systems of linear equations the minimal l_{1}-norm solution is also the sparsest solution,” Comm. on Pure and Applied Math. , **59**, 797–829 (2006). [CrossRef]

**23. **Y. Zhang, J. Yang, and W. Yin, Yall1: Your algorithms for l1, online at yall1.blogs.rice.edu. (2011).

**24. **B. Dierickx, D. Scheffer, G. Meynants, W. Ogiers, and J. Vlummens, “Random addressable active pixel image sensors,” Proc. SPIE **2905**, 2 (1996). [CrossRef]

**25. **N. Massari, M. Gottardi, L. Gonzo, D. Stoppa, and A. Simoni, “A CMOS image sensor with programmable pixel-level analog processing,” IEEE Trans. on Neural Networks , **16**, 1673–1684 (2005). [CrossRef] [PubMed]

**26. **S. Becker, J. Bobin, and E. J. Candès, “NESTA: a fast and accurate first-order method for sparse recovery,” SIAM Journal on Imaging Sciences , **4**, 1–39 (2011). [CrossRef]