## Abstract

Acousto-optic imaging (AOI) enables optical-contrast imaging deep inside scattering samples via localized ultrasound-modulation of scattered light. While AOI allows optical investigations at depths, its imaging resolution is inherently limited by the ultrasound wavelength, prohibiting microscopic investigations. Here, we propose a computational imaging approach that allows optical diffraction-limited imaging using a conventional AOI system. We achieve this by extracting diffraction-limited imaging information from speckle correlations in the conventionally detected ultrasound-modulated scattered-light fields. Specifically, we identify that since “memory-effect” speckle correlations allow estimation of the Fourier magnitude of the field inside the ultrasound focus, scanning the ultrasound focus enables robust diffraction-limited reconstruction of extended objects using ptychography (i.e., we exploit the ultrasound focus as the scanned spatial-gate probe required for ptychographic phase retrieval). Moreover, we exploit the short speckle decorrelation-time in dynamic media, which is usually considered a hurdle for wavefront-shaping- based approaches, for improved ptychographic reconstruction. We experimentally demonstrate noninvasive imaging of targets that extend well beyond the memory-effect range, with a 40-times resolution improvement over conventional AOI.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. INTRODUCTION

Optical microscopy through scattering media is a long-standing challenge with great implications for biomedicine. Since scattered light limits the penetration depth of diffraction-limited optical imaging techniques to approximately 1 mm, finding a solution for deep high-resolution imaging is the focus of many recent works [1]. Modern techniques that are based on using unscattered, “ballistic” light, such as optical coherence tomography and two-photon microscopy, have proven very useful, but are inherently limited to shallow depths where a measurable amount of unscattered photons is present [2–7].

The current leading approaches for deep-tissue imaging, where no ballistic components are present, are based on the combination of light and ultrasound [1], such as acousto-optic tomography (AOT) [8–10] and photoacoustic tomography (PAT) [8,11]. PAT relies on the generation of ultrasonic waves by the absorption of light in a target structure under pulsed optical illumination. In PAT, images of absorbing structures are reconstructed by recording the propagated ultrasonic waves with detectors placed outside the sample. In contrast to PAT, AOT does not require optical absorption but is based on the acousto-optic (AO) effect: In AOT a focused ultrasound spot is used to locally modulate light at chosen positions inside the sample. The ultrasound spot is generated and scanned inside the sample by an external ultrasonic transducer. The modulated, frequency-shifted, light is detected outside the sample using, e.g., interferometry-based approaches [8,10]. This enables the measurement of the total optical intensity traversing through the localized acoustic focus inside the sample. Light also can be focused back into the ultrasound focus via optical phase conjugation of the tagged light in time-reversed ultrasonically encoded (TRUE) [12] optical focusing, or via iterative optimization, which can be used for fluorescence imaging [13,14]. AOT and PAT combine the advantages of optical contrast with the near scatter-free propagation of ultrasound in soft tissues. However, they suffer from low spatial resolution that is limited by the dimensions of the ultrasound focus, dictated by acoustic diffraction. This resolution is several orders of magnitude lower than the optical diffraction limit. For example, for an ultrasound frequency of 50 MHz, the acoustic wavelength is 30 µm, while the optical diffraction limit for coherent imaging is $\frac{\lambda}{{\rm NA}}$, where NA is the numerical aperture of the imaging system and $\lambda$ is the optical wavelength (i.e., a 100-fold difference in resolution). This results in a very significant gap and a great challenge for deep cellular and subcellular imaging.

In recent years, several novel approaches to overcome the acoustic resolution limit of AOT based on wavefront shaping have been developed. These include iterative TRUE (iTRUE) [15,16], time reversal of variance-encoded (TROVE) optical focusing [17], and the measurement of the acousto-optic transmission matrix (AOTM) [18]. Both iTRUE and TROVE rely on a digital optical phase conjugation (DOPC) system [19], a complex apparatus that conjugates a high-resolution spatial light modulator to a camera. In AOTM, an identical resolution increase as in TROVE is obtained without the use of a DOPC system, by measuring the transmission matrix of the ultrasound modulated light and using its singular value decomposition (SVD) for sub-acoustic optical focusing. A major drawback of these state-of-the-art super-resolution AOT approaches is that they require performance of a large number of measurements and wavefront-shaping operations in a time shorter than the sample speckle decorrelation time. In addition, in practice, these techniques do not allow a resolution increase of more than a factor of $\times 3 - \times 6$ improvement from the acoustical diffraction limit [18]. Recently, approaches that do not rely on wavefront shaping, and exploit the dynamic fluctuations to enable improved resolution [20] or fluorescent imaging [21] have been demonstrated, but these do not practically allow a resolution increase of more than a factor of 2–3. Thus, closing the two orders of magnitude gap between the ultrasound resolution and the optical diffraction limit is still an open challenge.

Diffraction-limited resolution imaging through highly scattering samples without relying on ballistic light is currently possible only by relying on the optical memory effect for speckle correlations [22–26]. These techniques retrieve the scene behind a scattering layer by analyzing the correlations within the scattered light speckle patterns. Unfortunately, the memory effect has a very narrow angular range, which limits these techniques to isolated objects that are contained within the memory-effect field of view (FoV). For example, at a depth of 1 mm, the memory-effect range is on the order of ten microns [27–29], making it inapplicable for imaging-extended objects using current approaches.

Here, we present acousto-optic ptychographic imaging (AOPI), an approach that allows optical diffraction-limited imaging over a wide FoV that is not limited by the memory-effect range, by combining acousto-optic imaging (AOI) with speckle-correlation imaging. Specifically, we use the ultrasound focus as a controlled spatial-gate probe that is scanned across the wide imaging FoV, and use speckle correlations to retrieve optical diffraction-limited information from within the ultrasound focus. We develop a reliable and robust computational reconstruction framework that is based on ptychography [30–32] that exploits the intentional partial overlap between the ultrasound foci. We demonstrate in proof-of-principle experiments a ${\gt}\times 40$ increase in resolution over the ultrasound diffraction limit, providing a resolution of 3.65 µm using a modulating ultrasound frequency of 25 MHz.

## 2. METHODS

#### A. Principle

The principle of our approach is presented in Fig. 1 along with a numerical example. Our approach is based on a conventional pulsed AOI setup, employing a camera-based holographic detection of the ultrasound modulated light [8,18]. In this setup [Fig. 1(a)], the sample is illuminated by a pulsed quasi-monochromatic light beam at a frequency ${f_{\rm{opt}}}$. The diffused spatially coherent light is ultrasonically tagged at a chosen position inside the sample by a focused ultrasound pulse at a central frequency ${f_{\rm{US}}}$. The acousto-optic modulated (ultrasound-tagged) light field at frequency ${f_{\rm{AO}}} = {f_{\rm{opt}}} + {f_{\rm{US}}}$ is measured by a camera placed outside the sample using a pulsed reference beam that is synchronized with the ultrasound pulses, via off-axis phase-shifting interferometry [20,33], as shown in Supplement 1, Section 1. The spatially coherent illumination and the holographic detection are required to separate the frequency-shifted light from the strong unmodulated background. Alternative approaches to filter the ultrasound-tagged signal may also be considered [8,9].

In conventional AOI, the ultrasound focus position, $r_m^{\rm{US}}$, is scanned over $m = 1 \cdots M$ positions along the target object inside the sample [Figs. 1(b) and 1(c)], and the AOI image, ${I^{\rm{AOI}}}({\textbf{r}})$, is formed by summing the total power of the detected ultrasound-modulated light at each scanned position, $r_m^{\rm{US}}$:

Our approach to overcome the acoustic diffraction limit relies on the same data acquisition scheme as in conventional camera-based AOI, as shown in Figs. 1(b)–1(d). However, instead of integrating the total power of the camera-detected field at each ultrasound focus position, we use the spatial information in the detected field, ${E_m}({{\textbf{r}}_{{\textbf{cam}}}})$, to reconstruct the diffraction-limited target features *inside* the ultrasound focus via speckle-correlation computational imaging [25,26,34]. Specifically, we estimate the autocorrelation of the hidden target inside each ultrasound focus position, as shown in Fig. 1(e), from the detected speckle autocorrelation, and then use a ptychography-based algorithm [30,31] to jointly reconstruct the entire target from its autocorrelations estimates, as shown in Fig. 1(f). Our approach exploits the richness of information in the ultrasound-modulated speckle fields, which contain a number of speckle grains limited only by the camera pixel count.

Beyond improving the resolution of AOI by several orders of magnitude, our approach tackles a fundamental and generally very difficult to fulfill requirement of speckle-correlation imaging: The entire imaged object area must be contained within the memory-effect correlation range [23,25,26] (i.e., all object points produce correlated speckle patterns [26]). This requirement usually limits speckle-correlation imaging to small and unnaturally isolated objects.

Recently, ptychography-based approaches were used to overcome the memory-effect FoV [35,36]. However, the implementations of all FoV-extending approaches to date required direct access to the target to control the illumination area, a requirement that is impossible to fulfill in noninvasive imaging. Our approach relaxes this requirement by effectively replacing the scanned illumination aperture with noninvasive scanning of an ultrasound focus. Another conceptually similar speckle-correlation approach that exploits direct access to the target was suggested to improve the resolution of fluorescence microscopy by scanning a high-resolution speckle pattern [37].

Our approach overcomes the obstacle of direct access to the target by relying on noninvasive ultrasound tagging to spatially limit the detected light to originate only from a small controlled volume, determined by the ultrasound focus position. The remaining requirement for speckle-correlation imaging is that the ultrasound focus [Fig. 1(b), dashed yellow circle] would be smaller than the memory-effect range [Fig. 1(b), dashed green circle], such that all the tagged light is within the memory-effect range. Our approach thus allows to the imaging of objects that extend well beyond the memory-effect FoV, without a limit on their total dimensions.

Mathematically, our approach can be described as follows: Consider a target object located inside a scattering sample [Fig. 1(a)]. For simplicity, we consider an object that is described by a thin 2D amplitude and phase mask, whose complex field transmission is given by $O({\textbf{r}})$. Our goal is to reconstruct the object 2D transmission $|O({\textbf{r}}{)|^2}$ by noninvasive measurements of the scattered light distributions outside the sample. A monochromatic spatially coherent laser beam illuminates the object through the scattering sample [Fig. 1(a)]. The light propagation through the scattering sample to the object results in a speckle illumination pattern at the object plane. Considering a dynamic scattering sample, such as biological tissues, the illuminating speckle pattern on the object is time-varying. We denote the speckle field illuminating the object at a time ${t_n}$ by ${S_n}({\textbf{r}})$.

The field distribution of the light that traverses the object at time ${t_n}$ is thus given by ${O_n}({\textbf{r}}) = O({\textbf{r}}){S_n}$. This field is ultrasound modulated by the ultrasound focus, whose central position, $r_m^{\rm{US}}$, is sequentially scanned over $m = 1 \cdots M$ positions inside the sample. We denote the ultrasound focus pressure distribution at the m-th position by $U({\textbf{r}} - r_m^{\rm{US}})$. The shift-invariance of the ultrasound focus is assumed here for simplicity of the derivation, and is not a necessary requirement [38].

The ultrasound modulated light field is given by the product of ${O_n}({\textbf{r}})$ and $U({\textbf{r}} - r_m^{\rm{US}})$: ${O_{m,n}}({\textbf{r}}) = {O_n}({\textbf{r}})U({\textbf{r}} - r_m^{\rm{US}})$, where $r_m^{\rm{US}}$ is the center position of the ultrasound focus. The ultrasound modulated light field propagates to the camera through the scattering sample and produces a random speckle field at the camera plane: ${E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}})$. When the ultrasound focus dimensions are considerably smaller than the memory-effect range, then ${D_{\rm{US}}} \lt \Delta {r_{\rm{mem}}} \approx L\Delta {\theta _{\rm{mem}}}$, where $L$ is the depth of the object inside the scattering sample from the camera side [Fig. 1(a)], $\Delta {\theta _{\rm{mem}}}$ is the angular range of the memory effect [23,24], and the detected field, ${E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}})$, originates from points that are within the memory-effect range. Thus, the focused ultrasound modulation reduces the challenge of imaging an extended object through the scattering medium to the challenge of imaging an object that is contained within the memory-effect range.

A single-shot solution to this challenge in the case of spatially incoherent illumination exists [26], when the camera is placed at a distance from the scattering medium facet [23]. It is based on estimating the object intensity autocorrelation from the autocorrelation of the detected speckle intensity, followed by phase-retrieval reconstruction [26]. The exact same approach can be applied on the detected ultrasound-modulated speckle fields, by incoherently averaging several ($n = 1 \cdots N$) camera frames captured under different speckle illuminations for a single ultrasound focus position; i.e., by calculating the autocorrelation of the incoherently compounded intensity pattern: ${I_m}({{\textbf{r}}_{{\textbf{cam}}}}) = {\langle |{E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}}{)|^2}\rangle _n} = \frac{1}{N}\sum\nolimits_{n = 1}^N |{E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}}{)|^2}$. If a sufficiently large number of speckle grains are captured by the camera, the autocorrelation of ${I_m}({{\textbf{r}}_{{\textbf{cam}}}})$ provides a good approximation of the autocorrelation of the object intensity transmission under the modulation by the m-th ultrasound focus position: ${I_m}({{\textbf{r}}_{{\textbf{cam}}}}) \star {I_m}({{\textbf{r}}_{{\textbf{cam}}}}) \approx |{O_m}({\textbf{r}}{)|^2} \star |{O_m}({\textbf{r}}{)|^2} + C$, where $C$ is a constant background term [23,26]. For a dynamic sample, the spatially coherent speckle illuminations naturally vary in time. For a static sample, the varying speckle realizations can be easily obtained; e.g., by placing a rotating diffuser in the illumination path [Fig. 1(a)].

Since the Fourier transform of the object autocorrelation is the Fourier magnitude of $|{O_m}({\textbf{r}}{)|^2}$, for each ultrasound focus position, ${r_m}$, the ultrasound-modulated portion of the object can be reconstructed from its estimated autocorrelation via phase retrieval [26,39]. Repeating the phase-retrieval process for the $M$ positions of the ultrasound focus thus allows the reconstruction of extended objects by mosaicing $M$ independent phase-retrieval reconstructions, in a similar fashion to keyhole coherent diffractive imaging [40]. However, since speckle autocorrelation estimates usually suffer from statistical speckle noise due to the finite averaging statistics, phase-retrieval does not always provide a stable high-fidelity reconstruction [41]. In addition, the ultrasound focus “probe,” does not provide a sharp-edged finite support, but rather a soft-edged support, which contains weak object features beyond the memory-effect range. Nonetheless, if a partial overlap between the scanned ultrasound foci exists, the reconstruction problem can be more reliably solved using ptychography, an advanced joint phase-retrieval technique [30–32,42], which was recently shown to be very successful in stable, high-fidelity, robust reconstruction of complex objects. Ptychographic reconstruction is able to outperform conventional “keyhole” phase retrieval by using the joint information of overlapping object parts, at the price of an increased number of scan positions.

To leverage the potential of overlapping ultrasound foci for improved reconstruction, we have adapted a ptychographic reconstruction algorithm to be applied on autocorrelations estimates calculated from ultrasound-modulated speckle fields. The ptychographic reconstruction algorithm (Supplement 1, Section 7) receives as input the $M$ measured Fourier amplitudes estimates of the object at the $M$ ultrasound focus positions, the $M$ ultrasound focus positions, and an estimate of the acoustic focus spatial distribution (the “probe” in ptychography jargon). The iterative algorithm begins with a random initial guess of the target object. At each iteration, the algorithm updates the object and probe estimates by performing a Gerchberg–Saxton type process over each of the $m = 1 \cdots M$ probe positions (Supplement 1, Section 7). For each probe position, the algorithm multiplies the current object estimate by the ultrasound probe estimate, Fourier-transforms the result, enforces the measured Fourier amplitude, and inverse Fourier transforms back to the object plane. The result of this calculation is used to update the object and probe estimates in a manner similar to hybrid input-output (HIO) phase retrieval [39]. The update rule may also include various constraints on the reconstructed object and probe. In our approach, since the object intensity pattern is reconstructed, we also enforced a non-negativity constraint on the reconstructed object. The iterative process is repeated until convergence.

While the object autocorrelation can be estimated from the autocorrelation of a single captured image obtained under incoherent illumination [26,35], or equivalently by incoherent averaging of several realizations of spatially coherent illumination, an alternative approach for improved estimation of the object incoherent intensity autocorrelation, $A{C_m}({\textbf{r}}) = |{O_m}({\textbf{r}}{)|^2} \star |{O_m}({\textbf{r}}{)|^2}$, from $N$ captured coherent speckle fields is via correlography [34,36,43]. In correlography, the incoherent intensity autocorrelation of the object at the m-th ultrasound focus position, $\hat A{C_m}({\textbf{r}})$, is calculated by averaging the Fourier transforms of the speckle intensity distribution captured at the scattering medium facet, after subtracting their mean value [36,43]:

The estimated autocorrelation, $\hat A{C_m}({\textbf{r}})$, is the autocorrelation of the object, convolved with the autocorrelation of a single speckle grain, which is diffraction-limited [23,26,34].

Compared to incoherent averaging, correlography offers a zero background estimation of the autocorrelation, and a $\sqrt N$ improvement in statistical SNR [43]. The SNR improvement can be intuitively understood by considering that instead of computing a single autocorrelation, in correlography $N$ field autocorrelations are effectively computed and then averaged, after subtracting the coherent diffraction-limited autocorrelation peak (the negative term in Eq. (2)).

Correlography was originally put forward for imaging speckle-illuminated objects from their far-field coherent diffraction patterns [43]. The use of correlography for imaging through scattering media is appropriate under two conditions [34]: The first is that the diffraction patterns measured at the output facet of the scattering medium are approximately the same diffraction patterns that would have been measured without the scattering medium present. Such a “shower-curtain effect” is exactly the case when the scattering medium is an (infinitely thin) random phase mask. But most importantly, this is approximately the case when the object is within the memory-effect range [34], since the memory-effect angular correlations are a result of the finite spatial extent of the scattering medium impulse response [23,27]. The second condition for applying correlography for imaging through a scattering media is that the object distance from the medium facet, $L$, is sufficiently large so that the Fourier transform of its diffraction pattern provides a good estimate of its field autocorrelation [34,44]. Edrei and Scarcelli have shown that this condition is satisfied when $L \gt 2D{r_c}/\lambda$, where $D$ is the object size, and ${r_c}$ is the illumination speckle grain size [34]. Considering ${r_c} \approx \lambda /2$, this condition becomes $L \gt D$; i.e., the imaging depth should be larger than the object size, which in our case is the ultrasound focus diameter, a naturally fulfilled condition in acousto-optic imaging.

A numerically simulated example of ptychographic reconstruction of an extended object using scanned acousto-optic modulation is presented in Fig. 1(f). The details of the simulations can be found in Supplement 1, Section 10. The result of the ptychographic AOI is compared to a conventional AOI image of the same object using the same measurements [Fig. 1(g)]. A resolution increase of ${\gt}\times\! 25$ over conventional AOI is apparent in the high-fidelity reconstruction [Fig. 1(f)]. A detailed explanation of the data processing for correlography and the ptychographic reconstruction algorithm can be found in Supplement 1, Sections 4 and 7.

## 3. RESULTS

#### A. Experimental Setup

To demonstrate our approach, we constructed a proof-of-principle setup schematically shown in Fig. 1(a). It is a conventional AOI setup with a holographic detection based on phase-shifting off-axis holography (see Supplement 1, Section 1), with the addition of a controlled rotating diffuser before the sample (two 1° light-shaping diffusers, Newport Corp., Irvine, CA, USA), used to generate dynamic random speckle illumination. While the holographic detection is used as a phase-sensitive method to measure the weak frequency-shifted, ultrasound-tagged field, the ptychographic reconstruction only uses the intensity distribution of the ultrasound-tagged field at the sample facet, which can be detected by an alternative means [8]. The illumination is provided by a pulsed long-coherence laser at a wavelength of 532 nm (Standa Ltd., Vilnius, Lithuania). An ultrasound transducer with a central frequency of ${f_{\rm{US}}} = 25 \;{\rm MHz}$, and ultrasound focus dimensions of ${D_X} = 149\,\,\unicode{x00B5}{\rm m}$ and ${D_Y} = 140\,\,\unicode{x00B5}{\rm m}$, FWHM in the horizontal (transverse) and vertical (ultrasound axial) directions, correspondingly, is used for acousto-optic modulation. The ultrasound focus position was scanned laterally by a motorized stage, and axially by electronically varying the time delay between the laser pulses and the ultrasound excitation pulses. The full description is given in Supplement 1, Section 1. As controlled scattering media and targets for our proof-of-principle experiments we used a sample comprised of a transmission-plate target placed in water between two scattering layers composed of several 5° scattering diffusers that have no ballistic component (see Supplement 1, Section 6). An sCMOS camera (Zyla 4.2 PLUS, Andor Technology, Belfast, UK) was used to holographically record the ultrasound-modulated scattered light fields using a frequency-shifted reference beam. To minimize distortions in the recorded fields, no optical elements were present between the camera and the diffuser. The field at the diffuser plane, ${E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}}/\lambda L)$, was calculated from the field recorded on the camera by digital back propagation (Supplement 1, Section 1).

#### B. Imaging an Extended Object Beyond the Memory Effect

As a first demonstration, we imaged a transmittive target composed of nine digits [Fig. 2(a)] that extends over 3.5 times beyond the memory effect of the scattering sample, which is $\Delta {r_{\rm{mem}}} \sim\! 280\,\,\unicode{x00B5}{\rm m}$ [Fig. 2(a), dashed green circle, Supplement 1, Fig. S3].

For 2D imaging, the ultrasound focus [Fig. 2(b)] was scanned over the target with a step size of $\Delta X = 44.7\,\,\unicode{x00B5}{\rm m}$, $\Delta Y = 37.3\,\,\unicode{x00B5}{\rm m}$, in the horizontal and vertical directions, respectively. These steps (along with the ultrasound spot size) define a probe overlap of ${\sim}88\%$ between neighboring positions [Fig. 2(a)]. A study of the effect of the probe overlap on the reconstruction fidelity is presented in Supplement 1, Section 8. For each ultrasound focus position, $r_m^{\rm{US}}$ ($m = 1 \cdots 224$) we recorded $N = 150$ different ultrasound-modulated light fields, ${E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}})$, each with a different (unknown) speckle realization, ${S_n}({\textbf{r}})$. The target was reconstructed from the $M = 224$ autocorrelations rPIE ptychographic algorithm [31] (see Supplement 1, Section 7, for details). Figure 2(c) presents an example for one of the autocorrelations used as input to the ptychographic reconstruction. The AOPI reconstruction [Fig. 2(d)] provides an image with a resolution well beyond that of a conventional AOI reconstruction [Fig. 2(e)] and also well beyond the improved resolution of recent super-resolution AOI techniques such as AO–SOFI [20] (Supplement 1, Section 11). Importantly, since the target extends beyond the memory-effect range, as is the case in many practical imaging scenarios, conventional speckle-correlation imaging without AO modulation [26,34] fails to reconstruct the object [Fig. 2(f)], as expected.

When only a small portion of the overlapping scanned positions are used [Fig. 3(a)], a high-fidelity reconstruction is still obtained over the smaller scanned area [Fig. 3(b)], confirming the ultrasound focus serves as an effective spatial gating probe.

#### C. Imaging Resolution Verification

To demonstrate the resolution increase of AOPI, we performed an additional experiment where the target of Fig. 2 was replaced by elements 3–4 of group 6 of a negative USAF-1951 resolution test chart [Fig. 4(a)]. For 2D imaging, the ultrasound focus [Fig. 4(d)] was scanned over the target with a step size of $\Delta X = \Delta Y = 29\,\,\unicode{x00B5}{\rm m}$. These steps (along with the ultrasound spot size) define a probe overlap of ${\sim}93\%$ between neighboring positions [Fig. 4(b)]. For each ultrasound focus position, $r_m^{\rm{US}}$, $m = 1 \cdots 72$ we recorded $N = 150$ different ultrasound-modulated light fields, ${E_{m,n}}({{\textbf{r}}_{{\textbf{cam}}}})$ [Fig. 4(c)], each with a different (unknown) speckle realization, ${S_n}({\textbf{r}})$. The reconstruction of the target from the $M = 72$ autocorrelations using rPIE ptychography algorithm [31] is presented in Figs. 4(g) and 4(h). A study of the effect of probe overlap on the reconstruction is presented in Supplement 1, Section 8. The AOPI reconstructed image resolves resolution target features with a separation as small as 5.52 µm [Figs. 4(g) and 4(h)], which is ${\sim}\times\! 30$ smaller than the acoustic focus FWHM (${\sim}145\,\,\unicode{x00B5}{\rm m}$).

The cross-sections of the reconstructed image [Fig. 4(h)] allow the provision of an estimate to the imaging resolution by fitting the result to a convolution of the known sample structure with a Gaussian PSF. This quantification results in an estimated resolution of 3.65 µm (FWHM), a 40-fold increase in resolution compared to the acoustic resolution of conventional AOI [Fig. 4(e)]. Interestingly, although the target dimensions in this experiment are contained in the memory-effect range, conventional speckle-correlation imaging based on phase-retrieval without AO modulation [Fig. 4(f)] results in a considerably lower reconstruction fidelity than the AOPI reconstruction [Fig. 4(g)]. This improvement is attributed to the larger input data set and improved algorithmic stability of ptychographic reconstruction compared to single-shot phase-retrieval [30,42].

## 4. DISCUSSION AND CONCLUSION

We proposed and demonstrated an approach to wide FoV diffraction-limited optical imaging through highly scattering media, using speckle correlations and acousto-optic tagging. In contrast to previous approaches for super-resolved acousto-optic imaging [15–18], the resolution of our approach is optically diffraction-limited, and is independent of the ultrasound probe dimensions. In addition, it requires a much lower number of measurements, and does not require large speckle grain size [18]. This allowed us to demonstrate an $\times 40$ improvement in resolution over the acoustic diffraction-limit, an order of magnitude larger gain in resolution compared to state-of-the-art approaches, such as iTRUE [15,16], TROVE [17], and AOTM [18]. Another important advantage of our approach is that, unlike transmission-matrix and wavefront-shaping-based approaches [17,18], it is not limited by short speckle decorrelation times. Similar to recent approaches that use random fluctuations [20,36], our approach benefits from the natural speckle decorrelation to generate independent speckle realizations, improving the estimation of the object autocorrelation [34].

AOPI relies on the existence of a sufficiently large memory-effect correlation range. However, unlike all other noninvasive memory-effect-based techniques [25,26], the imaged FoV in AOPI is not limited by the memory-effect range. The FoV is dictated by the scanning range of the ultrasound focus, which is practically limited only by the total acquisition time. Such an extension of the FoV in speckle-correlation imaging has until today only been obtained by invasive access to the target object [35,45–47]. While the spatial coherence of the illumination is a requirement for the holographic detection of the ultrasound-modulated light, it allowed us to use correlography to improve the speckle autocorrelation estimation over the spatial-incoherent method [26,35]. Finally, the adaptation of a ptychographic image reconstruction significantly improves the reconstruction fidelity and stability compared to single-shot phase-retrieval reconstruction [Figs. 4(f)–4(g)].

Our super-resolution AOPI approach does not rely on wavefront-shaping [15–18] or nonlinear effects [48], and it can be applied to any AOI system employing camera-based coherent detection.

The main limitation of our approach is the requirement for a memory-effect range to be on the order of the ultrasound probe dimensions. This condition can be satisfied by relying on a small ultrasound focus, achieved by the use of high-frequency ultrasound (see Supplement 1, Section 9), and by the use of a long laser wavelength that increases the memory-effect range [23,24]. We note that while at very large imaging depths, deep within the diffusive light propagation regime, the memory effect angular range is ${\theta _{\rm{mem}}} \approx \frac{\lambda}{L}$ [23,24], at millimeter-scale depths, which are of the order of the transport mean free path (TMFP), the memory-effect range has been shown to be orders of magnitude larger [28]. In addition, the requirement for a sufficiently large memory effect may be alleviated by relying on translation correlations [27] or the generalized memory effect [29]. Applying the above improvements are the next steps to bring our proof-of-principle demonstrations to practical biomedical imaging applications, which is still a standing challenge.

Another necessary technical improvement for biomedical applications of our approach is in the acquisition speed. Similar to all super-resolution techniques that do not rely on object priors, our approach requires a large number of measurements to reconstruct a single image. The required number of frames is the product of (the number of probe positions) × (the number of realization per probe position) × (number of phase-shifting frames). In our proof-of-principle demonstration system, which was not optimized for acquisition speed, we used 150 realizations per probe position and 16 phase-shifting frames. A diffuser mounted on a slowly rotating motor was used to change the realizations, and a conventional sCMOS camera was used to capture the images. This resulted in an acquisition time of ${\sim}1.6 \;{\rm{s}}$ per realization. This time can be reduced by orders of magnitude using single-shot, off-axis, fast-camera-based detection [18,21,49,50], a fast MEMS-based dynamic wavefront randomizer [51], a laser with a higher average power, and 2D electronic scanning US array instead of the mechanical scan of the single-element ultrasound transducer [52]. Assuming the illumination intensity is limited by laser safety limits, the acquisition time to acquire each speckle realization with a ${\rm SNR} = 1$ at 1 mm target depth is approximately 1.5 ms (see Supplement 1, Section 9). The acquisition time per single realization is reduced by two orders of magnitude if the SNR limit is set on the correlographic autocorrelation estimate (which used 150 realizations in our experiment), instead of each single realization (Supplement 1, Section 9). Thus, assuming a camera frame rate of ${\sim}2,000$ frames per second [18], the acquisition of 100 realizations for 48 probe positions (as in Fig. 3), will be on the order of ${\sim}2.4{\rm{\;s}}$, excluding off-line data processing. The number of required realizations may be significantly reduced by using advanced correlation-based reconstruction schemes such as those provided by deep neural networks (DNN). These have been recently shown to be able to significantly improve the estimation of the intensity autocorrelation, from only a few coherent realizations [41].

The combination of the state-of-the-art optical, ultrasound, and computational imaging approaches has the potential to significantly impact imaging deep inside complex samples.

## Funding

H2020 European Research Council (677909); Azrieli Foundation; Israel Science Foundation (1361/18); National Science Foundation (1813848); Israeli Ministry of Science and Technology.

## Acknowledgment

The authors thank Prof. Hagai Eisenberg for the Q-switched laser and thank the Nanocenter at the Hebrew University, with special thanks to Dr. Itzik Shweky and Galina Chechelinsky, for the fabrication of the target samples.

## Disclosures

The authors declare no conflicts of interest.

## Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

## Supplemental document

See Supplement 1 for supporting content.

## REFERENCES

**1. **V. Ntziachristos, “Going deeper than microscopy: the optical imaging frontier in biology,” Nat. Methods **7**, 603–614 (2010). [CrossRef]

**2. **J. Pawley, *Handbook of Biological Confocal Microscopy* (Springer, 2006), Vol. 236.

**3. **R. H. Webb, “Confocal optical microscopy,” Rep. Prog. Phys. **59**, 427 (1996). [CrossRef]

**4. **D. Huang, E. A. Swanson, C. P. Lin, J. S. Schuman, W. G. Stinson, W. Chang, M. R. Hee, T. Flotte, K. Gregory, C. A. Puliafito, and J. G. Fujimoto, “Optical coherence tomography,” Science **254**, 1178–1181 (1991). [CrossRef]

**5. **A. F. Fercher, W. Drexler, C. K. Hitzenberger, and T. Lasser, “Optical coherence tomography-principles and applications,” Rep. Prog. Phys. **66**, 239 (2003). [CrossRef]

**6. **W. Drexler and J. G. Fujimoto, *Optical Coherence Tomography: Technology and Applications* (Springer, 2008).

**7. **M. Jang, H. Ko, J. H. Hong, W. K. Lee, J.-S. Lee, and W. Choi, “Deep tissue space-gated microscopy via acousto-optic interaction,” Nat. Commun. **11**, 710 (2020). [CrossRef]

**8. **D. S. Elson, R. Li, C. Dunsby, R. Eckersley, and M.-X. Tang, “Ultrasound-mediated optical tomography: a review of current methods,” Interface Focus **1**, 632–648 (2011). [CrossRef]

**9. **L. V. Wang, “Ultrasound-mediated biophotonic imaging: a review of acousto-optical tomography and photo-acoustic tomography,” Dis. Markers **19**, 123–138 (2004). [CrossRef]

**10. **S. G. Resink, W. Steenbergen, and A. C. Boccara, “State-of-the art of acousto-optic sensing and imaging of turbid media,” J. Biomed. Opt. **17**, 040901 (2012). [CrossRef]

**11. **L. V. Wang and S. Hu, “Photoacoustic tomography: in vivo imaging from organelles to organs,” Science **335**, 1458–1462 (2012). [CrossRef]

**12. **Y. Liu, P. Lai, C. Ma, X. Xu, A. A. Grabar, and L. V. Wang, “Optical focusing deep inside dynamic scattering media with near-infrared time-reversed ultrasonically encoded (true) light,” Nat. Commun. **6**, 5904 (2015). [CrossRef]

**13. **Y. M. Wang, B. Judkewitz, C. A. DiMarzio, and C. Yang, “Deep-tissue focal fluorescence imaging with digitally time-reversed ultrasound-encoded light,” Nat. Commun. **3**, 928 (2012). [CrossRef]

**14. **K. Si, R. Fiolka, and M. Cui, “Fluorescence imaging beyond the ballistic regime by ultrasound-pulse-guided digital phase conjugation,” Nat. Photonics **6**, 657–661 (2012). [CrossRef]

**15. **K. Si, R. Fiolka, and M. Cui, “Breaking the spatial resolution barrier via iterative sound-light interaction in deep tissue microscopy,” Sci. Rep. **2**, 748 (2012). [CrossRef]

**16. **H. Ruan, M. Jang, B. Judkewitz, and C. Yang, “Iterative time-reversed ultrasonically encoded light focusing in backscattering mode,” Sci. Rep. **4**, 7156 (2014). [CrossRef]

**17. **B. Judkewitz, Y. M. Wang, R. Horstmeyer, A. Mathy, and C. Yang, “Speckle-scale focusing in the diffusive regime with time reversal of variance-encoded light (TROVE),” Nat. Photonics **7**, 300–305 (2013). [CrossRef]

**18. **O. Katz, F. Ramaz, S. Gigan, and M. Fink, “Controlling light in complex media beyond the acoustic diffraction-limit using the acousto-optic transmission matrix,” Nat. Commun. **10**, 1–10 (2019). [CrossRef]

**19. **M. Cui and C. Yang, “Implementation of a digital optical phase conjugation system and its application to study the robustness of turbidity suppression by phase conjugation,” Opt. Express **18**, 3444–3455 (2010). [CrossRef]

**20. **D. Doktofsky, M. Rosenfeld, and O. Katz, “Acousto optic imaging beyond the acoustic diffraction limit using speckle decorrelation,” Commun. Phys. **3**, 5 (2020). [CrossRef]

**21. **H. Ruan, Y. Liu, J. Xu, Y. Huang, and C. Yang, “Fluorescence imaging through dynamic scattering media with speckle-encoded ultrasound-modulated light correlation,” Nat. Photonics **14**, 511–516 (2020). [CrossRef]

**22. **S. Feng, C. Kane, P. A. Lee, and A. D. Stone, “Correlations and fluctuations of coherent wave transmission through disordered media,” Phys. Rev. Lett. **61**, 834 (1988). [CrossRef]

**23. **I. Freund, “Looking through walls and around corners,” Physica A **168**, 49–65 (1990). [CrossRef]

**24. **I. Freund, M. Rosenbluh, and S. Feng, “Memory effects in propagation of optical waves through disordered media,” Phys. Rev. Lett. **61**, 2328 (1988). [CrossRef]

**25. **J. Bertolotti, E. G. Van Putten, C. Blum, A. Lagendijk, W. L. Vos, and A. P. Mosk, “Non-invasive imaging through opaque scattering layers,” Nature **491**, 232–234 (2012). [CrossRef]

**26. **O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nat. Photonics **8**, 784–790 (2014). [CrossRef]

**27. **B. Judkewitz, R. Horstmeyer, I. M. Vellekoop, I. N. Papadopoulos, and C. Yang, “Translation correlations in anisotropically scattering media,” Nat. Phys. **11**, 684–689 (2015). [CrossRef]

**28. **S. Schott, J. Bertolotti, J.-F. Léger, L. Bourdieu, and S. Gigan, “Characterization of the angular memory effect of scattered light in biological tissues,” Opt. Express **23**, 13505–13516 (2015). [CrossRef]

**29. **G. Osnabrugge, R. Horstmeyer, I. N. Papadopoulos, B. Judkewitz, and I. M. Vellekoop, “Generalized optical memory effect,” Optica **4**, 886–892 (2017). [CrossRef]

**30. **A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy **109**, 1256–1262 (2009). [CrossRef]

**31. **A. Maiden, D. Johnson, and P. Li, “Further improvements to the ptychographical iterative engine,” Optica **4**, 736–745 (2017). [CrossRef]

**32. **M. Pham, A. Rana, J. Miao, and S. Osher, “Semi-implicit relaxed Douglas-Rachford algorithm (SDR) for ptychography,” Opt. Express **27**, 31246–31260 (2019). [CrossRef]

**33. **M. Gross, P. Goy, B. Forget, M. Atlan, F. Ramaz, A. Boccara, and A. Dunn, “Heterodyne detection of multiply scattered monochromatic light with a multipixel detector,” Opt. Lett. **30**, 1357–1359 (2005). [CrossRef]

**34. **E. Edrei and G. Scarcelli, “Optical imaging through dynamic turbid media using the Fourier-domain shower-curtain effect,” Optica **3**, 71–74 (2016). [CrossRef]

**35. **D. F. Gardner, S. Divitt, and A. T. Watnik, “Ptychographic imaging of incoherently illuminated extended objects using speckle correlations,” Appl. Opt. **58**, 3564–3569 (2019). [CrossRef]

**36. **Z. Li, D. Wen, Z. Song, T. Jiang, W. Zhang, G. Liu, and X. Wei, “Imaging correlography using ptychography,” Appl. Sci. **9**, 4377 (2019). [CrossRef]

**37. **H. Yilmaz, E. G. van Putten, J. Bertolotti, A. Lagendijk, W. L. Vos, and A. P. Mosk, “Speckle correlation resolution enhancement of wide-field fluorescence imaging,” Optica **2**, 424–429 (2015). [CrossRef]

**38. **I. Peterson, R. Harder, and I. Robinson, “Probe-diverse ptychography,” Ultramicroscopy **171**, 77–81 (2016). [CrossRef]

**39. **J. R. Fienup, “Reconstruction of an object from the modulus of its Fourier transform,” Opt. Lett. **3**, 27–29 (1978). [CrossRef]

**40. **B. Abbey, K. A. Nugent, G. J. Williams, J. N. Clark, A. G. Peele, M. A. Pfeifer, M. De Jonge, and I. McNulty, “Keyhole coherent diffractive imaging,” Nat. Phys. **4**, 394–398 (2008). [CrossRef]

**41. **C. A. Metzler, F. Heide, P. Rangarajan, M. M. Balaji, A. Viswanath, A. Veeraraghavan, and R. G. Baraniuk, “Deep-inverse correlography: towards real-time high-resolution non-line-of-sight imaging,” Optica **7**, 63–71 (2020). [CrossRef]

**42. **J. M. Rodenburg and H. M. Faulkner, “A phase retrieval algorithm for shifting illumination,” Appl. Phys. Lett. **85**, 4795–4797 (2004). [CrossRef]

**43. **P. S. Idell, J. R. Fienup, and R. S. Goodman, “Image synthesis from nonimaged laser-speckle patterns,” Opt. Lett. **12**, 858–860 (1987). [CrossRef]

**44. **G. Stern and O. Katz, “Noninvasive focusing through scattering layers using speckle correlations,” Opt. Lett. **44**, 143–146 (2019). [CrossRef]

**45. **G. Li, W. Yang, H. Wang, and G. Situ, “Image transmission through scattering media using ptychographic iterative engine,” Appl. Sci. **9**, 849 (2019). [CrossRef]

**46. **M. Zhou, R. Li, T. Peng, A. Pan, J. Min, C. Bai, D. Dan, and B. Yao, “Retrieval of non-sparse objects through scattering media beyond the memory effect,” J. Opt. **22**, 085606 (2020). [CrossRef]

**47. **S. Divitt, D. F. Gardner, and A. T. Watnik, “Imaging around corners in the mid-infrared using speckle correlations,” Opt. Express **28**, 11051–11064 (2020). [CrossRef]

**48. **J. Selb, L. Pottier, and A. C. Boccara, “Nonlinear effects in acousto-optic imaging,” Opt. Lett. **27**, 918–920 (2002). [CrossRef]

**49. **Y. Liu, C. Ma, Y. Shen, J. Shi, and L. V. Wang, “Focusing light inside dynamic scattering media with millisecond digital optical phase conjugation,” Optica **4**, 280–288 (2017). [CrossRef]

**50. **Y. Liu, Y. Shen, C. Ma, J. Shi, and L. V. Wang, “Lock-in camera based heterodyne holography for ultrasound-modulated optical tomography inside dynamic scattering media,” Appl. Phys. Lett. **108**, 231106 (2016). [CrossRef]

**51. **F. Shevlin, “Phase randomization for spatiotemporal averaging of unwanted interference effects arising from coherence,” Appl. Opt. **57**, E6–E10 (2018). [CrossRef]

**52. **J.-B. Laudereau, A. A. Grabar, M. Tanter, J.-L. Gennisson, and F. Ramaz, “Ultrafast acousto-optic imaging with ultrasonic plane waves,” Opt. Express **24**, 3774–3789 (2016). [CrossRef]