## Abstract

Occlusion is one of the most important issues in light-field depth estimation. In this paper, we propose a light-field multi-occlusion model with the analysis of light transmission. By the model, occlusions in different views are discussed separately. An adaptive algorithm of anti-occlusion in the central view is proposed to obtain more precise consistency regions (unoccluded views) in the angular domain and a subpatch approach of anti-occlusion in other views is presented to optimize the initial depth maps, where depth boundaries are better preserved. Then we propose a curvature confidence analysis approach to make depth evaluation more accurate and it is designed in an energy model to regularize the depth maps. Experimental results demonstrate that the proposed algorithm achieves better subjective and objective quality in depth maps compared with state-of-the-art algorithms.

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

With 3D and VR (virtual reality) technology growing quickly, depth estimation becomes
one of the most popular research issues for the last few years. Because
light-field camera can provide fruitful information captured by multiple
viewpoints from a single shot [1–4], various
approaches [5–9] (e.g. epipolar
plane image, focal stack, and angular patch) have been proposed to develop
light field depth estimation algorithms. Wanner *et al.*
[10] utilized
structure tensor to obtain dominant directions on epipolar plane images.
Tao *et al.* [11, 12] firstly
combined the defocus cue and the correspondence cue, then the shadow cue
was added to make depth map more precise. Jeon *et al.*
[13] made use of
the phase shift theorem in the Fourier domain to estimate the sub-pixel
shifts of sub-aperture images. These algorithms fail on occlusions because
angular pixels may be observed from different objects due to occlusions,
and photo-consistency assumption no longer holds.

Recently, Wang *et al.* [14, 15] explicitly modeled occlusions by using edge orientation to separate the angular patch into two equal regions by a straight line. The method improved depth results for line single-occlusion situations, but a straight line is not enough to deal with multi-occlusion situations. Zhu *et al.* [16] used k-means to separate the patch for selecting unoccluded views, and thus improves the performance with single-occlusion situation. However, the algorithm depends on the existing K-means strategy clustering effect, and it can not handle this situation with more complex clusters. Williem *et al.* [17,18] proposed a novel angular entropy and an adaptive defocus response to estimate the depths, but some regions are still over-smoothing because angular entropy is too random for complex details. There are also some other algorithms handling occlusions [19–21] and Each method has its own advantages and disadvantages.

In this paper, we propose a novel algorithm to solve the multi-occlusion problem. Our depth estimation algorithm is developed from occlusion in different views. For occlusion in the central view, an adaptive connected-domain selection algorithm is proposed to accurately separate the spatial patch. By the light-field multi-occlusion model between the spatial domain and the angular domain, the divided spatial patch can be mapped to the angular patch and the consistency regions (unoccluded views) can be obtained more precisely. The first moment and the second moment cues are applied on the consistency region to estimate depth. For occlusion in other views, we present a subpatch idea to design an estimated algorithm, which can obtain more accurate initial depth than the method without handling it. In order to evaluate initial depth more precisely, we present a creative way for confidence analysis, which considers curvature factor for the first time. Finally, a Markov Random Field (MRF) energy model is built to regularize depth maps. As shown in Fig. 1, the proposed algorithm provides much sharper boundaries and more precise details. Experimental results demonstrate that compared with state-of-the-art algorithms on light field, our method can obtain more accurate depth. The main contributions of this paper is as follows:

- We analyze the light transmission model and derive the corresponding relationship on different occlusion situations. Then we extend it to multi-occlusion so that it can handle more complex situations.
- We take different occlusions into consideration and propose corresponding algorithms to obtain unoccluded views precisely. A novel confidence analysis is designed in energy model to regularize depth maps better.

## 2. Light-field multi-occlusion model

In this section, we first analyze different occlusion cases before we introduce the proposed light-field multi-occlusion model. There is a consensus that edge pixels and edges are considered as candidate occlusion pixels and candidate occlusion boundaries respectively and many algorithms start from it. However, some points near edges are not occluded in the central view, but occluded in some other views as Fig. 2 shows. So different from previous work, we analyze different multi-occlusion situations from the perspective of light. It can also be extended to any shape occlusions and deal with different occlusion situations.

#### One occluder analysis:

For analyzing multi-occlusion better, we firstly take one occluder as an example as shown in Fig. 3. A point (*x*_{0}, *y*_{0}, *F*) located on the focal plane at the depth of *F*, and an occluder at *Z*_{1} plane has a straight edge, which contains the point (*X*_{1}, *Y*_{1}, *Z*_{1}). (*X*_{0}, *Y*_{0}, *F*) is projected to *Z*_{1} plane at (*X*_{0}, *Y*_{0}, *Z*_{1}) and the occluder point (*X*_{1}, *Y*_{1}, *Z*_{1}) is projected to *F* plane at (*x*_{1}, *y*_{1}, *F*). Assume that the connection between two points is perpendicular to the edge, so the formed line is the distance between center point and the edge.

In Fig. 3(a), for the central view (*u*_{0}, *v*_{0}), the normal vector of the edge at *Z*_{1} plane can be expressed by (*X*_{1} − *X*_{0}, *Y*_{1} −*Y*_{0}) and the actual distance *D _{gt}* is modulus of the normal vector. When the light at

*Z*

_{1}plane is projected to

*f*plane, these relationships can be obtained by projection principle.

*f*. Similarly, by the reserved light model these can be obtained.

Then considering the distance, we firstly assume that (*X*_{0}, *Y*_{0}) is the edge pixel in central view. *D _{gt}*,

*D*,

_{spatial}*D*are zero. From the analysis above, it can be seen that the vector (

_{angular}*x*

_{0}−

*x*

_{1},

*y*

_{0}−

*y*

_{1}) of the edge in spatial domain is consistent with the vector (

*x*

_{0}−

*x*

_{1},

*y*

_{0}−

*y*

_{1}) of the boundary between occluded views and unoccluded views in the angular domain as shown in Fig. 4(a). We call it boundary-consistency. Secondly, we assume that (

*X*

_{0},

*Y*

_{0}) is near the edge pixel in central view.

*D*is non-zero, so

_{gt}*D*is different from

_{spatial}*D*because of the distance as shown in the right of Fig. 4(b).

_{angular}#### Multi-occluders analysis:

Then we extend the model to multi-occlusion model as shown in Fig. 5. Take the case of two occluders, a point (*X*_{0}, *Y*_{0}, *F*) located on the focal plane at the depth of *F*. An occluder at *Z*_{1} plane has a straight edge, which contains the point (*X*_{1}, *Y*_{1}, *Z*_{1}), and another occluder at *Z*_{2} plane has a oblique edge with point (*X*_{2}, *Y*_{2}, *Z*_{2}). Let the point (*X*_{0}, *Y*_{0}) be the origin, any point (*X*, *Y*) to the origin can form a vector (*X* − *X*_{0}, *Y* − *Y*_{0}).

And for the central view (*u*_{0}, *v*_{0}), the two edges can be expressed by

As shown in Fig. 5(a), if the point is not occluded in the spatial domain, its vector must be between the vectors of two edges. Assume that (*X*_{1}, *Y*_{1}, *Z*_{1}) is the upper boundary and (*X*_{2}, *Y*_{2}, *Z*_{2}) is the lower boundary, the vector of an unoccluded point $\overrightarrow{{e}_{\mathit{up}}}=\left({x}_{\mathit{up}}-{x}_{0},{y}_{\mathit{up}}-{y}_{0}\right)$, so we can infer that it should satisfy the following constraints.

Then considering Fig. 5(b) in the angular domain, the main lens is refocused on depth *F*. Similarly, an unoccluded view (*u _{uo}*,

*v*) can form the vector $\overrightarrow{{e}_{uv}}=({u}_{uo}-{u}_{0},{v}_{uo}-{v}_{0})$, if the point (

_{uo}*x*

_{0},

*y*

_{0},

*F*) can be observed by the view, it also obeys the rule.

From the analysis above, when *D*1* _{gt}*,

*D*2

*are zero, Eq. (4) and Eq. (5) have the same constraints with one-to-one correspondence, so (*

_{gt}*u*

_{0},

*v*

_{0}) and (

*x*

_{0},

*y*

_{0}) share the same separation line. So for occlusion in central view, boundary-consistency exists in any occlusions. This helps obtaining the consistency region (unoccluded views) in the angular domain. when

*D*1

*,*

_{gt}*D*2

*are non-zero, the situation becomes very complex. Occlusions with varying depths have different offsets with the central point. Unfortunately, since depth estimation is unavailable, it is inapplicable to obtain the offset distance. So for occlusion in other views, an alternative approach is designed in our algorithm to obtain the consistency region approximately.*

_{gt}## 3. Initial depth estimation

The angular patch of each unoccluded pixel exhibits photo-consistency in all regions while the angular patch of each occluded pixel exhibits photo-consistency in part of regions. The key issue is selecting the consistency region (unoccluded views vs. occluded views) in the angular patch for each occluded pixel. In this section, we present how to select the consistency region in the angular domain, and introduce how to obtain an initial depth map by the consistency region based on different properties for the central view and the other views.

#### 3.1. Anti-occlusion in the central view

### 3.1.1. Adaptive connected-domain selection

Edge detection is applied to the central view (pinhole) image to
obtain an edge map. These edge pixels and edges are considered as
candidate occlusion pixels and candidate occlusion boundaries
respectively. For each edge pixel we extract an edge patch with
pixel *p* being the center, and the patch is in the
same size with the angular resolution of light field.
Four-Connected Components Labeling algorithm [22,23] is applied on the
edge patch to label the spatial patch and these pixels with the
same label compose a region. So the patch is divided into several
regions because of different labels.

In addition, Connected Components Labeling algorithm can not label these edge line pixels, so in order to label them, we design a method which fuses the color distance in Eq. (6) and the space distance in Eq. (7), which are shown as follows.

*I*and

_{e}*I*denote respectively the intensity of pixels at edge line and pixels labeled

_{i}*i*.

*N*is the number of pixels labeled

_{i}*i*.

*x*and

_{e}*x*

_{ci}are respectively x-coordinate of pixels at edge line and the center pixels of the region labeled

*i*.

*y*and

_{e}*y*

_{ci}have the similar meaning but are with y-coordinate.

According to the boundary-consistency mentioned in section 2,
angular patches have the same labels with edge patches. The region
including the center pixel *p* is selected into the
consistency region Ω* _{p}*.
Compared with state-of-the-art algorithms, our method can divide
the patch into several regions adaptively and correctly for
multi-occlusion. As an example, Fig. 6 shows the process result for a multi-occlusion
point. Compared with other algorithms, the consistency region
selected by our method is more accurate.

### 3.1.2. Depth estimation

We refocus the light field data to various depths, the light field 4D shearing is performed as follows [2].

*L*is the input 4D light-field data and

*α*is the refocused depth.

*L*is the refocused light-field image at depth

_{α}*α*. (

*x*,

*y*) is the spatial coordinate. (

*u*,

*v*) is the angular coordinate and the central image of the light-field data is defined as

*L*(

*x*,

*y*, 0, 0).

In order to analyze regional pixel consistency, we calculate the
first moment and the second moment of the region
Ω* _{p}*. For pixel

*p*, Ω

*is selected by adaptive connected-domain selection algorithm if*

_{p}*p*is an edge point, otherwise Ω

*is the whole patch. Specific calculation methods are as follows. The first moment of error is performed by*

_{p}*L*is the refocused light-field image at depth

_{α}*α*, (

*u*,

_{i}*v*) is the consistency region of pixel

_{i}*p*and

*N*is the number of pixels in the consistency region. Then the second moment is performed by

_{i}*α*depth. The initial depth

*α*is determined as

#### 3.2. Anti-occlusion in other views

For pixels around the occlusion edge, previous work [15,16] just dilated edges with a rough size, so depth edges are blurry because of the uncertain size. In our method, we design a filter to determine these points. Since consistency region of these points occluded in other views are imprecise and they should have larger cost than other points. We calculate the mean and variance of all points’ cost in the scene, which are used as parameters of a filter to identify these points and mark them.

For the marked points, instead of measuring the cost in consistency region which is
affected by occlusion in other views, we search a subregion to avoid
influence of occlude in other views. In our method, the angular patch
(9 × 9) is divided into 9 subpatches (3 × 3). The
total cost *C _{α}*(

*p*) of each subpatch

*p*is performed by

*p*is the index of subpatch.

*N*is the number of pixels in subpatch. The minimum subpatch cost is selected as the new consistency region at

_{p}*α*depth. The depth is estimated by As an example, Fig. 7 shows the process results. The initial depth map in Fig. 7(b) is obtained by Eq. (12), and the areas occluded in other views (close to the edge) in the initial depth map obtain imprecise depth. In Fig. 7(c) these areas are detected by our method. And in Fig. 7(d), most of these areas are corrected to the accurate depth by our algorithm and sharper boundaries are well preserved.

## 4. Depth regularization

Given the initial depth obtained in Section 3, we present in this section the approach of depth refinement with global regularization. More specifically, we incorporate curvature based confidence analysis into the data cost, and the depth evaluation is more accurate.

#### 4.1. Curvature based confidence analysis

In order to analyze the confidence, we select two pixels as an example. The two pixels are marked in red and yellow color in the initial depth map in Fig. 8(a). Their Depth-Cost (D-C) curves with the horizontal axis and the vertical axis being total cost and different depths respectively show in Fig. 8(b). The D-C curve of the red point is very different from that of the yellow point. Especially, near the minimum value curve of the red point has a sharper change than that of the yellow point and the minimum value of the red curve is smaller.

For each pixel, if the consistency region is precisely selected, it will exhibit photo-consistency and when reaching the lowest cost, the total cost is sure to be very small and more focused at its minimal. Therefore, these pixels have a higher confidence with more accurate initial depth. For example, the depth of red point is more reliable than that of yellow point. Based on this finding, we propose a method to estimate confidence of initial depth as follows

*D*,

*C*) point and sort them in ascending order, so the second smallest point (

_{min}*D*,

_{second}*C*) is obtained. Then the curvature of (

_{second}*D*,

_{second}*C*) is calculate by Eq. (17). At last, the confidence of pixel (

_{second}*x*,

*y*) is obtained by

*k*is a weight. In our experiment, we set 10 for all data. Figure 9 shows the result. Figure 9(c) shows the difference between ground truth and initial depth map where brighter pixels have larger difference. Figure 9(d) is the corresponding confidence map where darker pixels have larger confidence. The confidence map is consistent with the difference map mostly.

#### 4.2. Final depth regularization

Finally, given the initial depth and the confidence cue, we refine the result with global regularization by using data cost and smoothness cost. More specifically, the initial depth map is regularized with a Markov Random Field (MRF) model and the problem is further casted to minimizing the energy as follows.

*α*is the final depth,

*p*,

*q*are neighboring pixels, and

*λ*is a weight which we set to 5.

Based on the confidence, data cost is defined as a Gaussian function, as follows.

*con*(

*p*) controls the sensitivity of the function which can be found in confidence analysis section.

The smooth energy cost controls the smoothness constraint between two neighboring pixels, which is defined as

*I*is the gradient of the central pinhole image,

*ω*is a weight balances between the gradient and the edge, and

_{c}*δ*is 0.1 to avoid infinite smoothness term. If two pixels are very similar, the gradient term and the edge term are very small to compose a region. In contrast, if two pixels are very different or there may be an occlusion, the gradient term and the edge term preserve sharp boundary. The minimization of

*E*is solved by a standard graph cut algorithm [24] [25] [26].

## 5. Experimental results

In order to evaluate the performance of the proposed method, we test it on both synthetic and real light field datasets. The synthetic datasets are from public datasets provided by 4D Light Field Dataset [27], and it contains light field images and ground truth for comparing. Using the stratified, test and training images of the 4D Light Field Dataset, we compare all results with recent algorithms ([28], [29], [13], [30], [14,15], [16] and [17,18]) in various aspects. Details of the resulting disparity maps of these existing methods can be found in [27]. Then the real light field database is created by Stanford University [31]. In experiments, depths of all methods are set to [0,100] for fair comparison and the given disparity range on each image is provided by the synthetic dataset or [−1,1] in the real dataset. Our proposed algorithm is implemented in MATLAB and VS2015 on a PC with a 3.2GHz CPU.

#### 5.1. Evaluation on synthetic datasets

Table 1 shows the averaged evaluation metrics in general, stratified and photorealistic performance evaluation on the synthetic images from 4D Light Field Dataset [27]. Lower scores indicate better performance for all metrics. It can be seen that our method outperforms these state-of-the-art algorithms in many aspects and shows a better comprehensive performance.

In these metrics, Mean Squared Error (MSE) is a more reasonable metric. So for better evaluation, we show more details on each synthetic dataset compared with these algorithms in Table 2 where our algorithm reports the lowest scores among all algorithms and gives more reliable depth results. In addition, the proposed algorithm consumes less time than most of these methods.

Figure 10 shows the detailed results of dino dataset in 4D Light Field Dataset. As shown in the figures, especially for the tiny hole highlighted with red boxes, only our method can obtain the accurate result in this complex occlusion case, while other algorithms produce blurs. As to the blue boxes and green boxes covering object edges, [28], [29] and [13] over-smooth and miss it. [30], [15] and [16] have some loss in line boundaries, and there are many blurs on boundaries of objects because of pixels occluded in other views. [18] obtains rough boundaries and misses complex occlusions, and our results can give sharper boundaries and better details.

Figure 11 shows the results for 4D Light Field Dataset [27]. As shown in the figures, for the net part in the first row, the proposed method not only provides good details but also avoids being over-sharp. For the region around the statue in the second row, the region is occluded in other view. [29] and [13] just smooth it without handling it explicitly. [28] and [30] are blurry in these parts. [15] and [16] adopt a coarse method, so there are many blurs on boundaries of statue, these depths are incorrect and [18] is better than other algorithms but still over-smooth some parts. Our algorithm solves the occlusion problem, so the boundary around the statue is clear and precise. For the last two rows, our results also show the better performances. In short, our algorithm outperforms these algorithms in terms of multi-occluder occasions, complex details and occlusions in other views. In summary, no matter in the objective or the subjective perspective, the proposed algorithm outperforms other algorithms.

#### 5.2. Evaluation on real datasets

Figure 12 compares results on real scenes database created by Stanford University, and these images are captured by Lytro Illum [31]. It can be seen that our results still preserve details of the scene well and avoid being over-sharp, and these results are the same as that of synthetic datasets. Only our method well preserves the structure of the near plant and the net (first row), and the precise details of wooden net (third row). And only our method can capture the little objects. For example, the little plant on the bottom right corner can be well obtained in our method (second row). Moreover, for complex scenes (fourth row), our method also gives a more detailed depth map than other methods, and reproduces the thin structure of the branch without burrs, and keeps the details without expanding the boundaries (final row).

## 6. Conclusion

In this paper, a thoughtful depth estimation algorithm is proposed which is robust not only in single-occluder occasions but also in multi-occluder scenarios. we build a light-field model from the multi-occlusion situation and prove that boundaries between occluded views and unoccluded views in angular domain correspond to edges of the occluders in spatial domain. Based on the fact, an adaptive connected-domain selection algorithm is proposed to obtain more accurate consistency regions in angular domain for occlusion in the central view. Considering the occlusion in different views, we develop a subpatch approach for anti-occlusion in other views to keep sharper boundaries. A novel confidence analysis which considers the curvature factor is proposed to obtain more precise confidence value for better depth evaluation. The final depth estimation is optimized by using the MRF framework which fuse the confidence analysis and initial depth map. our algorithm outperforms other algorithms on synthetic datasets and real-world scenes, and can be used in a range of applications such as 3D reconstruction, VR and AR scene.

## Funding

National Nature Science Foundation of China (61871437,61702384).

## References

**1. **T. Georgiev, Z. Yu, and A. Lumsdaine, “Lytro camera
technology: theory, algorithms, performance
analysis,” Int. Soc. Opt.
Eng. **8667**,
1–10
(2013).

**2. **N. Ren, L. Marc, B. Mathieu, D. Gene, H. Mark, and H. Pat, “Light field photography with a hand-held plenoptic camera,” Comput. Sci. Tech. Rep. **2**, 1–11 (2005).

**3. **H. Mark, “Focusing on everything,” IEEE Spectr. **49**(5), 44–50 (2012). [CrossRef]

**4. **C. Hahne, A. Aggoun, S. Haxha, V. Velisavljevic, and J. C. J. Fernández, “Light field geometry of a standard plenoptic camera,” Opt. Express **22**, 26659–26673 (2014). [CrossRef] [PubMed]

**5. **Z. Ma, Z. Cen, and X. Li, “Depth estimation algorithm for light field data by epipolar image analysis and region interpolation,” Appl. Opt. **56**, 6603–6610 (2017). [CrossRef] [PubMed]

**6. **Y. Qin, X. Jin, Y. Chen, and Q. Dai, “Enhanced depth
estimation for hand-held light field cameras,”
in *Proceedings of IEEE International Conference on Acoustics,
Speech and Signal Processing*,
(IEEE, 2017), pp.
2032–2036.

**7. **C. Kim, H. Zimmer, Y. Pritch, S.-H. Alexander, and M. Gross, “Scene reconstruction from high spatio-angular resolution light fields,” Acm Trans. Graph. **32**, 1–12 (2017). [CrossRef]

**8. **Z. Cai, X. Liu, X. Peng, Y. Yin, A. Li, J. Wu, and B. Z. Gao, “Structured light field 3d imaging,” Opt. Express **24**, 20324–20334 (2016). [CrossRef] [PubMed]

**9. **T. Tao, Q. Chen, S. Feng, Y. Hu, and C. Zuo, “Active depth estimation from defocus using a camera array,” Appl. Opt. **57**, 4960–4967 (2018). [CrossRef] [PubMed]

**10. **S. Wanner and B. Goldluecke, “Globally consistent
depth labeling of 4d light fields,” in
*Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition*, (IEEE,
2012), pp.
41–48.

**11. **M. W. Tao, S. Hadap, J. Malik, and R. Ramamoorthi, “Depth from combining
defocus and correspondence using light-field
cameras,” in *Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition*,
(IEEE, 2013), pp.
673–680.

**12. **M. W. Tao, P. P. Srinivasan, J. Malik, and R. Ramamoorthi, “Depth from shading,
defocus, and correspondence using light-field angular
coherence,” in *Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition*,
(IEEE, 2015), pp.
1940–1948.

**13. **H.-G. Jeon, J. Park, G. ChoE, J. Park, Y. Bok, Y.-W. Tai, and S. K. In, “Accurate depth map
estimation from a lenslet light field camera,”
in *Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition*, (IEEE,
2015), pp.
1547–1555.

**14. **T. C. Wang, A. A. Efros, and R. Ramamoorthi, “Occlusion-aware depth
estimation using light-field cameras,” in
*Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition*, (IEEE,
2015), pp.
3487–3495.

**15. **T. C. Wang, A. A. Efros, and R. Ramamoorthi, “Depth estimation with occlusion modeling using light-field cameras,” IEEE Trans. Pattern Anal. Mach. Intell. **38**, 2170–2181 (2016). [CrossRef] [PubMed]

**16. **H. Zhu, Q. Wang, and J. Y. Yu, “Occlusion-model guided anti-occlusion depth estimation in light field,” IEEE J. Sel. Top. Signal Process. **11**, 965–978 (2017). [CrossRef]

**17. **W. Williem and I. K. Park, “Robust light field
depth estimation for noisy scene with
occlusion,” in *Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition*,
(IEEE, 2016), pp.
4396–4404.

**18. **W. Williem, I. K. Park, and K. M. Lee, “Robust light field depth estimation using occlusion-noise aware data costs,” IEEE Trans. Pattern Anal. Mach. Intell. **40**, 2484–2497 (2018). [CrossRef]

**19. **T. Ryu, B. Lee, and S. Lee, “Mutual constraint using partial occlusion artifact removal for computational integral imaging reconstruction,” Appl. Opt. **54**, 4147–4153 (2015). [CrossRef]

**20. **M. Ghaneizad, Z. Kavehvash, and H. Aghajan, “Human detection in
occluded scenes through optically inspired multi-camera image
fusion,” J. Opt. Soc. Am. A **34**, 856–869
(2017). [CrossRef] [PubMed]

**21. **S. Xie, P. Wang, X. Sang, Z. Chen, N. Guo, B. Yan, K. Wang, and C. Yu, “Profile preferentially
partial occlusion removal for three-dimensional integral
imaging,” Opt. Express **24**, 23519–23530
(2016). [CrossRef] [PubMed]

**22. **A. L. Dulmage and N. S. Mendelsohn, “Coverings of bipartite graphs,” Can. J. Math. **10**, 516–534 (1958). [CrossRef]

**23. **A. Pothen and C.-J. Fan, “Computing the block triangular form of a sparse matrix,” ACM Trans. Math. Softw. **16**, 303–324 (1990). [CrossRef]

**24. **Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision,” IEEE Trans. Pattern Anal. Mach. Intell. **26**, 1124–1137 (2004). [CrossRef]

**25. **Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell. **23**, 1222–1239 (2001). [CrossRef]

**26. **V. Kolmogorov and R. Zabin, “What energy functions can be minimized via graph cuts?” IEEE Trans. Pattern Anal. Mach. Intell. **26**, 147–159 (2004). [CrossRef] [PubMed]

**27. **K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke, “A dataset and
evaluation methodology for depth estimation on 4d light
fields,” in *Proceedings of Asian
Conference on Computer Vision*,
(Springer, 2016), pp.
19–34.

**28. **O. Johannsen, A. Sulc, and B. Goldluecke, “What sparse light
field coding reveals about scene structure,”
in *Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition*, (IEEE,
2016), pp.
3262–3270.

**29. **L. Si and Q. Wang, “Dense depth-map
estimation and geometry inference from light fields via global
optimization,” in *Proceedings of Asian
Conference on Computer Vision*,
(Springer, 2016), pp.
83–98.

**30. **S. Zhang, H. Sheng, C. Li, J. Zhang, and X. Zhang, “Robust depth estimation for light field via spinning parallelogram operator,” Comput. Vis. Image Underst. **145**, 148–159 (2016). [CrossRef]

**31. **A. S. Raj, M. Lowney, and R. Shah, “Light-field database
creation and depth estimation,”
Tech. Rep., Department of Computer Science, Stanford University (2016).