A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image filtering,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 154–169.

I. Alhashim and P. Wonka, “High quality monocular depth estimation via transfer learning,” (2018).

Y. Altmann, R. Aspden, M. Padgett, and S. McLaughlin, “A bayesian approach to denoising of single-photon binary images,” IEEE Trans. Comput. Imaging 3(3), 460–471 (2017).

[Crossref]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

A. Gordon, H. Li, R. Jonschkowski, and A. Angelova, “Depth from videos in the wild: Unsupervised monocular depth learning from unknown cameras,” CoRR abs/1904.04998 (2019).

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

Y. Altmann, R. Aspden, M. Padgett, and S. McLaughlin, “A bayesian approach to denoising of single-photon binary images,” IEEE Trans. Comput. Imaging 3(3), 460–471 (2017).

[Crossref]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” (2014).

H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proceedings of the IEEE International Conference on Computer Vision, (2013), pp. 993–1000.

C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017).

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), 234–241.

S. Burri, C. Bruschini, and E. Charbon, “Linospad: a compact linear spad camera system with 64 fpga-based tdc modules for versatile 50 ps resolution time-resolved imaging,” Instruments 1(1), 6 (2017).

[Crossref]

S. Burri, H. Homulle, C. Bruschini, and E. Charbon, “Linospad: a time-resolved 256x1 cmos spad line sensor system featuring 64 fpga-based tdc channels running at up to 8.5 giga-events per second,” in Optical Sensing and Detection IV, vol. 9899 (International Society for Optics and Photonics, 2016), p. 98990D.

A. M. Pawlikowska, A. Halimi, R. A. Lamb, and G. S. Buller, “Single-photon three-dimensional imaging at up to 10 kilometers range,” Opt. Express 25(10), 11919–11931 (2017).

[Crossref]

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

S. Burri, C. Bruschini, and E. Charbon, “Linospad: a compact linear spad camera system with 64 fpga-based tdc modules for versatile 50 ps resolution time-resolved imaging,” Instruments 1(1), 6 (2017).

[Crossref]

S. Burri, H. Homulle, C. Bruschini, and E. Charbon, “Linospad: a time-resolved 256x1 cmos spad line sensor system featuring 64 fpga-based tdc channels running at up to 8.5 giga-events per second,” in Optical Sensing and Detection IV, vol. 9899 (International Society for Optics and Photonics, 2016), p. 98990D.

C. Cadena, A. R. Dick, and I. D. Reid, “Multi-modal auto-encoders as joint estimators for robotics scene understanding,” in Robotics: Science and Systems, vol. 5 (2016), p. 1.

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

F. Ma, G. V. Cavalheiro, and S. Karaman, “Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera,” (2018).

S. Burri, C. Bruschini, and E. Charbon, “Linospad: a compact linear spad camera system with 64 fpga-based tdc modules for versatile 50 ps resolution time-resolved imaging,” Instruments 1(1), 6 (2017).

[Crossref]

E. Charbon, “Single-photon imaging in complementary metal oxide semiconductor processes,” Phil. Trans. R. Soc. A 372(2012), 20130100 (2014).

[Crossref]

S. Burri, H. Homulle, C. Bruschini, and E. Charbon, “Linospad: a time-resolved 256x1 cmos spad line sensor system featuring 64 fpga-based tdc channels running at up to 8.5 giga-events per second,” in Optical Sensing and Detection IV, vol. 9899 (International Society for Optics and Photonics, 2016), p. 98990D.

M. Jaritz, R. D. Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” in 2018 International Conference on 3D Vision (3DV), (2018), pp. 52–60.

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018).

[Crossref]

I. Eichhardt, D. Chetverikov, and Z. Jankó, “Image-guided tof depth upsampling: a survey,” Mach. Vis. Appl. 28(3-4), 267–282 (2017).

[Crossref]

J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,” ACM Trans. Graph. 26(3), 96 (2007).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture,” in Computer Vision – ACCV 2016, S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, eds. (Springer International Publishing, Cham, 2017), pp. 213–228

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

C. Ti, R. Yang, J. Davis, and Z. Pan, “Simultaneous time-of-flight sensing and photometric stereo with a single tof sensor,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),, (2015).

F. Heide, S. Diamond, D. B. Lindell, and G. Wetzstein, “Sub-picosecond photon-efficient 3d imaging using single-photon sensors,” Sci. Rep. 8(1), 17726 (2018).

[Crossref]

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

C. Cadena, A. R. Dick, and I. D. Reid, “Multi-modal auto-encoders as joint estimators for robotics scene understanding,” in Robotics: Science and Systems, vol. 5 (2016), p. 1.

J. Diebel and S. Thrun, “An application of markov random fields to range sensing,” in Advances in neural information processing systems, (2006), pp. 291–298.

J. P. S. do Monte Lima, F. P. M. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand, “Depth-assisted rectification for real-time object detection and pose estimation,” Mach. Vis. Appl. 27(2), 193–219 (2016).

[Crossref]

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture,” in Computer Vision – ACCV 2016, S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, eds. (Springer International Publishing, Cham, 2017), pp. 213–228

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

I. Eichhardt, D. Chetverikov, and Z. Jankó, “Image-guided tof depth upsampling: a survey,” Mach. Vis. Appl. 28(3-4), 267–282 (2017).

[Crossref]

D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds. (Curran Associates, Inc., 2014), pp. 2366–2374.

A. Eldesokey, M. Felsberg, and F. S. Khan, “Confidence propagation through cnns for guided sparse depth regression,” IEEE Trans. Pattern Anal. Mach. Intell.1 (2019).

A. Eldesokey, M. Felsberg, and F. S. Khan, “Confidence propagation through cnns for guided sparse depth regression,” IEEE Trans. Pattern Anal. Mach. Intell.1 (2019).

Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. on Image Process. 28(2), 994–1006 (2019).

[Crossref]

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision – ECCV 2012, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, eds. (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 746–760.

D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds. (Curran Associates, Inc., 2014), pp. 2366–2374.

D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proceedings of the IEEE International Conference on Computer Vision, (2013), pp. 993–1000.

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), 234–241.

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

Y. Zhang and T. Funkhouser, “Deep depth completion of a single rgb-d image,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The Int. J. Robotics Res. 32(11), 1231–1237 (2013).

[Crossref]

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017).

H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

A. Gordon, H. Li, R. Jonschkowski, and A. Angelova, “Depth from videos in the wild: Unsupervised monocular depth learning from unknown cameras,” CoRR abs/1904.04998 (2019).

J. Rapp and V. K. Goyal, “A few photons among many: Unmixing signal and noise for photon-efficient active imaging,” IEEE Trans. Comput. Imaging 3(3), 445–459 (2017).

[Crossref]

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors,” IEEE Trans. Comput. Imaging 1(2), 112–125 (2015).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in Proc. CVPR, (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in Proc. CVPR, (2019).

A. Ingle, A. Velten, and M. Gupta, “High flux passive imaging with single-photon sensors,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture,” in Computer Vision – ACCV 2016, S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, eds. (Springer International Publishing, Cham, 2017), pp. 213–228

K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision, (Springer, 2016), 630–645.

F. Heide, S. Diamond, D. B. Lindell, and G. Wetzstein, “Sub-picosecond photon-efficient 3d imaging using single-photon sensors,” Sci. Rep. 8(1), 17726 (2018).

[Crossref]

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision – ECCV 2012, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, eds. (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 746–760.

S. Burri, H. Homulle, C. Bruschini, and E. Charbon, “Linospad: a time-resolved 256x1 cmos spad line sensor system featuring 64 fpga-based tdc channels running at up to 8.5 giga-events per second,” in Optical Sensing and Detection IV, vol. 9899 (International Society for Optics and Photonics, 2016), p. 98990D.

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image filtering,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 154–169.

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

T.-W. Hui, C. C. Loy, and X. Tang, “Depth map super-resolution by deep multi-scale guidance,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 353–369.

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

A. Ingle, A. Velten, and M. Gupta, “High flux passive imaging with single-photon sensors,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in Proc. CVPR, (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

I. Eichhardt, D. Chetverikov, and Z. Jankó, “Image-guided tof depth upsampling: a survey,” Mach. Vis. Appl. 28(3-4), 267–282 (2017).

[Crossref]

M. Jaritz, R. D. Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” in 2018 International Conference on 3D Vision (3DV), (2018), pp. 52–60.

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

A. Gordon, H. Li, R. Jonschkowski, and A. Angelova, “Depth from videos in the wild: Unsupervised monocular depth learning from unknown cameras,” CoRR abs/1904.04998 (2019).

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

F. Ma, G. V. Cavalheiro, and S. Karaman, “Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera,” (2018).

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

A. Eldesokey, M. Felsberg, and F. S. Khan, “Confidence propagation through cnns for guided sparse depth regression,” IEEE Trans. Pattern Anal. Mach. Intell.1 (2019).

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” (2014).

D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors,” IEEE Trans. Comput. Imaging 1(2), 112–125 (2015).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision – ECCV 2012, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, eds. (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 746–760.

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018).

[Crossref]

J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,” ACM Trans. Graph. 26(3), 96 (2007).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The Int. J. Robotics Res. 32(11), 1231–1237 (2013).

[Crossref]

A. Gordon, H. Li, R. Jonschkowski, and A. Angelova, “Depth from videos in the wild: Unsupervised monocular depth learning from unknown cameras,” CoRR abs/1904.04998 (2019).

Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. on Image Process. 28(2), 994–1006 (2019).

[Crossref]

Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image filtering,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 154–169.

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. on Image Process. 28(2), 994–1006 (2019).

[Crossref]

D. B. Lindell, M. O’Toole, and G. Wetzstein, “Single-photon 3d imaging with deep sensor fusion,” ACM Trans. Graph. 37(4), 1–12 (2018).

[Crossref]

F. Heide, S. Diamond, D. B. Lindell, and G. Wetzstein, “Sub-picosecond photon-efficient 3d imaging using single-photon sensors,” Sci. Rep. 8(1), 17726 (2018).

[Crossref]

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,” ACM Trans. Graph. 26(3), 96 (2007).

[Crossref]

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

T.-W. Hui, C. C. Loy, and X. Tang, “Depth map super-resolution by deep multi-scale guidance,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 353–369.

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

F. Ma, G. V. Cavalheiro, and S. Karaman, “Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera,” (2018).

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture,” in Computer Vision – ACCV 2016, S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, eds. (Springer International Publishing, Cham, 2017), pp. 213–228

C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017).

J. P. S. do Monte Lima, F. P. M. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand, “Depth-assisted rectification for real-time object detection and pose estimation,” Mach. Vis. Appl. 27(2), 193–219 (2016).

[Crossref]

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

Y. Altmann, R. Aspden, M. Padgett, and S. McLaughlin, “A bayesian approach to denoising of single-photon binary images,” IEEE Trans. Comput. Imaging 3(3), 460–471 (2017).

[Crossref]

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018).

[Crossref]

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

M. Jaritz, R. D. Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” in 2018 International Conference on 3D Vision (3DV), (2018), pp. 52–60.

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

D. B. Lindell, M. O’Toole, and G. Wetzstein, “Single-photon 3d imaging with deep sensor fusion,” ACM Trans. Graph. 37(4), 1–12 (2018).

[Crossref]

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

Y. Altmann, R. Aspden, M. Padgett, and S. McLaughlin, “A bayesian approach to denoising of single-photon binary images,” IEEE Trans. Comput. Imaging 3(3), 460–471 (2017).

[Crossref]

C. Ti, R. Yang, J. Davis, and Z. Pan, “Simultaneous time-of-flight sensing and photometric stereo with a single tof sensor,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),, (2015).

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018).

[Crossref]

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

M. Jaritz, R. D. Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” in 2018 International Conference on 3D Vision (3DV), (2018), pp. 52–60.

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds. (Curran Associates, Inc., 2014), pp. 2366–2374.

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proceedings of the IEEE International Conference on Computer Vision, (2013), pp. 993–1000.

J. Rapp and V. K. Goyal, “A few photons among many: Unmixing signal and noise for photon-efficient active imaging,” IEEE Trans. Comput. Imaging 3(3), 445–459 (2017).

[Crossref]

C. Cadena, A. R. Dick, and I. D. Reid, “Multi-modal auto-encoders as joint estimators for robotics scene understanding,” in Robotics: Science and Systems, vol. 5 (2016), p. 1.

D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proceedings of the IEEE International Conference on Computer Vision, (2013), pp. 993–1000.

K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision, (Springer, 2016), 630–645.

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), 234–241.

D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proceedings of the IEEE International Conference on Computer Vision, (2013), pp. 993–1000.

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors,” IEEE Trans. Comput. Imaging 1(2), 112–125 (2015).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. on Image Process. 28(2), 994–1006 (2019).

[Crossref]

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors,” IEEE Trans. Comput. Imaging 1(2), 112–125 (2015).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision – ECCV 2012, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, eds. (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 746–760.

J. P. S. do Monte Lima, F. P. M. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand, “Depth-assisted rectification for real-time object detection and pose estimation,” Mach. Vis. Appl. 27(2), 193–219 (2016).

[Crossref]

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The Int. J. Robotics Res. 32(11), 1231–1237 (2013).

[Crossref]

K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision, (Springer, 2016), 630–645.

T.-W. Hui, C. C. Loy, and X. Tang, “Depth map super-resolution by deep multi-scale guidance,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 353–369.

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

J. P. S. do Monte Lima, F. P. M. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand, “Depth-assisted rectification for real-time object detection and pose estimation,” Mach. Vis. Appl. 27(2), 193–219 (2016).

[Crossref]

J. Diebel and S. Thrun, “An application of markov random fields to range sensing,” in Advances in neural information processing systems, (2006), pp. 291–298.

C. Ti, R. Yang, J. Davis, and Z. Pan, “Simultaneous time-of-flight sensing and photometric stereo with a single tof sensor,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),, (2015).

J. P. S. do Monte Lima, F. P. M. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand, “Depth-assisted rectification for real-time object detection and pose estimation,” Mach. Vis. Appl. 27(2), 193–219 (2016).

[Crossref]

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The Int. J. Robotics Res. 32(11), 1231–1237 (2013).

[Crossref]

J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,” ACM Trans. Graph. 26(3), 96 (2007).

[Crossref]

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in Proc. CVPR, (2019).

A. Ingle, A. Velten, and M. Gupta, “High flux passive imaging with single-photon sensors,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. on Image Process. 28(2), 994–1006 (2019).

[Crossref]

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

F. Heide, S. Diamond, D. B. Lindell, and G. Wetzstein, “Sub-picosecond photon-efficient 3d imaging using single-photon sensors,” Sci. Rep. 8(1), 17726 (2018).

[Crossref]

D. B. Lindell, M. O’Toole, and G. Wetzstein, “Single-photon 3d imaging with deep sensor fusion,” ACM Trans. Graph. 37(4), 1–12 (2018).

[Crossref]

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

M. Jaritz, R. D. Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” in 2018 International Conference on 3D Vision (3DV), (2018), pp. 52–60.

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

I. Alhashim and P. Wonka, “High quality monocular depth estimation via transfer learning,” (2018).

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image filtering,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 154–169.

C. Ti, R. Yang, J. Davis, and Z. Pan, “Simultaneous time-of-flight sensing and photometric stereo with a single tof sensor,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),, (2015).

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018).

[Crossref]

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision, (Springer, 2016), 630–645.

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

Y. Zhang and T. Funkhouser, “Deep depth completion of a single rgb-d image,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,” ACM Trans. Graph. 26(3), 96 (2007).

[Crossref]

D. B. Lindell, M. O’Toole, and G. Wetzstein, “Single-photon 3d imaging with deep sensor fusion,” ACM Trans. Graph. 37(4), 1–12 (2018).

[Crossref]

J. Rapp and V. K. Goyal, “A few photons among many: Unmixing signal and noise for photon-efficient active imaging,” IEEE Trans. Comput. Imaging 3(3), 445–459 (2017).

[Crossref]

Y. Altmann, R. Aspden, M. Padgett, and S. McLaughlin, “A bayesian approach to denoising of single-photon binary images,” IEEE Trans. Comput. Imaging 3(3), 460–471 (2017).

[Crossref]

D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors,” IEEE Trans. Comput. Imaging 1(2), 112–125 (2015).

[Crossref]

Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. on Image Process. 28(2), 994–1006 (2019).

[Crossref]

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018).

[Crossref]

S. Burri, C. Bruschini, and E. Charbon, “Linospad: a compact linear spad camera system with 64 fpga-based tdc modules for versatile 50 ps resolution time-resolved imaging,” Instruments 1(1), 6 (2017).

[Crossref]

I. Eichhardt, D. Chetverikov, and Z. Jankó, “Image-guided tof depth upsampling: a survey,” Mach. Vis. Appl. 28(3-4), 267–282 (2017).

[Crossref]

J. P. S. do Monte Lima, F. P. M. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand, “Depth-assisted rectification for real-time object detection and pose estimation,” Mach. Vis. Appl. 27(2), 193–219 (2016).

[Crossref]

D. Shin, F. Xu, D. Venkatraman, R. Lussana, F. Villa, F. Zappa, V. K. Goyal, F. N. Wong, and J. H. Shapiro, “Photon-efficient imaging with a single-photon camera,” Nat. Commun. 7(1), 12046 (2016).

[Crossref]

A. McCarthy, N. J. Krichel, N. R. Gemmell, X. Ren, M. G. Tanner, S. N. Dorenbos, V. Zwiller, R. H. Hadfield, and G. S. Buller, “Kilometer-range, high resolution depth imaging via 1560 nm wavelength single-photon detection,” Opt. Express 21(7), 8904–8915 (2013).

[Crossref]

A. M. Pawlikowska, A. Halimi, R. A. Lamb, and G. S. Buller, “Single-photon three-dimensional imaging at up to 10 kilometers range,” Opt. Express 25(10), 11919–11931 (2017).

[Crossref]

E. Charbon, “Single-photon imaging in complementary metal oxide semiconductor processes,” Phil. Trans. R. Soc. A 372(2012), 20130100 (2014).

[Crossref]

F. Heide, S. Diamond, D. B. Lindell, and G. Wetzstein, “Sub-picosecond photon-efficient 3d imaging using single-photon sensors,” Sci. Rep. 8(1), 17726 (2018).

[Crossref]

A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “First-photon imaging,” Science 343(6166), 58–61 (2014).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The Int. J. Robotics Res. 32(11), 1231–1237 (2013).

[Crossref]

P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The Int. J. Robotics Res. 31(5), 647–663 (2012).

[Crossref]

C. Cadena, A. R. Dick, and I. D. Reid, “Multi-modal auto-encoders as joint estimators for robotics scene understanding,” in Robotics: Science and Systems, vol. 5 (2016), p. 1.

H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds. (Curran Associates, Inc., 2014), pp. 2366–2374.

A. Gordon, H. Li, R. Jonschkowski, and A. Angelova, “Depth from videos in the wild: Unsupervised monocular depth learning from unknown cameras,” CoRR abs/1904.04998 (2019).

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Pattern Recognition, X. Jiang, J. Hornegger, and R. Koch, eds. (Springer International Publishing, Cham, 2014), pp. 31–42.

C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017).

K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision, (Springer, 2016), 630–645.

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture,” in Computer Vision – ACCV 2016, S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, eds. (Springer International Publishing, Cham, 2017), pp. 213–228

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu, B. Zeng, and M. Pollefeys, “Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image,” (2018).

Y. Zhang and T. Funkhouser, “Deep depth completion of a single rgb-d image,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018).

F. Ma, G. V. Cavalheiro, and S. Karaman, “Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera,” (2018).

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in 2017 International Conference on 3D Vision (3DV), (2017), pp. 11–20.

A. Eldesokey, M. Felsberg, and F. S. Khan, “Confidence propagation through cnns for guided sparse depth regression,” IEEE Trans. Pattern Anal. Mach. Intell.1 (2019).

M. Jaritz, R. D. Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” in 2018 International Conference on 3D Vision (3DV), (2018), pp. 52–60.

T.-W. Hui, C. C. Loy, and X. Tang, “Depth map super-resolution by deep multi-scale guidance,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 353–369.

Z.-P. Li, X. Huang, Y. Cao, B. Wang, Y.-H. Li, W. Jin, C. Yu, J. Zhang, Q. Zhang, and C.-Z. Peng, “Single-photon computational 3d imaging at 45 km,” arXiv preprint arXiv:1904.10341 (2019).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in Proc. CVPR, (2019).

M. O’Toole, F. Heide, D. B. Lindell, K. Zang, S. Diamond, and G. Wetzstein, “Reconstructing transient images from single-photon sensors,” in Proc. CVPR, (2019).

S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of fermat paths for non-line-of-sight shape reconstruction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2019), pp. 6800–6809.

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” (2017).

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360 (2016).

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision – ECCV 2012, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, eds. (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 746–760.

S. Burri, H. Homulle, C. Bruschini, and E. Charbon, “Linospad: a time-resolved 256x1 cmos spad line sensor system featuring 64 fpga-based tdc channels running at up to 8.5 giga-events per second,” in Optical Sensing and Detection IV, vol. 9899 (International Society for Optics and Photonics, 2016), p. 98990D.

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), 234–241.

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” (2014).

Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep joint image filtering,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds. (Springer International Publishing, Cham, 2016), pp. 154–169.

I. Alhashim and P. Wonka, “High quality monocular depth estimation via transfer learning,” (2018).

A. Gupta, A. Ingle, A. Velten, and M. Gupta, “Photon-flooded single-photon 3d cameras,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

A. Ingle, A. Velten, and M. Gupta, “High flux passive imaging with single-photon sensors,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019).

C. Ti, R. Yang, J. Davis, and Z. Pan, “Simultaneous time-of-flight sensing and photometric stereo with a single tof sensor,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),, (2015).

J. Diebel and S. Thrun, “An application of markov random fields to range sensing,” in Advances in neural information processing systems, (2006), pp. 291–298.

D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proceedings of the IEEE International Conference on Computer Vision, (2013), pp. 993–1000.