M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” arXiv preprint arXiv:1511.05952 (2015).

T. M. Shay, V. Benham, J. T. Baker, A. D. Sanchez, D. Pilkington, and C. A. Lu, “Self-synchronous and self-referenced coherent beam combination for large optical arrays,” IEEE J. Sel. Top. Quantum Electron. 13, 480–486(2007).

[Crossref]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991).

[Crossref]

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction (MIT, 2018).

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555 (2014).

T. M. Shay, V. Benham, J. T. Baker, A. D. Sanchez, D. Pilkington, and C. A. Lu, “Self-synchronous and self-referenced coherent beam combination for large optical arrays,” IEEE J. Sel. Top. Quantum Electron. 13, 480–486(2007).

[Crossref]

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555 (2014).

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555 (2014).

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

T. Hansch and B. Couillaud, “Laser frequency stabilization by polarization spectroscopy of a reflecting reference cavity,” Opt. Commun. 35, 441–444 (1980).

[Crossref]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

N. Wahlström, T. B. Schön, and M. P. Deisenroth, “From pixels to torques: Policy learning with deep dynamical models,” arXiv preprint arXiv:1502.02251 (2015).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

T. Y. Fan, “Laser beam combining for high-power, high-radiance sources,” IEEE J. Sel. Top. Quantum Electron. 11, 567–577 (2005).

[Crossref]

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991).

[Crossref]

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555 (2014).

T. Hansch and B. Couillaud, “Laser frequency stabilization by polarization spectroscopy of a reflecting reference cavity,” Opt. Commun. 35, 441–444 (1980).

[Crossref]

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

F. C. Hoppensteadt and E. M. Izhikevich, “Pattern recognition via synchronization in phase-locked loop neural networks,” IEEE Transactions on Neural Networks 11, 734–738 (2000).

[Crossref]

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

F. C. Hoppensteadt and E. M. Izhikevich, “Pattern recognition via synchronization in phase-locked loop neural networks,” IEEE Transactions on Neural Networks 11, 734–738 (2000).

[Crossref]

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

T. M. Shay, V. Benham, J. T. Baker, A. D. Sanchez, D. Pilkington, and C. A. Lu, “Self-synchronous and self-referenced coherent beam combination for large optical arrays,” IEEE J. Sel. Top. Quantum Electron. 13, 480–486(2007).

[Crossref]

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991).

[Crossref]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

T. M. Shay, V. Benham, J. T. Baker, A. D. Sanchez, D. Pilkington, and C. A. Lu, “Self-synchronous and self-referenced coherent beam combination for large optical arrays,” IEEE J. Sel. Top. Quantum Electron. 13, 480–486(2007).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” arXiv preprint arXiv:1511.05952 (2015).

J. B. Rawlings, “Tutorial overview of model predictive control,” IEEE Control. Syst. Mag. 20, 38–52 (2000).

[Crossref]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

T. M. Shay, V. Benham, J. T. Baker, A. D. Sanchez, D. Pilkington, and C. A. Lu, “Self-synchronous and self-referenced coherent beam combination for large optical arrays,” IEEE J. Sel. Top. Quantum Electron. 13, 480–486(2007).

[Crossref]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991).

[Crossref]

T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” arXiv preprint arXiv:1511.05952 (2015).

N. Wahlström, T. B. Schön, and M. P. Deisenroth, “From pixels to torques: Policy learning with deep dynamical models,” arXiv preprint arXiv:1502.02251 (2015).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

T. M. Shay, V. Benham, J. T. Baker, A. D. Sanchez, D. Pilkington, and C. A. Lu, “Self-synchronous and self-referenced coherent beam combination for large optical arrays,” IEEE J. Sel. Top. Quantum Electron. 13, 480–486(2007).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

H. Tünnermann and A. Shirakawa, “Reinforcement learning for coherent beam combining,” in Pacific Rim Conference on Lasers and Electro-Optics (CLEO-PR), (2018). W1A.2.

H. Tünnermann and A. Shirakawa, “End-to-end reinforcement learning for coherent beam combination,” in 8th EPS-QEOD Europhoton Conference, (2018). TuP.11.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” arXiv preprint arXiv:1511.05952 (2015).

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” arXiv preprint arXiv:1712.09381 (2017).

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction (MIT, 2018).

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning,” in Experimental Robotics IX, (Springer, 2006), pp. 363–372.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

A. Klenke, M. Müller, H. Stark, M. Kienel, C. Jauregui, A. Tünnermann, and J. Limpert, “Coherent beam combination of ultrafast fiber lasers,” IEEE J. Sel. Top. Quantum Electron. 24, 1–9 (2018).

[Crossref]

M. Müller, M. Kienel, A. Klenke, T. Gottschall, E. Shestaev, M. Plötner, J. Limpert, and A. Tünnermann, “1 kW 1 mJ eight-channel ultrafast fiber laser,” Opt. Lett. 41, 3439–3442 (2016).

[Crossref]

H. Tünnermann, J. H. Pöld, J. Neumann, D. Kracht, B. Willke, and P. Weßels, “Beam quality and noise properties of coherently combined ytterbium doped single frequency fiber amplifiers,” Opt. Express 19, 19600–19606 (2011).

[Crossref]
[PubMed]

H. Tünnermann and A. Shirakawa, “End-to-end reinforcement learning for coherent beam combination,” in 8th EPS-QEOD Europhoton Conference, (2018). TuP.11.

H. Tünnermann and A. Shirakawa, “Reinforcement learning for coherent beam combining,” in Pacific Rim Conference on Lasers and Electro-Optics (CLEO-PR), (2018). W1A.2.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, D. Sander, G. Dominik, N. John, K. Nal, S. Ilya, L. Timothy, L. Madeleine, K. Koray, G. Thore, and H. Demis, “Mastering the game of go with deep neural networks and tree search,” Nature 529, 484–489 (2016).

[Crossref]
[PubMed]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

N. Wahlström, T. B. Schön, and M. P. Deisenroth, “From pixels to torques: Policy learning with deep dynamical models,” arXiv preprint arXiv:1502.02251 (2015).

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971 (2015).

D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, “Use of a neural network to control an adaptive optics system for an astronomical telescope,” Nature 351, 300–302 (1991).

[Crossref]

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale machine learning,” in 12th Symposium on Operating Systems Design and Implementation, (2016), pp. 265–283.

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).

T. Hou, Y. An, Q. Chang, P. Ma, J. Li, L. Huang, D. Zhi, J. Wu, R. Su, Y. Ma, and P. Zhou, “Deep learning-based phase control method for coherent beam combining and its application in generating orbital angular momentum beams,” arXiv preprint arXiv:1903.03983 (2019).