Abstract

In large-scale multi-processor computing systems, global communications are typically supported by an auxiliary network (e.g., IBM Blue Gene) or with hardware support in the network (e.g., NEC Earth Simulator). We explore the potential for realizing efficient global communications that can scale beyond a million processors by harnessing the unique parallelism and wavelength routing properties of optical devices. Specifically, we use an arrayed waveguide grating router (AWGR) device as the basic building block in realizing scalable global communication. The AWGR is a passive switch fabric (wavelength router) that uses multiple wavelengths to interconnect outputs and inputs by following a specific cyclic wavelength routing (permutation) pattern. We analyze different network topologies using AWGR devices for barrier synchronization and propose techniques to pick parameters of the network for a given number of processors. We compare the performance and energy consumption for barrier synchronization with what is achievable with state-of-the-art electrical networks.

© 2012 OSA

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. R. Karmani, N. Chen, A. Shali, and R. Johnson, “Barrier synchronization pattern,” 2009 [Online]. Available: http://parlab.eecs.berkeley.edu/wiki/_media/patterns/paraplop_g1_3.pdf.
  2. M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.
  3. S. L. Scott, “Synchronization and communication in the T3E multiprocessor,” ACM SIGOPS Oper. Syst. Rev., vol. 30, pp. 26–36, 1996.
    [CrossRef]
  4. S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
    [CrossRef]
  5. R. Ramaswami, K. Sivarajan, and G. Sasaki, Optical Networks: A Practical Perspective. 3rd ed.Morgan Kaufmann, 2009.
  6. Y. Yoshikuni, “Semiconductor arrayed waveguide gratings for photonic integrated devices,” IEEE J. Sel. Top. Quantum Electron., vol. 8, pp. 1102–1114, 2002.
    [CrossRef]
  7. T. Suzuki and H. Tsuda, “Ultrasmall arrowhead arrayed-waveguide grating with V-shaped bend waveguides,” IEEE Photon. Technol. Lett., vol. 17, pp. 810–812, 2005.
    [CrossRef]
  8. F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.
  9. J. Kim, W. J. Dally, and D. Abts, “Flattened butterfly: a cost-efficient topology for high-radix networks,” in 34th Int. Symp. on Computer Architecture, 2007, pp. 126–137.
  10. J. Oh, M. Prvulovic, and A. Zajic, “TLSync: support for multiple fast barriers using on-chip transmission lines,” in Proc. of the 38th Int. Symp. on Computer Architecture, 2011, pp. 105–116.
  11. J. Sartori and R. Kumar, “Low-overhead, high-speed multi-core barrier synchronization,” in 5th Int. Conf. on High-Performance Embedded Architectures and Compilers (HiPEAC), 2010, pp. 18–34.
  12. D. Adams, “Cray T3D system architecture overview manual,” 1993 [Online]. Available: ftp://ftp.cray.com/product-info/mpp/T3D_Architecture_Over/T3D.overview.html.
  13. E. Anderson, J. Brooks, C. Grassl, and S. Scott, “Performance of the CRAY T3E multiprocessor,” in 1997 ACM/IEEE Conf. on Supercomputing, 1997, pp. 1–17.
  14. W. Cohen, D. Hyde, and R. Gaede, “An optical bus-based distributed dynamic barrier mechanism,” IEEE Trans. Comput., vol. 49, pp. 1354–1365, 2000.
    [CrossRef]
  15. N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.
  16. A. Louri and H. Sung, “An optical multi-mesh hypercube: a scalable optical interconnection network for massively parallel computing,” J. Lightwave Technol., vol. 12, pp. 704–716, 1994.
    [CrossRef]
  17. R. Rabenseifner, “Optimization of collective reduction operations,” Lect. Notes Comput. Sci., vol. 3036, pp. 1–9, 2004.
  18. R. Thakur, R. Rabenseifner, and W. Gropp, “Optimizing of collective communication operations in MPICH,” Int. J. High Perform. Comput. Appl., vol. 19, pp. 49–66, 2005.
    [CrossRef]
  19. M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 63–74.
  20. J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly-scalable dragonfly topology,” in 35th Int. Symp. on Computer Architecture, 2008, pp. 77–88.
  21. C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.
  22. D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.
  23. K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
    [CrossRef]
  24. S. Bregni, A. Pattavina, and G. Vegetti, “Architectures and performance of AWG-based optical switching nodes for IP networks,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp. 1113–1121, 2003.
    [CrossRef]
  25. W. D. Zhong and R. S. Tucker, “Wavelength routing-based photonic packet buffers and their applications in photonic packet switching systems,” J. Lightwave Technol., vol. 16, no. 10, pp. 1737–1745, 1998.
    [CrossRef]
  26. D. Banerjee, J. Frank, and B. Mukherjee, “Passive optical network architecture based on waveguide grating routers,” IEEE J. Sel. Areas Commun., vol. 16, no. 7, pp. 1040–1050, 1998.
    [CrossRef]
  27. X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.
  28. N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.
  29. C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
    [CrossRef]
  30. J. M. Mellor-Crummey and M. L. Scott, “Algorithms for scalable synchronization on shared-memory multiprocessors,” ACM Trans. Comput. Syst., vol. 9, pp. 21–65, 1991.
    [CrossRef]
  31. B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.
  32. J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” in Proc. of the Sixth Symp. on Operating System Design and Implementation (OSDI), 2004, pp. 137–149.
  33. A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
    [CrossRef]
  34. N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.
  35. R. S. Tucker, “The role of optics and electronics in high-capacity routers,” J. Lightwave Technol., vol. 24, no. 12, pp. 4655–4673, 2006.
    [CrossRef]
  36. J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
    [CrossRef]
  37. X. Zheng, J. Lexau, Y. Luo, H. Thacker, T. Pinguet, A. Mekis, G. Li, J. Shi, P. Amberg, N. Pinckney, K. Raj, R. Ho, J. E. Cunningham, and A. V. Krishnamoorth, “Ultra-low-energy all-CMOS modulator integrated with driver,” Opt. Express, vol. 18, no. 3, pp. 3059–3070, 2010.
    [CrossRef] [PubMed]
  38. W. J. Dally, “From hypercubes to dragonflies: a short history of interconnect,” IAA Workshop, 2008.

2010 (1)

2007 (1)

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

2006 (2)

R. S. Tucker, “The role of optics and electronics in high-capacity routers,” J. Lightwave Technol., vol. 24, no. 12, pp. 4655–4673, 2006.
[CrossRef]

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

2005 (2)

T. Suzuki and H. Tsuda, “Ultrasmall arrowhead arrayed-waveguide grating with V-shaped bend waveguides,” IEEE Photon. Technol. Lett., vol. 17, pp. 810–812, 2005.
[CrossRef]

R. Thakur, R. Rabenseifner, and W. Gropp, “Optimizing of collective communication operations in MPICH,” Int. J. High Perform. Comput. Appl., vol. 19, pp. 49–66, 2005.
[CrossRef]

2004 (2)

R. Rabenseifner, “Optimization of collective reduction operations,” Lect. Notes Comput. Sci., vol. 3036, pp. 1–9, 2004.

S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
[CrossRef]

2003 (1)

S. Bregni, A. Pattavina, and G. Vegetti, “Architectures and performance of AWG-based optical switching nodes for IP networks,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp. 1113–1121, 2003.
[CrossRef]

2002 (1)

Y. Yoshikuni, “Semiconductor arrayed waveguide gratings for photonic integrated devices,” IEEE J. Sel. Top. Quantum Electron., vol. 8, pp. 1102–1114, 2002.
[CrossRef]

2000 (2)

W. Cohen, D. Hyde, and R. Gaede, “An optical bus-based distributed dynamic barrier mechanism,” IEEE Trans. Comput., vol. 49, pp. 1354–1365, 2000.
[CrossRef]

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

1998 (2)

W. D. Zhong and R. S. Tucker, “Wavelength routing-based photonic packet buffers and their applications in photonic packet switching systems,” J. Lightwave Technol., vol. 16, no. 10, pp. 1737–1745, 1998.
[CrossRef]

D. Banerjee, J. Frank, and B. Mukherjee, “Passive optical network architecture based on waveguide grating routers,” IEEE J. Sel. Areas Commun., vol. 16, no. 7, pp. 1040–1050, 1998.
[CrossRef]

1997 (1)

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

1996 (1)

S. L. Scott, “Synchronization and communication in the T3E multiprocessor,” ACM SIGOPS Oper. Syst. Rev., vol. 30, pp. 26–36, 1996.
[CrossRef]

1994 (1)

A. Louri and H. Sung, “An optical multi-mesh hypercube: a scalable optical interconnection network for massively parallel computing,” J. Lightwave Technol., vol. 12, pp. 704–716, 1994.
[CrossRef]

1991 (1)

J. M. Mellor-Crummey and M. L. Scott, “Algorithms for scalable synchronization on shared-memory multiprocessors,” ACM Trans. Comput. Syst., vol. 9, pp. 21–65, 1991.
[CrossRef]

Abel, F.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Abts, D.

J. Kim, W. J. Dally, and D. Abts, “Flattened butterfly: a cost-efficient topology for high-radix networks,” in 34th Int. Symp. on Computer Architecture, 2007, pp. 126–137.

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly-scalable dragonfly topology,” in 35th Int. Symp. on Computer Architecture, 2008, pp. 77–88.

Adams, D.

D. Adams, “Cray T3D system architecture overview manual,” 1993 [Online]. Available: ftp://ftp.cray.com/product-info/mpp/T3D_Architecture_Over/T3D.overview.html.

Akella, V.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

Al-Fares, M.

M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 63–74.

Amberg, P.

Anderson, E.

E. Anderson, J. Brooks, C. Grassl, and S. Scott, “Performance of the CRAY T3E multiprocessor,” in 1997 ACM/IEEE Conf. on Supercomputing, 1997, pp. 1–17.

Arimilli, B.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Arimilli, R.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Baclig, A. C.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Baek, J. H.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Banerjee, D.

D. Banerjee, J. Frank, and B. Mukherjee, “Passive optical network architecture based on waveguide grating routers,” IEEE J. Sel. Areas Commun., vol. 16, no. 7, pp. 1040–1050, 1998.
[CrossRef]

Bazzaz, H. H.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Binkert, N.

N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.

Blumrich, M.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Bregni, S.

S. Bregni, A. Pattavina, and G. Vegetti, “Architectures and performance of AWG-based optical switching nodes for IP networks,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp. 1113–1121, 2003.
[CrossRef]

Brooks, J.

E. Anderson, J. Brooks, C. Grassl, and S. Scott, “Performance of the CRAY T3E multiprocessor,” in 1997 ACM/IEEE Conf. on Supercomputing, 1997, pp. 1–17.

Caspers, P. J.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Chen, D.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Chen, N.

R. Karmani, N. Chen, A. Shali, and R. Johnson, “Barrier synchronization pattern,” 2009 [Online]. Available: http://parlab.eecs.berkeley.edu/wiki/_media/patterns/paraplop_g1_3.pdf.

Chung, V.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Clark, S.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Cohen, W.

W. Cohen, D. Hyde, and R. Gaede, “An optical bus-based distributed dynamic barrier mechanism,” IEEE Trans. Comput., vol. 49, pp. 1354–1365, 2000.
[CrossRef]

Coteus, P.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Cunningham, J. E.

Dally, W. J.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

W. J. Dally, “From hypercubes to dragonflies: a short history of interconnect,” IAA Workshop, 2008.

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly-scalable dragonfly topology,” in 35th Int. Symp. on Computer Architecture, 2008, pp. 77–88.

J. Kim, W. J. Dally, and D. Abts, “Flattened butterfly: a cost-efficient topology for high-radix networks,” in 34th Int. Symp. on Computer Architecture, 2007, pp. 126–137.

Davis, A.

N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.

Dean, J.

J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” in Proc. of the Sixth Symp. on Operating System Design and Implementation (OSDI), 2004, pp. 137–149.

Denzel, W.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Dill, P.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Drerup, B.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Driessen, A.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Eyles, J.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

Fainman, Y.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Farrington, N.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Fontaine, N.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Frank, J.

D. Banerjee, J. Frank, and B. Mukherjee, “Passive optical network architecture based on waveguide grating routers,” IEEE J. Sel. Areas Commun., vol. 16, no. 7, pp. 1040–1050, 1998.
[CrossRef]

Fuller, A. M.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

Gaede, R.

W. Cohen, D. Hyde, and R. Gaede, “An optical bus-based distributed dynamic barrier mechanism,” IEEE Trans. Comput., vol. 49, pp. 1354–1365, 2000.
[CrossRef]

Gara, A.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Ghemawat, S.

J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” in Proc. of the Sixth Symp. on Operating System Design and Implementation (OSDI), 2004, pp. 137–149.

Giampapa, M.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Grassl, C.

E. Anderson, J. Brooks, C. Grassl, and S. Scott, “Performance of the CRAY T3E multiprocessor,” in 1997 ACM/IEEE Conf. on Supercomputing, 1997, pp. 1–17.

Greer, T.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

Gropp, W.

R. Thakur, R. Rabenseifner, and W. Gropp, “Optimizing of collective communication operations in MPICH,” Int. J. High Perform. Comput. Appl., vol. 19, pp. 49–66, 2005.
[CrossRef]

Gruezke, L.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Grzybowski, R.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Guo, C.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

Gusat, M.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Habata, S.

S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
[CrossRef]

Hamm, R.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Hasegawa, T.

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

Heidelberger, P.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Hemenway, R. R.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Heritage, J.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Himeno, A.

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

Ho, R.

Hoefler, T.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Horowitz, M.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

Hyde, D.

W. Cohen, D. Hyde, and R. Gaede, “An optical bus-based distributed dynamic barrier mechanism,” IEEE Trans. Comput., vol. 49, pp. 1354–1365, 2000.
[CrossRef]

Iliadis, I.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Ishida, O.

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

Ismail, N.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Itoh, M.

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

Johnson, R.

R. Karmani, N. Chen, A. Shali, and R. Johnson, “Barrier synchronization pattern,” 2009 [Online]. Available: http://parlab.eecs.berkeley.edu/wiki/_media/patterns/paraplop_g1_3.pdf.

Joyner, J.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Junesand, C.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Kaneko, A.

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

Karmani, R.

R. Karmani, N. Chen, A. Shali, and R. Johnson, “Barrier synchronization pattern,” 2009 [Online]. Available: http://parlab.eecs.berkeley.edu/wiki/_media/patterns/paraplop_g1_3.pdf.

Kim, J.

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly-scalable dragonfly topology,” in 35th Int. Symp. on Computer Architecture, 2008, pp. 77–88.

J. Kim, W. J. Dally, and D. Abts, “Flattened butterfly: a cost-efficient topology for high-radix networks,” in 34th Int. Symp. on Computer Architecture, 2007, pp. 126–137.

Kitawaki, S.

S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
[CrossRef]

Krishnamoorth, A. V.

Krishnamurthy, R.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Kumar, R.

J. Sartori and R. Kumar, “Low-overhead, high-speed multi-core barrier synchronization,” in 5th Int. Conf. on High-Performance Embedded Architectures and Compilers (HiPEAC), 2010, pp. 18–34.

Lewis, J.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Lexau, J.

Li, D.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

Li, G.

Li, J.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Liou, K. Y.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Lipastiy, M.

N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.

Loukissas, A.

M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 63–74.

Lourdudoss, S.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Louri, A.

A. Louri and H. Sung, “An optical multi-mesh hypercube: a scalable optical interconnection network for massively parallel computing,” J. Lightwave Technol., vol. 12, pp. 704–716, 1994.
[CrossRef]

Lu, S.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

Luijten, R.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Luo, Y.

Mejia, P.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

Mekis, A.

Mellor-Crummey, J. M.

J. M. Mellor-Crummey and M. L. Scott, “Algorithms for scalable synchronization on shared-memory multiprocessors,” ACM Trans. Comput. Syst., vol. 9, pp. 21–65, 1991.
[CrossRef]

Minkenberg, C.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Mukherjee, B.

D. Banerjee, J. Frank, and B. Mukherjee, “Passive optical network architecture based on waveguide grating routers,” IEEE J. Sel. Areas Commun., vol. 16, no. 7, pp. 1040–1050, 1998.
[CrossRef]

Muller, P.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Ni, N.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Oh, J.

J. Oh, M. Prvulovic, and A. Zajic, “TLSync: support for multiple fast barriers using on-chip transmission lines,” in Proc. of the 38th Int. Symp. on Computer Architecture, 2011, pp. 105–116.

Ohmori, Y.

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

Okamoto, K.

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

Palmer, R.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

Papen, G.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Patel, B.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Pattavina, A.

S. Bregni, A. Pattavina, and G. Vegetti, “Architectures and performance of AWG-based optical switching nodes for IP networks,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp. 1113–1121, 2003.
[CrossRef]

Pinckney, N.

Pinguet, T.

Pollnau, M.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Porter, G.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Poulton, J.

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

Proietti, R.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

Prvulovic, M.

J. Oh, M. Prvulovic, and A. Zajic, “TLSync: support for multiple fast barriers using on-chip transmission lines,” in Proc. of the 38th Int. Symp. on Computer Architecture, 2011, pp. 105–116.

Rabenseifner, R.

R. Thakur, R. Rabenseifner, and W. Gropp, “Optimizing of collective communication operations in MPICH,” Int. J. High Perform. Comput. Appl., vol. 19, pp. 49–66, 2005.
[CrossRef]

R. Rabenseifner, “Optimization of collective reduction operations,” Lect. Notes Comput. Sci., vol. 3036, pp. 1–9, 2004.

Radhakrishnan, S.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Raj, K.

Rajamony, R.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

Ramaswami, R.

R. Ramaswami, K. Sivarajan, and G. Sasaki, Optical Networks: A Practical Perspective. 3rd ed.Morgan Kaufmann, 2009.

Ridder, R. M. d.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Sartori, J.

J. Sartori and R. Kumar, “Low-overhead, high-speed multi-core barrier synchronization,” in 5th Int. Conf. on High-Performance Embedded Architectures and Compilers (HiPEAC), 2010, pp. 18–34.

Sasaki, G.

R. Ramaswami, K. Sivarajan, and G. Sasaki, Optical Networks: A Practical Perspective. 3rd ed.Morgan Kaufmann, 2009.

Schiattarella, E.

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

Schreiber, R.

N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.

Scott, M. L.

J. M. Mellor-Crummey and M. L. Scott, “Algorithms for scalable synchronization on shared-memory multiprocessors,” ACM Trans. Comput. Syst., vol. 9, pp. 21–65, 1991.
[CrossRef]

Scott, R.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Scott, S.

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly-scalable dragonfly topology,” in 35th Int. Symp. on Computer Architecture, 2008, pp. 77–88.

E. Anderson, J. Brooks, C. Grassl, and S. Scott, “Performance of the CRAY T3E multiprocessor,” in 1997 ACM/IEEE Conf. on Supercomputing, 1997, pp. 1–17.

Scott, S. L.

S. L. Scott, “Synchronization and communication in the T3E multiprocessor,” ACM SIGOPS Oper. Syst. Rev., vol. 30, pp. 26–36, 1996.
[CrossRef]

Shali, A.

R. Karmani, N. Chen, A. Shali, and R. Johnson, “Barrier synchronization pattern,” 2009 [Online]. Available: http://parlab.eecs.berkeley.edu/wiki/_media/patterns/paraplop_g1_3.pdf.

Shi, J.

Shi, L.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

Singh, S.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Sivarajan, K.

R. Ramaswami, K. Sivarajan, and G. Sasaki, Optical Networks: A Practical Perspective. 3rd ed.Morgan Kaufmann, 2009.

Soares, F.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Steinmacher-Burow, B.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Subramanya, V.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Sugita, A.

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

Sun, F.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Sung, H.

A. Louri and H. Sung, “An optical multi-mesh hypercube: a scalable optical interconnection network for massively parallel computing,” J. Lightwave Technol., vol. 12, pp. 704–716, 1994.
[CrossRef]

Suzuki, T.

T. Suzuki and H. Tsuda, “Ultrasmall arrowhead arrayed-waveguide grating with V-shaped bend waveguides,” IEEE Photon. Technol. Lett., vol. 17, pp. 810–812, 2005.
[CrossRef]

Takken, T.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Tan, K.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

Thacker, H.

Thakur, R.

R. Thakur, R. Rabenseifner, and W. Gropp, “Optimizing of collective communication operations in MPICH,” Int. J. High Perform. Comput. Appl., vol. 19, pp. 49–66, 2005.
[CrossRef]

Tsang, W. T.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Tsuda, H.

T. Suzuki and H. Tsuda, “Ultrasmall arrowhead arrayed-waveguide grating with V-shaped bend waveguides,” IEEE Photon. Technol. Lett., vol. 17, pp. 810–812, 2005.
[CrossRef]

Tucker, R. S.

Umezawa, K.

S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
[CrossRef]

Vahdat, A.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 63–74.

Van-trease, D.

N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.

Vatanapradit, S.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Vegetti, G.

S. Bregni, A. Pattavina, and G. Vegetti, “Architectures and performance of AWG-based optical switching nodes for IP networks,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp. 1113–1121, 2003.
[CrossRef]

Vranas, P.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

Wang, W.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Wang, Y.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Worhoff, K.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

Wu, H.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

Ye, X.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

Yin, Y.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

Yokokawa, M.

S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
[CrossRef]

Yoo, S. J. B.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

Yoshikuni, Y.

Y. Yoshikuni, “Semiconductor arrayed waveguide gratings for photonic integrated devices,” IEEE J. Sel. Top. Quantum Electron., vol. 8, pp. 1102–1114, 2002.
[CrossRef]

Zajic, A.

J. Oh, M. Prvulovic, and A. Zajic, “TLSync: support for multiple fast barriers using on-chip transmission lines,” in Proc. of the 38th Int. Symp. on Computer Architecture, 2011, pp. 105–116.

Zhang, Y.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

Zheng, X.

Zhong, W. D.

Zhou, X.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

ACM SIGOPS Oper. Syst. Rev. (1)

S. L. Scott, “Synchronization and communication in the T3E multiprocessor,” ACM SIGOPS Oper. Syst. Rev., vol. 30, pp. 26–36, 1996.
[CrossRef]

ACM Trans. Comput. Syst. (1)

J. M. Mellor-Crummey and M. L. Scott, “Algorithms for scalable synchronization on shared-memory multiprocessors,” ACM Trans. Comput. Syst., vol. 9, pp. 21–65, 1991.
[CrossRef]

Electron. Lett. (1)

K. Okamoto, T. Hasegawa, O. Ishida, A. Himeno, and Y. Ohmori, “32 × 32 arrayed-waveguide grating multiplexer with uniform loss and cyclic frequency characteristics,” Electron. Lett., vol. 33, pp. 1865–1866, 1997.
[CrossRef]

IEEE J. Sel. Areas Commun. (2)

S. Bregni, A. Pattavina, and G. Vegetti, “Architectures and performance of AWG-based optical switching nodes for IP networks,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp. 1113–1121, 2003.
[CrossRef]

D. Banerjee, J. Frank, and B. Mukherjee, “Passive optical network architecture based on waveguide grating routers,” IEEE J. Sel. Areas Commun., vol. 16, no. 7, pp. 1040–1050, 1998.
[CrossRef]

IEEE J. Sel. Top. Quantum Electron. (1)

Y. Yoshikuni, “Semiconductor arrayed waveguide gratings for photonic integrated devices,” IEEE J. Sel. Top. Quantum Electron., vol. 8, pp. 1102–1114, 2002.
[CrossRef]

IEEE J. Solid-State Circuits (1)

J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M. Horowitz, “A 14-mW 6.25-Gb/s transceiver in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, pp. 2745–2757, 2007.
[CrossRef]

IEEE Micro (1)

C. Minkenberg, F. Abel, P. Muller, R. Krishnamurthy, M. Gusat, P. Dill, I. Iliadis, R. Luijten, R. R. Hemenway, R. Grzybowski, and E. Schiattarella, “Designing a crossbar scheduler for HPC applications,” IEEE Micro, vol. 26, pp. 58–71, 2006.
[CrossRef]

IEEE Photon. Technol. Lett. (2)

A. Sugita, A. Kaneko, K. Okamoto, M. Itoh, A. Himeno, and Y. Ohmori, “Very low insertion loss arrayed-waveguide grating with vertically tapered waveguides,” IEEE Photon. Technol. Lett., vol. 12, pp. 1180–1182, 2000.
[CrossRef]

T. Suzuki and H. Tsuda, “Ultrasmall arrowhead arrayed-waveguide grating with V-shaped bend waveguides,” IEEE Photon. Technol. Lett., vol. 17, pp. 810–812, 2005.
[CrossRef]

IEEE Trans. Comput. (1)

W. Cohen, D. Hyde, and R. Gaede, “An optical bus-based distributed dynamic barrier mechanism,” IEEE Trans. Comput., vol. 49, pp. 1354–1365, 2000.
[CrossRef]

Int. J. High Perform. Comput. Appl. (1)

R. Thakur, R. Rabenseifner, and W. Gropp, “Optimizing of collective communication operations in MPICH,” Int. J. High Perform. Comput. Appl., vol. 19, pp. 49–66, 2005.
[CrossRef]

J. Lightwave Technol. (3)

Lect. Notes Comput. Sci. (1)

R. Rabenseifner, “Optimization of collective reduction operations,” Lect. Notes Comput. Sci., vol. 3036, pp. 1–9, 2004.

Opt. Express (1)

Parallel Comput. (1)

S. Habata, K. Umezawa, M. Yokokawa, and S. Kitawaki, “Hardware system of the Earth Simulator,” Parallel Comput., vol. 30, pp. 1287–1313, 2004.
[CrossRef]

Other (20)

R. Ramaswami, K. Sivarajan, and G. Sasaki, Optical Networks: A Practical Perspective. 3rd ed.Morgan Kaufmann, 2009.

R. Karmani, N. Chen, A. Shali, and R. Johnson, “Barrier synchronization pattern,” 2009 [Online]. Available: http://parlab.eecs.berkeley.edu/wiki/_media/patterns/paraplop_g1_3.pdf.

M. Blumrich, D. Chen, P. Coteus, A. Gara, M. Giampapa, P. Heidelberger, S. Singh, B. Steinmacher-Burow, T. Takken, and P. Vranas, “Design and analysis of the BlueGene/L torus inter-connection network,” IBM Research Report RC23025 (W0312–022), 2003.

N. Binkert, A. Davis, M. Lipastiy, R. Schreiber, and D. Van-trease, “Nanophotonic barriers,” in Workshop on Photonic Interconnects & Computer Architecture (in conjunction with MICRO 41), 2009, pp. 1–4.

F. Soares, J. H. Baek, N. Fontaine, X. Zhou, Y. Wang, R. Scott, J. Heritage, C. Junesand, S. Lourdudoss, K. Y. Liou, R. Hamm, W. Wang, B. Patel, S. Vatanapradit, L. Gruezke, W. T. Tsang, and S. J. B. Yoo, “Monolithically integrated InP wafer-scale 100-channel × 10-GHz AWG and Michelson interferometers for 1-THz-bandwidth optical arbitrary waveform generation,” in Optical Fiber Communication Conf. (OFC), 2010, OThS1.

J. Kim, W. J. Dally, and D. Abts, “Flattened butterfly: a cost-efficient topology for high-radix networks,” in 34th Int. Symp. on Computer Architecture, 2007, pp. 126–137.

J. Oh, M. Prvulovic, and A. Zajic, “TLSync: support for multiple fast barriers using on-chip transmission lines,” in Proc. of the 38th Int. Symp. on Computer Architecture, 2011, pp. 105–116.

J. Sartori and R. Kumar, “Low-overhead, high-speed multi-core barrier synchronization,” in 5th Int. Conf. on High-Performance Embedded Architectures and Compilers (HiPEAC), 2010, pp. 18–34.

D. Adams, “Cray T3D system architecture overview manual,” 1993 [Online]. Available: ftp://ftp.cray.com/product-info/mpp/T3D_Architecture_Over/T3D.overview.html.

E. Anderson, J. Brooks, C. Grassl, and S. Scott, “Performance of the CRAY T3E multiprocessor,” in 1997 ACM/IEEE Conf. on Supercomputing, 1997, pp. 1–17.

W. J. Dally, “From hypercubes to dragonflies: a short history of interconnect,” IAA Workshop, 2008.

N. Ismail, A. C. Baclig, P. J. Caspers, F. Sun, K. Worhoff, R. M. d. Ridder, M. Pollnau, and A. Driessen, “Design of low-loss arrayed waveguide gratings for applications in integrated Raman spectroscopy,” in 2010 Conf. on Lasers and Electro-Optics (CLEO) and Quantum Electronics and Laser Science Conf. (QELS), 2010, pp. 1–2.

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony, “The PERCS high-performance interconnect,” in Proc. of the 2010 18th IEEE Symp. on High Performance Interconnects, 2010, pp. 75–82.

J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” in Proc. of the Sixth Symp. on Operating System Design and Implementation (OSDI), 2004, pp. 137–149.

M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 63–74.

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly-scalable dragonfly topology,” in 35th Int. Symp. on Computer Architecture, 2008, pp. 77–88.

C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu, “Dcell: a scalable and fault-tolerant network structure for data centers,” in Proc. of the ACM SIGCOMM 2008 Conf. on Data Communication, 2008, pp. 75–86.

D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, and S. Lu, “FiConn: using backup port for server interconnection in data centers,” in Proc. of INFOCOM, 2009, pp. 2276–2285.

X. Ye, P. Mejia, Y. Yin, R. Proietti, S. J. B. Yoo, and V. Akella, “DOS—A scalable optical switch for datacenters,” in Proc. of ACM/IEEE Symp. on Architectures for Networking and Communications Systems, 2010, pp. 1–12.

N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” in Proc. of the ACM SIGCOMM 2010 Conf. on Data Communication, 2010, pp. 339–350.

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (14)

Fig. 1
Fig. 1

(Color online) Wavelength routing of a 5 × 5 AWGR.

Fig. 2
Fig. 2

(Color online) The processor node diagram.

Fig. 3
Fig. 3

(Color online) An all-to-all connection is realized by connecting k nodes to a k × k AWGR.

Fig. 4
Fig. 4

The one-step barrier scheme (S1BS).

Fig. 5
Fig. 5

(Color online) Examples of using 8 × 8 AWGRs to connect 20 nodes, (a) using 4 CNs, (b) using 5 CNs w/WCMs, (c) wavelength conversion at each WCM, (d) a WCM example, (e) using 5 CNs without WCMs.

Fig. 6
Fig. 6

The two-step barrier scheme (S2BS-I).

Fig. 7
Fig. 7

(Color online) An example of the AGCNet-I with two levels of hierarchy by using the proposed connection scheme to avoid the placement of WCMs. CN( i , j ): the j th CN in CN 1 ( i ) .

Fig. 8
Fig. 8

(Color online) Using one 5 × 5 AWGR “switch” to connect five CNs.

Fig. 9
Fig. 9

The three-step barrier scheme (S3BS-II).

Fig. 10
Fig. 10

(Color online) An example of a two-level AGCNet-II with k = 5 .

Fig. 11
Fig. 11

(Color online) A modified two-level AGCNet-II based on an SCN.

Fig. 12
Fig. 12

(Color online) An example of (a) parallel reduce operations and (b) the corresponding tree construction on a two-level AGCNet-I network: processors 7 and 14 are assigned to output the reduce results for key 1 and key 2 , respectively; the intra-CN and inter-CN connections at different hierarchies are shown in (c).

Fig. 13
Fig. 13

(Color online) The estimated latency to complete a global barrier synchronization for N processors. k: the size of AWGRs, s: the number of steps to complete a global barrier, h 1 : the number of levels of the hierarchy following the AGCNet-I structure, h 2 : the number of levels of the hierarchy following the AGCNet-II structure.

Fig. 14
Fig. 14

(Color online) The per-processor energy consumption estimation for the G-AGCNet. k: the size of AWGRs, s: the number of steps to complete a global barrier, h 1 : the number of levels of the hierarchy following the AGCNet-I structure, h 2 : the number of levels of the hierarchy following the AGCNet-II structure.

Tables (2)

Tables Icon

Table I The Best Parameters Under Which Eq. (9.1) is Maximized Given Different s and k

Tables Icon

Table II The Number of Steps to Complete One Global Barrier Synchronization in G-AGCNet, Flattened Butterfly, Dragonfly, and DCell

Equations (13)

Equations on this page are rendered with MathJax. Learn more.

P mod ( 2 j 1 , m 1 ) j P mod ( 2 j 1 , m 1 ) m P mod ( i , m 1 ) j P mod ( i , m 1 ) i j + 1 ,
P mod ( i , m ) j P mod ( i , m ) i j + 1 ,
N = i = 1 h ( m i + 1 ) k i = 1 h m i .
N = 2 h a h h 11 ( a + 1 ) h 11 ( k 2 a h + h 2 h 11 ) ,
N = ( k h + 1 ) h h ( k h ) ( h 2 ) .
h 2 = i = 1 h 20 ( h 2 i + 1 ) ,
N = k i = 1 h 20 h 2 i + 1 h 2 h 2 k i = 1 h 20 h 2 i .
s = ( h 20 + 1 ) + ( h 20 + 2 ) i = 1 h 20 ( h 2 i + 1 ) .
s = 1 + h 1 ( h 2 = 0 ) 3 + h 1 ( h 2 = 1 ) h 1 + ( h 20 + 1 ) + ( h 20 + 2 ) h 2 ( h 2 > 1 ) .
N = ( n + 1 ) h 2 i = 1 h 1 ( m i + 1 ) h 2 n n = k j = 1 h 20 h 2 j i = 1 h 1 m i , h 2 = j = 1 h 20 ( h 2 j + 1 ) ,
i = 1 h 1 m i + j = 1 h 20 h 2 j < k , h 1 0 , m i 0 ( 1 i h 1 ) , h 20 1 , h 2 j 1 ( 1 j h 20 ) ,
N = ( k h 2 ( 2 a 1 ) h 1 2 h 11 + 1 ) h 2 2 h 1 a h 1 h 11 ( a + 1 ) h 11 h 2 ( k h 2 ( 2 a 1 ) h 1 2 h 11 ) ,
h 2 + ( 2 a 1 ) h 1 + 2 h 11 < k a = ( k h 2 2 ) / ( 2 ( h 1 + 1 ) ) 0 h 1 h , 0 h 11 h 1 h 2 = 0 , 1 ,  or  2 b ( b N , b h / 2 ) .