Abstract

We evaluate the performance of three-dimensional optoelectronic computer architectures on the basis of basic database operations and parallel benchmark algorithms for numerical computations. We show that the select and the join database operations can be performed much faster with an optical interconnection network. Also, optoelectronic architectures can perform the fast Fourier transform and sorting benchmarks orders of magnitude faster than electronic supercomputers. An architecture with an adequately fast reconfigurable interconnection network can perform the conjugate-gradient benchmark faster than all parallel supercomputers, but its performance is not as impressive when a fixed network is used. In the case of the multigrid benchmark the three-dimensional optoelectronic architecture also can outperform the best parallel supercomputers.

© 1998 Optical Society of America

Full Article  |  PDF Article

References

  • View by:
  • |
  • |
  • |

  1. E. Schenfeld, “Massively parallel processing with optical interconnections: what can be, should be and must not be done by optics,” in Optical Computing, Vol. 10 of 1995 OSA Technical Digest Series (Optical Society of America, Washington, D.C., 1995), pp. 16–18.
  2. R. A. Nordin, “Optical interconnects in electronic processing systems,” Photonics in Switching, J. E. Midwinter, ed. (Academic, New York, 1993), Vol. 1, Chap. 9.
    [CrossRef]
  3. V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
    [CrossRef]
  4. G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the FFT and sorting benchmarks,” in Proceedings of the Second International Conference on Massively Parallel Processing Using Optical Interconnections, E. Schenfeld, ed. (IEEE Computer Society, Los Alamitos, Calif., 1995), pp. 160–167.
    [CrossRef]
  5. D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).
  6. H. J. Nussbaumer, Fast Fourier Transform and Convolution Algorithms, 2nd corrected and updated edition (Springer-Verlag, New York, 1982).
    [CrossRef]
  7. M. C. Pease, “An adaptation of the fast fourier transform for parallel processing,” J. Assoc. Comput. Mach. 15, 252–264 (1968).
    [CrossRef]
  8. H. S. Stone, “Parallel processing with the perfect shuffle,” IEEE Trans. Comput. 20, 153–161 (1971).
    [CrossRef]
  9. A. W. Lohman, W. Stork, G. Stuck, “Optical perfect shuffle,” Appl. Opt. 25, 1530–1531 (1986).
    [CrossRef]
  10. S. Saini, D. Bailey, “NAS parallel benchmark (version 1.0) results 11-96,” NAS Tech. Rep. NAS-96-18 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1996).
  11. T. N. Hicks, R. E. Fry, P. E. Harvey, “POWER2 floating point unit: architecture and implementation,” IBM J. Res. Dev. 38, 525–536 (1994).
    [CrossRef]
  12. N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
    [CrossRef]
  13. K. E. Batcher, “Sorting networks and their applications,” in 1968 Spring Joint Computer Conference, Vol. 32 of AFIPS Proceedings Series (American Federation of Information Processing Societies, Reston, Va., 1968), pp. 307–314.
  14. C. W. Stirk, R. A. Athale, M. W. Haney, “Folded perfect shuffle optical processor,” Appl. Opt. 27, 202–203 (1988).
    [CrossRef] [PubMed]
  15. G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the NAS benchmarks,” Tech. Rep. OCSTR-97-11 (Optoelectronic Computing Systems Center, Colorado State University, Fort Collins, Colo., 1997).

1995 (1)

V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
[CrossRef]

1994 (1)

T. N. Hicks, R. E. Fry, P. E. Harvey, “POWER2 floating point unit: architecture and implementation,” IBM J. Res. Dev. 38, 525–536 (1994).
[CrossRef]

1993 (1)

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

1988 (1)

1986 (1)

1971 (1)

H. S. Stone, “Parallel processing with the perfect shuffle,” IEEE Trans. Comput. 20, 153–161 (1971).
[CrossRef]

1968 (1)

M. C. Pease, “An adaptation of the fast fourier transform for parallel processing,” J. Assoc. Comput. Mach. 15, 252–264 (1968).
[CrossRef]

Athale, R. A.

Bailey, D.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

S. Saini, D. Bailey, “NAS parallel benchmark (version 1.0) results 11-96,” NAS Tech. Rep. NAS-96-18 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1996).

Barszcz, E.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Barton, J.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Batcher, K. E.

K. E. Batcher, “Sorting networks and their applications,” in 1968 Spring Joint Computer Conference, Vol. 32 of AFIPS Proceedings Series (American Federation of Information Processing Societies, Reston, Va., 1968), pp. 307–314.

Betzos, G. A.

G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the NAS benchmarks,” Tech. Rep. OCSTR-97-11 (Optoelectronic Computing Systems Center, Colorado State University, Fort Collins, Colo., 1997).

G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the FFT and sorting benchmarks,” in Proceedings of the Second International Conference on Massively Parallel Processing Using Optical Interconnections, E. Schenfeld, ed. (IEEE Computer Society, Los Alamitos, Calif., 1995), pp. 160–167.
[CrossRef]

Browning, D.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Carter, R.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Dagum, L.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Fatoohi, R.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Fedor, A.

V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
[CrossRef]

Fineberg, S.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Frederickson, P.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Fry, R. E.

T. N. Hicks, R. E. Fry, P. E. Harvey, “POWER2 floating point unit: architecture and implementation,” IBM J. Res. Dev. 38, 525–536 (1994).
[CrossRef]

Fukuhisa, H.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Haney, M. W.

Harvey, P. E.

T. N. Hicks, R. E. Fry, P. E. Harvey, “POWER2 floating point unit: architecture and implementation,” IBM J. Res. Dev. 38, 525–536 (1994).
[CrossRef]

Hicks, T. N.

T. N. Hicks, R. E. Fry, P. E. Harvey, “POWER2 floating point unit: architecture and implementation,” IBM J. Res. Dev. 38, 525–536 (1994).
[CrossRef]

Ide, N.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Kondo, Y.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Lasinski, T.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Lohman, A. W.

Mitkas, P. A.

G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the NAS benchmarks,” Tech. Rep. OCSTR-97-11 (Optoelectronic Computing Systems Center, Colorado State University, Fort Collins, Colo., 1997).

G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the FFT and sorting benchmarks,” in Proceedings of the Second International Conference on Massively Parallel Processing Using Optical Interconnections, E. Schenfeld, ed. (IEEE Computer Society, Los Alamitos, Calif., 1995), pp. 160–167.
[CrossRef]

Mori, J.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Morozov, V.

V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
[CrossRef]

Nagamatsu, M.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Neff, J.

V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
[CrossRef]

Nordin, R. A.

R. A. Nordin, “Optical interconnects in electronic processing systems,” Photonics in Switching, J. E. Midwinter, ed. (Academic, New York, 1993), Vol. 1, Chap. 9.
[CrossRef]

Nussbaumer, H. J.

H. J. Nussbaumer, Fast Fourier Transform and Convolution Algorithms, 2nd corrected and updated edition (Springer-Verlag, New York, 1982).
[CrossRef]

Pease, M. C.

M. C. Pease, “An adaptation of the fast fourier transform for parallel processing,” J. Assoc. Comput. Mach. 15, 252–264 (1968).
[CrossRef]

Saini, S.

S. Saini, D. Bailey, “NAS parallel benchmark (version 1.0) results 11-96,” NAS Tech. Rep. NAS-96-18 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1996).

Schenfeld, E.

E. Schenfeld, “Massively parallel processing with optical interconnections: what can be, should be and must not be done by optics,” in Optical Computing, Vol. 10 of 1995 OSA Technical Digest Series (Optical Society of America, Washington, D.C., 1995), pp. 16–18.

Schreiber, R.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Simon, H.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Stirk, C. W.

Stone, H. S.

H. S. Stone, “Parallel processing with the perfect shuffle,” IEEE Trans. Comput. 20, 153–161 (1971).
[CrossRef]

Stork, W.

Stuck, G.

Temkin, H.

V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
[CrossRef]

Ueno, K.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Venkatakrishnan, V.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Weeratunga, S.

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

Yamazaki, I.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Yoshida, T.

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

Appl. Opt. (2)

IBM J. Res. Dev. (1)

T. N. Hicks, R. E. Fry, P. E. Harvey, “POWER2 floating point unit: architecture and implementation,” IBM J. Res. Dev. 38, 525–536 (1994).
[CrossRef]

IEEE J. Solid-State Circuits (1)

N. Ide, H. Fukuhisa, Y. Kondo, T. Yoshida, M. Nagamatsu, J. Mori, I. Yamazaki, K. Ueno, “A 320-MFLOPS floating-point processing unit for superscalar processors,” IEEE J. Solid-State Circuits 28, 352–361 (1993).
[CrossRef]

IEEE Trans. Comput. (1)

H. S. Stone, “Parallel processing with the perfect shuffle,” IEEE Trans. Comput. 20, 153–161 (1971).
[CrossRef]

J. Assoc. Comput. Mach. (1)

M. C. Pease, “An adaptation of the fast fourier transform for parallel processing,” J. Assoc. Comput. Mach. 15, 252–264 (1968).
[CrossRef]

Opt. Eng. (1)

V. Morozov, J. Neff, H. Temkin, A. Fedor, “Analysis of a three-dimensional computer optical scheme based on bidirectional free-space optical interconnects,” Opt. Eng. 34, 523–534 (1995).
[CrossRef]

Other (8)

G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the FFT and sorting benchmarks,” in Proceedings of the Second International Conference on Massively Parallel Processing Using Optical Interconnections, E. Schenfeld, ed. (IEEE Computer Society, Los Alamitos, Calif., 1995), pp. 160–167.
[CrossRef]

D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, “The NAS parallel benchmarks,” RNR Tech. Rep. RNR-94-007 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1994).

H. J. Nussbaumer, Fast Fourier Transform and Convolution Algorithms, 2nd corrected and updated edition (Springer-Verlag, New York, 1982).
[CrossRef]

E. Schenfeld, “Massively parallel processing with optical interconnections: what can be, should be and must not be done by optics,” in Optical Computing, Vol. 10 of 1995 OSA Technical Digest Series (Optical Society of America, Washington, D.C., 1995), pp. 16–18.

R. A. Nordin, “Optical interconnects in electronic processing systems,” Photonics in Switching, J. E. Midwinter, ed. (Academic, New York, 1993), Vol. 1, Chap. 9.
[CrossRef]

K. E. Batcher, “Sorting networks and their applications,” in 1968 Spring Joint Computer Conference, Vol. 32 of AFIPS Proceedings Series (American Federation of Information Processing Societies, Reston, Va., 1968), pp. 307–314.

S. Saini, D. Bailey, “NAS parallel benchmark (version 1.0) results 11-96,” NAS Tech. Rep. NAS-96-18 (Numerical Aerospace Simulation Facility, NASA Ames Research Center, Moffett Field, Calif., 1996).

G. A. Betzos, P. A. Mitkas, “Performance evaluation of 3D optoelectronic computer architectures based on the NAS benchmarks,” Tech. Rep. OCSTR-97-11 (Optoelectronic Computing Systems Center, Colorado State University, Fort Collins, Colo., 1997).

Cited By

OSA participates in CrossRef's Cited-By Linking service. Citing articles from OSA journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (14)

Fig. 1
Fig. 1

Main OEC architecture for database operations.

Fig. 2
Fig. 2

Relative data transfer time for selection operations.

Fig. 3
Fig. 3

Relative data transfer time for join operations.

Fig. 4
Fig. 4

Three-dimensional OEC architecture when log2 N is odd.

Fig. 5
Fig. 5

Three-dimensional OEC architecture when log2 N is even.

Fig. 6
Fig. 6

Performance comparison of a 3-D OEC and the best parallel supercomputer on a FFT. FPU, floating-point unit.

Fig. 7
Fig. 7

Three-dimensional OEC architecture for the bitonic-sorting network.

Fig. 8
Fig. 8

Performance comparison of four types of 3-D OEC and the best parallel supercomputer on integer sorting. SP indicates the architecture featuring serial data bit transfer and parallel bit comparison; SS, the architecture featuring serial data bit transfer and serial bit comparison; PS, the architecture featuring parallel data bit transfer and serial bit comparison; and PP, the architecture featuring parallel data bit transfer and parallel bit comparison.

Fig. 9
Fig. 9

Three-dimensional OEC architecture for the CG benchmark.

Fig. 10
Fig. 10

Performance comparison of the 3-D OEC with the fixed interconnection network and the best parallel electronic supercomputers on the CG benchmark.

Fig. 11
Fig. 11

Reconfigurable 3-D OEC for the CG algorithm.

Fig. 12
Fig. 12

Comparison of the performance of the reconfigurable 3-D OEC against the best parallel supercomputers for certain values for the time it takes to reconfigure the network t r on the CG benchmark.

Fig. 13
Fig. 13

Schematic representation of the 3-D OEC for the MG benchmark.

Fig. 14
Fig. 14

Comparison of the performance of the 3-D OEC and the best parallel supercomputers on the MG benchmark.

Tables (2)

Tables Icon

Table 1 Performance Per One 1-D FFT Stage of the Best Parallel Electronic Supercomputers

Tables Icon

Table 2 All the Basic Operations in the CG Algorithm and the Total Number of Times They Are Executed

Equations (24)

Equations on this page are rendered with MathJax. Learn more.

t comm = bits rate = 128   bits r   bits / s = 128 r   s ,
t b = ft f ,
t stage = log 2   Nt f f + log 2   N + 1.5 t comm log 2   NN ,
t ss = m 2 - m + 1 w + m 2 + m - 1 2 t u ,
t sp = 3 m m + 1 2 + m m - 3 + 2 2 t op + m m - 1 wt c = 2 m 2 + 1 t op + m m - 1 wt c ,
t ps = m m + 1 t c + m 2 - m + 1 wt op ,
t pp = m m + 1 t c + 2 m 2 + 1 t op .
S = i = 1 n   x i ,
t comm = 64 r   s ,
t cv = 2 t comm + 2 t f .
t mv = 14,000 t cv .
t f + 64 t comm + t f = 2 t f + 64 t comm ,
t v = 2 t f + 64 t comm + 64 t f + 2 t comm + t f = 67 t f + 66 t comm .
t sv = t comm + t f .
t cg = 10,977,845 t f + 10,974,600 t comm .
t mv = t r + 294 t comm + t f + t r + t comm + t f = 295 t comm + t f + t r .
t v = t r + t comm + t f + t r + 13 t comm + t f = 2 t r + 14 t comm + t f .
t sv = t r + t comm + t r + t comm + t f = 2 t r + t comm + t f .
t cg = 128,670 t comm + 129,965 t f + 118,950 t r .
2 u = v
r = v - Au , u = u + M k r , evaluate   residual , apply   correction ,
z k = M k r k : if   k > 1 , otherwise r k - 1 = Pr k , z k - 1 = M k - 1 r k - 1 , z k = Qz k - 1 , r k = r k - Az k , z k = z k + Sr k , z 1 = Sr 1 , restrict   residual , recursive   solve , prolongate , evaluate   residual , apply   smoother , apply   smoother .
u i j k = W 0 u i j k + W 1 u i ± 1 × j k + u i j ± 1 k + u i j k ± 1 + W 2 u i ± 1 j ± 1 k + u i ± 1 × j ± 1 k + u i ± 1 j ± 1 k + W 3 u i ± 1 j ± 1 k ± 1 ,
t mg = t er log 2   N + i = 1 4 k = 1 log 2 N   t rr k + t er 1 + k = 1 log 2 N t p k + 2 t er k + 1 + t er log 2   N + t n ,

Metrics