The paper presents the architecture, implementation and evaluation of the flexible and finely granular Time Shared Optical Network (TSON) metro node. It focuses on the FPGA-based Layer 2 TSON metro node system. The experimentally measured results show exceptional performance of up to 8.68Gbps throughput per 10Gbps port, 95.38% of theoretical maximum throughput, latency of less than 160μsec and jitter of less than 25μsec. The TSON topology agnostic node/network also delivers differentiated QoS latency levels yet always guaranteed (contention-free) by deploying diverse time-slice allocation schemes.
© 2013 OSA
The increasing demand for the emerging applications of rich media content, cloud services etc., utilizing distributed Data Centers are stimulating the growth of requests for highly dynamic and efficient networking solutions in the metro network . Transparent optical networking is the preferred solution for the next generation networks to support FTTx, Mobile backhauling, intra/inter Data center networks. Optical sub-wavelength switching is a promising candidate by setting up a number of lightpaths over a single wavelength, introducing statistical multiplexing. Optical Packet Switching  (OPS) and Optical Burst Switching  (OBS) are among the first sub-wavelength switching approaches. OPS as a close imitation of electronics packets switching, tries to switch and route optical packets but practically hard to implement cost-efficiently although prototypes have been materialized. OBS on the other hand, shows to be a more realistic approach for the implementation compared to OPS by avoiding the technological complexities and bottlenecks of OPS such as optical buffering and packets processing, and instead use out-of-band resource allocation. However, OBS performance has always suffered from data transport uncertainty or high blocking, due to contention among the user traffic.
To address the abovementioned issues, Time Shared Optical Network (TSON)  was proposed as the first topology-agnostic sub-wavelength switching architecture to deliver flexible statistically multiplexed optical network infrastructure and on-demand guaranteed contention-free time-shared multi-granular services. TSON provides multi-granular all-optical services over dynamically established sub-wavelength lightpaths. These advanced and affluent features lead us to propose and design a complete multi-layer architecture consisting of control, electronic, and optical layers coordinated by dynamic and intelligent reconfigurable logic (FPGA) between layers. To generate the time-shared burst traffic and transmission and carry out O/E/O operation, a high-performance tailor-programmed electronic line card is needed. The fast development of FPGA technology made such kind of implementation faster and more flexible. Xilinx has announced its industry’s highest bandwidth FPGA chipsets featured with up to 16 high performance 28Gbps transceivers, and single chip 400G line card implementation enabled by 2.8Tbps of full duplex serial bandwidth . This can support ultra-high speed uncompressed 8K video, that requires 24 Gbps data rate . The latest commercially available Xilinx Virtex-6 HXT FPGA platform was used to operate 10Gbps O/E/O conversions at domain boarder nodes. Multiple 10Gbps channels with time-slice data set aggregation and segregation are designed and implemented. Figure 1(a) shows the TSON topology including servers to host virtual machines (VM) for control and management, four FPGA-based TSON edge nodes and one FPGA-based TSON core node, and optical switches.
This paper, for the first time, delivers FPGA-based TSON Layer 2 metro node implementation prototype. The paper describes the TSON metro node architecture and its functions of different layers, focuses on the implementation of mapping different Ethernet flows (i.e. based on MAC address) to dynamically allocated optical time-slices over one or two wavelengths. As such the system and testbed are able to deliver sub-wavelength lightpaths on a flexible (100 Mbps minimum bandwidth with equal step size) and high throughput (up to 8.68Gbps per 10G port) manner while guaranteeing very high QoS (160 μsec latency and <25 μsec jitter). The rest of the paper is organized as follows; section 2 describes the TSON metro node architecture, section 3 depicts the TSON node FPGA implementation details, section 4 setups the testbed and performs the TSON network level evaluation, lastly, section 5 concludes the paper.
2. TSON metro node architecture
To deliver dynamic control, Layer 2 O/E/O and optical transport functionalities, the TSON metro node architecture as shown in Fig. 1(a), consists of the TSON node controller which hosts the control and management functions, the FPGA which runs Layer 2 operations, and the physical layer with active and passive optical components. The TSON network offers a hierarchy of three levels of granularity of resources: connections, frames, and time-slices, as illustrated in Fig. 1(c) By connection, it is referred to a sub-wavelength lightpath establishment between any two end points in the TSON domain. In order to improve statistical multiplexing of data units, each connection lasts for a number of 1 msec long fixed size frames. Each frame is in turn divided to time-slices as the smallest units of network resource i.e. the actual sub-lambda resources. The frame length and the number of time-slices inside a frame define the minimum granularity achievable by the TSON network. TSON can establish as fine as 100Mbps connections over 10G links, using frames of 1 msec length, and 100 time-slices per frame.
The TSON node controller hosts virtual machines to support extended generalized multi-protocol label switching (GMPLS) stack that consists of sub-wavelength enabled RSVP-TE; multi-layer path computation engine (PCE) that interoperates with TSON’s sub-lambda assignment entity (SLAE) for the TSON dynamic resource assignment . Also the FPGA control function and FPGA programming facility to provide dynamic in-system reconfiguration is supported by the controller. The FPGA-based Layer 2 system, acts as a core node or edge node. As core node, it controls the fast PLZT switches (10 nsec ) to bypass time-sliced traffic; and as edge node, it communicates with the TSON node controller to get the time-slice allocation and PLZT switches instructions, handles the Ethernet-to-TSON or TSON-to-Ethernet traffic flows, accomplishes O/E/O operation, and controls PLZT switches. The Layer 1 part of TSON node consists of a number of passive and active optical components. The PLZT switches (10 nsec) are the main enablers for time-shared sub-wavelength switching and transport. Other components such as EDFAs, couplers, multiplexers were used to establish sub-wavelength time-shared light paths (Fig. 1(b)).
TSON edge nodes operate as the TSON domain gateway. Upon a client request for data transmission, the route, wavelength(s) and time-slice(s) is computed and allocated by the PCE. Then the controller forwards the time-slice allocated information (Look-Up-Table (LUT)) over the southbound 10GE based interface to the FPGA and the server informs the clients for data transmission. Finally, based on the LUT information, the FPGA aggregates the Ethernet frames to the time-slice data sets, controls the PLZT switches and transmits the data out in the allocated time-slices. TSON core nodes do not carry out any data processing operation but switch the traffic optically. Therefore, with a client’s request, FPGA-based layer 2 controls the PLZT switches to setup the path following PLZT control LUT, which is computed and transmitted by the node controller.
3. TSON metro node FPGA-based design and implementation
The FPGA-based Layer 2 node is required to realize 10Gbps networking speed, in a fine-granular fashion (100 Mbps per time-slice) low latency/jitter, and reliable performance. The aim of the implementation is to map Ethernet traffic to dynamically allocated optical time-slices over one or two wavelengths and vice versa. To achieve the critical requirement of TSON metro node, the high-performance FPGA-based design system uses the on-chip RAM to store the Ethernet frames and time-slice data sets so as to minimize the processing latency. However, due to limited volume of the on-chip RAM, the system can hold up to 6.5 time-slices (10µsec each) data in the FPGA fabric. Furthermore, a preamble needs to be used in front of the time-slice data set for RX alignment and clock data recovery. Due to the preamble overhead, we experimentally use 91 (out of 100) time-slices (10µsec each) per 1msec frame. (so the switching gap is equal to [(total 9 time slice overhead of 10µs)/(91 as # of time slices) ~1 µs gap)]. It’s important to state that after calculating the FPGA platform interfaces and FPGA on-chip resources, 3 TSON metro nodes are possible into a single chip FPGA-based layer 2 platform (Xilinx Virtex6 XC6VHX-380T-2FF1923 based line card).
Figure 2 demonstrates the architecture of the FPGA-based TSON metro node Layer 2 design with 2 nodes on a single chip. The data flow follows the direction of arrows crossing a number of clock domains. The link communicating with node controller is used to update the time-slice allocation LUT and PLZT switch LUT. Each transceiver of the node works in its own clock domain, totally, there are 13 clock domains in the FPGA design and the cross clock domain signals need to be specially taken care of to avoid loss of signaling or transport data.
For the ingress function of the node, the transceiver receives Ethernet frames, which are passed to the 10Gbps Ethernet MAC block to discard the preambles and FCS. The processed frames are sorted to different buffers based on their destination MAC address. The aggregation block of the design waits for the burst-length Ethernet frames ready in the buffers and an available time-slice allocation, then transmits the bursts into the different wavelength transmitter FIFOs. The transmitter FIFOs adjusts the time to synchronize with PLZT switch controller and then transmits the burst out through 10 Gbps link. The egress function receives time-slice data sets through 10Gbps receiver, it drops them in the Receiver FIFO and segregates the bursts to Ethernet frames and transmits them out. The FPGA in the core node, controls the PLZT switch following the PLZT switch LUT.
4. TSON metro node testbed setup and experimental results
The FPGA-based TSON Metro Node L2 evaluation testbed operates as shown in Fig. 3 including the data generation and evaluation platform. The displayed testbed, shows the FPGA ingress/egress nodes in the edge of the TSON network. These nodes carry out the electric processing of the Ethernet data coming in to make optical time slices, and at egress they extract the Ethernet packets from optical time slices and send them out. The ingress and egress TSON nodes are connected in the core by using active and passive optical components, i.e. sets of PLZT switches per each wavelength, EDFAs, couplers and AWGs, etc. The TSON node controller/server deploys VMs for network wide control as well as node control to dynamically update the PLZT switch LUT and ingress Time-slice allocation LUT in the FPGA over Gigabit Ethernet. A 10Gbps Ethernet/IP network data analyser, is used as the Ethernet frames generator and it is also set to measure and analyse the throughput, frame error rate (FER), latency and jitter of the received Ethernet frame. The FPGA platform consists of the main FPGA board HTG-V6HXT-100G and an extender card HTG-SFP-PLUS-MDL able to support 12 SFP + optical transceivers.
To evaluate and analyse the implemented TSON Layer 2 metro node characteristics, the experiment uses three representative time-slice allocation patterns. Figure 4 shows an example of 1Gbps time-slice allocation pattern over 91 time-slices (10 μsec each and ~1 μsec inter time-slice gap) of a 1msec frame over a single wavelength with 1 . Bit ‘1’ represents allocated time-slice and Bit ‘0’ means unallocated time-slice; Pattern 1 over-provisions time-slices (all slices allocated) ; Pattern 2 evenly splits the just-enough allocated time-slices for 1Gbps; Pattern 3 also allocates just-enough time-slices by combining as many allocated time-slices together as possible. For pattern 2 and 3, the time-slice allocation is limited by the maximum number of contiguous unallocated time-slices without any Ethernet frame loss.
The TSON metro node experimental results are shown in Fig. 5 , which includes the throughput, latency and jitter. There is no packet loss on all experiments reported. Figure 5(a) shows the measured throughput for TSON and Ethernet based on 1500B and 64B Ethernet frame sizes. The maximum experimental speed for TSON is 8.68 Gbps (1500B Ethernet frame size). While using 91 time slices, the maximum theoretical speed is 9.1 Gbps. The utilization is 95.38%. Figure 5(b) shows maximum contiguous unallocated time-slice without any frame loss. This is due to the fact that the on-chip RAM and processing latency of FPGA to generate the bursts are fixed. So when the bit rate increases, the proportion of FPGA processing latency to the input stream latency increases, then the maximum contiguous unallocated time-slice decreases. Figure 5(c) and 5(d) are the latency results (data plane delays caused mainly by TSON Electronics, for delay measure inclusive of control plane as well please refer to ) for different time-slice allocation and different Ethernet frame sizes. Allocation pattern 2 resulted with less than 160μs latency. The latency goes down as bit rate goes up with the aggregation delay decreasing. The latency remains the same in case the just-enough time-slices are allocated across two wavelengths when applying the same patterns. Pattern 1,2,3 with different distribution of allocated time-slice affect the latency results. When the burst is ready, pattern 1 needs less time to wait for an allocated time-slice, so pattern 1 resulted better than pattern 2,3. The same applies to pattern 2 to 3. This clearly shows that TSON node/network can deliver different QoS level (latency) to match traffic profile using the most suited resource allocation. So, pattern 3 delivers 450 μs worst case latency whereas pattern 2 achieves 160 μs and the clear QoS latency separation is evident across both frame sizes (Fig. 5(c) and 5(d)) and throughput level. The jitter CDF results, indicated in Fig. 5(e) and 5(f) were measured based on time-slice allocation pattern 2, and the jitter for both Ethernet frame sizes is less than 25μs. When the bit rate goes up, the jitter has better performance. Figure 5(e) illustrates that over 99% of the frames are received in 1us, while Fig. 5(f) shows over 87.5% of the frames are received in 2μs.
This paper reports on the design and implementation of the innovative and flexible topology-agnostic TSON Metro Node architecture using high performance FPGA platform. It is based on a three layer structure to host and support dynamic and extended GMPLS-PCE control plane as well as high performance FPGA-based layer 2 and optical fast switched (10ns) layer 1 data plane. It mainly focuses on the FPGA-based layer 2 functionalities, implementation and evaluation. The experimental results show a highly flexible system able to achieve ultra-high QoS (<160μs latency, <25μs jitter) and throughput (8.68 Gbps per 10Gbps port) performance. Three time-slice allocation schemes all delivering contention-free services have been benchmarked to demonstrate their differentiated QoS latency levels.
This work is supported by the EC through IST STREP project MAINS (INFSO-ICT-247706) and PIANO + ADDONAS.
References and links
1. J. Berthold, “Optical Networking for Data Centers Across Wide Area Networks,” in Optical Fiber Communication Conference (OFC), March 2012. P. OW1J.1.
2. H. Furukawa, T. Miyazawa, K. Fujikawa, N. Wada, and H. Harai, “First Development of Integrated Optical Packet and Circuit Switching Node for New-Generation networks,” in European Conference on Optical Communication (ECOC), We.8.A.4, September 2010.
3. S. Yoo, “Optical packet and burst switching technologies for the future photonic internet,” J. Lightwave Technol. 12, 4468–4492 (2006).
4. G. S. Zervas, J. Triay, N. Amaya, Y. Qin, C. Cervelló-Pastor, and D. Simeonidou, “Time Shared Optical Network (TSON): a novel metro architecture for flexible multi-granular services,” Opt. Express 19(26), B509–B514 (2011). [CrossRef] [PubMed]
5. G. Lara, “Industry’s highest bandwidth FPGA enables world’s first single-FPGA solution for 400G communications line cards,” Xilinx,(Nov 2010), http://www.xilinx.com/support/documentation/white_papers/wp385_V7_28G_for_400G_Comm_Line_Cards.pdf
6. Dimitra Simeonidou, Optical network services for ultra high definition digital media distribution,” in conference of Broadband Communications, Networks and Systems, (2008).
7. G. S. Zervas, B.R. Rofoee, Y.Yan, D. Simeonidou, G. Bernini, G. Carrozzo, N. Ciulli, “Control and Transport of Time Shared Optical Networks (TSON) in Metro Areas,“ Future Network & Mobile Summit 2012, Berlin, Germany, July 2012.
8. K. Nashimoto, D. Kudzuma, H. Han, “High-speed switching and filtering using PLZT waveguide devices,” in conference of OECC2010, 5–9 July 2010, pp. 540–542.
9. B. R. Rofoee, G. Zervas, Y. Yan, and D. Simeonidou, etc “First Demonstration of ultra-low latency Intra/Inter Data-Centre heterogeneous optical Sub-lambda network using extended GMPLS-PCE Control-Plane,” in European Conference on Optical Communication (ECOC), Th.3.D, (2012).