Abstract

Outline — Introduction — Overview of Recovery Approaches — Multilayer Networks — Multilayer Recovery — Enabling Technologies for Multilayer Recovery — Conclusion Failures in Networks — Planned vs. unplanned outages — Planned: intentionally caused by operational or maintenance procedures — Unplanned outages: difficult to predict — Internal vs. external causes — Internal: caused by network-internal imperfection (e.g., design error, battery breakdown, component defect) — External: by surrounding event (e.g., electricity breakdown, storm, earthquake, sabotage, vandalism) — Commonly occurring failures — Cable cuts: related to link length; between 50 and 200 days per 1000 km of cable — Equipment failures Failures in Optical IP Networks — Ports on the client equipment, or connections between client equipment and optical-layer equipment — Optical layer hardware — The fiber facility between sites Æ the least reliable component of the optical networks — IP node failures (operator errors, power outages, software errors) Recovery Phases — Fault detection — Loss of Signal, Loss of Light, Loss of Lambda, Loss of Modulation, Loss of Clock… — Signal degradation (Bit Error Rate, Optical Signal to Noise Ratio…) — Fault localization — Link Management Protocol (LMP) — Fault notification — Alarm Indication Signal, Remote Defect Indication — LMP: Failure Indication Signal — Recovery — Reversion — Make-before-break Overview of Recovery Approaches — Introduction — Overview of Recovery Approaches — Multilayer Networks — Multilayer Recovery — Enabling Technologies for Multilayer Recovery — Conclusion Basic Recovery Methods — Protection Switching to a pre-established recovery path or path segment after the occurrence of a fault — Restoration Establishing new paths or path segments on demand for restoring traffic after the occurrence of a fault Classification of Recovery Procedures Classification of Recovery Procedures (cont’d) Example 1+1 Line Protection — Fast protection Rx: Receiver — No signaling required between end-points — High cost (duplication of wavelengths and corresponding transponders) Example 1:1 Line Protection — Fast protection (slower than 1+1) — Signaling required between end-points — High cost (duplication of wavelengths corresponding transponders) BUT: possibility for low priority pre-emptable traffic — SDH (today) and transparent (economical) alternatives Example Restoration Example Restoration Example Restoration Multilayer Networks — Introduction — Overview of Recovery Approaches — Multilayer Networks — Multilayer Recovery — Enabling Technologies for Multilayer Recovery — Conclusion Multilayer Networks IP/WDM Example GFP: Generic Framing Procedure IP / MPLS: Internet Protocol / Multiprotocol Label Switching OTN: Optical Transport Network RPR: Resilient Packet Ring SDH / SONET: Synchronous Data Hierarchy / Synchronous Optical Network WDM: Wavelength Division Multiplexing Multilayer Networks EU Project ‘NOBEL’ Example Two-layer Network Single-Layer Networks IP Network over WDM Multilayer Recovery — Introduction — Overview of Recovery Approaches — Multilayer Networks — Multilayer Recovery — Enabling Technologies for Multilayer Recovery — Conclusion Multilayer Recovery Outline — Single-Layer Recovery in Multilayer Networks — Multilayer Recovery — Rationale behind multilayer approach to recovery — Multilayer recovery: no coordination — Multilayer recovery: escalation strategies — Integrated multilayer recovery — Multilayer recovery: guidelines — Common pool survivability — Dynamic multilayer recovery Single-Layer Recovery in Multilayer Networks: at the Bottom Layer — Advantages — Only a simple root failure has to be treated (efficient for cable cuts) — Recovery actions are performed on the coarsest granularity — Failures in the bottom layer do not need to propagate through multiple failures before they trigger any recovery action — Drawbacks — Inability to recover failures in the layers above — Inability to restore transit traffic in isolated client node Single Layer Recovery at the Bottom Layer Single-Layer Recovery in Multilayer Networks: at the Top Layer — Advantages — Simple to recover from higher layer failures — Simple to recover from node failures — Possibility to differentiate between different flows at higher layer depending on their importance — Drawbacks — Lots of recovery actions due to finer granularity of flows (slow recovery process) — Quite often, complexity of recovery related to multiple secondary failures (one physical cut can expand to tens of thousands of simultaneous logical link failures at the IP layer) Single Layer Recovery at the Top Layer Secondary Failures (Failure Propagation) Single Layer Recovery in Multilayer Networks — Optical protection — Large granularity Æ few recovery actions — Close to root failure — No delay due to failure propagation — No need to deal with complex secondary failures — Known to be fast (at least protection) — BUT: cannot recover from all failures — IP(-MPLS) recovery — For sure, better failure coverage — MPLS protection (making use of pre-established backup LSPs) can also be fast — BUT: — Can be confronted with complex secondary failure scenarios — Fine granularity Æ many recovery actions — During recovery increased usage of capacity Æ decreased QoS — Conclusion: combine recovery at both layers Rationale Behind Multilayer Approach to Recovery — layer recovery schemes Avoid contention between different single- — Promote cooperation and sharing of spare capacity (prevent double protection) — However, multilayer recovery brings new challenges — Key questions — In which layer or layers should recovery schemes be provided? — If multiple layers are chosen for recovery, then how are the procedures coordinated? 29 Multilayer Recovery: No Coordination Deploy recovery schemes in several layers without any coordination Multilayer Recovery: Time Charts (No Coordination) — Instant response to alarms by higher layer causes unnecessary routing activity, routing instability, and traffic congestion Multilayer Recovery Escalation Strategy — After detecting a failure, a protocol layer waits for a hold-off time before initiating its own protection/restoration process — The target restoration time at a protocol layer decides the hold-off time for the layer above — Hold-off time increases as one moves to higher layers — Protection/restoration at any layer is faster than the layers above — Option: Recovery token sent to the upper layer when recovery in the lower fails Multilayer Recovery Bottom-Up Escalation Multilayer Recovery Bottom-Up Escalation (cont’d) Bottom-Up Escalation — Advantages — Recovery actions are taken at appropriate granularity (First, the coarse granularities are handled. Then, actions on higher layers have to recover a small fraction of traffic) — Complex secondary failures handled only if needed Top-Down Escalation — Advantage — Higher layer can more easily differentiate traffic, restoring higher priority traffic first — Drawback — Lower laver has no easy way to detect whether a higher layer was able to restore traffic, which leads to complex solutions — Efficiency issues (lower layer “recovers” part of the traffic already recovered in the higher layer Integrated Multilayer Recovery — Coordination between recovery mechanisms in one integrated multilayer recovery scheme — Problems — Need for a global view — High intelligence required from the relevant recovery algorithms — Implementation complexity Multilayer Network Recovery General Recommendations — High cost redundant recovery mechanisms at different layers should be avoided — Complementary mechanisms at different layers may work together to enhance network reliability — Network failure should be handled as close to where failure happens — Recovery should be handled at the highest pipe size as possible — IP traffic with certain QoS may be transported over optical pipes with matching QoS support PANEL Project Guidelines — Recovery in the highest layer is recommended when: — Multiple reliability grades need to be provided with fine granularity — Recovery interworking cannot be implemented — Survivability schemes in the highest layer are more mature than in the lowest layer — Recovery in the lowest layer is recommended when: — The number of entities to recover has to be limited/reduced — The lowest layer supports multiple client layers and it is appropriate to provide survivability to all services in a homogeneous way — Survivability schemes in the lowest layer are more mature than in the highest layer — It is difficult to ensure the physical diversity of working and backup paths in the higher layer — Using unprotected or preemptible server (lower) paths to carry the client (upper) layer spare capacity is recommended to alleviate redundant protection and remain cost-effective Common Pool Survivability — Multilayer survivability implies multiple spare capacity pools allocated separately in each layer — Traditional planning approaches result in poor utilization of resources (redundant or double protection) — Common pool approach: Leave spare client resources unprotected in server layers — Spare capacity of the higher layer is treated as extra traffic in the lower layer — A simultaneous failure in both layers cannot be survived by this approach Dynamic Multilayer Recovery — Based on upper layer topology modification for recovery purposes — Requires the possibility to modify connection in the lower layer in real time — Advantage: spare resources are not needed in the upper layer — But, in the lower layer spare capacities are needed to deal with failures in this layer and to allow reconfiguration of the upper layer Dynamic Multilayer Recovery Example Dynamic Multilayer Recovery Example Enabling Technologies for Multilayer Recovery — Introduction — Overview of Recovery Approaches — Multilayer Networks — Multilayer Recovery — Enabling Technologies for Multilayer Recovery — Conclusion Enabling Technologies for Multilayer Recovery — Decoupling of transport and control planes — Possible technologies — ASON (Automatically Switched Optical Networks) — GMPLS (Generalized Multi-Protocol Label Switching) Definition of ASON — Automatically switched optical network (ASON) is an optical transport network that has dynamic connection capability — ASON enables: — Improved support for end-to-end provisioning, re-routing and restoration — New transport services, such as: bandwidth on demand, rapid service restoration for disaster recovery, switched connections within a private network, etc. — Support for a wide range of client signals, e.g.: SDH/SONET, IP, ATM, Ethernet, Frame Relay Logical view of ASON architecture Interconnection Models for IP over WDM — Overlay model No routing information shared between IP and optical domains — Peer (integrated) model Single routing domain for all routers and optical crossconnects — Augmented (hybrid) model The IP and optical domains can be functionally separated, each running its own routing protocol, but exchanging full reachability information across the UNI Overlay Model Peer Model Augmented Model Conclusion We have to deal with multiple layers — Integration of IP and optical networks controlled by a common control plane offers new opportunities and challenges related to survivability — Simple escalation strategies are available — Integrated multilayer recovery seems to be impractical — Dynamic multilayer recovery is very promising References ■ P. Demeester, M. Gryseels, A. Autenrieth, C. Brianza, L. Castagna, G. Signorelli, R. Clemente, M. Ravera, A. Jajszczyk, D. Janukowicz, K. Van Doorselaere, Y. Harada, “Resilience in multilayer networks”, IEEE Communications Magazine, vol. 37, no. 8, August 1999, pp. 70-76 ■ J.-P. Vasseur, M. Picavet, P. Demeester, Network Recovery. Protection and Restoration of Optical, SONET-SDH, IP, and MPLS, Morgan Kaufmann, 2004 Biography of the Instructor Andrzej Jajszczyk is a Professor at AGH University of Science and Technology in Krakow, Poland. He received M.S., Ph.D., and Dr Hab. degrees from Poznan University of Technology in 1974, 1979 and 1986, respectively. He spent a year at the University of Adelaide in Australia and two years at Queen’s University in Kingston, Ontario, Canada, as a visiting scientist. He is the author or co-author of seven books and more than 240 papers, as well as 19 patents in the areas of telecommunications switching, high-speed networking, and network management. His current research interests focus on control plane architectures for transport networks, quality of service and network reliability. He has been a consultant to industry, telecommunications operators, and government agencies in Poland, Australia, Canada, France, Germany, India, and the USA. He was the founding editor of the IEEE Global Communications Newsletter (1994 – 1996), editor of IEEE Transactions on Communications (1993 – 1997), and editor-in-chief of IEEE Communications Magazine (1998 – 2000). In 2004 – 2005 he was ComSoc’s Director of Magazines. In 2006 – 2007 he was Director of the Europe, Africa, and Middle East Region of ComSoc. Since January 2008 he is Vice-President – Technical Activities in the same Society. He has been involved in organization of numerous technical and scientific conferences. He was an IEEE Communications Society Distinguished Lecturer. He is a member of the Association of Polish Electrical Engineers and a Fellow of IEEE.

© 2009 Optical Society of America

PDF Article
More Like This
Recovery in Multilayer Optical Networks

Mario Pickavet, Piet Demeester, and Didier Colle
OThK1 Optical Fiber Communication Conference (OFC) 2005

On Using Fast Signalling to Improve Restoration in Multilayer Networks

Américo Muchanga, Antoine B. Bagula, and Lena Wosinska
NThC4 National Fiber Optic Engineers Conference (NFOEC) 2007

Multi-failure Resiliency and Cost-effectiveness in Transport Networks: A Contradiction?

Bodhisattwa Gangopadhyay, João Pedro, Jari Kivimaa, and Stefan Spälter
NeM2F.2 Photonic Networks and Devices (Networks) 2018

References

You do not have subscription access to this journal. Citation lists with outbound citation links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription