1 / 24

Design and Implementation of TWAREN Hybrid Network Management System

Design and Implementation of TWAREN Hybrid Network Management System. National Center for High-Performance Computing Speaker: Ming-Chang Liang & Li-Chi Ku. Outline. Introduction Motivation Issues Design Implementation Future works. Introduction. About TWAREN.

hawa
Télécharger la présentation

Design and Implementation of TWAREN Hybrid Network Management System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design and Implementation of TWAREN Hybrid Network Management System National Center for High-Performance Computing Speaker: Ming-Chang Liang & Li-Chi Ku

  2. Outline • Introduction • Motivation • Issues • Design • Implementation • Future works

  3. Introduction

  4. About TWAREN • TWAREN (TaiWan Advanced Research & Education Network) network construction was completed at the end of 2003 and started its operation and service in the beginning of 2004. • In its initial phase, IP routing was the main service provided. • The network management programs coming along with the purchase of network equipments, including CIC, Webtop, CW2K, HP Openview, HP NNM and other solutions.

  5. 10GE STM-64/OC-192 STM-16/OC- 48 GE Initial phase of TWAREN MOECC NTU C6509 NCCU ASCC C6509 C7609 C6509 NDHU Taipei C6509 C6509 GSR EBT10GE NCU CCU C6509 NHLTC TWAREN C6509 C6509 GSR GSR NCU C6509 Tainan Hsinchu NCTU GSR NTHU C6509 Taichung C6509 C6509 C6509 NTTU NCHU NYSU

  6. Initial phase of NMS Remedy Help Desk CLI WebTop Notification Gateway API ISM CLI Cisco Info Center SMTP HTTP FTP DNS Probe CW2K (DFM) NNM Trap CTM Trap Trap PING PING PING Trap Trap Polling Polling Polling 12416 7609 3750 2522 2600 15600 15454 NAM

  7. Phase 2 of TWAREN • TWAREN was adapted for more protection methods and better availability at the end of 2006, called TWAREN phase 2. • Tens of optical switches and hundreds of lightpaths were then served as the foundation of the layer 2 VLAN services and the layer 3 IP routing services. • In 2008, tens of VPLS switches were further incorporated to provide additional Multi-point VPLS VPN service. • The layer 1 lightpaths can be protected by SNCP, layer 2 VLAN by spanning tree recalculation and layer 2 VPLS by fast reroute technology. • All these improvements transform TWAREN phase 2 into a true hybrid network capable of providing multiple layers of services and high availability .

  8. STM64 STM16 10GE GE Architecture of TWAREN phase 2 NCCU NIU ASCC NTU 6509 15454 6509 7609 7609 7609 NDHU 6509 15454 7609 15454 15454 3750 6509 15600 NCU 15454 12816 12816 15454 7609 NCNU MOEcc 7609C NHLTC Taipei 7609 12816 12816 6509 3750 6509 Taichung Hsinchu 7609C NCTU 15454 7609C 15454 15454 15600 7609 NCHC 12816 NCHC 12816 6509 Tainan NCHU 15454 NCHC 7609C 7609 NTTU 6509 12816 12816 15454 NTHU 15454 6509 3750 15600 7609 15454 15454 15454 7609 6509 7609 7609 6509 6509 NSYSU CCU NCKU

  9. Motivation

  10. Why need new NMS? • The architecture of TWAREN phase 2 became more and more complicated. • Since TWAREN phase 2 has more protection methods, a single point of hardware or circuit failure will not interrupt the service level provided to the end users. • The initial phase of NMS was no longer competent for the hybrid network anymore because it is hard to determine and predict the correlation between failures and affected services.

  11. Requirements for new NMS • Automatically determine the correlation between failures, affected services, affected customs and severity level on this highly safeguard network. • Provide single integrated visual user interface. • Use integrated database, logs, message flows and exchange protocols. • After several surveys, we decided to develop a new NMS which be suitable for monitoring all services provided by TWAREN phase 2.

  12. issues

  13. Uncertainty of SNMP implementation • There are some different implementations of the SNMP TRAP/MIB among equipments of same brand. • The SNMP OIDs or the return values may vary between OS upgrade on the same equipment and are usually hard to reveal beforehand. • Therefore, the system must be designed in a way such that these changes can be accommodated with minimal modifications.

  14. The lack of skillful programmers • Our programmers are the same guys with the members of operating team. • We are not professional programmers and have not accordant programming language. • The system must be partially available and operational during the early phase of its development such that it can evolve along with the real needs. • So, an unified standard of communication between different modules is necessary

  15. Huge historical data and computing • For minimizing the false positive and false negative rate, baseline thresholds would have much better quality when they are dynamically generated from historical data. • Therefore, we need to store sufficiently large historical data sets and to have very high efficiency to retrieve the data back while calculating those thresholds.

  16. Automatically determine affected services and customs • TWAREN phase 2 inherently has the ability to guard against a single point of hardware or circuit failure, so the failure is less likely to affect the actual service provisioning. • An intelligent management system which is able to determine the scope of failure affected service will reduce the management cost.

  17. Design

  18. 1st Stage System Architecture Control API GUI & Ticket System Monitor Objs Traps Fault Detection Data Collectors Fault Location MIBs Current Status DB Syslogs Threshold DB Net flows Long Term DB Telnet/SSH Case/Action DB Auto Action TL1 Threshold Analyzer Mirror Report System Interactive Passive

  19. Relationship of Data Tables Basic Data Tables Relationship Tables Component Circuit People VLAN Services Location VPLS Services Unit ONS Light Path Vendor ONS Cross Connection …., etc …., etc

  20. Basic Data Tables Component Data Table Vendor Data Table People Data Table Unit Data Table Location Data Table

  21. Relationship Data Tables Circuit Data Table ONS Topology Link Table ONS Light Path Table ONS Cross Connection Table

  22. IMPLEMENTATION

  23. Current monitor objects • Trap monitor • Used interfaces, BGP, etc. • Environment of equipment room • Temperature (auto threshold), Voltage • Statuses of equipments • Temperature , CPU, RAM, FANs, Power-Supply • BGP peering with other networks • Statuses, Number of exchanged routes (auto threshold), Utilization analysis • Performance monitor • End to End RTT (auto threshold), End to End Packet Lost Rate (auto threshold), End to End Availability • Throughput • Backbone (auto threshold), Designate interfaces • Top N • Bytes, Flows, Packets • Routes monitor • The routes of customs (exact comparison) • VPLS VPN • Throughput of CE side, MACs of VPN • Optical Network • Current topology of lightpaths • VLAN • Current topology of VLAN

  24. Future works • Combine all developed monitor objects with single integrated visual user interface. • Enhance the monitoring of optical, VPLS and VLAN networks. • Automatically determine the fault location, root cause and affected scope. • Minimize the false positive and false negative rate.

More Related