280 likes | 525 Vues
40G Signal Tap (sniffer) – Yearly Project 40G Signal Tap. Intel: Lan Access Division Technion: High Speed Digital Systems Lab. By: Leonid Yuhananov & Asaad Malshy Supervised by: Dr. David Bar-On. Goal. “Tracing 40Gbit Ethernet on a logic analyzer”.
E N D
40G Signal Tap (sniffer) – Yearly Project 40G Signal Tap Intel: Lan Access Division Technion: High Speed Digital Systems Lab By: Leonid Yuhananov& Asaad Malshy Supervised by: Dr. David Bar-On
Goal “Tracing 40Gbit Ethernet on a logic analyzer” We want to tap onto 40G traffic and present it in a useful way. • Tap: Listen to the Link. • Sniff the data transmitting on the line. • Present: View data on Logic analyzer. • Parse the data into Ethernet II frames. • Useful: Easy to read and good for debug. • Only the frames we are interested in will be presented. • Versatile: highly configurable for debug purposes. • We are able to configure our Rx path to suit our needs in the bit level.
Project Definition – Preliminary • In the Ingress direction: • 4 x 10G optical lines in differential operation mode. • Representing an IEEE 802.3 40GbE link. • In the egress direction: • 34x4 channels to logic analyzer. • Display: • Output will be displayed on the logic analyzer in Ethernet II frame structure. • Trigger indication.
Project Definition – Revised • Due to HW limitations some revisions in the requirements were made, though maintaining the project’s poise and quality. • In the Ingress direction: • 10G Base-R optical line in differential operation mode. • Representing an IEEE 802.3 10GbE link. • Highly configurable PHY. • Dual pipelined data path one for the frames, and one for the trigger. • Low latency. • In the egress direction: • Top generation clock speed for the Altera, of 625MHz. • 18 bit wide bus. • External LA sync. • Display: • Output will be displayed on the logic analyzer in Ethernet II frame structure. • Trigger indication.
High Level Block Diagram – Initial FPGA 4xXAUIx3.125G Transceiver channels AEL2005 SFP+ Optical modules x2 ALTERA AltGx 10.3125Gx2 72 lines x 156.25M x4 Logic Analyzer ALTERAto DDR frequency multipliers 10G word alignerx4 40G wordsaligner 4xXAUIx3.125G Transceiver channels AEL2006 SFP+ Optical modules x2 ALTERA10GbaseR PHY 10.3125Gx2 10.3125Gx2
DCRs • Due to HW limitations and optimization requirements, some Design Change Requests (DCRs) were addressed. • We used only one transceiver. • We used only the faster more configurable AEL2006. • The implementation was in the more up to date 10G Base-R protocol. • There was no need for a 40G aligner, but our 10G aligner was implemented to support future expansion. • We expanded our trigger mechanism. • Out Logic analyzer is of more elegant nature, supplying more information.
High Level Block Diagram – Revised FPGA 72 lines x 156.25M Transceiver channels AEL2006 SFP+ Optical modules x2 ALTERA10GbaseR PHY Logic Analyzer 10.3125Gx2 ALTERAto DDR frequency multipliers 10G word aligner 10.3125Gx2 Trigger detection SYNC signal MDIO writer, PHY configuration Block.
Descriptionofmainblocks. • SFP+ (optical Module): Converts the optical signal to an electrical one. • Transceiver channel: • AEL2006- converts data to 10Gbase-R 10.3125G traffic (detailed information is internal) • MDIO writer: we use it to write configurations to the AEL thus making our PHY highly configurable. • ALTERA 10GBASER-PHY – convert 10.3125Gtraffic to 72 lines of 64 data and 8 controls. • 10G word aligner – Our logic to align data and generate triggers (as defined at midterm presentation) • Trigger detection Block: the block which detects our wanted word for triggering and thus capturing the data following it. • Altera to DDR frequency block – multipliers that reduce amount of lines to logic analyzer by increasing speed. • Logic Analyzer: • Is our device for viewing the captured data. • A sync signal supplied by the Altera is used to sync our device.
Transceiver channels • Puma AEL2006-10GbE Dual CDR w/EDC • Transiving 10G HSRXDATA from SFP+ to 10G RXDATA for 10G baseR PHY • NetLogic Microsystems' Puma AEL2006 device is a dual physical layer retimer - compliant with IEEE802.3aq specifications. • The NetLogic Microsystems Puma AEL2006 device provides the consolidation of the receiver and transmitter SerDes functions on a single chip along with on-chip clock drivers, multiple loop-back features and PRBS generation & verification for both the line side and the system side.
ALTERA 10G BaseR PHY • 10G BaseR PHY – block from Alteramegafunction, used to convert 10G RXDATA to 8 words of data and 8 bits of controlsSDR XGMII = single data rate XGMII, 72 bits @156.25 Mbps • 10GBASE-R PCS • 10.3125-Gbps physical medium attachment (PMA), • PHY management functions • 10GBASE-R PHY functions: 64b/66b encoding/decoding • scrambling/descrambling • 66b/16b gear-boxing, and data serialization/deserialization
Alignment FPGAblocks • 10G alignment logic (detailed description at part 1) • Rearrangement of data coming from 10G-BASER-PHY • Alignment data from beginning of packet • Triggering matched packet (hard coded) • Contains FSM, rewiring blocks and trigger capturing FSMs 10G-BASER-PHY alignment logic x2 72 bits 156.25M not aligned 72 bits 156.25M aligned
Alignment FPGA blocks • 40G alignment logic – Done – though not checked due to HW limitations. • Contains 4 10G alignment blocks • Determining alignment pattern logic • Alignment output according to 40G protocols – FSMs • Redirection of Trigger’s signals • Arrangement data for DDR to Logic analyzer block 72 bits 156.25M aligned 10G-BASER-PHY alignment logic x2 72x4 bits 156.25M Aligned for 40G protocol And DDR multipliers
DDR – Double Data Rate • The double data rate is our output to the outer world (Logic Analyzer). • Since we want to utilize less LA pins using higher speeds, a double data rate is required. • Should be considered as a serializer, from 2 or more lines of a certain data rate, to a single line of double or more data rate. • The operation is based on a high speed DeMux, with a round around counter for its select bits.
DDR Interposer • A SODIMM to 4xSoftTouch interposer. • Market price > 40k $. • Our price – 3 gray hairs on Leonid’s head. • The following interposer was designed by a member of the team. • It was a distinct effort, that would make our project unique. • The implementation was from scratch and done in many purposes in mind. • Supports maximum bandwidth of current and future LA. • High connectivity – low latency.
Logic Analyzer • A logic analyzer is an electronic instrument which displays signals in a digital circuit. A logic analyzer may convert the captured data into timing diagrams, protocol decodes, state machine traces. • TLA7000 Series • 6,528 Logic Analyzer Channels • 500 ps (2 GHz) – serial data • 312.5 ps (3.2 GHz) – signal integrity • 625 ps (1.6 GHz) MIPI
40G IXIA • Our 40G IXIA should have been our hammer and chisel for our debug process. • IXIA is an industry leading supplier of networking test equipment. • We leased our 40G IXIA for 2 weeks. • it had many features though lacked some key features for our debug process. • It was a good experience, allowed us to determine functionality and compatibility of our design.
Setup • Our setup was a real system setup. • We had our system sniffing both a production NIC by Intel and an industry grade IXIA. • we veryfied our system to be working under all conditions. • Our LA was connected via our interposer. • The results were shown on the LA screen with vivid and vibrant colors.
Project Constraints and HW limitation • Our Staratix 4 dev board proved to lack the necessary features to enable 40G link and processing. • The DDR to soft touch connector: an imperative piece of hardware. Unfortunately it was not designed as requested. • A 40G link partner, even though we had the IXIA for 2 weeks, it was lacking many debug features. • One of our dev boards was fried during testing operation. • We are using MegaCore functions provided by Altera – black box. • We may run our design while connected to the computer only– not an issue, since the device is couple with a computer not unlike any test equipment today.
Why it didn’t work – and what did! • When we look at the spec, we see the reason for this behavior. • The spec states that the alignment words do not undergo encoding, while all the rest does. • This presented the hardware limitation and stoped our 40G effort. • In our debugging we proved that this was the actual issue.
Why it didn’t work – and what did! • Here is the spec snippet.
Why it didn’t work – and what did! • When we tried debugging why 40G link wasn’t working, we started configuring various loopback modes. • The 40G link worked when we tried a loopback without the various decoding blocks. • Proof of the fact that the transceivers were working properly, but the decoding mechanism wasn’t. • This wasn’t due to our limitation but due to HW.
Project deliverables • We have a working 10GbE tap. • Alignment mechanisms. • Special IEEE 802.3 words detection. • We have a wide array of configurations. • A versatile MDIO writer. • A triggering mechanism. • A Logic Analyzer interface. • A working DDR for a generic bit width. • A 625MHz output on a wanted width. • The ground is ready for a 40GbE tap assuming Stratix 5 board.
Project deliverables • Here is a run of our project using the TLA.
Next – expectations for the next project • Get 40Gbit link. • A whole new world – 40G is still young in the industry. • Getting a new dev board which will support QSFP. • On our side, the ground is laid for the 40G effort, most of the required blocks are working and debugged under simulation in HDL. • We enabled our DDR to provide enough frequency multiplication to enable 40Gbit traffic to be shown on the LA.
Thank you all Don’t Stay tapped for more