1 / 10

Detailed Results from measurements and simulation

Detailed Results from measurements and simulation. Status Report on the Combined L1&DAQ implementation Wednesday, April 16 Niko Neufeld. Key Technical issues. Gigabit Ethernet Bit Error Rate (BER) Context Switching Latency Latency due to event queuing in sub-farm

vanida
Télécharger la présentation

Detailed Results from measurements and simulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detailed Results from measurements and simulation Status Report on the Combined L1&DAQ implementation Wednesday, April 16 Niko Neufeld

  2. Key Technical issues • Gigabit Ethernet Bit Error Rate (BER) • Context Switching Latency • Latency due to event queuing in sub-farm • Latency due to L1 decision sorter • Performance of event-merging in NP • Performance of L1 decision sorter (if NP) Work presented here has been done by BJ, JPD, AB and NN Niko NEUFELD CERN, EP

  3. Context Switching Latency • What is it? • On a multi-tasking OS, whenever the OS switches from one process to another it needs a certain time to do this • Why do we worry? • Because we run the L1 and the HLT algorithms concurrently on each CPU node • Why do we want this concurrency? • We want to minimise the idle-time of the CPUs • We cannot use double-buffering in the L1 (latency budget would be half-ed!) Niko NEUFELD CERN, EP

  4. Priority and Latency • Using Linux 2.5.55 we have established two facts about the scheduler: • Realtime priorities work: the L1 task will never be interrupted until it finishes • The context switch latency is low: 10.1 ± 0.2 µs • Measurements of this have been done on a high-end server 2.4 GHz PIV Xeon – 400 MHz FSB – we should have machines at least 2x faster in 2007 • Conclusion: the scheme of running both tasks concurrently is sound Niko NEUFELD CERN, EP

  5. Bit Error Rate (BER) • Gigabit Ethernet is specified to work over UTP CAT5e cables (1000 BaseT) • The BER is defined to be < 10^11  one bad packet per 100 s. Real equipment is much better. • Re-transmission (a.k.a. TCP/IP) does not cure the problem: 1% BER  80% loss in effective band-width • BER depends not only on the cable, but particularly also on the end-point (MAC/PHY) Niko NEUFELD CERN, EP

  6. Is BER a problem? • LHCb is based 1000 BaseT, because of cost reasons: NIC and Switch ports are still 3x more expensive than fibre • The Marvel-Phy on the GigeFE card is working up to 160 m. • Preliminary tests show Niko NEUFELD CERN, EP

  7. FE FE FE FE FE FE FE FE FE FE FE Sorter TRM FE Edge Switch Edge Switch NP NP NP NP NP NP NP SFC SFC SFC SFC SFC SFC SFC SFC NP Readout Network Fast/Gb Ethernet Gb Ethernet Farm CPUs ~1200 CPUs HLT Traffic Mixed Traffic Latencies HLTTraffic Front-end Electronics Level-1Traffic 333Links 40 kHz 2 GB/s 133-235Links 1.1 MHz 9.5-17.5 GB/s Multiplexing Layer 28 Switches 63-111 NPs 19 NPs 64-126 Links 7-13 GB/s 19 Links 1.2 GB/s TFCSystem L1-Decision 75-131 Links 8.2-14.2 GB/s Event Builder 38-66 NPs 57-99 Links 6.2-11 GB/s Queuing latency Level-1 Traffic 57-99 SFCs Niko NEUFELD CERN, EP

  8. “Local” latencies • Latencies which arise as a feature of an isolated component of the system. An event / fragment takes a certain time to pass through the component, independent of other fragments in the system • Examples: forwarding latencies in the switch, event building latency in the NP • They will be covered by a global budget of a few ms • They will be measured as soon as final software and candidate hardware is available Niko NEUFELD CERN, EP

  9. Global latencies • Latencies which arise from the architecture of the system itself, where an event has to wait because of other events • When event on arrival in the sub-farm finds all nodes busy it will be “punished”, with extra latency • When a decision arrives in the L1 decision sorter, it will need to wait for all previous decisions (except the ones in time-out) to arrive before it can go out Niko NEUFELD CERN, EP

  10. Latency due to decision sorting Processing time assumed for L1 trigger ~ 1 / x [ns] Additional time an event needs to in the RS before it is dispatched [ns] Niko NEUFELD CERN, EP

More Related