130 likes | 213 Vues
Discussing SEU and radiation concerns for new chips, calculation challenges, neutron issues, SEU cross-sections, and potential solutions for DCH Electronics.
E N D
Upgrade Radiation Issues Christopher O’Grady For the DCH Electronics Upgrade Group Based on work by Jerry Va’vra
Introduction • In principle, we need to study SEU and radiation “death” issues of all new chips (FPGA and flash perhaps most worrisome). • Worked on this in parallel with development. • Had hoped we could do “back of the envelope” calculation saying all OK, but didn’t happen. • Neutrons appear to be a problem • This presentation will be “fuzzy”. Situation in flux.
Complications • Only have guess for the neutron rate (depends on some assumptions) • Only have guess for neutron spectrum • Only have approximate answer for cross section for our device • Only have approximate answer for number of configuration bits we rely on • Rate scales with Luminosity!! (jerry believes radiative bhabhas slamming into beam wall).
Proton Cross Section Per Bit • VirtexII Proton Cross Section from Xilinx • Have read that neutrons>20MeV similar
Simulation of Neutron Production • Alberto Fasso’s FLUKA simulation Iron Slab. Copper similar. • 10% of neutrons > 20MeV
Va’vras Detection Method • Uses a Boron detector, which detects “thermal” (low energy neutrons). • Moderates “all” neutrons to be thermal using polyethylene wrapper (if anything, undercounts). • Without polyethylene, no rate observed (good evidence he is really seeing neutrons). • 1MeV neutrons act like a gas, fall off as 1/r**2 (not true for 20MeV neutrons). • Assuming a single point source of 1MeV neutrons, can calculate position of source and then rates everywhere.
Rates from Va’vra’s Logbook Jerry says that KEK sees about 1kHz n/cm**2, not lumi dependent.
Neutron SEU Cross-Sections { 4000 series { Spartan3 series { VirtexII series
SEU Estimates with XC3S1000 • Xilinx claims to have study saying that only 2-10% of configuration bits are used in a typical design. In addition, we are currently only using about 30% of CLBs. • (2000n/cm**2/s)*(2.6E-8cm**2/device)*(48devices)*0.1=0.00025/s (XILINX number) • (2000n/cm**2/s)*(1E-7cm**2/device)*(48devices)*0.1=0.00096/s (ATMEL number) • Corresponds to about 1 SEU per hour. • We’re using XC3S1500, which has factor 2 more configuration bits.
Are We Making Things Worse? • Assume XC4010E bit cross-section is the same as XC3000 series bit cross-section. • Coincidentally, XC4010E (178kbit) = 1*XC3190(64kbit)+3*XC3142(30kbit)+2*XC3120 (14kbit) • i.e. one current box “equals” one XC4010E: • (2000n/cm**2/s)*(4.0E-10cm**2/device)*(48devices)*0.1=3.8E-6/s (XILINX number) • This is one SEU every 72 hours. • We could be making things ~100 times worse.
Some Existing DCH Corruption • Don’t think we see errors every 72 hours, but … • See two sources of data corruption: • Elefant “lock” problem. Typically breaks in a whole quadrant • Illegal board addresses regularly. • Not understood. Karl attempted to reproduce (1) using high rate 1MeV neutrons. No success in 1 week.
What should we do? • If present, this problem affects more than just FPGAs: elefant, atom chip SRAM and configurations, for example. • Seems like a bigger “experiment” problem. • Configuration corruption is the worst. • Perhaps move to ATMEL flash based FPGAs? (lower cross-sections by >100). • Atmel downsides: not as much block RAM (matt), more difficult to synthesize (herbst) • TMR+partial reconfiguration?
Other Ideas (that don’t fix problem) • Monte-Carlo directionality of 20MeV neutrons (hard to really believe) • Checksum FPGA every event • Understand existing DCH corruption • Work with Lockman/Bower on G4 neutron simulation (long time scale) • Put a neutron detector in DCH electronics region • Use SEUPI (xilinx tool) to understand how many critical configuration bits we have