480 likes | 652 Vues
Design for Test of Systems on Chip: Digital Test Basic Principles of Bio-Inspired Approaches to Fault Tolerance . Vladimír Drábek and Luk áš Sekanina {drabek, sekanina}@fit.vutbr.cz Faculty of Information Technology Brno University of Technology, Czech Republic. TUTORIAL. Tutorial outline.
E N D
Design for Test of Systems on Chip: Digital TestBasic Principles of Bio-Inspired Approaches to Fault Tolerance Vladimír Drábek and Lukáš Sekanina {drabek, sekanina}@fit.vutbr.cz Faculty of Information Technology Brno University of Technology, Czech Republic TUTORIAL
Tutorial outline • Introduction • Bio-inspired models in computer science • Reconfigurable devices • New trends in fault tolerance • Cellular systems/Embryonics • Evolvable hardware • Immunotronics • Conclusions
Hardware and biology: Why? People require powerful systems. These systems are complex. Assume 10x computing elements (x = 2, 3, 6, 12, 24) Adaptation to changes, self-diagnostic, self-repairing, self-assembling, autonomous control, … are needed. Nature has a lot of experience with ... Now, we can use it at the HW level.
Hardware + Biology = Three crucial factors • Development of reconfigurable circuits • starting with Xilinx FPGAs in 1985 • continued with reconfigurable computing • Development of soft computing • Goldberg’s popularization of evolutionary algorithms • evolutionary design (Bentley) • THE AGE OF NANOTECHNOLOGY • 10x computing elements • how to ensure reliability
10 years ago… only a few people involved in the world • 1992 – Higuchi (ETL Japan), Hugo de Garis (now with Utah State U.) – multiplexer evolution in PLA • 1993 – Mange (LSL, Switzerland) – self-repairing and self-replicating HW • 1994 – CAM (Cellular Automata Machine) Brain Project (de Garis) • 1995 – Thompson (U. of Sussex) – intrinsic evolution in FPGA XC6216 • 1995 – Towards Evolvable Hardware (1st conference, LSL, Lausanne, Switzerland)
Nowadays: conferences, journals • Main conferences • Evolvable systems: From biology to hardware (1996 - Japan, 1998 - Switzerland, 2000 - UK, 2001 - Japan, 2003 - Norway) • NASA/DoD Workshops on Evolvable Hardware (1999, 2000, 2001, 2002 – in USA) • Workshops on Information Processing in Cells and Tissues • partially at GECCO, CEC, FPL, DDECS, … • Journals • Genetic Programming and Evolvable Machines • IEEE Transactions on Evolutionary Computation
Nowadays: 52 research groups(see A. Thompson’s links) • UK 16 • USA 14 • Germany 5 • Italy 3 • Canada 2 • Japan 2 • Norway 2 • Czech R. 1 • Denmark 1 • Switzerland 1 • The Netherlands 1 • Romania 1 • Brazil 1 • Australia 1 • Mexico 1
Resources • Evolutionary Electronics Web Links (A. Thompson) • http://www.cogs.susx.ac.uk/users/adrianth/EHW_groups.html • EvoELEC at EvoNet • http://evonet.dcs.napier.ac.uk/evoweb/working_groups/evoelec/index.html • Reconfigurable POEtic Tissue • http://www.poetictissue.org • Brno University of Technology, Czech Rep. • http://www.fee.vutbr.cz/~sekanina/ehw/index.html • The selected papers related to this tutorial • [1] Sekanina, L., Drabek, V.: Relation Between Fault Tolerance and Reconfiguration in Cellular Systems. In: 6th IEEE Int. On-Line Testing Workshop, Palma de Mallorca, Spain, 2000, pp. 25-30 • [2] Sekanina, L., Drabek, V.: Automatic Design of Image Operators Using Evolvable Hardware. In: 5th IEEE Design and Diagnostics of Electronic Circuits and Systems Workshop 2002, Czech Rep., pp. 132-139 • [3] Sekanina, L., Drabek, V.: A Survey of Bioinspired Methods for Design of Fault Tolerant Reconfigurable Architectures In: BEC'02, Tallinn
POE model for classification of bio-inspired HW (LSL, Switzerland, 1997 ) • is adopted for classification of FT-systems in this presentation. • P – Phylogeny – Evolution – the circuit connection is subject of evolution (evolvable hardware) • O – Ontogeny – Development – circuitconnection is understood as the multicellular organism developed from the mother cell (embryonics electronics) • E – Epigenesis –Learning – neural machines, immunotronics • Combined as PO, PE, OE, POE hardware
Some models from computer science • O axis: cellular automaton • Von Neumann and Ulam in 1940s • Non-uniform CA, Vichniac in 1986 • Cell Matrix, Macias in 1999 • P axis: evolutionary algorithms • Genetic algorithm - Holland in 1960s • Evolutionary strategy – Bienert, Rechenberg, Schwefel in 1960s • Evolutionary programming – Fogel in 1960s • Genetic programming – Koza in 1992 • E axis: artificial neural network (ANN) • McCuloch and Pitts in 1940s
Math models and nature: O-axis Cellular automaton (CA) • an array of simple cells • only local interaction • (a)synchronous operations • uniform/nonuniform • computational as well as constructional universality • emergent computation • Problems: • How to define rules for a given task. • What behavior is generated using the given rules. • Other models: Lindenmayer systems
Math models and nature: P-axisEvolutionary algorithm (EA) • bio-inspired robust search - iterative procedure • population of chromosomes (candidate solutions) • Selection • - select perspective chromosomes • Crossover • - exchange parts of chromosomes • Mutation • - change a part of the chromosome • Fitness Calculation • - evaluate chromosomes
EAs continued • Evolutionary optimization vs. evolutionary design • Adaptation on population level • Advantages: • provide many alternative solutions • can generate innovative solutions • widely applicable • Disadvantages • no guarantee for optimal solution within finite time • weak theoretical basis • can be computationally expensive
Math models and nature: E-axisArtificial Neural Nets for learning • Adaptation on individual level (learning) • Other models: the artificial immune system
Examples of PO, OE, PE, POE • PO – cellular programming • CA rules are evolved (Sipper). • PE – evolutionary design of ANN • Architecture/weights/… of an ANN are evolved. • OE – development of an ANN • ANN is built from a mother cell • POE – evolution of CA rules, CA defines the structure of an ANN. Then the ANN is trained. (CAM Brain Project, de Garis)
Implementation platform: Reconfigurable devices • = an array of programmable elements in: • ASIC • FPGA • Virtual reconfigurable devices in FPGAs • Cell Matrix • Application-given function – mainly deals with P and E axes • Static • Dynamic – configurable computing • Dynamic – adaptive – evolvable hardware • Re/Configuration system – mainly deals with O axis • Internal/External • Partial/Full • Controlled/Autonomous • Serial/Parallel
New trends in fault tolerance (FT) • O axis: Cellular systems/Embryonics • P axis: Evolvable hardware • E axis: Immunotronics
Principles of Fault Tolerance • Hardware redundancy: spare cells/columns/rows • Always needed. • If a cell detects a fault => reconfiguration • Fault-detection is needed. • Reconfiguration scenario is CRUCIAL!!! • 10x computing elements must be managed effectively.
Traditionalapproach Embryology-based (Embryonics) Macias‘s cell-based (in Cell Matrix) Cellular Systems: 10x cells3 scenarios of reconfiguration
(1) FT: Traditional Approach [1] • Serial configuration • Column redundancy • When fails: Full reconfiguration is initialised. • Slow and inefficient
(2) FT: Based on Embryology:Levels of abstraction • population level (virtual) • a set of multicellular organisms • multicellular organismic level (virtual) • a set of cells • cellular level (virtual) • a set of molecules • molecular level (basic FPGA element) • Similar principles (fault detection, self-repair, self-replication, reconfiguration) are applied at all levels.
Example: POEtic Tissuewww.poetictissue.org population Hierarchical FT organism cell molecule = FPGA element (FT: duplication/memory testing)
FT: Cellular Division- the circuit is developed from the mother cellCellular Differentiation- the cell sets up its function according to coordinates The implementation is based on coordinate registers placed in each cell.
execution instructions X=3 Y=2 Implementation 1: Fixed coordinatesThe genome is known at design time • Development of the multicellular organism • Cellular division - each cell gets entire genetic program (stored in Configuration Register - CR) • Cellular differentiation - only some parts of the CR (given by position of the cell stored in the coordinate registers) define function of the cell execution genome instructions X=4 Y=1
In case of a fault • FT mechanism: When a cell fails – other cells only recalculate their positions to activate appropriate functions. • Modification: incomplete genom in CR • entire system reconfiguration from an external device in the case of a major fault • coordinate recalculation in the case of a minor fault
Implementation 2: Relative coordinatesThe genome is unknown at design time • Opposite to the previous approach, positions of the cells are determined using several artificial diffusers (inspiration in distributed diffusers that release a given protein into the system) • A cell’s coordinate depends on distance from the diffuser. • This is to model a dynamic environment – diffusers can change their positions dynamically.
BioWatch projecthttp://lslwww.epfl.ch/pages/embryonics/home.html • BioWatch is an example of a system based on principles of embryology. It is a giant artificial organism operating as a wall watch that is able to self-repair in case of a minor fault or to self-replicate in case of major fault. In case of a large damage, the BioWatch dies. • Implementation: bio-inspired electronic wall, fixed coordinates, LED, XCS10XL, touch sensitive elements, 5000 molecules
PO: The Firefly machinehttp://lslwww.epfl.ch/pages/research/papers/firefly/home.html • The machine is based on the cellular programming approach, in which parallel cellular machines evolve to solve computational tasks. The firefly system operates with no reference to an external device, such as a computer that carries out genetic operators, thereby exhibiting online autonomous evolution.
Assume 10x cells A cell can send its table to the neighboring cells! Internal, distributed reconfiguration if C=0 => asynchronous data mode if one of C=1 => synchronous configuration mode Macias‘s cell sends its configuration!
Distributed and internalreconfiguration of HW Cell X configures cell Z by first configuring cell Y to act as a router (with table T1) and then passing table T2 into Z via Y. That can be done in parallel in many regions.
Obrázek 45 Example: An expanding adderIf overflows then build new stage autonomously!
Potential applications(the results are from simulators, only an 8x8 cell chip exists) • Cell Matrix is a platform for nanocomputing. • 56bit DES cracker at 256 Kbaud (1023 cells) • DNA sequence alignment • Image processing • Self-assembling and self-repairing circuits for space applications • Supercell – 72900 cells (270x270), implements a single two-input, one-output functional block • Supercells find defect-free regions. • Supercells “copy” correct circuits into these regions. • Supercells work autonomously after a “self-test” command is supplied.
New trends in FT • O axis: Cellular systems/Embryonics • P axis: Evolvable hardware • E axis: Immunotronics
Evolvable hardwareEHW = EA + reconfigurable HW The circuit connection is encoded in the chromosome. Fitness = # correct outputs for all input combinations (in case of small combinational circuits)
Mean Difference Per Pixel: Example: Image filter design [2] original image corrupted image (Gaussian noise) chromosome filtered image Digital circuit fitness Comparator N x N pixels, N = 256
Evolutionary design of shot noise filters:Some results RA3P5 1702 gates Median filter 4740 gates F57 441 gates IF-THEN-ELSE 123 gates shot noise QUALITY Conventional filters Evolved filters HW COST Another circuits evolved: Gaussian noise filters, edge detectors …
An evolved circuit (edge detector) Redundant (inactive) elements Intron: A and A = A
Redundancy and inherent FT • Redundancy is beneficial for HW evolution. • Mutations can be considered as faults. • Inherent FT: A perfect circuit could appear in a few generations after a fault (because of redundancy). • Neutral mutation – does not change fitness of the circuit (= inherent FT). • Intron – a part of chromosome which does not affect the fitness. • Rigidity of the circuit – the evolved circuit depends on mutations only minimally.
Explicit FT in evolvable hardware • The requirements for FT are included into the fitness function (e.g. some critical cases are tested during evaluation of the circuit). • Disadvantage: time consuming fitness calculation
New trends in FT • O axis: Cellular systems/Embryonics • P axis: Evolvable hardware • E axis: Immunotronics
Immunotronics= immunological electronics • The immune system • recognizes all cells (or molecules) within the body and categorizes those cells as self or nonself. • From an engineering viewpoint it is a multi-layer, parallel and distributed adaptive system that uses learning and memory to perform pattern recognition task in a decentralized fashion.
Immunotronics as FSM (Univ. of York, 2001) • The automaton of the system consists of the valid and invalid states and transitions allowing extraction self and nonself conditions required for fault detection. Faults can be detected by monitoring of the transitions. • The hardware immune system is created in four steps: • A test bench is used to collect self data from the finite state machine undergoing the immunisation process. • A set of tolerance conditions is extracted from self data. • The selected tolerance conditions are then downloaded into the hardware immune system. • During operation, the inputs and current state of the finite state machine are extracted and passed through to the immune system. The immune system searches through all tolerance conditions at the same time to determine the validity of the extracted string. If a match is found then a potential fault is indicated.
Bradley and Tyrrell’s schema (ICES’01) Hardware enclosure Environmental control complexity Hardware immune system Immune system control Protected system Memory tolerance conditions
Conclusions • Bio-inspired FT: new, topical but starting branch which has entered into HW design. • We have to add some redundancy in all cases. Bio-inspired approaches should exploit this redundancy in much better way than engineers usually do. • Practical results in industrial applications – open problem nowadays. • We are waiting for suitable reconfigurable platforms. Maybe nanotechnology?