Origin System Architecture

Origin System Architecture Hardware and Software Environment

Scalar Architecture memory Register File Functional Unit (mult, add) Cache Processor • Reduced Instruction Set (RISC) Architecture: • load/store instructions refer to memory • functional units operate on items in the register file • memory hierarchy in the Scalar Architecture • Most recently used items are captured in the cache • Access to cache is much faster than access to memory ~2GB/s ~10 cy ~500 MB/s ~100 cycles

Vector Architecture Vector Registers Functional Unit (mult, add) Processor k i i = X k C A B • Vectors will be loaded (loadv instruction) from memory • The performance is determined by memory bandwidth • Optimization takes vector length (64 words) into account Vector Operation DO i=1,n DO k=1,n C(i,1:n)=C(i,1:n) + A(i,k)*B(k,1:n) ENDDO ENDDO memory loadf f2,(r3) load scalar A(i,k) loadv v3,(r3) load vector B(k,1:n) mpyvs v3,v3,v2 calculate A(I,k)*B(k,1:n) addvv v4,v4,v3 update C(I,1:n) + Accumulate C(1,1:n) in a vector register

Multiprocessor Architecture memory Register File Register File Functional Unit (mult, add) Functional Unit (mult, add) Cache Cache Cache Coherency Unit Cache Coherency Unit Processor Processor • Cache coherency unit will intervene if two or more processors attempt to update same cache line • All memory (and I/O) is shared by all processors • Read/write conflicts between processors on the same memory location are resolved by cache coherency unit • Programming model is an extension of single processor programming model

Multicomputer Architecture Main memory Main memory Register File Register File Functional Unit (mult, add) Functional Unit (mult, add) Cache Cache Processor Processor • All memory and I/O path are independent • Data movement across the interconnect is “slow” • Programming model is based on message passing • Processors explicitly engage in communication by sending and receiving data

Origin 2000 Node Board Main Memory Directory Basic Building Block • 2 X R12000 Processors • 64 MB to 4 GB Main Memory • Hub Bandwidth Peaks • 780 MB/s [625] --- CPUs • 780 MB/s [683] --- memory • 1.56 GB/s [1.25] -- XIO link • 1.56 GB/s [1.25] -- CrayLink XIO Directory >32P Hub CrayLink R1*K R1*K Cache Cache Node Board

O2000 Node Board HUB Crossbar ASIC: • Single chip integrates all 4 Interfaces: • Processor Interface; two R1x000 processors multiplex on the same bus • Memory Interface, integrating the memory controller and (Directory) Cache Coherency • Interface to the CrayLink Interconnect to other nodes in the system • Interface to the I/O devices with XIO-to-PCI bridges • Memory Access characteristics: • Read Bandwidth single processor 460 MB/s sustained • Average access latency 315 ns to restart processor pipeline Directory SDRAM Main Memory up to 4 GB/node SDRAM (144@50 MHz=800MB/s) L2 Cache 1-4-8 MB Memory Interface CrayLink duplex connection (2x23@400 MHz, 2x800 MB/s) to other nodes R1x000 processor HUB Link Interface Proc Interface R1x000 processor I/O Interface HUB ASIC: 950K gates 100MHz 64bit BTE 64 counters /(4KB)page L2 Cache 1-4-8 MB Input/Output on every node: 2x800 MB/s

Origin 2000 Switch Technology Main Memory Directory N N N N N N N N N N N N N N N N R R R R R R R R 6 ports to XIO Directory >32P XBOW Hub Proc. Proc. Cache Cache Node Board ccNUMA hypercube Router to other Node Boards

O2000 Scalability Principle • Distributed switch does scale: • Network of crossbars allows for full remote bandwidth • The switch components are distributed and modular Main Memory Directory SDRAM Directory SDRAM Main Memory L2 Cache 1-4-8 MB L2 Cache 1-4-8 MB Memory Interface Memory Interface R1x000 processor R1x000 processor HUB Proc Interface Link Interface HUB Link Interface Proc Interface R1x000 processor R1x000 processor I/O Interface I/O Interface L2 Cache 1-4-8 MB L2 Cache 1-4-8 MB Crossbar router network

Origin 2000 Module System Building Block • Module Features: • Up to 8 R12000 CPUs (1-4 Nodes) • Up to 16 GB physical memory • Up to 12 XIO slots • 2 XBOW Switches • 2 Router Switches • 64 bit internal PCI Bus (optional) • Up to 2.5 [3.1] GB/sec system bandwidth • Up to 5.0 [6.2] GB/sec I/O bandwidth

Origin 2000 Module N N N N R R • Deskside System • 2-8 CPUs • 16GB Memory • 12 XIO slots • SGI 2100 / 2200

Origin 2000 Single Rack N N N N N N N N R R R R • Single Rack System • 2-16 CPUs • 32GB Memory • 24 XIO slots • SGI 2400

Origin 2000 Multi-Rack N N N N N N N N N N N N N N N N R R R R R R R R • Multi-Rack System • 17-32 CPUs • 64GB Memory • 48 XIO slots • 32-processor hypercube building block

Origin 2000 Large Systems • Large Multi-Rack Systems • up to 512 CPUs • up to 1 TB Memory • 384+ XIO slots • SGI 2800 + + + =

ScalableNode Product Concept Address diverse customer requirements Independent scaling of CPU, I/O, and storage…tailor ratios to suit application Large dynamic range of product configurations RAS via component isolation Independent evolution and upgrade of system components Maximize leverage of engineering and technology development efforts INTERCONNECT SUBSYSTEMS PROCESSOR SUBSYSTEMS Modular Architecture Interface and Form Factor Standards I/O SUBSYSTEMS

Origin 3000 Hardware Modules (BRICKS) G-brick Graphics Expansion C-brickCPU Module R-brick Router Interconnect I-brick Base I/O Module P-brick PCI Expansion X-brick XIO Expansion D-brick Disk Storage

Origin 3000 MIPS Node R1*000 R1*000 R1*000 R1*000 Bedrock ASIC Mem/Dir Two Independent SysAD Interfaces Each 2x O2K Bandwidth 200 MHz, 1600 MB/sec each 128 Nodes / 512 CPUs per System (Max) L2 Cache L2 Cache L2 Cache L2 Cache Memory Interface 4x O2K Bandwidth 200 MHz, 3200 MB/sec 60% O2K Latency 180 ns local 8 GB/node (Max) DDR SDRAM NUMALink3 Network Port 2x O2K Bandwidth 800 MHz, 1600 MB/sec Bi-directional XIO+ Port 1.5x O2K Bandwidth 600 MHz, 1200 MB/sec Bi-directional

Origin 3000 CPU Brick (C-brick) • 3U high x 28” deep • Four MIPS or IA64 CPUs • 1 - 4 DIMM pairs: 256MB, 512MB, 1024MB (premium) • 48V DC power input • N+1 redundant, hot-plug cooling • Independent power on/off • Each CPU module can support one I/O brick

Origin 3000 BEDROCK Chip

SGI Origin 3000 Bandwidth Theoretical vs. Measured (MB/s) 900 900 1600 1600 CPU CPU CPU CPU 900 900 1600 1600 CPU CPU CPU CPU 1150 1150 1600 1600 2x1250 2x1600 Hub Hub 2100 3200 Memory Memory node node

STREAMS Copy Benchmark SGI Confidential

Origin 3000 Router Brick (r/R-brick) • 2U high x 25” deep • Replaces system mid-plane • Multiple Implementations • r-Brick…6-port (up to 32 CPUs) • R-Brick…8-port (up to 128 CPUs) • metarouter…(128 to 512 CPUs) • 48V DC power input • N+1 redundant, hot-plug cooling • Independent power on/off • Latency 50% ORIGIN 2000 • 45 ns 8 NUMAlink™ 3 NW Ports Each port...3.2GB/s (2x O2K bandwidth) 45ns roundtrip latency (50% O2K router latency) NUMAlink™ 3 Router

SGI Origin 3000 Measured Bandwidth 5000 MB/s Router 2500 2500

SGI NUMA 3Scalable Architecture (16p - 1hop) R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 R1*000 Bedrock ASIC Bedrock ASIC Bedrock ASIC Bedrock ASIC 8-port Router To other Routers

Origin 3000I/O Bricks I-brick: Base I/O Module P-brick: PCI Expansion X-brick: XIO Expansion • Base system I/O: • system disk • CD-ROM • 5 PCI slots • No need to duplicate starting I/O infrastructure • 12 industry-standard,64-bit, 66MHz slots • Supports almost allsystem peripherals • All slots are hot-swap • Highest performanceI/O expansion • Supports HIPPI,GSN, VME, HDTV • 4 XIO slots per brick New I/O bricks (e.g., PCI-X) can be attached via same XIO+ port

Types of Computer Architecturecharacterised by memory access PVP (SGI/Cray T90) UMA Central Memory SMP (Intel SHV, SUN E10000, DEC 8400 SGI Power Challenge, IBM R60, etc.) COMA (KSR-1, DDM) Multiprocessors Single Address space Shared Memory NUMA distributed memory CC-NUMA (SGI Origin2000, Origin3000, Cray T3E, HP Exemplar, Sequent NUMA-Q, Data General) NCC-NUMA (Cray T3D, IBM SP3) MIMD Cluster (IBM SP2, DEC TruCluster, Microsoft Wolfpack, “Beowolf”, etc.) loosely coupled, multiple OS Multicomputers Multiple Address spaces NORMA no-remote memory access “MPP” (Intel TFLOPS,TM-5) tightly coupled & single OS MIMD Multiple Instruction s Multiple Data PVP Parallel Vector Processor UMA Uniform Memory Access SMP Symmetric Multi-Processor NUMA Non-Uniform Memory Access COMA Cache Only Memory Architecture NORMA No-Remote Memory Access CC-NUMA Cache-Coherent NUMA MPP Massively Parallel Processor NCC-NUMA Non-Cache Coherent NUMA

Origin DSM-ccNUMA Architecture Processor Processor Processor Processor Processor Processor Processor Cache Cache Cache Cache Cache Cache Cache Main Memory Dir DistributedSharedMemory Processor Cache Bedrock XIO+ Bedrock XIO+ Main Memory Dir NUMALink3 and R-Bricks

Distributed Shared Memory Architecture (DSM) Main memory Main memory Register File Register File Functional Unit (mult, add) Functional Unit (mult, add) Cache Cache Processor Processor Cache Coherency Unit Cache Coherency Unit • Local memory and independent path to memory as with the Multicomputer Architecture • Memory of all nodes is organized as one logical “shared memory” • Non-uniform memory access (NUMA): • “Local memory” access is faster than “remote memory” access • Programming model is (almost) the same as for the Shared Memory Architecture • data distribution is available for optimization • Scalability properties similar to the Multicomputer Architecture interconnect

Origin DSM-ccNUMA Architecture Processor Processor Processor Processor Processor Processor Processor Cache Cache Cache Cache Cache Cache Cache Main Memory Dir Directory-BasedScalableCache Coherence Processor Cache Bedrock XIO+ Bedrock XIO+ Main Memory Dir NUMALink3 and R-Bricks

Origin Cache Coherency Data Block or Cache line 128 Bytes (32 words) Data Block or Cache line 128 Bytes (32 words) directory page presence (64 bits) presence (64 bits) state 8bits state 8bits • Memory page is divided in data blocks of 32 words or 128 Bytes each (L2 cache line size) • Each data request transfers one data block (128 Bytes) • Each data block has associated presence and state information • If a node (HUB) requests a data block, the corresponding presence bit is set and the state of that cache line is recorded • HUB runs the Cache Coherency protocol, updating the state of the data block and notifying nodes for which the presence bit is set. Unowned: no copies Shared: read-only copies Exclusive: one read-write Busy: state in transition Each L2 cache line contains 4 data blocks of 8 words or 32 Bytes each (L1 data cache line size)

CC-NUMA Architecture: Programming Proc 1 Proc 2 k i i Proc 3 = X j j k • All data is shared • Additional optimization to place data close to the processor that would do most of the computations on that data • Automatic (compiler) optimizations for single processor and parallel performance • The data access (data exchange) is implicit in the algorithm; • Except for the additional data placement directives, the source is the same as for the single processor programming (SMP principle) C every processor holds a column of each matrix: C$distribute A(*,block),B(*,block),C(*,block) C$omp parallel do DO i=1,n DO j=1,n DO k=1,n C(i,j)=C(i,j) + A(i,k)*B(k,j) ENDDO ENDDO ENDDO

Problems of CC-NUMA Architecture • SMP programming style + data placement techniques (directives) SMP programming Cliff remote memory latency jump ~3-5 requires correct data placement Based on 1 GB/s SCI link; latency/hop ~ 500 ns 64-128 processor O2000 ta(remote)/ta(local) ~3-5 ->correct data placement

DSM-ccNUMA Memory Distributed Shared Memory Systems [ccNUMA) Easy to Program Easy to Scale Hard to scale Hard to program Shared-memory Systems (SMP) Massively Parallel Systems (MPP) Easy to Program Easy to Scale

SGI 3200 (2-8p) Router-less configurations in deskside form factor Short Rack (17U config. space) C-Brick Network Network P P P P P, I, or, X-Brick BR BR P P P P I-Brick I-Brick XIO+ XIO+ C-Brick XIO+ Ports XIO+ Ports C-Brick C-Brick Power Bay Power Bay I-Brick P,I, or X-Brick Minimum (2p) System Maximum (8p) System System Topology

SGI 3400 (4-32p) P P P P P P P P BR BR BR BR P P P P P P P P XIO+ XIO+ XIO+ XIO+ Full-size Rack (39U config. space) C-Brick P, I, or, X-Brick XIO+ XIO+ XIO+ XIO+ I-Brick C-Brick P, I, or, X-Brick P P P P P P P P C-Brick BR BR BR BR P, I, or, X-Brick P P P P P P P P C-Brick P, I, or, X-Brick r-Brick r-Brick 6-port router r-Brick 6-port router C-Brick P, I, or, X-Brick C-Brick P, I, or, X-Brick r-Brick r-Brick C-Brick P, I, or, X-Brick C-Brick I-Brick C-Brick C-Brick Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay System Topology Minimum (4p) System Maximum (32p) System

SGI 3800 (16-128p) Rack 1 Rack 2 Rack 3 Rack 4 1 2 3 4 C C C C C C C C C C C C R R R R C C C C R R R R C C C C C C C C C C C C C C C C R-Brick C-Brick R-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick I-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick Power Bay Power Bay C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick Power Bay Power Bay C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick R-Brick R-Brick R-Brick R-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick R-Brick R-Brick R-Brick R-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick I-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay 128P System Topology R-Brick 8-port router Minimum (16p) System Maximum (128p) System

SGI 3800 System: 128 processors 16 proc 16 proc 16 proc 16 proc 16 proc 16 proc 16 proc 16 proc

SGI 3800 (32-512p) P, I, or, X-Brick P, I, or, X-Brick R-Brick R-Brick R-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick R-Brick R-Brick R-Brick R-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick R-Brick R-Brick R-Brick R-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick P, I, or, X-Brick C-Brick C-Brick C-Brick C-Brick P, I, or, X-Brick I-Brick C-Brick C-Brick C-Brick C-Brick Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay Power Bay One Quadrant of a 512p System 512p Power Estimates: MIPS = 77 KW ItaniumTM= 150 KW McKinley = 231 KW No I/O or storage included in power estimates. Premium memory required

Router-to-Router Connections for 256 Processor Systems

512 Processor Systems

R1xK Family of Processors MIPS R1x000 is an out-of-order, dynamic-scheduling superscalar processor with non-blocking caches • Supports the 64-bit MIPS IV ISA • 4-way superscalar • Five separate execution units • 2 floating point results / cycle • 4-way deep speculative execution of branches • Out-of-order execution (48 instruction window) • Register re-naming • Two-way set associative non-blocking caches • Up to 4 outstanding memory read requests • Prefetching of data • 1MB to 8MB secondary data cache • Four user-accessible event counters

Origin 3000 MIPS Processor Roadmap 1999 2000 2001 2002 2003 O3K-MIPS R18000 xxx MHz, xxx GFlops R16000 xxx MHz, xxx GFlops Origin 2000 R14000(A) 500+ MHz, 1000+ MFlops 8 MB DDR SRAM@ 250+ MHz R12000A 400 MHz, 800 MFlops 8 MB @ 266 MHz R12000 300 MHz, 600 MFlops 8 MB @ 200 MHz R10000 250 MHz, 500 MFlops 4 MB @ 250 MHz

R14000 Cache Interfaces

Memory Hierarchy Cache subsystem memory disk ~2-3 cy 1 ~10 cy 1400 0.1 1169 Origin3000 Latency 64reg 1200 1067 Origin2000 Latency Speed of Access 1/clock 1000 836 0.01 759 759 32KB (L1) 800 Remote Latency (ns) ~100 - 300 cy (NUMA) 554 600 8MB (L2) 585 485 343 400 435 335 335 285 ~4000 cy 200 235 175 175 ~1 - 100s GB 0 2p 4p 8p 16p 32p 64p 128p 256p 512p Device Capacity (size)

Effects of Memory Hierarchy 1MB cache 32 KB L1 cache 4 MB L1 cache L2 cache: 2MB cache 4MB cache

Instruction Latencies (R12K) • Integer units latency Repeat rate • ALU 1 • add, sub, logic ops, shift, br 1 1 • ALU 2 • add, sub, logic ops 1 1 • signed multiply (32/64 bit) 6/10 6/10 • (unsigned multiply: +1 cycle) • divide (32/64 bit) 35/67 35/67 • Address Unit • load integer 2 1 • load floating point 3 1 • store - 1 • Atomic LL,ADD,SC sequence 6 6 • Floating point units • FPU 1 • add, sub, compare, convert 2 1 • FPU 2 • multiply 2 1 • multiply-add (madd) 4 1 • FPU 3 • divide, reciprocal (32/64 bit) 12/19 14/21 • sqrt (32/64 bit) 18/33 20/35 • rsqrt (32/64 bit) 30/52 34/56 Repeat rate of 1 means that after pipelining processor can complete 1 operation per cycle. Thus the peak rates: Int operations: 2 int operations/cycle FP operations: 2 fp operations/cycle For the R14000@500MHz: 4*500 MHz = 2000 MIPS 2*500 MHz = 1000 Mflop/s Compiler has this table build in. The goal of compiler scheduling is finding instructions that can be executed in parallel to fill all slots: ILP - Instruction Level Parallelism

Instruction Latencies: DAXPY Example Loop parallelism: 2 loads, 1 store 1 multiply-add (madd) 2 address increments 1 loop-end test 1 branch per single loop iteration Processor parallelism: 1 load or store 1 ALU1 instruction 1 ALU2 instruction 1 FP add 1 FP multiply per processor cycle • There are 2 loads (x,y) and 1 store (y)= 3 mem ops. • There are 2 fp operations (+,*) which can be done with 1 madd • 3 mem ops require 3 cycles minimum (processor can do 1 mem op/cycle) • theoretically in 3 cycles processor can do 6 fp operations • only 2 fp operations are available in the code • max processor speed is 2fp/6fp=1/3 peak on this code; • I.e. for the R12000@300MHz processor 600/3=200 Mflop/s. DO I=1,n Y(I) = Y(I) + A*X(I) ENDDO

DAXPY Example: Schedules cycle instructions 0 ld x x++ 1 ld y 2 3 madd 4 5 6 7 st y br y++ x load delay 3 cycles cycle instructions 0 ld x0 1 ld x1 2 ld y0 x+=4 3 ld y1 madd0 4 madd1 5 6 7 st y0 8 st y1 y+=4 br madd delay 4 cycles x load delay 3 cycles madd delay 4 cycles • Simple schedule: unrolled by 2: • 2fp/(8cycles*2fp/cy)=1/8 peak 4fp/(9cycles*2fp/cy)=2/9 peak • R12000@300MHz ~ 75 Mflop/s ~133 Mflop/s DO I=1,n-1,2 Y(I+0) = Y(I+0) + A*X(I+0) Y(I+1) = Y(I+1) + A*X(I+1) ENDDO DO I=1,n Y(I) = Y(I) + A*X(I) ENDDO

DAXPY Example: Software Pipelining #<swp> replication 0 #cy ld x0 ldc1 $f0,0($1) #[0] ld x1 ldc1 $f1,-8($1) #[1] st y2 sdc1 $f3,-8($3) #[2] st y3 sdc1 $f5,0($3) #[3] y+=2 addiu $3,$2,16 #[3] madd.d $f5,$f2,$f0,$f4 #[4] ld y0 ldc1 $f0,-8($2) #[4] madd.d $f3,$f0,$f1,$f4 #[5] x+=2 addiu $1,$1,16 #[5] beq $2,$4,.BB21.daxpy #[5] ld y3 ldc1 $f2,0($3) #[5] #<swp> replication 1 #cy ld x3 ldc1 $f1,0($1) #[0] ld x2 ldc1 $f0,-8($1) #[1] st y1 sdc1 $f3,-8($2) #[2] st y0 sdc1 $f5,0($2) #[3] y+=2 addiu $2,$3,16 #[3] madd.d $f5,$f2,$f1,$f4 #[4] ld y3 ldc1 $f1,-8($3) #[4] madd.d $f3,$f1,$f0,$f4 #[5] x+=2 addiu $1,$1,16 #[5] ld y0 ldc1 $f2,0($2) #[5] • Software pipeliningis the way to fill all processor slots by mixing iterations • replications gives how many iterations are mixed • number of replications depends on the distance (in cycles) between the load and the calculation • DAXPY 6 cy schedule with 4 fp ops: 4fp/(6cy*2fp/cy)=1/3 peak

DAXPY SWP: Compiler Messages • F77 -mips4 -O3 -LNO:prefetch=0 -S daxpy.f • With the -S switch the compiler will produce file daxpy.s with assembler instructions and comments about software pipelining schedules • #<swps> Pipelined loop line 6 steady state • #<swps> 50 estimated iterations before pipelining • #<swps> 2 unrolling before pipelining • #<swps> 6 cycles per 2 iterations • #<swps> 4 flops ( 33% of peak)(madds count 2fp) • #<swps> 2 flops ( 16% of peak)(madds count 1fp) • #<swps> 2 madds ( 33% of peak) • #<swps> 6 mem refs (100% of peak) • #<swps> 3 integer ops ( 25% of peak) • #<swps> 11 instructions ( 45% of peak) • #<swps> 2 short trip threshold • #<swps> 7 ireg registers used. • #<swps> 6 fgr registers used. • The schedule is the max 1/3 peak processor performance, as expected • note: it is necessary to switch off prefetch to attain max schedule

Origin System Architecture

Origin System Architecture

Presentation Transcript

Origin of the Solar System

Origin of solar system

Origin of the nervous system

Origin of the Solar System

Nova System Architecture

System Architecture

Origin of Solar System

System Architecture

SYSTEM ARCHITECTURE

System Architecture

DSpace System Architecture

System Architecture

System Architecture

Origin of the Solar System

SYSTEM ARCHITECTURE

Origin of the Solar System

System Architecture

Origin of the Solar System

System Architecture

System Architecture