1 / 27

Status – Week 281

Status – Week 281. Victor Moya. Objectives. Research in future GPUs for 3D graphics. Simulate current and future 3D graphic hardware. Finish (someday) the PhD ;). Problems. Information. Choice of the simulation target: Current GPUs. Near future GPUs. Absolutely new GPU designs.

ima
Télécharger la présentation

Status – Week 281

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Status – Week 281 Victor Moya

  2. Objectives • Research in future GPUs for 3D graphics. • Simulate current and future 3D graphic hardware. • Finish (someday) the PhD ;).

  3. Problems • Information. • Choice of the simulation target: • Current GPUs. • Near future GPUs. • Absolutely new GPU designs. • Future is hard to predict. • But GPUs change very fast. • Fierce competence between ATI and NVidia. Matrox and 3DLabs follow (3DLabs can rule workstation market). SIS and VIA as OEM.

  4. Status • Designing a hardware 3D graphics pipeline: • Command processors. • Vertex Shader.  • Divide by w, Clip, Culling and Triangle Setup. • Rasterization. • Pixel shaders. • Antialiasing. • Designing the simulator.

  5. 3D Graphics Pipeline

  6. Geometry • Vertex operations: • (1) Transform coordinates and normal • Model => World. • World => Eye. • (2) Normalize the length of the normal. • (3) Compute vertex lightning. • (4) Transform texture coordinates. • (5) Transform coordinates to clip coordinates (projection). • (8) Divide coordinate by w. • (9) Apply affine viewport transform (x, y, z).

  7. Geometry • Primitive operations: • (6) Primitive assembly • (7) Clipping: • (10) Backface cull: eliminate back-facing triangles. • Primitive generation: new pipeline stage (ATI TruForm).

  8. Vertex Shader • VS 1.0, 1.1 and 1.2 (current technology) for Direct3D 8 and 8.1. OpenGL extensions: ARB_vertex_program (finally in OpenGL v1.4), NV_vertex_program1_1 (NVidia), EXT_vertex_shader (ATI). • No branching. • Single cycle execution latency (?). • Single issue instruction each cycle. • Simple in order pipeline (?).

  9. Vertex Shader • 16 input registers (read only). • 15 output registers (write only). • 12 temporary registers (read/write). • 96 constant registers (read only or read/write?). • 256 instructions max

  10. Vertex Shader • Output • Inputs (vector or • Opcode (scalar or vector) replicated scalar) Operation • ------ ------------------ ------------------ -------------------------- • ARL s address register address register load • MOV v v move • MUL v,v v multiply • ADD v,v v add • MAD v,v,v v multiply and add • RCP s ssss reciprocal • RSQ s ssss reciprocal square root • DP3 v,v ssss 3-component dot product • DP4 v,v ssss 4-component dot product • DST v,v v distance vector • MIN v,v v minimum • MAX v,v v maximum • SLT v,v v set on less than • SGE v,v v set on greater equal than • EXP s v exponential base 2 • LOG s v logarithm base 2 • LIT v v light coefficients • DPH v,v ssss homogeneous dot product • RCC s ssss reciprocal clamped • SUB v,v v subtract • ABS v v absolute value

  11. Clipping • Clip geometry primitives with the view frustrum (6 planes). • Clip geometry primitives with the user clip planes. • Techniques used: • Guard-Band Clipping. • Homogenous rasterization avoids clipping in the geometry stage.

  12. Guard-Band Clipping

  13. Homogeneus coordinates • “Triangle Scan Conversion using 2D Homogeneus Coordinates”, Olano and Greer.

  14. Rasterization • Setup (per-triangle). • Sampling (triangle = {fragments}. • Interpolation (interpolate colors and coordinates).

  15. Rasterization • Converts primitives to fragments. • Primitive: point, line, polygon, … • Fragment: transient data structure short x, y; long depth; short r, g, b, a; • Fragment selection. • Parameter Assignment (color, depth ...).

  16. Programmable Pipeline

  17. Vertex Program

  18. Vertex Program

  19. NV_vertex_program2 • ARL (new support for four-component A0 and A1 instead of just A0.x) • ARR (similar to ARL, but rounds instead of truncating before storing the integer result in an address register) • BRA, CAL, RET (branching instructions) • COS, SIN (high-precision trigonometric functions) • FLR, FRC (floor and fraction of floating-point values) • EX2, LG2 (high-precision exponentiation and logarithm functions) • ARA (adds pairs of components of an address register; useful for looping and other operations) • SEQ, SFL, SGT, SLE, SNE, STR (“set on” instructions similar to SLT, SGE) • SSG (“set sign” operation; generates a vector holding –1.0 for negative operand components, 0 for zero-value components, and +1.0 for positive components)

  20. NV_vertex_program2 Overview • 1. Condition codes • 2. Branching & subroutines • 3. Even faster performance • 4. Nineteen new instructions • 5. New source modifiers • 6. Clip plane support • 7. More registers & instructions

  21. NV_vertex_program2 Resource Limits • 256 vertex program parameters • Up from 96 • 16 temporary registers • Up from 12 • Two 4-component address registers • Up from one single-component address register • 256 static instructions per program • Up from 128 • Given branching, 65536 dynamic instructions can execute before termination to avoid infinite loops

  22. NV_vertex_program2 Source Modifiers • Source operand absolute value • Example: MOV R0, |R1|; • In addition to source negation & swizzling • Example: MAD R0, -|R1|.yzwy, |R2|, -R3,w; • Swizzle, negate, & absolute value operations are “free” source modifiers

  23. NV_vertex_program2 Condition Codes (1) • Condition code state • 4-component register stores condition code values • Four possible values • LT –less than zero • EQ – equal to zero • GT –greater than zero • UN– unordered, for comparisons involving NaN • Most instructions optionally update condition code state • Indicated with “C” suffix: DP4C, MOVC, etc • “CC” pseudo-register used to just update condition codes

  24. NV_vertex_program2 Condition Codes (2) • Optional condition code based destination masking • Example: MOV R1.xy(NE.z), R0; • Copy R0components to R1’s X & Y components except when condition code’s Z component is EQ • Condition code rules: EQ, equal; GE, greater or equal; GT, greater than; LE, less or equal; LT, less than; NE, not equal; FL, false; and TR, true • Note that condition code masking rule can swizzle condition code components

  25. ATI R300. Vertex Shader.

  26. 3DLabs P10. Pipeline.

  27. Matrox Parhelia. Pipeline.

More Related