1 / 70

Sony Playstation: Graphics in a Crunch

Sony Playstation: Graphics in a Crunch. By Paul Zimmons. PSX Sources:. http://www.classicgaming.com/aec/css/html/psx_development.html http://psx.rules.org/ http://dev.paradogs.com/ http://www.eetimes.com More sources listed for PS2. History.

julius
Télécharger la présentation

Sony Playstation: Graphics in a Crunch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sony Playstation:Graphics in a Crunch By Paul Zimmons

  2. PSX Sources: • http://www.classicgaming.com/aec/css/html/psx_development.html • http://psx.rules.org/ • http://dev.paradogs.com/ • http://www.eetimes.com • More sources listed for PS2

  3. History • Play Station was announced in June 1991 at the CES • Part of a deal with Nintendo to add a CD-ROM attachment to the Super Nintendo system • The deal fell through • 1993 - PlayStation-X announced • December 3, 1994 - Japan release - $387 • 300,000 units in 30 days • September 9, 1995 - US release - $299

  4. December 1994 • Onyx RealityEngine 2 was king ~ $100K • Indigo just starting ~ $15-30K • PC - Pentium 90, 16M, 1M Video, 500M • Mac - 100 Mhz PPC604, 16M, 2M Video, 500M HD announced for Q2 1995 • Doom II came out October 10, 1994

  5. Your PC in 1994

  6. Your Playstation in 1994

  7. People of PSX • Ken Kutaragi • ‘Father’ of the Playstation • Designed audio chip for initial defunct SNES Nintendo deal • Headed Playstation project • Masakazu Suzuoki (suzu@rd.scei.sony.co.jp) • More technical • Design started in May 1992 (CD, 1M transistors goals)

  8. PSX HW Overview • MIPS R3000A main CPU at 33.8688Mhz • 32 bit processor, 4KB I cache, 1 KB D cache • 2 MB of main system memory • 1 MB of video memory, 2KB tex cache • 32 bit bus, 132 Mbyte/sec • 2x CD-ROM (300KB/sec) • 512 KB Sound buffer, 24 simul. Sounds • ~300 K unlit no tex poly/sec

  9. CPU Notes • Some ranges of logical memory space benefit from I cache, some don’t • D cache is accessible by programmers • Between CPU to main memory are • I cache, R buffer (reads),W buffer (writes) • Buffers speed up ops but make exact program prediction nearly impossible

  10. Sound Notes • Pitch can be changed on the fly • Pitch of one sound can be varied by the volume of another sound • Pseudo random noise built in (changed by clock changes) • Attack, decay, sustain, release • Separate reverb unit • Buffer is used for reverb, SPU, xfers to MM

  11. Graphics Overview

  12. Geometry Transform Engine • Coordinate Transforms • Light Source Calculations • Coprocessor for the CPU (separate chip) • Fixed point matrix and vector ops • Sin, Cos on R3000 (rot matrices) • Operates in parallel • 1 sign bit, 3 integer bits, 12 fraction bits

  13. GPU • Receives drawing instructions from the CPU • Addresses from the CPU map to the frame buffer • Textures and palettes are sent from the CPU to the frame buffer • GPU uses that info with the coordinates and color info (lighting) from the GTE

  14. Frame Buffer • Dual ported, can also go direct MM to FB • Can draw while you display

  15. Frame Buffer • Supports 256x240 up to 640x480 • Can do interlaced or not • 15 bit color or 24 bit color (no ops in 24)

  16. Frame Buffer • 640x480x24 is really done as 960x480x16 • Virtual bits per pixel • GPU only works on 16 bits so no drawing while showing a movie sequence • Frame buffer is used for current output, drawing area, textures and color tables • Color palette = color table = CLUT = color look up table = color index mode • No separate texture memory

  17. Typical Frame Buffer Layout Can also change drawing environment properties while drawing.

  18. Primitive Types • Data is in Main Memory • Drawing primitives - seen on screen • Special primitives - change drawing parameters while drawing occurs • Polygon Primitive • 3 or 4 sides, flat or Gouraud, textured or not • Line Primitive • Line (A,B), (A,B,C), (A,B,C,D), gradient or no

  19. Primitive Types • Sprite Primitive - rectangle • Sprite - tex map, Tile - no tex map • free, 1x1, 8x8, 16x16 pixels • Special Primitive • Change window, window clipping, texture window, drawing offset

  20. Drawing Primitives • Draw on pixel centers, pixel center inside OK • Pixel center outside has rules • If pixel to right is inside -> draw • If pixel to left is inside -> no draw • If pixel above is inside -> no draw • If pixel below is inside -> draw • Don’t draw boundaries more than once

  21. Ordering Tables • Like Word diagrams (grouping, order) • Similar to a linked list (draw1->draw2) • GPU renders while CPU goes on

  22. Z Sorting with Ordering Tables • Calculate the primitives position in the table based on its Z value • GTE creates ordering table while converting (x,y,z) coordinates to (xs,ys), zw/4 returned • 256 entries: AddPoly(ot+256-z, poly0) • i.e. Painter’s algorithm style back to front • Last -> Next to Last -> Before that -> ..closest

  23. Reverse OT • Normal OT approaches 0 close to viewer (!) • Let’s try 1/z • Index into OT with 1/z • But then we are indexing in reverse • Reverse the order of the table • OT: ot[0]->ot[1]->… ot[SIZE-1] • ROT: ot[SIZE-1]->ot[SIZE-2] … ->ot[0]

  24. Transform to Screen • (Wx,Wy,Wz) - World, (Sx,Sy,Sz) - Screen • (m00, m01.. m22) - rotation matrix

  25. Now Perspective Xform • Project onto screen • Distance h from the viewer Sy Sz h

  26. Packet Buffers • PB=Area in memory for OT and primitives • CPU and graphics are not in parallel • Have two sets of Packet Buffers

  27. Texture Mapping • Textures are stored in the frame buffer • Textures are in Texture Pages (256 x 256) • X coord multiple of 64, Y multiple of 256 • 4, 8, 16 bit (15+1) • 4, 8 use CLUT (palette of 16 or 256 colors)

  28. Texture Cache • GPU has 2K of texture cache • Faster than the frame buffer reads • Texture reads fill the texture cache • Subsequent reads hit the cache • Depends on your bits per pixel • 4 bpp ==> 64x64 pixels saved • 8 bpp ==>64x32 pixels saved • 16 bpp ==> 32x32 pixels saved

  29. Performance • Large polygons directly mean longer render time • Semi-transparent takes longer (R+W vs. W) • # of reads and writes is frame time • 4 bit means 4 texels per read

  30. Performance Calculation • Cycles for 100x100 texture to 1/2 size 50x100 in 4 bit mode, cache misses always • Read: 100x100/4 = 2500 • Write: 50x100 = 5000 • Total: 7500 • So half texture size does not mean half the time • Ratio gets bad if texture is oblique to viewer

  31. More Performance • Enlarging textures is faster than reducing • Texture cache receives multiple hits per pixel • No filtering • If you always hit the cache (never read from the frame buffer), 4, 8, 16 bits all take same time • Or if you make a repeated pattern small enough • Clipping generates empty cycles • Above and below OK but not side to side

  32. Texture Mapping • No Z used, no perspective • Perspective correct is: U = a0*x + a1*y + a2 V = b0*x + b1*y + b2 U = (a0*x + a1*y + a2*z + a3) / (c0*x + c1*y + c2*z + c3) V = b0*x + b1*y + b2*z + b3 / (c0*x + c1*y + c2*z + c3)

  33. Texture Mapping • If z value doesn’t vary much then OK • Otherwise you get: • Diagonal distortions of textures (split polys)

  34. Geometry Transforms • Local coord -> World coord -> Screen • Local has rot and trans to world • World has rot and trans to viewer • Can concatenate to 3x3 mult and vec add • Then divide to do perspective correct • Lighting is similar (loc normal to world)

  35. Normal Line Clipping • Back face culling (in screen space) • Order of vertices in screen space seems to determine whether they are back face or not 0 1 2

  36. Depth Cueing • Use vertex colors (and blend) • Only done with black or if texture and back are close together • Use texture • Use MIP MAP but farther maps are darker • MIP MAP based on ‘size’ of polygon => depth • Can also change CLUT based on drawing order

  37. MIP MAP • Must be within the same texture page

  38. Meshing • Strip Mesh • Round Mesh

  39. <PS-X OS> • Environment for developer and program control • Takes up max of 64 KB of RAM • OS System table are not hidden for speed • Careful programming • Multi-tasking OS (background music + drawing) • Multiple file systems (through driver)

  40. PS2 • Totally different

  41. PS 2 Sources • 2 Main Papers: • “A Microprocessor with a 128-Bit CPU, Ten Floating-Point MAC’s, Four Floating-Point Dividers,, and an MPEG-2 Decoder” by Masakazu Suzuoki et al. In IEEE Journal of Solid-State Circuits Vol 14. No. 11, November 1999. Page 1608 - 1618 • “Designing and Programming the Emotion Engine” by Masaaki Oka and Masakazu Suzuoki. IEEE Micro November-December 1999. Pages 20-28. • Chip Images • http://fuji.stanford.edu/seminars/spring99/slides/may13/sld001.html, Slides 29, 30, 36 • GPU Info: http://www.g-o-l.com/ck/speciali/altro/graphics-synthesizer.pdf

  42. CPU (Emotion Engine) • 250 Mhz • 32 MB RAMBUS • 2 GB/sec bus • MIPS core with vector coprocessors • 128 bit internal and external pathways • VLIW • 10.5 M transistors

  43. CPU

  44. New Design Goals • Behavior synthesis • Dynamices (distance, Newton iterations) • Geometry • More surface processing • Texture compression

  45. Three in One • RISC core with floating point • 128 bit registers • Two floating point vector units • VPU0 for behavior and physics • VPU1 for geometry • Independent phyics and geometry • Because of penalties with vector processors

  46. Processor Organization

  47. Acronyms • IPU = MPEG2 decoder • DMAC = direct memory access • EFU = elementary function unit • SPR = scratch pad RAM • Used for communicating between subprocs • GIF = graphics synthesizer interface unit • Interprets display lists

  48. Vector Processing Unit

  49. VPU • 4D quantities (x,y,z,w), (r,g,b,a) • 4 multiply accumulators (FMAC) • Big penalties for branches and context switches and interrupts • Cache • Swap out big chunks at a time • VLIW bad efficiency • Break into two parts

More Related