1 / 28

Status – Week 207

Status – Week 207. Victor Moya. Summary. Z Test box. Z Compression. Z Cache. Stencil. HZ Box. HZ Test. Traces. Z Test box. Z Test box includes: Z cache. Z encoder (compress and reference value). Z decoder (decompress). Z test. Z update. Stencil test. Stencil update.

damiena
Télécharger la présentation

Status – Week 207

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Status – Week 207 Victor Moya

  2. Summary • Z Test box. • Z Compression. • Z Cache. • Stencil. • HZ Box. • HZ Test. • Traces.

  3. Z Test box • Z Test box includes: • Z cache. • Z encoder (compress and reference value). • Z decoder (decompress). • Z test. • Z update. • Stencil test. • Stencil update.

  4. Fragments/Stamps Reference Z value Fetch Enc Z Cache Read Compressed Z Line/Block Stencil Test Dec Z Test Stencil Update Write Fragments/Stamps

  5. Z Compression. • ATI HOT 3D in Eurographics 2000. • 8x8 pixel block (Z cache line). • DDPCM : differential differential pulse code modulation. • Two modes: • ½ of original size. • ¼ of original size. • Entropy encoder. • Entropy encoders? • Hufffman. • Arithmetic encoder.

  6. 1D Z Compression 8 input z values - - - - - - - - - - - - - Entropy Encoder

  7. 2D Z Compression 64 pixels 2D DDPCM Entropy Encoder Packer

  8. Z Compression • ATI patent application 20030038803. • Two reference values MAX and MIN. • Offset values. • Windows. • Other method I don’t understand yet … • S3 patent 6,411,295. • Similar approach. • Others.

  9. Z Compression • Method 1: • MIN and MAX per cache line/block. • 1 bit flag per pixel/Z value telling which reference value to use. • The offset from MIN or MAX reference values are stored in the compressed output. • The offsets must be inside a window of T values (log2T = bits per offset) from MIN and MAX.

  10. z = Zmin + T - 1 z = Zmax - T + 1 Z = 0 Z = 1 Zmin Zmax

  11. MAX MIN

  12. Z Compression • Method 2: • Z values are divided into upper and lower bits. • Keep UMAX and UMIN. • Calculate A = Umin - 1, B = UMAX + 1. • 2-bit flag per pixel/Z value references the upper bits from { UMAX, UMIN, A, B}. • Lower bits per pixel/Z value are stored in the compressed output.

  13. A B Z = 0 Z = 1 Zmin Zmax Umin << a Umax << a

  14. Umin Umin

  15. Z Compression • Reference values in the compressed output. • Compression flags on die. • Useful for fast clear too.

  16. Z Cache • Normal cache? • Or ‘fetch’ cache? • Normal cache that supports a large number of active misses (miss on miss, miss on hit). • Or prefetching?

  17. Z Cache • Fetch vs Prefetch. • Fetch needs additional state (bits) per cache line. • Fetch needs additional port to the cache tag file. • Fetch implies a large queue or stalls somewhere. • Prefetch requires a predictor. • Prefetch may request data that won’t be used (failed predictions).

  18. Z Cache • Prefetching. • Very easy to predict next data inside a triangle (large). • Quite common (middle-small triangles). • Easy to predict next data inside a tristrip or triangle list batch. • Very common. • Hard to predict next data between batches (or meshes)? • But will happen rarely.

  19. Z Cache • “Fetch cache” • In fact prefetching. • Texture Prefetching Architecture. • Akeley course. • Igehy, Eldridge, Proudfoot, Prefetching in a texture cache architecture. • Not read yet. • Slightly different concept: • Our fetch cache is accessing twice the tag file. • But simulated is the same as we are not taxing the tag file access!! • Change mechanism so that fetch returns pointer to the cache line.

  20. Rasterizer Texture Memory Request FIFO FIFO Cache Tags Reorder Buffer Stall Cache Data Texture Filter Texture Apply

  21. Stencil • Stencil and Z share a 32 bit word per pixel: • 8/24. • 0/32. • 2x16 (Z only!!).

  22. Stencil • Stencil compression: • If stencil is not active and is cleared: • Remove stencil field from compressed data. • If stencil is active or not cleared: • Compress stencil? • Independent of Z compression. • Needs more compression flag bits. • Which is the average stencil value? Or log2 of the value? • How much can be saved? 8b to 1b, 2b, 4b? Worth of it?

  23. HZ Box • Hierarchical Z buffer. • Number of levels? • Size? • On die? • Includes: • Memory for storing the different levels. • Update mechanism. • Process requests and updates.

  24. HZ Box • ATI model (from patents XXX, and XXX). • 2 levels. • 1st level is from original 8x8 blocks (z cache line). • 2nd level is 2x2 (?) values from level 1. • Update mechanism: • Z Max (or Z Min) from the Z encoder (compressor) for a 8x8 block (cache line). • Combining cache for level 2 (?). • Write and update on eviction from combining cache (?).

  25. HZ Test • Compares the incoming Z value from a graphic object to the reference Z value stored in one or more of the Hierarchical Z levels. • What can be tested: • Triangle Z (or 3 vertex Z). • Cull a whole triangle. • Blocks of fragments: • Good for recursive descent or tiled!!. • Large blocks to level 2. • 8x8 (or less) blocks to level 1. • Stamps (2x2) or fragments: • Against level 1 (slow access? fast update?). • Against level 2 (fast access? slow update?).

  26. HZ L2 HZ L1

  27. Traces • I stalled Carlos work so delayed until next week.

  28. Web • I’m writing my web page. • GPU3D page? • Public/private.

More Related