1 / 11

Understanding Floating Point Representation of Real Numbers in Numerical Analysis

This introduction to numerical analysis focuses on how computers represent and operate with real numbers, specifically through the IEEE 754 floating point standard. We explore the concepts of rounding errors, machine representation of numbers, and the various formats of floating point representation, including normalized scientific notation. Key topics include the definition of machine epsilon, the significance of mantissa and exponent, and the details of rounding rules for double precision. This foundational knowledge is crucial for understanding numerical computations and their potential errors in computer science.

edric
Télécharger la présentation

Understanding Floating Point Representation of Real Numbers in Numerical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Numerical Analysis I Floating Point Representation of Real Numbers MATH/CMPSC 455

  2. Floating Point Representation of Real Numbers • This is about how computers represent and operate real numbers. • Helps us to understand rounding errors • Weconsider IEEE 754 Floating Point Standard • Representing binary numbers in computer: • format • machine representation

  3. Floating Point Format • Formats for decimal system Standard Notation Scientific Notation Normalized Scientific Notation

  4. Floating Point Format • Format for floating point number (binary representation) Normalized IEEE floating point standard: • sign (+ or -) • mantissa , which contains the significant bits. (N b’s) • exponent (p, M-bit binary number)

  5. Definition (machine epsilon, ): It is the distance between 1 and the smallest floating point number greater than 1. Gives a bound on the relative error due to rounding. For the IEEE double precision floating point standard:

  6. Rounding How do we fit a given binary number in a finite number of bits? IEEE Rounding to Nearest Rule: For double precision, if the 53rd bit to the right of the binary point is 0, then round down (truncate after the 52nd bit). If the 53rd bit is 1, then round up (add 1 to 52 bit), unless all known bits to the right of the 1 are 0’s, in which case 1 is added to bit 52 if and only if bit 52 is 1.

  7. Rounding Notation: Denote the IEEE double precision floating point number associated to x, using the Rounding to the Nearest Rule, by fl(x). Definition (absolute error & relative error): Let be a computed version of the exact quantity .

  8. Rounding Example: Example: Relative rounding error:

  9. Machine Representation • Sign: 1 bit, 0 for positive, 1 for negative; • Mantissa: 52 bits, … • Exponent: 11 bits so 0 < e < 2 -1 = 2047 and • p = e - 1023 • 1~2046  -1022 ~ 1023 • 2 values reserved for infinity / NaN and 0 • 2047  infinity if the mantissa is allzeros, NaN otherwise; • 0  small numbers including 0 11

  10. Addition and Rounding of Floating Point Numbers Step 1: line up the two numbers Double Precision Step 2: add them Higher Precision Step 3: store the result as a floating point number Double Precision

  11. Example : Example :

More Related