Lesson 4 Reals

Lesson 4Reals • Examples of real numbers: 123.456, 5, 2/3,  , • Reals in daily life height, weight, speed, distance, interest-rate, … • Reals include integers. There are infinite reals between any two different reals, e.g., between 1.1 and 1.2 • The set of floating-point numbers is a subset of reals. A floating-point number consists of an integer part and a fractional part. E.g., 12.125, -0.625, 0.0, 33.0937

There are infinite floating-point numbers • Only a subset of floating-point numbers is represented in computers. An arbitrary real is approximated by a nearby representable floating point number. E.g. • 0.6666666666666666666666  0.6666666666666667 • 1/3  0.3333333333333333 •  3.141592653589793 •  1.4142135623730951

To compute the surface area and volume of a sphere const double PI = 3.141592653589793; double radius, area, volume; cout << "Enter the radius of a sphere in cm: "; cin >> radius; area = 4.0 * PI * radius * radius; volume = 4.0 * PI * radius * radius * radius / 3.0; cout << "Radius is " << radius << " cm\n"; cout << "Surface area is " << area << " cm^2\n"; cout << "Volume is " << volume << " cm^3\n"; Enter the radius of a sphere in cm: 2.0 Radius is 2 cm Surface area is 50.2655 cm^2 Volume is 33.5103 cm^3

102 200 • A floating-point number can be represented in scientific notation where a value is represented as fraction 10exponent , where1  abs( fraction)< 10 (The fraction is called mantissa in some books.) Fixed Point NotationScientific Notation 120.0 1.2102 0.0010004 1.000410-3 0.0 0.0 100 12345600000 1.234561010 0.000000005 5.0 10-9 300000  000 3.0 10102 0.000000 09 9.0 10-201

Consider a decimal computer that uses 4 digits to represent the fraction and 2 digits for the exponent. Scientific NotationIn the decimal computer 1.2102+1200+02 -1.000410-3-1000-03(Rounding error) 0.0 100+0000+00 -1.234561011-1234+11 (Rounding error) 5.0 10-10+5000-10 -3.0 10102-INFINITY (Overflow) 9.0 10-200+0000+00 (Underflow) 1.0 10100INFINITY (Overflow)

The double values in a computer are represented in scientific notation except that its base is 2 instead of 10. • Eight bytes ( 64 bits) are used to represent a double value: 1 bit for the sign of the value, 52 for the fraction and 11 for the exponent (including the sign of the exponent). • The total number of representable values is at most • The minimum and maximum of double values in a computer are defined as symbolic constants in the library <cfloat>: DBL_MIN, DBL_MAX

A very special double “value” is the constant Infinity • An overflow occurs if the magnitude of the result of an operation is larger than the maximum. If the result is positive, it is set to +Infinity. A negative one is set to –Infinity. • A loss of precision occurs if the magnitude of the result is mildly less than the minimum. • An underflow occurs if the result is less than the minimum by a fold of 10-16 . In such case, the result is set to 0.

#include <cfloat> . . . cout << "DBL_MIN is " << DBL_MIN << endl << "DBL_MIN/10 is " << DBL_MIN/10 << endl << "DBL_MIN/1e15 is " << DBL_MIN/1e15 << endl << "DBL_MIN/1e16 is " << DBL_MIN/1e16 << endl; cout << "DBL_MAX is " << DBL_MAX << endl << "DBL_MAX*2 is " << DBL_MAX*2.0 << endl; DBL_MIN is 2.2250738585072014e-308 DBL_MIN/10 is 2.2250738585072034e-309 DBL_MIN/1e15 is 2.4703282292062327e-323 DBL_MIN/1e16 is 0.0000000000000000e+000 DBL_MAX is 1.7976931348623157e+308 DBL_MAX*2 is 1.#INF000000000000e+000 Loss of precision Underflow Overflow Double overflow and underflow

Examples of double constants: 123.456 , 5. , 0.0 , 0. , .0 , +9., .123456789999999999999 , 1.09e-3 , -1E10 , 0.1e5 • Examples of invalid double constants: 123, 0, 1d3, . , .e-2, -e5, 2.0e0.5 • Declaration of double variables • double area; //initial value undefined • double x, y, z; //initial value undefined • double price = 100.5; //price is initialized to 100.5

Advanced • The syntax rules of double constants • <sign>  + |  |  • <digit>  0 | 1 | …|9 • <digits>  <digit> | <digit> <digits> • <fixed>  <digits> . | . <digits> | <digits> . <digits> • <expn>  {e | E} <sign> <digits> |  • <double const>  <sign> <fixed> <expn> | <sign><digits><expn> • denotes an empty string (nothing) | denotes “or”

double operations • Unary operations +x //Identity –x //Negation • Binary operations x + y //Addition x – y //Subtraction x * y //Multiplication x / y //Division, y  0 Note that the same symbols are used to denote int and double operators. These operators are ____________. • The precedence and association rules are the same as that of int .

In a pure-mode operation, all the operands are of type double and the result is of type double, eg, 2.0 * PI • In a mixed-mode binary operation, e.g., 1 + 3.0 , one operand is of type int and the other is of double. The int value is first cast (converted) to the equivalent double value (automatically) before the double operation is carried out. The answer is of type double. • Mixed-mode assignment: <double var> = <int expr>; When an int value is assigned to a double variable, the int value is cast automatically to the equivalent double value before the assignment. E.g., • double d; //declare d as a double variable • d = 13; //convert 13 to 13.0 before the assignment

<line #> <program name> [Warning] assignment to `int' from `double' • No precision is lost when an int value is cast into a double value. • When a double value is assigned to an int variable, the value is cast automatically to an int value equal to the integer portion of the double value. The fraction is lost. A warning is issued for the possible loss of precision. E.g., int i; //declare i as an int variable i = 13.999; //i is equal to 13. A warning is issued i = d; //i = floor(d). A warning is issued.

(int)is the casting operator that converts a value of other type to an int value. E.g., (int)3.9 equal to the integer value 3 (int) -3.9 equal to the integer value -3 Note that there is no rounding, the fraction of the operand is merely discarded (truncation). • The casting operator has higher priority than all the binary operators. E.g., • (int)3.5 / 0.5 equals • (int)(3.5 / 0.5) equals

Cast a double value to an int value before assigning it to an int variable (highly recommended) <int var> = (int)(<double expr>); E.g.,i = (int) d; i = (int) (d / 13.3); • Similarly, you may convert a value of other type into the equivalent double value using the casting operator (double). What are the results of the following?int i = 7; d = (double) i / 2; d = (double) (i / 2);

! j = “555”; j = 5 / 2; j = 5 % 2; j = 5.0; j = (int) 3.4 / 1.1; j = (int) (3.4 / 1.1); d = 5 / 2; d = (double) 5 / 2; d = (int) (12.34567 * 100.0) / 100.; d = (int) (12.34567 * 1000.0 + 0.5) / 1000.; Let j be an int variable and d be a double. Give the value of j or d for each statement below. Give an X for a statement with syntax errors. Mark a statement with a if it may trigger a warning in compilation.

The last example shows a standard trick to perform rounding. • The following rounds the digit right after the decimal point, (int) (123.4567 + 0.5) • The following rounds the second digit after the decimal point, • (int) (123.4567 *10.0 + 0.5)/10.0; • The following rounds the third digit after the decimal point, • (int) (123.4567 *100.0 + 0.5)/100.0;

cmath library • Thislibrary provides functions for computing many common mathematical functions. E.g., sin(x), sqrt(x), ... • To look up the description of a function in cmath • Start from http://www.cplusplus.com/ref/ • Click cmath or math.h • Clickthe function you want to look up

Some useful functions in the library cmath double fabs( double x) //abs(x) is for int values!!! Returns the absolute value of a double value. double cos( double x) //sin( x), tan(x), … Returns the trigonometric cosine of an angle. double atan( double x) //asin(x), acos(x) Returns the arc tangent of an angle, in the range of -pi/2 through pi/2. double floor( double x)//Round downReturns the largest integer that is less than or equal to x . floor of 2.3 is 2.0; floor of 3.8 is 3.0floor of -2.3 is -3.0; floor of -3.8 is -4.0

More useful methods in cmath

Assuming that n is of int and x, y are of double, write a C++ statement for the following. Make use of the functions pow( x, y) and sqrt( x) in cmath.

Many new versions of math.h contain the definitions of e and  const double M_E = The double value that is closest to e, the base of the natural logarithms. const double M_PI = The double value that is closest to pi, the ratio of the circumference of a circle to its diameter. 2.718281828459045; 3.141592653589793;

cout << "M_E = " << M_E << endl; cout << "M_PI = " << M_PI << endl; cout << "4xatan(1) = " << 4.*atan(1.) << endl; cout << "sqrt(2) = " << sqrt(2.) << endl; cout << "log(10) = " << log(10.) << endl; cout << "2^(-4) = " << pow(2., -4.) << endl; cout << "exp(log(10)) = " << exp(log(10.)) << endl; cout << "tan(M_PI/4) = " << tan(M_PI/4.) << endl; M_E = 2.7182818284590451e+000 M_PI = 3.1415926535897931e+000 4xatan(1) = 3.1415926535897931e+000 sqrt(2) = 1.4142135623730951e+000 log(10) = 2.3025850929940459e+000 2^(-4) = 6.2500000000000000e-002 exp(log(10)) = 1.0000000000000002e+001 tan(M_PI/4) = 9.9999999999999989e-001 Why not 10? Why not 1?

Advanced • When double overflow occurs, the result is set to Infinity or –Infinity. • The result of 1.0/0.0 is undefined in mathematics. It is set to Infinity in C++. • The result of 0.0/0.0 is undefined in mathematics. It is set to NaN in C++. NaN is the acronym of “Not a Number”. • A unique feature of NaN is that it is NOT equal to itself. If a variable is not equal to itself, we can conclude that its value is NaN. • Reference: IEEE Standard 754 on floating point number representation.

Advanced • A program that shows overflow and the checking of Infinity and –Infinity. #include <cmath> . . . double d1, d2; d1 = exp( 1000.); cout << "exp( 1000.) is " << d1; if (d1 == numeric_limits<double>::infinity()) cout << " (Infinity)" << endl; d2 = -1./0.; cout << "-1./0. is " << d2; if (d2 == -numeric_limits<double>::infinity()) cout << " (-Infinity)" << endl; exp( 1000.) is 1.#INF (Infinity) -1./0. is -1.#INF (-Infinity)

Advanced • A program that shows results of NaN in computationand the checking of NaN d3 = log( -1.0); cout << "log( -1.0) is " << d3; if (d3 != d3) cout << " (NaN)" << endl; d4 = sqrt( -1.0); cout << "sqrt( -1.0) is " << d4; if (d4 != d4) cout << " (NaN)" << endl; d5 = 0.0 / 0.0; cout << "0.0/0.0 is " << d5; if (d5 != d5) cout << " (NaN)" << endl; log( -1.0) is -1.#IND (NaN) sqrt( -1.0) is -1.#IND (NaN) 0.0/0.0 is 1.#QNAN (NaN)

Specify a data type (int, double, or string) for each of the constants below. Put down X for any invalid constants. • (i) INT_MIN • (ii) TicTacToe • (iii) 4. • (iv) INFINITY • (v) DBL_MAX • (vi) 1E-.5 • (vii) NaN • (viii) 1.5E-3 • (ix) “12345.678”

Round-off errors The errors that are introduced by the inexactness of the floating point number representations for reals. These errors may be accumulated and magnified during extensive computation. • Inexactness example (in a machine holding 4 decimal digits) 1/3  0.3333 (Error is 0.0000333…) • Accumulation example 1/3 + 1/3  0.6666 (Error is 0.0000666…) • Magnifying example 1000 * (1/3 – 0.333) = 1000 * (0.3333 – 0.333) = 0.3 (Error is 0.0333…)

The following shows a substantial error made in a calculator that holds 4 decimal digits.

Give an example of syntax errors in a program. • Give an example of run-time errors during the execution of a program. • Will a computer issue an error message for a logic error in a program. • The statement below prints False. Explain. • if ( sqrt(2.0) * sqrt(2.0) == 2.0 ) cout << "True"; • else cout << "False";

Simple input of double values • The operation is similar to the input of an int value. To input a value from the standard input device for the double variable d, write cin >> d; When this statement is executed, the cpu suspends the running of the program until a user keys in a sequence of characters and eventually presses the <Enter> key. The cpu then looks for a string that represents a valid double value in the input sequence. When cpu finds the string, it converts the string into a double value and assigns it to d. Leading blanks and newline characters are skipped. If extra characters are typed, the next input operation starts from the character immediately after the string.

double d; do { cout << "Enter a double value for d: "; cin >> d; cout << "d = " << d << endl; } while (d != 0.); Enter a double value for d: 123 d = 123 Enter a double value for d: 123.456 d = 123.456 Enter a double value for d: 5. d = 5 Enter a double value for d: .5 d = 0.5 Enter a double value for d: 123e-2 d = 1.23 Enter a double value for d: 12e4 d = 120000 Enter a double value for d: -123e4 d = -1.23e+006 Enter a double value for d: 0.0 d = 0

Simple output of double values • To print a double value in the dialogue window, write cout << <double expr>; The computer will evaluate the expression, then print the result in the dialogue window. Enter a double value for d: 123 d = 123 Enter a double value for d: 123.456 d = 123.456 Enter a double value for d: 123e-2 d = 1.23 Enter a double value for d: 12e4 d = 120000 Enter a double value for d: -123456 d = -123456 Enter a double value for d: -1234567 d = -1.3457e+006

Enter a double value for d: -1e-4 d = -0.0001 Enter a double value for d: -1e-5 d = -1e-005 Enter a double value for d: 0.123456789 d = 0.123457 Enter a double value for d: -12345678900 d = -1.23457e+010 • The ordinary form, eg, 123.456, is called fixed-point notation. The other one is called scientific notation, eg, 1.23456e2 .The default width, precision and notation used are machine dependent. (In our system, the default format, fixed-point or scientific, is determined by the magnitude of an output value.)

To specify an output notation for double values, write • cout.setf(ios::fixed); • or cout.setf(ios::scientific); • To specify the number of significant digits in the output values, write • cout.precision( <int value> );

To insist showing the decimal point and the trailing zero’s, write cout.setf(ios::showpoint); cout.setf(ios::scientific); cout.precision(16); cout.setf(ios::showpoint); Enter a double value for d: 1234 d = 1.2340000000000000e+003 Enter a double value for d: 0.12345678 d = 1.2345678000000000e-001 Enter a double value for d: -1234.5678 d = -1.2345678000000000e+003 Enter a double value for d: 1234e100 d = 1.2340000000000000e+103 Enter a double value for d: 1234567890123456789 d = 1.2345678901234568e+018 Enter a double value for d: -0.00000000001234567890123456789 d = -1.2345678901234568e-011 Enter a double value for d: 0e0 d = 0.0000000000000000e+000

cout.setf(ios::fixed); cout.precision(4); cout.setf(ios::showpoint); Enter a double value for d: 1234 d = 1234.0000 Enter a double value for d: 0.12345678 d = 0.1235 Enter a double value for d: -1234.5678 d = -1234.5678 Enter a double value for d: 1234e20 d = 123400000000000000000000.0000 Enter a double value for d: 1234567890123456789 d = 1234567890123456800.0000 Enter a double value for d: -0.00000000001234567890123456789 d = -0.0000 Enter a double value for d: 0e0 d = 0.0000

Another data type for representing reals in C++ is float. • A float value occupies 32 bits (4 bytes). These values are described as single precision in some books. • float data are more storage efficient and probably faster in computation. But the results are far less accurate due to round-off errors.

Reading Assignment • Chapter 2, p. 60 – 72, 93-108 • Chapter 5, P. 247-252

Lesson 4 Reals

Lesson 4 Reals

Presentation Transcript

Lesson 4

LESSON 4

Lesson 4

LESSON 4-4

Lesson 4

Lesson 4

Lesson 4

Lesson 4-4

Lesson 4-4

Lesson 4 - 4

Lesson 4-4

Lesson 4

LESSON 4–4

LESSON 4-4

Lesson 4

Lesson 4-4

Lesson 4:

Lesson 4-4

Lesson 4