530 likes | 541 Vues
IT 252 Computer Organization and Architecture. Number Representation. Chia-Chi Teng. Where Are We Now?. CS142 & 124. IT344. Review (do you remember from 124/104?). 8 bit signed 2’s complement binary # -> decimal # 0111 1111 = ? 1000 0000 = ? 1111 1111 = ?
E N D
IT 252Computer Organizationand Architecture Number Representation Chia-Chi Teng
Where Are We Now? CS142 & 124 IT344
Review (do you remember from 124/104?) • 8 bit signed 2’s complement binary # -> decimal # • 0111 1111 = ? • 1000 0000 = ? • 1111 1111 = ? • Decimal # -> 8 bit signed 2’s complement binary # • 32 = ? • -2 = ? • 200 = ?
Decimal Numbers: Base 10 Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Example: 3271 = (3x103) + (2x102) + (7x101)+ (1x100)
Numbers: positional notation • Number Base B B symbols per digit: • Base 10 (Decimal): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9Base 2 (Binary): 0, 1 • Number representation: • d31d30 ... d1d0is a32 digit number • value = d31 B31 + d30 B30 + ... + d1 B1 + d0 B0 • Binary: 0,1 (In binary digits called “bits”) • 0b11010 = 124 + 123 + 022 + 121 + 020 = 16 + 8 + 2 = 26 • Here 5 digit binary # turns into a 2 digit decimal # • Can we find a base that converts to binary easily? #s often written0b…
Hexadecimal Numbers: Base 16 • Hexadecimal: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F • Normal digits + 6 more from the alphabet • In C, written as 0x… (e.g., 0xFAB5) • Conversion: BinaryHex • 1 hex digit represents 16 decimal values • 4 binary digits represent 16 decimal values • 1 hex digit replaces 4 binary digits • One hex digit is a “nibble”. Two is a “byte” • 2 bits is a “half-nibble”. Shave and a haircut… • Example: • 1010 1100 0011 (binary) = 0x_____ ?
Decimal vs. Hexadecimal vs. Binary 00 0 000001 1 000102 2 001003 3 001104 4 010005 5 010106 6 011007 7 011108 8 100009 9 100110 A 101011 B 101112 C 110013 D 110114 E 111015 F 1111 Examples: 1010 1100 0011 (binary) = 0xAC3 10111 (binary) = 0001 0111 (binary) = 0x17 0x3F9 = 11 1111 1001 (binary) How do we convert between hex and Decimal? MEMORIZE!
Precision and Accuracy Don’t confuse these two terms! Precision is a count of the number bits in a computer word used to represent a value. Accuracy is a measure of the difference between the actual value of a number and its computer representation. High precision permits high accuracy but doesn’t guarantee it. It is possible to have high precision but low accuracy. Example: float pi = 3.14; pi will be represented using all bits of the significant (highly precise), but is only an approximation (not accurate).
What to do with representations of numbers? • Just what we do with numbers! • Add them • Subtract them • Multiply them • Divide them • Compare them • Example: 10 + 7 = 17 • …so simple to add in binary that we can build circuits to do it! • subtraction just as you would in decimal • Comparison: How do you tell if X > Y ? 1 1 1 0 1 0 + 0 1 1 1 ------------------------- 1 0 0 0 1
Visualizing (Mathematical) Integer Addition • Integer Addition • 4-bit integers u, v • Compute true sum Add4(u , v) • Values increase linearly with u and v • Forms planar surface Add4(u , v) v u
2w+1 2w 0 Visualizing Unsigned Addition • Wraps Around • If true sum ≥ 2w • At most once Overflow UAdd4(u , v) True Sum Overflow v Modular Sum u
BIG IDEA: Bits can represent anything!! • Characters? • 26 letters 5 bits (25 = 32) • upper/lower case + punctuation 7 bits (in 8) (“ASCII”) • standard code to cover all the world’s languages 8,16,32 bits (“Unicode”)www.unicode.com • Logical values? • 0 False, 1 True • colors ? Ex: • locations / addresses? commands? • MEMORIZE: N bits at most 2N things Red (00) Green (01) Blue (11)
How to Represent Negative Numbers? • So far, unsigned numbers • Obvious solution: define leftmost bit to be sign! • 0 +, 1 – • Rest of bits can be numerical value of number • Representation called sign and magnitude • MIPS uses 32-bit integers. +1ten would be:0000 0000 0000 0000 0000 0000 0000 0001 • And –1ten in sign and magnitude would be:1000 0000 0000 0000 0000 0000 0000 0001
Shortcomings of sign and magnitude? • Arithmetic circuit complicated • Special steps depending whether signs are the same or not • Also, two zeros • 0x00000000 = +0ten • 0x80000000 = –0ten • What would two 0s mean for programming? • Therefore sign and magnitude abandoned
00000 00001 ... 01111 11111 10000 ... 11110 Another try: complement the bits • Example: 710 = 001112 –710 = 110002 • Called One’s Complement • Note: positive numbers have leading 0s, negative numbers have leadings 1s. • What is -00000 ? Answer: 11111 • How many positive numbers in N bits? • How many negative numbers?
Standard Negative Number Representation • What is result for unsigned numbers if tried to subtract large number from a small one? • Would try to borrow from string of leading 0s, so result would have a string of leading 1s • 3 - 4 00…0011 – 00…0100 = 11…1111 • With no obvious better alternative, pick representation that made the hardware simple • As with sign and magnitude, leading 0s positive, leading 1s negative • 000000...xxx is ≥ 0, 111111...xxx is < 0 • except 1…1111 is -1, not -0 (as in sign & mag.) • This representation is Two’s Complement
2’s Complement Number “line”: N = 5 00000 00001 11111 11110 00010 • 2N-1 non-negatives • 2N-1 negatives • one zero • how many positives? 0 -1 1 11101 2 -2 -3 11100 -4 . . . . . . 15 -15 -16 01111 10001 10000 00000 00001 ... 01111 10000 ... 11110 11111
Unsigned Values UMin = 0 000…0 UMax = 2w – 1 111…1 Two’s Complement Values TMin = –2w–1 100…0 TMax = 2w–1 – 1 011…1 Other Values Minus 1 111…1 Numeric Ranges Values for W = 16
Values for Different Word Sizes • Observations • |TMin | = TMax + 1 • Asymmetric range • UMax = 2 * TMax + 1 • C Programming • #include <limits.h> • Declares constants, e.g., • ULONG_MAX • LONG_MAX • LONG_MIN • Values platform specific
X B2U(X) B2T(X) 0000 0 0 0001 1 1 0010 2 2 0011 3 3 0100 4 4 0101 5 5 0110 6 6 0111 7 7 1000 8 –8 1001 9 –7 1010 10 –6 1011 11 –5 1100 12 –4 1101 13 –3 1110 14 –2 1111 15 –1 Unsigned & Signed Numeric Values • Equivalence • Same encodings for nonnegative values • Uniqueness • Every bit pattern represents unique integer value • Each representable integer has unique bit encoding • Can Invert Mappings • U2B(x) = B2U-1(x) • Bit pattern for unsigned integer • T2B(x) = B2T-1(x) • Bit pattern for two’s comp integer
Two’s Complement Formula • Can represent positive and negative numbers in terms of the bit value times a power of 2: d31 x -(231)+ d30 x 230 + ... + d2 x 22 + d1 x 21 + d0 x 20 • Example: 1101two = 1x-(23)+ 1x22 + 0x21 + 1x20 = -23+ 22 + 0 + 20 = -8 + 4 + 0 + 1 = -8 + 5 = -3ten
Two’s Complement shortcut: Negation *Check out www.cs.berkeley.edu/~dsw/twos_complement.html • Change every 0 to 1 and 1 to 0 (invert or complement), then add 1 to the result • Proof*: Sum of number and its (one’s) complement must be 111...111two However, 111...111two= -1ten Let x’ one’s complement representation of x Then x + x’ = -1 x + x’ + 1 = 0 -x = x’ + 1 • Example: -3 to +3 to -3x : 1111 1111 1111 1111 1111 1111 1111 1101twox’: 0000 0000 0000 0000 0000 0000 0000 0010two+1: 0000 0000 0000 0000 0000 0000 0000 0011two()’: 1111 1111 1111 1111 1111 1111 1111 1100two+1: 1111 1111 1111 1111 1111 1111 1111 1101two You should be able to do this in your head…
What if too big? • Binary bit patterns above are simply representatives of numbers. Strictly speaking they are called “numerals”. • Numbers really have an number of digits • with almost all being same (00…0 or 11…1) except for a few of the rightmost digits • Just don’t normally show leading digits • If result of add (or -, *, / ) cannot be represented by these rightmost HW bits, overflow is said to have occurred. 11110 11111 00000 00001 00010 unsigned
Peer Instruction Question X = 1111 1111 1110 1100two Y = 0011 1010 0000 0000two • X > Y (if signed) • X > Y (if unsigned) • X = -19 (if signed) ABC 0: FFF 1: FFT 2: FTF 3: FTT 4: TFF 5: TFT 6: TTF 7: TTT
Peer Instruction Question A: False (X negative) B: True C: False(X = -20) X = 1111 1111 1110 1100two Y = 0011 1010 0000 0000two • X > Y (if signed) • X > Y (if unsigned) • X = -19 (if signed) ABC 0: FFF 1: FFT 2: FTF 3: FTT 4: TFF 5: TFT 6: TTF 7: TTT
META: We often make design decisions to make HW simple 00000 00001 ... 01111 10000 ... 11110 11111 00000 00001 ... 01111 10000 ... 11110 11111 Number summary... • We represent “things” in computers as particular bit patterns: N bits 2N things • Decimal for human calculations, binary for computers, hex to write binary more easily • 1’s complement - mostly abandoned • 2’s complement universal in computing: cannot avoid, so learn • Overflow: numbers ; computers finite,errors!
Information units • Basic unit is the bit(has value 0 or 1) • Bits are grouped together in units and operated on together: • Byte = 8 bits • Word = 4 bytes • Double word = 2 words • etc.
Decimal Binary Hex 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111 Encoding Byte Values • Byte = 8 bits • Binary 000000002 to 111111112 • Decimal: 010 to 25510 • First digit must not be 0 in C • Hexadecimal 0016 to FF16 • Base 16 number representation • Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ • Write FA1D37B16 in C as 0xFA1D37B • Or 0xfa1d37b
Memory addressing • Memory is an array of information units • Each unit has the same size • Each unit has its own address • Address of an unit and contents of the unit at that address are different 0 123 1 -17 0 2 contents address
Addressing • In most of today’s computers, the basic unit that can be addressed is a byte.(how many bit is a byte?) • MIPS (and pretty much all CPU today) is byte addressable • The address spaceis the set of all memory units that a program can reference • The address space is usually tied to the length of the registers • MIPS has 32-bit registers. Hence its address space is 4G bytes • Older micros (minis) had 16-bit registers, hence 64 KB address space (too small) • Some current (Alpha, Itanium, Sparc, Altheon) machines have 64-bit registers, hence an enormous address space
Machine Words • Machine Has “Word Size” • Nominal size of integer-valued data • Including addresses • Most current machines use 32 bits (4 bytes) words • Limits addresses to 4GB • Becoming too small for memory-intensive applications • High-end systems use 64 bits (8 bytes) words • Potential address space 1.8 X 1019 bytes • x86-64 machines support 48-bit addresses: 256 Terabytes • Machines support multiple data formats • Fractions or multiples of word size • Always integral number of bytes
Addressing words • Although machines are byte-addressable, 4 byte integers are the most commonly used units • Every 32-bit integer starts at an address divisible by 4 int at address 0 int at address 4 int at address 8
0000 0000 0004 0008 0008 0012 Word-Oriented Memory Organization 32-bit Words 64-bit Words Bytes Addr. • Addresses Specify Byte Locations • Address of first byte in word • Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) 0000 Addr = ?? 0001 0002 Addr = ?? 0003 0004 Addr = ?? 0005 0006 0007 0008 Addr = ?? 0009 0010 Addr = ?? 0011 0012 Addr = ?? 0013 0014 0015
Data Representations • Sizes of C Objects (in Bytes) • C Data Type Typical 32-bit Intel IA32 x86-64 • char 1 1 1 • short 2 2 2 • int 4 4 4 • long 4 4 8 • long long 8 8 8 • float 4 4 4 • double 8 8 8 • long double 8 10/12 10/16 • char * 4 4 8 • Or any other pointer
Byte Ordering How should bytes within multi-byte word be ordered in memory? Conventions • Big Endian: Sun, PPC Mac, Internet • Least significant byte has highest address • Little Endian: x86 • Least significant byte has lowest address
0x100 0x101 0x102 0x103 01 23 45 45 67 0x100 0x101 0x102 0x103 67 45 23 01 Byte Ordering Example • Big Endian • Least significant byte has highest address • Big End First • Little Endian • Least significant byte has lowest address • Little End First • Example • Variable x has 4-byte representation 0x01234567 • Address given by &x is 0x100 Big Endian Little Endian
Reading Byte-Reversed Listings • Disassembly • Text representation of binary machine code • Generated by program that reads the machine code • Example Fragment Address Instruction Code Assembly Rendition 8048365: 5b pop %ebx 8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx 804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx) Deciphering Numbers • Value: 0x12ab • Pad to 32 bits: 0x000012ab • Split into bytes: 00 00 12 ab • Reverse: ab 12 00 00
Examining Data Representations Code to Print Byte Representation of Data • Casting pointer to unsigned char * creates byte array typedef unsigned char *pointer; void show_bytes(pointer start, int len) { int i; for (i = 0; i < len; i++) printf("0x%p\t0x%.2x\n", start+i, start[i]); printf("\n"); } • Printf directives: • %p: Print pointer • %x: Print Hexadecimal
show_bytes Execution Example int a = 15213; printf("int a = 15213;\n"); show_bytes((pointer) &a, sizeof(int)); Result (Linux): int a = 15213; 0x11ffffcb8 0x6d 0x11ffffcb9 0x3b 0x11ffffcba 0x00 0x11ffffcbb 0x00
Representing & Manipulating Sets • Representation • Width w bit vector represents subsets of {0, …, w–1} • aj = 1 if jA 01101001{ 0, 3, 5, 6 } 76543210 01010101{ 0, 2, 4, 6 } 76543210 • Operations • & Intersection 01000001 { 0, 6 } • | Union 01111101 { 0, 2, 3, 4, 5, 6 } • ^ Symmetric difference 00111100 { 2, 3, 4, 5 } • ~ Complement 10101010 { 1, 3, 5, 7 }
Bit-Level Operations in C • Operations &, |, ~, ^ Available in C • Apply to any “integral” data type • long, int, short, char, unsigned • View arguments as bit vectors • Arguments applied bit-wise • Examples (Char data type) • ~0x41 --> 0xBE ~010000012 --> 101111102 • ~0x00 --> 0xFF ~000000002 --> 111111112 • 0x69 & 0x55 --> 0x41 011010012 & 010101012 --> 010000012 • 0x69 | 0x55 --> 0x7D 011010012 | 010101012 --> 011111012
Contrast: Logic Operations in C • Contrast to Logical Operators • &&, ||, ! • View 0 as “False” • Anything nonzero as “True” • Always return 0 or 1 • Early termination • Examples (char data type) • !0x41 --> 0x00 • !0x00 --> 0x01 • !!0x41 --> 0x01 • 0x69 && 0x55 --> 0x01 • 0x69 || 0x55 --> 0x01 • p && *p (avoids null pointer access)
Shift Operations • Left Shift: x << y • Shift bit-vector x left y positions • Throw away extra bits on left • Fill with 0’s on right • Right Shift: x >> y • Shift bit-vector x right y positions • Throw away extra bits on right • Logical shift • Fill with 0’s on left • Arithmetic shift • Replicate most significant bit on right • Undefined Behavior • Shift amount < 0 or word size Argument x 01100010 << 3 00010000 00010000 00010000 Log. >> 2 00011000 00011000 00011000 Arith. >> 2 00011000 00011000 00011000 Argument x 10100010 << 3 00010000 00010000 00010000 Log. >> 2 00101000 00101000 00101000 Arith. >> 2 11101000 11101000 11101000
The CPU - Instruction Execution Cycle • The CPU executes a program by repeatedly following this cycle 1. Fetch the next instruction, say instruction i 2. Execute instruction i 3. Compute address of the next instruction, say j 4. Go back to step 1 • Of course we’ll optimize this but it’s the basic concept
What’s in an instruction? • An instruction tells the CPU • the operation to be performed via the OPCODE • where to find the operands (source and destination) • For a given instruction, the ISA specifies • what the OPCODE means (semantics) • how many operands are required and their types, sizes etc.(syntax) • Operand is either • register (integer, floating-point, PC) • a memory address • a constant
Reference slides You ARE responsible for the material on these slides (they’re just taken from the reading anyway) ; we’ve moved them to the end and off-stage to give more breathing room to lecture!
Kilo, Mega, Giga, Tera, Peta, Exa, Zetta, Yotta physics.nist.gov/cuu/Units/binary.html • Common use prefixes (all SI, except K [= k in SI]) • Confusing! Common usage of “kilobyte” means 1024 bytes, but the “correct” SI value is 1000 bytes • Hard Disk manufacturers & Telecommunications are the only computing groups that use SI factors, so what is advertised as a 30 GB drive will actually only hold about 28 x 230 bytes, and a 1 Mbit/s connection transfers 106 bps.
kibi, mebi, gibi, tebi, pebi, exbi, zebi, yobi en.wikipedia.org/wiki/Binary_prefix • New IEC Standard Prefixes [only to exbi officially] • International Electrotechnical Commission (IEC) in 1999 introduced these to specify binary quantities. • Names come from shortened versions of the original SI prefixes (same pronunciation) and bi is short for “binary”, but pronounced “bee” :-( • Now SI prefixes only have their base-10 meaning and never have a base-2 meaning. As of thiswriting, thisproposal hasyet to gainwidespreaduse…
MEMORIZE! The way to remember #s • What is 234? How many bits addresses (I.e., what’s ceil log2 = lg of) 2.5 TiB? • Answer! 2XY means… X=0 --- X=1 kibi ~103 X=2 mebi ~106 X=3 gibi ~109 X=4 tebi ~1012 X=5 pebi ~1015 X=6 exbi ~1018 X=7 zebi ~1021 X=8 yobi ~1024 • Y=0 1 • Y=1 2 • Y=2 4 • Y=3 8 • Y=4 16 • Y=5 32 • Y=6 64 • Y=7 128 • Y=8 256 • Y=9 512
Which base do we use? • Decimal: great for humans, especially when doing arithmetic • Hex: if human looking at long strings of binary numbers, its much easier to convert to hex and look 4 bits/symbol • Terrible for arithmetic on paper • Binary: what computers use; you will learn how computers do +, -, *, / • To a computer, numbers always binary • Regardless of how number is written: • 32ten == 3210 == 0x20 == 1000002 == 0b100000 • Use subscripts “ten”, “hex”, “two” in book, slides when might be confusing