TK 2123 COMPUTER ORGANISATION & ARCHITECTURE

TK 2123COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob

Contents • This lecture will address: • Several different number systems. • Data format: • Alphanumeric character. • Image data. • Audio data. • Data compression • Internal computer data format. • Representing Integer data. • Floating point number.

Number Systems • Computer perform all of their operations using the binary (base 2). • Program code and data are stored and manipulated in binary. • Each digit in a binary number is known as a bit (value 0 or 1). • Bits are commonly stored and manipulated in groups of: • 8 bit: Byte. • 16 bit : Halfword. • 32 bit: Word. • 34 bit: Doubleword

Number Systems • The number of bits used in calculations affects the accuracy and size limitations. • In programming language, programmer can define a signed integer variable to be: • short (16 bit) • int (32 bit) • long (64 bit).

Number Systems • Common number systems used when working with computers include: • binary • base 10 (decimal) • base 8 (octal) • base 16 (hexadecimal)

Number Systems: Counting in Different Base • Base 10: 0,1,2,3,4,5,6,7,8,9,10,11,12,…99,100…. • Base 8: 0,1,2,3,4,5,6,7,10,11,12,…17,20,…77,100,.. • Base 16: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,10,11…FF,100,.. • Base 2: 0, 1, 10, 11, 110, 111,…….

Numeric Conversion between Numbers • Convert the number to base 10. • E.g. 137548= ??? (1x84)+(3x83)+(7x82)+(5x81)+(4x80) = 612410 • Other method. 1x8=8 (8+3)x8=88 (88+7)x8=760 (760+5)x8=6120 (6120+4)= 612410

Numeric Conversion between Numbers • Convert the number from base 10. • E.g. 612410= ??? 5 residuals 6124/5 = 1224 4 1224/5=244 4 244/5=48 4 48/5=9 3 9/5=1 4 1/5=0 1 1434445

Convert Binary Number to Hex E.g. 0011 0101 1101 1000 Group by 4 digit 3 5 D 8 35D816 Most computer manufacturers prefer to use hexadecimal, since 16-bit or 32-bit number can be represented exactly by a four- or eight-digit hex number. Conversion between binary hex are used frequently.

Data Formats • Since all data/codes in computer are binary, it is almost always necessary to convert our words, numbers, images and sounds into a different form in order to store and process them in the computer. • Original data (character, image, etc.) must be brought initially into the computer and converted into an appropriate computer representation so that it can be processed, stored and used within the computer system.

Data Formats

Data Formats • Different input devices are used for converting original data into computer format. • Keyboard: Generate binary number code for each key. • Microphone: Convert analog sound into binary data using ADC. • Camera: Convert analog picture into binary data using ADC. • Etc.

Data Formats • There must be agreement between input-output devices, so that the data is displayed correctly. • If necessary, translation programs can be used to translate from one representation to another. Example: • Data from keyboard enters the computer in the form of character stream. • For storage and transmission of data, a representation different from that used for internal processing is often necessary, • i.e. in addition to the actual data representing points in an image, for example, the system must also store and pass along information that describes or interprets the meaning of data. • This information is known as metadata. • E.g. graphic image: Type of graphical image, colour format, etc.

Data Formats • Individual programs can store and process data in any format that they want. • The format used by individual programs are known as proprietary formats. • However, standard data representation exist to be used as interfaces between different programs, between program and IO devices, between interconnected hardware, and between systems that shared data.

Data Formats • Many different standards in use for different types of data. Some common data representation are:

Alphanumeric character data • Characters, number digits, and punctuation : alphanumeric data. • Since the is no processing capability in the keyboard itself, number data must be entered into the computer just like other characters, one digit at a time. • Conversion will be done using software. • Alphanumeric data must be stored and processed within computer in binary form  character translation. • The choice of code used is arbitrary. • Three common alphanumeric code: • Unicode • ASCII (American Standard Code for Information Interchange). • EBCDIC (Extended Binary Coded Decimal Interchange Code).. ….”ebb-see-dick”. Many computer/terminal use: Unicode or ASCII.

ASCII Code Table The codes are in hex. This is a 7-bit code  128 entries.

ASCII • Note that ASCII are designed so that the order of the letters is such that a simple numerical sort on the codes can be used within the computer to perform alphabetization. • The order of codes in the representation table is known as its collating sequence. • There are two classes of codes: • Printing characters – produce output on the screen/printer. • Control characters – use to control the position of the output on the screen/paper, to cause some action to occur (e.g. ringing a bell, deleting a character), etc.

Control Code Definitions Except for position control characters, the control characters are struck by holding down the Control key and striking a character. The code executed corresponds in table position to the position of the same alphabetic character. e.g. “Ctrl A” is for executing SOH.

ASCII vs Unicode • Due to the limitation of 7-bit ASCII code, American National Standard Institute (ANSI) also extend the 7-bit ASCII code to 8-bit code, known as Latin-I. • Latin-I is an ISO standard. • However, the 8-bit code still not adequate for representing all possible characters in use Unicode. • Unicode can represent 65,536 characters, of which approximately 49,000 have been defined. • More recent standard, Unicode 3.1 supports millions of different characters. • Unicode ismultilingual in the most global sense.

Two-byte Unicode Table

Keyboard Input • When key is struck on the keyboard, the circuitry in the keyboard generates a binary code, called a scan code. • When key is released, a different code is generated. • The scan codes are converted to Unicode, ASCII or EBCDIC codes by software within terminal or PC to which the keyboard is connected. • Advantage of software conversion – use of the keyboard can be easily change to correspond to different language and keyboard layout.

Keyboard Operation

Alternative Sources of Alphanumeric Input • Optical character recognition: • Scan text with an image scanner and convert the image into alphanumeric data form using optical character recognition (OCR) software. • Bar code readers: • Bar code represent alphanumeric data. Bar code are read optically using a device called a wand that converts a visual scan of the code into electrical binary signals that a bar translation module can read.

Alternative Sources of Alphanumeric Input • Magnetic stripe reader: • Read alphanumeric data form credit cards and other similar devices. • Voice input: • It is currently possible and practical to digitised audio for use as input data. However, technology to interpret audio data as voice input and to translate the data into alphanumeric form is still primitive.

Image Data • Images used in computer: Bitmap and object images. Different computer representations and processing techniques are used for each category. • Bitmap image/raster image: e.g. photograph and painting. • Produced by: scanner, digital camera, video camera frame grabber, software program such as paint. • To maintain and reproduce the detail of these images, it is necessary to represent and store each individual point within the image. • GIF and JPEG formats are common bitmap image using on the Web.

Image Data • Object image/vector image: made up of graphical shapes such as line, circle, etc. that can be defined geometrically. • Produced using drawing or design package. • Example: the movies Shrek and Toy Story are the object images.

Image Input • Image scanner. • Digital camera. • Video capture devices. • Graphical input using pointing devices.

Audio Data • Few different formats are used for storing audio waveform, e.g.: • .MOD • .MIDI • .VOC • .WAV • MP3

Data Compression • Due to the volume of multimedia data, particularly video, but also sound and images, data compression is usually desirable. • Two categories of data compression: • Lossless – allow complete recovery of the original noncompressed data. • Lossy – does not allow recovery but is designed to be perceived as sufficient by the user.

Data Formats • Internally, all data, regardless of use, are stored in binary number. • Instructions in the computer support interpretation of these numbers as character, integers, pointers, and floating point numbers. • No special provision is made the storage of algebraic sign or decimal point that might be associated with a number.

Representing Integer Data • Unsigned integer can be stored using unsigned binary or binary-coded decimal (BCD). • unsigned binary – the range of integers that we can store is determined by the number of bits available, i.e. 8-bit binary, for example, can store an unsigned integer of value between 0 and 255. • For storing larger numbers, multiple storage locations of 8-bit is used. • BCD – the number is stored as a digit-by-digit binary representation of the original decimal integer. Each decimal digit is individually converted to 4-bit binary.

Storage of a 32-bit Data Word

Representation for Signed Integers • The most common method to represent signed numbers is using 2’s complement representation. • The 2’s complement of a number can be found in one of two ways: • Subtract the value from the modulus or • Find the 1’s complement by inverting all 1’s and 0’s and adding 1 to the result (common method use in computer). Two’s Complement Representation

2’s complement representation • Example: The number +2 in 8-bit number is: 0000 0010 The number -2 in 8-bit number is: 0000 0010 1’s complement: 1111 11 01 + 1 2’s complement: 1111 1110

Floating Point Numbers • The usual floating point number format consist of: • A sign bit. • An exponent • A mantissa.

Thank youQ & A

TK 2123 COMPUTER ORGANISATION & ARCHITECTURE