1 / 19

L12-Algorithm timing

ECE 2560. L12-Algorithm timing. Department of Electrical and Computer Engineering The Ohio State University. Execution time. Execution time for instructions Algorithm execution time Dumb code Smarter code Constant time

rowena
Télécharger la présentation

L12-Algorithm timing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 2560 L12-Algorithm timing Department of Electrical and Computer Engineering The Ohio State University ECE 3561 - Lecture 1

  2. Execution time • Execution time for instructions • Algorithm execution time • Dumb code • Smarter code • Constant time • Information relevant to this lecture can be found in the 430 Users Guide at the end of chapter 3. ECE 3561 - Lecture 1

  3. The first code • Computing A*B • Choose A as number of times to add B to SUM • WHILE A > 0 LOOP • SUM = SUM + B; • A = A-1; • END LOOP; • How long does this take to execute? • A and B are integers from 0 to 127 • On average the loop will be repeated 64 times, minimum 0 times, maximum values 127 times. • Need to look at instruction execution cycles. ECE 3561 - Lecture 1

  4. The code • Code of subroutine • srmultmov 2(SP),R6 ;B to R6 • mov 4(SP),R5 ;A to R5 • clr R7 • mlp add R6,R7 • dec R5 • jnemlp ;at end R7 is A*B • mov R7,4(SP) ;R7 for rtn • ret ;return from sr • Need to look at each instruction for cycles it takes to execute. ECE 3561 - Lecture 1

  5. Time for instructions • Tables 3-14, 3-15, and 3-16 of the users manual provide information on the time it takes for instructions to execute. • Table 3-14 is for interrupts and will be discussed later. • Table 3-15 is for single operand instructions, RRA, RRC, SWPB, SXT, PUSH, and CALL • The PUSH is the only instruction of note and not in the code on the previous slide. ECE 3561 - Lecture 1

  6. Table 3-16 • Note it is by src and dst addressing modes. • Example: • mov R5,R6 – • 1 cycle – 1 byte • mov 2(SP),Rx • 3 cycles – 2 bytes • mov R7,4(SP) • 4 cycles – 2 bytes ECE 3561 - Lecture 1

  7. Side note • This processor has instructions that occupy 1 to 3 words in memory and • Take 1 to 6 cycles to execute. • Such a processor is considered to be a CISC processor even though it may have many RISC features. A true RISC processor has instructions that are 1 or 2 words (machine instruction and operand address) and take 1 or 2 cycles to execute. ECE 3561 - Lecture 1

  8. Analyze our program • Instructions and cycles • srmultmov 2(SP),R6 3 • mov 4(SP),R5 3 • mlp add R6,R7 1 • dec R5 1 • jnemlp 2 • mov R7,4(SP) 4 • ret 3 • Note: Timing for a few instructions such as ret is not provided but the timing for Format-II instructions is. Most likely ret is 3 cycles ECE 3561 - Lecture 1

  9. Time for routine • Setup time – 6 cycles • Loop cycles – 4 cycles • Cleanup/return cycles – 7 cycles • Total cycles = setup + cleanup + n*loop • = 13 + 4n • What is n? – n is one of A,B in A*B • What is time? • Max that can be multiplied is 127x127 so • Max cycles = 13 + 4*127 = 621 • Min of values is 1 x 1 or just 17 cycles • Average would be 64 times in the loop so • Average cycles = 13 + 4*64 = 269 ECE 3561 - Lecture 1

  10. Modification to routine • To shorten the time the code could be modified to make the A, the value for the loop count, to be the smaller. • However, this would not change the equation just derived. • A better algorithm is needed. ECE 3561 - Lecture 1

  11. Multiplication • Consider multiplication in base 10 • 1006 • x 32 • 2012 • 3018 . • 32192 • Binary is much the same if not easier ECE 3561 - Lecture 1

  12. Binary multiplication • Binary is either 0 or 1 • 1100111 multiplicand • x 0100101 multiplier • 1100111 • 1100111 • 1100101 . • 111010100011 • An algorithm can be developed to do essentially this. • The multiplicand is shifted as each bit of the multiplier is examined. If a 1, then the multiplicand is added to the final sum. A finite time algorithm. ECE 3561 - Lecture 1

  13. The code • Put arguments in R5 and R6 • Sum will be in R7 • srmultmov2(SP),R5 ;A multiplier • mov 4(SP),R6 ;B multiplicand • clr R7 ;R7 for sum • mov #0x0001,R9 ;the mask for testing • mov #8,R8 ;will execute loop 7 times • toldec R8 • jeq done • bit R9,R5 ;test bit of multiplier • jeqnxtbit ;jump if zero ECE 3561 - Lecture 1

  14. What is bit test • The BIT instruction logically AND’s the source and destination. If only 1 bit of one operand is set it tells you if that bit of the other operand is a 0 or 1. • Set up a mask in a register. This will be used to check the bit of the multiplier. • After BIT test the CCR bits for N,Z,C=Z’, and V reset. ECE 3561 - Lecture 1

  15. The overall scheme • Setup • Register 5 has the multiplier • Register 6 has the multiplicand • Register 9 has the mask • Clear sum • Loop 7 times • BIT R9 and R5 • If not zero THEN add_miltiplicand_to_sum • Shift R9 left 1 position • Shift R6 left 1 position (*2) • Back to Loop ECE 3561 - Lecture 1

  16. The complete code • Put arguments in R5 and R6 • Sum will be in R7 • srmultmov2(SP),R5 ;A multiplier • mov 4(SP),R6 ;B multiplicand • clr R7 ;R7 for sum • mov #0x0001,R9 ;the mask for testing • mov #8,R8 ;will execute loop 7 x • toldec R8 • jeq done • bit R9,R5 ;test bit of multiplier • jznxtbit ;jump if zero • add R6,R7 • nxbitclrc ;need to do rotate • rlc R9 • rlc R6 ;multiplicand x2 • jmptol • done mov R6,4(SP) ECE 3561 - Lecture 1

  17. Time anlaysis • Before the loop • 2 mov using x(Rn) to Rm – 3 cycles • 1 clr instruction – 1 cycle • 1 mov immediate to Rm – 2 cycles • Total setup time • Total of 6 CPU cycles ECE 3561 - Lecture 1

  18. Time analysis – the loop • Within the loop • 1 dec instruction – 1 cycle • 1 conditional jump – 2 cycles • NOTE: all jump instructions take 2 CPU cycles to execute regardless of whether jump is taken or not. • The BIT instruction (R,R) – 1 cycle • Conditional jump – 2 cycles • If jump not taken add instr (R,R) – 1 cycle • The clrc instruction – 1 cycle • 2 rlc instructions – 1 cycle • Unconditional jump – 2 cycles • Loop cycles = 10 + 1 when jump not taken • Total loop time = 7 * Loop cycles • Max loop time = 77 cycles Min loop time = 70 cycles • Total algorithm time – 6 + 77 = 83 cycles ECE 3561 - Lecture 1

  19. Summary - Assignment • No new assignment. • But try coding up the loop. ECE 3561 - Lecture 1

More Related