Chapter 10

Chapter 10 The Stack Stack data structure Interrupt I/O Arithmetic using a stack

Stack Data Structure • Abstract Data Structures • are defined simply by the rules for inserting and extracting data • The rule for a Stack is LIFO (Last In - First Out) • Operations: • Push (enter item at top of stack) • Pop (remove item from top of stack) • Error conditions: • Underflow (trying to pop from empty stack) • Overflow (trying to push onto full stack) • We just have to keep track of the address of top of stack (TOS)

A “physical” stack • A coin holder as a stack

A hardware stack • Implemented in hardware (e.g. registers) • Previous data entries move up to accommodate each new data entry • Note that the Top Of Stack is always in the same place.

A software stack • Implemented in memory • The Top Of Stack moves as new data is entered • Here R6 is the TOS register, a pointer to the Top Of Stack

Push & Pop • Push • Decrement TOS pointer (our stack is moving down) • then write data in R0 to new TOS • Pop • Read data at current TOS into R0 • then increment TOS pointer PUSH ADD R6, R6, # -1 STR R0, R6, # 0 POP LDR R0, R6, # 0 ADD R6, R6, # 1

Push & Pop (cont.) • Push • Decrement TOS pointer (our stack is moving down) • then write data in R0 to new TOS • Pop • Read data at current TOS into R0 • then increment TOS pointer • What if stack is already full or empty? • Before pushing, we have to test for overflow • Before popping, we have to test for underflow • In both cases, we use R5 to report success or failure

PUSH & POP in LC-3 x3FFB MAX … x3FFF BASE PUSH ST R2, Sv2 ;needed by PUSH ST R1, Sv1 ;needed by PUSH LD R1, MAX ;MAX has -x3FFB ADD R2, R6, R1 ;Compare SP to x3FFB BRzfail_exit ;Branch is stack is full ADD R6, R6, # -1 ;Adjust Stack Pointer STR R0, R6, # 0 ;The actual ‘push’ BRnzpsuccess_exit … BASE .FILL xC001 ;Base has -x3FFF MAX .FILL xC005 ;Max has -x3FFB Sv1 .FILL x0000 Sv2 .FILL x0000

POP ST R2, Sv2 ;save, needed by POP ST R1, Sv1 ;save, needed by POP LD R1, BASE ;BASE contains x-3FFF ADD R1, R1, # -1 ;R1 now has x-4000 ADD R2, R6, R1 ;Compare SP to x4000 BRzfail_exit ;Branch if stack is empty LDR R0, R6, # 0 ;The actual ‘pop’ ADD R6, R6, # 1 ;Adjust stack pointer BRnzpsuccess_exit PUSH & POP in LC-3 … BASE .FILL xC001 ;Base has -x3FFF MAX .FILL xC005 ;Max has -x3FFB Sv1 .FILL x0000 Sv2 .FILL x0000

PUSH & POP in LC-2 (cont.) success_exit LD R1, Sv1 ;Restore register values LD R2, Sv2 ; AND R5, R5, # 0 ;R5 <-- success RET ; fail_exit LD R1, Sv1 ;Restore register values LD R2, Sv2 AND R5, R5, # 0 ADD R5, R5, # 1 ;R5 <-- fail RET BASE .FILL xC001 ;Base has -x3FFF MAX .FILL xC005 ;Max has -x3FFB Sv1 .FILL x0000 Sv2 .FILL x0000

Memory-mapped I/O revisited

CPU Memory I/O Interrupt-driven I/O • Just one device: IRQ IACK • When IRQ goes active, jump to a special memory location: the ISR, or interrupt service routine. For now, let’s say it exists at address x1000. • Activate IACK to tell the device that the interrupt is being serviced, and it can stop activating the IRQ line.

Generating the Interrupt • Using the Status Register • The peripheral sets a Ready bit in SR[15] (as with polling) • The CPU sets an Interrupt Enable bit in SR[14] • These two bits are anded to set the Interrupt. • In this way, the CPU has the final say in who gets to interrupt it!

Processing an interrupt: one device • Device generates an IRQ • CPU signals IACK – “OK, I’m on it.” • Switch to Supervisor Mode • CPU saves its current state • What and how? • Address of the ISR is loaded into the PC • x1000 • Continue – process the interrupt • When finished, return to running program • How?

Supervisor Mode Priv Priority N Z P 15 10 – 8 2 1 0 • Only the Operating System can access device addresses • Why? Bit 15 of the PSR = Privileged (supervisor) mode

Interrupts and program state • We need to save the PC, the PSR, and all Registers • We could require that ISRs save all relevant registers (callee save) • The callee would ALWAYS have to save the contents of the PC and PSR • In most computers these values (and possibly all register contents) are stored on a stack • Remember, there might be nested interrupts, so simply saving them to a register or reserved memory location might not work.

The Supervisor Stack • The LC-3 has two stacks • The User stack • Used by the programmer for subroutine calls and other stack functions • The Supervisor stack • Used by programs in supervisor mode (interrupt processing) • Each stack is in separate region of memory • The stack pointer for the current stack is always R6. • If the current program is in privileged mode, R6 points to the Supervisor stack, otherwise it points to the user stack. • Two special purpose registers, Saved.SSP and Saved.USP, are used to store the pointer currently not in use, so the two stacks can share push/pop subroutines without interfering with each other.

Saving State • When the CPU receives an INT signal … • If the system was previously in User mode, the User stack pointer is saved & the Supervisor stack pointer is loaded • Saved.USP <= (R6) • R6 <= (Saved.SSP) • PC and PSR are pushed onto the Supervisor Stack • Set the system to Supervisor mode • PSR[15] <= 0 • Jump to the interrupt service routine

Processing an interrupt: details • Device generates in IRQ • CPU signals IACK – “OK, I’m on it.” • CPU saves its current state • PC and PSR are saved on the Supervisor Stack • Switch to Supervisor Mode • Change the S bit in the PSR to 0. • Address of the ISR is loaded into the PC • For now we assume just one ISR – x1000 • Continue – process the interrupt • When finished, return to running program • Pop the PC and PSR from the Supervisor Stack

More than one device CPU Memory I/O 1 I/O 2 I/O 3 I/O 4 IRQ • Who sent the interrupt? • One way is to have a unified ISR that checks the status bits of every device in the system • This is a hybrid method between interrupt-driven I/O and polling • Requires every new device to modify the ISR • The ISR will be large and complex

Vectored Interrupts • If we have multiple devices, we need a very large ISR that knows how to deal with all of them! • Using vectored interrupts, we can have a different ISR for each device. • Each I/O device has a special register where it keeps a special number called the interrupt vector. • The vector tells the CPU where to look for the ISR.

A vectored-interrupt device Device Controller x8000 Input register x8002 Output register x8004 Status register x8006 Interrupt VectorRegister 67 • When I trigger an interrupt, look up address number 67 in the vector table, and jump to that address.

Getting the interrupt vector • INTA tells a device to put the interrupt vector on the bus • INTA is daisy chained so only one device will respond INTA Memory CPU I/O 1 I/O 2 I/O 3 I/O 4 IRQ

Initial state of the ISR • Vectored interrupts • Along with the INT signal, the I/O device transmits an 8-bit vector (INTV). • If the interrupt is accepted, INTV is expanded to a 16-bit address: • The Interrupt Vector Table resides in locations x0100 to x01FF and holds the starting addresses of the various Interrupt Service Routines. (similar to the Trap Vector Table and the Trap Service Routines) • INTV is an index into the Interrupt Vector Table, i.e. the address of the relevant ISR is ( x0100 + Zext(INTV) ) • The address of the ISR is loaded into the PC • The PSR is set as follows: • PSR[15] <= 1 (Supervisor mode) • PSR[2:0] <= 000 (no condition codes set) • Now we wait while the interrupt is processed

Interrupt sequence: >1 device • Device generates an IRQ • CPU switches to SSP if necessary (hardware) • Current PC and PSR are saved to the supervisor stack (hardware) • Switch to supervisor mode (S = 0; hardware) • CPU sends IACK , which is daisy chained to device (hardware) • Device sends its vector number (hardware) • Vector is looked up in the interrupt vector table, and address of the ISR is loaded into the PC (hardware) • ISR saves any registers that it will use (software) • ISR runs, then restores register values (software) • ISR executes RTI instruction, which restores PSR and PC (software) • Note that this restores previous supervisor/user mode

Multiple devices: priority What happens if another interrupt occurs while the system is processing an interrupt? Can devices be “starved” in this system?

Priority • Each task has an assigned priority level • LC-3 has levels PL0 (lowest) to PL7 (highest). • If a higher priority task requests access, then a lower priority task will be suspended. • Likewise, each device has an assigned priority • The highest priority interrupt is passed on to the CPU only if it has higher priority than the currently executing task. • If an INT is present at the start of the instruction cycle, then an extra step is inserted: • The CPU saves its state information so that it can later return to the current task. • The PC is loaded with the starting address of the Interrupt Service Routine • The FETCH phase of the cycle continues as normal.

Priority of the current program Priv Priority N Z P 15 10 – 8 2 1 0 Remember those extra bits in the PSR?

Device Priority

Returning from the Interrupt • The last instruction of an ISR is RTI (ReTurn from Interrupt) • Return from Interrupt (opcode 1000) • Pops PSR and PC from the Supervisor stack • Restores the condition codes from PSR • If necessary (i.e. if the current privilege mode is User) restores the user stack pointer to R6 from Saved.USP • Essentially this restores the state of our program to exactly the state it had prior to the interrupt • Continues running the program as if nothing had happened! • How does this enable multiprogramming environments?

The entire interrupt sequence • Device generates an IRQ at a specific PL • IF requesting PL > current process priority: • CPU switches to SSP if necessary (hardware) • Current PC and PSR are saved to the supervisor stack (hardware) • Switch to supervisor mode (S = 0; hardware) • Set process priority to requested interrupt PL • CPU sends IACK , which is daisy chained to device (hardware) • Device sends its vector number (hardware) • Vector is looked up in the interrupt vector table, and address of the ISR is loaded into the PC (hardware) • ISR saves any registers that it will use (software) • ISR runs, then restores register values (software) • ISR executes RTI instruction, which restores PSR and PC (software) • Note that this restores previous supervisor/user mode and process priority

Execution flow for a nested interrupt

Supervisor Stack & PC during INT

Interrupts: Not just for I/O • Interrupts are also used for: • Errors (divide by zero, etc.) • TRAPs • Operating system events (quanta for multitasking, etc.) • User generated events (Ctrl-C, Ctrl-Z, Ctrl-Alt-Del, etc.) • …and more.

DMA A device specialized in transferring data between memory and an I/O device (disk). CPU writes the starting address and size of the region of memory to be copied, both source and destination addresses. DMA does the transfer in the background. It accesses the memory only when the CPU is not accessing it (cycle stealing). CPU memory DMA I/O Dev DMA – Direct Memory Access

Stack-based instruction sets • Three-address vs zero-address • The LC-3 explicitly specifies the location of each operand: it is a three-address machine • e.g. ADD R0, R1, R2 • Some machines use a stack data structure for all temporary data storage: these are zero-address machines • the instruction ADD would simply pop the top two values from the stack, add them, and push the result back on the stack • Most calculators use a stack to do arithmetic, most general purpose microprocessors use a register bank • Two-address machines • This has nothing to do with stacks… but the x86 is a two-address machine • The DR is always SR1 • So ADD R0, R1 in x86 is equivalent to ADD R0, R0, R1 in LC-3 • Implications?

Practice problems 10.8, 10.10 (this is a good one!), 10.12 (long, but good), 10.13 (also long, but good)

Chapter 10

Chapter 10

Presentation Transcript

Chapter 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

CHAPTER 10

CHAPTER 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10

10~Chapter 10

CHAPTER 10

Chapter 10

Chapter 10

Chapter 10

Chapter 10