CS 452/652 Spring 2025 - Lecture 8
Caching, Interrupts
May 29, 2025 prev next
BCM - Broadcom BCM2711 data sheet
GIC - ARM Generic Interrupt Controller
REF - Architecture Reference Manual
Note: Per-Task Information Registers
- TPIDR_EL0: per-thread pointer (RW in EL0)
- TPIDR_EL1: per-thread pointer (RW in EL1)
- can be used to store pointer to task context
- TPIDRRO_EL0: per-thread pointer (RO in EL0)
CPU Caching
- memory types & caches (REF, Section B2.7, etc.)
- L1 instruction cache; L1 data cache; L2 unified/shared cache
- branch target cache (branch predictors)
- TLB (page translation cache)
- write buffer: mediate fast CPU vs. slow RAM; coalesce writes
- memory modes: Device vs Normal
- caching modes: Non-Cacheable, Write-Through, Write-Back
Cache Setup & Maintenance
- Cache Maintenance
- CTR_EL0 for cache information (REF, Section D7.5)
- CCSIDR_EL1/CSSELR_EL1 for cache size (REF, Section D7.5)
- L1d 128KB, L1i 192KB, L2 1MB (unified, shared)
- SCTLR_EL1
- I and C flags for instruction and data cache respectively
- controls EL0 and EL1 for Normal mode
- memory barrier to enforce changes
- Cache Maintenance
- clean: make changes "permanent"
- invalidate: avoid stale reads
- status changes
- enabled → disable: clean cache?
- enabled → enable: clean cache?
- disabled → disable: nothing
- disabled → enable: depends on previous state
- enabled → disable → enable
information in cache might be stale
invalidate cache before disable or before re-enable?
- experiment!
Interrupts
- Motivation
- avoid busy waiting for devices
- instead, device signals to get attention
- signal vs. edge semantics
- Interrupts start with signals asserted by devices
- each device defines the signals it can generate
- example: system timer
- four signals, corresponding to 4 System Timer Compare registers (C0-C3)
- when CLO matches value in Cx, signal is asserted
- example of what ARM refers to as a Shared Peripheral Interrupt (SPI)
- generated by a device, routed to some processor(s)
- also Private Peripheral Interrupts (PPI) and Software Generated Interrupts (SGIs)
- PPI specific to a single core (not used here)
- SGI generated by software - can be used to signal other cores
- Interrupts result in an asynchronous exception at a processor
- mapped to one of two signals (FIQ and IRQ) at each processor core
- Reminder: exception vector
- groups identify execution state when exception occurred:
- Current EL with SPO/SPx, Lower EL using 64/32 bit mode
- within each group, entries for different types of exceptions
- synchronous, IRQ, FIQ, SError
- If IRQ signal is asserted
- CPU transfers control to exception handler after execution of current instruction
- pipeline flush
- Handler/Kernel
- saves current application context
- figures out the reason for the interrupt
- handle the interrupt
- includes device control and interrupt acknowledgement (more later)
- choose next task to run
- restore chosen task context, return from exception (
eret
)
- Masking
- Interrupts can be masked at processor
- IRQ/FIQ still get asserted, but processor ignores them
- DAIF bits in pstate
- I is mask for IRQ, F is mask for FIQ
- When exception occurs, interrupts get masked automatically
- kernel unmasks by ensuring I and F bits are cleared in
SPSR_EL1
prior to context switch
- possible to unmask within the kernel, but we don't do this!
- Interrupt handling also presents overheads
- direct: pipeline flush, handler execution
- indirect: cache disturbance
- high-rate of interrupts? use hybrid approach (Interrupt Mitigation):
- poll device and deliver event
- only enable interrupt, if poll (or several polls) unsuccessful
- disable interrupt after it is triggered
- real-world example: Linux NAPI
- similar strategy useful for CS 452 microkernel
- excursion: Linux IRQ Suspend Mechanism