CS 452/652 Spring 2025 - Lecture 8

Caching, Interrupts

May 29, 2025 prev next

BCM - Broadcom BCM2711 data sheet
GIC - ARM Generic Interrupt Controller REF - Architecture Reference Manual

Note: Per-Task Information Registers

TPIDR_EL0: per-thread pointer (RW in EL0)
TPIDR_EL1: per-thread pointer (RW in EL1)
- can be used to store pointer to task context
TPIDRRO_EL0: per-thread pointer (RO in EL0)

CPU Caching

memory types & caches (REF, Section B2.7, etc.)
- L1 instruction cache; L1 data cache; L2 unified/shared cache
- branch target cache (branch predictors)
- TLB (page translation cache)
- write buffer: mediate fast CPU vs. slow RAM; coalesce writes
  - memory modes: Device vs Normal
  - caching modes: Non-Cacheable, Write-Through, Write-Back

Cache Setup & Maintenance

Cache Maintenance
- CTR_EL0 for cache information (REF, Section D7.5)
- CCSIDR_EL1/CSSELR_EL1 for cache size (REF, Section D7.5)
- L1d 128KB, L1i 192KB, L2 1MB (unified, shared)
SCTLR_EL1
- I and C flags for instruction and data cache respectively
- controls EL0 and EL1 for Normal mode
  - memory barrier to enforce changes
- Cache Maintenance
  - clean: make changes "permanent"
  - invalidate: avoid stale reads
- status changes
  - enabled → disable: clean cache?
  - enabled → enable: clean cache?
  - disabled → disable: nothing
  - disabled → enable: depends on previous state
    - enabled → disable → enable
      information in cache might be stale
      invalidate cache before disable or before re-enable?
experiment!

Interrupts

Motivation
- avoid busy waiting for devices
- instead, device signals to get attention
- signal vs. edge semantics
Interrupts start with signals asserted by devices
- each device defines the signals it can generate
- example: system timer
  - four signals, corresponding to 4 System Timer Compare registers (C0-C3)
  - when CLO matches value in Cx, signal is asserted
- example of what ARM refers to as a Shared Peripheral Interrupt (SPI)
  - generated by a device, routed to some processor(s)
  - also Private Peripheral Interrupts (PPI) and Software Generated Interrupts (SGIs)
    - PPI specific to a single core (not used here)
    - SGI generated by software - can be used to signal other cores
Interrupts result in an asynchronous exception at a processor
- mapped to one of two signals (FIQ and IRQ) at each processor core
- Reminder: exception vector
  - groups identify execution state when exception occurred:
    - Current EL with SPO/SPx, Lower EL using 64/32 bit mode
  - within each group, entries for different types of exceptions
    - synchronous, IRQ, FIQ, SError
- If IRQ signal is asserted
  - CPU transfers control to exception handler after execution of current instruction
  - pipeline flush
- Handler/Kernel
  - saves current application context
  - figures out the reason for the interrupt
  - handle the interrupt
    - includes device control and interrupt acknowledgement (more later)
  - choose next task to run
  - restore chosen task context, return from exception (eret)
- Masking
  - Interrupts can be masked at processor
    - IRQ/FIQ still get asserted, but processor ignores them
    - DAIF bits in pstate
      - I is mask for IRQ, F is mask for FIQ
  - When exception occurs, interrupts get masked automatically
    - kernel unmasks by ensuring I and F bits are cleared in SPSR_EL1 prior to context switch
    - possible to unmask within the kernel, but we don't do this!
Interrupt handling also presents overheads
- direct: pipeline flush, handler execution
- indirect: cache disturbance
- high-rate of interrupts? use hybrid approach (Interrupt Mitigation):
  - poll device and deliver event
  - only enable interrupt, if poll (or several polls) unsuccessful
  - disable interrupt after it is triggered
  - real-world example: Linux NAPI
- similar strategy useful for CS 452 microkernel
- excursion: Linux IRQ Suspend Mechanism