CS 452/652 Winter 2026 - Lecture 20

Latency and Priorities

Mar 19, 2026 prev prev

compute latency (vs. Märklin latency)
think of program as graph
- computation & termination vs. control/service loop
control/service loop: throughput vs. latency
- timing of edges: busy vs. block
- throughput: compare average busy cost of paths to offered load / arrival rate
- latency: take into account busy and blocked cost
- cf. real-time scheduling work?

no balancing between lower and higher latencies
- service loop: outliers change mean - potentially skews utility
  - late response is useless, regardless of how late
- control loop: mean masks outliers - potentially hides catastrophic problem
  - how often a train enters a critical track section matters more than how much it overshoots
analysis must work with latency distribution
- service loop: high order percentile (tail latency)
- control loop: worst-case latency
exceptions where average latency is relevant
- sizing of streaming playout buffer

worst-case execution time (WCET)
identify (safety-)critical paths or worst-case execution path (WCEP)
- high order percentile vs. actual worst-case?
uncertainty
- busy: memory pipelining/caching/sharing effects?
- cache: no miss, one miss, all misses → reality?
- block: contributes to latency, but little throughput impact
- internal loops - bounds?
queueing
- standing queue - contributes to latency, but does not affect throughput
- queueing theory: latency increases as arrivals approach capacity
  - cannot "save" capacity during low-arrival times for later high-arrival
  - unbounded arrivals, average over infinity

critical path: sensor event → compute → action → effect
- includes Märklin latency
small tasks: shorter program graph edges
preemptive scheduling: shortcut to loop start in multi-tasking graph
preemption: keep uninterruptible kernel execution short!
priorities: resource management via scheduling

assume CommandWriter is highest priority server
latency includes
- submit request (includes a context switch)
- request queueing?
- request execution
variability:
- assume CommandWriter runs immediately, because it is highest priority
  - otherwise, we need to consider additional delay due to higher priority task(s)
- interrupts might occur while CommandWriter is running?
  - can try to estimate/bound number
  - this is why we want interrupt handling to be fast and predictable!
might need to measure primitive steps
- have already measured SRR, including context switch
- measure/compute CANbus writing/sending time
- measure interrupt overhead?
  - in the kernel, using always-asserted interrupt, count and measure N occurrences

critical path: sensor → mcp2515 → supervisor → engineer → track → mcp2515 → command
- plus requisite notifiers, couriers, etc.
critical path? everything is important! what can be bumped down?
- computation such as routing and path planning?
- terminal processing?
at low utilization: priorities not as critical
priority determines worst-case latency
- low-level tasks: low priority could loose events or input
- use higher priority and/or buffer

prioritze application-meaningful operations
- sensor-command activation cycle
- route planning, route setup
- user interface
mapping operations to tasks
- problem: same task doing high- and low-priority operations
  - route planning vs re-routing
  - clock services for low- and high-priority tasks
  - train server handling high (stop) and low (go) priority commands
- static solution: split tasks, e.g, high- and low-priority train server
  - not always practical: ensure atomicity - one command at a time to Marklin
  - servers often have to handle requests from processes with different priorities
- prioritize messages or operations?

lower-priority tasks indirectly keeps higher-priority task from running
latency problem: performance and/or correctness
- unintended starvation, if system fully loaded
examples: assume lower numbers are higher priorties
Scenario A
- Server T₃ running on behalf of T₁, then T₂ preempts
  → T₁ waits for T₂
Scenario B
- Server T₃ running on behalf of T₅, then T₄ wants to run
  → T₄ waits for T₅
Scenario C
- Server T₃ idle or preempted, T₂ running, then then T₁ sends to T₃
  → T₁ waits for T₂
dynamic prioritization
- promote server priority to priority of active request (Scenario A)
- demote server priority to priority of active request (Scenario B)
- promote server priority to highest priority queued request (Scenario C)
kernel can support dynamic prioritization
- carry sender's priority in internal message
- also: priority queue of waiting senders?
further reading: Priority Inheritance Protocols: An Approach to Real-Time Synchronization