CS 452/652 Winter 2023 - Lecture 18
Mar 21, 2023
TC1 Demo Review
- importance of robustness increases between TC1 and final project
- use headlights to (manually) detect train's positioning during initialization
- use sensors to detect trains location and travel direction
- double-check kernel mechanisms: message passing & context switch
- double-check UART servers and interrupt handling (including CTS)
- double-check priorities
- double-check main bottleneck: serialization at half-duplex COM1
- being able to quit your program - and restart without reload, without reboot
- good for demos, but also helps development
- initialize BSS
- reset track & trains at start and stop
- use emergency stop (speed level 15) at end or failure
- always/periodically check for exit key
- computation / loop with easily verifiable results
- high register coverage
- use subroutine, recursion (to a limit)
- trigger frequent timer interrupt
- randomized timeout
- interrupt handler (kernel): perform similar computation
- clobber as much state as possible
- test with and without optimization
- repeat speed measurements as kernel testing method
- should see similar measurements
- test reliability of train commands
- speed variation and timing
- change direction and verify using sensors
- add lights on/off to increase command intensity
- travel at constant speed level
- can recalibrate velocity estimate
- optional: slow down before stopping
- travel at specific velocity
- timed adjustment of speed level, verify via sensors
- travel in cohort: multiple trains at specific distance
- adjust one or both trains, verify via sensors
- sensor hit is either
- train - which one?
- keep track of expected sensor, expected time of trigger
- train N, time T +/- error margin
- error margin not necessarily symmetric
- velocity estimate - symmetric
- sensor reporting latency - asymmetric
- error: unexpected sensor is triggered (or timeout)
- error detection critically depends on assumptions
- here: assume at most one failure, not two in a row
- can observation be explained by a single turnout failure or a single sensor failure?
- spurious sensor hit?
- testing: treat as critical error
- demo: ignore and hope for the best
- enhanced error model?
- one error → (detected by attribution) recover
- two errors → (detected by timeout) abort
- anything else → spurious/ignore?
- broken turnout complicates matters...
Train Control Robustness
- fault testing? example: Chaos Monkey
- fake or suppress reports/commands: speed, turnout, sensor
Collision Avoidance - Detection
- direct: check other trains
- indirect: use track representation
- full-path vs. on-demand reservation and release
- planning horizon?
- consider speed and stopping distance
- topology (potential for evasion)
- trade-off: safety vs. efficiency
Collision Avoidance - Reaction
- avoid deadlock
- direction of conflict?
- stationary obstacle → reroute
- same direction → slow or stop easy
- head-on → likely reroute
- shortest path routing is often presented as optimal
but not always the best choice → bottlenecks!
- common optimisation criterion required for consistent path computation
- from the perspective of different nodes, especially in distributed setup
- run Dijkstra algorithm only for branch nodes
- use overlay data structure for reservations and faults
- mid-travel reverse - relevant for merge nodes (and begin/end)
model as extra link → augment track data
- online: on-demand routing?
- compute path using track \ reservations
- single-node Dijkstra (overlay) is probably doable online
- offline: determine loop-free alternative path with loop threshold!
- loop-free: take other edge, then remove turnout from consideration
- or take loop to get out of the way
- offline: precompute (some) routing information at system startup
- including knowledge about track deficiencies
- finish path computation online
- e.g. Dijkstra previous hop tracing
- e.g. find alternative routes
- on-demand routing: compute entire end-to-end path online
- including knowledge about other planned routes
- single-node Dijkstra feasible
- on-demand conflict resolution
- avoid switching turnout while train passes
- integrate with track reservation system?
- or independent switch protection layer?
- block turnout during travel, treat as (transient) turnout error
- integrate everything via reservation system or combine independent components?
- two time horizons: immediate (prediction) and longer-term (reservation)