Most lectures will be done on the black/white board. Supplemental materials
such as diagrams and simulations will be posted here.
| Topic |
Section |
Text Reference |
| System Verilog
(notes) |
| 0) Intro | |
| 1) Gate-Level Modelling | |
| 1.1) Hierarchical Design | |
| 2) Dataflow Modelling | |
| 3) Behavioural Modeling | |
| 3.1) Synchronous Design | |
| 3.2) Discrete Event Simulation (skipped) | |
| 4) Miscellaneous Verilog | |
| 5) Design Example: Complex Multiplier | |
| Processor Design
(notes) |
ISA, μarch, ILP, TLP, DLP | |
| Scalar Pipelines
(notes) |
1) MIPS Datapath | PCOD 3.3.1, H+P C.3 |
| 2) Dependencies | PCOD 3.3.1 |
| 3) Static Scheduling | PCOD 3.3.5, H+P 3.2,C.5 |
| 3.1) Local Scheduling | PCOD 3.3.5 |
| 3.2) Global Scheduling | PCOD 3.3.5 |
| 3.2.1) Loop Unrolling | |
| 4) Exceptions | H+P C.4 |
| VLIW Pipelines
(notes) |
1) Intro | PCOD 3.5.2, H+P 3.7 |
| 2) Local Scheduling | PCOD 3.5.2 |
| 3) Loop Unrolling | PCOD 3.5.3, H+P 3.7 |
| 4) Software Pipelining | PCOD 3.5.4 |
| 5) Trace Scheduling | PCOD 3.5.5, H+P H.3 |
| 6) Speculative Loads | PCOD 3.5.7, H+P H.5 |
| 7) Deferred Exceptions | PCOD 3.5.8, H+P H.5 |
| 8) Predicated Execution | PCOD 3.5.6, H+P 4.4 |
| 9) IA-64 ISA | PCOD 3.6, H+P 3.6 |
| Out-of-Order Pipelines
(notes 1,
notes 2,
notes 3) |
1) Introduction | |
| 2) Tomasulo's Alogithm | PCOD 3.4.1, H+P 3.4,3.5 |
| 3) Pipeline Structure | PCOD 3.4.4 |
| 3.1) Re-Order Buffer | |
| 4) Data Dependencies | |
| 5) Register Renaming | PCOD 3.4.6 |
| 5.1) Rename Register File | |
| 5.2) Physical Register File | PCOD 3.4.6 |
| 5.3) Delayed Register Read | PCOD 3.4.7 |
| 5.4) Speculative Register Renaming | |
| 6) Intel Skylake Pipeline | |
| 7) Memory Data Flow | |
| 7.1) Speculative Loads | PCOD 3.4.5 |
| 7.2) Data Prefetch | |
| 7.3) Caches | |
| 7.4) Virtual Memory | |
| 8) Control Flow | |
| 8.1) Target Speculation | PCOD 3.4.3 |
| 8.1.1) Return Address Stack | |
| 8.1.2) Polymorphic Branches | |
| 8.2) Condition Speculation | PCOD 3.4.3 |
| 8.2.1) Static Methods | |
| 8.2.2) Dynamic Methods | |
| 8.2.3) Correlating Predictors | PCOD 5.1.5, H+P
3.3 |
| 8.3) Validation and Recovery | PCOD 3.4.3 |
| Thread-Level Parallelism
(notes 1,
notes 2) |
1) Introduction | PCOD 5.2 |
| 2) Synchronization | PCOD 7.5 |
| 2.1) Pthreads | |
| 2.2) OpenMP | |
| 3) Shared Memory Multiprocessors | PCOD 5.4.1 |
| 3.1) UMA | |
| 3.2) NUMA | |
| 4) Coherent Shared Memory | PCOD 5.4.2, H+P 5.2 |
| 4.1) MESI Protocol | PCOD 5.4.3 |
| 4.2) Snooping Implementation | PCOD 5.4.3 |
| 4.2.1) MESIF and MOESI | H+P 5.2 |
| 4.2.2) Locks and Coherence | H+P 5.5 |
| 4.3) Directory Implementation | PCOD 5.5.2, H+P
5.4 |
| 4.4) Multilevel Caches | PCOD 5.5.2 |
| 4.5) Transactional Memory | |
| 4.5.1) HTM on POWER8 | |
| 4.5.2) HTM Application | |
| 5) Memory Consistency | PCOD 7.4,7.6.1, H+P
5.6,5.7 |
| 6) Multithreaded Processors | PCOD 8.3, 11.4.2,
ll.4.3, 11.4.4 |
| Data-Level Parallelism
(notes) |
1) GPU | |
| 2) CUDA Programming Model | |
| 3) GPU Hardware | |
| 4) Many-Core Architectures | |