Topic |
Section |
Text Reference |
System Verilog
(notes) |
0) Intro | |
1) Gate-Level Modelling | |
1.1) Hierarchical Design | |
2) Dataflow Modelling | |
3) Behavioural Modeling | |
3.1) Synchronous Design | |
3.2) Discrete Event Simulation (skipped) | |
4) Miscellaneous Verilog | |
5) Design Example: Complex Multiplier | |
Processor Design
(notes) |
ISA, μarch, ILP, TLP, DLP | |
Scalar Pipelines
(notes) |
1) MIPS Datapath | PCOD 3.3.1, H+P C.3 |
2) Dependencies | PCOD 3.3.1 |
3) Static Scheduling | PCOD 3.3.5, H+P 3.2,C.5 |
3.1) Local Scheduling | PCOD 3.3.5 |
3.2) Global Scheduling | PCOD 3.3.5 |
3.2.1) Loop Unrolling | |
4) Exceptions | H+P C.4 |
VLIW Pipelines
(notes) |
1) Intro | PCOD 3.5.2, H+P 3.7 |
2) Local Scheduling | PCOD 3.5.2 |
3) Loop Unrolling | PCOD 3.5.3, H+P 3.7 |
4) Software Pipelining | PCOD 3.5.4 |
5) Trace Scheduling | PCOD 3.5.5, H+P H.3 |
6) Speculative Loads | PCOD 3.5.7, H+P H.5 |
7) Deferred Exceptions | PCOD 3.5.8, H+P H.5 |
8) Predicated Execution | PCOD 3.5.6, H+P 4.4 |
9) IA-64 ISA | PCOD 3.6, H+P 3.6 |
Out-of-Order Pipelines
(notes 1,
notes 2,
notes 3) |
1) Introduction | |
2) Tomasulo's Alogithm | PCOD 3.4.1, H+P 3.4,3.5 |
3) Pipeline Structure | PCOD 3.4.4 |
3.1) Re-Order Buffer | |
4) Data Dependencies | |
5) Register Renaming | PCOD 3.4.6 |
5.1) Rename Register File | |
5.2) Physical Register File | PCOD 3.4.6 |
5.3) Delayed Register Read | PCOD 3.4.7 |
5.4) Speculative Register Renaming | |
6) Intel Skylake Pipeline | |
7) Memory Data Flow | |
7.1) Speculative Loads | PCOD 3.4.5 |
7.2) Data Prefetch | |
7.3) Caches | |
7.4) Virtual Memory | |
8) Control Flow | |
8.1) Target Speculation | PCOD 3.4.3 |
8.1.1) Return Address Stack | |
8.1.2) Polymorphic Branches | |
8.2) Condition Speculation | PCOD 3.4.3 |
8.2.1) Static Methods | |
8.2.2) Dynamic Methods | |
8.2.3) Correlating Predictors | PCOD 5.1.5, H+P
3.3 |
8.3) Validation and Recovery | PCOD 3.4.3 |
Thread-Level Parallelism
(notes 1,
notes 2) |
1) Introduction | PCOD 5.2 |
2) Synchronization | PCOD 7.5 |
2.1) Pthreads | |
2.2) OpenMP | |
3) Shared Memory Multiprocessors | PCOD 5.4.1 |
3.1) UMA | |
3.2) NUMA | |
4) Coherent Shared Memory | PCOD 5.4.2, H+P 5.2 |
4.1) MESI Protocol | PCOD 5.4.3 |
4.2) Snooping Implementation | PCOD 5.4.3 |
4.2.1) MESIF and MOESI | H+P 5.2 |
4.2.2) Locks and Coherence | H+P 5.5 |
4.3) Directory Implementation | PCOD 5.5.2, H+P
5.4 |
4.4) Multilevel Caches | PCOD 5.5.2 |
4.5) Transactional Memory | |
4.5.1) HTM on POWER8 | |
4.5.2) HTM Application | |
5) Memory Consistency | PCOD 7.4,7.6.1, H+P
5.6,5.7 |
6) Multithreaded Processors | PCOD 8.3, 11.4.2,
ll.4.3, 11.4.4 |
Data-Level Parallelism
(notes) |
1) GPU | |
2) CUDA Programming Model | |
3) GPU Hardware | |
4) Many-Core Architectures | |