正文內容

現(xiàn)代計算機體系結構-展示頁

2024-08-04 03:08本頁面

　　

【正文】 e Busy Op Vj Vk Qj QkA dd1 Y e s S U BD M (A 1) L oa d2A dd2 NoA dd3 NoM ul t 1 Y e s M U L T D R(F 4) L oa d2M ul t 2 NoR e gis te r r e s ult s tat us :C loc k F0 F2 F4 F6 F8 F 10 F 12 ... F 304 FU M ul t 1 L oa d2 M (A 1) A dd1? Load2 pleting。 –40 clocks for Flopt. / 2022/8/17 16 Tomasulo Example I ns tr uc ti on s tat us : E xec W r i t eIns t ruc t i on j k Is s u e Co m p R es u l t Bu s y A d d res sLD F6 34+ R2 L oa d1 NoLD F2 45+ R3 L oa d2 NoM U L T D F0 F2 F4 L oa d3 NoS U BD F8 F6 F2D IV D F 10 F0 F6ADDD F6 F8 F2R e s e r v ati on Stat ions : S1 S2 RS RST i m e Nam e Busy Op Vj Vk Qj QkA dd1 NoA dd2 NoA dd3 NoM ul t 1 NoM ul t 2 NoR e gis te r r e s ult s tat us :C loc k F0 F2 F4 F6 F8 F 10 F 12 ... F 300 FUClock cycle counter FU count down Instruction stream 3 Load/Buffers 3 FP Adder . 2 FP Mult . 2022/8/17 17 Tomasulo Example Cycle 1 I ns tr uc ti on s tat us : E xec W r i t eIns t ruc t i on j k Is s u e Co m p R es u l t Bu s y A d d res sLD F6 34+ R2 1 L oa d1 Y e s 34+ R2LD F2 45+ R3 L oa d2 NoM U L T D F0 F2 F4 L oa d3 NoS U BD F8 F6 F2D IV D F 10 F0 F6ADDD F6 F8 F2R e s e r v ati on Stat ions : S1 S2 RS RST i m e Nam e Busy Op Vj Vk Qj QkA dd1 NoA dd2 NoA dd3 NoM ul t 1 NoM ul t 2 NoR e gis te r r e s ult s tat us :C loc k F0 F2 F4 F6 F8 F 10 F 12 ... F 301 FU L oa d12022/8/17 18 Tomasulo Example Cycle 2 I ns tr uc ti on s tat us : E xec W r i t eIns t ruc t i on j k Is s u e Co m p R es u l t Bu s y A d d res sLD F6 34+ R2 1 L oa d1 Y e s 34+ R2LD F2 45+ R3 2 L oa d2 Y e s 45+ R3M U L T D F0 F2 F4 L oa d3 NoS U BD F8 F6 F2D IV D F 10 F0 F6ADDD F6 F8 F2R e s e r v ati on Stat ions : S1 S2 RS RST i m e Nam e Busy Op Vj Vk Qj QkA dd1 NoA dd2 NoA dd3 NoM ul t 1 NoM ul t 2 NoR e gis te r r e s ult s tat us :C loc k F0 F2 F4 F6 F8 F 10 F 12 ... F 302 FU L oa d2 L oa d1Note: Can have multiple loads outstanding 2022/8/17 19 Tomasulo Example Cycle 3 I ns tr uc ti on s tat us : E xec W r i t eIns t ruc t i on j k Is s u e Co m p R es u l t Bu s y A d d res sLD F6 34+ R2 1 3 L oa d1 Y e s 34+ R2LD F2 45+ R3 2 L oa d2 Y e s 45+ R3M U L T D F0 F2 F4 3 L oa d3 NoS U BD F8 F6 F2D IV D F 10 F0 F6ADDD F6 F8 F2R e s e r v ati on Stat ions : S1 S2 RS RST i m e Nam e Busy Op Vj Vk Qj QkA dd1 NoA dd2 NoA dd3 NoM ul t 1 Y e s M U L T D R(F 4) L oa d2M ul t 2 NoR e gis te r r e s ult s tat us :C loc k F0 F2 F4 F6 F8 F 10 F 12 ... F 303 FU M ul t 1 L oa d2 L oa d1? Note: registers names are removed (“renamed”) in Reservation Stations。 mark reservation station available 2022/8/17 15 Three Stages of Tomasulo Algorithm ? Normal data bus: data + destination (―go to‖ bus) ? Common data bus: data + source (―e from‖ bus) –64 bits of data + 4 bits of Functional Unit source address –Write if matches expected Functional Unit (produces result) –Does the broadcast ? Example speed: –3 clocks for Flopt. +,。 sends operands (renames registers). 2. Execute—operate on operands (EX) When both operands ready then execute。 called register renaming 。 buffers distributed with Function Units (FU) –FU buffers called ―reservation stations‖?，F(xiàn)代計算機體系結構 1 現(xiàn)代計算機體系結構主講教師：張鋼教授天津大學計算機學院通信郵箱：提交作業(yè)郵箱： 2022年 2 The Main Contents課程主要內容 ? Chapter 1. Fundamentals of Quantitative Design and Analysis ? Chapter 2. Memory Hierarchy Design ? Chapter 3. InstructionLevel Parallelism and Its Exploitation ? Chapter 4. DataLevel Parallelism in Vector, SIMD, and GPU Architectures ? Chapter 5. ThreadLevel Parallelism ? Chapter 6. WarehouseScale Computers to Exploit RequestLevel and DataLevel Parallelism ? Appendix A. Pipelining: Basic and Intermediate Concepts 課堂討論 2022/8/17 4 Advantages of Dynamic Scheduling ? Dynamic scheduling –Hardware rearranges the instruction execution to reduce stalls while maintaining data flow and exception behavior ? What’s the meaning that maintaining data flow and exception behavior? 2022/8/17 5 Advantages of Dynamic Scheduling ? Advantages –It handles cases when dependences unknown at pile time ? it allows the processor to tolerate unpredictable delays such as cache misses, by executing other code while waiting for the miss to resolve –It allows code that piled for one pipeline to run efficiently on a different pipeline –It simplifies the piler ? Why? 2022/8/17 6 HW Schemes: Instruction Parallelism ? Key idea: Allow instructions behind stall to proceed DIVD F0,F2,F4 ADDD F10,F0,F8 SUBD F12,F8,F14 ? Enables outoforder execution and allows outoforder pletion (., SUBD) –In a dynamically scheduled pipeline, all instructions still pass through issue stage in order (inorder issue) ? What are the meaning that inorder issue, outoforder execution, outoforder pletion? 202

點擊復制文檔內容

規(guī)章制度相關推薦

freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

現(xiàn)代計算機體系結構-展示頁

[工學]計算機體系結構(2)-展示頁

計算機體系結構復習-展示頁

[理學]計算機體系結構復習-展示頁

并行計算機體系結構-展示頁

高級計算機體系結構intel86體系結構-展示頁

計算機體系結構考試總結-展示頁

計算機體系結構b卷-展示頁

計算機體系結構復習題-展示頁

計算機體系結構學科發(fā)展簡介-展示頁

計算機體系結構第五章-展示頁

[工學]計算機體系結構術語解釋-展示頁

計算機體系結構學科發(fā)展簡介-展示頁

[精選]現(xiàn)代計算機體系結構--cpu英文版-展示頁

計算機體系結構a卷答案-展示頁

計算機體系結構學科發(fā)展簡介(1)-展示頁

現(xiàn)代計算機體系結構-文庫吧

現(xiàn)代計算機體系結構-wenkub

現(xiàn)代計算機體系結構(已修改)

現(xiàn)代計算機體系結構(編輯修改稿)