...

計算機アーキテクチャ特論 (Advanced Computer Architectures)

by user

on
Category: Documents
3

views

Report

Comments

Transcript

計算機アーキテクチャ特論 (Advanced Computer Architectures)
ハザード (hazard)
命令を適切なサイクルで実行できないような状況が存在す
る.これをハザードと呼ぶ.
計算機アーキテクチャ特論
(Advanced Computer Architectures)
„
構造ハザード (structural hazard)
„
4.レジスタ・リネーミング,メモリ・データフロー
„
吉瀬 謙二 計算工学専攻
kise _at_ cs.titech.ac.jp www.arch.cs.titech.ac.jp
W832 講義室 金曜日 13:20 – 14:50
1
データ依存関係のない命令列
:=
:=
:=
:=
R3
R4
R5
R6
+
+
+
+
1
1
1
1
(1)
(2)
(3)
(4)
代入
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
:=
:=
:=
:=
R3
R3
R5
R3
x
+
+
x
R5
1
1
R4
If R3=20, R5=3
60 := 20 x 3
61 := 60 + 1
4 := 3 + 1
244:= 60 x 61
„
制御ハザード(control hazard)
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
R3
R4
R3
R7
:=
:=
:=
:=
R3
R3
R5
R3
x
+
+
x
R5
1
1
R4
If R3=20, R5=3
60 := 20 x 3
61 := 60 + 1
4 := 3 + 1
244:= 4 x 61
(1)
(2)
(3)
(4)
(1)
(2)
(3)
(4)
出力依存 (output dependence)
(1)
(2)
(3)
(4)
R3
R4
R3
R7
(1)
(2)
(3)
(4)
If R3=20, R5=3
60 := 20 x 3
61 := 60 + 1
4 := 3 + 1
XXX:= 60 x 61
3番目の命令が完了する前に,4番目の命令を実行してはいけない.
RAW (read after write)
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
データ・ハザード(data hazard)
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
真のデータ依存 (true data dependence)
R3
R4
R3
R7
„
データ依存関係のある命令列
レジスタ
R3
R4
R5
R6
オーバラップ実行する命令の組み合わせをハードウェアがサ
ポートしていない場合.
資源不足により生じる.
:=
:=
:=
:=
R3
R3
R5
R3
x
+
+
x
R5
1
1
R4
(1)
(2)
(3)
(4)
(1)
(2)
(3)
(4)
1番目の命令の代入が3番目の命令の代入より後に完了してはいけない.
WAW (write after write)
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
逆依存 (antidependence)
R3
R4
R3
R7
:=
:=
:=
:=
R3
R3
R5
R3
x
+
+
x
R5
1
1
R4
If R3=20, R5=3
60 := 20 x 3
XX := 4 + 1
4 := 3 + 1
XXX:= 4 x XX
データ依存関係 (data dependence)
(1)
(2)
(3)
(4)
真のデータ依存 (true data dependence)
„
„
RAW, read after write
出力依存 (output dependence)
„
„
WAW, write after write
偽のデータ依存
(1)
(2)
(3)
(4)
逆依存 (antidependence)
„
„
WAR, write after read
„
RAR ?, read after read
2番目の命令が実行を始める前に,3番目の命令を完了してはいけない.
WAR (write after read)
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
パイプライン処理 (pipelining)
プロセッサのデータパス(パイプライン処理)
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
レジスタ名前換え,レジスタ・リネーミング
スーパースカラ・プロセッサと命令レベル並列性
„
複数のパイプライを利用して IPC を 1以上に引き上げる
„
„
n-way スーパースカラ
ハザードの積極的な解消とストールの隠蔽が重要
R3
R4
R3
R7
:=
:=
:=
:=
R3
R3
R5
R3
x
+
+
x
R5
1
1
R4
(1)
(2)
(3)
(4)
n
(3)
(1)
(2)
2-way superscalar
R8 :=
R9 :=
R10:=
R11:=
R3 x R5
R8 + 1
R5 + 1
R10 x R9
(1)
(2)
(3)
(4)
(4)
命令レベル並列性は?
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
データ依存関係 (data dependence)
ハードウェアによるレジスタ名前換え
真のデータ依存 (true data dependence)
„
„
„
論理レジスタ (logical register)
RAW, read after write
プログラマやコンパイラから見えるレジスタ
機械命令のフィールドで指定
MIPSの命令セットでは R0 – R31 という論理レジスタを利用
„
„
出力依存 (output dependence)
„
„
„
WAW, write after write
偽のデータ依存
„
物理レジスタ (physical register)
逆依存 (antidependence)
„
„
„
WAR, write after read
32 bits
Logical Register File
src addr
src addr
32 bits
dst addr
5
32 src1
5
data
dst addr
dst addr
32
locations
7
7
7
7
32
128
locations
data
write data
write data
32
32
R0 – R31
write data
write data
32
32
32
32
32
32
Physical Registers
フリータグ・バッファ
32 src
32
32 src2
32
write data
物理的に存在するレジスタ
ハードウェアによるレジスタ名前換え
Physical Registers
5
プログラマやコンパイラから陽に見える必要はない.
„
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
MIPS Register File
src2 addr
プロセッサアーキテクチャから見えるレジスタ
„
RAR ?, read after read
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
src1 addr
„
32
data
src
data
src
data
src
data
src
data
src
data
src
data
src
data
head
P1
P2
P4
P6
P7
P8
P9
P10
32 bits
レジスタ
マップテーブル
R0
R1
R2
R3
R4
R5
R6
R7
tail
data
32 src
128
locations
P1
P2
data
write data
write data
write data
8ビット
I0:
I1:
I2:
I3:
sub
add
or
and
$5,$1,$2
$9,$5,$4
$5,$5,$2
$2,$9,$1
フリータグ・バッファ
13 12 11 10 9
head
dst = $5
src1 = $1
src2 = $2
0
1
2
3
4
5
6
7
8
9
10
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
32 src
32
data
32 src
32
data
32
32 src
data
P0 – P127
レジスタ・リネーミング(演習)
レジスタ・マップテーブル
Cycle 2
0
1
2
3
4
dst = p9
src1 = p1
src2 = p2
I0:
I1:
I2:
I3:
sub
add
or
and
$5,$1,$2
$9,$5,$4
$5,$5,$2
$2,$9,$1
フリータグ・バッファ
head
I0: sub p9,p1,p2
31
data
32
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
レジスタ・マップテーブル
data
32 src
write data
レジスタ・リネーミング(演習)
Cycle 1
data
32 src
32 src
P0 – P127
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
32 src
dst =
src1 =
src2 =
0
1
2
3
4
5
6
7
8
9
10
dst =
src1 =
src2 =
I1: add
31
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
レジスタ・リネーミング(演習)
レジスタ・マップテーブル
Cycle 3
I0:
I1:
I2:
I3:
レジスタ・リネーミング(演習)
sub
add
or
and
$5,$1,$2
$9,$5,$4
$5,$5,$2
$2,$9,$1
フリータグ・バッファ
head
dst =
src1 =
src2 =
dst =
src1 =
src2 =
R3
R3
R5
R3
x
+
+
x
R5
1
1
R4
sub
add
or
and
0
1
2
3
4
5
6
7
8
9
10
$5,$1,$2
$9,$5,$4
$5,$5,$2
$2,$9,$1
フリータグ・バッファ
head
I2: or
31
dst =
src1 =
src2 =
I3: and
31
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Exercise
レジスタ名前換え,レジスタ・リネーミング
:=
:=
:=
:=
I0:
I1:
I2:
I3:
dst =
src1 =
src2 =
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
R3
R4
R3
R7
レジスタ・マップテーブル
Cycle 4
0
1
2
3
4
5
6
7
8
9
10
フリータグ・バッファ
(1)
(2)
(3)
(4)
36 35 34 33 32
(3)
(1)
(2)
R8 :=
R9 :=
R10:=
R11:=
R3 x R5
R8 + 1
R5 + 1
R10 x R9
(1)
(2)
(3)
(4)
(4)
命令レベル並列性の向上
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
スーパースカラ・プロセッサにおける問題
„
スーパースカラ・プロセッサにおける問題
レジスタ・リネーミング
„
„
State
Element
1
1サイクルに複数(N個)の命令のリネーミングをおこなう必用がある.
N倍で動作させる.
State
Element
1
Combinational
logic
State
Element
2
N個の命令を扱えるように回路を構成する.
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
State
Element
2
Combinational
logic
State
Element
3
Clock cycle
State
Element
1
Clock cycle
„
Combinational
logic
Combinational
logic
Clock cycle
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Combinational
logic
State
Element
2
Out-of-orderスーパースカラ・プロセッサ
2命令のレジスタ・リネーミング
レジスタ・マップテーブル
Cycle 1
sub $5,$1,$2
add $9,$5,$4
0
1
2
3
4
5
6
7
8
9
10
フリータグ・バッファ
13 12 11 10 9
head
I1
dst = $5
src1 = $1
src2 = $2
sub p9,p1,p2
add p10,p9,p4
0
1
2
3
4
5->9
dst = $9
src1 = $5
src2 = $4
Branch
Branch handler
handler
Register
Register file
file
dst = p9
src1 = p1
src2 = p2
ALU
ALU
dst = p10
src1 = p9
src2 = p4
Time
見つける.
„
真のデータ依存関係の違反を「回避」する.
„
真のデータ依存関係の違反を「検出」する.
„
真のデータ依存関係を「解消」する.
„
FP
FP ALU
ALU
Memory dataflow
Memory
FP
FP ALU
ALU
Adr
Adr gen.
gen.
Adr
Adr gen.
gen.
Reorder buffer
Store
queue
Data
Data cache
cache
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
アナウンス
Consumer
„
Floating-point
Register dataflow
真のデータ依存関係
Data Dependency
ALU
ALU
Operand
Operand Fetch
Fetch
I1 dst == I2 src1 ?
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Producer
Integer
RS
->10
31
Fetch
Fetch
Decode
Decode
Rename
Rename
MUX
I2
Instruction flow
Instruction
Instruction cache
cache
„
講義スライド,講義スケジュール
„ www.arch.cs.titech.ac.jp
どこまで?
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005
28
Fly UP