kl800.com省心范文网

深入理解计算机系统


2010 秋季

计算机组成原理
Zhang, Youhui (张悠慧)

zyh02@tsinghua.edu.cn

课程回顾
Topics
? ? ? ?

计算机系统结构等相关概念与范畴 数的表示 汇编语言与C语言 代码优化

计算机系统结构等相关概念与范畴

概念——计算机系统结构
? 编写出能够在机器上正确运行的系统程序所 必须了解到的 计算机系统的属性 ? 研究计算机系统软件与硬件的功能分配,确 定计算机系统 软件与硬件的分界面 ? 研究计算机系统的外部特性,即程序员所看 到的计算机系 统属性

? 程序员看到的计算机系统属性
数据表示:硬件直接认别和处理的数据类型
寻址技术:编址方式、寻址方式和定位方式 寄存器定义:寄存器定义、数量和使用规则

指令系统:指令的操作类型、格式、排序等
存储系统:要求速度高、容量大、价格便宜 中断系统:中断类型、中断级别和响应方式

输入输出系统:数据交换方式、交换过程控制
机器工作状态:定义和切换方式,如内核态、执行态、管理态和用户 态等

概念——计算机组成
? 计算机系统的逻辑实现
设计功能部件:处理器,主存储器等 数据通路的宽度 各种操作对功能部件的共享程度 确定功能部件的并行度 设计缓冲和排队策略 设计控制机构 采用何种可靠性技术

概念——汇编语言
? 用符号表示的机器语言, 可包括宏构造

概念——冯诺依曼计算机

? 特点: 存储程序、运算器为中心、集中控制
存储器是字长固定的、顺序线性编址的一维结构,每个地址是唯一定 义的
由指令形式的低级机器语言驱动 指令顺序执行,一般按照指令在存储器中存放的顺序执行,程序分支 由转移指令实现 运算器为中心,输入输出设备与存储器之间的数据传送都途经运算器

集中控制,运算器、存储器、输入输出设备的操作以及它们之间的联 系都由控制器控制

概念——处理器运算速度
? 现代处理器运算速度计算公式:
P= Fz X IPC X TPC
其中: Fz为处理机的工作主频
IPC (Instruction Per Cycle) 指令级并行度 TPC (Threading Per Cycle) 线程级并行度 例如:主频3GHz,4核Pentium4处理器的最高运算速度为: P=??3GHz X 4IPC X 4TPC = 48GIPS

即:每秒钟480亿次

? 提高处理器性能的主要途径
(1) 提高主频Fz: 增加流水线级数,依靠计算机系统结构 缩短门电路延迟时间,依靠电子技术

(2) 提高指令级并行度IPC 依靠并行算法和计算机系统结构

(3) 提高线程级并行度TPC

依靠并行算法、程序设计和计算机系统结构

? 近期出现的新问题:
线延迟大于门延迟 漏电流很大 功耗惊人

? 近期提高计算机性能的途径
只能依靠并行算法、程序设计和计算机系统结构,不能指望电子技术

? 不仅对计算机系统结构,而且对并行算法、 软件技术和计 算机应用技术都将产生深远的 影响

概念——指令执行速度

? 平均速度

概念——Amdahl定律

数的表示

Bits, Bytes, and Integers
Sizes of C Objects (in Bytes)
?

C Data Type
? char ? short ? int

Typical 32-bit
1 2 4 4 8 4 8 8 4

Intel IA32
1 2 4 4 8 4 8 10/12 4

x86-64
1 2 4 8 8 4 8 10/16 8

? long
? long long ? float ? double

? long double
? char *

? Or any other pointer

Bit-Level Operations in C
?Operations &, |, ~, ^ Available in C

Logic Operations in C
?&&, ||, !
View 0 as ―False‖ Anything nonzero as ―True‖ Always return 0 or 1 Early termination

Shift Operations
?Logical vs. Arithmetic ?Shift amount < 0 or ? word size

Signed vs. Unsigned in C
Constants
? ?

By default are considered to be signed integers Unsigned if have ―U‖ as suffix
0U, 4294967259U

Casting
?

Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty; unsigned ux, uy; tx = (int) ux; uy = (unsigned) ty;

?

Implicit casting also occurs via assignments and procedure calls
tx = ux; uy = ty;

Unsigned is dangerous!

Integer C Puzzles Revisited
? ? ? ? ? ? ? ? ? ? ? ? ? x<0 ux >= 0 x & 7 == 7 ux > -1 x>y x * x >= 0 x > 0 && y > 0 x >= 0 x <= 0 (x|-x)>>31 == -1 ux >> 3 == ux/8 x >> 3 == x/8 x & (x-1) != 0 ??? ((x*2) < 0)

??? (x<<30) < 0
??? -x < -y

Initialization int x = foo(); int y = bar(); unsigned ux = x; unsigned uy = y;

??? x + y > 0 ?? -x <= 0 ?? -x >= 0

Floating Point
2i 2i–1

???
bi bi–1 ? ? ?

4 2 1 b2 b1 b0 . b–1 b–2 b–3 ? ? ? b–j 1/2 1/4 1/8 2–j

Representation
? ?

Bits to right of ―binary point‖ represent fractional powers of 2 i Represents rational number: k
k ?? j

? bk ?2

???

Floating Point Representation
Numerical Form
?

–1s M 2E
? Sign bit s determines whether number is negative or positive ? Significand M normally a fractional value in range [1.0,2.0). ? Exponent E weights value by power of two

Encoding
s
? ?

exp

frac

?

MSB is sign bit exp field encodes E frac field encodes M

Sizes

? Single precision: 8 exp bits, 23 frac bits

? Double precision: 11 exp bits, 52 frac bits ? Extended precision: 15 exp bits, 63 frac bits

“Normalized” Numeric Values
Condition
?

exp ? 000…0 and exp ? 111…1

Exponent coded as biased value
E = Exp – Bias
? Exp : unsigned value denoted by exp ? Bias : Bias value

?Single precision: 127 (Exp: 1…254, E: -126…127) ?Double precision: 1023 (Exp: 1…2046, E: -1022…1023) ?in general: Bias = 2e-1 - 1, where e is number of exponent bits

Significand coded with implied leading 1
M = 1.xxx…x2
? xxx…x: bits of frac ? Minimum when 000…0 (M = 1.0) ? Maximum when 111…1 (M = 2.0 – ?) ? Get extra leading bit for ―free‖

Denormalized Values
Condition
?

Condition
?

exp = 000…0

exp = 111…1

Value
? ?

Cases
?

Exponent value E = –Bias +1 Significand value M = 0.xxx…x2
? xxx…x: bits of frac

exp = 111…1, frac = 000…0
? Represents

value???(infinity)
? Operation that overflows ? Both positive and

Cases
?

exp = 000…0, frac = 000…0
? Note that have distinct

negative
?

exp = 111…1, frac ? 000…0
? Not-a-Number (NaN)

values +0 and –0
?

exp = 000…0, frac ? 000…0

s exp 0 0 0 … 0 0 0 0 … 0 0 0 0 0 … 0 0 0

frac

E -6 -6 -6 -6 -6 -6 -6 -1 -1 0 0 0 7 7 n/a

Value 0 1/8*1/64 = 1/512 2/8*1/64 = 2/512 6/8*1/64 7/8*1/64 8/8*1/64 9/8*1/64 14/8*1/2 15/8*1/2 8/8*1 9/8*1 10/8*1 = = = = = = = = = 6/512 7/512 8/512 9/512 14/16 15/16 1 9/8 10/8

Denormalized numbers

0000 000 0000 001 0000 010 0000 0000 0001 0001 0110 0110 0111 0111 0111 110 111 000 001 110 111 000 001 010

closest to zero

largest denorm smallest norm

closest to 1 below

Normalized numbers

closest to 1 above

1110 110 1110 111 1111 000

14/8*128 = 224 15/8*128 = 240 inf

largest norm

Round-To-Even
Binary Fractional Numbers
? ?

―Even‖ when least significant bit is 0 Half way when bits to right of rounding position = 100…2

Examples
Round to nearest 1/4 (2 bits right of binary point) Value Binary Rounded Action 2 3/32 10.000112 10.002 (<1/2—down) 2 3/16 10.001102 10.012 (>1/2—up) 2 7/8 10.111002 11.002 (1/2—up) 2 5/8 10.101002 10.102 (1/2—down)
?

Rounded Value 2 2 1/4 3 2 1/2

Floating Point in C
C Guarantees Two Levels
float double single precision double precision

Conversions
? ?

Casting between int, float, and double changes numeric values Double or float to int
? Truncates fractional part ? Like rounding toward zero ? Not defined when out of range or NaN

? Generally sets to Tmin or Tmax
?

int to double
? Exact conversion, as long as int has ≤ 53 bit word size

?

int to float
? Will round according to rounding mode

Floating Point Puzzles
?

For each of the following C expressions, either:
? Argue that it is true for all argument values ? Explain why not true

? x == (int)(float) x

int x = …;
float f = …; double d = …;

? x == (int)(double) x ? f == (float)(double) f ? d == (float) d ? f == -(-f); ? 2/3 == 2/3.0

Assume neither d nor f is NaN

? d < 0.0
? d > f

???
???

((d*2) < 0.0)
-f > -d

? d * d >= 0.0 ? (d+f)-d == f

汇编与C语言

movl Operand Combinations
Source Imm movl Dest Src,Dest C Analog
temp = 0x4; *p = -147;

Reg movl $0x4,%eax Mem movl $-147,(%eax)

Reg

Reg movl %eax,%edx Mem movl %eax,(%edx) Reg
movl (%eax),%edx

temp2 = temp1;
*p = temp;

Mem

temp = *p;

Cannot do memory-memory transfer with a single instruction

Indexed Addressing Modes
Most General Form D(Rb,Ri,S)
? ? ?

Mem[Reg[Rb]+S*Reg[Ri]+ D]

D: Constant “displacement” Rb: Base register: Any of 8 integer registers Ri: Index register: Any, except for %esp ?Unlikely you’d use %ebp, either

?

S:

Scale: 1, 2, 4, or 8

Special Cases (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]

D(Rb,Ri)
(Rb,Ri,S)

Mem[Reg[Rb]+Reg[Ri]+D]
Mem[Reg[Rb]+S*Reg[Ri]]

Address Computation Instruction
leal Src,Dest
? Src

is address mode expression ? Set Dest to address denoted by expression

Uses
? Computing
? Computing

addresses without a memory reference
arithmetic expressions of the form x + k*y

? E.g., translation of p = &x[i];

? k = 1, 2, 4, or 8.

x86-64 General Purpose Registers
%rax %eax %r8 %r8d

%rdx
%rcx

%edx
%ecx

%r9
%r10

%r9d
%r10d

%rbx
%rsi %rdi %rsp %rbp
? ?

%ebx
%esi %edi %esp %ebp

%r11
%r12 %r13 %r14 %r15

%r11d
%r12d %r13d %r14d %r15d

Extend existing registers. Add 8 new ones. Make %ebp/%rbp general purpose

Swap in 64-bit Mode
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; }

swap: movl movl movl movl retq

(%rdi), %edx (%rsi), %eax %eax, (%rdi) %edx, (%rsi)

?

Operands passed in registers
? First (xp) in %rdi, second (yp) in %rsi ? 64-bit pointers

? ?

No stack operations required 32-bit data
? Data held in registers %eax and %edx ? movl operation

Reading Condition Codes
SetX Instructions
?

Set single byte based on combinations of condition codes

SetX
sete setne sets setns setg setge setl setle seta setb

Condition
ZF ~ZF SF ~SF ~(SF^OF)&~ZF ~(SF^OF) (SF^OF) (SF^OF)|ZF ~CF&~ZF CF

Description
Equal / Zero Not Equal / Not Zero Negative Nonnegative Greater (Signed) Greater or Equal (Signed) Less (Signed) Less or Equal (Signed) Above (unsigned) Below (unsigned)

Conditional Branch Example
int absdiff( int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; } absdiff: pushl movl movl movl cmpl jle subl movl .L8: leave ret .L7: subl jmp %ebp %esp, %ebp 8(%ebp), %edx 12(%ebp), %eax %eax, %edx .L7 %eax, %edx %edx, %eax

Set Up

Body1

Finish

%edx, %eax .L8

Body2

New Conditional Branch Example
pushl %ebp

movl %esp, %ebp
pushl %ebx

movl 8(%ebp), %ecx

movl 12(%ebp), %edx
movl %ecx, %ebx subl %edx, %ebx movl %edx, %eax

subl %ecx, %eax
cmpl %edx, %ecx cmovg %ebx, %eax

int absdiff( int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; }

Gcc 4.3.4

popl %ebx popl %ebp ret

Implementing Loops
IA32
?

All loops translated into form based on “do-while”
Also make use of “jump to middle”

x86-64
?

Why the Difference
? ?

IA32 compiler developed for machine where all operations costly x86-64 compiler developed for machine where unconditional branches incur (almost) no overhead

“For”? “While”? “Do-While”
For Version
for (Init; Test; Update ) Body

While Version
Init; while (Test ) { Body Update ; }

Do-While Version
Init; if (!Test) goto done; do { Body Update ; } while (Test) done:

Goto Version
Init; if (!Test) goto done; loop: Body Update ; if (Test) goto loop; done:

“For”? “While” (Jump-to-Middle)
For Version
for (Init; Test; Update ) Body

While Version
Init; while (Test ) { Body Update ; }

Goto Version
Init; goto middle; loop: Body Update ; middle: if (Test) goto loop; done:

Switch Statements
Implementation Options
?

Series of conditionals
? Organize in tree structure ? Logarithmic performance

?

Jump Table
? Lookup branch target ? Constant time ? Possible when cases are small integer constants

?

GCC
? Picks one based on case structure

IA32-Stack Frames
Contents
? ?

?

Local variables Return information Temporary space

yoo who amI

Management
?

Space allocated when enter procedure
? “Set-up” code

?

Deallocated when return
? “Finish” code

Frame Pointer %ebp proc Stack Pointer %esp Stack ―Top‖

Pointers
? ?

Stack pointer %esp indicates stack top Frame pointer %ebp indicates start of current frame

IA32/Linux Stack Frame
Current Stack Frame (“Top” to Bottom)
?

Parameters for function about to call
? “Argument build”

Caller Frame
Arguments Frame Pointer (%ebp) Return Addr Old %ebp Saved Registers + Local Variables Stack Pointer (%esp)

?

Local variables
? If can’t keep in registers

?
?

Saved register context Old frame pointer

Caller Stack Frame
?

Return address
? Pushed by call instruction

?

Arguments for this call

Argument Build

IA32/Linux Register Usage
Integer Registers
?

Two have special uses
%ebp, %esp

%eax

?

Three managed as callee-save
%ebx, %esi, %edi ? Old values saved on stack prior to using

Caller-Save Temporaries

%edx %ecx %ebx

Callee-Save Temporaries

%esi %edi %esp

?

Three managed as callersave
%eax, %edx, %ecx ? Do what you please, but expect any callee to do so, as well Special

%ebp

?

Register %eax also stores returned value

x86-64 Register Conventions
%rax Return Value Callee Saved Argument #4 Argument #3 Argument #2 Argument #1 Stack Pointer Callee Saved %r8

Argument #5
Argument #6 Callee Saved Used for linking C: Callee Saved Callee Saved

%rbx
%rcx

%r9
%r10

%rdx
%rsi %rdi %rsp %rbp

%r11
%r12 %r13 %r14 %r15

Callee Saved
Callee Saved

x86-64 Registers
Arguments passed to functions via registers
? ?

If more than 6 integral parameters, then pass rest on stack These registers can be used as caller-saved as well Eliminates need to update %ebp

All References to Stack Frame via Stack Pointer
?

Other Registers
? ?

6+1 callee saved 2 or 3 have special uses

x86-64 Locals in the Red Zone
/* Swap, using local array */ void swap_a(long *xp, long *yp) { volatile long loc[2]; loc[0] = *xp; loc[1] = *yp; *xp = loc[1]; *yp = loc[0]; }

Avoiding Stack Pointer Change
?

swap_a: movq movq movq movq movq movq movq movq ret

(%rdi), %rax %rax, -24(%rsp) (%rsi), %rax %rax, -16(%rsp) -16(%rsp), %rax %rax, (%rdi) -24(%rsp), %rax %rax, (%rsi)

Can hold all information within small window beyond stack pointer

rtn Ptr ?8 unused ?16 loc[1] ?24 loc[0]

%rsp

x86-64 NonLeaf without Stack Frame
long scount = 0; /* Swap a[i] & a[i+1] */ void swap_ele_se (long a[], int i) { swap(&a[i], &a[i+1]); scount++; }
?

?

No values held while swap being invoked No callee save registers needed

swap_ele_se: movslq %esi,%rsi # Sign extend i leaq (%rdi,%rsi,8), %rdi # &a[i] leaq 8(%rdi), %rsi # &a[i+1] call swap # swap() incq scount(%rip) # scount++; ret

x86-64 Call using Jump
long scount = 0; /* Swap a[i] & a[i+1] */ void swap_ele (long a[], int i) { swap(&a[i], &a[i+1]); }
?

When swap executes ret, it will return from swap_ele Possible since swap is a “tail call”

?

swap_ele: movslq %esi,%rsi # Sign extend i leaq (%rdi,%rsi,8), %rdi # &a[i] leaq 8(%rdi), %rsi # &a[i+1] jmp swap # swap()

Interesting Features of Stack Frame
Allocate Entire Frame at Once
? ? ?

All stack accesses can be relative to %rsp Do by decrementing stack pointer Can delay allocation, since safe to temporarily use red zone

Simple Deallocation
?

Increment stack pointer

Basic Data Types
Integral
? ?

Stored & operated on in general registers Signed vs. unsigned depends on instructions used
Intel byte word double word quad word GAS b w l q Bytes 1 2 4 8 C [unsigned] [unsigned] [unsigned] [unsigned] char short int long int (x86-64)

Floating Point
?

Stored & operated on in floating point registers
Intel Single Double Extended GAS s l t Bytes 4 8 10/12/16 C float double long double

Array Allocation
Basic Principle
T A[L];
?

?
?

Array of data type T and length L Contiguously allocated region of L * sizeof(T) bytes Identifier A can be used as a pointer to array element 0 Type T*

Viewing as Multidimensional Array
Declaration
T A[R][C];
? ? ?

A[0][0] ? ? ?

? ? ?

A[0][C-1] ? ? ?

2D array of data type T R rows, C columns Type T element requires K bytes

A[R-1][0] ? ? ? A[R-1][C-1]

Array Size
?

R * C * K bytes
int A[R][C]; A [1] ? [C-1] 4*R*C Bytes A ? ? ? [R-1] [0] A [R-1] [C-1]

Arrangement
?

Row-Major Ordering
A A [0] [1] ? ? ? [C-1] [0]

A ? ? [0] ? [0]

?

?

Nested Array Row Access
Row Vectors
? ? ?

A[i] is array of C elements Each element of type T requires K bytes Starting address A + i * (C * K)

int A[R][C];
A[0] A [0] ??? [0] A A [0] ? ? ? [C-1] A [i] ??? [0] A+i*C*4 A[i] A [i] ? ? ? [C-1] A[R-1] A [R-1] ??? [0] A+(R-1)*C*4 A [R-1] [C-1]

Nested Array Element Access
Array Elements
? ?

A[i][j] is element of type T Address A + i * (C * K) + j * K
= A + (i * C + j)* K

A [i] [j]

int A[R][C]; A[0] A [0] ??? [0] A A [0] ? ? ? [C-1] A[i] A [i] [j] A [R-1] ??? [0] A+(R-1)*C*4 A[R-1] A [R-1] [C-1]

???

???

? ? ?

A+i*C*4 A+(i*C+j)*4

Nested Array Element Access Code
Array Elements
? ?

pgh[index][dig] is int Address:
pgh + 20*index + 4*dig

IA32 Code
?
?

int get_pgh_digit (int index, int dig) { return pgh[index][dig]; }

Computes address
pgh + 4*dig + 4*(index+4*index)

movl performs memory reference

# %ecx = dig # %eax = index leal 0(,%ecx,4),%edx leal (%eax,%eax,4),%eax movl pgh(%edx,%eax,4),%eax

# 4*dig # 5*index # *(pgh + 4*dig + 20*index)

Array Element Accesses
?

Similar C references

?

Different address computation

Nested Array
int get_pgh_digit (int index, int dig) { return pgh[index][dig]; } Element at Mem[pgh+20*index+4*dig]
?

Multi-Level Array
int get_univ_digit (int index, int dig) { return univ[index][dig]; } Element at Mem[Mem[univ+4*index]+4*dig]
?
cmu univ 160 164 168 16 0 40 9 60 4 64 1 20 2 44 7 68 5 24 1 48 2 72 2 28 3 52 0 76 1 32 9 56 3 36

1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1 76 96 116 136 156

36 16 56

mit

ucb 36 56

Using Nested Arrays
Strengths
? ?

#define N 16 typedef int fix_matrix[N][N]; /* Compute element i,k of fixed matrix product */ int fix_prod_ele (fix_matrix a, fix_matrix b, int i, int k) { int j; int result = 0; for (j = 0; j < N; j++) result += a[i][j]*b[j][k]; return result; }

C compiler handles doubly subscripted arrays Generates very efficient code
? Avoids multiply in index

computation

Limitation
?

Only works if have fixed array size
(*,k) (i,*)

Row-wise

A

B

Column-wise

Dynamic Nested Arrays
Strength
int * new_var_matrix(int n) { ? Can create matrix of arbitrary return (int *) size calloc(sizeof(int), n*n); }
?

Programming

Must do index computation explicitly

Performance
? ?

Accessing single element costly Must do multiplication
movl 12(%ebp),%eax movl 8(%ebp),%edx imull 20(%ebp),%eax addl 16(%ebp),%eax movl (%edx,%eax,4),%eax

int var_ele (int *a, int i, int j, int n) { return a[i*n+j]; } # # # # # i a n*i n*i+j Mem[a+4*(i*n+j)]

Optimizing Dynamic Array Mult.
{

Optimizations
?

Performed when set optimization level to -O2

Code Motion
?

int j; int result = 0; for (j = 0; j < n; j++) result += a[i*n+j] * b[j*n+k]; return result; }

Expression i*n can be computed outside loop Incrementing j has effect of incrementing j*n+k by n

{
int j; int result = 0; int iTn = i*n; int jTnPk = k; for (j = 0; j < n; j++) { result += a[iTn+j] * b[jTnPk]; jTnPk += n; } return result; }

Strength Reduction
?

Performance
?

Compiler can optimize regular access patterns

Satisfying Alignment with Structures
Offsets Within Structure
?

Must satisfy element’s alignment requirement

Overall Structure Placement
?

Each structure has alignment requirement K
? Largest alignment of any element

struct S1 { char c; int i[2]; double v; } *p;

?

Initial address & structure length must be multiples of K K = 8, due to double element
c p+0 i[0] p+4 i[1] p+8 v p+16 p+24

Example (under Windows or x86-64):
?

Multiple of 4

Multiple of 8

Multiple of 8

Multiple of 8

Specific Cases of Alignment (IA32)
Size of Primitive Data Type:
? ? ? ?

1 byte (e.g., char)
? no restrictions on address

2 bytes (e.g., short)
? lowest 1 bit of address must be 02

4 bytes (e.g., int, float, char *, etc.)
? lowest 2 bits of address must be 002

8 bytes (e.g., double)
? Windows (and most other OS’s & instruction sets):

?lowest 3 bits of address must be 0002 ? Linux: ?lowest 2 bits of address must be 002 ?i.e., treated the same as a 4-byte primitive data type
?

12 bytes (long double)
? Windows, Linux:

?lowest 2 bits of address must be 002 ?i.e., treated the same as a 4-byte primitive data type

Specific Cases of Alignment (x86-64)
Size of Primitive Data Type:
? ? ? ?

1 byte (e.g., char)
? no restrictions on address

2 bytes (e.g., short)
? lowest 1 bit of address must be 02

4 bytes (e.g., int, float)
? lowest 2 bits of address must be 002

8 bytes (e.g., double, char *)
? Windows & Linux:

?lowest 3 bits of address must be 0002
?

16 bytes (long double)
? Linux:

?lowest 3 bits of address must be 0002 ?i.e., treated the same as a 8-byte primitive data type

Union Allocation
Principles
? ?

?

Overlay union elements Allocate according to largest element Can only use one field at a time
union U1 { char c; int i[2]; double v; } *up;

c i[0] v
up+0 up+4

i[1] up+8

struct S1 { char c; int i[2]; double v; } *sp; c sp+0 sp+4 i[0]

(Windows alignment) i[1] sp+8 v sp+16 sp+24

Advanced Topics
? ? ?

Linux Memory Layout Understanding Pointers Buffer Overflow

BF

Stack

IA32 Linux Memory Layout
Stack
?

Runtime stack (8MB limit) Dynamically allocated storage When call malloc(), calloc(), new() Statically allocated data E.g., arrays & strings declared in code Executable machine instructions Read-only

Heap
?

?

Data
? ?

Upper 2 hex digits of address

Text
? ?

08 00

Heap Data Text

C pointer declarations
int *p int *p[13] int *(p[13]) int **p p is a pointer to int p is an array[13] of pointer to int p is an array[13] of pointer to int p is a pointer to a pointer to an int

int (*p)[13]
int *f()

p is a pointer to an array[13] of int
f is a function returning a pointer to int

int (*f)()
int (*(*f())[13])()

f is a pointer to a function returning int
f is a function returning ptr to an array[13] of pointers to functions returning int

int (*(*x[3])())[5]

x is an array[3] of pointers to functions returning pointers to array[5] of ints

Malicious Use of Buffer Overflow
Stack after call to gets()
void foo(){ bar(); ... } int bar() { char buf[64]; gets(buf); ... return ...; }
? ?

return address A

foo stack frame data written by gets()
B

B

pad
exploit code bar stack frame

?

Input string contains byte representation of executable code Overwrite return address with address of buffer When bar() executes ret, will jump to exploit code


深入理解计算机系统LAB1实验报告

深入理解计算机系统LAB1实验报告_院校资料_高等教育_教育专区。LAB1 实验报告语法检查: 正确性检查: 1. bitAnd 源代码: return ~(~x|~y); 思路: 可以直接...

深入理解计算机系统(第二版) 家庭作业答案

深入理解计算机系统(第二版) 家庭作业 第三章 3.54 int decode2(int x, int y, int z) { int ret; z -= y; //line 2 ret = z; //line 3 ret...

《深入理解计算机系统》-读后感

[《深入理解计算机系统》-读后感] 本书的最大优点是为程序员描述计算机系统的实现细节,帮助其在大脑中构造一个层次型的 计算机系统,从最底层的数据在内存中的表示...

深入理解计算机系统(第二版)-家庭作业答案

深入理解计算机系统(第二版) 家庭作业 第三章 3.54 int decode2(int x, int y, int z) { int ret; z -= y; //line 2 ret = z; //line 3 ret...

深入理解计算机系统LAB2

深入理解计算机系统LAB2_电脑基础知识_IT/计算机_专业资料。LAB1 实验报告实验目的: 使用课程知识拆除一个“Binary Bombs”来增强对程序的机器级表示、汇编 语言、...

深入理解计算机系统配套练习卷

深入理解计算机系统配套练习卷_理学_高等教育_教育专区。《深入》题目 S141000825 李永伟 第一章题目 1.1.1_25_1 我们通常所说的“字节”由___个二进制位构成...

《深入了解计算机系统》读后感

深入了解计算机系统》读后感_计算机软件及应用_IT/计算机_专业资料。《深入了解计算机系统》读后感 13 级电商 1 班 梁小嵘 《深入理解计算机系统》一书是由美国...

深入理解计算机系统勘误_图文

深入理解计算机系统,第二版 Computer Systems: A Programmer's Perspective, Second Edition (CS:APP2e) 发现日期 页码行号 原文 修改后 分时(time-sharing) 张岩...

深入理解计算机系统(第二版)-家庭作业答案

深入理解计算机系统(第二版)-家庭作业答案_电脑基础知识_IT/计算机_专业资料。深入理解计算机系统(第二版) 家庭作业 2.55-2.57 略 2.58 int is_little_endian...

深入理解计算机操作系统笔记

深入理解计算机操作系统笔记_电脑基础知识_IT/计算机_专业资料。自己整理的深入理解计算机系统基础知识 深入理解计算机系统第一章 计算机系统漫游 信息就是位+上下文 源...