Allocators in C is provided by standard C library, “malloc package”;
#includevoid*malloc(size_t size);voidfree(void*p);void*calloc(size_t nmemb, size_t size);void*realloc(void*ptr, size_t size);void*reallocarray(void*ptr, size_t nmemb, size_t size);#includeintbrk(void*addr);void*sbrk(intptr_t increment);/********** annotation **********/brk()andsbrk() change the location of the program break, which defines
the end of the process's data segment(i.e., the program break is the first
location after the end of the uninitialized data segment). Increasing the
program break has the effect of allocating memory to the process; decreasing
the break deallocates memory.brk() sets the end of the data segment to the value specified by addr, when
that value is reasonable, the system has enough memory,and the process does
not exceed its maximum data size(see setrlimit(2)).sbrk() increments the program's data space by increment bytes. Calling sbrk()
with an increment of 0 can be used to find the current location of
the program break.
Returns the block pointed at by p to pool of available memory;
p must come from a previous call to malloc、calloc ( 公开课上发音 “c” “alloc” ) or realloc;
11.1.2 How to implement malloc and free ?
Assumption
内存 以 word 为最小单元编址,( on x86 ) word = 4 bytes;
malloc、free 入参单位为 word;
Constraints
Application 调用 malloc 和 free 的顺序随机;
free 的入参必须是 指向已分配内存块的 指针;
请求的 块大小 或 数量 不可控;
请求 必须立即响应 ( 不可以 囤一堆请求一次性响应,或 重排请求的先后顺序 ) ;
只能用 free memory 响应请求,一旦分配 allocators 再也无权访问;
allocator can’t compress or smoosh them all together to create larger free blocks;
大小对齐 8-byte (x86) or 16-byte (x86-64);
Performance Goal Allocators combine a trade-off of both running time ( speed ) and space. So these speed and memory efficiency metrics are defined;
Troughput Number of completed requests per unit time,eg:5000 malloc calls + 5000 free calls in 10 sec,throughput is 1000 ops/sec;
Peak Memory Utilization (
U
k
U_k
Uk )
U
k
=
m
a
x
(
∑
i
<
k
P
i
)
H
k
U_k=\frac{max( \sum\limits_{iUk=Hkmax(i<k∑Pi) 【Payload】malloc ( p ),result in a block with a payload of p bytes; 【Aggregate payload】
P
k
P_k
Pk,the sum of currently allocated payload after request
R
1
,
R
2
.
.
.
R
k
R_1,R_2 ... R_k
R1,R2...Rk has completed; 【Current heap size】
H
k
H_k
Hk;
How to deal with extra space when requiring blocks smaller than free blocks it is placed in ?
How to choose a specific block among so many free blocks ?
How to reinsert freed blocks ?
11.1.3 Fragmentation
Internal Fragmentation
payload is smaller than block size;
like padding for alignment,overhead for maintaining,explicit policy ( don’t want to splinter blocks up into little chunks );
depends only on the pattern of previous requests,easy to measure;
External Fragmentation
aggregate heap memory is enough,but no single free block satisfied request;
depends on the pattern of future requests,difficult to measure;
11.2 Implicit free lists
Knowing How Much to Free
standard method 把 size 藏在 block 开头的的 word 中,这个 word 被称为 header field 或者 header;
Keeping Track of Free Blocks
【1】Implicit list Traversing all free blocks in heap by traversing all blocks,ignoring allocated blocks; There is no list of free blocks,but the headers in blocks can tell; 与 blocks 的总数线性相关;
【2】Explicit list Use some of the words in the block to create singly or doubly linked lists; 与 free blocks 的总数线性相关,效率略有提升;
【3】Segregated free list Different free lists for different size classes;
【4】Blocks sorted by size Use a balanced tree (e.g. Red-Black tree) with pointers and the length used as a key
11.2.1 Implicit List
Each block needs size and allocation status,standard trick:
Blocks has to be aligned,low-order bits are always 0,use it as a allocated / free flag;
【First Fit】Search from beginning,choose first free block that fits;It can cause “splinters” at beginning of list;Take linear time in total number of blocks;
【Next Fit】Search from where previous search finished;Obviously faster than first fit ( avoid re-scanning unhelpful blocks );fragmentation maybe worse;
【Best Fit】Search list and choose the best free block ( closest to size required ) ;less fragmentation but slower than first fit;infinite segregated free list is one way to approach best fit;
Allocating in Free Block
Split the block if size required is smaller than free block,or just return the whole block ?
voidaddblock(void*p,int len){// arithmetic shift, round up to even(远零取整),两字节对齐int newsize =((len +1)>>1)<<1;int oldsize =*p &-2;// mask out status bit*p = newsize |1;// set length and allocted status of new blockif(newsize < oldsize)// set length of remaining free block*(p + newsize)= oldsize - newsize;}void*malloc(int len){void*p =NULL;
p =search_free_block();if(p)addblock(p,len +4);// header size = 4 Bytesreturn p +1;// (p + 4 bytes) addresse of payload }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Freeing a Block
only clear “allocated” status ? Generating external fragmentation !!
Adjacent free blocks need to be coalesced
All decent allocator is that contiguous free blocks never exist;
voidfree_block(ptr p){*p =*p &-2;// clear allocated flag
next = p +(*p)/4;// find next block headerif((*next &1)==0)// if not allocated*p =*p +*next;// add to this block}
1
2
3
4
5
6
7
How to Coalescing previous block
remember the previous block while traverse the list,very inefficient !
Bidirectional Coalescing
Boundary tags [ Don Kunth 1973 ]; Replicate header at end of free blocks ( a.k.a. “footer” ),这样 free§ 的时候,p-1 便是 前一个 block 的 “footer”(即 “header”),从而判断是否与前一个 block 进行 Coalescing;缺点是 overhead 变大;
Common in many dynamic languages; Python, Ruby, Java, Perl, ML, Lisp, Mathematica;
Variants (“conservative” garbage collectors) exist for C and C++ Allocator can’t determine whether these blocks are indeed garbage;
voidfoo(){int*p =malloc(128);// garbagereturn;}
1
2
3
4
11.5.1 Garbage Collection
Which memory can be freed?
扫描内存,辨别指针及被其指向的块,没有被指向的块,认为是 Garbage;
Assumptions about pointers
Memory Manager 需要能分辨出哪些变量是指针;
假设 指针都指向 块的开头(反例如 指向数组中元素的指针);
指针必须是静态不可变的?
Classical GC Algorithms
Mark-and-sweep collection (McCarthy, 1960)
Reference counting (Collins, 1960)
Copying collection (Minsky, 1963)
Generational Collectors (Lieberman and Hewitt, 1983)
Collection based on lifetimes;
Most allocations become garbage very soon;
So focus reclamation work on zones of memory recently allocated;
More information “Garbage Collection: Algorithms for Automatic Dynamic Memory”, John Wiley & Sons, 1996.
Mark-and-sweep collection
Each block is a node in graph;
Each pointer is a edge in the graph;
Locations not in the heap that contain pointers into the heap are calld root nodes (e.g. registers, locations on the stack, global variables);
A node is reachable if there is a path from any root to that node; Non-reachable nodes are garbage;
Implement on top of malloc / free package
Allocate using malloc until you “run out of space”;
When out of space,use extra mark bit in the head of each block;
Sub phase 1,mark,start from all roots and set mark bit on each reachable block( depth-first traversal );
Sub phase 2,scan all blocks,and free blocks that are not marked;
A Simple Implementation
ptr mark(ptr p)// pointer to the payload{if(false==is_ptr(p))return;// do nothing if not pointerif(true==markBitSet(p))return;// check if already markedsetMarkBit(p);// set the mark bitfor(i=0; i <length(p); i++)// mark() recursivelymark(p[i]);return;}
ptr sweep(ptr p, ptr end){while(p < end)// scan the whole heap{if(true==markBitSet(p))clearMarkBit();// 正在被引用的块elseif(true==allocateBitSet(p))free(p);// 没有被引用的块
p +=length(p);// 将 p 指向 next block}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
How to find the beginning of the block
In C,pointers can point to the middle of a block;
Use a balanced binary tree to keep track of all allocated blocks (key is start-of-block);
Balanced-tree pointers can be stored in header (use two additional words);
Search the binary tree,a pointer should fall in the beginning and end of some allocated block,and that block is reachable;
It’s conservative,because it maybe just an integer,but purportedly points to non-reachable blocks;
11.6 Memory-related perils and pitfalls
Errors involving memory are the worst kind of bugs to try to find out,because they are distant in both space and time;You only find out about those errors when you try to reference that data;
Dereferencing bad pointers
Reading uninitialized memory
Overwriting memory
Referencing nonexistent variables
Freeing blocks multiple times
Referencing freed blocks
Failing to free blocks
C pointers,只要根据声明时 运算符的优先级,就不会认错类型;
declarations
type of p
int *p
pointer to int
int *p[13]
an array[13] of pointer to int
int *(p[13])
an array[13] of pointer to int
int (*p)[13]
a pointer to an array[13] of int
int (*(*f())[13])()
a function returning ptr to an array[13]of pointers to functions returning int
int (*(*x[3])())[5]
an array[3] of pointers to functions returning pointers to array[5] of ints
*x[3],[]优先级高,x 是 size 为 3 的数组,其左是*,数组的元素是指针; *(*x[3])(),数组元素(指针)指向函数,函数的返回值也是指针; int (*(*x[3])())[5],函数返回的指针,指向 size 为 5 的 int 数组;