【Linux基础】进程（三）

9. 地址空间

9.1 程序地址空间

C/C++的程序地址空间

在这里插入图片描述

修改了值，但是地址却是一样的

#include     
#include     
#include     
    
int g_val = 100;    
    
int main()    
{    
    pid_t id = fork();    
    if (id == 0)    
    {    
        int cnt = 5;    
        //child    
        while (cnt--)    
        {    
            printf("I am child, times: %d, g_val = %d, &g_val = %p\n", cnt, g_val, &g_val);    
            sleep(1);    
            if (cnt == 3)    
            {    
                printf("############child change val##############\n");    
                g_val = 200;    
                printf("##################done####################\n");    
            }    
        }    
    }                                                                                                                                                        
    else    
    {    
        while (true)    
        {    
            printf("I am parent, g_val = %d, &g_val = %p\n", g_val, &g_val);    
            sleep(1);    
        }    
    }
    
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

在这里插入图片描述

子进程修改变量后，父进程打印出来的变量值没变，但两个不同值的变量的地址居然是一样的

如果存储的是真正的物理地址，这种现象不可能发生

程序使用的地址是虚拟地址，而不是实际的物理地址，存放在进程的 struct mm_struct，称为进程地址空间

9.2 进程地址空间

让进程能够以统一的视角看待内存

地址空间本质是内核中的一种数据类型 struct mm_struct { // 进程地址空间 }，32为机器下大小为4G

每个进程都认为地址空间的划分是按照4G空间划分的，即每个进程都认为自己拥有4GB

地址空间上进行区域划分时，对应的先行位置称为虚拟地址

页表 + MMU：MMU通过查页表将虚拟地址转化为物理地址

const char* str = "hello world"

*str = 'H' ❌

定义在字符常量区的变量无法被修改，本质是因为OS给你的权限只有 r 权限

为什么要有地址空间

通过添加一层软件层，完成有效的对进程操作内存进行风险管理（权限管理），本质目的是为了保护物理内存及各个进程的数据安全
将内存申请和内存使用的概念在时间上划分清除，通过虚拟地址空间来屏蔽底层申请内存的过程，达到进程读写内存和OS进行内存管理操作，进行软件上面的分离
站在CPU应用层的角度，进程可以看作统一使用4GB空间，而且每个空间区域的相对位置是比较确定的

OS这样设计的目的是为了达到一个目标：每个进程都认为自己是独占系统资源的（进程具有独立性）

解释了这种现象的原因

在这里插入图片描述

在未改变g_val的前，两个页表都指向同一块物理内存，父进程g_val
但当改变了之后就发生了写时拷贝，在物理内存重写开辟了一块地方，子进程页表指向了子进程g_val
但虚拟地址并没有发生改变

子进程的创建时以父进程为模板的，父子进程一般的代码是共享的

#include 

int main()
{
    char *p = "hello";
    char *q = "hello";

    printf("%p\n%p\n", p, q); // 两块地址是一样的

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11

所以，所有的只读数据一般就只有一份，因为操作系统维护一份的成本是最低的

9.3 地址空间的划分

在这里插入图片描述

10. 进程控制

10.1 进程创建fork()

fork函数初始

进程调用fork，当控制转移到内核中的fork代码后，内核做：

分配新的内存块和内核数据结构给子进程
将父进程部分数据结构内容拷贝至子进程
添加子进程到系统进程列表当中
fork返回，开始调度器调度

在这里插入图片描述

注意，fork之后会从fork的下一条语句开始执行，之前的并不会被执行

所以，fork之前父进程独立执行，fork之后父子两个执行流分别执行。而且，fork之后，谁先执行完全由调度器决定

写时拷贝

通常，父子代码共享，父子在不写入时，数据也是共享的，当任意一方试图写入，便以写时拷贝的方式复制一份副本

在这里插入图片描述

当进程试图写入时，系统会发生缺页中断，子进程会被暂停，进行写时拷贝

10.2 进程终止

进程退出场景

代码运行完毕，结果正确
代码运行完毕，结果不正确
代码异常终止

进程常见退出方法

正常终止（可以通过 echo $? 查看进程退出码）

从main函数返回（main函数的返回值是进程的退出码，0：结果正确、!0：结果不正确）
从main函数return代表进程退出
非main函数return代表函数返回

调用stdlib.h下的void exit(int status)
在任何地方调用都代表终止进程，参数为退出码status

使用unistd.h下的void _exit(int status)
之前提到printf不带’\n’打印的数据就会被暂时保存在输出缓冲区中，exit和main return本身就会要求系统进行缓冲区刷新
而_exit为强制终止进程，不会进行进程后续的收尾工作，比如刷新用户缓冲区

C语言中将错误码转换为字符串描述的函数为string.h下的strerror

异常退出

程序崩溃（除零错）
Ctrl + C：信号终止

异常退出下，程序的退出码变得没有意义了

进程退出，在系统层面少了一个进程：free PCB，free mm_struct，free 页表和各种映射关系，代码 + 数据，申请的空间也要释放掉

10.3 进程等待

让父进程fork之后，需要通过wait/waitpid等待子进程退出

让父进程等待的原因：

通过获取子进程退出的信息，能够得知子进程的执行结果
可以保证时序问题，子进程先退出，父进程后退出
进程退出的时候会先进入僵尸状态，会造成内存泄露的问题，因此需要通过父进程wait释放该子进程占用的资源
另外进程一旦变成僵尸状态，那就刀枪不入，kill -9 也无能为力，因为谁也没有办法杀死一个已经死去的进程

`wait`方法

头文件：#include 和#include

pid_t wait(int *status);

返回值：

成功返回被等待进程pid，失败返回-1。

参数：

输出型参数，获取子进程退出状态，不关心则可以设置成为NULL

#include 
#include 
#include 
#include 
#include 

int main()
{
    pid_t id = fork();
    if (id == 0)
    {
        // 子进程
        int cnt = 5;
        while (cnt)
        {
            printf("child[%d] is running, cnt is: %d\n", getpid(), cnt);
            cnt--;
            sleep(1);
        }
        exit(0);
    }
    
    sleep(10);
    printf("father begin wait\n");
    // 父进程
    pid_t ret = wait(NULL);
    if (ret > 0)
    {
        printf("father wait: %d, success\n", ret);
    }
    else
    {
        printf("father wait failed\n");
    }
    sleep(10);

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

在这里插入图片描述

父进程在开始等待之前一共睡眠10s，前5s子进程执行，后5s子进程退出，但父进程没有开始等待，因而子进程进入了僵尸状态，当父进程开始等待之后，子进程就被回收，回收之后父进程又睡眠10s，故只剩一共进程在运行，知道10s后进程全部结束运行

`waitpid`方法

头文件：#include 和#include

pid_t waitpid(pid_t pid, int *status, int options);

返回值：

当正常返回的时候waitpid返回收集到的子进程的进程ID
如果设置了选项WNOHANG，而调用中waitpid发现没有已退出的子进程可收集，则返回0
如果调用中出错，则返回 -1，这时errno会被设置成相应的值以指示错误所在

参数：

pid
- pid= -1，等待任意一个子进程，与wait等效
- pid> 0，等待进程ID与pid值相等的子进程
status（为一个输出型参数，与子进程如何退出的有关，让父进程得到子进程的执行结果）
- 一共是32个比特位，但只使用低16个比特位
- 如果传递NULL，表示不关心子进程的退出状态信息
- 代码正常退出下的退出码就是这里的退出状态
- 代码异常终止，其本质是这个进程因为异常问题，导致自己收到了某种信号，就是这里的终止信号
- core dump暂不关心
- WIFEXITED(status)：判断进程是否正常退出
- WEXITSTATUS(status)：提取出进程的退出码
options
- 0：默认行为，阻塞等待，即一定等到子进程退出才会返回，否则会一直停在这里
  
  阻塞的本质：进程的PCB被放入了等待队列，并将进程的状态改为S状态
  
  返回的本质：进程的PCB从等待队列拿到R队列，从而被CPU调度
- WNOHANG：设置等待方式为非阻塞，可能需要多次检测——基于非阻塞等待的轮询方案
  - 使用WNOHANG需要注意有一种情况是子进程根本还没退出，需要特殊处理一下
  看到某些应用或者OS本身卡住了长时间不动，称做应用或程序hang住了

示例1——获取退出状态和终止信号：

#include 
#include 
#include 
#include 
#include 

int main()
{
    pid_t id = fork();
    if (id == 0)
    {
        // 子进程
        int cnt = 3;
        while (cnt)
        {
            printf("child[%d] is running, cnt is: %d\n", getpid(), cnt);
            cnt--;
            sleep(1);
        }
        exit(11);
    }

    printf("father begin wait\n");
    // 父进程
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if (ret > 0)
        printf("father wait: %d, success, status exit code: %d, status terminate signal: %d\n", ret, (status >> 8) & 0xff, status & 0x7f); // 获取退出状态和终止信号
    else
        printf("father wait failed\n");

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

正常退出的情况
异常退出的情况

获取退出码的简洁方式：

int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if (WIFEXITED(status)) // 没有收到任何终止信号
        // 即正常结束的，获取对应的退出码
        printf("exit code: %d\n", WEXITSTATUS(status));
    else
        printf("error, got a terminate signal\n");
1
2
3
4
5
6
7

补充：bash是命令行启动的所有进程的父进程，而且bash一定是通过wait方式得到子进程的退出结果，所以用echo $?能够查到子进程的退出码

实例2——非阻塞等待：

#include 
#include 
#include 
#include 
#include 

int main()
{
    pid_t id = fork();
    if (id == 0)
    {
        int cnt = 5;
        while (cnt)
        {
            printf("child[%d] is running, cnt: %d\n", getpid(), cnt);
            cnt--;
            sleep(1);
        }
        exit(0);
    }

    int status = 0;
    while (1) //轮询等待
    {
        pid_t ret = waitpid(id, &status, WNOHANG);
        if (ret == 0)
            // 子进程没有退出，但是waitpid是成功的，需要父进程重复进行等待
            printf("Do father's things\n");
        else if (ret)
        {
            // 子进程退出了，waitpid也成功了，获取到了对应的结果
            printf("fahter wait: %d, success, status exit code: %d, status terminate signal: %d\n", ret, (status >> 8) & 0xff, status & 0x7f);
            break;
        }
        else
        {
            // 等待失败
            perror("waitpid");
            break;
        }
        sleep(1);
    }

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

在这里插入图片描述

10.4 进程程序替换

目前创建子进程的目的是让子进程执行父进程代码的一部分，但如果需要子进程执行一个全新的程序，则使用进程程序替换

进程程序替换：进程不变，仅仅替换当前进程的代码和数据的技术

程序替换的本质就是把程序的进程代码+数据加载到特定的进程上下文中

C/C++程序要运行，必须先加载到内存中

如何加载：使用加载器（exec*）加载

进程程序替换会更改代码区的代码，也要发生写时拷贝

只要进程的程序替换成功，就不会执行后续代码，意味着exec*函数执行成功的时候，不需要返回值检测

只要exec*函数返回了，就一定是因为调用失败了

exec*函数命名理解

l(list)：表示参数采用列表
v(vector)：表示参数用数组
p(path)：表示自动搜索环境变量PATH
e(env)：表示自己维护环境变量

`execl`方法

int execl(const char *path, const char *arg, ...);

path：要执行的目标程序的全路径，即所在路径/文件名

…：可变参数列表

arg, …：要执行的目标程序在命令行上怎么执行，这里的参数就怎么一个一个的传递进去，必须以NULL作为参数传递的结束

#include 
#include 
#include 

int main()
{
    pid_t id = fork();
    if (id == 0)
    {
        // child
        printf("I am child, pid: %d\n", getpid());
        sleep(5);
        execl("/usr/bin/ls", "ls", "-a", "-l", NULL);

        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        printf("hahahahaha\n");
        exit(0);
    }
    
    while (1)
    {
        printf("I am father\n");
        sleep(1);
    }
    
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

`execv`方法

int execv(const char *path, char *const argv[]);

就是将execl的参数列表放到了一个数组里，其余都一样

char* argv[] = {
    "ls",
    "-a",
    "-l",
    NULL
};
execv("/usr/bin/ls", argv);
1
2
3
4
5
6
7

`execlp`方法

int execlp(const char *file, const char *arg, ...);

第一个参数可能会和第二个参数一样，但这两个参数的含义完全不一样

第一个参数表示你要执行谁，即你要执行的文件名
第二个参数表示如何执行它

execlp("ls", "ls", "-a", "-l", NULL);
1

`execle`方法

int execle(const char *path, const char *arg, ..., char * const envp[]);

model.c

#include 
#include 
#include 
#include 
#include 

int main()
{
    if (fork() == 0)
    {
       // child
       // exec*

        char *envs[] = {
            "MYENV1=env test",
            "MYENV2=haha",
            "MYENV3=xixi",
            NULL
        };
        execle("./myexe", "myexe", NULL, envs);
        
        exit(1);
    }

    // parent
    waitpid(-1, NULL, 0);
    printf("wait success\n");

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

myexe.c

#include 

int main()
{
    printf("this new process is printing envs\n");

    extern char** environ;
    for (int i = 0; environ[i]; i++)
        printf("%s\n", environ[i]);
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11

在这里插入图片描述

`execve`方法（系统调用）

int execvpe(const char *file, char *const argv[], char *const envp[]);

model.c

#include 
#include 
#include 
#include 
#include 

int main()
{
    if (fork() == 0)
    {
       // child
       // exec*

        char* argv[] = {
            "myexe",
            NULL
        };
        char *envs[] = {
            "MYENV1=env test",
            "MYENV2=haha",
            "MYENV3=xixi",
            NULL
        };
        execle("./myexe", argv, envs);
        
        exit(1);
    }

    // parent
    waitpid(-1, NULL, 0);
    printf("wait success\n");

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

myexe.c

#include 

int main()
{
    printf("this new process is printing envs\n");

    extern char** environ;
    for (int i = 0; environ[i]; i++)
        printf("%s\n", environ[i]);
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11

`execvp`方法

int execvp(const char *file, char *const argv[]);

char* argv[] = {
    "ls",
    "-a",
    "-l",
    NULL
};
execvp("ls", argv);
1
2
3
4
5
6
7

`execvpe`方法

int execvpe(const char *file, char *const argv[], char *const envp[]);

总结

在这里插入图片描述

除了execve为系统调用，其余的都是库函数，最终都是由execve实现的

编写一个简单的shell

#include 
#include 
#include 
#include 
#include 

#define NUM 128
#define CMD_NUM 64
int main()
{
    char command[NUM];
    for (; ; )
    {
        char* argv[CMD_NUM] = {NULL};
        command[0] = '\0'; // C语言字符串以"\0"结尾，这样可以直接清空字符串
        printf("[who@hostname dir]# ");
        fflush(stdout);

        fgets(command, NUM, stdin);
        command[strlen(command) - 1] = '\0'; // 清除输入带的"\n"
        
        const char* sep = " ";
        argv[0] = strtok(command, sep); // 将输入的字符串以空格分割
        int i = 1;
        while (argv[i] = strtok(NULL, sep))
            i++;

        // 检测是否是需要shell本身执行的内建命令，如cd命令，要切换的是本身的路径，而不是子进程的
        if (strcmp(argv[0], "cd") == 0)
        {
            if (argv[1] != NULL)
                chdir(argv[1]);
            continue;
        }
		
        // 用子进程进行程序替换
        if (fork() == 0)
        {
            // child
            execvp(argv[0], argv);
            exit(1);
        }

        waitpid(-1, NULL, 0);
    }
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

相关阅读:
vue3+ts 组合式api中（setup)如何使用getCurrentInstance，以及用它替代this
基于微信小程序奶茶店在线点单管理系统ssm框架-计算机毕业设计
 【笔试强训选择题】Day38.习题（错题）解析
 骨传导耳机会损伤大脑吗？一分钟详细了解骨传导耳机
 阿里面试官问我Redis 数据结构，这一篇文章足以征服他
 FFmpeg直播能力更新计划与新版本发布
 分享一个基于Python和Django的产品销售收入数据分析系统源码
 Python内置函数input()详解
 QT+OSG/osgEarth编译之二十七：Pixman+Qt编译（一套代码、一套框架，跨平台编译，版本：Pixman-0.42.2）
解除百度文库VIP、语雀、知乎付费限制，原来这么简单
原文地址：https://blog.csdn.net/weixin_52665939/article/details/126635532

【Linux基础】进程（三）

文章目录

9. 地址空间

9.1 程序地址空间

9.2 进程地址空间

9.3 地址空间的划分

10. 进程控制

10.1 进程创建fork()

fork函数初始

写时拷贝

10.2 进程终止

进程退出场景

进程常见退出方法

10.3 进程等待

wait方法

waitpid方法

10.4 进程程序替换

execl方法

execv方法

execlp方法

execle方法

execve方法（系统调用）

execvp方法

execvpe方法

总结

编写一个简单的shell

`wait`方法

`waitpid`方法

`execl`方法

`execv`方法

`execlp`方法

`execle`方法

`execve`方法（系统调用）

`execvp`方法

`execvpe`方法