• easyRL学习笔记:强化学习基础


    https://datawhalechina.github.io/easy-rl/#/chapter1/chapter1

    pip install gym
    
    • 1

    配置开发环境
    在这里插入图片描述
    https://book.douban.com/subject/35043939/
    https://zhuanlan.zhihu.com/reinforce

    参考项目二

    python train.py
    visualdl --logdir=train_log/train --host=172.30.159.168
    
    
    • 1
    • 2
    • 3

    在这里插入图片描述

    这三个高峰意味着什么呢?
    偶尔的突变
    在这里插入图片描述
    6分钟左右跑完成了,我们看看效果。
    在这里插入图片描述
    不知道什么原因,感觉后面是越训练越差劲了,后面我们再调试一下。

    note:
    前面sarsa是同策略的一直是策略π,Q学习是异策略的每次算maxQ,第六章深度Q网络是只属于异策略部分的一个深度算法。
    第六章刚开始的价值函数近似只有Q函数近似,是不是就是说策略迭代时候从Q表格找maxQ用近似函数代替,价值迭代时候不需要近似V函数,然后这个近似Q和不近似的V再用深度网络训练。
    DQN里还有目标网络,是不是这第六章到第九章都是在异策略的条件下做的?
    参考链接https://datawhalechina.github.io/easy-rl/#/chapter1/chapter1
    Actor-Critic算法,可以这么说(PPO也可以说是异策略)

    然后这个时候,我们可以参考https://github.com/sfujim/TD3
    自己也实现一下TD3
    可以指定一下端口号

    jupyter notebook --no-browser --port 8889 --ip=192.168.1.103
    
    • 1

    参考
    蘑菇书代码:https://github.com/datawhalechina/easy-rl/tree/master/projects
    个人开发版代码:https://github.com/johnjim0816/rl-tutorials

    但是遇到了一个问题
    Failed to build mujoco_py

    解决办法:安装旧版本mujoco_py
    pip install mujoco_py==2.0.2.8
    同时mujoco也需要安装
    You appear to be missing MuJoCo. We expected to find the file here: /home/kewei/.mujoco/mujoco200

    This package only provides python bindings, the library must be installed separately.

    Please follow the instructions on the README to install MuJoCo

    https://github.com/openai/mujoco-py#install-mujoco
    
    • 1

    Which can be downloaded from the website

    https://www.roboti.us/index.html
    
    • 1

    那我们下载linux版本的
    在这里插入图片描述

    mkdir ~/.mujoco
    mv mujoco200_linux ~/.mujoco/mujoco200
    
    
    • 1
    • 2
    • 3

    在这里插入图片描述
    在这里插入图片描述
    按照官网所说,配置了环境变量,还是报错
    Exception:
    Missing path to your environment variable.
    Current values LD_LIBRARY_PATH=/usr/local/openmpi-4.0.3/lib:/usr/local/cuda-11.1/lib64:
    Please add following line to .bashrc:
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/kewei/.mujoco/mujoco200/bin

    配置如下

    export LD_LIBRARY_PATH=~/.mujoco/mujoco200/bin${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} 
    export MUJOCO_KEY_PATH=~/.mujoco${MUJOCO_KEY_PATH}
    
    • 1
    • 2

    遇事不决多重启,把wsl shutdown了再来。
    果然,重启wsl后就开始运转了。
    在这里插入图片描述

    /home/kewei/miniconda3/lib/python3.9/site-packages/mujoco_py/gl/osmesashim.c:1:10: fatal error: GL/osmesa.h: No such file or directory
    #include
    ^~~~~~~~~~~~~
    compilation terminated.

    解决办法

    sudo apt-get install mesa-common-dev
    sudo apt-get install libgl1-mesa-dev libglu1-mesa-dev
    
    • 1
    • 2

    但是却告诉我已经安装了
    在这里插入图片描述

    尝试另外一种办法

    sudo apt-get install libglew-dev
    sudo gedit ~/.bashrc
    export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
    source ~/.bashrc
    
    
    • 1
    • 2
    • 3
    • 4
    • 5

    还是不行,换另外一种办法

    sudo apt-get install libosmesa6-dev
    
    
    • 1
    • 2

    这样便可以了,但是又遇到了一个问题
    import的时候报错
    PermissionError: [Errno 13] Permission denied: ‘patchelf’
    解决:这个是构建锁导致的问题,可以通过如下办法解决:
    cd /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/mujoco_py
    cd ~/miniconda3/lib/python3.9/site-packages/mujoco_py
    sudo chmod -R 777 ./

    结果这种方法又出错了,我决定安装最新版的
    结果又爆了
    Building wheels for collected packages: mujoco-py
    Building wheel for mujoco-py (pyproject.toml) … error
    error: subprocess-exited-with-error

    × Building wheel for mujoco-py (pyproject.toml) did not run successfully.
    │ exit code: 1
    ╰─> [20 lines of output]
    running bdist_wheel
    running build
    Removing old mujoco_py cext /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/cymj_2.0.2.13_39_linuxcpuextensionbuilder_39.so
    Compiling /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/cymj.pyx because it depends on /tmp/pip-build-env-t5spnhlg/overlay/lib/python3.9/site-packages/Cython/Includes/libc/string.pxd.
    [1/1] Cythonizing /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/cymj.pyx
    running build_ext
    building ‘mujoco_py.cymj’ extension
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/gl
    gcc -pthread -B /home/kewei/miniconda3/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /home/kewei/miniconda3/include -I/home/kewei/miniconda3/include -fPIC -O2 -isystem /home/kewei/miniconda3/include -fPIC -Imujoco_py -I/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py -I/home/kewei/.mujoco/mujoco200/include -I/tmp/pip-build-env-t5spnhlg/overlay/lib/python3.9/site-packages/numpy/core/include -I/home/kewei/miniconda3/include/python3.9 -c /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/cymj.c -o /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/cymj.o -fopenmp -w
    gcc -pthread -B /home/kewei/miniconda3/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /home/kewei/miniconda3/include -I/home/kewei/miniconda3/include -fPIC -O2 -isystem /home/kewei/miniconda3/include -fPIC -Imujoco_py -I/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py -I/home/kewei/.mujoco/mujoco200/include -I/tmp/pip-build-env-t5spnhlg/overlay/lib/python3.9/site-packages/numpy/core/include -I/home/kewei/miniconda3/include/python3.9 -c /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/gl/osmesashim.c -o /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/gl/osmesashim.o -fopenmp -w
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/lib.linux-x86_64-cpython-39
    creating /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/lib.linux-x86_64-cpython-39/mujoco_py
    gcc -pthread -B /home/kewei/miniconda3/compiler_compat -shared -Wl,-rpath,/home/kewei/miniconda3/lib -Wl,-rpath-link,/home/kewei/miniconda3/lib -L/home/kewei/miniconda3/lib -L/home/kewei/miniconda3/lib -Wl,-rpath,/home/kewei/miniconda3/lib -Wl,-rpath-link,/home/kewei/miniconda3/lib -L/home/kewei/miniconda3/lib /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/cymj.o /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/temp.linux-x86_64-cpython-39/tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/gl/osmesashim.o -L/home/kewei/.mujoco/mujoco200/bin -Wl,–enable-new-dtags,-R/home/kewei/.mujoco/mujoco200/bin -lmujoco200 -lglewosmesa -lOSMesa -lGL -o /tmp/pip-install-p35bq4mq/mujoco-py_5714e77de65c4408a54ec041c0e44487/mujoco_py/generated/_pyxbld_2.0.2.13_39_linuxcpuextensionbuilder/lib.linux-x86_64-cpython-39/mujoco_py/cymj.cpython-39-x86_64-linux-gnu.so -fopenmp
    error: [Errno 13] Permission denied: ‘patchelf’
    [end of output]

    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for mujoco-py
    Failed to build mujoco-py
    ERROR: Could not build wheels for mujoco-py, which is required to install pyproject.toml-based projects

    解决办法
    sudo apt-get install patchelf

    然后一切顺利
    在这里插入图片描述

    然后导入包报错,
    ImportError: cannot import name ‘MISSING_KEY_MESSAGE’ from ‘mujoco_py.utils’ (/home/kewei/miniconda3/lib/python3.9/site-packages/mujoco_py/utils.py)

    先测试一下
    cd ~/.mujoco/mujoco200/bin
    ./simulate …/model/humanoid.xml

    在这里插入图片描述

    在这里插入图片描述
    下载glfw的zip文件,解压后
    预编译,出了一个小错
    Looking for remove - found
    – Looking for shmat
    – Looking for shmat - found
    – Looking for IceConnectionNumber in ICE
    – Looking for IceConnectionNumber in ICE - found
    CMake Error at CMakeLists.txt:218 (message):
    RandR headers not found; install libxrandr development package

    解决

    sudo apt-get install libxrandr-dev
    
    
    • 1
    • 2

    ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

    解决

    pip install --upgrade numpy
    
    
    • 1
    • 2

    在这里插入图片描述
    终于顺利安装了。
    在这里插入图片描述
    这下子我们就可以好好体验一把td3了。
    然后这个训练了2个多小时
    在这里插入图片描述
    但是貌似9966已经收敛到极限了。

    在这里插入图片描述
    运行完1000000次后有个快速更新的过程
    ,不过我仔细一看,貌似是全部都要又来训练一遍,泪目了,但是貌似这次的比上次要快些。
    在这里插入图片描述
    使用单个环境一百万次跑完,还不错。

  • 相关阅读:
    IntelliJ IDEA 2022.2发布首个Beta版本,看看有哪些更新
    StringTable
    risc-v dv源代码分析
    Nacos支持https
    linux安装docker
    C++:栈与队列,优先级队列
    JS事件处理机制/微任务和宏任务
    visual studio code中base环境切换的问题
    Unity Meta Quest MR 开发(七):使用 Stencil Test 模板测试制作可以在虚拟与现实之间穿梭的 MR 传送门
    vue中的mixin混入
  • 原文地址:https://blog.csdn.net/weixin_54227557/article/details/126395876