• C/C++开发,opencv-ml库学习,随机森林(RTrees)应用


    目录

    一、随机森林算法

    1.1 算法简介

    1.2 OpenCV-随机森林(Random Forest)

    二、cv::ml::RTrees应用

    2.2 RTrees应用

    2.2 程序编译

    2.3 main.cpp全代码


    一、随机森林算法

    1.1 算法简介

            随机森林算法是一种集成学习(Ensemble Learning)方法,由多个决策树组成。它结合了决策树的高效性和集成学习的准确性,具有很强的模型泛化能力。

            随机森林算法的原理主要包括两个方面:随机性和集成。

    1. 随机性:体现在样本的随机性和特征的随机性。算法通过引入随机性来构建多个决策树,每个决策树都是基于随机抽样的训练数据和随机选择的特征进行构建的。这种随机性能够有效地减少过拟合的风险,提高模型的泛化能力。
    2. 集成:通过集成多个决策树的预测结果来完成最终的分类或回归任务。通常采用投票的方式进行集成,即多数表决原则。在随机森林中,每棵决策树的构建过程都是相互独立的,这意味着每棵决策树都是在不同的训练数据和特征子集上进行构建的,这种随机性能够有效地降低模型的方差,提高模型的稳定性。
    1.2 OpenCV-随机森林(Random Forest)

            在OpenCV中,随机森林(Random Forest)是一种集成学习方法,通过构建并组合多个决策树来做出预测。OpenCV提供了cv::ml::RTrees类来实现随机森林算法。决策树在使用时仅仅构建一棵树,这样容易出现过拟合现象,可以通过构建多个决策树来避免过拟合现象。当构建多个决策树时,就出现了随机森林。这种方法通过多个决策树的投票来得到在最终的结果。

    1. //创建 RTrees对象
    2. cv::Ptr rf = cv::ml::RTrees::create();
    3. RTrees类:
    4. setMaxDepth() 设置决策树的最大深度
    5. setMinSampleCount() 设置叶子节点上的最小样本数
    6. setRegressionAccuracy() 非必须 回归算法的精度
    7. setPriors() 非必须 数据类型
    8. setCalculateVarImportance() 非必须 是否要计算var
    9. setActiveVarCount() 非必须 设置var的数目
    10. setTermCriteria() 设置终止条件
    11. .....

            随机森林通过随机选择特征和样本子集来构建每棵决策树,并将它们的预测结果进行集成。这种方法通常能够降低过拟合的风险,并提高模型的预测性能。

    二、cv::ml::RTrees应用

    2.1 数据集样本准备

      本文为了快速验证使用,采用mnist数据集,参考本专栏博文《C/C++开发,opencv-ml库学习,支持向量机(SVM)应用-CSDN博客》下载MNIST 数据集(手写数字识别),并解压。

            同时参考该博文“2.4 SVM(支持向量机)实时识别应用”的章节资料,利用python代码解压t10k-images.idx3-ubyte出图片数据文件。

    2.2 RTrees应用

            创建了一个 cv::ml::RTrees对象,并设置了训练数据和终止条件。接着,我们调用 train 方法来训练决策树模型。最后,我们使用训练好的模型来预测一个新样本的类别。

    1. // 3. 设置并训练随机森林模型
    2. cv::Ptr rf = cv::ml::RTrees::create();
    3. rf->setMaxDepth(30); // 设置决策树的最大深度
    4. rf->setMinSampleCount(2); // 设置叶子节点上的最小样本数
    5. rf->setTermCriteria(cv::TermCriteria(cv::TermCriteria::EPS + cv::TermCriteria::COUNT, 10, 0.1)); // 设置终止条件
    6. rf->train(trainingData, cv::ml::ROW_SAMPLE, labelsMat);
    7. ......
    8. cv::Mat testResp;
    9. float response = rf->predict(testData,testResp);
    10. ......
    11. rf->save("mnist_svm.xml");

         训练及测试过的算法模型,保存输出,然后调用

    1. cv::Ptr rf = cv::ml::StatModel::load("mnist_svm.xml");
    2. //预测图片
    3. float ret = rf ->predict(image);
    4. std::cout << "predict val = "<< ret << std::endl;
    2.2 程序编译

            和讲述支持向量机(SVM)应用的博文编译类似,采用opencv+mingw+makefile方式编译:

    1. #/bin/sh
    2. #win32
    3. CX= g++ -DWIN32
    4. #linux
    5. #CX= g++ -Dlinux
    6. BIN := ./
    7. TARGET := opencv_ml03.exe
    8. FLAGS := -std=c++11 -static
    9. SRCDIR := ./
    10. #INCLUDES
    11. INCLUDEDIR := -I"../../opencv_MinGW/include" -I"./"
    12. #-I"$(SRCDIR)"
    13. staticDir := ../../opencv_MinGW/x64/mingw/staticlib/
    14. #LIBDIR := $(staticDir)/libopencv_world460.a\
    15. # $(staticDir)/libade.a \
    16. # $(staticDir)/libIlmImf.a \
    17. # $(staticDir)/libquirc.a \
    18. # $(staticDir)/libzlib.a \
    19. # $(wildcard $(staticDir)/liblib*.a) \
    20. # -lgdi32 -lComDlg32 -lOleAut32 -lOle32 -luuid
    21. #opencv_world放弃前,然后是opencv依赖的第三方库,后面的库是MinGW编译工具的库
    22. LIBDIR := -L $(staticDir) -lopencv_world460 -lade -lIlmImf -lquirc -lzlib \
    23. -llibjpeg-turbo -llibopenjp2 -llibpng -llibprotobuf -llibtiff -llibwebp \
    24. -lgdi32 -lComDlg32 -lOleAut32 -lOle32 -luuid
    25. source := $(wildcard $(SRCDIR)/*.cpp)
    26. $(TARGET) :
    27. $(CX) $(FLAGS) $(INCLUDEDIR) $(source) -o $(BIN)/$(TARGET) $(LIBDIR)
    28. clean:
    29. rm $(BIN)/$(TARGET)

    make编译,make clean 清除可重新编译。

    运行效果,同样数据样本,相比决策树算法训练结果,其准确率有了较大改善,大家可以尝试调整参数验证:

    2.3 main.cpp全代码

            main.cpp源代码,由于是基于前两篇博文支持向量机(SVM)应用、决策树(DTrees)应用基础上,快速移用实现的,有很多支持向量机(SVM)应用或决策树(DTrees)的痕迹,采用的数据样本也非较合适的,仅仅是为了阐述c++ opencv 随机森林(RTrees)应用说明。

    1. #include
    2. #include
    3. #include
    4. #include
    5. #include
    6. #include
    7. #include
    8. int intReverse(int num)
    9. {
    10. return (num>>24|((num&0xFF0000)>>8)|((num&0xFF00)<<8)|((num&0xFF)<<24));
    11. }
    12. std::string intToString(int num)
    13. {
    14. char buf[32]={0};
    15. itoa(num,buf,10);
    16. return std::string(buf);
    17. }
    18. cv::Mat read_mnist_image(const std::string fileName) {
    19. int magic_number = 0;
    20. int number_of_images = 0;
    21. int img_rows = 0;
    22. int img_cols = 0;
    23. cv::Mat DataMat;
    24. std::ifstream file(fileName, std::ios::binary);
    25. if (file.is_open())
    26. {
    27. std::cout << "open images file: "<< fileName << std::endl;
    28. file.read((char*)&magic_number, sizeof(magic_number));//format
    29. file.read((char*)&number_of_images, sizeof(number_of_images));//images number
    30. file.read((char*)&img_rows, sizeof(img_rows));//img rows
    31. file.read((char*)&img_cols, sizeof(img_cols));//img cols
    32. magic_number = intReverse(magic_number);
    33. number_of_images = intReverse(number_of_images);
    34. img_rows = intReverse(img_rows);
    35. img_cols = intReverse(img_cols);
    36. std::cout << "format:" << magic_number
    37. << " img num:" << number_of_images
    38. << " img row:" << img_rows
    39. << " img col:" << img_cols << std::endl;
    40. std::cout << "read img data" << std::endl;
    41. DataMat = cv::Mat::zeros(number_of_images, img_rows * img_cols, CV_32FC1);
    42. unsigned char temp = 0;
    43. for (int i = 0; i < number_of_images; i++) {
    44. for (int j = 0; j < img_rows * img_cols; j++) {
    45. file.read((char*)&temp, sizeof(temp));
    46. //svm data is CV_32FC1
    47. float pixel_value = float(temp);
    48. DataMat.at<float>(i, j) = pixel_value;
    49. }
    50. }
    51. std::cout << "read img data finish!" << std::endl;
    52. }
    53. file.close();
    54. return DataMat;
    55. }
    56. cv::Mat read_mnist_label(const std::string fileName) {
    57. int magic_number;
    58. int number_of_items;
    59. cv::Mat LabelMat;
    60. std::ifstream file(fileName, std::ios::binary);
    61. if (file.is_open())
    62. {
    63. std::cout << "open label file: "<< fileName << std::endl;
    64. file.read((char*)&magic_number, sizeof(magic_number));
    65. file.read((char*)&number_of_items, sizeof(number_of_items));
    66. magic_number = intReverse(magic_number);
    67. number_of_items = intReverse(number_of_items);
    68. std::cout << "format:" << magic_number << " ;label_num:" << number_of_items << std::endl;
    69. std::cout << "read Label data" << std::endl;
    70. //data type:CV_32SC1,channel:1
    71. LabelMat = cv::Mat::zeros(number_of_items, 1, CV_32SC1);
    72. for (int i = 0; i < number_of_items; i++) {
    73. unsigned char temp = 0;
    74. file.read((char*)&temp, sizeof(temp));
    75. LabelMat.at<unsigned int>(i, 0) = (unsigned int)temp;
    76. }
    77. std::cout << "read label data finish!" << std::endl;
    78. }
    79. file.close();
    80. return LabelMat;
    81. }
    82. //change path for real paths
    83. std::string trainImgFile = "D:\\workForMy\\OpenCVLib\\opencv_demo\\opencv_ml01\\train-images.idx3-ubyte";
    84. std::string trainLabeFile = "D:\\workForMy\\OpenCVLib\\opencv_demo\\opencv_ml01\\train-labels.idx1-ubyte";
    85. std::string testImgFile = "D:\\workForMy\\OpenCVLib\\opencv_demo\\opencv_ml01\\t10k-images.idx3-ubyte";
    86. std::string testLabeFile = "D:\\workForMy\\OpenCVLib\\opencv_demo\\opencv_ml01\\t10k-labels.idx1-ubyte";
    87. void train_SVM()
    88. {
    89. //read train images, data type CV_32FC1
    90. cv::Mat trainingData = read_mnist_image(trainImgFile);
    91. //images data normalization
    92. trainingData = trainingData/255.0;
    93. std::cout << "trainingData.size() = " << trainingData.size() << std::endl;
    94. std::cout << "trainingData.type() = " << trainingData.type() << std::endl;
    95. std::cout << "trainingData.rows = " << trainingData.rows << std::endl;
    96. std::cout << "trainingData.cols = " << trainingData.cols << std::endl;
    97. //read train label, data type CV_32SC1
    98. cv::Mat labelsMat = read_mnist_label(trainLabeFile);
    99. std::cout << "labelsMat.size() = " << labelsMat.size() << std::endl;
    100. std::cout << "labelsMat.type() = " << labelsMat.type() << std::endl;
    101. std::cout << "labelsMat.rows = " << labelsMat.rows << std::endl;
    102. std::cout << "labelsMat.cols = " << labelsMat.cols << std::endl;
    103. std::cout << "trainingData & labelsMat finish!" << std::endl;
    104. // //create SVM model
    105. // cv::Ptr svm = cv::ml::SVM::create();
    106. // //set svm args,type and KernelTypes
    107. // svm->setType(cv::ml::SVM::C_SVC);
    108. // svm->setKernel(cv::ml::SVM::POLY);
    109. // //KernelTypes POLY is need set gamma and degree
    110. // svm->setGamma(3.0);
    111. // svm->setDegree(2.0);
    112. // //Set iteration termination conditions, maxCount is importance
    113. // svm->setTermCriteria(cv::TermCriteria(cv::TermCriteria::EPS | cv::TermCriteria::COUNT, 1000, 1e-8));
    114. // std::cout << "create SVM object finish!" << std::endl;
    115. // std::cout << "trainingData.rows = " << trainingData.rows << std::endl;
    116. // std::cout << "trainingData.cols = " << trainingData.cols << std::endl;
    117. // std::cout << "trainingData.type() = " << trainingData.type() << std::endl;
    118. // // svm model train
    119. // svm->train(trainingData, cv::ml::ROW_SAMPLE, labelsMat);
    120. // std::cout << "SVM training finish!" << std::endl;
    121. // // 创建决策树对象
    122. // cv::Ptr dtree = cv::ml::DTrees::create();
    123. // dtree->setMaxDepth(30); // 设置树的最大深度
    124. // dtree->setCVFolds(0);
    125. // dtree->setMinSampleCount(1); // 设置分裂内部节点所需的最小样本数
    126. // std::cout << "create dtree object finish!" << std::endl;
    127. // // 训练决策树--trainingData训练数据,labelsMat训练标签
    128. // cv::Ptr td = cv::ml::TrainData::create(trainingData, cv::ml::ROW_SAMPLE, labelsMat);
    129. // std::cout << "create TrainData object finish!" << std::endl;
    130. // if(dtree->train(td))
    131. // {
    132. // std::cout << "dtree training finish!" << std::endl;
    133. // }else{
    134. // std::cout << "dtree training fail!" << std::endl;
    135. // }
    136. // 3. 设置并训练随机森林模型
    137. cv::Ptr rf = cv::ml::RTrees::create();
    138. rf->setMaxDepth(30); // 设置决策树的最大深度
    139. rf->setMinSampleCount(2); // 设置叶子节点上的最小样本数
    140. rf->setTermCriteria(cv::TermCriteria(cv::TermCriteria::EPS + cv::TermCriteria::COUNT, 10, 0.1)); // 设置终止条件
    141. rf->train(trainingData, cv::ml::ROW_SAMPLE, labelsMat);
    142. // svm model test
    143. cv::Mat testData = read_mnist_image(testImgFile);
    144. //images data normalization
    145. testData = testData/255.0;
    146. std::cout << "testData.rows = " << testData.rows << std::endl;
    147. std::cout << "testData.cols = " << testData.cols << std::endl;
    148. std::cout << "testData.type() = " << testData.type() << std::endl;
    149. //read test label, data type CV_32SC1
    150. cv::Mat testlabel = read_mnist_label(testLabeFile);
    151. cv::Mat testResp;
    152. // float response = svm->predict(testData,testResp);
    153. // float response = dtree->predict(testData,testResp);
    154. float response = rf->predict(testData,testResp);
    155. // std::cout << "response = " << response << std::endl;
    156. testResp.convertTo(testResp,CV_32SC1);
    157. int map_num = 0;
    158. for (int i = 0; i
    159. {
    160. if (testResp.at<int>(i, 0) == testlabel.at<int>(i, 0))
    161. {
    162. map_num++;
    163. }
    164. // else{
    165. // std::cout << "testResp.at(i, 0) " << testResp.at(i, 0) << std::endl;
    166. // std::cout << "testlabel.at(i, 0) " << testlabel.at(i, 0) << std::endl;
    167. // }
    168. }
    169. float proportion = float(map_num) / float(testResp.rows);
    170. std::cout << "map rate: " << proportion * 100 << "%" << std::endl;
    171. std::cout << "SVM testing finish!" << std::endl;
    172. //save svm model
    173. // svm->save("mnist_svm.xml");
    174. // dtree->save("mnist_svm.xml");
    175. rf->save("mnist_svm.xml");
    176. }
    177. void prediction(const std::string fileName,cv::Ptr dtree)
    178. // void prediction(const std::string fileName,cv::Ptr svm)
    179. {
    180. //read img 28*28 size
    181. cv::Mat image = cv::imread(fileName, cv::IMREAD_GRAYSCALE);
    182. //uchar->float32
    183. image.convertTo(image, CV_32F);
    184. //image data normalization
    185. image = image / 255.0;
    186. //28*28 -> 1*784
    187. image = image.reshape(1, 1);
    188. //预测图片
    189. float ret = dtree->predict(image);
    190. std::cout << "predict val = "<< ret << std::endl;
    191. }
    192. std::string imgDir = "D:\\workForMy\\OpenCVLib\\opencv_demo\\opencv_ml01\\t10k-images\\";
    193. std::string ImgFiles[5] = {"image_0.png","image_10.png","image_20.png","image_30.png","image_40.png",};
    194. void predictimgs()
    195. {
    196. //load svm model
    197. // cv::Ptr svm = cv::ml::StatModel::load("mnist_svm.xml");
    198. //load DTrees model
    199. // cv::Ptr dtree = cv::ml::StatModel::load("mnist_svm.xml");
    200. cv::Ptr rf = cv::ml::StatModel::load("mnist_svm.xml");
    201. for (size_t i = 0; i < 5; i++)
    202. {
    203. prediction(imgDir+ImgFiles[i],rf);
    204. }
    205. }
    206. int main()
    207. {
    208. train_SVM();
    209. predictimgs();
    210. return 0;
    211. }

  • 相关阅读:
    【redis】7.6 安装与配置Redis - (docker-compose)
    java计算机毕业设计网络课程答疑系统MyBatis+系统+LW文档+源码+调试部署
    YOLOv5实操——检测是否戴口罩
    怎样查看kafka写数据送到topic是否成功
    Springboot集成redis和mybatis-plus及websocket异常框架代码封装
    【入门-08】系统控制单元(SCU)
    Mysql进阶学习(三)排序查询与常见函数
    C#多维数组的属性获取方法及操作注意
    pytorch-构建卷积神经网络
    Cesium渲染模块之VAO
  • 原文地址:https://blog.csdn.net/py8105/article/details/138200564