• 日撸 Java 三百行day54-55


    说明

    闵老师的文章链接: 日撸 Java 三百行(总述)_minfanphd的博客-CSDN博客
    自己也把手敲的代码放在了github上维护:https://github.com/fulisha-ok/sampledata

    day54 基于 M-distance 的推荐

    1. M-distance 理解

    例如day51-53中的KNN我们预测一个物品类别,我们是以测试样本和我们训练样本的距离的远近来找k个最相似似的邻居,对这k个邻居评分来预测测试样本的类别。而M-distance是根据平均分来计算两个用户 (或项目) 之间的距离。
    如下图标这是以基于项目来预测得分的例子。例如预测用户u0对m2的评分,我们怎么来找邻居呢?我们是求出每个项目的平均分,找离m2平均分半径范围内的项目求平均分来预测得分。
    在这里插入图片描述

    2.代码理解

    1.代码中变量的解读

    文档中的内容有100000行记录,数据如下(部分):
    在这里插入图片描述
    一行代表的意思:0用户对项目0的评分,评分为5分;有943个用户,1682部电影,100000个评分;对部分变量说明

    • 总数统计(用户数量,项目数量,评分数)
    private  int numItems; 
    private int numUsers;
    private int numRatings;
    
    • 1
    • 2
    • 3

    -在这里插入图片描述

    • compressedRatingMatrix(压缩的评分矩阵-实际上就是把文件内容读出来)
      就本次文本来看,共有100000条记录
     private int[][] compressedRatingMatrix;
    
    • 1

    在这里插入图片描述

    • userDegrees(每个用户评分的项目数量)
     private int[] userDegrees;
    
    • 1

    在这里插入图片描述

    • userStartingIndices(每个用户的起始索引,例如用户1的起始索引是272)
     private int[] userStartingIndices;
    
    • 1

    在这里插入图片描述- userAverageRatings(每个用户评价项目的一个平均分)

    private double[] userAverageRatings;
    
    • 1

    在这里插入图片描述

    • itemDegrees (每个项目被评分的次数-也可以理解为有多少用户评分了)
    private int[] itemDegrees;
    
    • 1

    在这里插入图片描述

    • itemAverageRatings (每个项目的平均分)
     private double[] itemAverageRatings;
    
    • 1

    在这里插入图片描述
    MBR构造函数即是对上面的变量进行赋值,初始化。

    2.leave-one-out测试

    之前在knn中已经接触过了,即将数据集中的每一个样本都被单独作为测试集,而剩下的样本就作为训练集.
    以一个测试集来举例。例如我们将第一行的数据作为测试集(0,0,5)我们知道这是用户0对项目0评分为5分,我们现在结合其他项目来预测项目0的评分。接下来的步骤为:

    • 先移除这个用户0对项目0的评分,重新计算对项目0的平均分
      在这里插入图片描述
    • 找邻居(这里是基于项目进行预测,去找用户评论过的项目的平均分与当前项目的平均分差值在一个半径范围内,则作为邻居,并累计他的评分)。如我们知道用户0评论了272部电影,排除项目0,我们要从271部电影中去找邻居来预测项目0的评分。
    • 若找到了邻居则求他们的平均分。如用户0找到94个邻居,总分387分,那我们预测用户0对项目0的预测分数为:4.117021276595745
      在这里插入图片描述
    • 完整代码:
        public void leaveOneOutPrediction() {
            double tempItemAverageRating;
            // Make each line of the code shorter.
            int tempUser, tempItem, tempRating;
            System.out.println("\r\nLeaveOneOutPrediction for radius " + radius);
    
            numNonNeighbors = 0;
            for (int i = 0; i < numRatings; i++) {
                tempUser = compressedRatingMatrix[i][0];
                tempItem = compressedRatingMatrix[i][1];
                tempRating = compressedRatingMatrix[i][2];
    
                // Step 1. Recompute average rating of the current item.
                tempItemAverageRating = (itemAverageRatings[tempItem] * itemDegrees[tempItem] - tempRating)
                        / (itemDegrees[tempItem] - 1);
    
                // Step 2. Recompute neighbors, at the same time obtain the ratings
                // Of neighbors.
                int tempNeighbors = 0;
                double tempTotal = 0;
                int tempComparedItem;
                for (int j = userStartingIndices[tempUser]; j < userStartingIndices[tempUser + 1]; j++) {
                    tempComparedItem = compressedRatingMatrix[j][1];
                    if (tempItem == tempComparedItem) {
                        continue;// Ignore itself.
                    } // Of if
    
                    if (Math.abs(tempItemAverageRating - itemAverageRatings[tempComparedItem]) < radius) {
                        tempTotal += compressedRatingMatrix[j][2];
                        tempNeighbors++;
                    }
                }
                // Step 3. Predict as the average value of neighbors.
                if (tempNeighbors > 0) {
                    predictions[i] = tempTotal / tempNeighbors;
                } else {
                    predictions[i] = DEFAULT_RATING;
                    numNonNeighbors++;
                }
            }
        }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42

    3.计算MAE(平均绝对误差)

    预测值与实际值之间的平均绝对偏差程度(MAE 的值越小,表示预测结果与实际值的偏差越小,预测模型的准确性越高)

        public double computeMAE() throws Exception {
            double tempTotalError = 0;
            for (int i = 0; i < predictions.length; i++) {
                tempTotalError += Math.abs(predictions[i] - compressedRatingMatrix[i][2]);
            } 
    
            return tempTotalError / predictions.length;
        }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    4.计算RMSE(均方根误差)

    预测值与实际值之间的平方值偏差程度。RMSE 的值越小,表示预测结果与实际值的均方差越小,预测模型的准确性越高

     public double computeRSME() throws Exception {
            double tempTotalError = 0;
            for (int i = 0; i < predictions.length; i++) {
                tempTotalError += (predictions[i] - compressedRatingMatrix[i][2])
                        * (predictions[i] - compressedRatingMatrix[i][2]);
            } 
            
            double tempAverage = tempTotalError / predictions.length;
            return Math.sqrt(tempAverage);
        }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    在这里插入图片描述

    day55 基于 M-distance 的推荐(续)

    1.基于用户和基于项目的推荐

    • day 54基于项目的推荐
      我们预测m2的评分是:计算每个项目(m0~m5)的一个平均分,找到邻居(m1,m3)我们求他的品平均就是我们对m2的一个预测分数。(在这个过程中我们是根据项目邻居去评估的得分)
      在这里插入图片描述
    • 基于用户的推荐
      我们预测m2的评分:计算每个用户评分的项目(u0~u4),找到邻居(u1),求他们的平均分即我们对m2的预测得分。
      在这里插入图片描述

    2.基于用户推荐代码思路

    我最开始也想到的是对compressedRatingMatrix重新赋值,使数组进用户与项目关系互换。但最后我还是想采用列表的方式来编码。我大致思路如下:

    2.1 抽象文本内容

    将文本内容涉及的三个指标抽象为一个对象Text.其中userNum代表用户的编号,itemNum代表项目的编号,score代表用户对项目的评分。

        class Text{
            private Integer userNum;
            private Integer itemNum;
            private Integer score;
    
            public Integer getUserNum() {
                return userNum;
            }
    
            public void setUserNum(Integer userNum) {
                this.userNum = userNum;
            }
    
            public Integer getItemNum() {
                return itemNum;
            }
    
            public void setItemNum(Integer itemNum) {
                this.itemNum = itemNum;
            }
    
            public Integer getScore() {
                return score;
            }
    
            public void setScore(Integer score) {
                this.score = score;
            }
    
            public Text(Integer userNum, Integer itemNum, Integer score) {
                this.userNum = userNum;
                this.itemNum = itemNum;
                this.score = score;
            }
        }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35

    2.2 构造函数, 初始化MBR对象(借助jdk1.8新特性–Stream流)

    stream流的知识可以百度使用(结合Lambda表达式),他可以对集合进行非常复杂的查找、过滤、筛选等操作。

    我大致思路是将文本内容放在一个列表中:List textList,我对这个textList采用stream流进行分组,按电影编号分组Map > textGroupByItem和按用户编号分组Map > textGroupByUser。并对相应的数组进行赋值。

    stream流的使用:

    // 按电影编号分组
    textGroupByItem = textList.stream().collect(Collectors.groupingBy(Text::getItemNum));
    //按用户编号分组
    textGroupByUser = textList.stream().collect(Collectors.groupingBy(Text::getUserNum));
    //对列表中某一属性求和
    tempUserTotalScore[i] = textsByUser.stream().mapToDouble(Text::getScore).sum();
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    代码:

        public MBR(String paraFileName, int paraNumUsers, int paraNumItems, int paraNumRatings, boolean basedUser) throws Exception {
            if (basedUser){
                //基于用户的计算
                //step1. initialize these arrays
                numItems = paraNumItems;
                numUsers = paraNumUsers;
                numRatings = paraNumRatings;
    
                userDegrees = new int[numUsers];
                userAverageRatings = new double[numUsers];
                itemDegrees = new int[numItems];
                itemAverageRatings = new double[numItems];
                predictions = new double[numRatings];
    
                System.out.println("Reading " + paraFileName);
    
                //step2. Read the data file
                File tempFile = new File(paraFileName);
                if (!tempFile.exists()) {
                    System.out.println("File " + paraFileName + " does not exists.");
                    System.exit(0);
                }
                BufferedReader tempBufReader = new BufferedReader(new FileReader(tempFile));
                String tempString;
                String[] tempStrArray;
                while ((tempString = tempBufReader.readLine()) != null) {
                    // Each line has three values
                    tempStrArray = tempString.split(",");
                    //把数据读入到textList列表中
                    Text text = new Text(Integer.parseInt(tempStrArray[0]), Integer.parseInt(tempStrArray[1]), Integer.parseInt(tempStrArray[2]));
                    textList.add(text);
                }
                tempBufReader.close();
    
                //按电影号分组
                textGroupByItem = textList.stream().collect(Collectors.groupingBy(Text::getItemNum));
                textGroupByUser = textList.stream().collect(Collectors.groupingBy(Text::getUserNum));
                double[] tempUserTotalScore = new double[numUsers];
                double[] tempItemTotalScore = new double[numItems];
                for (int i = 0; i < numUsers; i++) {
                    // 用户的总分
                    List textsByUser = textGroupByUser.get(i);
                    tempUserTotalScore[i] = textsByUser.stream().mapToDouble(Text::getScore).sum();
                    userDegrees[i] = textsByUser.size();
                    userAverageRatings[i] = tempUserTotalScore[i] / userDegrees[i];
                }
    
                for (int i = 0; i < numItems; i++) {
                    try {
                        // 电影的总分
                        List textsByItem = textGroupByItem.get(i);
                        tempItemTotalScore[i] = textsByItem.stream().mapToDouble(Text::getScore).sum();
                        itemDegrees[i] = textsByItem.size();
                        itemAverageRatings[i] = tempItemTotalScore[i] / itemDegrees[i];
                    } catch (Exception e) {
                        System.out.println(e.getMessage());
                    }
    
                }
            }
        }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61

    2.3 leave-one-out测试

    leave-one-out测试中,以文本中第一条记录为例子(0,0,5)我们要预测对项目0的评分
    (1)第一步:排除用户0对项目0的评分(用户0评论了272个项目),重新计算用户0的平均分(271个项目)
    (2)第二步:我们看对项目0评分的用户个数(452个),依次遍历用户的平均分与我们重新计算的平均分之差是否在半径范围内,从而累计邻居个数以及他们的总分。
    (3)第三步:预测用户0对项目0的得分

    stream流的使用:

    // 对列表过滤数据
    textsByUser = textsByUser.stream().filter(e -> !e.getItemNum().equals(outItem)).collect(Collectors.toList());
    
    • 1
    • 2

    代码:

        public void leaveOneOutPredictionByUser() {
            double tempItemAverageRating;
            // Make each line of the code shorter.
            int tempUser, tempItem, tempRating;
            System.out.println("\r\nLeaveOneOutPredictionUser for radius " + radius);
    
            numNonNeighbors = 0;
            for (int i = 0; i < numRatings; i++) {
                Text text = textList.get(i);
                tempUser = text.getUserNum();
                tempItem = text.getItemNum();
                tempRating = text.getScore();
    
                // Step 1. Recompute average rating of the current user.
                List textsByUser = textGroupByUser.get(tempUser);
                Integer outItem = tempItem;
                textsByUser = textsByUser.stream().filter(e -> !e.getItemNum().equals(outItem)).collect(Collectors.toList());
                tempItemAverageRating = textsByUser.stream().mapToDouble(Text::getScore).sum() / textsByUser.size();
    
                // Step 2. Recompute neighbors, at the same time obtain the ratings
                // Of neighbors.
                int tempNeighbors = 0;
                double tempTotal = 0;
                List texts = textGroupByItem.get(tempItem);
                for (int j = 0; j < texts.size(); j++) {
                    Text userText = texts.get(j);
                    if (tempUser == j) {
                        continue;// Ignore itself.
                    }
    
                    if (Math.abs(tempItemAverageRating - userAverageRatings[userText.getUserNum()]) < radius) {
                        tempTotal += userText.getScore();
                        tempNeighbors++;
                    }
                }
                // Step 3. Predict as the average value of neighbors.
                if (tempNeighbors > 0) {
                    predictions[i] = tempTotal / tempNeighbors;
                } else {
                    predictions[i] = DEFAULT_RATING;
                    numNonNeighbors++;
                }
            }
    
        }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46

    2.4 代码结果

    在这里插入图片描述

    • day54-55代码
    package machinelearing.knn;
    
    import java.io.BufferedReader;
    import java.io.File;
    import java.io.FileReader;
    import java.util.ArrayList;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    import java.util.stream.Collectors;
    
    public class MBR {
        /**
         * Default rating for 1-5 points
         */
        public static final double DEFAULT_RATING = 3.0;
    
        /**
         * the total number of users (参与评分的用户数量)
         */
        private int numUsers;
    
    
        /**
         * the total number of items (评分的物品数量)
         */
        private  int numItems;
    
        /**
         * the total number of ratings (no-zero values) (非零评分值的数量)
         */
        private int numRatings;
    
    
        /**
         * the predictions
         */
        private double[] predictions;
    
        /**
         * Compressed rating matrix. user-item-rating triples (压缩的评分矩阵,存储用户-物品-评分的三元组)
         */
        private int[][] compressedRatingMatrix;
    
        /**
         * The degree of users (how many item he has rated). (用户已评分的物品数量)
         */
        private int[] userDegrees;
    
        /**
         * The average rating of the current user. (当前用户的平均评分。存储每个用户的平均评分值)
         */
        private double[] userAverageRatings;
    
        /**
         * The degree of users (how many item he has rated). (物品被评分的次数)
         */
        private int[] itemDegrees;
    
        /**
         * The average rating of the current item. (当前物品的平均评分。存储每个物品的平均评分值)
         */
        private double[] itemAverageRatings;
    
        /**
         * The first user start from 0. Let the first user has x ratings, the second user will start from x. (用户起始索引。第一个用户的起始索引为0,第二个用户的起始索引为前一个用户评分的数量。用于定位用户的评分在compressedRatingMatrix中的位置。)
         */
        private int[] userStartingIndices;
    
        /**
         * Number of non-neighbor objects. (非邻居对象的数量。用于表示在某个半径内不属于邻居的对象的数量。)
         */
        private int numNonNeighbors;
    
        /**
         * The radius (delta) for determining the neighborhood. (: 确定邻域的半径(delta)。用于确定邻域内的对象,即在该半径范围内的对象被视为邻居。)
         */
        private double radius;
    
        List textList = new ArrayList<>();
    
        private Map> textGroupByItem = new HashMap<>();
    
        private Map> textGroupByUser= new HashMap<>();
    
        class Text{
            private Integer userNum;
            private Integer itemNum;
            private Integer score;
    
            public Integer getUserNum() {
                return userNum;
            }
    
            public void setUserNum(Integer userNum) {
                this.userNum = userNum;
            }
    
            public Integer getItemNum() {
                return itemNum;
            }
    
            public void setItemNum(Integer itemNum) {
                this.itemNum = itemNum;
            }
    
            public Integer getScore() {
                return score;
            }
    
            public void setScore(Integer score) {
                this.score = score;
            }
    
            public Text(Integer userNum, Integer itemNum, Integer score) {
                this.userNum = userNum;
                this.itemNum = itemNum;
                this.score = score;
            }
        }
    
    
        public MBR(String paraFileName, int paraNumUsers, int paraNumItems, int paraNumRatings) throws Exception{
            //step1. initialize these arrays
            numItems = paraNumItems;
            numUsers = paraNumUsers;
            numRatings = paraNumRatings;
    
            userDegrees = new int[numUsers];
            userStartingIndices = new int[numUsers + 1];
            userAverageRatings = new double[numUsers];
            itemDegrees = new int[numItems];
            compressedRatingMatrix = new int[numRatings][3];
            itemAverageRatings = new double[numItems];
            predictions = new double[numRatings];
    
            System.out.println("Reading " + paraFileName);
    
            //step2. Read the data file
            File tempFile = new File(paraFileName);
            if (!tempFile.exists()) {
                System.out.println("File " + paraFileName + " does not exists.");
                System.exit(0);
            }
            BufferedReader tempBufReader = new BufferedReader(new FileReader(tempFile));
            String tempString;
            String[] tempStrArray;
            int tempIndex = 0;
            userStartingIndices[0] = 0;
            userStartingIndices[numUsers] = numRatings;
            while ((tempString = tempBufReader.readLine()) != null) {
                // Each line has three values
                tempStrArray = tempString.split(",");
                compressedRatingMatrix[tempIndex][0] = Integer.parseInt(tempStrArray[0]);
                compressedRatingMatrix[tempIndex][1] = Integer.parseInt(tempStrArray[1]);
                compressedRatingMatrix[tempIndex][2] = Integer.parseInt(tempStrArray[2]);
    
                userDegrees[compressedRatingMatrix[tempIndex][0]]++;
                itemDegrees[compressedRatingMatrix[tempIndex][1]]++;
    
                if (tempIndex > 0) {
                    // Starting to read the data of a new user.
                    if (compressedRatingMatrix[tempIndex][0] != compressedRatingMatrix[tempIndex - 1][0]) {
                        userStartingIndices[compressedRatingMatrix[tempIndex][0]] = tempIndex;
                    }
                }
                tempIndex++;
            }
            tempBufReader.close();
    
            double[] tempUserTotalScore = new double[numUsers];
            double[] tempItemTotalScore = new double[numItems];
            for (int i = 0; i < numRatings; i++) {
                tempUserTotalScore[compressedRatingMatrix[i][0]] += compressedRatingMatrix[i][2];
                tempItemTotalScore[compressedRatingMatrix[i][1]] += compressedRatingMatrix[i][2];
            }
    
            for (int i = 0; i < numUsers; i++) {
                userAverageRatings[i] = tempUserTotalScore[i] / userDegrees[i];
            }
    
            for (int i = 0; i < numItems; i++) {
                itemAverageRatings[i] = tempItemTotalScore[i] / itemDegrees[i];
            }
    
        }
    
        public MBR(String paraFileName, int paraNumUsers, int paraNumItems, int paraNumRatings, boolean basedUser) throws Exception {
            if (basedUser){
                //基于用户的计算
                //step1. initialize these arrays
                numItems = paraNumItems;
                numUsers = paraNumUsers;
                numRatings = paraNumRatings;
    
                userDegrees = new int[numUsers];
                userAverageRatings = new double[numUsers];
                itemDegrees = new int[numItems];
                itemAverageRatings = new double[numItems];
                predictions = new double[numRatings];
    
                System.out.println("Reading " + paraFileName);
    
                //step2. Read the data file
                File tempFile = new File(paraFileName);
                if (!tempFile.exists()) {
                    System.out.println("File " + paraFileName + " does not exists.");
                    System.exit(0);
                }
                BufferedReader tempBufReader = new BufferedReader(new FileReader(tempFile));
                String tempString;
                String[] tempStrArray;
                while ((tempString = tempBufReader.readLine()) != null) {
                    // Each line has three values
                    tempStrArray = tempString.split(",");
                    //把数据读入到textList列表中
                    Text text = new Text(Integer.parseInt(tempStrArray[0]), Integer.parseInt(tempStrArray[1]), Integer.parseInt(tempStrArray[2]));
                    textList.add(text);
                }
                tempBufReader.close();
    
                //按电影号分组
                textGroupByItem = textList.stream().collect(Collectors.groupingBy(Text::getItemNum));
                textGroupByUser = textList.stream().collect(Collectors.groupingBy(Text::getUserNum));
                double[] tempUserTotalScore = new double[numUsers];
                double[] tempItemTotalScore = new double[numItems];
                for (int i = 0; i < numUsers; i++) {
                    // 用户的总分
                    List textsByUser = textGroupByUser.get(i);
                    tempUserTotalScore[i] = textsByUser.stream().mapToDouble(Text::getScore).sum();
                    userDegrees[i] = textsByUser.size();
                    userAverageRatings[i] = tempUserTotalScore[i] / userDegrees[i];
                }
    
                for (int i = 0; i < numItems; i++) {
                    try {
                        // 电影的总分
                        List textsByItem = textGroupByItem.get(i);
                        tempItemTotalScore[i] = textsByItem.stream().mapToDouble(Text::getScore).sum();
                        itemDegrees[i] = textsByItem.size();
                        itemAverageRatings[i] = tempItemTotalScore[i] / itemDegrees[i];
                    } catch (Exception e) {
                        System.out.println(e.getMessage());
                    }
    
                }
            }
        }
    
        public void setRadius(double paraRadius) {
            if (paraRadius > 0) {
                radius = paraRadius;
            } else {
                radius = 0.1;
            }
        }
    
        public void leaveOneOutPrediction() {
            double tempItemAverageRating;
            // Make each line of the code shorter.
            int tempUser, tempItem, tempRating;
           // System.out.println("\r\nLeaveOneOutPrediction for radius " + radius);
    
            numNonNeighbors = 0;
            for (int i = 0; i < numRatings; i++) {
                tempUser = compressedRatingMatrix[i][0];
                tempItem = compressedRatingMatrix[i][1];
                tempRating = compressedRatingMatrix[i][2];
    
                // Step 1. Recompute average rating of the current item.
                tempItemAverageRating = (itemAverageRatings[tempItem] * itemDegrees[tempItem] - tempRating)
                        / (itemDegrees[tempItem] - 1);
    
                // Step 2. Recompute neighbors, at the same time obtain the ratings
                // Of neighbors.
                int tempNeighbors = 0;
                double tempTotal = 0;
                int tempComparedItem;
                for (int j = userStartingIndices[tempUser]; j < userStartingIndices[tempUser + 1]; j++) {
                    tempComparedItem = compressedRatingMatrix[j][1];
                    if (tempItem == tempComparedItem) {
                        continue;// Ignore itself.
                    } // Of if
    
                    if (Math.abs(tempItemAverageRating - itemAverageRatings[tempComparedItem]) < radius) {
                        tempTotal += compressedRatingMatrix[j][2];
                        tempNeighbors++;
                    }
                }
                // Step 3. Predict as the average value of neighbors.
                if (tempNeighbors > 0) {
                    predictions[i] = tempTotal / tempNeighbors;
                } else {
                    predictions[i] = DEFAULT_RATING;
                    numNonNeighbors++;
                }
            }
        }
    
    
        public void leaveOneOutPredictionByUser() {
            double tempItemAverageRating;
            // Make each line of the code shorter.
            int tempUser, tempItem, tempRating;
            // System.out.println("\r\nLeaveOneOutPredictionUser for radius " + radius);
    
            numNonNeighbors = 0;
            for (int i = 0; i < numRatings; i++) {
                Text text = textList.get(i);
                tempUser = text.getUserNum();
                tempItem = text.getItemNum();
                tempRating = text.getScore();
    
                // Step 1. Recompute average rating of the current user.
                List textsByUser = textGroupByUser.get(tempUser);
                Integer outItem = tempItem;
                textsByUser = textsByUser.stream().filter(e -> !e.getItemNum().equals(outItem)).collect(Collectors.toList());
                tempItemAverageRating = textsByUser.stream().mapToDouble(Text::getScore).sum() / textsByUser.size();
    
                // Step 2. Recompute neighbors, at the same time obtain the ratings
                // Of neighbors.
                int tempNeighbors = 0;
                double tempTotal = 0;
                List texts = textGroupByItem.get(tempItem);
                for (int j = 0; j < texts.size(); j++) {
                    Text userText = texts.get(j);
                    if (tempUser == j) {
                        continue;// Ignore itself.
                    }
    
                    if (Math.abs(tempItemAverageRating - userAverageRatings[userText.getUserNum()]) < radius) {
                        tempTotal += userText.getScore();
                        tempNeighbors++;
                    }
                }
                // Step 3. Predict as the average value of neighbors.
                if (tempNeighbors > 0) {
                    predictions[i] = tempTotal / tempNeighbors;
                } else {
                    predictions[i] = DEFAULT_RATING;
                    numNonNeighbors++;
                }
            }
    
        }
    
    
        public double computeMAE() throws Exception {
            double tempTotalError = 0;
            for (int i = 0; i < predictions.length; i++) {
                tempTotalError += Math.abs(predictions[i] - compressedRatingMatrix[i][2]);
            } // Of for i
    
            return tempTotalError / predictions.length;
        }
    
        public double computeMAE_User() throws Exception {
            double tempTotalError = 0;
            for (int i = 0; i < predictions.length; i++) {
                tempTotalError += Math.abs(predictions[i] - textList.get(i).getScore());
            } // Of for i
    
            return tempTotalError / predictions.length;
        }
    
    
        public double computeRSME() throws Exception {
            double tempTotalError = 0;
            for (int i = 0; i < predictions.length; i++) {
                tempTotalError += (predictions[i] - compressedRatingMatrix[i][2])
                        * (predictions[i] - compressedRatingMatrix[i][2]);
            } // Of for i
    
            double tempAverage = tempTotalError / predictions.length;
    
            return Math.sqrt(tempAverage);
        }
    
        public double computeRSME_User() throws Exception {
            double tempTotalError = 0;
            for (int i = 0; i < predictions.length; i++) {
                tempTotalError += (predictions[i] - textList.get(i).getScore())
                        * (predictions[i] - textList.get(i).getScore());
            } // Of for i
    
            double tempAverage = tempTotalError / predictions.length;
    
            return Math.sqrt(tempAverage);
        }
    
        public static void main(String[] args) {
            try {
                MBR tempRecommender = new MBR("C:/Users/Desktop/sampledata/movielens-943u1682m.txt", 943, 1682, 100000);
                MBR tempRecommender1 = new MBR("C:/Users/Desktop/sampledata/movielens-943u1682m.txt", 943, 1682, 100000, true);
                for (double tempRadius = 0.2; tempRadius < 0.6; tempRadius += 0.1) {
                    tempRecommender.setRadius(tempRadius);
                    tempRecommender1.setRadius(tempRadius);
    
                    tempRecommender.leaveOneOutPrediction();
                    double tempMAE = tempRecommender.computeMAE();
                    double tempRSME = tempRecommender.computeRSME();
    
                    tempRecommender1.leaveOneOutPredictionByUser();
                    double tempMAE1 = tempRecommender1.computeMAE_User();
                    double tempRSME1 = tempRecommender1.computeRSME_User();
    
                    System.out.println("Radius_item = " + tempRadius + ", MAE_item = " + tempMAE + ", RSME_item = " + tempRSME
                            + ", numNonNeighbors_item = " + tempRecommender.numNonNeighbors);
    
                    System.out.println("Radius_user = " + tempRadius + ", MAE_user = " + tempMAE1 + ", RSME_user = " + tempRSME1
                            + ", numNonNeighbors_user = " + tempRecommender1.numNonNeighbors);
                }
            } catch (Exception ee) {
                System.out.println(ee);
            }
        }
    
    
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193
    • 194
    • 195
    • 196
    • 197
    • 198
    • 199
    • 200
    • 201
    • 202
    • 203
    • 204
    • 205
    • 206
    • 207
    • 208
    • 209
    • 210
    • 211
    • 212
    • 213
    • 214
    • 215
    • 216
    • 217
    • 218
    • 219
    • 220
    • 221
    • 222
    • 223
    • 224
    • 225
    • 226
    • 227
    • 228
    • 229
    • 230
    • 231
    • 232
    • 233
    • 234
    • 235
    • 236
    • 237
    • 238
    • 239
    • 240
    • 241
    • 242
    • 243
    • 244
    • 245
    • 246
    • 247
    • 248
    • 249
    • 250
    • 251
    • 252
    • 253
    • 254
    • 255
    • 256
    • 257
    • 258
    • 259
    • 260
    • 261
    • 262
    • 263
    • 264
    • 265
    • 266
    • 267
    • 268
    • 269
    • 270
    • 271
    • 272
    • 273
    • 274
    • 275
    • 276
    • 277
    • 278
    • 279
    • 280
    • 281
    • 282
    • 283
    • 284
    • 285
    • 286
    • 287
    • 288
    • 289
    • 290
    • 291
    • 292
    • 293
    • 294
    • 295
    • 296
    • 297
    • 298
    • 299
    • 300
    • 301
    • 302
    • 303
    • 304
    • 305
    • 306
    • 307
    • 308
    • 309
    • 310
    • 311
    • 312
    • 313
    • 314
    • 315
    • 316
    • 317
    • 318
    • 319
    • 320
    • 321
    • 322
    • 323
    • 324
    • 325
    • 326
    • 327
    • 328
    • 329
    • 330
    • 331
    • 332
    • 333
    • 334
    • 335
    • 336
    • 337
    • 338
    • 339
    • 340
    • 341
    • 342
    • 343
    • 344
    • 345
    • 346
    • 347
    • 348
    • 349
    • 350
    • 351
    • 352
    • 353
    • 354
    • 355
    • 356
    • 357
    • 358
    • 359
    • 360
    • 361
    • 362
    • 363
    • 364
    • 365
    • 366
    • 367
    • 368
    • 369
    • 370
    • 371
    • 372
    • 373
    • 374
    • 375
    • 376
    • 377
    • 378
    • 379
    • 380
    • 381
    • 382
    • 383
    • 384
    • 385
    • 386
    • 387
    • 388
    • 389
    • 390
    • 391
    • 392
    • 393
    • 394
    • 395
    • 396
    • 397
    • 398
    • 399
    • 400
    • 401
    • 402
    • 403
    • 404
    • 405
    • 406
    • 407
    • 408
    • 409
    • 410
    • 411
    • 412
    • 413
    • 414
    • 415
    • 416
    • 417
    • 418
    • 419
    • 420
  • 相关阅读:
    基于百度AI人脸识别+uniapp+springboot的高校防疫小程序
    Python编程:高效数据处理与自动化任务实践
    Redis篇---第十一篇
    深入解读redis的zset和跳表【源码分析】
    pycharm debug调试点击结束断点报错KeyboardInterrupt
    Lumiprobe非荧光炔烃研究丨DBCO NHS 酯
    一吨托盘式单臂吊设计
    6条优势,anzo capital昂首资本相信MT5替代MT4的原因
    探讨 MyBatis 特殊 SQL 执行技巧与注意事项
    树莓派(以及各种派)使用指南
  • 原文地址:https://blog.csdn.net/fulishafulisha/article/details/130836579