• Machine learning week 9(Andrew Ng)


    Recommender systems

    1.Collaborative filtering

    1.1 Making recommendations

    Introduction

    1.2 Using per-item features

    在这里插入图片描述
    Cost function:
    在这里插入图片描述

    1.3 Collaborative Filtering algorithm

    Predict x feature via the known parameters w w w and b b b
    在这里插入图片描述
    The cost function is similar to before.
    在这里插入图片描述

    # Y (4778, 443) R (4778, 443)
    # X (443, 10)
    # W (443, 10)
    # b (1, 443)
    # num_features 10
    # num_movies 4778
    # num_users 443
    def cofi_cost_func(X, W, b, Y, R, lambda_):
        """
        Returns the cost for the content-based filtering
        Args:
          X (ndarray (num_movies,num_features)): matrix of item features
          W (ndarray (num_users,num_features)) : matrix of user parameters
          b (ndarray (1, num_users)            : vector of user parameters
          Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
          R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
          lambda_ (float): regularization parameter
        Returns:
          J (float) : Cost
        """
        nm, nu = Y.shape
        J = 0
        ### START CODE HERE ###  
        for j in range(nu):
            w = W[j,:]
            b_j = b[0,j]
            for i in range(nm):
                x = X[i,:]
                J += R[i,j] * ((np.dot(w,x) + b_j - Y[i,j])**2)
        J /= 2 
        J += (lambda_ / 2) * (np.sum(np.square(W)) + np.sum(np.square(X)))
        ### END CODE HERE ### 
    
        return J
    
    
    def cofi_cost_func_v(X, W, b, Y, R, lambda_):
        """
        Returns the cost for the content-based filtering
        Vectorized for speed. Uses tensorflow operations to be compatible with custom training loop.
        Args:
          X (ndarray (num_movies,num_features)): matrix of item features
          W (ndarray (num_users,num_features)) : matrix of user parameters
          b (ndarray (1, num_users)            : vector of user parameters
          Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
          R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
          lambda_ (float): regularization parameter
        Returns:
          J (float) : Cost
        """
        j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y)*R
        J = 0.5 * tf.reduce_sum(j**2) + (lambda_/2) * (tf.reduce_sum(X**2) + tf.reduce_sum(W**2))
        return J
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    1.4.Binary labels: favs, likes and clicks

    Judge whether the customer likes it(Binary labels)
    在这里插入图片描述

    1.5.Mean normalization

    在这里插入图片描述
    And in fact, the effect of this algorithm is it will cause the initial guesses for the new user Eve to be just equal to the mean of whatever other users have rated these five movies. And that seems more reasonable to take the average rating of the movies rather than to guess that all the ratings by Eve will be zero.

    1.6.TensorFlow
    1.7.Finding related items

    在这里插入图片描述

    2. Content-based filtering

    2.1. Collaborative filtering vs Content-based filtering

    在这里插入图片描述


    Our purpose is to predict the value, so we should transform the features of users and movies to vector v u , v m v_u,v_m vu,vm, which have the same dimension.
    在这里插入图片描述

    2.2. Deep learning for content-based filtering

    在这里插入图片描述
    Both of them have 32 numbers although x u x_u xu and x m x_m xm are different.
    在这里插入图片描述
    And ∣ ∣ v m ( k ) − v m ( i ) ∣ ∣ m i n 2 ||v_m^{(k)}-v_m^{(i)}||^2_{min} ∣∣vm(k)vm(i)min2 is the most similar movie to movie i i i.

    2.3. Recommending from a large catalog

    When the catalog is too large, thousands of millions of times every time a user shows up on your website becomes computationally infeasible. So, we should prepare before.
    Having precomputed the most similar movies to every movie, we can just pull up the results using a look-up table. And then we can retrieval and rank
    在这里插入图片描述在这里插入图片描述
    Rank them by the prediction value from high to low.

  • 相关阅读:
    成都建筑模板批发市场在哪?
    【目标检测】Object Detection in 20 Years: A Survey
    机器学习 泰坦尼克号——灾难中的机器学习
    面试中的MySQL主从复制|手撕MySQL|对线面试官
    10 款更先进的开源命令行工具
    Kafka 安装教程和基本操作
    微服务轰炸中:RPC+Dubbo+SpirngBoot+SpringCloud Alibaba+Docker+K8s!
    猿创征文|机器学习实战(8)——随机森林
    【VSCode】解决Open in browser无效
    LeetCode(力扣)968. 监控二叉树Python
  • 原文地址:https://blog.csdn.net/weixin_62012485/article/details/126581928