• Session-based Recommendation with Graph Neural Networks论文阅读笔记


    1. Abstract

            (1)基于会话的推荐问题旨在基于匿名会话来预测用户的行为。

    The problem of session-based recommendation aims to predict user actions based on anonymous sessions.

           (2) 以前的方法存在的不足:不足以在会话中获得准确的用户向量,并且忽略了项目的复杂转换。

    Previous methods model a session as a sequence and estimate user representations besides item representations to make recommendations. Though achieved promising results, they are insuffificient to obtain accurate user vectors in sessions and neglect complex transitions of items.

            (3)为了获得准确的项目嵌入并考虑到项目的复杂转换,论文提出了一种新的方法,即基于图神经网络的会话推荐(SR-GNN)

    To obtain accurate item embedding and take complex transitions of items into account, we propose a novel method, i.e. Session-based Recommendation with Graph Neural Networks, SR-GNN for brevity.

    2. Introduction

    2.1 常用的会话推荐方式存在的缺点

            (1)当一个会话中用户的行为序列较少时,较难捕获用户的行为表示;

    without adequate user behavior in one session, these methods have diffificulty in estimating user representations.

            (2)只构建了单项的转移向量,忽略了一个会话中用户的其他行为,对信息的表达能力不够强。

    complex transitions among distant items are often overlooked by these methods.

            注意:这里的常用的会话推荐包括:循环神经网络、马尔科夫链

    2.2 论文工作的主要贡献

            (1)会话序列——>图形结构——>图神经网络(GNN)来捕获复杂的项目转换

    We model separated session sequences into graphstructured data and use graph neural networks to capture complex item transitions. To best of our knowledge, it presents a novel perspective on modeling in the session-based recommendation scenario.

            (2)使用会话嵌入

    To generate session-based recommendations, we do not rely on user representations, but use the session embedding, which can be obtained merely based on latent vectors of items involved in each single session.

    3. Related Work

            回顾一些基于会话的推荐系统方法,包括:传统方法、基于马尔可夫链的顺序方法和基于RNN的方法。

    3.1 Conventional recommendation methods

    3.1.1 Matrix factorization(MF)

            (1)方法:将一个用户-项目评级矩阵分解为两个低秩矩阵进行内积

            (2)缺点:用户偏好只通过一些积极的点击提供

    3.1.2 The item-based neighborhood methods

            (1)方法:计算同一会话中的项目相似度

            (2)缺点:很难考虑项目的顺序,并且仅基于最后一次点击就能产生预测

    3.1.3 sequential methods based on Markov chains

            (1)方法:将推荐生成作为一个顺序优化问题

            (2)缺点:独立性假设太强,限制了预测的精度

    3.2 Deep-learning-based methods.

            (1)RNN

            (2)基于RNN衍生增强

    3.3 Neural network on graphs

            (1)基于无监督的网络嵌入算法LINE

            (2)基于RNN和CNN的一种图数据结构上的卷积神经网络

            (3)GNN

    4. The Proposed Method

    4.1 论文模型结构图

    afa90e5e823c46f3a62bd73ed1992b3c.png        

            (1)输入:用户的行为序列(用户交互过的item id的列表);

            (2)将用户的行为序列构造成 Session Graph;

            (3)通过GNN来对所得的 Session Graph进行特征提取,得到每一个Item的向量表征;

            (4)经过GNN提取Session Graph之后,通过attention机制和线性层对所有的Item的向量表征进行融合,得到User的向量表征;

            (5)经过softmax函数得到用户下一个时刻可能点击的top-k个item

    4.2 Constructing Session Graphs

          (1) A example of a session graph

     

    af1c039c078a468e85490a3456eb9c0a.jpg

           (2)the connection matrix As


    2e44a9589fd14279a952d0eddb44a44a.jpg

    4.3 Learning Item Embeddings on Session Graphs实现中

            长期偏好 + 当前的会话兴趣 = 会话嵌入

            会话嵌入公式推导:

    gif.latex?a%5Et_%7Bs%2Ci%7D%3DA_%7Bs%2Ci%7D%3A%5Bv%5E%7Bt-1%7D_1%2C...%2Cv%5E%7Bt-1%7D_n%5D%5ETH+B

    gif.latex?z%5Et_%7Bs%2Ci%7D%3D%5Csigma%20%28W_za%5Et_%7Bs%2Ci%7D+U_zv%5E%7Bt-1%7D_i%29

    gif.latex?r%5Et_%7Bs%2Ci%7D%3D%5Csigma%20%28W_ra%5Et_%7Bs%2Ci%7D+U_rv%5E%7Bt-1%7D_i%29

    gif.latex?%5Cwidetilde%7Bv%7D%5Et_i%3Dtanh%28W_oa%5Et_%7Bs%2Ci%7D+U_o%28r%5Et_%7Bs%2Ci%7D%20%5Codot%20v%5E%7Bt-1%7D_i%29%29

    gif.latex?v%5Et_i%3D%281-z%5Et_%7Bs%2Ci%7D%29%5Codot%20v%5E%7Bt-1%7D_i%20+%20z%5Et_%7Bs%2Ci%7D%5Codot%20%5Cwidetilde%7Bv%7D%5Et_i

            注意:第一个公式在使用代码实现矩阵相乘的时候不能直接乘,因为这里输入是两个矩阵,导致维度对不上,在实际代码中是将outing和incoming分开进行矩阵乘积,然后进行conca。

    4.4 Generating Session Embedding

            (1)用attention机制来获取序列中每一个item对于序列中最后一个item的attention score,然后加权求和。

    gif.latex?a_i%3Dq%5ET%5Csigma%20%28W_1v_n+W_2v_i+c%29%5Cepsilon%20R%5E1

    gif.latex?s_g%3D%5Csum_%7Bi%3D1%7D%5E%7Bn%7Da_iv_I%5Cepsilon%20R%5Ed

            (2)将Sg与序列中的最后一个item信息相结合,得到最后的嵌入表征。

    gif.latex?s_h%3DW3%5Bs_1%3Bs_g%5D%5Cepsilon%20R%5Ed

    4.5 Making Recommendation and Model Training

            (1)公式推导

    da3477c836ab4b90a0650709dcb5aca1.jpg

    81b0fa83f1ee4a2c895be20da405d549.jpg

             (2)损失函数:交叉熵损失函数

    f81e6369d0db4b05b346326c6189ceb7.jpg

    5. 模型代码实践

              基于paddle的SR-GNN模型定义:

    1. class GNN(nn.Layer):
    2. def __init__(self, embedding_size, step=1):
    3. super(GNN, self).__init__()
    4. self.step = step
    5. self.embedding_size = embedding_size
    6. self.input_size = embedding_size * 2
    7. self.gate_size = embedding_size * 3
    8. self.w_ih = self.create_parameter(shape=[self.input_size, self.gate_size])
    9. self.w_hh = self.create_parameter(shape=[self.embedding_size, self.gate_size])
    10. self.b_ih = self.create_parameter(shape=[self.gate_size])
    11. self.b_hh = self.create_parameter(shape=[self.gate_size])
    12. self.b_iah = self.create_parameter(shape=[self.embedding_size])
    13. self.b_ioh = self.create_parameter(shape=[self.embedding_size])
    14. self.linear_edge_in = nn.Linear(self.embedding_size, self.embedding_size)
    15. self.linear_edge_out = nn.Linear(self.embedding_size, self.embedding_size)
    16. def GNNCell(self, A, hidden):
    17. input_in = paddle.matmul(A[:, :, :A.shape[1]], self.linear_edge_in(hidden)) + self.b_iah
    18. input_out = paddle.matmul(A[:, :, A.shape[1]:], self.linear_edge_out(hidden)) + self.b_ioh
    19. # [batch_size, max_session_len, embedding_size * 2]
    20. inputs = paddle.concat([input_in, input_out], 2)
    21. # gi.size equals to gh.size, shape of [batch_size, max_session_len, embedding_size * 3]
    22. gi = paddle.matmul(inputs, self.w_ih) + self.b_ih
    23. gh = paddle.matmul(hidden, self.w_hh) + self.b_hh
    24. # (batch_size, max_session_len, embedding_size)
    25. i_r, i_i, i_n = gi.chunk(3, 2)
    26. h_r, h_i, h_n = gh.chunk(3, 2)
    27. reset_gate = F.sigmoid(i_r + h_r)
    28. input_gate = F.sigmoid(i_i + h_i)
    29. new_gate = paddle.tanh(i_n + reset_gate * h_n)
    30. hy = (1 - input_gate) * hidden + input_gate * new_gate
    31. return hy
    32. def forward(self, A, hidden):
    33. for i in range(self.step):
    34. hidden = self.GNNCell(A, hidden)
    35. return hidden
    36. class SRGNN(nn.Layer):
    37. def __init__(self, config):
    38. super(SRGNN, self).__init__()
    39. # load parameters info
    40. self.config = config
    41. self.embedding_size = config['embedding_dim']
    42. self.step = config['step']
    43. self.n_items = self.config['n_items']
    44. # define layers and loss
    45. # item embedding
    46. self.item_emb = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0)
    47. # define layers and loss
    48. self.gnn = GNN(self.embedding_size, self.step)
    49. self.linear_one = nn.Linear(self.embedding_size, self.embedding_size)
    50. self.linear_two = nn.Linear(self.embedding_size, self.embedding_size)
    51. self.linear_three = nn.Linear(self.embedding_size, 1, bias_attr=False)
    52. self.linear_transform = nn.Linear(self.embedding_size * 2, self.embedding_size)
    53. self.loss_fun = nn.CrossEntropyLoss()
    54. # parameters initialization
    55. self.reset_parameters()
    56. def gather_indexes(self, output, gather_index):
    57. """Gathers the vectors at the specific positions over a minibatch"""
    58. # gather_index = gather_index.view(-1, 1, 1).expand(-1, -1, output.shape[-1])
    59. gather_index = gather_index.reshape([-1, 1, 1])
    60. gather_index = paddle.repeat_interleave(gather_index,output.shape[-1],2)
    61. output_tensor = paddle.take_along_axis(output, gather_index, 1)
    62. return output_tensor.squeeze(1)
    63. def calculate_loss(self,user_emb,pos_item):
    64. all_items = self.item_emb.weight
    65. scores = paddle.matmul(user_emb, all_items.transpose([1, 0]))
    66. return self.loss_fun(scores,pos_item)
    67. def output_items(self):
    68. return self.item_emb.weight
    69. def reset_parameters(self, initializer=None):
    70. for weight in self.parameters():
    71. paddle.nn.initializer.KaimingNormal(weight)
    72. def _get_slice(self, item_seq):
    73. # Mask matrix, shape of [batch_size, max_session_len]
    74. mask = (item_seq>0).astype('int32')
    75. items, n_node, A, alias_inputs = [], [], [], []
    76. max_n_node = item_seq.shape[1]
    77. item_seq = item_seq.cpu().numpy()
    78. for u_input in item_seq:
    79. node = np.unique(u_input)
    80. items.append(node.tolist() + (max_n_node - len(node)) * [0])
    81. u_A = np.zeros((max_n_node, max_n_node))
    82. for i in np.arange(len(u_input) - 1):
    83. if u_input[i + 1] == 0:
    84. break
    85. u = np.where(node == u_input[i])[0][0]
    86. v = np.where(node == u_input[i + 1])[0][0]
    87. u_A[u][v] = 1
    88. u_sum_in = np.sum(u_A, 0)
    89. u_sum_in[np.where(u_sum_in == 0)] = 1
    90. u_A_in = np.divide(u_A, u_sum_in)
    91. u_sum_out = np.sum(u_A, 1)
    92. u_sum_out[np.where(u_sum_out == 0)] = 1
    93. u_A_out = np.divide(u_A.transpose(), u_sum_out)
    94. u_A = np.concatenate([u_A_in, u_A_out]).transpose()
    95. A.append(u_A)
    96. alias_inputs.append([np.where(node == i)[0][0] for i in u_input])
    97. # The relative coordinates of the item node, shape of [batch_size, max_session_len]
    98. alias_inputs = paddle.to_tensor(alias_inputs)
    99. # The connecting matrix, shape of [batch_size, max_session_len, 2 * max_session_len]
    100. A = paddle.to_tensor(A)
    101. # The unique item nodes, shape of [batch_size, max_session_len]
    102. items = paddle.to_tensor(items)
    103. return alias_inputs, A, items, mask
    104. def forward(self, item_seq, mask, item, train=True):
    105. if train:
    106. alias_inputs, A, items, mask = self._get_slice(item_seq)
    107. hidden = self.item_emb(items)
    108. hidden = self.gnn(A, hidden)
    109. alias_inputs = alias_inputs.reshape([-1, alias_inputs.shape[1],1])
    110. alias_inputs = paddle.repeat_interleave(alias_inputs, self.embedding_size, 2)
    111. seq_hidden = paddle.take_along_axis(hidden,alias_inputs,1)
    112. # fetch the last hidden state of last timestamp
    113. item_seq_len = paddle.sum(mask,axis=1)
    114. ht = self.gather_indexes(seq_hidden, item_seq_len - 1)
    115. q1 = self.linear_one(ht).reshape([ht.shape[0], 1, ht.shape[1]])
    116. q2 = self.linear_two(seq_hidden)
    117. alpha = self.linear_three(F.sigmoid(q1 + q2))
    118. a = paddle.sum(alpha * seq_hidden * mask.reshape([mask.shape[0], -1, 1]), 1)
    119. user_emb = self.linear_transform(paddle.concat([a, ht], axis=1))
    120. loss = self.calculate_loss(user_emb,item)
    121. output_dict = {
    122. 'user_emb': user_emb,
    123. 'loss': loss
    124. }
    125. else:
    126. alias_inputs, A, items, mask = self._get_slice(item_seq)
    127. hidden = self.item_emb(items)
    128. hidden = self.gnn(A, hidden)
    129. alias_inputs = alias_inputs.reshape([-1, alias_inputs.shape[1],1])
    130. alias_inputs = paddle.repeat_interleave(alias_inputs, self.embedding_size, 2)
    131. seq_hidden = paddle.take_along_axis(hidden, alias_inputs,1)
    132. # fetch the last hidden state of last timestamp
    133. item_seq_len = paddle.sum(mask, axis=1)
    134. ht = self.gather_indexes(seq_hidden, item_seq_len - 1)
    135. q1 = self.linear_one(ht).reshape([ht.shape[0], 1, ht.shape[1]])
    136. q2 = self.linear_two(seq_hidden)
    137. alpha = self.linear_three(F.sigmoid(q1 + q2))
    138. a = paddle.sum(alpha * seq_hidden * mask.reshape([mask.shape[0], -1, 1]), 1)
    139. user_emb = self.linear_transform(paddle.concat([a, ht], axis=1))
    140. output_dict = {
    141. 'user_emb': user_emb,
    142. }
    143. return output_dict

     

    参考内容及代码来自:手把手教你实现序列召回推荐模型

     

     

     

     

     

     

  • 相关阅读:
    Swift中的继承(Inheritance)
    [附源码]计算机毕业设计JAVAJAVA点餐系统
    在maven官网中如何下载低版本的maven
    构建数据驱动的文化价值体系,还得靠数据分析
    stride与padding对输出尺寸的计算
    java 并发篇
    通过配置数据库事件(Event)来实现定时导出 MySQL 数据库
    【正点原子STM32连载】 第三十四章 DAC实验 摘自【正点原子】MiniPro STM32H750 开发指南_V1.1
    编译器一日一练(DIY系列之总结)
    基于Java毕业设计养老院管理系统源码+系统+mysql+lw文档+部署软件
  • 原文地址:https://blog.csdn.net/qq_51167531/article/details/127990196