• 威尔逊平滑点击率


    看到同事sql中使用了夏明这个表达式来计算一个点击率特征
    ( c l i c k _ c n t _ i n _ 30 d s h o w _ c n t _ i n _ 30 d + 1.9 6 2 2 ∗ s h o w _ c n t _ i n _ 30 d ) / ( 1 + 1.9 6 2 s h o w _ c n t _ i n _ 30 d ) − 1.96 1 + 1.9 6 2 s h o w _ c n t _ i n _ 30 d ∗ c l i c k _ c n t _ i n _ 30 d s h o w _ c n t _ i n _ 30 d ∗ 1 − c l i c k _ c n t _ i n _ 30 d s h o w _ c n t _ i n _ 30 d s h o w _ c n t _ i n _ 30 d + 1.9 6 2 4 ∗ s h o w _ c n t _ i n _ 30 d 2 ( \frac{click\_cnt\_in\_30d}{show\_cnt\_in\_30d} + \frac{1.96^2}{2 * show\_cnt\_in\_30d} ) /(1 + \frac{1.96^2}{show\_cnt\_in\_30d}) -\frac{1.96}{1 + \frac{1.96^2}{ show\_cnt\_in\_30d}} * \sqrt{ \frac{click\_cnt\_in\_30d}{show\_cnt\_in\_30d} *\frac{1 - \frac{click\_cnt\_in\_30d}{show\_cnt\_in\_30d}}{show\_cnt\_in\_30d}+ \frac{1.96^2}{4 * show\_cnt\_in\_30d^2} } (show_cnt_in_30dclick_cnt_in_30d+2show_cnt_in_30d1.962)/(1+show_cnt_in_30d1.962)1+show_cnt_in_30d1.9621.96show_cnt_in_30dclick_cnt_in_30dshow_cnt_in_30d1show_cnt_in_30dclick_cnt_in_30d+4show_cnt_in_30d21.962
    优化下表达
    c l i c k 30 d s h o w 30 d + 1.9 6 2 2 ∗ s h o w 30 d 1 + 1.9 6 2 s h o w 30 d − 1.96 1 + 1.9 6 2 s h o w 30 d ∗ c l i c k 30 d s h o w 30 d ∗ ( 1 − c l i c k 30 d s h o w 30 d ) s h o w 30 d + 1.9 6 2 4 ∗ s h o w 30 d 2 \frac{ \frac{click_{30d}}{show_{30d}} + \frac{1.96^2}{2 * show_{30d}} }{1 + \frac{1.96^2}{show_{30d}}} -\frac{1.96}{1 + \frac{1.96^2}{ show_{30d}}} * \sqrt{ \frac{\frac{click_{30d}}{show_{30d}}*\left(1 - \frac{click_{30d}}{show_{30d}}\right)}{show_{30d}}+ \frac{1.96^2}{4 * show_{30d}^2} } 1+show30d1.962show30dclick30d+2show30d1.9621+show30d1.9621.96show30dshow30dclick30d(1show30dclick30d)+4show30d21.962



    p ^ = c l i c k s h o w \hat{p} = \frac{click}{show} p^=showclick
    n = s h o w n = show n=show
    z = 1.96 z = 1.96 z=1.96

    可以得到

    p ^ + z 2 2 ∗ n 1 + z 2 n − z 1 + z 2 n ∗ p ^ ∗ ( 1 − p ^ ) n + z 2 4 ∗ n 2 \frac{ \hat{p} + \frac{z^2}{2 * n} }{1 + \frac{z^2}{n}} -\frac{z}{1 + \frac{z^2}{ n}} * \sqrt{ \frac{\hat{p}*\left(1 - \hat{p}\right)}{n}+ \frac{z^2}{4 * n^2} } 1+nz2p^+2nz21+nz2znp^(1p^)+4n2z2

    所谓的威尔逊平滑的下界值

    参考:
    https://blog.csdn.net/hero_myself/article/details/116264111

  • 相关阅读:
    二叉树题目:二叉树剪枝
    Java中的::
    LeetCode每日一题:1668. 最大重复子字符串 (简单) 字符串查找/枚举/kmp+dp/序列dp
    软件流程和管理(二):Process & Formal
    Qt编写物联网管理平台44-告警邮件转发
    动画师如何选择全身动捕设备制作动画?
    C#:实现鸡尾酒定向冒泡排序算法(附完整源码)
    Android—ATMS启动
    力扣两数之和
    钉钉智慧校园小程序如何开发,你知道么!
  • 原文地址:https://blog.csdn.net/wkl7123/article/details/127019466