• 分类与回归梯度下降公式推导


    1.相关函数公式及求导

    • 链式推导法则
      ∂ J ( θ ) ∂ θ = ∂ J ( h ) ∂ h ∗ ∂ h ( z ) ∂ z ∗ ∂ z ( θ ) ∂ θ
      J(θ)θ=J(h)hh(z)zz(θ)θ" role="presentation" style="position: relative;">J(θ)θ=J(h)hh(z)zz(θ)θ
      θJ(θ)=hJ(h)zh(z)θz(θ)

    1.1.线性回归公式

    • 原函数
      z ( θ ) = θ T x z_{(\theta)}= \theta^T x z(θ)=θTx
    • 求导过程
      ∂ z ( θ ) ∂ θ = ∂ θ T x ∂ θ = x
      z(θ)θ=θTxθ=x" role="presentation" style="position: relative;">z(θ)θ=θTxθ=x
      θz(θ)=θθTx=x

    1.2.sigmoid函数

    • 原函数
      h ( z ) = 1 1 + e − z
      h(z)=11+ez" role="presentation" style="position: relative;">h(z)=11+ez
      h(z)=1+ez1
    • 求导过程

    ∂ h ( z ) ∂ z = ∂ ( 1 1 + e − z ) ∂ z = 0 − ∂ ( 1 + e − z ) ( 1 + e − z ) 2 = e − z ( 1 + e − z ) 2 = 1 + e − z − 1 ( 1 + e − z ) ( 1 + e − z ) = 1 + e − z − 1 ( 1 + e − z ) . 1 ( 1 + e − z ) = [ 1 − 1 ( 1 + e − z ) ] . 1 ( 1 + e − z ) = ( 1 − h ( z ) ) ∗ h ( z )

    h(z)z=(11+ez)z=0(1+ez)(1+ez)2=ez(1+ez)2=1+ez1(1+ez)(1+ez)=1+ez1(1+ez).1(1+ez)=[11(1+ez)].1(1+ez)=(1h(z))h(z)" role="presentation" style="position: relative;">h(z)z=(11+ez)z=0(1+ez)(1+ez)2=ez(1+ez)2=1+ez1(1+ez)(1+ez)=1+ez1(1+ez).1(1+ez)=[11(1+ez)].1(1+ez)=(1h(z))h(z)
    zh(z)=z(1+ez1)=(1+ez)20(1+ez)=(1+ez)2ez=(1+ez)(1+ez)1+ez1=(1+ez)1+ez1.(1+ez)1=[1(1+ez)1].(1+ez)1=(1h(z))h(z)

    2. 逻辑回归

    2.1损失函数公式【交叉熵公式】

    • 原函数
      L o s s ( w ) = − 1 m [ ∑ i = 1 m ( y i ∗ l o g h ( z ) + ( 1 − y i ) ∗ l o g ( 1 − h ( z ) ) ]
      Loss(w)=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]" role="presentation" style="position: relative;">Loss(w)=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]
      Loss(w)=m1[i=1m(yilogh(z)+(1yi)log(1h(z))]
    • 求偏导过程
      ∂ L o s s ( h ) ∂ h = − ∂ 1 m [ ∑ i = 1 m ( y i ∗ l o g h ( z ) + ( 1 − y i ) ∗ l o g ( 1 − h ( z ) ) ] ∂ h = − 1 m [ ∑ i = 1 m ( y i ∗ 1 h ( z ) + ( 1 − y i ) ∗ 1 1 − h ( z ) ∗ ( − 1 ) ) ] = − 1 m [ ∑ i = 1 m ( y i h ( z ) + ( 1 − y i ) 1 − h ( z ) ∗ ( − 1 ) ) ] = − 1 m [ ∑ i = 1 m ( y i ∗ ( 1 − h ( z ) ) + ( y i − 1 ) ∗ h ( z ) h ( z ) ∗ ( 1 − h ( z ) ) ) ] = − 1 m [ ∑ i = 1 m ( y i − h ( z ) h ( z ) ∗ ( 1 − h ( z ) ) ) ] = 1 m [ ∑ i = 1 m ( h ( z ) − y i h ( z ) ∗ ( 1 − h ( z ) ) ) ]
      Loss(h)h=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]h=1m[i=1m(yi1h(z)+(1yi)11h(z)(1))]=1m[i=1m(yih(z)+(1yi)1h(z)(1))]=1m[i=1m(yi(1h(z))+(yi1)h(z)h(z)(1h(z)))]=1m[i=1m(yih(z)h(z)(1h(z)))]=1m[i=1m(h(z)yih(z)(1h(z)))]" role="presentation" style="position: relative;">Loss(h)h=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]h=1m[i=1m(yi1h(z)+(1yi)11h(z)(1))]=1m[i=1m(yih(z)+(1yi)1h(z)(1))]=1m[i=1m(yi(1h(z))+(yi1)h(z)h(z)(1h(z)))]=1m[i=1m(yih(z)h(z)(1h(z)))]=1m[i=1m(h(z)yih(z)(1h(z)))]
      hLoss(h)=hm1[i=1m(yilogh(z)+(1yi)log(1h(z))]=m1[i=1m(yih(z)1+(1yi)1h(z)1(1))]=m1[i=1m(h(z)yi+1h(z)(1yi)(1))]=m1[i=1m(h(z)(1h(z))yi(1h(z))+(yi1)h(z))]=m1[i=1m(h(z)(1h(z))yih(z))]=m1[i=1m(h(z)(1h(z))h(z)yi)]

    2.2 求导

    ∂ L o s s ( θ ) ∂ θ j = ∂ L o s s ( h ) ∂ h ∗ ∂ h ( z ) ∂ z ∗ ∂ z ( θ ) ∂ θ j = 1 m [ ∑ i = 1 m ( h ( z ) − y i h ( z ) ∗ ( 1 − h ( z ) ) ) ] ∗ ( 1 − h ( z ) ) ∗ h ( z ) ∗ x j i = 1 m ∑ i = 1 m ( h ( z ) − y i ) ∗ x j i

    Loss(θ)θj=Loss(h)hh(z)zz(θ)θj=1m[i=1m(h(z)yih(z)(1h(z)))](1h(z))h(z)xji=1mi=1m(h(z)yi)xji" role="presentation" style="position: relative;">Loss(θ)θj=Loss(h)hh(z)zz(θ)θj=1m[i=1m(h(z)yih(z)(1h(z)))](1h(z))h(z)xji=1mi=1m(h(z)yi)xji
    θjLoss(θ)=hLoss(h)zh(z)θjz(θ)=m1[i=1m(h(z)(1h(z))h(z)yi)](1h(z))h(z)xji=m1i=1m(h(z)yi)xji

    2.2 逻辑回归梯度下降公式

    • θ \theta θ求偏导
      θ j : = θ j − α ∗ ∂ L o s s ( θ ) ∂ θ j : = θ j − α ∗ 1 m ∑ i = 1 m ( h ( z ) − y i ) ∗ x j i
      θj:=θjαLoss(θ)θj:=θjα1mi=1m(h(z)yi)xji" role="presentation" style="position: relative;">θj:=θjαLoss(θ)θj:=θjα1mi=1m(h(z)yi)xji
      θj:=θjαθjLoss(θ):=θjαm1i=1m(h(z)yi)xji

    3. 线性回归

    3.1损失函数公式【MSE公式】

    • 原函数
      L o s s ( θ ) = 1 2 ( z ( θ ) − y ) 2
      Loss(θ)=12(z(θ)y)2" role="presentation" style="position: relative;">Loss(θ)=12(z(θ)y)2
      Loss(θ)=21(z(θ)y)2
    • 求导过程
      ∂ L o s s ( θ ) ∂ θ j = ∂ 1 2 ( z ( θ ) − y i ) 2 ∂ θ j = 1 2 ∗ 2 ∗ ( z ( θ j ) − y i ) ∗ ∂ z ( θ j ) = ( z ( θ j ) − y i ) ∗ x j i
      Loss(θ)θj=12(z(θ)yi)2θj=122(z(θj)yi)z(θj)=(z(θj)yi)xji" role="presentation" style="position: relative;">Loss(θ)θj=12(z(θ)yi)2θj=122(z(θj)yi)z(θj)=(z(θj)yi)xji
      θjLoss(θ)=θj21(z(θ)yi)2=212(z(θj)yi)z(θj)=(z(θj)yi)xji

    3.2 线性回归梯度下降公式

    θ j : = θ j − α ∗ ∂ L o s s ( θ ) ∂ θ j : = θ j − α ∗ ( z ( θ j ) − y i ) ∗ x j i

    θj:=θjαLoss(θ)θj:=θjα(z(θj)yi)xji" role="presentation" style="position: relative;">θj:=θjαLoss(θ)θj:=θjα(z(θj)yi)xji
    θj:=θjαθjLoss(θ):=θjα(z(θj)yi)xji

  • 相关阅读:
    基于Java+SpringBoot+Vue校园求职招聘系统的设计与实现 前后端分离【Java毕业设计·文档报告·代码讲解·安装调试】
    领悟《信号与系统》之 连续系统的频域分析
    OS2.2.5:调度算法之时间片轮转调度、优先级调度、多级反馈队列调度
    Java “constant string too long” 编译错误
    Python Day2 爬虫基础操作【初级】
    三种内存分配的库比较
    如何用程序开启 平滑屏幕字体边缘 c++
    系统移植3:kernel的配置,编译和移植以及根文件系统
    LeetCode50天刷题计划(Day 18—— 搜索旋转排序数组(8.50-12.00)
    Protein A FITC Conjugate,FITC标记重组蛋白A,Recombinant Protein A FITC Conjugate
  • 原文地址:https://blog.csdn.net/m0_46926492/article/details/128074635