• 分类与回归梯度下降公式推导


    1.相关函数公式及求导

    • 链式推导法则
      ∂ J ( θ ) ∂ θ = ∂ J ( h ) ∂ h ∗ ∂ h ( z ) ∂ z ∗ ∂ z ( θ ) ∂ θ
      J(θ)θ=J(h)hh(z)zz(θ)θ" role="presentation" style="position: relative;">J(θ)θ=J(h)hh(z)zz(θ)θ
      θJ(θ)=hJ(h)zh(z)θz(θ)

    1.1.线性回归公式

    • 原函数
      z ( θ ) = θ T x z_{(\theta)}= \theta^T x z(θ)=θTx
    • 求导过程
      ∂ z ( θ ) ∂ θ = ∂ θ T x ∂ θ = x
      z(θ)θ=θTxθ=x" role="presentation" style="position: relative;">z(θ)θ=θTxθ=x
      θz(θ)=θθTx=x

    1.2.sigmoid函数

    • 原函数
      h ( z ) = 1 1 + e − z
      h(z)=11+ez" role="presentation" style="position: relative;">h(z)=11+ez
      h(z)=1+ez1
    • 求导过程

    ∂ h ( z ) ∂ z = ∂ ( 1 1 + e − z ) ∂ z = 0 − ∂ ( 1 + e − z ) ( 1 + e − z ) 2 = e − z ( 1 + e − z ) 2 = 1 + e − z − 1 ( 1 + e − z ) ( 1 + e − z ) = 1 + e − z − 1 ( 1 + e − z ) . 1 ( 1 + e − z ) = [ 1 − 1 ( 1 + e − z ) ] . 1 ( 1 + e − z ) = ( 1 − h ( z ) ) ∗ h ( z )

    h(z)z=(11+ez)z=0(1+ez)(1+ez)2=ez(1+ez)2=1+ez1(1+ez)(1+ez)=1+ez1(1+ez).1(1+ez)=[11(1+ez)].1(1+ez)=(1h(z))h(z)" role="presentation" style="position: relative;">h(z)z=(11+ez)z=0(1+ez)(1+ez)2=ez(1+ez)2=1+ez1(1+ez)(1+ez)=1+ez1(1+ez).1(1+ez)=[11(1+ez)].1(1+ez)=(1h(z))h(z)
    zh(z)=z(1+ez1)=(1+ez)20(1+ez)=(1+ez)2ez=(1+ez)(1+ez)1+ez1=(1+ez)1+ez1.(1+ez)1=[1(1+ez)1].(1+ez)1=(1h(z))h(z)

    2. 逻辑回归

    2.1损失函数公式【交叉熵公式】

    • 原函数
      L o s s ( w ) = − 1 m [ ∑ i = 1 m ( y i ∗ l o g h ( z ) + ( 1 − y i ) ∗ l o g ( 1 − h ( z ) ) ]
      Loss(w)=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]" role="presentation" style="position: relative;">Loss(w)=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]
      Loss(w)=m1[i=1m(yilogh(z)+(1yi)log(1h(z))]
    • 求偏导过程
      ∂ L o s s ( h ) ∂ h = − ∂ 1 m [ ∑ i = 1 m ( y i ∗ l o g h ( z ) + ( 1 − y i ) ∗ l o g ( 1 − h ( z ) ) ] ∂ h = − 1 m [ ∑ i = 1 m ( y i ∗ 1 h ( z ) + ( 1 − y i ) ∗ 1 1 − h ( z ) ∗ ( − 1 ) ) ] = − 1 m [ ∑ i = 1 m ( y i h ( z ) + ( 1 − y i ) 1 − h ( z ) ∗ ( − 1 ) ) ] = − 1 m [ ∑ i = 1 m ( y i ∗ ( 1 − h ( z ) ) + ( y i − 1 ) ∗ h ( z ) h ( z ) ∗ ( 1 − h ( z ) ) ) ] = − 1 m [ ∑ i = 1 m ( y i − h ( z ) h ( z ) ∗ ( 1 − h ( z ) ) ) ] = 1 m [ ∑ i = 1 m ( h ( z ) − y i h ( z ) ∗ ( 1 − h ( z ) ) ) ]
      Loss(h)h=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]h=1m[i=1m(yi1h(z)+(1yi)11h(z)(1))]=1m[i=1m(yih(z)+(1yi)1h(z)(1))]=1m[i=1m(yi(1h(z))+(yi1)h(z)h(z)(1h(z)))]=1m[i=1m(yih(z)h(z)(1h(z)))]=1m[i=1m(h(z)yih(z)(1h(z)))]" role="presentation" style="position: relative;">Loss(h)h=1m[i=1m(yilogh(z)+(1yi)log(1h(z))]h=1m[i=1m(yi1h(z)+(1yi)11h(z)(1))]=1m[i=1m(yih(z)+(1yi)1h(z)(1))]=1m[i=1m(yi(1h(z))+(yi1)h(z)h(z)(1h(z)))]=1m[i=1m(yih(z)h(z)(1h(z)))]=1m[i=1m(h(z)yih(z)(1h(z)))]
      hLoss(h)=hm1[i=1m(yilogh(z)+(1yi)log(1h(z))]=m1[i=1m(yih(z)1+(1yi)1h(z)1(1))]=m1[i=1m(h(z)yi+1h(z)(1yi)(1))]=m1[i=1m(h(z)(1h(z))yi(1h(z))+(yi1)h(z))]=m1[i=1m(h(z)(1h(z))yih(z))]=m1[i=1m(h(z)(1h(z))h(z)yi)]

    2.2 求导

    ∂ L o s s ( θ ) ∂ θ j = ∂ L o s s ( h ) ∂ h ∗ ∂ h ( z ) ∂ z ∗ ∂ z ( θ ) ∂ θ j = 1 m [ ∑ i = 1 m ( h ( z ) − y i h ( z ) ∗ ( 1 − h ( z ) ) ) ] ∗ ( 1 − h ( z ) ) ∗ h ( z ) ∗ x j i = 1 m ∑ i = 1 m ( h ( z ) − y i ) ∗ x j i

    Loss(θ)θj=Loss(h)hh(z)zz(θ)θj=1m[i=1m(h(z)yih(z)(1h(z)))](1h(z))h(z)xji=1mi=1m(h(z)yi)xji" role="presentation" style="position: relative;">Loss(θ)θj=Loss(h)hh(z)zz(θ)θj=1m[i=1m(h(z)yih(z)(1h(z)))](1h(z))h(z)xji=1mi=1m(h(z)yi)xji
    θjLoss(θ)=hLoss(h)zh(z)θjz(θ)=m1[i=1m(h(z)(1h(z))h(z)yi)](1h(z))h(z)xji=m1i=1m(h(z)yi)xji

    2.2 逻辑回归梯度下降公式

    • θ \theta θ求偏导
      θ j : = θ j − α ∗ ∂ L o s s ( θ ) ∂ θ j : = θ j − α ∗ 1 m ∑ i = 1 m ( h ( z ) − y i ) ∗ x j i
      θj:=θjαLoss(θ)θj:=θjα1mi=1m(h(z)yi)xji" role="presentation" style="position: relative;">θj:=θjαLoss(θ)θj:=θjα1mi=1m(h(z)yi)xji
      θj:=θjαθjLoss(θ):=θjαm1i=1m(h(z)yi)xji

    3. 线性回归

    3.1损失函数公式【MSE公式】

    • 原函数
      L o s s ( θ ) = 1 2 ( z ( θ ) − y ) 2
      Loss(θ)=12(z(θ)y)2" role="presentation" style="position: relative;">Loss(θ)=12(z(θ)y)2
      Loss(θ)=21(z(θ)y)2
    • 求导过程
      ∂ L o s s ( θ ) ∂ θ j = ∂ 1 2 ( z ( θ ) − y i ) 2 ∂ θ j = 1 2 ∗ 2 ∗ ( z ( θ j ) − y i ) ∗ ∂ z ( θ j ) = ( z ( θ j ) − y i ) ∗ x j i
      Loss(θ)θj=12(z(θ)yi)2θj=122(z(θj)yi)z(θj)=(z(θj)yi)xji" role="presentation" style="position: relative;">Loss(θ)θj=12(z(θ)yi)2θj=122(z(θj)yi)z(θj)=(z(θj)yi)xji
      θjLoss(θ)=θj21(z(θ)yi)2=212(z(θj)yi)z(θj)=(z(θj)yi)xji

    3.2 线性回归梯度下降公式

    θ j : = θ j − α ∗ ∂ L o s s ( θ ) ∂ θ j : = θ j − α ∗ ( z ( θ j ) − y i ) ∗ x j i

    θj:=θjαLoss(θ)θj:=θjα(z(θj)yi)xji" role="presentation" style="position: relative;">θj:=θjαLoss(θ)θj:=θjα(z(θj)yi)xji
    θj:=θjαθjLoss(θ):=θjα(z(θj)yi)xji

  • 相关阅读:
    数字化转型加快,低代码平台优势凸显
    动态规划入门 java版本
    猫头虎分享从Python到JavaScript传参数:多面手的数据传递术
    十一、任务调度算法
    数据库设计 ER图
    TCP的重传机制、滑动窗口、流量控制、拥塞控制,这一篇就够了
    虚拟摄像头之五: 详解 android8 的 Camera 子系统框架
    Windows下将文件夹映射为磁盘
    机器学习之集成学习算法简介
    CleanMyMac2023免费版系统清理优化工具
  • 原文地址:https://blog.csdn.net/m0_46926492/article/details/128074635