当前位置: 首页 > news >正文

线性代数 SVD | 导数 - 详解

注:本文为 “线性代数 · SVD | 导数” 相关英文引文,机翻未校。


Singular value decomposition derivatives

奇异值分解的导数

Published Nov 10, 2023

The singular value decomposition (SVD) is a matrix decomposition that is used in many applications. It is defined as:
一种在众多领域中均有应用的矩阵分解方法,其定义如下:就是奇异值分解(Singular Value Decomposition, SVD)

J = U Σ V T \begin{align*} J &= U \Sigma V^T \end{align*}J=UΣVT

where U UU and V VVare orthogonal matrices andΣ \SigmaΣis a diagonal matrix with non-negative entries. The diagonal entries ofΣ \SigmaΣare called the singular values ofJ JJand are denoted asσ 1 , … , σ n \sigma_1,\dots,\sigma_nσ1,,σn. The singular values are the square roots of the eigenvalues ofJ T J J^TJJTJ. In this post we’re going to go over how to differentiate the elements of the SVD under the assumption that the singular values are all distinct and non-zero.
其中,U UUV VV为正交矩阵,Σ \SigmaΣ为对角元非负的对角矩阵。Σ \SigmaΣ的对角元被称为矩阵J JJ的奇异值,记为σ 1 , … , σ n \sigma_1,\dots,\sigma_nσ1,,σn。奇异值是矩阵J T J J^TJJTJ特征值的平方根。本文将在“所有奇异值互不相同且非零”的假设下,详细推导奇异值分解各元素的导数。

Einstein notation

爱因斯坦求和符号

Before proceeding, we need to understand Einstein notation. Einstein notation is an alternative way of writing matrix equations where we undo the matrix notation and remove the summation symbol. For example, lets write the equationA x = b Ax = bAx=bin Einstein notation.
在进行后续推导前,需先理解爱因斯坦求和符号(Einstein notation)。该符号是矩阵方程的一种替代表示方法,核心是“拆解矩阵形式并省略求和符号”。以方程A x = b Ax = bAx=b为例,其爱因斯坦求和符号表示过程如下:

And thats it! All we did was remove the summation symbol. When we see Einstein notation in practice, we implicitly assume that there is a summation over indices that only appear on one side of the equality. Also in Einstein notation, we will make use of the Kronecker delta functionδ i j \delta_{ij}δijwhich is1 11 when i = j i=ji=j and 0 00otherwise.
推导至此即可!整个过程的核心就是省略求和符号。在实际利用爱因斯坦求和符号时,我们默认:对“仅在等式一侧出现的指标”进行求和(即“哑标求和”规则)。此外,爱因斯坦求和符号中还会用到克罗内克δ函数(Kronecker delta function)δ i j \delta_{ij}δij,其定义为:当i = j i=ji=j 时,δ i j = 1 \delta_{ij}=1δij=1;当 i ≠ j i \neq ji=j 时,δ i j = 0 \delta_{ij}=0δij=0

Orthogonal matrices

正交矩阵

Next, we need to know how to differentiate orthogonal matrices. LetQ QQbe an orthogonal matrix, then by definitionQ k i Q k j = δ i j Q_{ki}Q_{kj} = \delta_{ij}QkiQkj=δij. Taking the derivative yields:
接下来,需推导正交矩阵的导数。设Q QQ为正交矩阵,根据正交矩阵的定义,有Q k i Q k j = δ i j Q_{ki}Q_{kj} = \delta_{ij}QkiQkj=δij。对等式两侧求导可得:

∂ ( Q k i Q k j ) = ∂ δ i j ⟹ ∂ Q k i Q k j + Q k i ∂ Q k j = 0 ⟹ ∂ Q k i Q k j = − Q k i ∂ Q k j \begin{align*} \partial (Q_{ki}Q_{kj}) &= \partial \delta_{ij} \\ \implies \partial Q_{ki} Q_{kj} + Q_{ki} \partial Q_{kj} &= 0 \\ \implies \partial Q_{ki} Q_{kj} &= -Q_{ki} \partial Q_{kj} \\ \end{align*}(QkiQkj)QkiQkj+QkiQkjQkiQkj=δij=0=QkiQkj

To make this equations clearer, we can undo some of the Einstein notation by lettingq i : = Q : , i q_i:= Q_{:,i}qi:=Q:,ibe thei iith column ofQ QQ. Then we have:
为使等式更清晰,可部分还原爱因斯坦求和符号的矩阵形式:令q i : = Q : , i q_i := Q_{:,i}qi:=Q:,i(即 q i q_iqi 为矩阵 Q QQ 的第 i ii列),此时等式可改写为:

∂ q i ⋅ q j = − ∂ q j ⋅ q i \begin{align*} \partial q_i \cdot q_j = -\partial q_j \cdot q_i \end{align*}qiqj=qjqi

Notethat wheni = j i=ji=j, ∂ q i ⋅ q i = 0 \partial q_i \cdot q_i = 0qiqi=0. This will be useful when we differentiate the SVD.
需注意,当 i = j i=ji=j 时,有 ∂ q i ⋅ q i = 0 \partial q_i \cdot q_i = 0qiqi=0。该结论在后续奇异值分解的求导过程中会发挥重要作用。

Singular value decomposition derivatives

奇异值分解的导数

Lets start by writing the SVD using Einstein notation
首先,用爱因斯坦求和符号表示奇异值分解:

J i j = U i u σ u V j u \begin{align*} J_{ij} &= U_{iu} \sigma_u V_{ju} \\ \end{align*}Jij=UiuσuVju

with some term rearrangement, we can write two equations:
通过调整项的顺序,可得到以下两个等式:

J i j U i u = σ u V j u J i j V j u = σ u U i u \begin{align*} J_{ij}U_{iu} &= \sigma_u V_{ju} \\ J_{ij}V_{ju} &= \sigma_u U_{iu} \end{align*}JijUiuJijVju=σuVju=σuUiu

by applying a derivative, we get
对上述两个等式两侧分别求导,可得:

∂ J i j U i u + J i j ∂ U i u = ∂ σ u V j u + σ u ∂ V j u ∂ J i j V j u + J i j ∂ V j u = ∂ σ u U i u + σ u ∂ U i u \begin{align*} \partial J_{ij}U_{iu} + J_{ij}\partial U_{iu} &= \partial \sigma_u V_{ju} + \sigma_u \partial V_{ju} \\ \partial J_{ij}V_{ju} + J_{ij}\partial V_{ju} &= \partial \sigma_u U_{iu} + \sigma_u \partial U_{iu} \end{align*}JijUiu+JijUiuJijVju+JijVju=σuVju+σuVju=σuUiu+σuUiu

We’ll call the first equation, equation 1 and the second equation, equation 2.
我们将第一个等式称为“等式 1”,第二个等式称为“等式 2”。

Singular value derivaties

奇异值的导数

To get the derivatives of the singular values, we can multiply both sides of equation 2 byU i u U_{iu}Uiuand summing overi ii:
为推导奇异值的导数,可将等式 2 两侧同时乘以U i u U_{iu}Uiu,并对指标 i ii 求和:

∂ J i j V j u U i u + J i j ∂ V j u U i u ⏟ σ u V j u ∂ V j u = 0 = ∂ σ u U i u U i u ⏟ 1 + σ u ∂ U i u U i u ⏟ 0 ⟹ ∂ σ u = ∂ J i j U i u V j u \begin{align*} \partial J_{ij}V_{ju} U_{iu} + \underbrace{J_{ij}\partial V_{ju} U_{iu}}_{\sigma_u V_{ju} \partial V_{ju}=0} &= \partial \sigma_u \underbrace{U_{iu} U_{iu}}_{1} + \sigma_u \underbrace{\partial U_{iu} U_{iu}}_{0} \\ \implies \partial \sigma_u &= \partial J_{ij}U_{iu} V_{ju} \end{align*}JijVjuUiu+σuVjuVju=0JijVjuUiuσu=σu1UiuUiu+σu0UiuUiu=JijUiuVju

注:推导中用到两个结论:

  1. J i j V j u = σ u U i u J_{ij}V_{ju} = \sigma_u U_{iu}JijVju=σuUiuJ i j ∂ V j u U i u = σ u V j u ∂ V j u J_{ij}\partial V_{ju} U_{iu} = \sigma_u V_{ju} \partial V_{ju}JijVjuUiu=σuVjuVju,再结合正交矩阵求导结论∂ V j u V j u = 0 \partial V_{ju} V_{ju}=0VjuVju=0,故该项为 0;
  2. 正交矩阵列向量单位性:U i u U i u = 1 U_{iu} U_{iu}=1UiuUiu=1,且 ∂ U i u U i u = 0 \partial U_{iu} U_{iu}=0UiuUiu=0

Singular vector derivatives

奇异向量的导数

Next, to isolate the derivatives of the singular vecotrs, we’ll first multiply both sides of equation 1 byV j v V_{jv}Vjv, wherev ≠ u v \neq uv=uand sum overj jj:
接下来推导奇异向量的导数。首先,将等式 1 两侧同时乘以V j v V_{jv}Vjv(其中 v ≠ u v \neq uv=u),并对指标j jj 求和:

∂ J i j U i u V j v + J i j ∂ U i u V j v ⏟ σ v ∂ U i u U i v = ∂ σ u V j u V j v ⏟ 0 + σ u ∂ V j u V j v ∂ J i j U i u V j v = − σ v ∂ U i u U i v + σ u ∂ V j u V j v \begin{align*} \partial J_{ij}U_{iu} V_{jv} + \underbrace{J_{ij}\partial U_{iu} V_{jv}}_{\sigma_v \partial U_{iu} U_{iv}} &= \partial \sigma_u \underbrace{V_{ju} V_{jv}}_{0} + \sigma_u \partial V_{ju} V_{jv} \\ \partial J_{ij}U_{iu} V_{jv} &= -\sigma_v \partial U_{iu} U_{iv} + \sigma_u \partial V_{ju} V_{jv} \end{align*}JijUiuVjv+σvUiuUivJijUiuVjvJijUiuVjv=σu0VjuVjv+σuVjuVjv=σvUiuUiv+σuVjuVjv

注:

  1. J i j V j v = σ v U i v J_{ij}V_{jv} = \sigma_v U_{iv}JijVjv=σvUivJ i j ∂ U i u V j v = σ v ∂ U i u U i v J_{ij}\partial U_{iu} V_{jv} = \sigma_v \partial U_{iu} U_{iv}JijUiuVjv=σvUiuUiv
  2. 正交矩阵列向量正交性:V j u V j v = 0 V_{ju} V_{jv}=0VjuVjv=0(因 v ≠ u v \neq uv=u))

Similarly we can do the same with equation 2 but multiply byU i v U_{iv}Uiv where v ≠ u v\neq uv=uand sum overi ii:
类似地,对等式 2 进行处理:将其两侧同时乘以U i v U_{iv}Uiv(其中 v ≠ u v \neq uv=u),并对指标i ii 求和:

∂ J i j V j u U i v + J i j ∂ V j u U i v ⏟ σ v ∂ V j u V j v = ∂ σ u U i u U i v ⏟ 0 + σ u ∂ U i u U i v ∂ J i j U i v V j u = σ u ∂ U i u U i v − σ v ∂ V j u V j v \begin{align*} \partial J_{ij}V_{ju} U_{iv} + \underbrace{J_{ij}\partial V_{ju} U_{iv}}_{\sigma_v \partial V_{ju} V_{jv}} &= \partial \sigma_u \underbrace{U_{iu} U_{iv}}_{0} + \sigma_u \partial U_{iu} U_{iv} \\ \partial J_{ij}U_{iv} V_{ju} &= \sigma_u \partial U_{iu} U_{iv} - \sigma_v \partial V_{ju} V_{jv} \end{align*}JijVjuUiv+σvVjuVjvJijVjuUivJijUivVju=σu0UiuUiv+σuUiuUiv=σuUiuUivσvVjuVjv

注:

  1. J i j U i v = σ v V j v J_{ij}U_{iv} = \sigma_v V_{jv}JijUiv=σvVjvJ i j ∂ V j u U i v = σ v ∂ V j u V j v J_{ij}\partial V_{ju} U_{iv} = \sigma_v \partial V_{ju} V_{jv}JijVjuUiv=σvVjuVjv
  2. 正交矩阵列向量正交性:U i u U i v = 0 U_{iu} U_{iv}=0UiuUiv=0(因 v ≠ u v \neq uv=u))

So we’re left with the equation
综上,可得到如下方程组:

∂ J i j U i u V j v = − σ v ∂ U i u U i v + σ u ∂ V j u V j v ∂ J i j U i v V j u = σ u ∂ U i u U i v − σ v ∂ V j u V j v \begin{align*} \partial J_{ij}U_{iu} V_{jv} &= -\sigma_v \partial U_{iu} U_{iv} + \sigma_u \partial V_{ju} V_{jv} \\ \partial J_{ij}U_{iv} V_{ju} &= \sigma_u \partial U_{iu} U_{iv} - \sigma_v \partial V_{ju} V_{jv} \end{align*}JijUiuVjvJijUivVju=σvUiuUiv+σuVjuVjv=σuUiuUivσvVjuVjv

Left singular vectors
左奇异向量

Lets multiply the above equations byσ v \sigma_vσv and σ u \sigma_uσurespectively:
将上述方程组的第一个等式乘以σ v \sigma_vσv,第二个等式乘以σ u \sigma_uσu,可得:

σ v ∂ J i j U i u V j v = − σ v 2 ∂ U i u U i v + σ v σ u ∂ V j u V j v σ u ∂ J i j U i v V j u = σ u 2 ∂ U i u U i v − σ v σ u ∂ V j u V j v \begin{align*} \sigma_v \partial J_{ij}U_{iu} V_{jv} &= -{\sigma_v}^2 \partial U_{iu} U_{iv} + \sigma_v \sigma_u \partial V_{ju} V_{jv} \\ \sigma_u \partial J_{ij}U_{iv} V_{ju} &= {\sigma_u}^2 \partial U_{iu} U_{iv} - \sigma_v \sigma_u \partial V_{ju} V_{jv} \end{align*}σvJijUiuVjvσuJijUivVju=σv2UiuUiv+σvσuVjuVjv=σu2UiuUivσvσuVjuVjv

If we sum the equations, the last terms cancel and we’re left with
将两个等式相加,右侧的交叉项(σ v σ u ∂ V j u V j v \sigma_v \sigma_u \partial V_{ju} V_{jv}σvσuVjuVjv− σ v σ u ∂ V j u V j v -\sigma_v \sigma_u \partial V_{ju} V_{jv}σvσuVjuVjv)相互抵消,最终得到:

∂ J i j ( σ u U i v V j u + σ v U i u V j v ) = ( σ u 2 − σ v 2 ) ∂ U i u U i v ⟹ ∂ U i u U i v = 1 σ u 2 − σ v 2 ∂ J i j ( σ u U i v V j u + σ v U i u V j v ) \begin{align*} \partial J_{ij}\left(\sigma_u U_{iv} V_{ju} + \sigma_v U_{iu} V_{jv}\right) &= ({\sigma_u}^2 - {\sigma_v}^2) \partial U_{iu} U_{iv} \\ \implies \partial U_{iu} U_{iv} &= \frac{1}{\sigma_u^2 - \sigma_v^2} \partial J_{ij}\left(\sigma_u U_{iv} V_{ju} + \sigma_v U_{iu} V_{jv}\right) \end{align*}Jij(σuUivVju+σvUiuVjv)UiuUiv=(σu2σv2)UiuUiv=σu2σv21Jij(σuUivVju+σvUiuVjv)

Right singular vectors
右奇异向量

Similarly, if we multiplied byσ u \sigma_uσu and σ v \sigma_vσvrespectively, we get
类似地,将原方程组的第一个等式乘以σ u \sigma_uσu,第二个等式乘以σ v \sigma_vσv,可得:

σ u ∂ J i j U i u V j v = − σ v σ u ∂ U i u U i v + σ u 2 ∂ V j u V j v σ v ∂ J i j U i v V j u = σ v σ u ∂ U i u U i v − σ v 2 ∂ V j u V j v \begin{align*} \sigma_u \partial J_{ij}U_{iu} V_{jv} &= -{\sigma_v} \sigma_u \partial U_{iu} U_{iv} + {\sigma_u}^2 \partial V_{ju} V_{jv} \\ \sigma_v\partial J_{ij}U_{iv} V_{ju} &= \sigma_v\sigma_u \partial U_{iu} U_{iv} - {\sigma_v}^2 \partial V_{ju} V_{jv} \end{align*}σuJijUiuVjvσvJijUivVju=σvσuUiuUiv+σu2VjuVjv=σvσuUiuUivσv2VjuVjv

If we sum the equations, the first terms on the RHS cancel and we’re left with
将两个等式相加,右侧的交叉项(− σ v σ u ∂ U i u U i v -\sigma_v \sigma_u \partial U_{iu} U_{iv}σvσuUiuUivσ v σ u ∂ U i u U i v \sigma_v \sigma_u \partial U_{iu} U_{iv}σvσuUiuUiv)相互抵消,最终得到:

∂ J i j ( σ v U i v V j u + σ u U i u V j v ) = ( σ u 2 − σ v 2 ) ∂ V j u V j v ⟹ ∂ V j u V j v = 1 σ u 2 − σ v 2 ∂ J i j ( σ v U i v V j u + σ u U i u V j v ) \begin{align*} \partial J_{ij}\left(\sigma_v U_{iv} V_{ju} + \sigma_u U_{iu} V_{jv}\right) &= ({\sigma_u}^2 - {\sigma_v}^2) \partial V_{ju} V_{jv} \\ \implies \partial V_{ju} V_{jv} &= \frac{1}{\sigma_u^2 - \sigma_v^2} \partial J_{ij}\left(\sigma_v U_{iv} V_{ju} + \sigma_u U_{iu} V_{jv}\right) \end{align*}Jij(σvUivVju+σuUiuVjv)VjuVjv=(σu2σv2)VjuVjv=σu2σv21Jij(σvUivVju+σuUiuVjv)

Summary

总结

To simplify the expressions, we’ll use the notationU i : = U : , i U_i := U_{:,i}Ui:=U:,i and V i : = V : , i V_i := V_{:,i}Vi:=V:,ito denote thei iith column ofU UU and V VVrespectively. Then returning to matrix notation yields:
为简化表达式,定义符号U i : = U : , i U_i := U_{:,i}Ui:=U:,iU i U_iUi 为矩阵 U UU 的第 i ii 列)和 V i : = V : , i V_i := V_{:,i}Vi:=V:,iV i V_iVi 为矩阵 V VV 的第 i ii列),并还原为矩阵形式,可得:

∂ σ u = ∂ J i j U i u V j u ∂ U u ⋅ U v ≠ u = 1 σ u 2 − σ v 2 ∂ J i j ( σ u U i v V j u + σ v U i u V j v ) ∂ V u ⋅ V v = 1 σ u 2 − σ v 2 ∂ J i j ( σ v U i v V j u + σ u U i u V j v ) \begin{align*} \partial \sigma_u &= \partial J_{ij}U_{iu} V_{ju} \\ \partial U_u \cdot U_{v\neq u} &= \frac{1}{\sigma_u^2 - \sigma_v^2} \partial J_{ij}\left(\sigma_u U_{iv} V_{ju} + \sigma_v U_{iu} V_{jv}\right) \\ \partial V_u \cdot V_v &= \frac{1}{\sigma_u^2 - \sigma_v^2} \partial J_{ij}\left(\sigma_v U_{iv} V_{ju} + \sigma_u U_{iu} V_{jv}\right) \end{align*}σuUuUv=uVuVv=JijUiuVju=σu2σv21Jij(σuUivVju+σvUiuVjv)=σu2σv21Jij(σvUivVju+σuUiuVjv)

Note that to isolate the derivatives ofU UU and V VV, we can write them as a linear combination of the singular vectors:
需注意,若要单独表示U UUV VV的导数,可将其表示为奇异向量的线性组合:

∂ U u = ( ∂ U u ⋅ U v ≠ u ) U u ∂ V u = ( ∂ V u ⋅ V v ≠ u ) V u \begin{align*} \partial U_u = (\partial U_u \cdot U_{v\neq u}) U_u \\ \partial V_u = (\partial V_u \cdot V_{v\neq u}) V_u \end{align*}Uu=(UuUv=u)UuVu=(VuVv=u)Vu

BecauseU UU and V VVare orthogonal,∂ U u ⋅ U u = ∂ V u ⋅ V u = 0 \partial U_u \cdot U_u = \partial V_u \cdot V_u = 0UuUu=VuVu=0.
这是因为 U UUV VV均为正交矩阵,根据正交矩阵求导结论,有∂ U u ⋅ U u = ∂ V u ⋅ V u = 0 \partial U_u \cdot U_u = \partial V_u \cdot V_u = 0UuUu=VuVu=0(即奇异向量的导数与其自身正交)。

Time derivative

时间导数

We can also see how the singular vectors and singular values evolve when we flow on the vector field:
当矩阵 J JJ 随向量场 d x t d t = X t ( x t ) \frac{dx_t}{dt} = X_t(x_t)dtdxt=Xt(xt)演化时,大家还可推导奇异向量和奇异值的时间导数:

d x t d t = X t ( x t ) \begin{align*} \frac{dx_t}{dt} = X_t(x_t) \end{align*}dtdxt=Xt(xt)

To do this, recall that we can write the time derivative of the components ofJ JJ as:
开始,回顾矩阵J JJ各元素的时间导数表达式:

d J d t = ∇ X t J \begin{align*} \frac{dJ}{dt} = \nabla X_t J \end{align*}dtdJ=XtJ

Then we can look at the time derivative of the SVD derivatives.
基于此,可进一步推导奇异值分解各元素的时间导数。

Singular value derivatives

奇异值的时间导数

d σ u d t = d J i j d t U i u V j u = ( ∇ X t ) i k J k j U i u V j u = ( ∇ X t ) i k U i u σ u U k u \begin{align*} \frac{d\sigma_u}{dt} &= \frac{dJ_{ij}}{dt}U_{iu} V_{ju} \\ &= (\nabla X_t)_{ik} J_{kj} U_{iu} V_{ju} \\ &= (\nabla X_t)_{ik} U_{iu} \sigma_u U_{ku} \\ \end{align*}dtdσu=dtdJijUiuVju=(Xt)ikJkjUiuVju=(Xt)ikUiuσuUku

This is more simply expressed using the log of the singular values:
若对奇异值取对数,表达式可进一步简化:

d log ⁡ σ u d t = ( ∇ X t ) i k U i u U k u \begin{align*} \frac{d\log \sigma_u}{dt} = (\nabla X_t)_{ik} U_{iu} U_{ku} \end{align*}dtdlogσu=(Xt)ikUiuUku

注:对 d σ u d t = ( ∇ X t ) i k U i u σ u U k u \frac{d\sigma_u}{dt} = (\nabla X_t)_{ik} U_{iu} \sigma_u U_{ku}dtdσu=(Xt)ikUiuσuUku两侧同时除以σ u \sigma_uσu,利用 1 σ u d σ u d t = d log ⁡ σ u d t \frac{1}{\sigma_u}\frac{d\sigma_u}{dt} = \frac{d\log \sigma_u}{dt}σu1dtdσu=dtdlogσu,即可得到上述对数形式的时间导数,该形式在分析奇异值的相对变化率时更便捷。)

Left singular vector derivatives

左奇异向量的时间导数

d U u d t ⋅ U v ≠ u = 1 σ u 2 − σ v 2 d J i j d t ( σ u U i v V j u + σ v U i u V j v ) = 1 σ u 2 − σ v 2 ( ∇ X t ) i k J k j ( σ u U i v V j u + σ v U i u V j v ) = 1 σ u 2 − σ v 2 ( ∇ X t ) i k ( σ u 2 U i v U k u + σ v 2 U i u U k v ) = σ u 2 σ u 2 − σ v 2 U v T ( ∇ X t ) U u + σ v 2 σ u 2 − σ v 2 U u T ( ∇ X t ) U v = U v T ( σ u 2 σ u 2 − σ v 2 ∇ X t + σ v 2 σ u 2 − σ v 2 ∇ X t T ) U u \begin{align*} \frac{dU_u}{dt} \cdot U_{v\neq u} &= \frac{1}{\sigma_u^2 - \sigma_v^2} \frac{dJ_{ij}}{dt}\left(\sigma_u U_{iv} V_{ju} + \sigma_v U_{iu} V_{jv}\right) \\ &= \frac{1}{\sigma_u^2 - \sigma_v^2} (\nabla X_t)_{ik} J_{kj}\left(\sigma_u U_{iv} V_{ju} + \sigma_v U_{iu} V_{jv}\right) \\ &= \frac{1}{\sigma_u^2 - \sigma_v^2} (\nabla X_t)_{ik}\left(\sigma_u^2 U_{iv} U_{ku} + \sigma_v^2 U_{iu} U_{kv}\right) \\ &= \frac{\sigma_u^2}{\sigma_u^2 - \sigma_v^2} U_v^T(\nabla X_t)U_u + \frac{\sigma_v^2}{\sigma_u^2 - \sigma_v^2} U_u^T(\nabla X_t)U_v \\ &= U_v^T\left(\frac{\sigma_u^2}{\sigma_u^2 - \sigma_v^2}\nabla X_t + \frac{\sigma_v^2}{\sigma_u^2 - \sigma_v^2} \nabla X_t^T\right)U_u \end{align*}dtdUuUv=u=σu2σv21dtdJij(σuUivVju+σvUiuVjv)=σu2σv21(Xt)ikJkj(σuUivVju+σvUiuVjv)=σu2σv21(Xt)ik(σu2UivUku+σv2UiuUkv)=σu2σv2σu2UvT(Xt)Uu+σu2σv2σv2UuT(Xt)Uv=UvT(σu2σv2σu2Xt+σu2σv2σv2XtT)Uu

注:

  1. 将“奇异向量导数公式”中的∂ \partial替换为时间导数d d t \frac{d}{dt}dtd,并代入 d J i j d t = ( ∇ X t ) i k J k j \frac{dJ_{ij}}{dt} = (\nabla X_t)_{ik} J_{kj}dtdJij=(Xt)ikJkj
  2. 利用 J k j U i v V j u = σ u U k v U i v J_{kj} U_{iv} V_{ju} = \sigma_u U_{kv} U_{iv}JkjUivVju=σuUkvUivJ k j U i u V j v = σ v U k v U i u J_{kj} U_{iu} V_{jv} = \sigma_v U_{kv} U_{iu}JkjUiuVjv=σvUkvUiu(由 SVD 定义推导);
  3. 后两步将爱因斯坦求和符号转化为矩阵乘法形式,其中U v T ( ∇ X t ) U u U_v^T(\nabla X_t)U_uUvT(Xt)Uu 对应 ( ∇ X t ) i k U i v U k u (\nabla X_t)_{ik} U_{iv} U_{ku}(Xt)ikUivUkuU u T ( ∇ X t ) U v U_u^T(\nabla X_t)U_vUuT(Xt)Uv 对应 ( ∇ X t ) i k U i u U k v (\nabla X_t)_{ik} U_{iu} U_{kv}(Xt)ikUiuUkv,并通过合并同类项整理得到最终结果。

Right singular vector derivatives

右奇异向量的时间导数

d V u d t ⋅ V v ≠ u = 1 σ u 2 − σ v 2 d J i j d t ( σ v U i v V j u + σ u U i u V j v ) = 1 σ u 2 − σ v 2 ( ∇ X t ) i k J k j ( σ v U i v V j u + σ u U i u V j v ) = 1 σ u 2 − σ v 2 ( ∇ X t ) i k ( σ u σ v U i v U k u + σ u σ v U i u U k v ) = σ u σ v σ u 2 − σ v 2 ( U v T ( ∇ X t ) U u + U u T ( ∇ X t ) U v ) = σ u σ v σ u 2 − σ v 2 U v T ( ∇ X t + ∇ X t T ) U u \begin{align*} \frac{dV_u}{dt} \cdot V_{v\neq u} &= \frac{1}{\sigma_u^2 - \sigma_v^2} \frac{dJ_{ij}}{dt}\left(\sigma_v U_{iv} V_{ju} + \sigma_u U_{iu} V_{jv}\right) \\ &= \frac{1}{\sigma_u^2 - \sigma_v^2} (\nabla X_t)_{ik} J_{kj}\left(\sigma_v U_{iv} V_{ju} + \sigma_u U_{iu} V_{jv}\right) \\ &= \frac{1}{\sigma_u^2 - \sigma_v^2} (\nabla X_t)_{ik}\left(\sigma_u\sigma_v U_{iv} U_{ku} + \sigma_u\sigma_v U_{iu} U_{kv}\right) \\ &= \frac{\sigma_u \sigma_v}{\sigma_u^2 - \sigma_v^2}\left( U_v^T(\nabla X_t)U_u + U_u^T(\nabla X_t)U_v \right) \\ &= \frac{\sigma_u \sigma_v}{\sigma_u^2 - \sigma_v^2} U_v^T(\nabla X_t + \nabla X_t^T)U_u \end{align*}dtdVuVv=u=σu2σv21dtdJij(σvUivVju+σuUiuVjv)=σu2σv21(Xt)ikJkj(σvUivVju+σuUiuVjv)=σu2σv21(Xt)ik(σuσvUivUku+σuσvUiuUkv)=σu2σv2σuσv(UvT(Xt)Uu+UuT(Xt)Uv)=σu2σv2σuσvUvT(Xt+XtT)Uu

注:

  1. 与左奇异向量时间导数类似,第一步替换导数符号并代入d J d t \frac{dJ}{dt}dtdJ 的表达式;

  2. 第二步利用 J k j U i v V j u = σ u U k v V j u J_{kj} U_{iv} V_{ju} = \sigma_u U_{kv} V_{ju}JkjUivVju=σuUkvVjuJ k j U i u V j v = σ v U k v V j v J_{kj} U_{iu} V_{jv} = \sigma_v U_{kv} V_{jv}JkjUiuVjv=σvUkvVjv,进一步结合V j u = 1 σ u J k j U k u V_{ju} = \frac{1}{\sigma_u} J_{kj} U_{ku}Vju=σu1JkjUku(由 SVD 定义J U = Σ V T J U = \Sigma V^TJU=ΣVT变形),可化简得到σ v U i v U k u + σ u U i u U k v \sigma_v U_{iv} U_{ku} + \sigma_u U_{iu} U_{kv}σvUivUku+σuUiuUkv,再乘以 σ u σ v \sigma_u \sigma_vσuσv 的系数;

  3. 最后一步将两项合并为U v T ( ∇ X t + ∇ X t T ) U u U_v^T(\nabla X_t + \nabla X_t^T)U_uUvT(Xt+XtT)Uu,利用了矩阵乘法的转置性质:U u T ( ∇ X t ) U v = ( U v T ( ∇ X t T ) U u ) T U_u^T(\nabla X_t)U_v = \left( U_v^T(\nabla X_t^T)U_u \right)^TUuT(Xt)Uv=(UvT(XtT)Uu)T,而由于内积结果为标量,标量的转置等于其自身,故可合并为和的形式。


via:

http://www.jsqmd.com/news/27409/

相关文章:

  • vue3+ts+vant4开发,已配置自动引入,使用closeToast组件报异常closeToast is not defined
  • 2025年深圳神秘顾客调查机构权威推荐榜单:神秘顾客研究/神秘顾客暗访/神秘顾客源头机构精选
  • [MySQL] MySQL技术大全:开发、优化与运维实战
  • 2025年10月超声波清洗机厂家推荐榜:五强对比评测与选型指南
  • 2025年10月超声波清洗机厂家推荐榜:阿特万领衔五强对比评测
  • 2025年10月网上兼职赚钱正规平台推荐:市场报告与解决方案榜
  • 2025年10月网上兼职赚钱正规平台推荐:市场报告与知名列表
  • 2025年10月超声波清洗机厂家推荐榜:五强服务网络与成本效益评测
  • 2025年10月学生平板品牌推荐榜:读书郎领衔五强对比评测
  • 2025年共板法兰机生产厂家权威推荐榜单:风管生产线/螺旋风管机/风管接料平台源头厂家精选
  • 2025年10月网上兼职赚钱正规平台推荐:排行榜单与解决方案
  • 2025年10月卖得好的学习机品牌推荐:市场销量榜与公信力排名解读
  • 2025年10月学生平板品牌推荐:投入研发榜对比教研深度
  • 2025年10月学生平板品牌对比榜:五强横评助你锁定高效学习机
  • 2025年10月学生平板品牌对比榜:读书郎与四款热门机型全解析
  • 2025年10月智能学习机品牌推荐:双师AI榜对比清北真人辅导实力
  • 2025年10月卖得好的学习机品牌推荐:用户榜真实评价与选购排行
  • 2025年10月卖得好的学习机品牌推荐:权威销量榜与品牌对比排行
  • 2025年10月卖得好的学习机品牌推荐:销量排行五强横向评测
  • zerofs 常见问题以及解决方法
  • 2025年10月智能学习机品牌推荐榜:AI学习工具数量与教研投入排行
  • 2025年10月智能学习机品牌评价榜:新课标全科覆盖机型口碑排行
  • 2025年10月智能学习机品牌推荐:新课标闭环学习方案排行
  • 2025年10月智能学习机品牌对比:新课标同步与护眼大屏选购指南
  • 日记1
  • Java 运行时安全:输入验证、沙箱机制、安全反序列化
  • 2025年河北AI优化机构权威推荐榜单:AI推广/GEO推广/geo优化源头机构精选
  • [MySQL] 阿里新零售数据库设计与实战 (升级版)
  • note3
  • 2025年茅台酒回收服务权威推荐榜单:生肖茅台酒回收/年份茅台酒回收/回收老酒服务精选