エンジニアを目指す浪人のブログ

情報系に役立ちそうな応用数理をゆるめにメモします

いくつかの行列の公式を証明するその2

応用上よく使われるいくつかの行列の公式を証明しておくことにしました.


ベクトルと(正方)行列を準備します.

{\displaystyle \;\;\; x = \begin{bmatrix} x_{1} & x_{2} & \cdots & x_{n} \end{bmatrix}^T  }

{\displaystyle \;\;\; y = \begin{bmatrix} y_{1} & y_{2} & \cdots & y_{n} \end{bmatrix}^T  }

{\displaystyle \;\;\; A =  \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots  & \vdots & \vdots & \vdots \\  a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}  }

{\displaystyle \;\;\; B =  \begin{bmatrix} b_{11} & b_{12} & \cdots & b_{1n} \\ b_{21} & b_{22} & \cdots & b_{2n} \\ \vdots  & \vdots & \vdots & \vdots \\  b_{n1} & b_{n2} & \cdots & b_{nn} \end{bmatrix}  }


本記事の目的に進みます.行列のトレースや微分についての以下の公式を証明します.

'--------------------------------------------------------------------------------------------------------------------------------------------
事実1.

{\displaystyle  A,B,C } が同じサイズの正方行列のとき以下が成り立つ.

(1.1){\displaystyle \;\;\;  \mathrm{Tr} \left( A B \right) = \mathrm{Tr} \left( B A \right) }

(1.2){\displaystyle \;\;\;  \mathrm{Tr} \left( A B C \right) = \mathrm{Tr} \left( B C A \right) = \mathrm{Tr} \left( C A B \right)
}


事実2.

(2.1){\displaystyle \;\;\;  x^T y = \mathrm{Tr} \left( x y^T \right)  }

(2.2){\displaystyle \;\;\;  x^T A x = \mathrm{Tr} \left( x^T A x \right) = \mathrm{Tr} \left( x x^T A \right) \;\;\;\;\;\; \mathrm{if} \ A = A^T }


事実3.

(3.1){\displaystyle \;\;\; \frac{ \partial }{ \partial A } \mathrm{Tr} \left( A B \right) = \frac{ \partial }{ \partial A } \mathrm{Tr} \left( B A \right)  =  B^T }

(3.2){\displaystyle \;\;\; \frac{ \partial }{ \partial A } x^T A x  =  x x^T \;\;\;\;\;\; \mathrm{if} \ A = A^T }



事実1.証明.

(1.1)を示す.

{\displaystyle \;\;\;  \mathrm{Tr} \left( A B \right) = \mathrm{Tr} \left( \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots  & \vdots & \vdots & \vdots \\  a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} & \cdots & b_{1n} \\ b_{21} & b_{22} & \cdots & b_{2n} \\ \vdots  & \vdots & \vdots & \vdots \\  b_{n1} & b_{n2} & \cdots & b_{nn} \end{bmatrix} \right) }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \sum_{k=1}^n a_{1k} b_{k1} + \sum_{k=1}^n a_{2k} b_{k2} + \cdots + \sum_{k=1}^n a_{nk} b_{kn} \;\;\; \because } 文献[2] {\displaystyle \;\;\;\;\;\;\;\;\;} (※)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \sum_{l=1}^n \sum_{k=1}^n a_{lk} b_{kl} }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \sum_{k=1}^n \sum_{l=1}^n a_{lk} b_{kl} }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \sum_{k=1}^n \sum_{l=1}^n b_{kl} a_{lk} }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \sum_{l=1}^n b_{1l} a_{l1} + \sum_{l=1}^n b_{2l} a_{l2} + \cdots + \sum_{l=1}^n b_{nl} a_{ln}  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \mathrm{Tr} \left( \begin{bmatrix} b_{11} & b_{12} & \cdots & b_{1n} \\ b_{21} & b_{22} & \cdots & b_{2n} \\ \vdots  & \vdots & \vdots & \vdots \\  b_{n1} & b_{n2} & \cdots & b_{nn} \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots  & \vdots & \vdots & \vdots \\  a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}  \right)  \;\;\; \because } (※)の逆

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \mathrm{Tr} \left( B A \right) }


(1.2)を示す.

(1.1)で {\displaystyle B = B C } とすると {\displaystyle \mathrm{Tr} \left( A B C \right) = \mathrm{Tr} \left( B C A \right) } を得る.{\displaystyle A = C A } とすると {\displaystyle \mathrm{Tr} \left( C A B \right) = \mathrm{Tr} \left( B C A \right) } を得る.

(証明終わり)


事実2.証明.

(2.1)を示す.

{\displaystyle \;\;\;  \mathrm{Tr} \left( x y^T \right) = \mathrm{Tr} \left( \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{bmatrix} \begin{bmatrix} y_{1} & y_{2} & \cdots & y_{n} \end{bmatrix} \right)  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;  = \mathrm{Tr} \left(  \begin{bmatrix} x_1 y_1 & * & \cdots & * \\ * & x_2 y_2 & \cdots & * \\ \vdots  & \vdots & \vdots & \vdots \\  * & * & \cdots & x_n y_n \end{bmatrix}  \right)  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \begin{bmatrix} x_{1} & x_{2} & \cdots & x_{n} \end{bmatrix} \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{bmatrix}  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = x^T y  }


(2.2)を示す.

{\displaystyle \;\;\;  \mathrm{Tr} \left( x^T A x \right) = x^T A x \;\;\;\;\;\; \because  } スカラー

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = x^T \left( A x \right)  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \mathrm{Tr} \left( x \left( A x \right)^T \right)  \;\;\; \because  } (2.1)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \mathrm{Tr} \left( x x^T A^T \right)  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \mathrm{Tr} \left( x x^T A \right) \;\;\;\;\; \because \ A = A^T  }

(証明終わり)


事実3.証明.

(3.1)を示す.

{\displaystyle \;\;\; \frac{ \partial }{ \partial a_{ij} } \mathrm{Tr} \left( A B \right) = \frac{ \partial }{ \partial a_{ij} } \left( \sum_{k=1}^n a_{1k} b_{k1} + \sum_{k=1}^n a_{2k} b_{k2} + \cdots + \sum_{k=1}^n a_{nk} b_{kn} \right)  \;\;\; \because } (※)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \frac{ \partial }{ \partial a_{ij} } \left( \sum_{k=1}^n a_{ik} b_{ki} \right) }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \frac{ \partial }{ \partial a_{ij} } \left( a_{ij} b_{ji} \right) }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = b_{ji} \;\;\;\;\;\; } (※※)


{\displaystyle \;\;\; \frac{ \partial }{ \partial A } \mathrm{Tr} \left( B A \right) = \frac{ \partial }{ \partial A } \mathrm{Tr} \left( A B \right) \;\;\; \because } (1.1)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \begin{bmatrix} \frac{ \partial }{ \partial a_{11} } \mathrm{Tr} \left( A B \right) & \frac{ \partial }{ \partial a_{12} } \mathrm{Tr} \left( A B \right) & \cdots & \frac{ \partial }{ \partial a_{1n} } \mathrm{Tr} \left( A B \right) \\ \frac{ \partial }{ \partial a_{21} } \mathrm{Tr} \left( A B \right) & \frac{ \partial }{ \partial a_{22} } \mathrm{Tr} \left( A B \right) & \cdots & \frac{ \partial }{ \partial a_{2n} } \mathrm{Tr} \left( A B \right) \\ \vdots  & \vdots & \vdots & \vdots \\  \frac{ \partial }{ \partial a_{n1} } \mathrm{Tr} \left( A B \right) & \frac{ \partial }{ \partial a_{n2} } \mathrm{Tr} \left( A B \right) & \cdots & \frac{ \partial }{ \partial a_{nn} } \mathrm{Tr} \left( A B \right) \end{bmatrix}  }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \begin{bmatrix} b_{11} & b_{21} & \cdots & b_{n1} \\ b_{12} & b_{22} & \cdots & b_{n2} \\ \vdots  & \vdots & \vdots & \vdots \\  b_{1n} & b_{2n} & \cdots & b_{nn} \end{bmatrix}  \;\;\;\;\;\; \because } (※※)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = B^T }


(3.2)を示す.

{\displaystyle \;\;\; \frac{ \partial }{ \partial A } x^T A x  = \frac{ \partial }{ \partial A } \mathrm{Tr} \left( x x^T A \right) \;\;\; \because } (2.2)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \frac{ \partial }{ \partial A } \mathrm{Tr} \left( \left( x x^T \right) A \right) }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \left( x x^T \right)^T \;\;\;\;\;\; \because } (3.1)

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = (x^T)^T x^T }

{\displaystyle \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = x x^T  }

(証明終わり)
'--------------------------------------------------------------------------------------------------------------------------------------------


以上,いくつかの行列の公式を証明しました.



参考文献
[1] Petersen, K.B., and Pedersen, M.S., The Matrix Cookbook https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf
[2] Wikipedia Matrix multiplication のページ https://en.wikipedia.org/wiki/Matrix_multiplication
[3] Zhanwen Huang様の(University of Minnesota在籍時の)ノート http://www.d.umn.edu/~jgreene/masters_reports/Zhanwen%20Huang_s%20Thesis.pdf
[4] Cross Validated https://stats.stackexchange.com/questions/239373/proving-that-xtax-trxxta
[5] University of California, Berkeley Michael I. Jordan先生のノート https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/other-readings/chapter13.pdf