凸函数优化 Convex Optimization

凸函数 (Convex Function)

定义 (Definition)

一个函数 是凸的,如果对于所有 ,满足

A function is convex if for all and , the following holds:

一阶条件 (First-order Condition)

一个可微函数 是凸的,当且仅当对所有 ,满足

A differentiable function is convex if and only if for all , the following holds:

其中 处的梯度向量

where is the gradient vector of at

二阶条件 (Second-order Condition)

一个两次可微函数 是凸的,当且仅当对于所有 ,Hessian 矩阵 半正定

A twice-differentiable function is convex if and only if for all , the Hessian matrix is positive semidefinite

Hessian 矩阵 (Hessian Matrix)

定义 (Definition)

Hessian 矩阵是由函数 的所有二阶偏导数组成的方阵,记作

The Hessian matrix is a square matrix of second-order partial derivatives of a function , denoted as

性质 (Properties)

  1. 对称性:如果 二次可微,则 Hessian 矩阵 是对称的 Symmetry: If is twice differentiable, then the Hessian matrix is symmetric
  2. 半正定性:如果 是凸函数,则 是半正定的 Positive semi-definiteness: If is a convex function, then is positive semidefinite

矩阵求导 (Matrix Calculus)

梯度 (Gradient)

向量 的函数 的梯度是一个包含所有偏导数的列向量

The gradient of a function with respect to a vector is a column vector containing all partial derivatives

雅可比矩阵 (Jacobian Matrix)

向量值函数 的雅可比矩阵是一个 的矩阵,包含所有一阶偏导数

The Jacobian matrix of a vector-valued function is an matrix containing all first-order partial derivatives

矩阵的导数 (Derivatives of Matrices)

如果 是一个 的常数矩阵, 是一个变量向量,则 的导数是

If is an constant matrix and is a variable vector, then the derivative of with respect to is

矩阵和向量求导法则 (Rules for Matrix and Vector Derivatives)

矩阵求导法则 (Matrix Derivative Rules)

  1. 对于标量函数 ,梯度 是一个列向量 For a scalar function , , the gradient is a column vector
  1. 对于矩阵 和向量 的导数是 For a matrix and a vector , the derivative of with respect to is
  1. 对于矩阵 和向量 的导数是 For a matrix and a vector , the derivative of with respect to is
  1. 对于标量函数 ,其中 ,导数是 For a scalar function , where and , the derivative is