The only unknown term in the reverse processes is the data score ∇xt log qt(xt), which can be approximated by a time-dependent score model s t θ (xt) (or with other model parametrizations). s t θ (xt) is typically learned via score matching (SM) (Hyvarinen ¨ , 2005).
문제 제기: reverse processes에 unknown term은 data score ∇xt log qt(xt)임. 이걸 score model s t θ (xt)에 근사하여 얻음. 그리고 이 모델은 Score Matching으로 학습됨.
In this work, we observe that the stochastic process of the scaled data score αt∇xt log qt(xt) is a martingale w.r.t. the reverse-time process of xt from T to 0, where the timestep t can be either continuous or discrete. Along the reverse-time sampling path, this martingale property leads to concentration bounds for scaled data scores. Moreover, a martingale satisfies the optional stopping theorem that the expected value at a stopping time is equal to its initial expected value.
저자는 위에 언급된 scaled data score stochastic process는 martingale하다고함. 마팅게일은 과거의 모든 과정을 알면 미래의 기댓값이 현재 값과 동일하다는 의미임. T부터 0가지 xt 과정에 관해 마팅게일로임을 발견함. reverse-time sampling에 이 마팅게일 요소는 scaled data scores에 대한 concentration bounds로 이끌어준다. 게다가, 마팅게일은 stopping time에 기대값은 초기에 기대값과 동일하한 optional stopping theorem을 만족한다.
Based on the martingale property of data scores, for any t ∈ [0, T] and any pretrained score model s t θ (xt) (or with other model parametrizations), we can calibrate the model by subtracting its expectation over qt(xt), i.e., Eqt(xt) [s t θ (xt)]. We formally demonstrate that the calibrated score model s t θ (xt)−Eqt(xt) [s t θ (xt)] achieves lower values of SM objectives.
data score의 마팅게일 요소를 깔면 어떤 t든 pretrained score model이든 qt(x)t에 기대이상을 뽑아 모델을 보정할 수 있다. 저자는 이 모델을 $s^t_θ(x_t)−E_{q_t}(x_t) [s^t_θ(x_t)]$로 calibated score model을 나타낸다.
We consider a k-dimensional random variable x ∈ R k and define a forward diffusion process on x as {xt}t∈[0,T] with T > 0, which satisfies ∀t ∈ [0, T],
Here q0(x0) is the data distribution; αt and σt are two positive real-valued functions that are differentiable w.r.t. t with bounded derivatives.
prove that there exists a stochastic differential equation (SDE) satisfying the forward transition distribution in Eq. (1), and this SDE can be written as
where ωt ∈ R k is the standard Wiener process,
forward SDE in Eq. (2) corresponds to a reverse SDE constructed as
Starting from qT (xT ), the marginal distribution of the reverse SDE process is also qt(xt) for t ∈ [0, T]. There also exists a deterministic process described by an ordinary differential equation (ODE) as
Moreover, by Tweedie’s formula (Efron, 2011), we know that