Skip to content

Commit 76a067f

Browse files
committed
update language support
1 parent 4d4d086 commit 76a067f

10 files changed

Lines changed: 1487 additions & 303 deletions

File tree

app/(private)/chinese/page.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -426,7 +426,7 @@ Imagen 修改了 U-Net 的几个设计,使其成为*高效 U-Net*。
426426

427427
引用为:
428428

429-
> Weng, Lilian. (2021年7月). 什么是扩散模型? LilLog. https://lilianweng.github.io/posts/2021-07-11-diffusion-models/.
429+
> Weng, Lilian. (2021年7月). 什么是扩散模型? Lil'Log. https://lilianweng.github.io/posts/2021-07-11-diffusion-models/.
430430
431431
432432

app/(private)/english/page.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ $$
6969
q(\mathbf{x}_{t-1} \vert \mathbf{x}_t, \mathbf{x}_0) = \mathcal{N}(\mathbf{x}_{t-1}; \tilde{\boldsymbol{\mu}}(\mathbf{x}_t, \mathbf{x}_0), \tilde{\beta}_t \mathbf{I})
7070
$$
7171

72-
Using Bayes rule, we have:
72+
Using Bayes' rule, we have:
7373

7474
$$
7575
\begin{aligned}
@@ -120,7 +120,7 @@ $$
120120
\end{aligned}
121121
$$
122122

123-
It is also straightforward to get the same result using Jensens inequality. Say we want to minimize the cross entropy as the learning objective,
123+
It is also straightforward to get the same result using Jensen's inequality. Say we want to minimize the cross entropy as the learning objective,
124124

125125
$$
126126
\begin{aligned}
@@ -151,7 +151,7 @@ L_\text{VLB}
151151
\end{aligned}
152152
$$
153153

154-
Lets label each component in the variational lower bound loss separately:
154+
Let's label each component in the variational lower bound loss separately:
155155

156156
$$
157157
\begin{aligned}
@@ -209,7 +209,7 @@ where $C$ is a constant not depending on $\theta$.
209209

210210
#### Connection with noise-conditioned score networks (NCSN)
211211

212-
[Song & Ermon (2019)](/posts/diffusion-models/https://arxiv.org/abs/1907.05600) proposed a score-based generative modeling method where samples are produced via [Langevin dynamics](/posts/diffusion-models/#connection-with-stochastic-gradient-langevin-dynamics) using gradients of the data distribution estimated with score matching. The score of each sample $\mathbf{x}$s density probability is defined as its gradient $\nabla_{\mathbf{x}} \log q(\mathbf{x})$. A score network $\mathbf{s}_\theta: \mathbb{R}^D \to \mathbb{R}^D$ is trained to estimate it, $\mathbf{s}_\theta(\mathbf{x}) \approx \nabla_{\mathbf{x}} \log q(\mathbf{x})$.
212+
[Song & Ermon (2019)](/posts/diffusion-models/https://arxiv.org/abs/1907.05600) proposed a score-based generative modeling method where samples are produced via [Langevin dynamics](/posts/diffusion-models/#connection-with-stochastic-gradient-langevin-dynamics) using gradients of the data distribution estimated with score matching. The score of each sample $\mathbf{x}$'s density probability is defined as its gradient $\nabla_{\mathbf{x}} \log q(\mathbf{x})$. A score network $\mathbf{s}_\theta: \mathbb{R}^D \to \mathbb{R}^D$ is trained to estimate it, $\mathbf{s}_\theta(\mathbf{x}) \approx \nabla_{\mathbf{x}} \log q(\mathbf{x})$.
213213

214214
To make it scalable with high-dimensional data in the deep learning setting, they proposed to use either *denoising score matching* ([Vincent, 2011](/posts/diffusion-models/http://www.iro.umontreal.ca/~vincentp/Publications/smdae_techreport.pdf)) or *sliced score matching* (use random projections; [Song et al., 2019](/posts/diffusion-models/https://arxiv.org/abs/1905.07088)). Denosing score matching adds a pre-specified small noise to the data $q(\tilde{\mathbf{x}} \vert \mathbf{x})$ and estimates $q(\tilde{\mathbf{x}})$ with score matching.
215215

@@ -317,7 +317,7 @@ It is very slow to generate a sample from DDPM by following the Markov chain of
317317

318318
One simple way is to run a strided sampling schedule ([Nichol & Dhariwal, 2021](https://arxiv.org/abs/2102.09672)) by taking the sampling update every $\lceil T/S \rceil$ steps to reduce the process from $T$ to $S$ steps. The new sampling schedule for generation is $\{\tau_1, \dots, \tau_S\}$ where $\tau_1 < \tau_2 < \dots <\tau_S \in [1, T]$ and $S < T$.
319319

320-
For another approach, lets rewrite $q_\sigma(\mathbf{x}_{t-1} \vert \mathbf{x}_t, \mathbf{x}_0)$ to be parameterized by a desired standard deviation $\sigma_t$ according to the nice property:
320+
For another approach, let's rewrite $q_\sigma(\mathbf{x}_{t-1} \vert \mathbf{x}_t, \mathbf{x}_0)$ to be parameterized by a desired standard deviation $\sigma_t$ according to the nice property:
321321

322322
$$
323323
\begin{aligned}
@@ -340,7 +340,7 @@ $$
340340

341341
Let $\sigma_t^2 = \eta \cdot \tilde{\beta}_t$ such that we can adjust $\eta \in \mathbb{R}^+$ as a hyperparameter to control the sampling stochasticity. The special case of $\eta = 0$ makes the sampling process _deterministic_. Such a model is named the _denoising diffusion implicit model_ (**DDIM**; [Song et al., 2020](https://arxiv.org/abs/2010.02502)). DDIM has the same marginal noise distribution but deterministically maps noise back to the original data samples.
342342

343-
During generation, we dont have to follow the whole chain $t=1,\dots,T$, but rather a subset of steps. Lets denote $s < t$ as two steps in this accelerated trajectory. The DDIM update step is:
343+
During generation, we don't have to follow the whole chain $t=1,\dots,T$, but rather a subset of steps. Let's denote $s < t$ as two steps in this accelerated trajectory. The DDIM update step is:
344344

345345
$$
346346
q_{\sigma, s < t}(\mathbf{x}_s \vert \mathbf{x}_t, \mathbf{x}_0)
@@ -426,7 +426,7 @@ They found that noise conditioning augmentation, dynamic thresholding and effici
426426

427427
Cited as:
428428

429-
> Weng, Lilian. (Jul 2021). What are diffusion models? LilLog. https://lilianweng.github.io/posts/2021-07-11-diffusion-models/.
429+
> Weng, Lilian. (Jul 2021). What are diffusion models? Lil'Log. https://lilianweng.github.io/posts/2021-07-11-diffusion-models/.
430430
431431
Or
432432

0 commit comments

Comments
 (0)