Generator Matching：任意马尔可夫过程生成建模

1. 引言

现有的生成模型方法（Diffusion Models、Flow Matching）都基于特定的马尔可夫过程¹：

Diffusion Models：固定的前向/反向扩散过程
Flow Matching：固定的概率路径

这些限制约束了生成模型的设计空间。Generator Matching提出了一个**模态无关（modality-agnostic）**的框架，允许使用任意马尔可夫过程进行生成建模。

2. Generator的定义

2.1 马尔可夫过程回顾

马尔可夫过程的核心特征是无记忆性：

P (X_{t + h} ∣ X_{t}, X_{t - 1}, ..., X_{0}) = P (X_{t + h} ∣ X_{t})

2.2 Generator的形式化

对于连续时间的马尔可夫过程，其Generator $G$ 定义为：

G f (x) = h \to 0 lim \frac{E [ f ( X _{t + h} ) ∣ X _{t} = x ] - f ( x )}{h}

Generator描述了马尔可夫过程的无穷小变化。

2.3 Generator Matching目标

给定一个马尔可夫过程的Generator $G^{*}$ ，Generator Matching的目标是学习参数化的Generator $G_{θ}$ ，使得：

G_{θ} \approx G^{*}

3. 统一现有方法

3.1 作为特例的Generator Matching

Generator Matching框架可以统一以下方法：

方法	对应Generator	特点
Diffusion Models	扩散过程的反向Generator	随机、渐变
Flow Matching	ODE的速度场	确定性、平滑
离散Diffusion	离散跳跃过程的Generator	离散状态空间

3.2 统一性证明

定理：Diffusion Models、Flow Matching和离散Diffusion Models都可以表示为特定马尔可夫过程的Generator Matching。

证明思路：每种方法都定义了一个条件Generator $G (x^{'} ∣ x)$ ，描述在无穷小时间步内从状态 $x$ 到 $x^{'}$ 的转变。Generator Matching通过学习这个Generator来重建整个过程。

4. 扩展到任意马尔可夫过程

4.1 跳跃过程（Jump Processes）

Generator Matching的一个关键扩展是跳跃过程：

G f (x) = \int [f (y) - f (x)] λ (x, y) d y

其中 $λ (x, y)$ 是从 $x$ 到 $y$ 的跳跃率。

与连续Diffusion不同，跳跃过程允许非连续的轨迹，这对某些模态（如文本）可能更自然。

4.2 多模态生成的潜力

任意马尔可夫过程的设计空间包括：

连续vs离散：连续状态空间或离散状态空间
扩散vs跳跃：平滑转变或突变
多尺度：不同时间尺度的混合动力学

4.3 马尔可夫生成模型叠加

Generator Matching允许构建叠加（Superposition）：

G_{super} = α G_{1} + (1 - α) G_{2}

这种叠加可以：

结合不同过程的优点
创建多模态生成器
在单一模型中支持多种生成模式

5. 训练目标

5.1 Generator Matching损失

给定一个马尔可夫过程的经验样本，Generator Matching使用以下损失：

L (θ) = E_{x \sim p, y \sim P (\cdot ∣ x)} [∥ G_{θ} (y) - G^{*} (y) ∥^{2}]

5.2 可扩展的训练目标

Generator Matching提供了一系列可扩展的训练目标，这正是框架命名的由来：

L_{GM} (θ) = E_{t, x_{t}, x_{t + h}} [∥ Forward (x_{t}, t) - Backward (x_{t + h}, t + h) ∥^{2}]

5.3 条件Generator

为了实现条件生成，定义条件Generator：

G_{θ} (\cdot ∣ c)

其中 $c$ 是条件信息（如类别标签、文本描述）。

6. 实践应用

6.1 图像生成

在CIFAR-10和ImageNet上的实验表明：

基于跳跃过程的Generator Matching可以取得与标准Diffusion相当的性能
叠加多个Generator可以获得更好的生成多样性

6.2 多模态生成

Generator Matching的多模态特性使其特别适合：

图像-文本联合生成：不同的Generator处理不同模态
跨模态转换：使用适当的Generator叠加

6.3 跳跃过程的独特优势

实验发现，基于跳跃过程的Generator在以下任务上表现优异：

文本生成：离散状态空间与token天然匹配
组合结构：跳跃适合捕捉组合变化

7. 理论意义

7.1 更大的设计空间

Generator Matching开辟了一个巨大的设计空间：

现有方法只是这个空间中的特定点
理论上可以探索无数新的马尔可夫过程
每种过程可能对特定数据模态有优势

7.2 统一理论基础

Generator Matching提供了统一理论基础，解释了为什么：

Diffusion Models在图像上表现好
Flow Matching在某些任务上更高效
组合使用可能带来额外收益

8. 结论

Generator Matching通过将生成建模重新定义为任意马尔可夫过程的Generator学习，实现了对现有方法的优雅统一，并开辟了全新的研究方向。

关键贡献：

模态无关框架：不依赖特定的马尔可夫过程假设
统一现有方法：Diffusion、Flow Matching、离散Diffusion的统一视角
新过程探索：跳跃过程、多模态叠加等新方向
实践验证：在图像和多模态任务上的有效性

参考文献

“Generator Matching: Generative Modeling with Arbitrary Markov Processes.” ICLR 2025. https://arxiv.org/pdf/2410.20587 ↩

Metaphor

探索

Generator Matching：任意马尔可夫过程生成建模

1. 引言

2. Generator的定义

2.1 马尔可夫过程回顾

2.2 Generator的形式化

2.3 Generator Matching目标

3. 统一现有方法

3.1 作为特例的Generator Matching

3.2 统一性证明

4. 扩展到任意马尔可夫过程

4.1 跳跃过程（Jump Processes）

4.2 多模态生成的潜力

4.3 马尔可夫生成模型叠加

5. 训练目标

5.1 Generator Matching损失

5.2 可扩展的训练目标

5.3 条件Generator

6. 实践应用

6.1 图像生成

6.2 多模态生成

6.3 跳跃过程的独特优势

7. 理论意义

7.1 更大的设计空间

7.2 统一理论基础

8. 结论

参考文献

关系图谱

目录

Metaphor

探索

Generator Matching：任意马尔可夫过程生成建模

1. 引言

2. Generator的定义

2.1 马尔可夫过程回顾

2.2 Generator的形式化

2.3 Generator Matching目标

3. 统一现有方法

3.1 作为特例的Generator Matching

3.2 统一性证明

4. 扩展到任意马尔可夫过程

4.1 跳跃过程（Jump Processes）

4.2 多模态生成的潜力

4.3 马尔可夫生成模型叠加

5. 训练目标

5.1 Generator Matching损失

5.2 可扩展的训练目标

5.3 条件Generator

6. 实践应用

6.1 图像生成

6.2 多模态生成

6.3 跳跃过程的独特优势

7. 理论意义

7.1 更大的设计空间

7.2 统一理论基础

8. 结论

参考文献

Footnotes

关系图谱

目录