概述
深度学习模型在现实世界应用中面临一个关键挑战:如何可靠地估计模型对预测的置信度。不确定性量化(Uncertainty Quantification)旨在为深度学习模型提供概率性的预测,使模型能够识别其自身的局限性——比如遇到分布外(Out-of-Distribution, OOD)数据时主动”承认不知道”。1
不确定性量化在自动驾驶、医疗诊断、科学预测等安全关键应用中至关重要。
不确定性的分类
1. 任意不确定性(Aleatoric Uncertainty)
定义:数据本身固有的随机性,无法通过更多数据消除。
特点:
- 源于数据收集过程中的观测噪声
- 固有的统计波动
- 不会随训练数据增加而减少
示例:
- 传感器测量噪声
- 标注过程中的主观差异
- 数据本身的随机性
建模方式:在网络输出层建模方差
# 同方差不确定性:所有输入共享相同方差
output_mean, log_var = model(x)
var = torch.exp(log_var)
# 异方差不确定性:方差依赖于输入
mean, log_var = model(x) # log_var 也是输入的函数
var = torch.exp(log_var)2. 认知不确定性(Epistemic Uncertainty)
定义:模型参数的不确定性,源于训练数据的不足。
特点:
- 可以通过增加训练数据来减少
- 反映了模型对其预测的”知识”程度
- 当模型遇到OOD数据时会变大
建模方式:对网络参数使用分布而非点估计
3. 总不确定性分解
预测的总不确定性可以分解为:
\underbrace{\mathbb{V}[y \mid x]}_{\text{总不确定性}} = \underbrace{\mathbb{E}[\sigma^2(y \mid x, \theta)]}_{\text{任意不确定性}} + \underbrace{\mathbb{V}[\mathbb{E}[y \mid x, \theta)]]_{\text{认知不确定性}}贝叶斯神经网络(BNN)
形式化定义
贝叶斯神经网络将网络权重视为随机变量:
预测时,对所有可能的权重进行积分:
预测均值与方差
近似推断方法
精确贝叶斯推断在高维神经网络中不可行,需要近似方法。
1. MC Dropout
由Gal & Ghahramani (2016)提出,是一种简单而有效的贝叶斯近似方法。2
理论基础
Dropout可以解释为变分推断的一种形式。Dropout网络最小化变分自由能:
MC Dropout推断
def mc_dropout_predict(model, x, num_samples=50, dropout_prob=0.5):
"""
MC Dropout预测
"""
model.train() # 开启dropout
predictions = []
for _ in range(num_samples):
with torch.no_grad():
y = model(x)
predictions.append(y)
predictions = torch.stack(predictions) # [num_samples, batch, output_dim]
# 预测均值
mean = predictions.mean(dim=0)
# 预测方差(认知不确定性)
variance = predictions.var(dim=0)
return mean, variance, predictions不确定性估计
def estimate_uncertainty(model, x, y_true=None, num_samples=50):
mean, variance, predictions = mc_dropout_predict(model, x, num_samples)
# 认知不确定性
epistemic = variance
# 如果需要计算测试log似然
if y_true is not None:
# 边缘化预测分布
log_likelihood = torch.distributions.Normal(mean, variance.sqrt()).log_prob(y_true).mean()
return {
'mean': mean,
'epistemic_uncertainty': epistemic,
'test_log_likelihood': log_likelihood
}
return {
'mean': mean,
'epistemic_uncertainty': epistemic
}MC Dropout的局限性
- 近似质量:Dropout近似可能不准确
- 训练-推断不一致:训练时使用Dropout,测试时也必须使用
- 方差估计:往往低估真实不确定性
2. 深度集成(Deep Ensembles)
Lakshminarayanan et al. (2017)提出使用多个独立训练的模型来估计不确定性。3
方法
训练 个独立模型,每个模型初始化不同:
class EnsembleModel(nn.Module):
def __init__(self, base_model, num_models=5):
super().__init__()
self.models = nn.ModuleList([
copy.deepcopy(base_model) for _ in range(num_models)
])
def forward(self, x):
predictions = [model(x) for model in self.models]
return torch.stack(predictions)
def predict(self, x):
preds = self.forward(x) # [num_models, batch, output]
mean = preds.mean(dim=0)
variance = preds.var(dim=0)
return mean, variance不确定性分解
def ensemble_uncertainty(predictions, y_true=None):
"""
深度集成的完整不确定性量化
predictions: [num_models, batch_size, output_dim]
"""
# 预测均值
mean = predictions.mean(dim=0)
# 认知不确定性(模型间方差)
epistemic = predictions.var(dim=0)
# 预测方差(加权平均)
avg_pred_var = torch.distributions.Normal(
predictions.mean(dim=0),
predictions.std(dim=0)
).variance
# 总不确定性
total_uncertainty = avg_pred_var + epistemic
if y_true is not None:
# 测试NLL
nll = -torch.distributions.Normal(mean, torch.sqrt(total_uncertainty)).log_prob(y_true).mean()
return {
'mean': mean,
'total_uncertainty': total_uncertainty,
'epistemic': epistemic,
'aleatoric': avg_pred_var,
'test_nll': nll
}
return {
'mean': mean,
'total_uncertainty': total_uncertainty,
'epistemic': epistemic
}3. 多样性诱导方法
集成效果的关键在于模型的多样性。
随机权重初始化
def train_diverse_ensembles(model_class, train_loader, num_models=5):
ensembles = []
for i in range(num_models):
# 不同随机种子
torch.manual_seed(42 + i * 17)
torch.cuda.manual_seed(42 + i * 17)
model = model_class()
# 使用不同的初始化
# 训练
train_model(model, train_loader, epochs=100)
ensembles.append(model)
return ensembles数据增强多样性
每个子模型使用不同的数据增强策略:
class AugmentedEnsemble:
def __init__(self, augmentations_list):
self.augmentations = augmentations_list
def train(self, model, x, y, model_idx):
# 对应子模型使用特定的增强
aug = self.augmentations[model_idx]
x_aug = aug(x)
# 训练...不确定性评估指标
1. 斯皮尔曼等级相关系数
衡量不确定性与误差之间的相关性:
其中 是第 个样本的预测误差排名与不确定性排名的差。
2. 分布外检测
使用不确定性作为OOD检测指标:
def ood_detection(model, id_data, ood_data, method='softmax'):
"""
使用不确定性进行分布外检测
"""
if method == 'softmax':
# 基于最大softmax概率
id_unc = 1 - get_softmax_probs(model, id_data).max(dim=-1)[0]
ood_unc = 1 - get_softmax_probs(model, ood_data).max(dim=-1)[0]
elif method == 'epistemic':
# 基于认知不确定性
_, id_unc = mc_dropout_predict(model, id_data, num_samples=50)
_, ood_unc = mc_dropout_predict(model, ood_data, num_samples=50)
id_unc = id_unc.mean(dim=-1)
ood_unc = ood_unc.mean(dim=-1)
# 计算AUROC
labels = torch.cat([torch.zeros(len(id_unc)), torch.ones(len(ood_unc))])
scores = torch.cat([id_unc, ood_unc])
auroc = compute_auroc(labels, scores)
return auroc3. 校准曲线
评估预测概率与实际准确率的一致性:
def plot_calibration_curve(model, data_loader, num_bins=10):
"""
绘制可靠性图(Reliability Diagram)
"""
confidences = []
accuracies = []
for x, y in data_loader:
probs = torch.softmax(model(x), dim=-1)
max_probs = probs.max(dim=-1)[0]
preds = probs.argmax(dim=-1)
confidences.extend(max_probs.cpu().numpy())
accuracies.extend((preds == y).cpu().numpy())
# 分箱
bins = np.linspace(0, 1, num_bins + 1)
bin_indices = np.digitize(confidences, bins) - 1
bin_confidences = []
bin_accuracies = []
for i in range(num_bins):
mask = bin_indices == i
if mask.sum() > 0:
bin_confidences.append(np.mean(np.array(confidences)[mask]))
bin_accuracies.append(np.mean(np.array(accuracies)[mask]))
# 绘制校准曲线
plt.figure(figsize=(8, 8))
plt.plot([0, 1], [0, 1], 'k--', label='Perfect calibration')
plt.plot(bin_confidences, bin_accuracies, 'o-', label='Model')
plt.xlabel('Confidence')
plt.ylabel('Accuracy')
plt.legend()
plt.title('Calibration Curve')不确定性在主动学习中的应用
Bayesian Active Learning
使用不确定性来选择最有价值的标注样本:
def bayesian_active_learning(model, unlabeled_pool, batch_size=10, num_samples=50):
"""
基于不确定性的主动学习
"""
uncertainties = []
for x in unlabeled_pool:
_, variance = mc_dropout_predict(model, x.unsqueeze(0), num_samples)
uncertainty = variance.mean() # 总不确定性
uncertainties.append(uncertainty.item())
# 选择不确定性最高的样本
selected_indices = np.argsort(uncertainties)[-batch_size:]
return selected_indices, np.array(uncertainties)[selected_indices]Bayesian Query By Committee
使用多个模型(委员会)的一致性来选择样本:
def query_by_committee(models, unlabeled_pool, batch_size=10):
"""
Query By Committee (QBC) 主动学习
"""
all_predictions = []
for model in models:
preds = torch.stack([torch.softmax(model(x.unsqueeze(0)), dim=-1)
for x in unlabeled_pool])
all_predictions.append(preds)
all_predictions = torch.stack(all_predictions) # [num_models, pool_size, num_classes]
# 计算委员会成员间的分歧
mean_pred = all_predictions.mean(dim=0)
kl_divs = torch.distributions.kl_divergence(
torch.distributions.Categorical(probs=mean_pred),
torch.distributions.Categorical(probs=all_predictions)
).mean(dim=-1)
selected_indices = kl_divs.topk(batch_size)[1]
return selected_indices, kl_divs[selected_indices]安全关键应用中的不确定性
自动驾驶
class UncertaintyAwareDetector:
def __init__(self, detector, uncertainty_threshold=0.3):
self.detector = detector
self.threshold = uncertainty_threshold
def detect(self, image):
mean, uncertainty = mc_dropout_predict(self.detector, image)
# 高不确定性时触发安全策略
if uncertainty.mean() > self.threshold:
# 减速或请求人类接管
return {
'detections': mean,
'uncertainty': uncertainty,
'action': 'reduce_speed_or_takeover'
}
return {
'detections': mean,
'uncertainty': uncertainty,
'action': 'continue'
}医学诊断
class BayesianMedicalClassifier:
def __init__(self, model):
self.model = model
def diagnose(self, patient_data):
mean, variance = mc_dropout_predict(self.model, patient_data)
# 返回诊断结果和置信区间
std = torch.sqrt(variance)
return {
'diagnosis': mean.argmax(dim=-1),
'probability': torch.softmax(mean, dim=-1),
'confidence_interval': (mean - 1.96*std, mean + 1.96*std),
'high_uncertainty': variance.mean() > 0.1
}方法对比
| 方法 | 认知不确定性 | 任意不确定性 | 计算成本 | 实现复杂度 |
|---|---|---|---|---|
| MC Dropout | ✓ | 需要修改 | 低 | 低 |
| 深度集成 | ✓ | 需要修改 | 中-高 | 低 |
| 贝叶斯推断 | ✓ | ✓ | 高 | 高 |
| SWAG | ✓ | 需要修改 | 中 | 中 |
| MC Dropout + 异方差 | ✓ | ✓ | 低 | 中 |
最新进展
1. SWAG (Stochastic Weight Averaging Gaussian)
将权重空间的不确定性建模为高斯分布:
class SWAG:
def __init__(self, model, deviations=[]):
self.mean = copy.deepcopy(model.state_dict())
self.deviations = deviations
def collect_weights(self, model, k=20):
# 收集权重均值和偏差
pass
def sample(self):
# 从高斯分布采样权重
sampled = {}
for key in self.mean.keys():
mean = self.mean[key]
# 采样
noise = torch.randn_like(mean)
sampled[key] = mean + noise @ self.deviation_t
return sampled2. LUQ (Learning Uncertainty Quantities)
端到端学习不确定性:
class LUQ(nn.Module):
def __init__(self, backbone):
super().__init__()
self.backbone = backbone
# 共享特征提取器
self.mean_head = nn.Linear(feature_dim, output_dim)
self.var_head = nn.Sequential(
nn.Linear(feature_dim, hidden_dim),
nn.ReLU(),
nn.Linear(hidden_dim, output_dim),
nn.Softplus() # 确保方差为正
)3. Evidential Deep Learning
将不确定性建模为证据分布:
class EvidentialNetwork(nn.Module):
def forward(self, x):
outputs = self.backbone(x)
# 狄利克雷参数(证据)
gamma = torch.softmax(outputs[:, :num_classes], dim=-1)
log_lambda = outputs[:, num_classes:]
alpha = gamma * torch.exp(log_lambda) + 1
# 预测分布为Dirichlet
return {'alpha': alpha}
def evidential_loss(y_true, y_pred, alpha, lambd):
# Evidential回归损失
lambda_ = y_pred['lambd']
alpha = y_pred['alpha']
# 数据损失
data_loss = 0.5 * torch.sum(
lambda_ * (y_true - gamma)**2 / (alpha - 1),
dim=-1
)
# 正则化损失
reg_loss = torch.sum(
(alpha - y_true) * (torch.digamma(alpha) - torch.log(alpha - 1)),
dim=-1
)
return (data_loss + reg_loss).mean()参考
相关主题
- bayesian-neural-networks - 贝叶斯神经网络基础
- bayesian-neural-networks-advanced-inference - 贝叶斯神经网络高级推断
- variational-inference-advanced - 变分推断进阶
- mcmc-methods - MCMC方法
- mc-dropout - MC Dropout详解
Footnotes
-
Kendall & Gal (2017). “What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?” NeurIPS 2017. ↩
-
Gal & Ghahramani (2016). “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning”. ICML 2016. ↩
-
Lakshminarayanan et al. (2017). “Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles”. NeurIPS 2017. ↩