知识图谱嵌入新进展

知识图谱嵌入(Knowledge Graph Embedding,KGE)是将实体和关系映射到连续向量空间的技术,是知识图谱补全、推理等任务的基础。本文档介绍2025-2026年知识图谱嵌入领域的重要进展。

1. 基础回顾

1.1 问题形式化

给定知识图谱 ,其中:

  • :实体集合
  • :关系集合
  • :三元组集合

目标:学习嵌入函数 ,使得:

能够区分正三元组和负三元组。

1.2 经典方法分类

方法类别代表算法核心思想
翻译模型TransE, TransR, TransD
双线性模型DistMult, ComplEx, RotatE
神经网络ConvE, CompGCNCNN/GCN编码
TransformerKEPLER, CoSLM预训练语言模型

2. MAYPL:超关系知识图谱嵌入

2.1 核心思想

MAYPL (Multi-Aspect Yielding Pyramidal Learning)1提出了一种处理超关系知识图谱(Hyper-Relational KG)的方法,通过金字塔式的注意力消息传递实现高效的结构表示学习。

2.2 超关系知识图谱

传统KG vs 超关系KG:

传统KG:
  (Paris, capitalOf, France)

超关系KG:
  (Paris, capitalOf, France, {qualifier: {year: 2020, status: official}})
                   ↑
            超关系属性/限定符

优势:
- 表达更丰富的语义
- 区分同一关系的不同实例
- 更精确的知识表示

2.3 架构设计

┌─────────────────────────────────────────────────────────────┐
│                    MAYPL 架构                                │
│                                                              │
│  输入: (h, r, t, qualifiers)                              │
│                                                              │
│  ┌─────────────────────────────────────────────────────┐  │
│  │           金字塔注意力层 (Pyramidal Attention)          │  │
│  │                                                         │  │
│  │     Level 4: [全局上下文聚合]                       │  │
│  │           ↑                                           │  │
│  │     Level 3: [关系上下文]                            │  │
│  │           ↑                                           │  │
│  │     Level 2: [限定符聚合]                           │  │
│  │           ↑                                           │  │
│  │     Level 1: [基础三元组]                            │  │
│  │                                                         │  │
│  └─────────────────────────────────────────────────────┘  │
│                           │                                 │
│                           ▼                                 │
│  ┌─────────────────────────────────────────────────────┐  │
│  │              实体/关系嵌入层                           │  │
│  │                                                         │  │
│  │  h_entity = EntityEncoder(entity)                    │  │
│  │  h_relation = RelationEncoder(relation)               │  │
│  │  h_qualifiers = QualifierEncoder(qualifiers)         │  │
│  └─────────────────────────────────────────────────────┘  │
│                           │                                 │
│                           ▼                                 │
│  ┌─────────────────────────────────────────────────────┐  │
│  │              评分函数                                  │  │
│  │                                                         │  │
│  │  score = f(h_entity, h_relation, h_qualifiers)       │  │
│  └─────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

2.4 金字塔注意力机制

class PyramidalAttention(nn.Module):
    """金字塔式注意力"""
    
    def __init__(self, dim, num_heads=4):
        super().__init__()
        self.dim = dim
        self.num_heads = num_heads
        self.head_dim = dim // num_heads
        
        # 各层的注意力参数
        self.base_attn = nn.Linear(dim, dim)      # Level 1
        self.qualifier_attn = nn.Linear(dim, dim) # Level 2
        self.context_attn = nn.Linear(dim, dim)   # Level 3
        self.global_attn = nn.Linear(dim, dim)    # Level 4
    
    def forward(self, base_features, qualifiers, entity_context):
        """
        金字塔式前向传播
        """
        # === Level 1: 基础三元组编码 ===
        base_out = self.base_attn(base_features)
        
        # === Level 2: 限定符聚合 ===
        # 使用注意力聚合限定符信息
        qualifier_scores = torch.matmul(
            base_out.unsqueeze(1),  # [B, 1, D]
            qualifiers.transpose(1, 2)  # [B, N, D]
        ) / np.sqrt(self.head_dim)  # [B, 1, N]
        
        qualifier_weights = F.softmax(qualifier_scores, dim=-1)
        qualifier_out = torch.matmul(qualifier_weights, qualifiers)  # [B, 1, D]
        
        # === Level 3: 关系上下文聚合 ===
        combined = base_out + qualifier_out.squeeze(1)
        context_out = self.context_attn(combined)
        
        # === Level 4: 全局上下文 ===
        global_out = self.global_attn(context_out)
        
        return global_out
 
 
class MAYPLModel(nn.Module):
    """MAYPL完整模型"""
    
    def __init__(self, num_entities, num_relations, dim=256):
        super().__init__()
        
        # 嵌入层
        self.entity_emb = nn.Embedding(num_entities, dim)
        self.relation_emb = nn.Embedding(num_relations, dim)
        self.qualifier_emb = nn.Embedding(10000, dim)  # 假设最多10000种qualifier类型
        
        # 金字塔注意力
        self.pyramidal_attn = PyramidalAttention(dim)
        
        # 评分层
        self.score_net = nn.Sequential(
            nn.Linear(dim * 3, dim),
            nn.ReLU(),
            nn.Linear(dim, 1)
        )
        
        # 归一化
        self.entity_norm = nn.LayerNorm(dim)
        self.relation_norm = nn.LayerNorm(dim)
    
    def forward(self, head, relation, tail, qualifiers=None):
        """
        前向传播
        """
        # 1. 获取嵌入
        h_e = self.entity_norm(self.entity_emb(head))
        r = self.relation_norm(self.relation_emb(relation))
        t_e = self.entity_norm(self.entity_emb(tail))
        
        # 2. 编码限定符(如果有)
        if qualifiers is not None:
            q_emb = self.qualifier_emb(qualifiers)  # [B, N, D]
        else:
            q_emb = torch.zeros_like(h_e).unsqueeze(1)  # [B, 1, D]
        
        # 3. 金字塔注意力处理
        base_features = torch.cat([h_e, r, t_e], dim=-1)  # [B, 3D]
        pyramid_out = self.pyramidal_attn(
            base_features, 
            q_emb,
            entity_context=h_e
        )  # [B, D]
        
        # 4. 评分
        combined = torch.cat([h_e, pyramid_out, t_e], dim=-1)
        score = self.score_net(combined)
        
        return score

2.5 实验结果

方法FB15k-237-HR MRRWN18RR MRR归纳设置
MAYPL0.420.52
NaLP0.380.45
HyperRel0.390.48
提升+7.9%+8.3%-

3. CKRHE:复杂知识图谱层次嵌入

3.1 核心思想

CKRHE (Complex Knowledge Graph Hierarchical Representation Embedding)2提出了一种针对复杂知识图谱的层次嵌入方法,同时建模实体级别、关系级别和图级别的层次结构。

3.2 多层次结构建模

复杂KG的层次结构:

图级别 (Graph-level)
  │
  ├── 模式级别 (Schema-level)
  │     │
  │     ├── 实体类型 (Entity Types)
  │     └── 关系类型 (Relation Types)
  │
  └── 实例级别 (Instance-level)
        │
        ├── 具体实体 (Concrete Entities)
        └── 具体关系 (Concrete Relations)

3.3 层次嵌入机制

class HierarchicalEmbedding(nn.Module):
    """层次嵌入"""
    
    def __init__(self, num_entities, num_relations, num_types, dim):
        super().__init__()
        self.dim = dim
        
        # 实例级别嵌入
        self.entity_emb = nn.Embedding(num_entities, dim)
        self.relation_emb = nn.Embedding(num_relations, dim)
        
        # 类型级别嵌入
        self.entity_type_emb = nn.Embedding(num_types, dim)
        self.relation_type_emb = nn.Embedding(num_types, dim)
        
        # 层次约束参数
        self.type_entity_proj = nn.Linear(dim, dim)
        self.type_relation_proj = nn.Linear(dim, dim)
    
    def get_entity_embedding(self, entity_id, entity_type_id):
        """
        获取带类型约束的实体嵌入
        """
        # 基础实体嵌入
        base_emb = self.entity_emb(entity_id)
        
        # 类型嵌入
        type_emb = self.entity_type_emb(entity_type_id)
        
        # 类型到实体的投影
        projected_type = self.type_entity_proj(type_emb)
        
        # 组合:实体嵌入 + 类型约束
        final_emb = base_emb + projected_type
        
        return F.normalize(final_emb, p=2, dim=-1)
    
    def get_relation_embedding(self, relation_id, relation_type_id):
        """
        获取带类型约束的关系嵌入
        """
        base_emb = self.relation_emb(relation_id)
        type_emb = self.relation_type_emb(relation_type_id)
        projected_type = self.type_relation_proj(type_emb)
        
        final_emb = base_emb + projected_type
        
        # 对于复杂关系,使用双线性组合
        return final_emb
    
    def hierarchical_score(self, h_emb, r_emb, t_emb):
        """
        层次化评分函数
        """
        # 实例级别评分
        instance_score = -torch.norm(h_emb + r_emb - t_emb, p=2, dim=-1)
        
        # 类型级别一致性约束
        h_type = self.entity_type_proj.weight[h_emb.meta['type_id']]
        t_type = self.entity_type_proj.weight[t_emb.meta['type_id']]
        r_type = self.relation_type_proj.weight[r_emb.meta['type_id']]
        
        # 类型一致性应该符合关系模式
        type_score = -torch.norm(h_type + r_type - t_type, p=2, dim=-1)
        
        return instance_score + λ * type_score

4. 大规模知识图谱可扩展方法

4.1 挑战

大规模KG的挑战:

规模:
- Wikidata: 1亿+ 三元组
- Freebase: 30亿+ 三元组
- DBpedia: 5亿+ 三元组

问题:
- 内存无法容纳完整嵌入
- 训练时间过长
- 分布式存储和计算

4.2 分片嵌入方法

class ShardedKGEmbedding(nn.Module):
    """分片知识图谱嵌入"""
    
    def __init__(self, num_shards, dim_per_shard):
        super().__init__()
        self.num_shards = num_shards
        self.dim_per_shard = dim_per_shard
        
        # 为每个分片创建独立的嵌入表
        self.entity_embs = nn.ModuleList([
            nn.Embedding(shard_size, dim_per_shard) 
            for shard_size in self.get_shard_sizes()
        ])
        
        # 分片路由
        self.router = RoutingModule(num_shards, dim_per_shard)
    
    def get_entity_embedding(self, entity_id):
        """
        通过路由获取实体嵌入
        """
        # 确定实体属于哪个分片
        shard_id = self.router.get_shard(entity_id)
        
        # 计算分片内索引
        shard_offset = self.router.get_offset(entity_id)
        
        # 获取嵌入
        shard_emb = self.entity_embs[shard_id].weight[shard_offset]
        
        # 跨分片注意力(如果需要)
        if self.router.use_cross_shard():
            cross_shard_emb = self.router.cross_shard_attention(
                shard_emb, 
                entity_id
            )
            shard_emb = shard_emb + cross_shard_emb
        
        return shard_emb
    
    def forward(self, triples):
        """
        前向传播:分布式计算
        """
        # 分布式评分计算
        scores = []
        
        for h, r, t in triples:
            h_emb = self.get_entity_embedding(h)
            r_emb = self.relation_emb(r)
            t_emb = self.get_entity_embedding(t)
            
            score = self.compute_score(h_emb, r_emb, t_emb)
            scores.append(score)
        
        return torch.stack(scores)
 
 
class RoutingModule(nn.Module):
    """基于学习的路由模块"""
    
    def __init__(self, num_shards, dim):
        super().__init__()
        self.hash_net = nn.Sequential(
            nn.Linear(1, dim),
            nn.ReLU(),
            nn.Linear(dim, num_shards)
        )
    
    def get_shard(self, entity_id):
        """学习决定实体属于哪个分片"""
        logits = self.hash_net(torch.tensor([entity_id]))
        return torch.argmax(logits, dim=-1)

4.3 负采样优化

class AdaptiveNegativeSampler:
    """自适应负采样"""
    
    def __init__(self, kg, model):
        self.kg = kg
        self.model = model
        self.entity_freq = self.compute_entity_frequency()
    
    def sample_negatives(self, positive_triple, num_negatives, strategy='adversarial'):
        """
        自适应负采样
        """
        h, r, t = positive_triple
        
        if strategy == 'adversarial':
            # 对抗采样:根据模型当前判断选择难负例
            return self.adversarial_sample(h, r, t, num_negatives)
        
        elif strategy == 'frequency':
            # 频率感知采样:更频繁的实体被采样概率更低
            return self.frequency_sample(h, r, t, num_negatives)
        
        elif strategy == 'type':
            # 类型感知采样:保持类型一致性
            return self.type_constrained_sample(h, r, t, num_negatives)
    
    def adversarial_sample(self, h, r, t, num_negatives):
        """对抗采样:选择模型当前认为正但实际上是负的样本"""
        # 候选负例
        all_entities = list(self.kg.entities)
        
        # 计算当前模型的评分
        candidates = []
        for e in all_entities:
            if e not in [h, t]:
                score_hr = self.model.score(h, r, e)
                score_et = self.model.score(e, r, t)
                # 选择模型认为正但可能是负的
                candidates.append((e, score_hr + score_et))
        
        # 选择top-k最难负例
        candidates.sort(key=lambda x: x[1], reverse=True)
        return [e for e, _ in candidates[:num_negatives]]
    
    def frequency_sample(self, h, r, t, num_negatives):
        """频率感知采样"""
        probs = []
        for e in self.kg.entities:
            if e == h or e == t:
                probs.append(0)
            else:
                # 频率越低,被采样概率越高
                prob = 1.0 / (self.entity_freq[e] + 1)
                probs.append(prob)
        
        probs = torch.tensor(probs)
        probs = probs / probs.sum()
        
        indices = torch.multinomial(probs, num_negatives, replacement=False)
        return [self.kg.entities[i] for i in indices]

5. 双曲知识图谱嵌入

5.1 核心思想

双曲空间具有负曲率,能够以更少的维度指数级地容纳层次结构,非常适合表示知识图谱中的层次结构。

5.2 双曲空间基础

欧几里得空间 vs 双曲空间:

欧几里得空间(曲率=0):
- 距离增长线性
- 维度固定时容量固定

双曲空间(曲率<0):
- 距离增长指数级
- 指数级容量增长

例如:
- 树结构在双曲空间可以自然嵌入
- 层次关系不需要额外维度

5.3 Poincaré球模型

class PoincareBallEmbedding(nn.Module):
    """Poincaré球模型嵌入"""
    
    def __init__(self, num_entities, dim, curvature=1.0, eps=1e-5):
        super().__init__()
        self.dim = dim
        self.curvature = curvature
        self.eps = eps
        
        # 嵌入:初始化在Poincaré球内部
        self.emb = nn.Embedding(num_entities, dim)
        nn.init.uniform_(self.emb.weight, -1e-4, 1e-4)
        
        # Riemannian梯度缩放因子将在优化器中处理
    
    def exponential_map(self, x, v, c=None):
        """
        指数映射:将切空间映射到流形
        
        Args:
            x: Poincaré球上的点
            v: 切向量
            c: 曲率(默认为self.curvature)
        """
        if c is None:
            c = self.curvature
        
        sqrt_c = np.sqrt(c)
        v_norm = torch.norm(v, dim=-1, keepdim=True).clamp(min=self.eps)
        
        # 计算缩放因子
        second_term = (2 * sqrt_c * v_norm) / (1 + c * torch.sum(x * x, dim=-1, keepdim=True))
        second_term = second_term.clamp(max=50)  # 防止数值爆炸
        
        # 指数映射
        result = x + (torch.tanh(sqrt_c * v_norm) / (sqrt_c * v_norm)) * v
        result = result / (1 + c * torch.sum(result * result, dim=-1, keepdim=True).clamp(max=1e6))
        
        return result
    
    def distance(self, x, y, c=None):
        """
        Poincaré距离
        """
        if c is None:
            c = self.curvature
        
        sqrt_c = np.sqrt(c)
        
        # 计算Poincaré距离
        x2 = torch.sum(x * x, dim=-1, keepdim=True)
        y2 = torch.sum(y * y, dim=-1, keepdim=True)
        xy = torch.sum(x * y, dim=-1, keepdim=True)
        
        numerator = 2 * c * torch.sum((x - y) ** 2, dim=-1, keepdim=True)
        denominator = (1 - c * x2) * (1 - c * y2)
        
        ratio = 1 + numerator / (denominator.clamp(min=self.eps))
        
        dist = torch.acosh(ratio.clamp(min=1 + self.eps))
        
        return dist
    
    def score(self, head_emb, rel_emb, tail_emb):
        """
        基于双曲距离的评分
        """
        # 在双曲空间中进行关系翻译
        translated = self.exponential_map(head_emb, rel_emb)
        
        # 计算到tail的距离
        dist = self.distance(translated, tail_emb)
        
        # 距离越小越好
        return -dist

5.4 Lorentz模型

class LorentzEmbedding(nn.Module):
    """Lorentz(双曲)模型嵌入"""
    
    def __init__(self, num_entities, dim, curvature=1.0):
        super().__init__()
        self.dim = dim
        self.curvature = curvature
        
        # 使用Lorentz坐标(最后一维是时间维)
        self.emb = nn.Embedding(num_entities, dim + 1)
        
        # 初始化在正交双曲锥上
        nn.init.uniform_(self.emb.weight, -1e-4, 1e-4)
    
    def project_to_hyperboloid(self, x):
        """投影到双曲空间"""
        # Lorentz空间中的约束:⟨x, x⟩ = -1, x_0 > 0
        x0 = torch.sqrt(1 + torch.sum(x ** 2, dim=-1, keepdim=True))
        return torch.cat([x0, x], dim=-1)
    
    def lorentz_distance(self, x, y):
        """Lorentz距离"""
        # ⟨x, y⟩ = -cosh(d) 其中d是双曲距离
        inner = -torch.sum(x * y, dim=-1)
        dist = torch.acosh(torch.clamp(inner, min=1 + 1e-6))
        return dist
    
    def riemannian_grad(self, grad, x):
        """Riemannian梯度转换"""
        # 将欧几里得梯度转换为Riemannian梯度
        return grad / (self.curvature * torch.sum(x ** 2, dim=-1, keepdim=True))

6. 对比学习增强KGE

6.1 核心思想

通过对比学习增强实体和关系表示,捕捉更丰富的语义信息。

6.2 结构对比学习

class ContrastiveKGE(nn.Module):
    """对比学习增强KGE"""
    
    def __init__(self, num_entities, num_relations, dim):
        super().__init__()
        self.dim = dim
        
        # 基础嵌入
        self.entity_emb = nn.Embedding(num_entities, dim)
        self.relation_emb = nn.Embedding(num_relations, dim)
        
        # 对比学习投影头
        self.project_entity = nn.Sequential(
            nn.Linear(dim, dim),
            nn.ReLU(),
            nn.Linear(dim, dim)
        )
        
        self.project_relation = nn.Sequential(
            nn.Linear(dim, dim),
            nn.ReLU(),
            nn.Linear(dim, dim)
        )
        
        # 温度参数
        self.temp = nn.Parameter(torch.tensor(0.07))
    
    def contrastive_loss(self, embeddings, batch_idx):
        """
        对比损失
        """
        # 归一化投影
        z = F.normalize(self.project_entity(embeddings), p=2, dim=-1)
        
        # 计算相似度矩阵
        sim = torch.matmul(z, z.T) / self.temp
        
        # InfoNCE损失
        loss = F.cross_entropy(sim, batch_idx)
        
        return loss
    
    def forward(self, triples, negatives=None):
        """
        综合损失
        """
        h, r, t = triples
        
        # 1. 链接预测损失
        pos_score = self.link_loss(h, r, t)
        
        if negatives is not None:
            neg_h, neg_r, neg_t = negatives
            neg_score = self.link_loss(neg_h, neg_r, neg_t)
            link_loss = F.margin_ranking_loss(
                pos_score, 
                neg_score, 
                torch.ones_like(pos_score),
                margin=1.0
            )
        else:
            link_loss = 0
        
        # 2. 对比损失
        all_entities = torch.cat([h, t], dim=0)
        contrastive_loss = self.contrastive_loss(
            self.entity_emb(all_entities),
            torch.arange(len(all_entities))
        )
        
        # 3. 正则化
        reg_loss = self.regularization(h, r, t)
        
        return link_loss + λ_c * contrastive_loss + λ_r * reg_loss

7. 总结与展望

7.1 方法对比

方法核心创新适用场景优势
MAYPL金字塔注意力超关系KG表达能力强
CKRHE层次嵌入复杂层次结构类型感知
分片方法分布式嵌入大规模KG可扩展
双曲嵌入负曲率空间层次结构维度效率高
对比学习自监督增强任何KG质量提升

7.2 未来趋势

  1. 预训练KGE:像NLP一样预训练知识图谱嵌入
  2. 多模态KGE:整合文本、图像等多模态信息
  3. 动态KGE:处理时序知识图谱
  4. 神经符号KGE:结合符号推理能力

7.3 选择指南

场景推荐方法
超关系KGMAYPL
层次结构明显双曲嵌入
超大规模分片+采样
资源受限CKRHE + 对比学习

参考文献


相关主题

Footnotes

  1. [ICML 2025] MAYPL: Multi-Aspect Pyramidal Learning for Hyper-Relational Knowledge Graphs

  2. [KAIS 2025] CKRHE: Complex Knowledge Graph Hierarchical Representation Embedding