besskge.sharding.PartitionedTripleSet
- class besskge.sharding.PartitionedTripleSet(sharding, inverse_triples, partition_mode, dummy, triples, triple_counts, triple_offsets, triple_sort_idx, types, neg_heads, neg_tails)[source]
A partitioned collection of triples. If
partition_mode = 'h_shard'
each triple is assigned to one of n_shard partitions based on the shard where the head entity is stored. Similarly, ifpartition_mode = 't_shard'
, each triple is assigned to one of n_shard partitions based on the shard where the tail entity is stored.If
partition_mode = 'ht_shardpair'
, each triple is assigned to one of n_shard^2 partitions based on the shard-pair (shard_h, shard_t). Shard-pairs are ordered as: (0,0), (0,1), …, (0, n_shard-1), (1,0), …, (n_shard-1, n_shard-1).- Parameters:
- classmethod create_from_dataset(dataset, part, sharding, partition_mode='ht_shardpair', add_inverse_triples=False)[source]
Create a partitioned triple set from a
KGDataset
part.- Parameters:
- Return type:
- Returns:
Partitioned set of triples.
- classmethod create_from_queries(dataset, sharding, queries, query_mode, ground_truth=None, negative=None, negative_type=None)[source]
Create a partitioned triple set from a set of (h,r,?) or (?,r,t) queries. Pairs are completed to triples by adding dummy entities.
- Parameters:
dataset (
KGDataset
) – Knowledge graph dataset.sharding (
Sharding
) – The entity sharding to use.queries (
ndarray
[Any
,dtype
[int32
]]) – shape: (n_query, 2) The set of (h, r) or (r, t) queries. Global IDs for entities/relations.query_mode (
str
) – “hr” for (h,r,?) queries, “rt” for (?,r,t) queries.ground_truth (
Optional
[ndarray
[Any
,dtype
[int32
]]]) – shape: (n_query,) If known, the global ID of the ground truth tail/head.negative (
Optional
[ndarray
[Any
,dtype
[int32
]]]) – shape: (N, n_negative) Global IDs of negative entities to score against each query. This can be query-specific (N=n_query) or the same for all queries (N=1). Default: None (namely score each query against all entities in the graph).negative_type (
Optional
[str
]) – Score each query only against entities of a specific type. Default: None (namely score each query against entities of any type).
- Return type:
- Returns:
Partitioned set of queries (with dummy h/t completion).
-
dummy:
Optional
[str
] If set is constructed from (h,r,?) (resp. (?,r,t)) queries, dummy tails (resp. heads) are added to make pairs into triples. “head”, “tail”, “none”
-
inverse_triples:
bool
Whether the collection contains inverse triples (t,r_inv,h) for each regular triple (h,r,t)
-
neg_heads:
Optional
[ndarray
[Any
,dtype
[int32
]]] Global IDs of (possibly triple-specific) negative heads; int32[n_triple or 1, n_neg_heads]
-
neg_tails:
Optional
[ndarray
[Any
,dtype
[int32
]]] Global IDs of (possibly triple-specific) negative heads; int32[n_triple or 1, n_neg_tails]
-
triple_counts:
ndarray
[Any
,dtype
[int64
]] Number of triples in each partition; int64[n_shard] or int64[n_shard, n_shard]
-
triple_offsets:
ndarray
[Any
,dtype
[int64
]] Delimiting indices of ordered partitions; int64[n_shard] or int64[n_shard, n_shard]
-
triple_sort_idx:
ndarray
[Any
,dtype
[int64
]] Sorting indices to order triples by partition; int64[n_triple]