besskge.sharding.PartitionedTripleSet
- class besskge.sharding.PartitionedTripleSet(sharding, inverse_triples, partition_mode, dummy, triples, triple_counts, triple_offsets, triple_sort_idx, types, neg_heads, neg_tails)[source]
A partitioned collection of triples. If
partition_mode = 'h_shard'each triple is assigned to one of n_shard partitions based on the shard where the head entity is stored. Similarly, ifpartition_mode = 't_shard', each triple is assigned to one of n_shard partitions based on the shard where the tail entity is stored.If
partition_mode = 'ht_shardpair', each triple is assigned to one of n_shard^2 partitions based on the shard-pair (shard_h, shard_t). Shard-pairs are ordered as: (0,0), (0,1), …, (0, n_shard-1), (1,0), …, (n_shard-1, n_shard-1).- Parameters:
- classmethod create_from_dataset(dataset, part, sharding, partition_mode='ht_shardpair', add_inverse_triples=False)[source]
Create a partitioned triple set from a
KGDatasetpart.- Parameters:
- Return type:
- Returns:
Partitioned set of triples.
- classmethod create_from_queries(dataset, sharding, queries, query_mode, ground_truth=None, negative=None, negative_type=None)[source]
Create a partitioned triple set from a set of (h,r,?) or (?,r,t) queries. Pairs are completed to triples by adding dummy entities.
- Parameters:
dataset (
KGDataset) – Knowledge graph dataset.sharding (
Sharding) – The entity sharding to use.queries (
ndarray[Any,dtype[int32]]) – shape: (n_query, 2) The set of (h, r) or (r, t) queries. Global IDs for entities/relations.query_mode (
str) – “hr” for (h,r,?) queries, “rt” for (?,r,t) queries.ground_truth (
Optional[ndarray[Any,dtype[int32]]]) – shape: (n_query,) If known, the global ID of the ground truth tail/head.negative (
Optional[ndarray[Any,dtype[int32]]]) – shape: (N, n_negative) Global IDs of negative entities to score against each query. This can be query-specific (N=n_query) or the same for all queries (N=1). Default: None (namely score each query against all entities in the graph).negative_type (
Optional[str]) – Score each query only against entities of a specific type. Default: None (namely score each query against entities of any type).
- Return type:
- Returns:
Partitioned set of queries (with dummy h/t completion).
-
dummy:
Optional[str] If set is constructed from (h,r,?) (resp. (?,r,t)) queries, dummy tails (resp. heads) are added to make pairs into triples. “head”, “tail”, “none”
-
inverse_triples:
bool Whether the collection contains inverse triples (t,r_inv,h) for each regular triple (h,r,t)
-
neg_heads:
Optional[ndarray[Any,dtype[int32]]] Global IDs of (possibly triple-specific) negative heads; int32[n_triple or 1, n_neg_heads]
-
neg_tails:
Optional[ndarray[Any,dtype[int32]]] Global IDs of (possibly triple-specific) negative heads; int32[n_triple or 1, n_neg_tails]
-
triple_counts:
ndarray[Any,dtype[int64]] Number of triples in each partition; int64[n_shard] or int64[n_shard, n_shard]
-
triple_offsets:
ndarray[Any,dtype[int64]] Delimiting indices of ordered partitions; int64[n_shard] or int64[n_shard, n_shard]
-
triple_sort_idx:
ndarray[Any,dtype[int64]] Sorting indices to order triples by partition; int64[n_triple]