besskge.bess.TopKQueryBessKGE

class besskge.bess.TopKQueryBessKGE(k, candidate_sampler, score_fn, evaluation=None, return_scores=False, window_size=100)[source]

Distributed scoring of (h, r, ?) or (?, r, t) queries (against all entities in the knowledge graph, or a query-specific set) returning the top-k most likely completions, based on the BESS [CJM+22] inference scheme. To be used in combination with a batch sampler based on a “h_shard”/”t_shard”-partitioned triple set. If the correct tail/head is known, this can be passed as an input in order to compute metrics on the final predictions.

This class is recommended over BessKGE when the number of negatives is large, for example when one wants to score queries against all entities in the knowledge graph, as it uses a sliding window over the negative sample size via an on-device for-loop.

Only to be used for inference.

Initialize TopK BESS-KGE module.

Parameters:

k (int) – For each query return the top-k most likely predictions.
candidate_sampler (Union[TripleBasedShardedNegativeSampler, PlaceholderNegativeSampler]) – Sampler of candidate entities to score against queries. Use besskge.negative_sampler.PlaceholderNegativeSampler to score queries against all entities in the knowledge graph, avoiding unnecessary loading of negative entities on device.
score_fn (BaseScoreFunction) – Scoring function.
evaluation (Optional[Evaluation]) – Evaluation module, for computing metrics on device. Default: None.
return_scores (bool) – If True, return scores of the top-k best completions. Default: False.
window_size (int) – Size of the sliding window, namely the number of negative entities scored against each query at each step of the on-device for-loop. Should be decreased with large batch sizes, to avoid an OOM error. Default: 100.

forward(relation, head=None, tail=None, negative=None, triple_mask=None, negative_mask=None)[source]

Forward step.

Similarly to ScoreMovingBessKGE, candidates are scored on the device where they are gathered, then scores for the same query against candidates in different shards are collected together via an AllToAll. At each iteration of the for loop, only the top-k best query responses and respective scores are kept to be used in the next iteration, while the rest are discarded.

Parameters:

relation (Tensor) – shape: (1, shard_bs,) Relation indices.
head (Optional[Tensor]) – shape: (1, shard_bs,) Head indices, if known. Default: None.
tail (Optional[Tensor]) – shape: (1, shard_bs,) Tail indices, if known. Default: None.
negative (Optional[Tensor]) – shape: (1, n_shard, B, padded_negative) Candidates to score against the queries. It can be the same set for all queries (B=1), or specific for each query in the batch (B=shard_bs). If None, score each query against all entities in the knowledge graph. Default: None.
triple_mask (Optional[Tensor]) – shape: (1, shard_bs,) Mask to filter the triples in the micro-batch before computing metrics. Default: None.
negative_mask (Optional[Tensor]) – shape: (1, n_shard, B, padded_negative) If candidates are provided, mask to discard padding negatives when computing best completions. Requires the use of mask_on_gather=True in the candidate sampler (see besskge.negative_sampler.TripleBasedShardedNegativeSampler). Default: None.

Return type:

Dict[str, Any]