besskge.metric.Evaluation

class besskge.metric.Evaluation(metric_list, mode='average', worst_rank_infty=False, reduction='none', return_ranks=False)[source]

A module for computing link prediction metrics.

Initialize evaluation module.

Parameters:
  • metric_list (List[str]) – List of metrics to compute. Currently supports “mrr” and “hits@K”.

  • mode (str) – Mode used for metrics. Can be “optimistic”, “pessimistic” or “average”. Default: “average”.

  • worst_rank_infty (bool) – If True, assign a prediction rank of infinity as the worst possible rank. If False, assign a prediction rank of n_negative + 1 as the worst possible rank. Default: False.

  • reduction (str) – Method to use to reduce metrics along the batch dimension. Currently supports “none” (no reduction) and “sum”.

  • return_ranks (bool) – If True, returns prediction ranks alongside metrics.

dict_metrics_from_ranks(batch_rank, triple_mask=None)[source]

Compute the required metrics starting from the prediction ranks of the elements in the batch.

Parameters:
  • batch_rank (Tensor) – shape: (batch_size,) Prediction rank for each element in the batch.

  • triple_mask (Optional[Tensor]) – shape: (batch_size,) Boolean mask. If provided, all metrics for the elements where ~triple_mask are set to 0.0.

Return type:

Dict[str, Tensor]

Returns:

The dictionary of (reduced) batch metrics.

ranks_from_indices(ground_truth, candidate_indices)[source]

Compute the prediction rank from the ground truth ID and ORDERED candidate IDs.

Parameters:
  • ground_truth (Tensor) – shape: (batch_size,) Indices of ground truth entities for each query.

  • candidate_indices (Tensor) – shape: (batch_size, n_candidates) Indices of top n_candidates predicted entities, ordered by decreasing likelihood. The indices on each row are assumed to be distinct.

Return type:

Tensor

Returns:

The rank of the ground truth among the predictions.

ranks_from_scores(pos_score, candidate_score)[source]

Compute the prediction rank from the score of the positive triple (ground truth) and the scores of triples corrupted with the candidate entities.

Parameters:
  • pos_score (Tensor) – shape: (batch_size,) Scores of positive triples.

  • candidate_score (Tensor) – shape: (batch_size, n_candidate) Scores of candidate triples.

Return type:

Tensor

Returns:

The rank of the positive score among the ordered scores of the candidate triples.

stacked_metrics_from_ranks(batch_rank, triple_mask=None)[source]

Like Evaluation.dict_metrics_from_ranks(), but the outputs for different metrics are returned stacked in a single tensor, according to the ordering of Evaluation.metrics.

Parameters:
Return type:

Tensor

Returns:

shape: (1, n_metrics, batch_size) if reduction = “none”, else (1, n_metrics) The stacked (reduced) metrics for the batch.