besskge.negative_sampler.TripleBasedShardedNegativeSampler

class besskge.negative_sampler.TripleBasedShardedNegativeSampler(negative_heads, negative_tails, sharding, corruption_scheme, seed, mask_on_gather=False, return_sort_idx=False)[source]

Return (possibly triple-specific) predetermined negative entities.

Initialize triple-based negative sampler.

Parameters:
  • negative_heads (Optional[ndarray[Any, dtype[int32]]]) – shape: (N, n_negative) Global entity IDs of negative heads, specific for each triple (N=n_triple) or for all of them (N=1).

  • negative_tails (Optional[ndarray[Any, dtype[int32]]]) – shape: (N, n_negative) Global entity IDs of negative tails, specific for each triple (N=n_triple) or for all of them (N=1).

  • sharding (Sharding) – see RandomShardedNegativeSampler.__init__()

  • corruption_scheme (str) – see RandomShardedNegativeSampler.__init__()

  • seed (int) – see RandomShardedNegativeSampler.__init__()

  • mask_on_gather (bool) – If True, shape the negative mask to be applied on the device where negative entities are gathered, instead of the one where they are scored. Set to True only when using besskge.bess.TopKQueryBessKGE. Default: False.

  • return_sort_idx (bool) – If True, return for each triple in the batch the sorting indices to recover the same ordering of negatives as in negative_heads, negative_tails. Default: False.

corruption_scheme: str

Which entity to corrupt; “h”, “t”, “ht”

flat_negative_format: bool

Sample negatives per triple partition, instead of per triple

local_sampling: bool

Sample negatives only from processing device

pad_negatives(negatives, shard_counts, padded_shard_length)[source]

Divide negatives based on shard and pad lists to same length.

Parameters:
  • negatives (ndarray[Any, dtype[int32]]) – shape: (N, n_negative) Negative entities, each row already sorted in shard order (N = 1, n_triple).

  • shard_counts (ndarray[Any, dtype[int64]]) – shape: (N, n_shard) Number of negatives per shard.

  • padded_shard_length (int) – The size to which each shard list is to be padded.

Return padded_negatives:

shape: (N, n_shard, padded_shard_length) The padded shard lists of negatives.

Return mask:

shape: (N, n_negative) Indices of true negatives in padded_negatives.view(N,-1).

Return type:

Tuple[ndarray[Any, dtype[int32]], ndarray[Any, dtype[bool_]]]

rng: Generator

RNG

shard_negatives(negatives)[source]

Split negative entities into corresponding shards.

Parameters:

negatives (ndarray[Any, dtype[int32]]) – shape: (N, n_negatives) Negative entities to shard (N = 1, n_triple).

Return shard_neg_counts:

shape: (N, n_shard) Number of negative entities per shard.

Return sort_neg_idx:

shape: (N, n_negatives) Sorting index to cluster negatives in shard order.

Return type:

Tuple[ndarray[Any, dtype[int64]], ndarray[Any, dtype[int32]]]