semipy.sampler.DistributedJointSampler
Warning
This section is in construction.
This class is the distributed version of semipy.sampler.JointSampler. It has to be used while training with multi-GPU instead of JointSampler
. It is based on the DistributedSampler
from torch
.
Parameters
- dataset - A map-style dataset containing labelled and unlabelled data. Unlabelled data must be attached with a -1 label.
- batch_size (int) - Size of batches for labelled data.
- proportion (float) - Proportion of labelled/unlabelled data to use at each batch.
- num_replicas (int, optional) - Number of processes participating in distributed training. By default,
world_size
is retrieved from the current distributed group. Default =None
- rank (int, optional) - Rank of the current process within
num_replicas
. By default,rank
is retrieved from the current distributed group. Default:None
- shuffle (bool) - If True, sampler will shuffle the indices when arriving at the end of either labelled set or unlabelled set. Default: True
- seed (int) - Random seed used to shuffle the sampler if
shuffle=True
. This number should be identical across all processes in the distributed group. Default: 0. - drop_last (bool) - If
True
, then the sampler will drop the tail of the data to make it evenly divisible across the number of replicas. IfFalse
, the sampler will add extra indices to make the data evenly divisible across the replicas. Default: False.