pupil.sampling package
pupil.sampling.cluster_based module
- class pupil.sampling.cluster_based.ClusteringSampler(clustering_model: pupil.models.clustering.Clustering)
Bases:
objectClustering sampling: 1. Get the closest data to centroids 2. Get outliers in each cluster 3. Randomly sample from each cluster 4. Combine them all
- fit(X: NDArray2D) None
- predict(X: NDArray2D) Tuple[numpy.ndarray, numpy.ndarray]
- Parameters
X (NDArray2D) – _description_
- Returns
tuple(distances , cluster_ids)
- Return type
Tuple[NDArray2D, NDArray2D]
pupil.sampling.model_based module
- class pupil.sampling.model_based.LinearInterpolationTransformer
Bases:
sklearn.base.BaseEstimator,sklearn.base.TransformerMixin- fit(X, y=None)
- fit_transform(X, y=None)
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
- transform(X)
- class pupil.sampling.model_based.ModelBasedSampler(ranker)
Bases:
object- fit(X: NDArray2D)
- classmethod from_strategy(strategy: Literal['rank', 'quantile', 'linear'] = 'linear') pupil.sampling.model_based.ModelBasedSampler
- predict(X: NDArray2D)
- class pupil.sampling.model_based.RankTransformer
Bases:
sklearn.base.BaseEstimator,sklearn.base.TransformerMixin- fit(X, y=None)
- fit_transform(X, y=None)
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
- transform(X)
pupil.sampling.uncertainty module
- class pupil.sampling.uncertainty.UncertaintySampler(sampling_strategy: Callable[[numpy.ndarray], numpy.ndarray])
Bases:
objectUncertainty sampling is a set of techniques for identifying unlabeled items that are near a decision boundary in your current machine learning model.
- fit(prob_dist: NDArray2D) None
Get the 2D numpy array of model predictions and retun an array on indecies with the order of highest to lowst uncertainty.
- Parameters
prob_dist (NDArray2D) –
- classmethod from_strategy(strategy: str) pupil.sampling.uncertainty.UncertaintySampler
classmethod to help picking the sampling strategy
- Parameters
strategy (str) – Should be one of:
['least_confidence', 'margin_confidence', 'ratio_confidence', 'entropy_based']- Raises
ValueError – If strategy is not in the valid list
- Return type
- pupil.sampling.uncertainty.entropy_based(prob_dist: NDArray2D) numpy.ndarray
Returns the uncertainty score of an array using least confidence sampling in a 0-1 range where 1 is most uncertain.
Example:
Assumes probability distribution is a numpy array, like:
np.array([[0.0321, 0.6439, 0.0871, 0.2369]])The results will beP(y|x) log2(P(y|x)) = 0 – SUM(–0.159, –0.409, –0.307, –0.492) = 1.3671.367 / log2(n_classes = 4) = 0.684- Parameters
prob_dist (NDArray2D) – a 2D numpy array of real numbers between 0 and 1
point (each row is a data) –
class (and each column shows the probability of that) –
- Returns
shape(n_rows)
- Return type
np.ndarray
- pupil.sampling.uncertainty.least_confidence(prob_dist: NDArray2D) numpy.ndarray
Returns the uncertainty score of an array using least confidence sampling in a 0-1 range where 1 is most uncertain.
Example:
Assumes probability distribution is a numpy array, like
np.array ([[0.0321, 0.6439, 0.0871, 0.2369]])The restults will be(1 – 0.6439) × (4 / 3) = 0.4748- Parameters
prob_dist (NDArray2D) – a 2D numpy array of real numbers between 0 and 1
point (each row is a data) –
class (and each column shows the probability of that) –
- Returns
shape(n_rows)
- Return type
np.ndarray
- pupil.sampling.uncertainty.margin_confidence(prob_dist: NDArray2D) numpy.ndarray
Returns the uncertainty score of an array using least confidence sampling in a 0-1 range where 1 is most uncertain.
Example:
Assumes probability distribution is a numpy array, like:
np.array([[0.0321, 0.6439, 0.0871, 0.2369]])The results would will be1.0 - (0.6439 - 0.2369) = 0.5930- Parameters
prob_dist (NDArray2D) – a 2D numpy array of real numbers between 0 and 1
point (each row is a data) –
class. (and each column shows the probability of that) –
- Returns
shape(n_rows)
- Return type
np.ndarray
- pupil.sampling.uncertainty.ratio_confidence(prob_dist: NDArray2D) numpy.ndarray
Returns the uncertainty score of an array using least confidence sampling in a 0-1 range where 1 is most uncertain. Example:
Assumes probability distribution is a numpy array, like
np.array ***([[0.0321, 0.6439, 0.0871, 0.2369]])The results will be0.6439 / 0.2369 = 2.71828- Parameters
prob_dist (NDArray2D) – a 2D numpy array of real numbers between 0 and 1
point (each row is a data) –
class (and each column shows the probability of that) –
- Returns
shape(n_rows)
- Return type
np.ndarray