yall.ActiveLearningModel¶
-
class
yall.activelearning.
ActiveLearningModel
(classifier, query_strategy, eval_metric='auc', U_proportion=0.9, init_L='random', random_state=None)[source]¶ Bases:
object
Parameters: - classifier (sklearn.base.BaseEstimator) – Classifier to build the model.
- query_strategy (QueryStrategy) – QueryStrategy instance to use.
- eval_metric (str) – One of “auc”, “accuracy”.
- U_proportion (float) – proportion of training data to be assigned the unlabeled set.
- init_L (str) – How to initialize L: “random” or “LDS”.
- random_state (int) – Sets the random_state parameter of train_test_split.
-
partial_train
(new_x, new_y)[source]¶ Given a subset of training examples, calls partial_fit.
Parameters: - new_x (numpy.ndarray) – Feature array.
- new_y (numpy.ndarray) – Label array.
-
prepare_data
(train_X, test_X, train_y, test_y)[source]¶ Splits data into unlabeled, labeled, and test sets according to self.U_proportion.
Parameters: - train_X (np.array) – Training data features.
- test_X (np.array) – Test data features.
- train_y (np.array) – Training data labels.
- test_y (np.array) – Test data labels.
-
run
(train_X, test_X, train_y, test_y, ndraws=None, verbose=0)[source]¶ Run the active learning model. Saves AUC scores for each sampling iteration.
Parameters: - train_X (np.array) – Training data features.
- test_X (np.array) – Test data features.
- train_y (np.array) – Training data labels.
- test_y (np.array) – Test data labels.
- ndraws (int) – Number of times to query the unlabeled set. If None, query entire unlabeled set.
- verbose (int) – If > 0, print information.
Returns: AUC scores for each sampling iteration.
Return type: numpy.ndarray(shape=(ndraws, ))