learner
Module¶
Provides easytouse wrapper around scikitlearn.
author:  Michael Heilman (mheilman@ets.org) 

author:  Nitin Madnani (nmadnani@ets.org) 
author:  Dan Blanchard (dblanchard@ets.org) 
author:  Aoife Cahill (acahill@ets.org) 
organization:  ETS 

class
skll.learner.
Densifier
[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
A custom pipeline stage that will be inserted into the learner pipeline attribute to accommodate the situation when SKLL needs to manually convert feature arrays from sparse to dense. For example, when features are being hashed but we are also doing centering using the feature means.

fit_transform
(X, y=None)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters:  X (numpy array of shape [n_samples, n_features]) – Training set.
 y (numpy array of shape [n_samples]) – Target values.
Returns: X_new – Transformed array.
Return type: numpy array of shape [n_samples, n_features_new]


class
skll.learner.
FilteredLeaveOneGroupOut
(keep, example_ids, logger=None)[source]¶ Bases:
sklearn.model_selection._split.LeaveOneGroupOut
Version of
LeaveOneGroupOut
crossvalidation iterator that only outputs indices of instances with IDs in a prespecified set.Parameters:  keep (set of str) – A set of IDs to keep.
 example_ids (list of str, of length n_samples) – A list of example IDs.

split
(X, y, groups)[source]¶ Generate indices to split data into training and test set.
Parameters:  X (arraylike, with shape (n_samples, n_features)) – Training data, where n_samples is the number of samples and n_features is the number of features.
 y (arraylike, of length n_samples) – The target variable for supervised learning problems.
 groups (arraylike, with shape (n_samples,)) – Group labels for the samples used while splitting the dataset into train/test set.
Yields:  train_index (np.array) – The training set indices for that split.
 test_index (np.array) – The testing set indices for that split.

class
skll.learner.
Learner
(model_type, probability=False, pipeline=False, feature_scaling='none', model_kwargs=None, pos_label_str=None, min_feature_count=1, sampler=None, sampler_kwargs=None, custom_learner_path=None, logger=None)[source]¶ Bases:
object
A simpler learner interface around many scikitlearn classification and regression functions.
Parameters:  model_type (str) – Name of estimator to create (e.g.,
'LogisticRegression'
). See the skll package documentation for valid options.  probability (bool, optional) – Should learner return probabilities of all
labels (instead of just label with highest probability)?
Defaults to
False
.  pipeline (bool, optional) – Should learner contain a pipeline attribute that
contains a scikitlearn Pipeline object composed
of all steps including the vectorizer, the feature
selector, the sampler, the feature scaler, and the
actual estimator. Note that this will increase the
size of the learner object in memory and also when
it is saved to disk.
Defaults to
False
.  feature_scaling (str, optional) – How to scale the features, if at all. Options are  ‘with_std’: scale features using the standard deviation  ‘with_mean’: center features using the mean  ‘both’: do both scaling as well as centering  ‘none’: do neither scaling nor centering Defaults to ‘none’.
 model_kwargs (dict, optional) – A dictionary of keyword arguments to pass to the
initializer for the specified model.
Defaults to
None
.  pos_label_str (str, optional) – A string denoting the label of the class to be
treated as the positive class in a binary classification
setting. If
None
, the class represented by the label that appears second when sorted is chosen as the positive class. For example, if the two labels in data are “A” and “B” andpos_label_str
is not specified, “B” will be chosen as the positive class. Defaults toNone
.  min_feature_count (int, optional) – The minimum number of examples a feature must have a nonzero value in to be included. Defaults to 1.
 sampler (str, optional) – The sampler to use for kernel approximation, if desired.
Valid values are
 ‘AdditiveChi2Sampler’
 ‘Nystroem’
 ‘RBFSampler’
 ‘SkewedChi2Sampler’
Defaults to
None
.  sampler_kwargs (dict, optional) – A dictionary of keyword arguments to pass to the
initializer for the specified sampler.
Defaults to
None
.  custom_learner_path (str, optional) – Path to module where a custom classifier is defined.
Defaults to
None
.  logger (logging object, optional) – A logging object. If
None
is passed, get logger from__name__
. Defaults toNone
.

cross_validate
(examples, stratified=True, cv_folds=10, grid_search=True, grid_search_folds=3, grid_jobs=None, grid_objective=None, output_metrics=[], prediction_prefix=None, param_grid=None, shuffle=False, save_cv_folds=False, save_cv_models=False, use_custom_folds_for_grid_search=True)[source]¶ Crossvalidates a given model on the training examples.
Parameters:  examples (skll.FeatureSet) – The
FeatureSet
instance to crossvalidate learner performance on.  stratified (bool, optional) – Should we stratify the folds to ensure an even
distribution of labels for each fold?
Defaults to
True
.  cv_folds (int, optional) – The number of folds to use for crossvalidation, or a mapping from example IDs to folds. Defaults to 10.
 grid_search (bool, optional) – Should we do grid search when training each fold?
Note: This will make this take much longer.
Defaults to
False
.  grid_search_folds (int or dict, optional) – The number of folds to use when doing the grid search, or a mapping from example IDs to folds. Defaults to 3.
 grid_jobs (int, optional) – The number of jobs to run in parallel when doing the
grid search. If
None
or 0, the number of grid search folds will be used. Defaults toNone
.  grid_objective (str, optional) – The name of the objective function to use when
doing the grid search. Must be specified if
grid_search
isTrue
. Defaults toNone
.  output_metrics (list of str, optional) – List of additional metric names to compute in addition to the metric used for grid search. Empty by default. Defaults to an empty list.
 prediction_prefix (str, optional) – If saving the predictions, this is the
prefix that will be used for the filename.
It will be followed by
"_predictions.tsv"
Defaults toNone
.  param_grid (list of dicts, optional) – The parameter grid to traverse.
Defaults to
None
.  shuffle (bool, optional) – Shuffle examples before splitting into folds for CV.
Defaults to
False
.  save_cv_folds (bool, optional) – Whether to save the cv fold ids or not?
Defaults to
False
.  save_cv_models (bool, optional) – Whether to save the cv models or not?
Defaults to
False
.  use_custom_folds_for_grid_search (bool, optional) – If
cv_folds
is a custom dictionary, butgrid_search_folds
is not, perhaps due to user oversight, should the same custom dictionary automatically be used for the inner gridsearch crossvalidation? Defaults toTrue
.
Returns:  results (list of 6tuples) – The confusion matrix, overall accuracy, perlabel PRFs, model parameters, objective function score, and evaluation metrics (if any) for each fold.
 grid_search_scores (list of floats) – The grid search scores for each fold.
 grid_search_cv_results_dicts (list of dicts) – A list of dictionaries of grid search CV results, one per fold, with keys such as “params”, “mean_test_score”, etc, that are mapped to lists of values associated with each hyperparameter set combination.
 skll_fold_ids (dict) – A dictionary containing the testfold number for each id
if
save_cv_folds
isTrue
, otherwiseNone
.  models (list of skll.learner.Learner) – A list of skll.learner.Learners, one for each fold if
save_cv_models
isTrue
, otherwiseNone
.
Raises: ValueError
– If labels are not encoded as strings. examples (skll.FeatureSet) – The

evaluate
(examples, prediction_prefix=None, append=False, grid_objective=None, output_metrics=[])[source]¶ Evaluates a given model on a given dev or test
FeatureSet
.Parameters:  examples (skll.FeatureSet) – The
FeatureSet
instance to evaluate the performance of the model on.  prediction_prefix (str, optional) – If saving the predictions, this is the
prefix that will be used for the filename.
It will be followed by
"_predictions.tsv"
Defaults toNone
.  append (bool, optional) – Should we append the current predictions to the file if
it exists?
Defaults to
False
.  grid_objective (function, optional) – The objective function that was used when doing
the grid search.
Defaults to
None
.  output_metrics (list of str, optional) – List of additional metric names to compute in addition to grid objective. Empty by default. Defaults to an empty list.
Returns: res – The confusion matrix, the overall accuracy, the perlabel PRFs, the model parameters, the grid search objective function score, and the additional evaluation metrics, if any.
Return type: 6tuple
 examples (skll.FeatureSet) – The

classmethod
from_file
(learner_path, logger=None)[source]¶ Load a saved
Learner
instance from a file path.Parameters:  learner_path (str) – The path to a saved
Learner
instance file.  logger (logging object, optional) – A logging object. If
None
is passed, get logger from__name__
. Defaults toNone
.
Returns: learner – The
Learner
instance loaded from the file.Return type: Raises: ValueError
– If the pickled object is not aLearner
instance.ValueError
– If the pickled version of theLearner
instance is out of date.
 learner_path (str) – The path to a saved

learning_curve
(examples, metric, cv_folds=10, train_sizes=array([0.1, 0.325, 0.55, 0.775, 1. ]))[source]¶ Generates learning curves for a given model on the training examples via crossvalidation. Adapted from the scikitlearn code for learning curve generation (cf.``sklearn.model_selection.learning_curve``).
Parameters:  examples (skll.FeatureSet) – The
FeatureSet
instance to generate the learning curve on.  cv_folds (int, optional) – The number of folds to use for crossvalidation, or a mapping from example IDs to folds. Defaults to 10.
 metric (str) – The name of the metric function to use when computing the train and test scores for the learning curve.
 train_sizes (list of float or int, optional) – Relative or absolute numbers of training examples
that will be used to generate the learning curve.
If the type is float, it is regarded as a fraction
of the maximum size of the training set (that is
determined by the selected validation method),
i.e. it has to be within (0, 1]. Otherwise it
is interpreted as absolute sizes of the training
sets. Note that for classification the number of
samples usually have to be big enough to contain
at least one sample from each class.
Defaults to
np.linspace(0.1, 1.0, 5)
.
Returns:  train_scores (list of float) – The scores for the training set.
 test_scores (list of float) – The scores on the test set.
 num_examples (list of int) – The numbers of training examples used to generate the curve
 examples (skll.FeatureSet) – The

load
(learner_path)[source]¶ Replace the current learner instance with a saved learner.
Parameters: learner_path (str) – The path to a saved learner object file to load.

model
¶ The underlying scikitlearn model

model_kwargs
¶ A dictionary of the underlying scikitlearn model’s keyword arguments

model_params
¶ Model parameters (i.e., weights) for a
LinearModel
(e.g.,Ridge
) regression and liblinear models. If the model was trained using feature hashing, then names of the form hashed_feature_XX are used instead.Returns:  res (dict) – A dictionary of labeled weights.
 intercept (dict) – A dictionary of intercept(s).
Raises: ValueError
– If the instance does not support model parameters.

model_type
¶ The model type (i.e., the class)

predict
(examples, prediction_prefix=None, append=False, class_labels=False)[source]¶ Uses a given model to generate predictions on a given
FeatureSet
.Parameters:  examples (skll.FeatureSet) – The
FeatureSet
instance to predict labels for.  prediction_prefix (str, optional) – If saving the predictions, this is the prefix that will be used for
the filename. It will be followed by
"_predictions.tsv"
Defaults toNone
.  append (bool, optional) – Should we append the current predictions to the file if it exists?
Defaults to
False
.  class_labels (bool, optional) – For classifier, should we convert class indices to their (str) labels
for the returned array? Note that class labels are always written out
to disk.
Defaults to
False
.
Returns: yhat – The predictions returned by the
Learner
instance.Return type: arraylike
Raises: MemoryError
– If process runs out of memory when converting to dense. examples (skll.FeatureSet) – The

probability
¶ Should learner return probabilities of all labels (instead of just label with highest probability)?

save
(learner_path)[source]¶ Save the
Learner
instance to a file.Parameters: learner_path (str) – The path to save the Learner
instance to.

train
(examples, param_grid=None, grid_search_folds=3, grid_search=True, grid_objective=None, grid_jobs=None, shuffle=False, create_label_dict=True)[source]¶ Train a classification model and return the model, score, feature vectorizer, scaler, label dictionary, and inverse label dictionary.
Parameters:  examples (skll.FeatureSet) – The
FeatureSet
instance to use for training.  param_grid (list of dicts, optional) – The parameter grid to search through for grid
search. If
None
, a default parameter grid will be used. Defaults toNone
.  grid_search_folds (int or dict, optional) – The number of folds to use when doing the grid search, or a mapping from example IDs to folds. Defaults to 3.
 grid_search (bool, optional) – Should we do grid search?
Defaults to
True
.  grid_objective (str, optional) – The name of the objective function to use when
doing the grid search. Must be specified if
grid_search
isTrue
. Defaults toNone
.  grid_jobs (int, optional) – The number of jobs to run in parallel when doing the
grid search. If
None
or 0, the number of grid search folds will be used. Defaults toNone
.  shuffle (bool, optional) – Shuffle examples (e.g., for grid search CV.)
Defaults to
False
.  create_label_dict (bool, optional) – Should we create the label dictionary? This
dictionary is used to map between string
labels and their corresponding numerical
values. This should only be done once per
experiment, so when
cross_validate
callstrain
,create_label_dict
gets set toFalse
. This option is only for internal use. Defaults toTrue
.
Returns: tuple – 1) The best grid search objective function score, or 0 if we’re not doing grid search, and 2) a dictionary of grid search CV results with keys such as “params”, “mean_test_score”, etc, that are mapped to lists of values associated with each hyperparameter set combination, or None if not doing grid search.
Return type: (float, dict)
Raises: ValueError
– If grid_objective is not a valid grid objective or if one is not specified when necessary.MemoryError
– If process runs out of memory converting training data to dense.ValueError
– If FeatureHasher is used with MultinomialNB.
 examples (skll.FeatureSet) – The
 model_type (str) – Name of estimator to create (e.g.,

class
skll.learner.
RescaledAdaBoostRegressor
(base_estimator=None, n_estimators=50, learning_rate=1.0, loss='linear', random_state=None)[source]¶ Bases:
sklearn.ensemble.weight_boosting.AdaBoostRegressor

fit
(X, y, sample_weight=None)¶ Build a boosted regressor from the training set (X, y).
Parameters:  X ({arraylike, sparse matrix} of shape = [n_samples, n_features]) – The training input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR.
 y (arraylike of shape = [n_samples]) – The target values (real numbers).
 sample_weight (arraylike of shape = [n_samples], optional) – Sample weights. If None, the sample weights are initialized to 1 / n_samples.
Returns: self
Return type: object

predict
(X)¶ Predict regression value for X.
The predicted regression value of an input sample is computed as the weighted median prediction of the classifiers in the ensemble.
Parameters: X ({arraylike, sparse matrix} of shape = [n_samples, n_features]) – The training input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR. Returns: y – The predicted regression values. Return type: array of shape = [n_samples]


class
skll.learner.
RescaledBayesianRidge
(n_iter=300, tol=0.001, alpha_1=1e06, alpha_2=1e06, lambda_1=1e06, lambda_2=1e06, compute_score=False, fit_intercept=True, normalize=False, copy_X=True, verbose=False)[source]¶ Bases:
sklearn.linear_model.bayes.BayesianRidge

fit
(X, y, sample_weight=None)¶ Fit the model
Parameters:  X (numpy array of shape [n_samples,n_features]) – Training data
 y (numpy array of shape [n_samples]) – Target values. Will be cast to X’s dtype if necessary
 sample_weight (numpy array of shape [n_samples]) –
Individual weights for each sample
New in version 0.20: parameter sample_weight support to BayesianRidge.
Returns: self
Return type: returns an instance of self.

predict
(X, return_std=False)¶ Predict using the linear model.
In addition to the mean of the predictive distribution, also its standard deviation can be returned.
Parameters:  X ({arraylike, sparse matrix}, shape = (n_samples, n_features)) – Samples.
 return_std (boolean, optional) – Whether to return the standard deviation of posterior prediction.
Returns:  y_mean (array, shape = (n_samples,)) – Mean of predictive distribution of query points.
 y_std (array, shape = (n_samples,)) – Standard deviation of predictive distribution of query points.


class
skll.learner.
RescaledDecisionTreeRegressor
(criterion='mse', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, presort=False)[source]¶ Bases:
sklearn.tree.tree.DecisionTreeRegressor

fit
(X, y, sample_weight=None, check_input=True, X_idx_sorted=None)¶ Build a decision tree regressor from the training set (X, y).
Parameters:  X (arraylike or sparse matrix, shape = [n_samples, n_features]) – The training input samples. Internally, it will be converted to
dtype=np.float32
and if a sparse matrix is provided to a sparsecsc_matrix
.  y (arraylike, shape = [n_samples] or [n_samples, n_outputs]) – The target values (real numbers). Use
dtype=np.float64
andorder='C'
for maximum efficiency.  sample_weight (arraylike, shape = [n_samples] or None) – Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node.
 check_input (boolean, (default=True)) – Allow to bypass several input checking. Don’t use this parameter unless you know what you do.
 X_idx_sorted (arraylike, shape = [n_samples, n_features], optional) – The indexes of the sorted training input samples. If many tree are grown on the same dataset, this allows the ordering to be cached between trees. If None, the data will be sorted here. Don’t use this parameter unless you know what to do.
Returns: self
Return type: object
 X (arraylike or sparse matrix, shape = [n_samples, n_features]) – The training input samples. Internally, it will be converted to

predict
(X, check_input=True)¶ Predict class or regression value for X.
For a classification model, the predicted class for each sample in X is returned. For a regression model, the predicted value based on X is returned.
Parameters:  X (arraylike or sparse matrix of shape = [n_samples, n_features]) – The input samples. Internally, it will be converted to
dtype=np.float32
and if a sparse matrix is provided to a sparsecsr_matrix
.  check_input (boolean, (default=True)) – Allow to bypass several input checking. Don’t use this parameter unless you know what you do.
Returns: y – The predicted classes, or the predict values.
Return type: array of shape = [n_samples] or [n_samples, n_outputs]
 X (arraylike or sparse matrix of shape = [n_samples, n_features]) – The input samples. Internally, it will be converted to


class
skll.learner.
RescaledElasticNet
(alpha=1.0, l1_ratio=0.5, fit_intercept=True, normalize=False, precompute=False, max_iter=1000, copy_X=True, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')[source]¶ Bases:
sklearn.linear_model.coordinate_descent.ElasticNet

fit
(X, y, check_input=True)¶ Fit model with coordinate descent.
Parameters:  X (ndarray or scipy.sparse matrix, (n_samples, n_features)) – Data
 y (ndarray, shape (n_samples,) or (n_samples, n_targets)) – Target. Will be cast to X’s dtype if necessary
 check_input (boolean, (default=True)) – Allow to bypass several input checking. Don’t use this parameter unless you know what you do.
Notes
Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortrancontiguous numpy array if necessary.
To avoid memory reallocation it is advised to allocate the initial data in memory directly using that format.

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledGradientBoostingRegressor
(loss='ls', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, min_impurity_split=None, init=None, random_state=None, max_features=None, alpha=0.9, verbose=0, max_leaf_nodes=None, warm_start=False, presort='auto', validation_fraction=0.1, n_iter_no_change=None, tol=0.0001)[source]¶ Bases:
sklearn.ensemble.gradient_boosting.GradientBoostingRegressor

fit
(X, y, sample_weight=None, monitor=None)¶ Fit the gradient boosting model.
Parameters:  X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float32
and if a sparse matrix is provided to a sparsecsr_matrix
.  y (arraylike, shape (n_samples,)) – Target values (strings or integers in classification, real numbers in regression) For classification, labels must correspond to classes.
 sample_weight (arraylike, shape (n_samples,) or None) – Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. In the case of classification, splits are also ignored if they would result in any single class carrying a negative weight in either child node.
 monitor (callable, optional) – The monitor is called after each iteration with the current
iteration, a reference to the estimator and the local variables of
_fit_stages
as keyword argumentscallable(i, self, locals())
. If the callable returnsTrue
the fitting procedure is stopped. The monitor can be used for various things such as computing heldout estimates, early stopping, model introspect, and snapshoting.
Returns: self
Return type: object
 X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – The input samples. Internally, it will be converted to

predict
(X)¶ Predict regression target for X.
Parameters: X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – The input samples. Internally, it will be converted to dtype=np.float32
and if a sparse matrix is provided to a sparsecsr_matrix
.Returns: y – The predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledHuberRegressor
(epsilon=1.35, max_iter=100, alpha=0.0001, warm_start=False, fit_intercept=True, tol=1e05)[source]¶ Bases:
sklearn.linear_model.huber.HuberRegressor

fit
(X, y, sample_weight=None)¶ Fit the model according to the given training data.
Parameters:  X (arraylike, shape (n_samples, n_features)) – Training vector, where n_samples in the number of samples and n_features is the number of features.
 y (arraylike, shape (n_samples,)) – Target vector relative to X.
 sample_weight (arraylike, shape (n_samples,)) – Weight given to each sample.
Returns: self
Return type: object

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledKNeighborsRegressor
(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=None, **kwargs)[source]¶ Bases:
sklearn.neighbors.regression.KNeighborsRegressor

fit
(X, y)¶ Fit the model using X as training data and y as target values
Parameters:  X ({arraylike, sparse matrix, BallTree, KDTree}) – Training data. If array or matrix, shape [n_samples, n_features], or [n_samples, n_samples] if metric=’precomputed’.
 y ({arraylike, sparse matrix}) –
 Target values, array of float values, shape = [n_samples]
 or [n_samples, n_outputs]

predict
(X)¶ Predict the target for the provided data
Parameters: X (arraylike, shape (n_query, n_features), or (n_query, n_indexed) if metric == 'precomputed') – Test samples. Returns: y – Target values Return type: array of int, shape = [n_samples] or [n_samples, n_outputs]


class
skll.learner.
RescaledLars
(fit_intercept=True, verbose=False, normalize=True, precompute='auto', n_nonzero_coefs=500, eps=2.220446049250313e16, copy_X=True, fit_path=True, positive=False)[source]¶ Bases:
sklearn.linear_model.least_angle.Lars

fit
(X, y, Xy=None)¶ Fit the model using X, y as training data.
Parameters:  X (arraylike, shape (n_samples, n_features)) – Training data.
 y (arraylike, shape (n_samples,) or (n_samples, n_targets)) – Target values.
 Xy (arraylike, shape (n_samples,) or (n_samples, n_targets), optional) – Xy = np.dot(X.T, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.
Returns: self – returns an instance of self.
Return type: object

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledLasso
(alpha=1.0, fit_intercept=True, normalize=False, precompute=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')[source]¶ Bases:
sklearn.linear_model.coordinate_descent.Lasso

fit
(X, y, check_input=True)¶ Fit model with coordinate descent.
Parameters:  X (ndarray or scipy.sparse matrix, (n_samples, n_features)) – Data
 y (ndarray, shape (n_samples,) or (n_samples, n_targets)) – Target. Will be cast to X’s dtype if necessary
 check_input (boolean, (default=True)) – Allow to bypass several input checking. Don’t use this parameter unless you know what you do.
Notes
Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortrancontiguous numpy array if necessary.
To avoid memory reallocation it is advised to allocate the initial data in memory directly using that format.

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledLinearRegression
(fit_intercept=True, normalize=False, copy_X=True, n_jobs=None)[source]¶ Bases:
sklearn.linear_model.base.LinearRegression

fit
(X, y, sample_weight=None)¶ Fit linear model.
Parameters:  X (arraylike or sparse matrix, shape (n_samples, n_features)) – Training data
 y (array_like, shape (n_samples, n_targets)) – Target values. Will be cast to X’s dtype if necessary
 sample_weight (numpy array of shape [n_samples]) –
Individual weights for each sample
New in version 0.17: parameter sample_weight support to LinearRegression.
Returns: self
Return type: returns an instance of self.

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledLinearSVR
(epsilon=0.0, tol=0.0001, C=1.0, loss='epsilon_insensitive', fit_intercept=True, intercept_scaling=1.0, dual=True, verbose=0, random_state=None, max_iter=1000)[source]¶ Bases:
sklearn.svm.classes.LinearSVR

fit
(X, y, sample_weight=None)¶ Fit the model according to the given training data.
Parameters:  X ({arraylike, sparse matrix}, shape = [n_samples, n_features]) – Training vector, where n_samples in the number of samples and n_features is the number of features.
 y (arraylike, shape = [n_samples]) – Target vector relative to X
 sample_weight (arraylike, shape = [n_samples], optional) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
Returns: self
Return type: object

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledMLPRegressor
(hidden_layer_sizes=(100, ), activation='relu', solver='adam', alpha=0.0001, batch_size='auto', learning_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e08, n_iter_no_change=10)[source]¶ Bases:
sklearn.neural_network.multilayer_perceptron.MLPRegressor

fit
(X, y)¶ Fit the model to data matrix X and target(s) y.
Parameters:  X (arraylike or sparse matrix, shape (n_samples, n_features)) – The input data.
 y (arraylike, shape (n_samples,) or (n_samples, n_outputs)) – The target values (class labels in classification, real numbers in regression).
Returns: self
Return type: returns a trained MLP model.

predict
(X)¶ Predict using the multilayer perceptron model.
Parameters: X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – The input data. Returns: y – The predicted values. Return type: arraylike, shape (n_samples, n_outputs)


class
skll.learner.
RescaledRANSACRegressor
(base_estimator=None, min_samples=None, residual_threshold=None, is_data_valid=None, is_model_valid=None, max_trials=100, max_skips=inf, stop_n_inliers=inf, stop_score=inf, stop_probability=0.99, loss='absolute_loss', random_state=None)[source]¶ Bases:
sklearn.linear_model.ransac.RANSACRegressor

fit
(X, y, sample_weight=None)¶ Fit estimator using RANSAC algorithm.
Parameters:  X (arraylike or sparse matrix, shape [n_samples, n_features]) – Training data.
 y (arraylike, shape = [n_samples] or [n_samples, n_targets]) – Target values.
 sample_weight (arraylike, shape = [n_samples]) – Individual weights for each sample raises error if sample_weight is passed and base_estimator fit method does not support it.
Raises: ValueError
– If no valid consensus set could be found. This occurs if is_data_valid and is_model_valid return False for all max_trials randomly chosen subsamples.

predict
(X)¶ Predict using the estimated model.
This is a wrapper for estimator_.predict(X).
Parameters: X (numpy array of shape [n_samples, n_features]) – Returns: y – Returns predicted values. Return type: array, shape = [n_samples] or [n_samples, n_targets]


class
skll.learner.
RescaledRandomForestRegressor
(n_estimators='warn', criterion='mse', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False)[source]¶ Bases:
sklearn.ensemble.forest.RandomForestRegressor

fit
(X, y, sample_weight=None)¶ Build a forest of trees from the training set (X, y).
Parameters:  X (arraylike or sparse matrix of shape = [n_samples, n_features]) – The training input samples. Internally, its dtype will be converted
to
dtype=np.float32
. If a sparse matrix is provided, it will be converted into a sparsecsc_matrix
.  y (arraylike, shape = [n_samples] or [n_samples, n_outputs]) – The target values (class labels in classification, real numbers in regression).
 sample_weight (arraylike, shape = [n_samples] or None) – Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. In the case of classification, splits are also ignored if they would result in any single class carrying a negative weight in either child node.
Returns: self
Return type: object
 X (arraylike or sparse matrix of shape = [n_samples, n_features]) – The training input samples. Internally, its dtype will be converted
to

predict
(X)¶ Predict regression target for X.
The predicted regression target of an input sample is computed as the mean predicted regression targets of the trees in the forest.
Parameters: X (arraylike or sparse matrix of shape = [n_samples, n_features]) – The input samples. Internally, its dtype will be converted to dtype=np.float32
. If a sparse matrix is provided, it will be converted into a sparsecsr_matrix
.Returns: y – The predicted values. Return type: array of shape = [n_samples] or [n_samples, n_outputs]


class
skll.learner.
RescaledRidge
(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, solver='auto', random_state=None)[source]¶ Bases:
sklearn.linear_model.ridge.Ridge

fit
(X, y, sample_weight=None)¶ Fit Ridge regression model
Parameters:  X ({arraylike, sparse matrix}, shape = [n_samples, n_features]) – Training data
 y (arraylike, shape = [n_samples] or [n_samples, n_targets]) – Target values
 sample_weight (float or numpy array of shape [n_samples]) – Individual weights for each sample
Returns: self
Return type: returns an instance of self.

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledSGDRegressor
(loss='squared_loss', penalty='l2', alpha=0.0001, l1_ratio=0.15, fit_intercept=True, max_iter=1000, tol=0.001, shuffle=True, verbose=0, epsilon=0.1, random_state=None, learning_rate='invscaling', eta0=0.01, power_t=0.25, early_stopping=False, validation_fraction=0.1, n_iter_no_change=5, warm_start=False, average=False)[source]¶ Bases:
sklearn.linear_model.stochastic_gradient.SGDRegressor

fit
(X, y, coef_init=None, intercept_init=None, sample_weight=None)¶ Fit linear model with Stochastic Gradient Descent.
Parameters:  X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – Training data
 y (numpy array, shape (n_samples,)) – Target values
 coef_init (array, shape (n_features,)) – The initial coefficients to warmstart the optimization.
 intercept_init (array, shape (1,)) – The initial intercept to warmstart the optimization.
 sample_weight (arraylike, shape (n_samples,), optional) – Weights applied to individual samples (1. for unweighted).
Returns: self
Return type: returns an instance of self.

predict
(X)¶ Predict using the linear model
Parameters: X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – Returns: Predicted target values per element in X. Return type: array, shape (n_samples,)


class
skll.learner.
RescaledSVR
(kernel='rbf', degree=3, gamma='auto_deprecated', coef0=0.0, tol=0.001, C=1.0, epsilon=0.1, shrinking=True, cache_size=200, verbose=False, max_iter=1)[source]¶ Bases:
sklearn.svm.classes.SVR

fit
(X, y, sample_weight=None)¶ Fit the SVM model according to the given training data.
Parameters:  X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of features. For kernel=”precomputed”, the expected shape of X is (n_samples, n_samples).
 y (arraylike, shape (n_samples,)) – Target values (class labels in classification, real numbers in regression)
 sample_weight (arraylike, shape (n_samples,)) – Persample weights. Rescale C per sample. Higher weights force the classifier to put more emphasis on these points.
Returns: self
Return type: object
Notes
If X and y are not Cordered and contiguous arrays of np.float64 and X is not a scipy.sparse.csr_matrix, X and/or y may be copied.
If X is a dense array, then the other methods will not support sparse matrices as input.

predict
(X)¶ Perform regression on samples in X.
For an oneclass model, +1 (inlier) or 1 (outlier) is returned.
Parameters: X ({arraylike, sparse matrix}, shape (n_samples, n_features)) – For kernel=”precomputed”, the expected shape of X is (n_samples_test, n_samples_train). Returns: y_pred Return type: array, shape (n_samples,)


class
skll.learner.
RescaledTheilSenRegressor
(fit_intercept=True, copy_X=True, max_subpopulation=10000.0, n_subsamples=None, max_iter=300, tol=0.001, random_state=None, n_jobs=None, verbose=False)[source]¶ Bases:
sklearn.linear_model.theil_sen.TheilSenRegressor

fit
(X, y)¶ Fit linear model.
Parameters:  X (numpy array of shape [n_samples, n_features]) – Training data
 y (numpy array of shape [n_samples]) – Target values
Returns: self
Return type: returns an instance of self.

predict
(X)¶ Predict using the linear model
Parameters: X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples. Returns: C – Returns predicted values. Return type: array, shape (n_samples,)


class
skll.learner.
SelectByMinCount
(min_count=1)[source]¶ Bases:
sklearn.feature_selection.univariate_selection.SelectKBest
Select features occurring in more (and/or fewer than) than a specified number of examples in the training data (or a CV training fold).
Parameters: min_count (int, optional) – The minimum feature count to select. Defaults to 1.

skll.learner.
rescaled
(cls)[source]¶ Decorator to create regressors that store a min and a max for the training data and make sure that predictions fall within that range. It also stores the means and SDs of the gold standard and the predictions on the training set to rescale the predictions (e.g., as in erater).
Parameters: cls (BaseEstimator) – An estimator class to add rescaling to. Returns: cls – Modified version of estimator class with rescaled functions added. Return type: BaseEstimator Raises: ValueError
– If classifier cannot be rescaled (i.e. is not a regressor).