DWD Classifiers

class dwd.socp_dwd.DWD(C=1.0, solver_kws=None)
property direction

The separating hyperplane is of the form ‘p.d = d.i’, where ‘.’ is the dot product. If ‘p.d < d.i’, then ‘p’ is classified label 0. If ‘p.d > d.i’ then it is classified label 1.

Returns

  • direction (np.ndarray) – The DWD separating direction; normal to the hyperplane.

  • intercept (float) – The intercept of the separating hyperplane.

  • the DWD direction and intercept. The separating hyperplane is of the

  • form ‘p.d = d.i’, where ‘.’ is the dot product. If ‘p.d < d.i’, then ‘p’ is label

  • 0. If ‘p.d > d.i’, then ‘p’ is label 1.

fit(X, y, sample_weight=None)

Fit the model according to the given training data.

Parameters
  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vector, where n_samples in the number of samples and n_features is the number of features.

  • y (array-like, shape = [n_samples]) – Target vector relative to X

  • sample_weight (array-like, shape = [n_samples], optional) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

Returns

self

Return type

object

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.

  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns

score – Mean accuracy of self.predict(X) wrt. y.

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance

class dwd.gen_dwd.GenDWD(lambd=1.0, q=1, implicit_P=True)

Generalized Distance Weighted Discrimination

Solves the gDWD problem using the MM algorithm derived in Wang and Zou, 2017.

Primary reference: Another look at distance-weighted discrimination by Boxiang Wang and Hui Zou, 2017

Note the tuning parameter lambd is on a different scale the parameter C which is used in the SOCP formulation.

Parameters
  • lambd (float) – Tuning parameter for DWD.

  • q (float) – Tuning parameter for generalized DWD (the exponent on the margin terms). When q = 1, gDWD is equivalent to DWD.

  • implicit_P (bool) – Whether to use the implicit P^{-1} gamma formulation (in the publication) or the explicit computation (in the arxiv version).

class dwd.gen_kern_dwd.KernGDWD(lambd=1.0, q=1.0, kernel='linear', kernel_kws={}, implicit_P=True)

Kernel Generalized Distance Weighted Discrimination

Solves the kernel gDWD problem using the MM algorithm derived in Wang and Zou, 2017.

Primary reference: Another look at distance-weighted discrimination by Boxiang Wang and Hui Zou, 2017

Note the tuning parameter lambd is on a different scale the parameter C which is used in the SOCP formulation.

Parameters
  • lambd (float) – Tuning parameter for DWD.

  • q (float) – Tuning parameter for generalized DWD (the exponent on the margin terms). When q = 1, gDWD is equivalent to DWD.

  • kernel (str, callable(X, Y, **kwargs)) – The kernel to use.

  • kernel_kws (dict) – Any key word arguments for the kernel.

  • implicit_P (bool) – Whether to use the implicit P^{-1} gamma formulation (in the publication) or the explicit computation (in the arxiv version).

Cross-validation

class dwd.gen_dwd.GenDWDCV(lambd_vals=array([0.01, 0.027825594, 0.0774263683, 0.215443469, 0.59948425, 1.66810054, 4.64158883, 12.9154967, 35.9381366, 100.0]), q_vals=array([0.01, 0.1, 1.0, 10.0, 100.0]), cv=5, scoring='accuracy')

Fits Genralized DWD with cross-validation. gDWD cross-validation can be significnatly faster if certain quantities are precomputed.

Parameters
  • lambd_vals (list of floats) – The lambda values to cross-validate over.

  • q_vals (list of floats) – The q-values to cross validate over.

  • cv – How to perform cross-valdiation. See documetnation in sklearn.model_selection.GridSearchCV.

  • scoring – What metric to use to score cross-validation. See documetnation in sklearn.model_selection.GridSearchCV.

class dwd.gen_kern_dwd.KernGDWDCV(lambd_vals=array([0.01, 0.027825594, 0.0774263683, 0.215443469, 0.59948425, 1.66810054, 4.64158883, 12.9154967, 35.9381366, 100.0]), q_vals=array([0.01, 0.1, 1.0, 10.0, 100.0]), kernel='linear', kernel_kws_vals={}, cv=5, scoring='accuracy')

Fits kernel gDWD with cross-validation. gDWD cross-validation can be significnatly faster if certain quantities are precomputed.

Parameters
  • lambd_vals (list of floats) – The lambda values to cross-validate over.

  • q_vals (list of floats) – The q-values to cross validate over.

  • kernel (str, callable) – The kernel to use.

  • kern_kws_vals (list of dicts) – The kernel parameters to validate over.

  • cv – How to perform cross-valdiation. See documetnation in sklearn.model_selection.GridSearchCV.

  • scoring – What metric to use to score cross-validation. See documetnation in sklearn.model_selection.GridSearchCV.