DWD Classifiers
- class dwd.socp_dwd.DWD(C=1.0, solver_kws=None)
- property direction
The separating hyperplane is of the form ‘p.d = d.i’, where ‘.’ is the dot product. If ‘p.d < d.i’, then ‘p’ is classified label 0. If ‘p.d > d.i’ then it is classified label 1.
- Returns
direction (np.ndarray) – The DWD separating direction; normal to the hyperplane.
intercept (float) – The intercept of the separating hyperplane.
the DWD direction and intercept. The separating hyperplane is of the
form ‘p.d = d.i’, where ‘.’ is the dot product. If ‘p.d < d.i’, then ‘p’ is label
0. If ‘p.d > d.i’, then ‘p’ is label 1.
- fit(X, y, sample_weight=None)
Fit the model according to the given training data.
- Parameters
X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vector, where n_samples in the number of samples and n_features is the number of features.
y (array-like, shape = [n_samples]) – Target vector relative to X
sample_weight (array-like, shape = [n_samples], optional) – Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
- Returns
self
- Return type
object
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
- score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns
score – Mean accuracy of
self.predict(X)
wrt. y.- Return type
float
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class dwd.gen_dwd.GenDWD(lambd=1.0, q=1, implicit_P=True)
Generalized Distance Weighted Discrimination
Solves the gDWD problem using the MM algorithm derived in Wang and Zou, 2017.
Primary reference: Another look at distance-weighted discrimination by Boxiang Wang and Hui Zou, 2017
Note the tuning parameter lambd is on a different scale the parameter C which is used in the SOCP formulation.
- Parameters
lambd (float) – Tuning parameter for DWD.
q (float) – Tuning parameter for generalized DWD (the exponent on the margin terms). When q = 1, gDWD is equivalent to DWD.
implicit_P (bool) – Whether to use the implicit P^{-1} gamma formulation (in the publication) or the explicit computation (in the arxiv version).
- class dwd.gen_kern_dwd.KernGDWD(lambd=1.0, q=1.0, kernel='linear', kernel_kws={}, implicit_P=True)
Kernel Generalized Distance Weighted Discrimination
Solves the kernel gDWD problem using the MM algorithm derived in Wang and Zou, 2017.
Primary reference: Another look at distance-weighted discrimination by Boxiang Wang and Hui Zou, 2017
Note the tuning parameter lambd is on a different scale the parameter C which is used in the SOCP formulation.
- Parameters
lambd (float) – Tuning parameter for DWD.
q (float) – Tuning parameter for generalized DWD (the exponent on the margin terms). When q = 1, gDWD is equivalent to DWD.
kernel (str, callable(X, Y, **kwargs)) – The kernel to use.
kernel_kws (dict) – Any key word arguments for the kernel.
implicit_P (bool) – Whether to use the implicit P^{-1} gamma formulation (in the publication) or the explicit computation (in the arxiv version).
Cross-validation
- class dwd.gen_dwd.GenDWDCV(lambd_vals=array([0.01, 0.027825594, 0.0774263683, 0.215443469, 0.59948425, 1.66810054, 4.64158883, 12.9154967, 35.9381366, 100.0]), q_vals=array([0.01, 0.1, 1.0, 10.0, 100.0]), cv=5, scoring='accuracy')
Fits Genralized DWD with cross-validation. gDWD cross-validation can be significnatly faster if certain quantities are precomputed.
- Parameters
lambd_vals (list of floats) – The lambda values to cross-validate over.
q_vals (list of floats) – The q-values to cross validate over.
cv – How to perform cross-valdiation. See documetnation in sklearn.model_selection.GridSearchCV.
scoring – What metric to use to score cross-validation. See documetnation in sklearn.model_selection.GridSearchCV.
- class dwd.gen_kern_dwd.KernGDWDCV(lambd_vals=array([0.01, 0.027825594, 0.0774263683, 0.215443469, 0.59948425, 1.66810054, 4.64158883, 12.9154967, 35.9381366, 100.0]), q_vals=array([0.01, 0.1, 1.0, 10.0, 100.0]), kernel='linear', kernel_kws_vals={}, cv=5, scoring='accuracy')
Fits kernel gDWD with cross-validation. gDWD cross-validation can be significnatly faster if certain quantities are precomputed.
- Parameters
lambd_vals (list of floats) – The lambda values to cross-validate over.
q_vals (list of floats) – The q-values to cross validate over.
kernel (str, callable) – The kernel to use.
kern_kws_vals (list of dicts) – The kernel parameters to validate over.
cv – How to perform cross-valdiation. See documetnation in sklearn.model_selection.GridSearchCV.
scoring – What metric to use to score cross-validation. See documetnation in sklearn.model_selection.GridSearchCV.