deepod.models.DeepSVDD

class deepod.models.DeepSVDD(epochs=100, batch_size=64, lr=0.001, rep_dim=128, hidden_dims='100,50', act='ReLU', bias=False, epoch_steps=-1, prt_steps=10, device='cuda', verbose=2, random_state=42)[source]
Deep One-class Classification for Anomaly Detection (ICML’18)

[BRVG+18]

Parameters:
  • epochs (int, optional (default=100)) – Number of training epochs

  • batch_size (int, optional (default=64)) – Number of samples in a mini-batch

  • lr (float, optional (default=1e-3)) – Learning rate

  • rep_dim (int, optional (default=128)) – Dimensionality of the representation space

  • hidden_dims (list, str or int, optional (default='100,50')) –

    Number of neural units in hidden layers
    • If list, each item is a layer

    • If str, neural units of hidden layers are split by comma

    • If int, number of neural units of single hidden layer

  • act (str, optional (default='ReLU')) – activation layer name choice = [‘ReLU’, ‘LeakyReLU’, ‘Sigmoid’, ‘Tanh’]

  • bias (bool, optional (default=False)) – Additive bias in linear layer

  • epoch_steps (int, optional (default=-1)) –

    Maximum steps in an epoch
    • If -1, all the batches will be processed

  • prt_steps (int, optional (default=10)) – Number of epoch intervals per printing

  • device (str, optional (default='cuda')) – torch device,

  • verbose (int, optional (default=1)) – Verbosity mode

  • int (random_state:) – the seed used by the random

  • (default=42) (optional) – the seed used by the random

decision_scores_

The outlier scores of the training data. The higher, the more abnormal. Outliers tend to have higher scores. This value is available once the detector is fitted.

Type:

numpy array of shape (n_samples,)

threshold_

The threshold is based on contamination. It is the n_samples * contamination most abnormal samples in decision_scores_. The threshold is calculated for generating binary outlier labels.

Type:

float

labels_

The binary labels of the training data. 0 stands for inliers and 1 for outliers/anomalies. It is generated by applying threshold_ on decision_scores_.

Type:

int, either 0 or 1

Methods

__init__([epochs, batch_size, lr, rep_dim, ...])

decision_function(X[, return_rep])

Predict raw anomaly scores of X using the fitted detector.

decision_function_update(z, scores)

for any updating operation after decision function

epoch_update()

for any updating operation after each training epoch

fit(X[, y])

Fit detector.

fit_auto_hyper(X[, y, X_test, y_test, ...])

Fit detector.

inference_forward(batch_x, net, criterion)

define forward step in inference

inference_prepare(X)

define test_loader

load_model(path)

load_ray_checkpoint(best_config, best_checkpoint)

predict(X[, return_confidence])

Predict if a particular sample is an outlier or not.

save_model(path)

set_seed(seed)

set_tuned_net(config)

set_tuned_params()

training_forward(batch_x, net, criterion)

define forward step in training

training_prepare(X, y)

define train_loader, net, and criterion

decision_function(X, return_rep=False)

Predict raw anomaly scores of X using the fitted detector.

The anomaly score of an input sample is computed based on the fitted detector. For consistency, outliers are assigned with higher anomaly scores.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples. Sparse matrices are accepted only if they are supported by the base estimator.

  • return_rep (boolean, optional, default=False) – whether return representations

Returns:

anomaly_scores – The anomaly score of the input samples.

Return type:

numpy array of shape (n_samples,)

decision_function_update(z, scores)

for any updating operation after decision function

epoch_update()

for any updating operation after each training epoch

fit(X, y=None)

Fit detector. y is ignored in unsupervised methods.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples.

  • y (numpy array of shape (n_samples, )) – Not used in unsupervised methods, present for API consistency by convention. used in (semi-/weakly-) supervised methods

Returns:

self – Fitted estimator.

Return type:

object

fit_auto_hyper(X, y=None, X_test=None, y_test=None, n_ray_samples=5, time_budget_s=None)

Fit detector. y is ignored in unsupervised methods.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples.

  • y (numpy array of shape (n_samples, )) – Not used in unsupervised methods, present for API consistency by convention. used in (semi-/weakly-) supervised methods

  • X_test (numpy array of shape (n_samples, n_features), default=None) – The input testing samples for hyper-parameter tuning.

  • y_test (numpy array of shape (n_samples, ), default=None) – Label of input testing samples for hyper-parameter tuning.

  • n_ray_samples (int, default=5) – Number of times to sample from the hyperparameter space

  • time_budget_s (int, default=None) – Global time budget in seconds after which all trials of Ray are stopped.

Returns:

config – tuned hyper-parameter

Return type:

dict

inference_forward(batch_x, net, criterion)[source]

define forward step in inference

inference_prepare(X)[source]

define test_loader

predict(X, return_confidence=False)

Predict if a particular sample is an outlier or not.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples.

  • return_confidence (boolean, optional(default=False)) – If True, also return the confidence of prediction.

Returns:

  • outlier_labels (numpy array of shape (n_samples,)) – For each observation, tells whether it should be considered as an outlier according to the fitted model. 0 stands for inliers and 1 for outliers.

  • confidence (numpy array of shape (n_samples,).) – Only if return_confidence is set to True.

training_forward(batch_x, net, criterion)[source]

define forward step in training

training_prepare(X, y)[source]

define train_loader, net, and criterion