deepod.models.NCAD

class deepod.models.NCAD(epochs=100, batch_size=64, lr=0.0003, seq_len=100, stride=1, suspect_win_len=10, coe_rate=0.5, mixup_rate=2.0, hidden_dims='32,32,32,32', rep_dim=128, act='ReLU', bias=False, kernel_size=5, dropout=0.0, epoch_steps=-1, prt_steps=10, device='cuda', verbose=2, random_state=42)[source]

Neural Contextual Anomaly Detection for Time Series. (IJCAI’22)

It extends the BaseDeepAD class to implement anomaly detection specific for time series data.

Parameters:
  • epochs – (int, optional) The number of epochs to train the model (default is 100).

  • batch_size – (int, optional) The number of samples per batch to load (default is 64).

  • lr – (float, optional) Learning rate for the optimizer (default is 3e-4).

  • seq_len – (int, optional) Length of the input sequences for the model (default is 100).

  • stride – (int, optional) The stride of the window during training (default is 1).

  • suspect_win_len – (int, optional) The length of the window considered as suspect for anomaly (default is 10).

  • coe_rate – (float, optional) Rate at which contextual outlier exposure is applied (default is 0.5).

  • mixup_rate – (float, optional) Rate at which mixup is applied (default is 2.0).

  • hidden_dims

    (list or str, optional) The list or comma-separated string of hidden dimensions for the neural network layers (default is ‘32,32,32,32’).

    • If list, each item is a layer

    • If str, neural units of hidden layers are split by comma

    • If int, number of neural units of single hidden layer

  • rep_dim – (int, optional) The size of the representation layer (default is 128).

  • act – (str, optional) The activation function to use in the neural network (default is ‘ReLU’). choice = [‘ReLU’, ‘LeakyReLU’, ‘Sigmoid’, ‘Tanh’]

  • bias – (bool, optional) Whether to use bias in the layers (default is False).

  • kernel_size – (int, optional) The size of the kernel for convolutional layers (default is 5).

  • dropout – (float, optional) The dropout rate (default is 0.0).

  • epoch_steps – (int, optional) The maximum number of steps per epoch (default is -1, which processes all batches).

  • prt_steps – (int, optional) The interval for printing the training progress (default is 10).

  • device – (str, optional) The device to use for training the model (‘cuda’ or ‘cpu’) (default is ‘cuda’).

  • verbose – (int, optional) Verbosity mode (default is 2).

  • random_state – (int, optional) Seed used by the random number generator (default is 42).

Methods

__init__([epochs, batch_size, lr, seq_len, ...])

Initializes NCAD with specified hyperparameters.

coe_batch(x, y, coe_rate, suspect_window_length)

Generates a batch of data with contextual outlier exposure (COE) augmentations.

decision_function(X[, return_rep])

Predict raw anomaly scores of X using the fitted detector.

decision_function_update(z, scores)

for any updating operation after decision function

epoch_update()

for any updating operation after each training epoch

fit(X[, y])

Fit detector.

fit_auto_hyper(X[, y, X_test, y_test, ...])

Fit detector.

inference_forward(batch_x, net, criterion)

Conducts a forward pass during inference to calculate logits for anomaly scores.

inference_prepare(X)

Prepares the inference process by creating a DataLoader for the test data.

load_model(path)

load_ray_checkpoint(best_config, best_checkpoint)

mixup_batch(x, y, mixup_rate)

Generates a batch of data with mixup augmentations.

predict(X[, return_confidence])

Predict if a particular sample is an outlier or not.

save_model(path)

set_seed(seed)

set_tuned_net(config)

set_tuned_params()

training_forward(batch_x, net, criterion)

Conducts a forward pass during training, including data augmentation strategies like COE and mixup.

training_prepare(X, y)

Prepares the training process by creating data loaders and initializing the network and loss criterion.

static coe_batch(x, y, coe_rate, suspect_window_length, random_start_end=True)[source]

Generates a batch of data with contextual outlier exposure (COE) augmentations.

Parameters:
  • x (torch.Tensor) – Input batch of data with dimensions (batch, ts channels, time).

  • y (torch.Tensor) – Target labels for the batch.

  • coe_rate (float) – The proportion of the batch to augment with COE.

  • suspect_window_length (int) – The length of the window considered as suspect for anomaly.

  • random_start_end (bool, optional) – Whether to permute a random subset within the suspect segment. Defaults to True.

Returns:

A tuple containing the augmented data and corresponding labels.

Return type:

tuple

decision_function(X, return_rep=False)

Predict raw anomaly scores of X using the fitted detector.

The anomaly score of an input sample is computed based on the fitted detector. For consistency, outliers are assigned with higher anomaly scores.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples. Sparse matrices are accepted only if they are supported by the base estimator.

  • return_rep (boolean, optional, default=False) – whether return representations

Returns:

anomaly_scores – The anomaly score of the input samples.

Return type:

numpy array of shape (n_samples,)

decision_function_update(z, scores)

for any updating operation after decision function

epoch_update()

for any updating operation after each training epoch

fit(X, y=None)

Fit detector. y is ignored in unsupervised methods.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples.

  • y (numpy array of shape (n_samples, )) – Not used in unsupervised methods, present for API consistency by convention. used in (semi-/weakly-) supervised methods

Returns:

self – Fitted estimator.

Return type:

object

fit_auto_hyper(X, y=None, X_test=None, y_test=None, n_ray_samples=5, time_budget_s=None)

Fit detector. y is ignored in unsupervised methods.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples.

  • y (numpy array of shape (n_samples, )) – Not used in unsupervised methods, present for API consistency by convention. used in (semi-/weakly-) supervised methods

  • X_test (numpy array of shape (n_samples, n_features), default=None) – The input testing samples for hyper-parameter tuning.

  • y_test (numpy array of shape (n_samples, ), default=None) – Label of input testing samples for hyper-parameter tuning.

  • n_ray_samples (int, default=5) – Number of times to sample from the hyperparameter space

  • time_budget_s (int, default=None) – Global time budget in seconds after which all trials of Ray are stopped.

Returns:

config – tuned hyper-parameter

Return type:

dict

inference_forward(batch_x, net, criterion)[source]

Conducts a forward pass during inference to calculate logits for anomaly scores.

Parameters:
  • batch_x (torch.Tensor) – The input batch of data.

  • net (NCADNet) – The neural network for NCAD.

  • criterion (torch.nn.modules.loss) – The loss function used for inference.

Returns:

A tuple containing the input batch and the logits representing anomaly scores.

Return type:

tuple

inference_prepare(X)[source]

Prepares the inference process by creating a DataLoader for the test data.

Parameters:

X (numpy.ndarray) – Input data array for inference.

Returns:

The DataLoader containing the test data.

Return type:

DataLoader

static mixup_batch(x, y, mixup_rate)[source]

Generates a batch of data with mixup augmentations.

Parameters:
  • x (torch.Tensor) – Input batch of data with dimensions (batch, ts channels, time).

  • y (torch.Tensor) – Target labels for the batch.

  • mixup_rate (float) – The proportion of the batch to augment with mixup.

Returns:

A tuple containing the mixup-augmented data and corresponding labels.

Return type:

tuple

predict(X, return_confidence=False)

Predict if a particular sample is an outlier or not.

Parameters:
  • X (numpy array of shape (n_samples, n_features)) – The input samples.

  • return_confidence (boolean, optional(default=False)) – If True, also return the confidence of prediction.

Returns:

  • outlier_labels (numpy array of shape (n_samples,)) – For each observation, tells whether it should be considered as an outlier according to the fitted model. 0 stands for inliers and 1 for outliers.

  • confidence (numpy array of shape (n_samples,).) – Only if return_confidence is set to True.

training_forward(batch_x, net, criterion)[source]

Conducts a forward pass during training, including data augmentation strategies like COE and mixup.

Parameters:
  • batch_x (torch.Tensor) – The input batch of data.

  • net (NCADNet) – The neural network for NCAD.

  • criterion (torch.nn.modules.loss) – The loss function used for training.

Returns:

The computed loss for the batch.

Return type:

torch.Tensor

training_prepare(X, y)[source]

Prepares the training process by creating data loaders and initializing the network and loss criterion.

Parameters:
  • X (numpy.ndarray) – Input data array for training.

  • y (numpy.ndarray) – Target labels for training.

Returns:

A tuple containing the DataLoader for training data, the initialized neural network, and the loss criterion.

Return type:

tuple