Glossary

Alert

An informational flag showing that drift has reached a pre-defined threshold in a policy. Alert levels include critical (needs immediate attention) and warning (may need attention).

Baseline window

The point of reference for comparing metrics when monitoring model performance or drift. Baselines can be different types of data, such as training data or prior time periods of production data such as the prior day, the same weekdays of the last three weeks, prior week, prior month, prior quarts, and so no.

Drift

Changes in the value distribution of the target window and baseline window for an input feature or prediction output, or the changes in model performance.

Feature

An input to a model that represents a measurable piece of data. For example, a feature may be distance, fare, location, age, and so on. A model typically has tens to thousands of features.

Feature drift

Changes in the value distribution of a feature in a target time window with respect to the baseline window. For example, feature drift of trip_distance this month comparing to same month last year.

Overall feature drift is the aggregation of drift for all features that are selected in a policy. This represents overall drift at policy level and is used to trigger alerts.

Ground truth

The real value of an event.

Metric

Measures policies. Metric types are:

  • Distance-based drift metrics: PSI, JSD
  • Performance metrics:
    • Classification:
      • Accuracy
      • Precision
      • Recall
      • F1 score
      • Area Under the Curve (AUC)
      • Balanced accuracy
    • Regression:
      • Mean Absolute Error (MAE)
      • Mean Absolute Percentage Error (MAPE)
      • Mean Squared Error (MSE)
      • Root Mean Squared Error (RMSE)
Model

A model for prediction. Can be binary classification, multi-class classification, regression, ranking, recommendation or other types of models. Different types of models have different metrics to measure model performance.

Performance drift

Changes in the performance metrics of a model between target window and baseline window.

Policy

Set of rules that defines a process to monitor and alert users as drift happens. VIANOPS supports feature drift policy, prediction drift policy, and performance drift policy at this release. You can define multiple drift polices with different settings to monitor model from different dimensions. These include target and baseline windows, drift metric, alert thresholds, schedules to run policies, selected segments and selected features.

Prediction

The output of a model. Each type of model gives a different output. For example, a binary classification can output 0 or 1, while a regression model outputs a numeric value.

Prediction drift

Changes in the distribution of predictions made by the model between target window and baseline window.

Project

A collection of models with the same purpose.

Segment

A subsection of a data set used to narrow the scope and uncover patterns that may be occurring only in sections of the entire population but may have an impact on full model behavior. For example, you can define a segment where the location is Manhattan, or a segment where state is California and same time age group is Senior. VIANOPS allows users to view and compare performance and drift across multiple segments at the same time.

Schedule

Schedule determines how frequently the policy runs. It can be daily, weekly on a specific weekday, or monthly on a specific day of the month, etc.

Target window

The data frame being monitored. The target window can be day (last 24 hours), week to date (first day of the week up to current day), month to date (first day of the month up to current day), and so on.

Threshold

Determines severity level and triggers alerts or other actions. VIANOPS defines two levels of threshold: Critical (severe and need immediate attention) and Warning (less than critical but may need attention.)

Value distribution

The number of times a specific value falls into different bins (value range) for a feature during a target or baseline window. For example, the estimated time for a taxicab ride may range from 5 minutes to 37 minutes for a specific target or baseline. VIANOPS allows users to customize how they view a range of values that is easy to understand in a business context. For example, instead standardizing the grouping of values (<5, 5-10, 10-15, and so on) users can create custom groups to better view and expose patterns, such as <5, 5-15. 15-45, 45-60, and >60 minutes. By default, VIANOPS groups the values into 10 bins for continuous feature. For categorical feature, the categories are the bins.

TABLE OF CONTENTS