Metrics

VIANOPS supports the following metrics:

All metrics are supported through the API. A subset of metrics are supported through the UI.

Performance metrics - classification models

VIANOPS supports the following performance metrics for classification models. If accessing in the UI (e.g., when creating a performance drift policy or visualizing model performance in the Model Dashboard), see more details for each metric.

For more on using these with an API, see the API documentation for performance drift policies. When calling the endpoint make sure to specify the API Value exactly as shown.

Metric Description In UI In API API Value
Accuracy Fraction of correct predictions. Yes Yes accuracy
Area under the curve (AUC) The area under the ROC curve, which quantifies the overall performance of a binary classifier compared to a random classifier Yes Yes rocauc
Balanced accuracy Average of recall obtained on each class. Yes Yes balanced_accuracy
Bookmaker informedness (BM) Sum of sensitivity and specificity minus 1.   Yes bm
Diagnostic Odds Ratio (DOR) Ratio of the odds of the test being positive if the subject has a disease relative to the odds of the test being positive if the subject does not have the disease.   Yes dor
F1 Score (F1) Harmonic mean of precision and recall. Yes Yes f1
False Discovery Rate (FDR) Fraction of incorrect predictions in the predicted positive instances.   Yes fdr
Fowlkes-Mallows index (FM) Geometric mean of precision and recall.   Yes fm
False Negatives (FN) Incorrectly predicted negatives. Yes Yes fn
False Negative Rate (FNR) Fraction of positives incorrectly identified as negative.   Yes fnr
False Omission Rate (FOR) Fraction of incorrect predictions in the predicted negative instances.   Yes for
False Positives (FP) Incorrectly predicted positives. Yes Yes fp
False Positive Rate (FPR) Fraction of negatives incorrectly identified as positive.   Yes fpr
Gini coefficient A measure of inequality or impurity in a set of values, often used in decision trees. Yes Yes modelgini
Lift A measure of the effectiveness of a predictive model calculated as the ratio between the results obtained with and without the predictive model. Yes Yes lift
Log loss A performance metric for classification where the model prediction is a probability value between 0 and 1. Measures divergence of model output probability from actual value. Logloss = 0 is a perfect classifier. Yes Yes logloss
Matthews Correlation Coefficient (MCC) (TPTN - FPFN)/(sqrt((TP + FP)(TP + FN)(TN + FP)(TN + FN))). -1 to 1 range (1 is a perfect binary classifier, 0 is random, -1 is everything wrong).   Yes mcc
Markedness (MK) Sum of positive and negative predictive values minus 1.   Yes mk
Negative Likelihood Rate (NLR) Ratio of false negative rate to true negative rate.   Yes nlr
Negative Predictive Value (NPV) Fraction of identified negatives that are correct.   Yes npv
Positive Likelihood Ratio (PLR) Ratio of true positive rate to false positive rate.   Yes plr
Precision Fraction of identified positives that are correct.   Yes precision
Probability calibration curve A plot that compares the predicted probabilities of a model to the actual outcome frequencies, used to understand if a model’s probabilities can be taken at face value.   Yes prob_calib
Prevalence Threshold (PT) Point where the positive predictive value equals the negative predictive value.   Yes pt
Recall Sensitivity, True Positive Rate. Fraction of positives correctly identified. (Higher Recall minimizes False Negatives) Yes Yes recall
Rate of Negative Predictions (RNP) This term is not standard in machine learning. It could refer to a combined metric of recall and negative predictive value. Yes Yes rnp
Receiver Operating Characteristics (ROC) A plot that illustrates the diagnostic ability (TPR v FPR) of a binary classifier as its decision threshold is varied.   Yes auc_score
Specificity Fraction of negatives correctly identified.   Yes specificity
True Negatives (TN) Correctly predicted negatives. Yes Yes tn
True Positives (TP) Correctly predicted positives. Yes Yes tp
Threat score (TS) TP/(TP + FN + FP) Also known as the Critical Success Index.   Yes ts

Additional classification metrics available from the API

Several metrics of classification models are calculated using averaging techniques and accessible via the API. (See the API documentation for performance drift policies.)

Metric Description API Value
F1, Micro-Averaged Global average F-1 using total TP/TN/FP/FN. Better for imbalanced data. micro[f1]
F1, Macro-Averaged Average of per-class F-1 scores without regard to class size. macro[f1]
F1, Weighted-Averaged Average of per class F-1 scores weighted by support per class. weighted[f1]
Precision, Micro-Averaged Global average Precision using total TP/TN/FP/FN. Better for imbalanced data. Equal to F-1/Precision/Accuracy. micro[precision]
Precision, Macro-Averaged Average of per-class Precision scores without regard to class size. macro[precision]
Precision, Weighted-Averaged Average of per class Precision scores weighted by support per class. weighted[precision]
Recall, Micro-Averaged Global average Recall using total TP/TN/FP/FN. Better for imbalanced data. Equal to F-1/Precision/Accuracy. micro[recall]
Recall, Macro-Averaged Average of per-class Recall scores without regard to class size. macro[recall]
Recall, Weighted-Averaged Average of per class Recall scores weighted by support per class. weighted[recall]

Performance metrics - regression models

VIANOPS supports the following performance metrics for regression models. If accessing in the UI (e.g., when creating a performance drift policy or visualizing model performance in the Model Dashboard), see more details for each metric.

For more on using these with an API, see the API documentation for API - Performance drift. Value must be typed in the API exactly as shown.

Metric Description In UI In API API Value
Mean Absolute Error (MAE) Average of absolute prediction errors. (Error = predicted value - actual value in all cases) Yes Yes mae
Mean Absolute Percentage Error (MAPE) Average of absolute percentage prediction errors abs((pred value - actual)/actual). Yes Yes mape
Mean Squared Error (MSE) Average of squared prediction errors. Yes Yes mse
Negative Mean Squared Error (NMSE) Negative of MSE. Yes Yes mse
Negative Root Mean Squared Error (NRMSE) Negative of RMSE. Yes Yes rmse
Negative Mean Absolute Error (NMAE) Negative of MAE. Yes Yes mae
Negative Mean Absolute Percentage Error (NMAPE) Negative of MAPE. Yes Yes mape
R-Squared (R2) The proportion of the variance in the dependent variable that is predictable from the independent variable(s). Yes Yes r2_score
Root Mean Squared Error (RMSE) Square root of MSE. Yes Yes rmse

Drift metrics

VIANOPS supports the following drift metrics.

For more on using these with an API, see API - Performance drift.

Metric Description In UI In API
Jensen-Shannon Divergence for prediction drift Square root of the J-S divergence, which measures the average divergence of baseline and target from the mean of the two distributions. Yes Yes
Population Stability Index (PSI) for prediction drift Sum of (baseline frequency - target frequency) x log (baseline/target) across all defined bins. Yes Yes
Jensen-Shannon Divergence for feature drift Square root of the J-S divergence, which measures the average divergence of baseline and target from the mean of the two distributions Yes Yes
Population Stability Index (PSI) for feature drift Sum of (baseline frequency - target frequency) x log (baseline/target) across all defined bins. Yes Yes

Data profiling metrics

VIANOPS supports Numerical and Categorical data profiling metrics.

Numerical data profiling metrics

Metric Description In UI In API
Count Total number of observations or records. Yes Yes
Min/Mean/Max The smallest, average, and largest values respectively. Yes Yes
1%/99% The 1st and 99th percentiles of the data respectively (values below which a certain percent of observations fall). Yes Yes
5%/95% The 5th and 95th percentiles of the data respectively. Yes Yes
10%/90% The 10th and 90th percentiles of the data respectively. Yes Yes
25%/50%/75% The 25th (1st quartile or Q1), 50th (median or Q2), and 75th (3rd quartile or Q3) percentiles of the data respectively. Yes Yes
Mean-Std/Mean/Mean+Std The mean value minus one standard deviation, the mean value, and the mean value plus one standard deviation respectively. This provides a sense of the spread of the data around the mean. Yes Yes

Categorical data profiling metrics

VIANOPS supports the following categorical metrics.

Metric Description In UI In API
Count Total count. Yes Yes
Unique values Number of unique values. Yes Yes
Value counts Dictionary of {value: value_count}. Yes Yes

Custom metrics

You can define custom metrics based on the standard performance metrics listed above, for both regression and classification models.

  1. Define the custom metric with either the /v1/model-metrics REST API or the vianops_client.models.riskstore.model-metrics SDK endpoint. (To access REST API docs, see APIs.)
    • Define the metric with a python function or a py_statement.
    • Define your custom metric based on any of the VIANOPS standard metrics or datasources:
    • Custom metrics appear in italics in all the UI elements where standard metrics appear.
    • The following is an example of the body for a model-metrics endpoint defining a custom metric with language set to py_statement:

      [
          {
              "metric_name": "relative_squared_error3",
              "description": "Compare RMSE to MAE to determine the distribution of errors.",
              "language": "py_statement",
              "metric_type": "model_performance",
              "experiment_type": "regression",
              "definition": "rmse / mae",
              "abbreviation": "RSE",
              "alt_names": ["rmm"],
              "full_name": "Relative Squared Error",
              "status": "active",
              "metric_category": "custom",
              "metric_tags": {
                  "for_display": true,
                  "for_custom_definition": true,
                  "is_metric_of_interest": true,
                  "is_percent": false,
                  "lower_is_better": false,
                  "needs_predict_proba": false,
                  "needs_class_of_interest": false
              }
          }
      ]
      
    • The following is an example of notebook code to define a custom metric with language set to python:

      load_api = ModelMetricsV1API()
      
      model = V1ModelMetricsModel(
        metric_name=custom_metrics[2],
        description="topXauc refers to the area under the ROC curve when only the top 'X' predicted probabilities are considered for evaluation.",
        full_name="topXAUC",
        abbreviation="topX AUC",
        language="python",
        metric_type="model_performance",
        experiment_type="binary_classification",
        definition = {
                    "method": "calculateTopXauc",
                    "init_params": {"filtered_df": "drifter_df", "top_x": 10},
                    "module": "custom",
                    "classname": ""
        },
      )   
      
      models=V1ModelMetricsModelList(__root__=[])
      models.__root__.append(model)
      create_res = load_api.create(models)
      print(create_res)
      

      This example assumes the following:

      • custom_metrics has been defined previously as a list of at least three items representing the name for each custom metric. In this example, the third item in the list is the name of this custom metric.
      • definition.method calls calculateTopXauc, which is defined in the file /source/custom.py.
      • module.custom points to the file /source/custom.py.
  2. Add the following tags to the model_tags_payload section of your notebook.
  • custom_metrics

    {"name": "custom_metrics", "value": custom_metrics, "status": "active"}

    This object assumes you have assigned the variable custom_metrics previously in the notebook, as shown in the example below. This example assumes you have created custom metrics named “custom_reg_metric” and “twice_error” with the custom metric endpoint.

  • class_of_interest

    {"name": "class_of_interest", "value": custom_metrics_class, "status": "active"}

    This object assumes you have assigned the variable custom_metrics_class previously in the notebook as shown in the example below. You only need to provide this tag for binary classification models. For binary classification models, the classes of interest can be 0 or 1, and if neither then you must assign to “None”.

custom-metrics-tags

TABLE OF CONTENTS