Policies (APIs)
Overview
Policies track and report the health and trustworthiness of model inputs (features), outputs (predictions), and performance. By analyzing and comparing model inferences against specified thresholds and actual values (ground truth), the platform understands when the model has drifted or is no longer performing optimally. For example, using a policy for a demand forecasting model trained on older data, the platform may detect drift due to recent supply chain issues. When discovered that model likely should be retrained on recent data to continue making trustworthy predictions.
- Feature and prediction drift policies—These policies compare the probability distribution of a target dataframe to the probability distribution of a baseline dataframe using a variety of distance measures. The platform can detect distance-based drift using either Population Stability Index (
PSI
) or Jensen Shannon distance (distance_JS
). - Performance drift policies—These policies perform percent drift comparison of metrics between the selected baseline and target data. For example, a policy may compare yesterday’s accuracy metrics with today’s accuracy metrics.
When analyzing performance and drift issues configured for a policy, the platform looks at all data received in the target window since the last run. If you want to investigate specific portions of the data, you can configure the policy with segments to filter the features and data to just want you want to analyze. For example, you may have one policy that looks for drift issues across all the features and as well as specific portions of the data (selected features and values) defined in configured segments.
Detected drift and performance issues are available to view in visualizations, such as VIANOPS monitoring graphs. You can configure alerts for your policies that identify when specified thresholds are surpassed. The VIANOPS monitoring visualizations show the important details related to an alert and detected conditions. When getting information about the discovered risk conditions to the right people is critical, you’ll want to assign users to alerts to send them notifications via email or communications integrations such as Slack.
Hotspot analysis
To get additional insight into the causes for model drift, configure hotspot analysis for your drift policies. When alerts indicate a policy detects drift conditions, hotspot analysis calculates and surfaces the features causing the greatest drift issues (which are determined using a weighted average of the impact score of all the drift features in the policy). The resulting hotspots represent areas in the data that have significant impact on the detected drift; use the values to help understand the model and then retrain as needed.
VIANOPS automatically sets categorical features that are configured to support drift detection and be used within segments as available for hotspot analysis, based on their settings in Inference mapping.
You can add hotspot analysis to a policy by specifying the following in the policy payload:
- method—whether the most impactful features are a calculated as a flat number of features (for example, top 50 features in time window) or percent of features (for example, platform-calculated top 50% from total traffic)
- number—actual number or percentage
- features—columns enabled for hotspot analysis; hotspots are created for the values for these columns
For the following example, if there are 200 hotspot groupings within the time window then hotspot analysis runs on the top 50. It calculates the impact, traffic, and PSI for three features: feature_x
, feature_y
, and feature_z
.
"hotspot_analysis":{
"method":"flat",
"value":50,
"features":[
"feature_x",
"feature_y,"
"feature_z"
]
},
View results of hotspot analysis in the Hotspot Analysis window (accessible in the UI from the Risk Analysis > Alerts pane).
Policies object
{
"deployment": "string",
"model_name": "string",
"model_version": "1",
"model_stage": "string",
"name": "string",
"description": "Description of policy",
"type": "drift",
"policy": {
"window_parameters": {
"target": {
"offset": "1",
"offset_type": "day",
"window_type": "day"
},
"baseline": {
"window_method": "prior",
"window_type": "day"
}
},
"warning_level": 1,
"critical_level": 2,
"schedule": "0 0 0 ? * *",
"deployment_name": "string",
"method": "preprocess",
"hotspot_analysis": {
"method": "flat",
"value": 50,
"features": [
"feature_x",
"feature_y"
]
},
"type": "feature-drift",
"drift_type": "distance",
"select_features_type": "custom",
"feature_weightage": "equal",
"feature_weights": {
"feature_a": 50,
"feature_b": 50
},
"drift_measure": "PSI",
"baseline_bins": null
},
"status": "active",
"segments": [],
"uuid": "123-abc-456",
"created_ts": 1678111202290.127,
"modified_ts": 1678119666010.7341,
"created_by": "user1",
"modified_by": "user2"
}
{
"data": {
"h-0": "Property",
"h-1": "Type",
"h-2": "Description",
"0-0": "deployment",
"0-1": "string",
"0-2": "Name for the deployment to which this policy is assigned.",
"1-0": "model_name",
"1-1": "string",
"1-2": "Name for the model to which this policy is assigned.",
"2-0": "model_version",
"2-1": "string",
"2-2": "Version for the model.",
"3-0": "model_stage",
"3-1": "string",
"3-2": "Stage for the model.",
"4-0": "name",
"4-1": "string",
"4-2": "Name of the policy. Must be unique to the deployment, model_name, model_version, and model_stage.",
"5-0": "description",
"5-1": "string",
"5-2": "Description of the policy, if defined.",
"6-0": "type",
"6-1": "string",
"6-2": "The model may be configured to detect feature, prediction, or performance drift.",
"7-0": "policy",
"7-1": "JSON string",
"7-2": "JSON object specifying the policy configuration: type of policy and what the policy is detecting, data to analyze (target data and baseline comparison data), metrics and/algorithms, features to include in policy, etc. \n \nSee documentation for each type of policy: \n - [Distance-based drift on feature data](api-dist-drift-feat.html) \n - [Distance-based drift on prediction data](api-dist-drift-pred.html) \n- [Performance drift](api-drift-pred.html)",
"8-0": "status",
"8-1": "string",
"8-2": "Current status of the policy. Supported values: `active`, `inactive`. If `active` the policy is run based on its schedule.",
"9-0": "segments",
"9-1": "array",
"9-2": "Configured [segments](api-segments.html) for the policy.",
"10-0": "uuid",
"10-1": "string",
"10-2": "Unique ID created by the platform for this policy.",
"11-0": "created_ts",
"11-1": "dateTime (Unix time in milliseconds)",
"11-2": "Timestamp (Unix time in milliseconds) identifying when the policy was created.",
"12-0": "modified_ts",
"12-1": "dateTime (Unix time in milliseconds)",
"12-2": "Timestamp (Unix time in milliseconds) identifying when the policy was last modified.",
"13-0": "created_by",
"13-1": "string",
"13-2": "User who created the policy.",
"14-0": "modified_by",
"14-1": "string",
"14-2": "User who last modified the policy."
},
"cols": 3,
"rows": 15,
"align": [
"left",
"left",
"left"
]
}
Fields are case-insensitive for supported values.
Example policies
Feature drift with defined segment
This policy detects feature drift (using Jensen Shannon distance metric, distance_JS
) on banking data. It compares daily inferences to those collected from the previous day, looking for drift across six features (defined with equal importance).
In addition to looking at all inferences, the policy includes a simple static segment that filters to include three specific branch locations.
{
"deployment": "deployment_xyz",
"model_name": "deployment_xyz",
"model_version": "1",
"model_stage": "model_stage123",
"name": "Accounts policy 1",
"description": "Detect drift in account data",
"type": "drift",
"policy": {
"type": "feature-drift",
"drift_type": "distance",
"window_parameters": {
"target": {
"window_type": "day"
},
"baseline": {
"window_method": "prior",
"window_type": "day"
}
},
"select_features_type": "custom",
"feature_weightage": "equal",
"feature_weights": {
"AccountID": 16.66,
"TransactionID": 16.66,
"AccountBalance": 16.66,
"Timestamp": 16.66,
"BranchLocation": 16.66,
"TransactionAmount": 16.66
},
"drift_measure": "distance_JS",
"warning_level": 1,
"critical_level": 2,
"schedule": "0 0 0 ? * *",
"method": "preprocess"
},
"segments": [
{
"name": "BranchLocation Segment",
"description": "Filter to look at data related to Rodeo Drive, Hollywood Boulevard, and Sunset Boulevard branches.",
"filters": [
{
"feature_name": "BranchLocation",
"value": [
"Rodeo Drive",
"Hollywood Boulevard",
"Sunset Boulevard"
],
"operator": "=",
"conjuction": null,
"grouped_filters": null
}
],
}
]
"status": "active",
"created_ts": 1678393009340.0012,
"modified_ts": 1678393009340.0012,
"created_by": "user1",
"modified_by": "user2"
}
Prediction drift for specific historical time period
The following policy detects prediction drift using Population Stability Index metric (PSI
). It compares a baseline of the prediction data received for three months prior to the target month ending 2023-02-15.
{
"deployment": "deployment_xyz",
"model_name": "deployment_xyz",
"model_version": "1",
"model_stage": "primary",
"name": "policy-name",
"description": "Description of policy",
"type": "drift",
"policy": {
"type": "prediction-drift",
"drift_type": "distance",
"window_parameters": {
"target": {
"window_type": "month",
"process_date": "2023-02-15"
},
"baseline": {
"window_method": "prior",
"window_type": "month",
"last_amount": 3
}
},
"drift_measure": "PSI",
"warning_level": 0.1,
"critical_level": 0.25,
"schedule": "0 0 5 ? * *",
"deployment_name": "deployment_xyz",
"method": "preprocess",
"status": "active"
},
"status": "inactive",
"segments": null,
"uuid": "123-abc-456",
"created_ts": 1679596471053.271,
"modified_ts": 1679596471053.271,
"created_by": "user1",
"modified_by": "user2"
}