Data profiling (APIs)
Get data profiling metrics
Seeing and understanding changes in data patterns is critical to effective model monitoring. Data profiling metrics provide insight into the data degradation that could lead to drops in model performance or that could be useful for root cause analysis for drift policies.
The API provides endpoints for accessing data profiling metrics. By default, metrics are calculated as part of running policies. Resulting metrics are stored in the Risk Store database for access. The Monitor placeholder notebook (accessible from the platform’s edahub service) shows an example of running a drift detection job, during which data profiling runs automatically.
REST API
Use v1/dataprofiling/submit to run the data profiling job for your model. Returns V1JobModel.
Note: Currently, this endpoint supports running a data profiling job as a sub job of running a drift detection job, and typically is not called directly.
Example payload:
{
"deployment": "deployment_abc",
"model_name": "deployment_abc",
"model_version": "1",
"model_stage": "primary",
"policy": {
"window_parameters": {
"target": {
"window_type": "day",
"process_date": "2023-04-19",
},
"baseline": {
"window_method": "prior",
"window_type": "day",
"process_date": "2023-04-19",
},
}
},
"status": "inactive",
"segments": [
{
"model_uuid": "123-abc-456",
"name": "segment-1",
"description": "Segment to filter some data",
"filters": [
{
"feature_name": "feature-a",
"value": ["string1", "string2"],
"operator": "=",
"conjunction": "None",
"grouped_filters": "None",
},
],
"id": 2,
"status": "active",
"created_ts": "1683132722980.102",
"modified_ts": "1683132722980.102",
"created_by": "user1",
"modified_by": "user2",
}
],
}
Python SDK
Support for data profiling is provided by vianops_client.models.riskstore.data_profiling.V1DataProfilingModel.
Import and initiate client
V1DataProfilingModel