{ "cells": [ { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "58fab4bb-231e-48cf-8ed4-fc15a1b22845", "showTitle": false, "title": "" } }, "source": [ "

Databricks-ML-professional-S04b-Drift-Tests-and-Monitoring

" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": {}, "inputWidgets": {}, "nuid": "70d28f6a-8dc1-4a86-952f-ebf6ac82479b", "showTitle": false, "title": "" } }, "source": [ "
\n", "
\n", "

This Notebook adds information related to the following requirements:


\n", "Drift Tests and Monitoring:\n", "\n", "
\n", "

Download this notebook at format ipynb here.

\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "2d6aaf81-c559-44bd-bc70-25852c40193d", "showTitle": false, "title": "" } }, "source": [ "\n", "
\n", "1. Describe summary statistic monitoring as a simple solution for numeric feature drift

Summary statistic monitoring is a straightforward approach to detect numeric feature drift. The idea is to calculate summary statistics (such as mean, standard deviation, minimum, maximum, etc.) for each numeric feature in the training data and then compare these statistics with the summary statistics of the incoming data in the production environment. Deviations from the expected summary statistics can indicate feature drift.

" ] }, { "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { "cellMetadata": { "byteLimit": 2048000, "rowLimit": 10000 }, "inputWidgets": {}, "nuid": "18e681ce-93ed-4c38-814e-6d851bb56281", "showTitle": false, "title": "" } }, "source": [ "\n", "
\n", "2. Describe mode, unique values, and missing values as simple solutions for categorical\n", "feature drift
\n", "