Concept Drift Detection
OpenMOA provides a complete concept drift detection framework covering 12 detectors, drift stream construction, drift-aware data generators, and evaluation tools for detector performance — all integrated with the broader stream learning pipeline.
1. What is Concept Drift?
In data streams, the underlying data distribution can change over time — a phenomenon known as concept drift. When drift occurs, a learner's previously acquired knowledge may become outdated, leading to degraded prediction performance.
OpenMOA addresses this through a dedicated drift detection module (openmoa.drift) that provides:
12 Drift Detectors
From classic statistical tests (ADWIN, DDM, CUSUM) to deep learning-based multivariate methods (ABCD).
Drift Stream Builders
Compose any sequence of sub-streams connected by abrupt or gradual drift transitions.
Drift-Aware Generators
Synthetic streams with built-in drift — RBF centroids that drift, LED and waveform attributes that change.
Detector Evaluation
Measure detection delay, missed detections, and false alarm rate against known ground truth.
2. Common Detector Interface
All detectors share the same base interface defined in BaseDriftDetector. You call
add_element() on each new data point, then query detection state.
Base Class Attributes
| Attribute | Type | Description |
|---|---|---|
| in_concept_change | bool | Whether drift is currently detected |
| in_warning_zone | bool | Whether the detector is in warning state |
| detection_index | list[int] | History of all instance indices where drift was signaled |
| warning_index | list[int] | History of all instance indices where warning was signaled |
| idx | int | Total number of instances processed |
Base Class Methods
| Method | Description |
|---|---|
| add_element(element: float) | Update detector with new input value (typically prediction error: 0 or 1) |
| detected_change() → bool | Returns True if drift is currently detected |
| detected_warning() → bool | Returns True if warning zone is currently active |
| reset(clean_history=False) | Reset detector state; optionally clear detection history |
| get_params() → dict | Return the detector's hyperparameter configuration |
Basic Usage Pattern
from openmoa.drift.detectors import ADWIN
detector = ADWIN(delta=0.001)
for i, (x, y) in enumerate(stream):
y_pred = learner.predict(x)
error = int(y_pred != y)
detector.add_element(error)
if detector.detected_change():
print(f"Drift detected at instance {i}")
learner.reset()
elif detector.detected_warning():
print(f"Warning at instance {i}")
learner.train(x, y)
print(detector.detection_index) # [1011, 3025, ...]
print(detector.warning_index) # [985, 2990, ...]
3. Drift Detector Catalogue
3.1 ADWIN — Adaptive Windowing
from openmoa.drift.detectors import ADWINMaintains an adaptive window that automatically shrinks when it detects that two sub-windows have statistically different means. Non-parametric and parameter-light.
ADWIN(delta=0.002)
| Parameter | Type | Default | Description |
|---|---|---|---|
| delta | float | 0.002 | Confidence bound. Smaller = more sensitive |
3.2 DDM — Drift Detection Method
from openmoa.drift.detectors import DDMMonitors the error rate of a classifier. Drift is detected when the error rate increases significantly beyond the minimum observed error rate, using statistical control limits.
DDM(min_n_instances=30, warning_level=2.0, out_control_level=3.0)
| Parameter | Type | Default | Description |
|---|---|---|---|
| min_n_instances | int | 30 | Minimum instances before monitoring begins |
| warning_level | float | 2.0 | Standard deviations above minimum for warning |
| out_control_level | float | 3.0 | Standard deviations above minimum for drift |
3.3 CUSUM — Cumulative Sum Test
from openmoa.drift.detectors import CUSUMAccumulates deviations from a target mean. When the cumulative sum exceeds threshold lambda_, drift is flagged and the counter resets.
CUSUM(min_n_instances=30, delta=0.005, lambda_=50)
| Parameter | Type | Default | Description |
|---|---|---|---|
| min_n_instances | int | 30 | Minimum instances before monitoring |
| delta | float | 0.005 | Allowance parameter (sensitivity) |
| lambda_ | float | 50 | Detection threshold |
3.4 EWMA Chart — Exponentially Weighted Moving Average
from openmoa.drift.detectors import EWMAChartApplies exponential smoothing to the error stream. Drift is detected when the smoothed value exceeds a control limit.
EWMAChart(min_n_instances=30, lambda_=0.2)
| Parameter | Type | Default | Description |
|---|---|---|---|
| min_n_instances | int | 30 | Minimum instances before monitoring |
| lambda_ | float | 0.2 | Smoothing factor (0 < λ ≤ 1) |
3.5 GeometricMovingAverage
from openmoa.drift.detectors import GeometricMovingAverageUses a geometric moving average with a forgetting factor to track the error rate. Drift is flagged when the average exceeds a threshold.
GeometricMovingAverage(min_n_instances=30, lambda_=1.0, alpha=0.99)
| Parameter | Type | Default | Description |
|---|---|---|---|
| min_n_instances | int | 30 | Minimum instances before monitoring |
| lambda_ | float | 1.0 | Detection threshold |
| alpha | float | 0.99 | Decay factor (forgetting rate) |
3.6 HDDM_A — Hoeffding's Bound DDM (Average)
from openmoa.drift.detectors import HDDM_AApplies Hoeffding's inequality to bound the probability that the error rate has increased significantly from its historical minimum, using cumulative average statistics.
HDDM_A(drift_confidence=0.001, warning_confidence=0.005, test_type="Two-sided")
| Parameter | Type | Default | Description |
|---|---|---|---|
| drift_confidence | float | 0.001 | Significance level for drift detection |
| warning_confidence | float | 0.005 | Significance level for warning |
| test_type | str | "Two-sided" | "Two-sided" or "One-sided" |
3.7 HDDM_W — Hoeffding's Bound DDM (Weighted)
from openmoa.drift.detectors import HDDM_WWeighted variant of HDDM_A. More recent observations receive higher weight via an exponential decay factor lambda_.
HDDM_W(drift_confidence=0.001, warning_confidence=0.005, lambda_=0.05, test_type="Two-sided")
| Parameter | Type | Default | Description |
|---|---|---|---|
| drift_confidence | float | 0.001 | Significance level for drift |
| warning_confidence | float | 0.005 | Significance level for warning |
| lambda_ | float | 0.05 | Decay factor weighting recent instances higher |
| test_type | str | "Two-sided" | "Two-sided" or "One-sided" |
Extra attributes:
.estimation (current mean estimate), .delay (detection delay)
3.8 Page-Hinkley Test
from openmoa.drift.detectors import PageHinkleySequential change-point detection using cumulative sums with both upper and lower bounds. Drift is flagged when the difference between the maximum cumulative sum and the current value exceeds the threshold.
PageHinkley(min_n_instances=30, delta=0.005, lambda_=50.0, alpha=0.9999)
| Parameter | Type | Default | Description |
|---|---|---|---|
| min_n_instances | int | 30 | Minimum instances before monitoring |
| delta | float | 0.005 | Allowance parameter |
| lambda_ | float | 50.0 | Detection threshold |
| alpha | float | 0.9999 | Forgetting factor (decay) |
3.9 RDDM — Reactive Drift Detection Method
from openmoa.drift.detectors import RDDMAn extension of DDM that resets when a long sequence of instances passes without drift detection, preventing false adaptation to temporary fluctuations.
RDDM(
min_n_instances=129,
warning_level=1.773,
drift_level=2.258,
max_size_concept=40000,
min_size_concept=7000,
warning_limit=1400
)
| Parameter | Type | Default | Description |
|---|---|---|---|
| min_n_instances | int | 129 | Minimum instances before monitoring |
| warning_level | float | 1.773 | Standard deviations for warning |
| drift_level | float | 2.258 | Standard deviations for drift |
| max_size_concept | int | 40000 | Max concept length before reset |
| min_size_concept | int | 7000 | Min concept length to retain |
| warning_limit | int | 1400 | Max consecutive warnings before reset |
3.10 SEED — Statistical Entropy-based Detector
from openmoa.drift.detectors import SEEDDetects drift by monitoring volatility shifts in the error stream using block-level entropy. Suited for detecting changes in error variability rather than just mean shift.
SEED(delta=0.05, block_size=32, epsilon_prime=0.01, alpha=0.8, compress_term=75)
| Parameter | Type | Default | Description |
|---|---|---|---|
| delta | float | 0.05 | Confidence threshold |
| block_size | int | 32 | Size of blocks for entropy computation |
| epsilon_prime | float | 0.01 | Error tolerance |
| alpha | float | 0.8 | Decay factor |
| compress_term | int | 75 | Compression frequency |
3.11 STEPD — Statistical Test of Equal Proportions Drift
from openmoa.drift.detectors import STEPDUses a statistical test comparing the proportion of correct predictions in a recent window versus the overall accuracy to date.
STEPD(window_size=30, alpha_drift=0.003, alpha_warning=0.05)
| Parameter | Type | Default | Description |
|---|---|---|---|
| window_size | int | 30 | Size of the recent comparison window |
| alpha_drift | float | 0.003 | Significance level for drift detection |
| alpha_warning | float | 0.05 | Significance level for warning |
3.12 ABCD — Anomaly-Based Concept Drift (Multivariate)
from openmoa.drift.detectors import ABCDUnique among OpenMOA detectors — ABCD operates on raw feature vectors (not prediction errors). It trains a dimensionality reduction model (AutoEncoder, PCA, or Kernel PCA) and monitors reconstruction loss over time using the Bernstein statistical test. It can also identify which features are drifting.
ABCD(
delta_drift=0.002,
delta_warn=0.01,
model_id="ae", # "ae", "pca", or "kpca"
split_type="ed",
encoding_factor=0.5,
update_epochs=50,
num_splits=20,
max_size=np.inf,
subspace_threshold=2.5,
n_min=100,
maximum_absolute_value=1.0,
bonferroni=False
)
| Parameter | Type | Default | Description |
|---|---|---|---|
| delta_drift | float | 0.002 | Confidence level for drift detection |
| delta_warn | float | 0.01 | Confidence level for warning |
| model_id | str | "ae" | Feature extraction model: "ae" (AutoEncoder), "pca", "kpca" |
| encoding_factor | float | 0.5 | Bottleneck size as fraction of input size |
| update_epochs | int | 50 | Training epochs after confirmed drift |
| num_splits | int | 20 | Number of time split points to evaluate |
| n_min | int | 100 | Minimum instances before pre-training |
| bonferroni | bool | False | Apply Bonferroni correction for multiple testing |
Additional Methods (ABCD only)
| Method | Returns | Description |
|---|---|---|
| get_drift_dims() | np.ndarray | Feature indices where drift was detected |
| get_dims_p_values() | np.ndarray | Per-feature p-values |
| get_severity() | float | Magnitude of drift as a z-score |
| loss() | float | Current reconstruction loss |
from openmoa.drift.detectors import ABCD
import numpy as np
detector = ABCD(model_id="ae", delta_drift=0.002)
for instance in stream:
detector.add_element(instance.x) # full feature vector, not error
if detector.detected_change():
print(f"Drift at feature dims: {detector.get_drift_dims()}")
print(f"Severity: {detector.get_severity():.3f}")
4. Choosing a Detector
| Detector | Input | Warning Zone | Distribution-free | Speed | Best For |
|---|---|---|---|---|---|
| ADWIN | Error (0/1) | Yes | Yes | Fast | General purpose |
| DDM | Error (0/1) | Yes | No | Fast | Binary classification |
| CUSUM | Continuous | No | No | Fast | Process monitoring |
| EWMA Chart | Error (0/1) | No | No | Fast | Smoothed error tracking |
| Geometric MA | Error (0/1) | No | No | Fast | Adaptive smoothing |
| HDDM_A | Error (0/1) | Yes | Yes | Medium | Error rate bounds |
| HDDM_W | Error (0/1) | Yes | Yes | Medium | Recency-weighted |
| Page-Hinkley | Continuous | No | No | Fast | Mean shift detection |
| RDDM | Error (0/1) | Yes | No | Medium | Avoids false resets |
| SEED | Error (0/1) | Yes | Yes | Medium | Volatility shifts |
| STEPD | Error (0/1) | Yes | No | Fast | Proportion comparison |
| ABCD | Feature vector | Yes | Yes | Slower | Multivariate / raw features |
5. Evaluating Drift Detectors
When the true drift positions are known, OpenMOA provides EvaluateDetector to measure the quality of a detector's output.
Import: from openmoa.drift.eval_detector import EvaluateDetector
EvaluateDetector(max_delay: int)
| Parameter | Description |
|---|---|
| max_delay | Maximum instances to wait after a drift before it is counted as "missed" |
results = evaluator.calc_performance(preds, trues)
| Parameter | Type | Description |
|---|---|---|
| preds | array-like | Instance indices where the detector signaled drift (from detector.detection_index) |
| trues | array-like | Ground-truth drift positions |
Returned Metrics (pd.Series)
| Metric | Description |
|---|---|
mean_time_to_detect | Average instances between true drift and first detection (NaN if missed) |
missed_detection_ratio | Fraction of true drifts not detected within max_delay (0.0 = perfect) |
mean_time_btw_false_alarms | Average gap between consecutive false alarms (NaN if none) |
no_alarms_per_episode | Average number of false alarms per drift episode |
import numpy as np
from openmoa.drift.detectors import ADWIN
from openmoa.drift.eval_detector import EvaluateDetector
data = np.concatenate([
np.random.randint(0, 2, 1000), # Before drift: low error
np.random.randint(3, 8, 1000), # After drift: high error
])
detector = ADWIN(delta=0.001)
for val in data:
detector.add_element(float(val))
evaluator = EvaluateDetector(max_delay=200)
results = evaluator.calc_performance(
preds=detector.detection_index,
trues=np.array([1000])
)
print(results)
# mean_time_to_detect 11.0
# missed_detection_ratio 0.0
# mean_time_btw_false_alarms NaN
# no_alarms_per_episode 0.0
6. DriftStream — Composing Streams with Drift
Import: from openmoa.stream.drift import DriftStream, AbruptDrift, GradualDrift
AbruptDrift
AbruptDrift(position: int, random_seed: int = 1)
Instantaneous switch at a given instance index.
GradualDrift
# Specify by center position + width
GradualDrift(position=10000, width=2000, random_seed=1)
# Or specify by start + end
GradualDrift(start=9000, end=11000, random_seed=1)
Instances are probabilistically drawn from either concept over the drift window.
from openmoa.stream.drift import DriftStream, AbruptDrift, GradualDrift
from openmoa.stream.generator import SEA
stream = DriftStream(stream=[
SEA(function=1), # Concept 1
AbruptDrift(position=5000), # Sudden switch at instance 5000
SEA(function=2), # Concept 2
GradualDrift(position=10000, width=2000), # Gradual drift from 9000–11000
SEA(function=3), # Concept 3
])
print(stream.get_num_drifts()) # 2
print(stream.get_drifts()) # List of Drift objects
RecurrentConceptDriftStream
from openmoa.stream.drift import RecurrentConceptDriftStream
stream = RecurrentConceptDriftStream(
concept_list=[concept1, concept2, concept3],
max_recurrences_per_concept=2,
transition_type_template=AbruptDrift(position=2000),
)
7. Built-in Drifting Generators
| Generator | Key Parameters | Drift Mechanism |
|---|---|---|
RandomRBFGeneratorDrift | number_of_drifting_centroids=2, magnitude_of_change=0.0 | RBF centroids slowly move over time |
LEDGeneratorDrift | number_of_attributes_with_drift=7 | LED display attributes drift |
WaveformGeneratorDrift | number_of_attributes_with_drift=10 | Waveform attributes drift |
from openmoa.stream.generator import RandomRBFGeneratorDrift
stream = RandomRBFGeneratorDrift(
model_random_seed=1,
number_of_classes=2,
number_of_centroids=50,
number_of_drifting_centroids=5,
magnitude_of_change=0.5,
)
8. Drift Benchmark Datasets
| Dataset | Instances | Features | Drift Events | Description |
|---|---|---|---|---|
CovtFD | 581,011 | 104 | 2 (at 193,669 & 387,338) | Covertype with synthetic feature drift: dummy features swap with real features |
RTG_2abrupt | 100,000 | 30 | 2 abrupt | Random Tree Generator with 2 injected abrupt concept drifts, 5 classes |
from openmoa.datasets import CovtFD, RTG_2abrupt
stream = CovtFD()
stream = RTG_2abrupt()
9. Visualizing Drift in Evaluation Results
When a DriftStream is passed to prequential_evaluation, the resulting plot automatically marks drift locations.
from openmoa.evaluation import prequential_evaluation
from openmoa.evaluation.visualization import plot_windowed_results
from openmoa.classifier import HoeffdingTree
stream = DriftStream(stream=[
SEA(function=1),
AbruptDrift(position=5000),
SEA(function=2),
GradualDrift(position=10000, width=2000),
SEA(function=3),
])
learner = HoeffdingTree(schema=stream.get_schema())
results = prequential_evaluation(stream=stream, learner=learner, window_size=500)
plot_windowed_results(results, metric="accuracy")
Abrupt Drifts
Rendered as vertical dashed lines at the exact drift position.
Gradual Drifts
Rendered as shaded bands (70% transparency) showing the full transition window.