API Reference Complete Reference

openmoa.streams

Complete API reference for all stream classes, instance types, schema, and stream wrappers in OpenMOA. Covers every public class, method, parameter, and attribute.

§0 · Module Exports

import overview

§1 · Core Data Structures

Schema Instance LabeledInstance RegressionInstance Type Aliases

§2 · Stream Base Classes

Stream (Abstract) MOAStream

§3 · File-Based Streams

ARFFStream NumpyStream CSVStream LibsvmStream BagOfWordsStream ConcatStream stream_from_file

§4 · Feature Evolution Wrappers

OpenFeatureStream TrapezoidalStream CapriciousStream EvolvableStream ShuffledStream

§5 · Synthetic Generators

SEA RandomTreeGenerator RandomRBFGenerator RandomRBFGeneratorDrift LEDGenerator LEDGeneratorDrift WaveformGenerator WaveformGeneratorDrift AgrawalGenerator HyperplaneGenerator STAGGERGenerator MixedGenerator

§6 · Drift Streams

Drift AbruptDrift GradualDrift DriftStream RecurrentConceptDriftStream

§7 · Built-in Datasets

classification datasets regression datasets

§0 Module Exports

openmoa/stream/__init__.py

from openmoa.stream import (
    Stream, MOAStream,
    ARFFStream, NumpyStream, CSVStream, ConcatStream,
    LibsvmStream, BagOfWordsStream,
    OpenFeatureStream, EvolvingFeatureStream,   # EvolvingFeatureStream is an alias
    TrapezoidalStream, CapriciousStream, EvolvableStream,
    ShuffledStream,
    stream_from_file,
)

openmoa/stream/drift/__init__.py

from openmoa.stream.drift import (
    DriftStream, Drift, AbruptDrift, GradualDrift,
    RecurrentConceptDriftStream,
)

openmoa/stream/generator/__init__.py

from openmoa.stream.generator import (
    SEA, RandomTreeGenerator,
    RandomRBFGenerator, RandomRBFGeneratorDrift,
    LEDGenerator, LEDGeneratorDrift,
    WaveformGenerator, WaveformGeneratorDrift,
    AgrawalGenerator, HyperplaneGenerator,
    STAGGERGenerator, MixedGenerator,
)

openmoa/datasets/__init__.py (selection)

from openmoa.datasets import (
    Electricity, ElectricityTiny,
    Covtype, CovtypeNorm, CovtypeTiny, CovtFD,
    RBFm_100k, RTG_2abrupt, Hyper100k, Sensor,
    Fried, FriedTiny, Bike,
    # ... full list in §7
)

§1 Core Data Structures

class Schema · src/openmoa/stream/_stream.py

Describes the structure of a stream: attribute names, data types, number of classes, and label values. Required by all learners and evaluators.

Schema(moa_header)

signature

Schema(moa_header: InstancesHeader) -> None

Parameter	Type	Description
moa_header	InstancesHeader	Java MOA header object. Typically not called directly — use `Schema.from_custom()` or `stream.get_schema()`.

Schema.from_custom (class method)

signature

Schema.from_custom(
    feature_names: Sequence[str],
    values_for_nominal_features: Dict[str, Sequence[str]] = {},
    values_for_class_label: Sequence[str] = None,
    dataset_name: str = "No_Name",
    target_attribute_name: Optional[str] = None,
    target_type: Optional[str] = None,
) -> Schema

Parameter	Type	Default	Description
feature_names	Sequence[str]	required	List of feature attribute names
values_for_nominal_features	Dict[str, Sequence[str]]	{}	Maps feature name → list of possible values for nominal features
values_for_class_label	Sequence[str]	None	Possible class label strings; if `None` → regression schema
dataset_name	str	"No_Name"	Name of the dataset
target_attribute_name	Optional[str]	None	Name of the target/class attribute
target_type	Optional[str]	None	`'categorical'`, `'numeric'`, or `None` (auto-detect)

Task type methods

Method	Returns	Description
is_classification()	bool	True if classification task
is_regression()	bool	True if regression task

Attribute info methods

Method	Returns	Description
get_num_attributes()	int	Number of input features (excluding target)
get_num_numeric_attributes()	int	Count of numeric attributes
get_num_nominal_attributes()	int	Count of nominal (categorical) attributes
get_numeric_attributes()	list \| None	List of numeric attribute names
get_nominal_attributes()	dict \| None	Dict of `{name: [values]}` for nominal attributes

Class / label info methods (classification only)

Method	Returns	Description
get_num_classes()	int	Number of possible classes (1 for regression)
get_label_values()	Sequence[str]	List of possible class label strings
get_label_indexes()	Sequence[int]	List of class indices [0, 1, ..., n-1]
get_value_for_index(y_index)	Optional[str]	Class label string for index; None if y_index is None
get_index_for_label(y)	int	Class index for label string; raises KeyError if not found
is_y_index_in_range(y_index)	bool	Whether y_index is valid for this schema

MOA access & special methods

Method / Property	Returns	Description
get_moa_header()	InstancesHeader	Underlying Java MOA header (advanced use)
dataset_name	str	Property — name of the dataset
__str__() / __repr__()	str	Returns ARFF header representation
__eq__(other)	bool	Compares number of attributes and classes

class Instance · src/openmoa/instance.py

Base class representing a single data point with a feature vector and schema reference.

signature

Instance(schema: Schema, instance: Union[InstanceExample, FeatureVector]) -> None

Instance.from_array (class method)

Instance.from_array(schema: Schema, instance: FeatureVector) -> Instance

Creates an Instance from a NumPy feature array (no label).

Properties

Property	Type	Description
x	NDArray[float64]	Feature vector as 1D NumPy array
schema	Schema	The stream schema
java_instance	InstanceExample	Java representation

class LabeledInstance · Inheritance: Instance · Classification tasks

Instance with a class label for classification tasks.

LabeledInstance.from_array (class method)

LabeledInstance.from_array(schema: Schema, x: FeatureVector, y_index: int) -> LabeledInstance

Parameter	Type	Description
schema	Schema	Classification schema
x	NDArray[float64]	Feature vector
y_index	int	Class index (0-based)

Properties

Property	Type	Description
x	NDArray[float64]	Feature vector
y_index	int	Class index (0-based integer)
y_label	str	Class label string (via `schema.get_value_for_index`)
schema	Schema	Stream schema

class RegressionInstance · Inheritance: Instance · Regression tasks

Instance with a continuous target value for regression tasks.

RegressionInstance.from_array (class method)

RegressionInstance.from_array(schema: Schema, x: FeatureVector, y_value: float) -> RegressionInstance

Properties

Property	Type	Description
x	NDArray[float64]	Feature vector
y_value	float	Continuous target value
schema	Schema	Stream schema

aliases Type Aliases · src/openmoa/type_alias.py

Alias	Underlying Type	Description
FeatureVector	NDArray[float64]	1D NumPy float64 array of feature values
LabelIndex	int	Non-negative class index integer
Label	str	Class label string
LabelProbabilities	NDArray[float64]	1D array of prediction probabilities
TargetValue	float	Continuous target value for regression

§2 Stream Base Classes

abstract Stream · ABC, Generic[_AnyInstance], Iterator[_AnyInstance] · src/openmoa/stream/_stream.py

Abstract base class for all streams. Implements the Python iterator protocol. All subclasses must implement the four abstract methods below.

Abstract methods

Method	Signature	Description
has_more_instances()	() → bool	True if stream has more instances
next_instance()	() → _AnyInstance	Returns the next instance
get_schema()	() → Schema	Returns the stream schema
restart()	() → None	Resets the stream to the beginning

Concrete methods

Method	Returns	Description
__iter__()	Iterator	Returns self; does NOT restart the stream
__next__()	_AnyInstance	Returns next instance; raises StopIteration if exhausted
get_moa_stream()	Optional[InstanceStream]	Returns underlying Java MOA stream, or None
CLI_help()	str	Returns MOA option documentation (if MOA stream available)
__str__()	str	Returns dataset name

class MOAStream · Inheritance: Stream[_AnyInstance]

Wraps any MOA Java stream. Used internally by all built-in generators and dataset streams.

signature

MOAStream(
    moa_stream: Optional[InstanceStream] = None,
    schema: Optional[Schema] = None,
    CLI: Optional[str] = None,
) -> None

Parameter	Type	Default	Description
moa_stream	Optional[InstanceStream]	None	MOA stream Java object
schema	Optional[Schema]	None	Schema; inferred from moa_stream if None
CLI	Optional[str]	None	Additional MOA CLI arguments

Raises: ValueError if no schema and no moa_stream; ValueError if CLI provided without moa_stream.

§3 File-Based Streams

class ARFFStream · Inheritance: MOAStream[_AnyInstance]

Reads a stream from an ARFF file (Attribute-Relation File Format).

ARFFStream(
    path: Union[str, Path],
    CLI: Optional[str] = None,
    class_index: int = -1,
) -> None

Parameter	Type	Default	Description
path	str \| Path	required	Path to .arff file
CLI	Optional[str]	None	Additional MOA CLI arguments
class_index	int	-1	Index of class column (-1 = last column)

example

from openmoa.stream import ARFFStream

stream = ARFFStream("data/covtype.arff")
instance = stream.next_instance()
print(instance.x)        # feature vector
print(instance.y_index)  # class index

class NumpyStream · Inheritance: Stream[_AnyInstance]

Creates a stream directly from NumPy arrays. Useful for integrating existing datasets.

NumpyStream(
    X: np.ndarray,
    y: np.ndarray,
    dataset_name: str = "No_Name",
    feature_names: Optional[Sequence[str]] = None,
    target_name: Optional[str] = None,
    target_type: Optional[str] = None,
) -> None

Parameter	Type	Default	Description
X	np.ndarray	required	Feature matrix, shape (n_samples, n_features)
y	np.ndarray	required	Target vector, shape (n_samples,)
dataset_name	str	"No_Name"	Name of the dataset
feature_names	Optional[Sequence[str]]	None	Feature names; auto-generated as `attrib_0, attrib_1, …` if None
target_name	Optional[str]	None	Name of the target attribute
target_type	Optional[str]	None	`'categorical'`, `'numeric'`, or None (auto-detect)

Attributes & Methods

Name	Type / Returns	Description
current_instance_index	int	Current position in the array
has_more_instances()	bool	True if current_instance_index < len(X)
next_instance()	LabeledInstance \| RegressionInstance	Next instance from arrays
restart()	None	Resets current_instance_index to 0
__len__()	int	Total number of instances

class CSVStream · Inheritance: Stream[_AnyInstance]

Reads a stream from a CSV file line by line.

CSVStream(
    csv_file_path: str,
    dtypes: Optional[list] = None,
    values_for_nominal_features: Dict = {},
    class_index: int = -1,
    values_for_class_label: Optional[list] = None,
    target_attribute_name: Optional[str] = None,
    target_type: Optional[str] = None,
    skip_header: bool = False,
    delimiter: str = ",",
    dataset_name: Optional[str] = None,
) -> None

Parameter	Type	Default	Description
csv_file_path	str	required	Path to CSV file
dtypes	Optional[list]	None	List of (column_name, dtype) tuples; auto-inferred if None
values_for_nominal_features	Dict	{}	Maps column index → list of possible nominal values
class_index	int	-1	Index of class/target column (-1 = last)
values_for_class_label	Optional[list]	None	Possible class values; auto-detected if None
target_attribute_name	Optional[str]	None	Name of target attribute
target_type	Optional[str]	None	`'categorical'`, `'numeric'`, or None
skip_header	bool	False	Skip the first line
delimiter	str	","	Field delimiter character
dataset_name	Optional[str]	None	Defaults to "CSVStream({path})"

Attributes

Attribute	Type	Description
csv_file_path	str	Path to file
total_number_of_lines	int	Total lines in file (set at init)

class LibsvmStream · Inheritance: Stream

Reads sparse data in LIBSVM format (label feat_id:value feat_id:value …).

LibsvmStream(
    path: Union[str, Path],
    dataset_name: str = "LibsvmDataset",
    target_type: str = "categorical",
) -> None

Parameter	Type	Default	Description
path	str \| Path	required	Path to LIBSVM file
dataset_name	str	"LibsvmDataset"	Dataset name
target_type	str	"categorical"	`'categorical'` or `'numeric'`

Note: Instances have a _sparse_x attribute (dict {feature_id: value}) alongside the standard x array. Raises FileNotFoundError if file does not exist.

Method	Returns	Description
has_more_instances()	bool	True if more lines available
next_instance()	LabeledInstance \| RegressionInstance	Next sparse instance
restart()	None	Resets position and clears cache
__len__()	int	Total number of instances

class BagOfWordsStream · Inheritance: Stream

Reads text data from bag-of-words .review files for binary classification (positive vs. negative).

BagOfWordsStream(
    positive_file: Path,
    negative_file: Path,
    dataset_name: str = "BagOfWords",
    normalize: bool = True,
    shuffle_seed: Optional[int] = None,
) -> None

Parameter	Type	Default	Description
positive_file	Path	required	File containing positive examples
negative_file	Path	required	File containing negative examples
dataset_name	str	"BagOfWords"	Dataset name
normalize	bool	True	Normalize feature vectors to unit length
shuffle_seed	Optional[int]	None	Shuffle seed; None = no shuffle

Note: Instances have a _sparse_x attribute (dict {word: count}). Class 0 = negative, class 1 = positive.

class ConcatStream · Inheritance: Stream[_AnyInstance]

Concatenates multiple streams into one, switching to the next stream when the current one is exhausted.

ConcatStream(streams: Sequence[Stream]) -> None

Raises: ValueError if schemas are not equal across streams.

Method	Returns	Description
has_more_instances()	bool	True if any remaining stream has instances
next_instance()	_AnyInstance	Next instance; advances to next sub-stream when exhausted
get_schema()	Schema	Schema of the current stream
restart()	None	Restarts all sub-streams and resets index
__len__()	int	Total length (only if all sub-streams support len())

function stream_from_file

Auto-detects file type and returns the appropriate stream object (ARFFStream for .arff, CSVStream for .csv).

stream_from_file(
    path_to_csv_or_arff: Union[str, Path],
    dataset_name: str = "NoName",
    class_index: int = -1,
    target_type: Optional[str] = None,
) -> Stream

Parameter	Type	Default	Description
path_to_csv_or_arff	str \| Path	required	Path to .arff or .csv file
dataset_name	str	"NoName"	Dataset name
class_index	int	-1	Class column index
target_type	Optional[str]	None	`'categorical'`, `'numeric'`, or None (CSV only)

Raises: FileNotFoundError · IsADirectoryError · ValueError (unsupported extension).

§4 Feature Evolution Wrappers

These wrappers simulate dynamic feature spaces where the set of active features changes over time.

class OpenFeatureStream · alias: EvolvingFeatureStream · Inheritance: Stream

Wraps a stream to shrink/grow the active feature set over time. Each returned instance carries a feature_indices NumPy attribute indicating which original features are active.

OpenFeatureStream(
    base_stream: Stream,
    d_min: int = 2,
    d_max: Optional[int] = None,
    evolution_pattern: Literal["pyramid","incremental","decremental","tds","cds","eds"] = "pyramid",
    total_instances: int = 10000,
    feature_selection: Literal["prefix","suffix","random"] = "prefix",
    missing_ratio: float = 0.0,
    random_seed: int = 42,
    tds_mode: Literal["random","ordered"] = "random",
    n_segments: int = 2,
    overlap_ratio: float = 1.0,
) -> None

Parameter	Type	Default	Description
base_stream	Stream	required	Stream to wrap
d_min	int	2	Minimum number of active features
d_max	Optional[int]	None	Maximum features; defaults to original feature count
evolution_pattern	str	"pyramid"	Pattern of feature evolution (see table below)
total_instances	int	10000	Total stream length
feature_selection	str	"prefix"	Which features to keep when dimension is reduced
missing_ratio	float	0.0	Per-feature absence probability (only used by "cds" pattern)
random_seed	int	42	RNG seed for reproducibility
tds_mode	str	"random"	"random" or "ordered" birth assignment (only used by "tds")
n_segments	int	2	Number of sequential partitions (only used by "eds")
overlap_ratio	float	1.0	Overlap length relative to stable period (only used by "eds")

Evolution patterns

Value	Description
"pyramid"	Feature count grows linearly from d_min to d_max, then shrinks back to d_min
"incremental"	Monotonic growth from d_min to d_max
"decremental"	Monotonic shrinkage from d_max to d_min
"tds"	Trapezoidal: each feature has an independent birth time assigned across 10 stages
"cds"	Capricious: each feature is independently present at each step with probability 1 − missing_ratio
"eds"	Evolvable: n_segments sequential partitions with configurable overlap windows

Feature selection modes

Value	Description
"prefix"	Keep the first d features from the active set
"suffix"	Keep the last d features
"random"	Randomly select d features (reproducible per time step via seed)

class TrapezoidalStream · Inheritance: Stream

Similar to OpenFeatureStream but keeps the vector fixed-size: inactive features are filled with np.nan rather than omitted.

TrapezoidalStream(
    base_stream: Stream,
    d_min: int = 2,
    d_max: Optional[int] = None,
    evolution_mode: Literal["random","ordered","pyramid"] = "random",
    total_instances: int = 10000,
    random_seed: int = 42,
) -> None

Parameter	Type	Default	Description
base_stream	Stream	required	Stream to wrap
d_min	int	2	Minimum active features
d_max	Optional[int]	None	Maximum features (defaults to original count)
evolution_mode	str	"random"	"random": random order; "ordered": index order; "pyramid": grow then shrink
total_instances	int	10000	Stream length
random_seed	int	42	RNG seed

class CapriciousStream · Inheritance: Stream

Each feature is independently and randomly absent at each time step with probability missing_ratio. Inactive features are filled with np.nan.

CapriciousStream(
    base_stream: Stream,
    d_max: Optional[int] = None,
    missing_ratio: float = 0.5,
    total_instances: int = 10000,
    min_features: int = 1,
    random_seed: int = 42,
) -> None

Parameter	Type	Default	Description
base_stream	Stream	required	Stream to wrap
d_max	Optional[int]	None	Feature dimension (defaults to original)
missing_ratio	float	0.5	Probability that each feature is missing per time step
total_instances	int	10000	Stream length
min_features	int	1	Guaranteed minimum features per instance
random_seed	int	42	RNG seed

class EvolvableStream · Inheritance: Stream

Divides features into n_segments sequential partitions. Features transition from one segment to the next with configurable overlap windows. Inactive features are filled with np.nan.

EvolvableStream(
    base_stream: Stream,
    d_max: Optional[int] = None,
    n_segments: int = 2,
    overlap_ratio: float = 1.0,
    total_instances: int = 10000,
    random_seed: int = 42,
) -> None

Parameter	Type	Default	Description
base_stream	Stream	required	Stream to wrap
d_max	Optional[int]	None	Feature dimension
n_segments	int	2	Number of sequential feature partitions (≥ 2)
overlap_ratio	float	1.0	Overlap window length relative to stable period
total_instances	int	10000	Stream length
random_seed	int	42	RNG seed

class ShuffledStream · Inheritance: Stream

Buffers the entire base stream into memory and serves instances in a randomly shuffled order.

Warning: Loads the full dataset into memory. Suitable for MB-scale datasets; use caution with GB-scale data.

ShuffledStream(base_stream: Stream, random_seed: int = 42) -> None

Method / Attribute	Returns	Description
n_instances	int	Total buffered instances
get_num_instances()	int	Total buffered instances
has_more_instances()	bool	True if pointer not exhausted
next_instance()	_AnyInstance	Next shuffled instance
restart()	None	Re-shuffles and resets pointer

§5 Synthetic Stream Generators

Module: openmoa.stream.generator. All generators inherit from MOAStream.

class SEA · Binary classification · 3 numeric features

Classic SEA (Stream Ensemble Algorithm) binary classification with three numeric features.

SEA(function: int = 1, instance_random_seed: int = 1, noise_percentage: int = 10)

Parameter	Type	Default	Description
function	int	1	Concept function (1, 2, 3, or 4)
instance_random_seed	int	1	RNG seed for instance generation
noise_percentage	int	10	Percentage of noisy labels

class RandomTreeGenerator

Generates instances from a random decision tree concept.

RandomTreeGenerator(
    instance_random_seed: int = 1, tree_random_seed: int = 1,
    num_classes: int = 2, num_nominals: int = 5, num_numerics: int = 5,
    num_vals_per_nominal: int = 5, max_tree_depth: int = 5,
    first_leaf_level: int = 3, leaf_fraction: float = 0.15,
)

Parameter	Default	Description
instance_random_seed	1	RNG seed for instances
tree_random_seed	1	RNG seed for tree structure
num_classes	2	Number of classes
num_nominals	5	Number of nominal attributes
num_numerics	5	Number of numeric attributes
num_vals_per_nominal	5	Possible values per nominal attribute
max_tree_depth	5	Maximum tree depth
first_leaf_level	3	Level at which leaves start appearing
leaf_fraction	0.15	Fraction of internal nodes converted to leaves

class RandomRBFGenerator

Generates instances from randomly placed RBF (Radial Basis Function) centroids.

RandomRBFGenerator(
    model_random_seed: int = 1, instance_random_seed: int = 1,
    number_of_classes: int = 2, number_of_attributes: int = 10,
    number_of_centroids: int = 50,
)

class RandomRBFGeneratorDrift · Extends RandomRBFGenerator with drifting centroids

RBF generator where centroids drift over time — simulates continuous concept drift.

RandomRBFGeneratorDrift(
    model_random_seed: int = 1, instance_random_seed: int = 1,
    number_of_classes: int = 2, number_of_attributes: int = 10,
    number_of_centroids: int = 50,
    number_of_drifting_centroids: int = 2,
    magnitude_of_change: float = 0.0,
)

Parameter	Default	Description
number_of_drifting_centroids	2	Number of centroids that drift
magnitude_of_change	0.0	Speed of centroid movement

class LEDGenerator · 7-segment display digit recognition

LEDGenerator(instance_random_seed: int = 1, noise_percentage: int = 10, reduce_data: bool = False)

class LEDGeneratorDrift

LED generator with a configurable number of drifting attributes.

LEDGeneratorDrift(
    instance_random_seed: int = 1, noise_percentage: int = 10,
    reduce_data: bool = False, number_of_attributes_with_drift: int = 7,
)

class WaveformGenerator · 3-class · 21 attributes

WaveformGenerator(instance_random_seed: int = 1, noise: bool = False)

class WaveformGeneratorDrift

WaveformGeneratorDrift(
    instance_random_seed: int = 1, noise: bool = False,
    number_of_attributes_with_drift: int = 10,
)

class AgrawalGenerator · Loan classification

AgrawalGenerator(
    instance_random_seed: int = 1, function: int = 1,
    balance_classes: bool = False, peturbation: float = 0.05,
)

class HyperplaneGenerator · Rotating hyperplane · continuous gradual drift

HyperplaneGenerator(
    instance_random_seed: int = 1,
    number_of_attributes: int = 10,
    number_of_drifting_attributes: int = 2,
    magnitude_of_change: float = 0.0,
    noise_percentage: int = 5,
    sigma_percentage: int = 10,
)

class STAGGERGenerator · Binary classification

STAGGERGenerator(instance_random_seed: int = 1, function: int = 1, balance_classes: bool = False)

class MixedGenerator · Numeric + nominal features

MixedGenerator(instance_random_seed: int = 1, function: int = 1, balance_classes: bool = False)

§6 Drift Streams

Module: openmoa.stream.drift

class Drift · Base class for drift transitions

Drift(position: int, width: int = 0, alpha: float = 0.0, random_seed: int = 1)

Parameter	Type	Default	Description
position	int	required	Instance index at which drift occurs
width	int	0	Transition window size (0 or 1 = abrupt)
alpha	float	0.0	Grade of change
random_seed	int	1	RNG seed for the transition

class AbruptDrift · Inheritance: Drift

Instantaneous concept switch at a specific instance.

AbruptDrift(position: int, random_seed: int = 1)

class GradualDrift · Inheritance: Drift

Gradual transition where instances are probabilistically drawn from either the old or new concept.

# Specify by center + width
GradualDrift(position=10000, width=2000, random_seed=1)

# Or specify by start + end
GradualDrift(start=9000, end=11000, random_seed=1)

Parameter	Type	Default	Description
position	Optional[int]	None	Center of the drift window
width	Optional[int]	None	Length of transition window
start	Optional[int]	None	Start of transition window
end	Optional[int]	None	End of transition window
alpha	float	0.0	Grade of change
random_seed	int	1	RNG seed

Note: Either (position + width) or (start + end) must be provided. Internally: width = end − start, position = (start + end) / 2.

class DriftStream · Inheritance: MOAStream

Composes a list of sub-streams connected by Drift objects into a single stream with concept drift.

DriftStream(
    schema: Optional[Schema] = None,
    CLI: Optional[str] = None,
    moa_stream: Optional[InstanceStream] = None,
    stream: Optional[list] = None,
)

The stream parameter takes an alternating list: [Stream, Drift, Stream, Drift, Stream, …]

Methods

Method	Returns	Description
get_num_drifts()	int	Number of drift transitions
get_drifts()	list[Drift]	List of Drift objects with their positions and widths

example

from openmoa.stream.drift import DriftStream, AbruptDrift, GradualDrift
from openmoa.stream.generator import SEA

stream = DriftStream(stream=[
    SEA(function=1),
    AbruptDrift(position=5000),
    SEA(function=2),
    GradualDrift(position=10000, width=2000),
    SEA(function=3),
])
print(stream.get_num_drifts())   # 2

class RecurrentConceptDriftStream

Generates recurrent / periodic concepts from a list of concepts that cycle through multiple times.

RecurrentConceptDriftStream(
    concept_list: Sequence[Stream],
    max_recurrences_per_concept: int = 2,
    transition_type_template: Drift = AbruptDrift(position=2000),
    concept_name_list: Optional[Sequence[str]] = None,
)

Parameter	Type	Default	Description
concept_list	Sequence[Stream]	required	List of concepts to cycle through
max_recurrences_per_concept	int	2	How many times each concept reappears
transition_type_template	Drift	AbruptDrift(2000)	Template for transitions between concepts
concept_name_list	Optional[Sequence[str]]	None	Names for each concept

§7 Built-in Datasets

Module: openmoa.datasets. All datasets auto-download on first use and are stored locally. All inherit from DownloadARFFGzip and expose the standard Stream interface.

Classification Datasets

Class	Instances	Attributes	Classes	Description
Electricity	45,312	8	2	Electricity demand (UP/DOWN)
ElectricityTiny	2,000	8	2	Tiny version for testing
Covtype	581,012	54	7	Forest cover type
CovtypeNorm	581,012	54	7	Covtype with normalized features
CovtypeTiny	1,001	54	7	Tiny version for testing
CovtFD	581,011	104	7	Covtype with 2 synthetic feature drifts at instances 193,669 and 387,338
RBFm_100k	100,000	10	5	Synthetic RBF
RTG_2abrupt	100,000	30	5	Random Tree with 2 abrupt drifts
Hyper100k	100,000	10	2	Hyperplane
Sensor	2,219,803	5	54	Indoor sensor readings
RCV1	20,242	~47,236 (sparse)	2	Text classification
W8a	49,749	300	2	Web page classification
Adult	32,561	123	2	Census income
Magic04	19,020	10	2	MAGIC gamma telescope
Spambase	4,601	57	2	Email spam
Musk	6,598	166	2	Musk molecules
SVMGuide3	1,243	21	2	SVM benchmark
German	1,000	24	2	Credit risk
Australian	690	14	2	Credit approval
Ionosphere	351	34	2	Radar returns
InternetAds	2,359	1,558	2	Internet advertisements
DryBean	13,611	16	7	Dry bean classification
Optdigits	5,620	64	10	Optical digit recognition
Frogs	7,195	22	4	Frog species
Wine	178	13	3	Wine cultivars
Splice	3,190	60	3	DNA splice junctions
SeagateBinary	49,999	94	2	Seagate binary
SeagateMulti	11,800	94	11	Seagate multi-class

Regression Datasets

Class	Instances	Attributes	Description
Fried	40,768	10	Friedman regression
FriedTiny	1,000	10	Tiny version for testing
Bike	17,379	12	Bike sharing demand

usage example

from openmoa.datasets import Electricity, Fried

stream = Electricity()
print(stream.get_schema().get_num_classes())     # 2
print(stream.get_schema().get_num_attributes())  # 8

reg_stream = Fried()
print(reg_stream.get_schema().is_regression())   # True

Next: openmoa.classifiers