Installation

Install OpenMOA and set up your environment for streaming machine learning.

Prerequisites

Python

  • Minimum: Python ≥ 3.9
  • Recommended: Python 3.11
  • Tested: Python 3.10, 3.11, 3.12

Java (Required)

  • OpenJDK or Oracle JDK
  • Verify: java -version
  • Set JAVA_HOME if needed

PyTorch (Optional)

  • Only for deep learning algorithms
  • Not needed for Hoeffding Tree, FOBOS, etc.
  • Required for TorchClassifyStream

Note: OpenMOA uses the MOA (Massive Online Analysis) Java framework under the hood, bridged via JPype1. This is why Java is required.

Quick Install (User)

# Step 1: Verify Java environment
java -version

# Step 2: (Optional) Create virtual environment
conda create -n openmoa python=3.11
conda activate openmoa

# Step 3: (Optional) Install PyTorch CPU version
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

# Step 4: Install OpenMOA
pip install openmoa

# Step 5: Verify installation
python -c "import openmoa; print(openmoa.__version__)"

During installation, two build scripts run automatically:

Developer Install

# Step 1: Install Java and PyTorch (same as above)

# Step 2: Install Pandoc (required for documentation)
# Ubuntu:
sudo apt-get install -y pandoc
# macOS:
brew install pandoc
# conda:
conda install -c conda-forge pandoc

# Step 3: Clone repository
git clone https://github.com/ZW-SIYUAN/OpenMOA.git
cd OpenMOA

# Step 4: Editable install with dev dependencies
pip install --editable ".[dev,doc]"

Available invoke commands for developers:

invoke build.download-moa Manually download MOA JAR
invoke build.stubs Regenerate type stubs
invoke test.pytest Run unit tests
invoke test.nb Run notebook tests
invoke docs.build Build Sphinx documentation
invoke lint Code quality checks

Docker Install

cd docker
docker compose build
docker compose up
# Access Jupyter at http://localhost:8888

The Docker image is based on jupyter/base-notebook and includes OpenJDK 8, PyTorch CPU, and OpenMOA pre-installed.

Environment Variables

Variable Default Description
OPENMOA_DATASETS_DIR ./data Dataset auto-download/storage directory
OPENMOA_JVM_ARGS -Xmx8g -Xss10M JVM memory and stack size parameters
OPENMOA_MOA_JAR <package>/jar/moa.jar Custom MOA JAR path (advanced)
JAVA_HOME Auto-detected Specify Java installation path

Verify Installation

import openmoa
print(openmoa.__version__)      # Print version

# Quick test: load built-in dataset and read one instance
from openmoa.datasets import ElectricityTiny
stream = ElectricityTiny()
instance = stream.next_instance()
print(f"Features: {instance.x}")
print(f"Label: {instance.y_label}")

If the above code runs without errors, installation is successful.

Core Dependencies

jpype1 >= 1.5.1 Python-Java bridge for MOA
numpy Numerical computation
pandas Data processing and output
scikit-learn ML tools and sklearn wrappers
matplotlib / seaborn Visualization
tqdm Progress bars
click CLI tools

Troubleshooting

OpenmoaImportError: Could not find Java

Install Java and ensure java -version runs correctly, or set the JAVA_HOME environment variable.

JVM out of memory

Increase memory allocation: export OPENMOA_JVM_ARGS="-Xmx16g -Xss10M"

Dataset download failed

Check your network connection, or manually download the dataset and place it in the OPENMOA_DATASETS_DIR directory.