autoreject.autoreject_epochs#

autoclean.functions.advanced.autoreject.autoreject_epochs(epochs, n_interpolate=None, consensus=None, n_jobs=1, cv=4, random_state=None, picks=None, thresh_method='bayesian_optimization', verbose=None)[source]#

Apply AutoReject for automatic epoch cleaning and channel interpolation.

This function applies the AutoReject algorithm to clean epochs by identifying and removing bad epochs and interpolating bad channels within epochs. AutoReject is a machine learning-based method that automatically determines optimal thresholds for artifact rejection, reducing the need for manual inspection.

The method uses a cross-validation approach to determine the optimal parameters for artifact rejection, including the number of channels to interpolate and the consensus threshold. These parameters can be customized through the function arguments or determined automatically by the algorithm.

AutoReject works by: 1. Creating a grid of rejection thresholds and interpolation parameters 2. Using cross-validation to find optimal parameters for each channel 3. Applying the learned thresholds to identify bad epochs and channels 4. Interpolating bad channels and rejecting bad epochs

Parameters:

epochsmne.Epochs: The epoched EEG data to clean. Must have at least 4 epochs for cross-validation to work properly.
n_interpolatelist of int or None, default None: List of number of channels to interpolate for parameter search. If None, uses [1, 4, 8] as default values. Higher values allow more channel interpolation but may reduce data quality.
consensuslist of float or None, default None: List of consensus percentages for parameter search (0.0-1.0). If None, uses [0.1, 0.25, 0.5, 0.75, 0.9] as default values. Higher values are more conservative (fewer rejections).
n_jobsint, default 1: Number of parallel jobs to run for cross-validation. Set to -1 to use all available CPU cores. Higher values speed up computation but use more memory.
cvint, default 4: Number of cross-validation folds for parameter optimization. Must be at least 2. Higher values provide more robust parameter estimates but increase computation time.
random_stateint or None, default None: Random seed for reproducible results in cross-validation splits. Set to an integer for reproducible results across runs.
pickslist of str or None, default None: Channel names to include in the analysis. If None, uses all EEG channels. Non-EEG channels are automatically excluded.
thresh_methodstr, default ‘bayesian_optimization’: Method for threshold optimization. Options: - ‘bayesian_optimization’: Uses Bayesian optimization (recommended) - ‘random_search’: Uses random search (faster but less optimal)
verbosebool or None, default None: Control verbosity of output during processing.

Returns:

epochs_cleanmne.Epochs: The cleaned epochs object with bad epochs removed and bad channels interpolated. May contain fewer epochs than the input.
metadatadict: Dictionary containing detailed information about the cleaning process: - ‘initial_epochs’: Number of epochs before cleaning - ‘final_epochs’: Number of epochs after cleaning - ‘rejected_epochs’: Number of epochs rejected - ‘rejection_percent’: Percentage of epochs rejected - ‘epoch_duration’: Duration of each epoch in seconds - ‘samples_per_epoch’: Number of time samples per epoch - ‘total_duration_sec’: Total duration of cleaned data - ‘total_samples’: Total number of samples in cleaned data - ‘channel_count’: Number of channels - ‘interpolated_channels’: Channels that were interpolated - ‘n_interpolate’: Parameter values used - ‘consensus’: Parameter values used - ‘cv_scores’: Cross-validation scores for parameter selection

Raises:

TypeError: If epochs is not an MNE Epochs object.
ValueError: If parameters are outside valid ranges or insufficient data for CV.
ImportError: If AutoReject package is not installed.
RuntimeError: If AutoReject processing fails.

See also

autoreject.AutoReject: Underlying AutoReject implementation
mne.preprocessing.ICA: Alternative artifact removal method
autoclean.detect_outlier_epochs: Simpler statistical epoch rejection

Notes

Algorithm Overview: AutoReject uses a cross-validation approach to learn optimal rejection thresholds for each channel individually. It creates a grid search over possible numbers of channels to interpolate and consensus thresholds, then uses CV to find the best combination.

Parameter Guidelines: - n_interpolate: Start with [1, 4, 8]. For high-density arrays, consider

[1, 4, 8, 16]. For low-density arrays, use [1, 2, 4].

consensus: [0.1, 0.25, 0.5, 0.75, 0.9] covers range from liberal to conservative rejection. Lower values = more aggressive rejection.
n_jobs: Use -1 for maximum speed on multi-core systems.
cv: 4-5 folds typical. Higher values more robust but slower.

Memory and Performance: - Memory usage scales with (n_epochs × n_channels × n_times × cv) - For large datasets, consider reducing cv or chunking epochs - Processing time: ~1-10 minutes for typical datasets (64 channels, 100+ epochs)

Quality Considerations: - Requires minimum 20-30 epochs for reliable parameter estimation - Best results with 100+ epochs for robust cross-validation - Interpolated channels maintain spatial relationships - Aggressive rejection (>50% epochs) may indicate poor data quality

References

Jas, M., Engemann, D. A., Bekhti, Y., Raimondo, F., & Gramfort, A. (2017). Autoreject: Automated artifact rejection for MEG and EEG data. NeuroImage, 159, 417-429.

Jas, M., Engemann, D. A., Raimondo, F., Bekhti, Y., & Gramfort, A. (2016). Automated rejection and repair of bad trials in MEG/EEG. In 2016 international workshop on pattern recognition in neuroimaging (PRNI) (pp. 1-4). IEEE.

Examples

Basic usage with default parameters:

>>> from autoclean import autoreject_epochs
>>> clean_epochs, metadata = autoreject_epochs(epochs)
>>> print(f"Rejected {metadata['rejection_percent']:.1f}% of epochs")

Conservative cleaning for high-quality data:

>>> clean_epochs, metadata = autoreject_epochs(
...     epochs,
...     n_interpolate=[1, 2, 4],
...     consensus=[0.5, 0.75, 0.9],
...     n_jobs=4
... )

Aggressive cleaning for noisy data:

>>> clean_epochs, metadata = autoreject_epochs(
...     epochs,
...     n_interpolate=[1, 4, 8, 16],
...     consensus=[0.1, 0.25, 0.5],
...     random_state=42
... )

Processing specific channels only:

>>> clean_epochs, metadata = autoreject_epochs(
...     epochs,
...     picks=['Fp1', 'Fp2', 'F3', 'F4', 'C3', 'C4'],
...     n_jobs=-1
... )