segment_rejection.annotate_noisy_segments#

autoclean.functions.segment_rejection.segment_rejection.annotate_noisy_segments(raw, epoch_duration=2.0, epoch_overlap=0.0, picks=None, quantile_k=3.0, quantile_flag_crit=0.2, annotation_description='BAD_noisy_segment', verbose=None)[source]#

Identify and annotate noisy segments in continuous EEG data.

This function temporarily epochs the continuous data, calculates channel-wise standard deviations for each epoch, and then identifies epochs where a significant proportion of channels exhibit outlier standard deviations. The outlier detection is based on the interquartile range (IQR) method, similar to what’s used in the pylossless pipeline.

The method works by: 1. Creating temporary fixed-length epochs from continuous data 2. Calculating standard deviation for each channel in each epoch 3. Using IQR-based outlier detection to identify abnormal standard deviations 4. Flagging epochs where too many channels show outlier behavior 5. Adding annotations to mark these problematic time periods

Parameters:
rawmne.io.Raw

The continuous EEG data to analyze for noisy segments.

epoch_durationfloat, default 2.0

Duration of epochs in seconds for noise detection. Shorter epochs provide finer temporal resolution but may be less stable for outlier detection.

epoch_overlapfloat, default 0.0

Overlap between epochs in seconds. Non-zero overlap provides smoother detection but increases computation time.

pickslist of str, str, or None, default None

Channels to include in analysis. If None, defaults to ‘eeg’. Can be channel names (e.g., [‘EEG 001’, ‘EEG 002’]) or channel types (e.g., ‘eeg’, ‘grad’).

quantile_kfloat, default 3.0

Multiplier for the IQR when defining outlier thresholds for channel standard deviations. A channel’s std in an epoch is an outlier if it’s k IQRs above Q3 or below Q1 relative to its own distribution of stds across all epochs. Higher values = more conservative detection.

quantile_flag_critfloat, default 0.2

Proportion threshold (0.0-1.0). If more than this proportion of picked channels are marked as outliers (having outlier std) within an epoch, that epoch is flagged as noisy. Lower values = more sensitive detection.

annotation_descriptionstr, default “BAD_noisy_segment”

The description to use for MNE annotations marking noisy segments. Should start with “BAD_” to be recognized by MNE as artifact annotations.

verbosebool or None, default None

Control verbosity of output during processing.

Returns:
raw_annotatedmne.io.Raw

Copy of input Raw object with added annotations for noisy segments. Original data is not modified.

Raises:
TypeError

If raw is not an MNE Raw object.

ValueError

If parameters are outside valid ranges or no epochs can be created.

RuntimeError

If processing fails due to insufficient data or other errors.

See also

annotate_uncorrelated_segments

Detect segments with poor channel correlations

mne.Annotations

MNE annotations system

autoclean.detect_outlier_epochs

Statistical outlier detection for epochs

Notes

Detection Algorithm: 1. For each channel, its standard deviation is calculated within each epoch 2. For each channel, the distribution of its standard deviations across all

epochs is analyzed using quartiles (Q1, Q3) and IQR

  1. Outlier thresholds are: Q1 - k*IQR and Q3 + k*IQR

  2. An epoch is marked as noisy if the proportion of channels whose standard deviation falls outside their respective outlier bounds exceeds the quantile_flag_crit threshold

Parameter Guidelines: - epoch_duration: 1-4 seconds typical. Shorter for transient artifacts,

longer for stable outlier detection

  • quantile_k: 2-4 typical. Higher values = fewer false positives

  • quantile_flag_crit: 0.1-0.3 typical. Lower = more sensitive

Performance Considerations: - Processing time scales with (data_length / epoch_duration) - Memory usage depends on number of epochs and channels - Overlap increases computation but may improve detection continuity

References

This implementation adapts concepts from the PREP pipeline and pylossless:

Bigdely-Shamlo, N., Mullen, T., Kothe, C., Su, K. M., & Robbins, K. A. (2015). The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Frontiers in neuroinformatics, 9, 16.

Examples

Basic noise detection with default parameters:

>>> from autoclean import annotate_noisy_segments
>>> raw_clean = annotate_noisy_segments(raw)
>>> noisy_annotations = [ann for ann in raw_clean.annotations
...                     if 'noisy' in ann['description']]
>>> print(f"Found {len(noisy_annotations)} noisy segments")

Conservative detection for high-quality data:

>>> raw_clean = annotate_noisy_segments(
...     raw,
...     epoch_duration=3.0,
...     quantile_k=4.0,
...     quantile_flag_crit=0.3
... )

Sensitive detection for noisy data:

>>> raw_clean = annotate_noisy_segments(
...     raw,
...     epoch_duration=1.0,
...     quantile_k=2.0,
...     quantile_flag_crit=0.1,
...     annotation_description="BAD_very_noisy"
... )

EEG-only detection with channel selection:

>>> raw_clean = annotate_noisy_segments(
...     raw,
...     picks=['Fp1', 'Fp2', 'F3', 'F4', 'C3', 'C4', 'P3', 'P4'],
...     epoch_duration=2.0
... )