segment_rejection.annotate_noisy_segments#

autoclean.functions.segment_rejection.segment_rejection.annotate_noisy_segments(raw, epoch_duration=2.0, epoch_overlap=0.0, picks=None, quantile_k=3.0, quantile_flag_crit=0.2, annotation_description='BAD_noisy_segment', verbose=None)[source]#

Identify and annotate noisy segments in continuous EEG data.

This function temporarily epochs the continuous data, calculates channel-wise standard deviations for each epoch, and then identifies epochs where a significant proportion of channels exhibit outlier standard deviations. The outlier detection is based on the interquartile range (IQR) method, similar to what’s used in the pylossless pipeline.

The method works by: 1. Creating temporary fixed-length epochs from continuous data 2. Calculating standard deviation for each channel in each epoch 3. Using IQR-based outlier detection to identify abnormal standard deviations 4. Flagging epochs where too many channels show outlier behavior 5. Adding annotations to mark these problematic time periods

Parameters:

rawmne.io.Raw: The continuous EEG data to analyze for noisy segments.
epoch_durationfloat, default 2.0: Duration of epochs in seconds for noise detection. Shorter epochs provide finer temporal resolution but may be less stable for outlier detection.
epoch_overlapfloat, default 0.0: Overlap between epochs in seconds. Non-zero overlap provides smoother detection but increases computation time.
pickslist of str, str, or None, default None: Channels to include in analysis. If None, defaults to ‘eeg’. Can be channel names (e.g., [‘EEG 001’, ‘EEG 002’]) or channel types (e.g., ‘eeg’, ‘grad’).
quantile_kfloat, default 3.0: Multiplier for the IQR when defining outlier thresholds for channel standard deviations. A channel’s std in an epoch is an outlier if it’s k IQRs above Q3 or below Q1 relative to its own distribution of stds across all epochs. Higher values = more conservative detection.
quantile_flag_critfloat, default 0.2: Proportion threshold (0.0-1.0). If more than this proportion of picked channels are marked as outliers (having outlier std) within an epoch, that epoch is flagged as noisy. Lower values = more sensitive detection.
annotation_descriptionstr, default “BAD_noisy_segment”: The description to use for MNE annotations marking noisy segments. Should start with “BAD_” to be recognized by MNE as artifact annotations.
verbosebool or None, default None: Control verbosity of output during processing.

Returns:

raw_annotatedmne.io.Raw: Copy of input Raw object with added annotations for noisy segments. Original data is not modified.

Raises:

TypeError: If raw is not an MNE Raw object.
ValueError: If parameters are outside valid ranges or no epochs can be created.
RuntimeError: If processing fails due to insufficient data or other errors.

See also

annotate_uncorrelated_segments: Detect segments with poor channel correlations
mne.Annotations: MNE annotations system
autoclean.detect_outlier_epochs: Statistical outlier detection for epochs

Notes

Detection Algorithm: 1. For each channel, its standard deviation is calculated within each epoch 2. For each channel, the distribution of its standard deviations across all

epochs is analyzed using quartiles (Q1, Q3) and IQR

Outlier thresholds are: Q1 - k*IQR and Q3 + k*IQR
An epoch is marked as noisy if the proportion of channels whose standard deviation falls outside their respective outlier bounds exceeds the quantile_flag_crit threshold

Parameter Guidelines: - epoch_duration: 1-4 seconds typical. Shorter for transient artifacts,

longer for stable outlier detection

quantile_k: 2-4 typical. Higher values = fewer false positives
quantile_flag_crit: 0.1-0.3 typical. Lower = more sensitive

Performance Considerations: - Processing time scales with (data_length / epoch_duration) - Memory usage depends on number of epochs and channels - Overlap increases computation but may improve detection continuity

References

This implementation adapts concepts from the PREP pipeline and pylossless:

Bigdely-Shamlo, N., Mullen, T., Kothe, C., Su, K. M., & Robbins, K. A. (2015). The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Frontiers in neuroinformatics, 9, 16.

Examples

Basic noise detection with default parameters:

>>> from autoclean import annotate_noisy_segments
>>> raw_clean = annotate_noisy_segments(raw)
>>> noisy_annotations = [ann for ann in raw_clean.annotations
...                     if 'noisy' in ann['description']]
>>> print(f"Found {len(noisy_annotations)} noisy segments")

Conservative detection for high-quality data:

>>> raw_clean = annotate_noisy_segments(
...     raw,
...     epoch_duration=3.0,
...     quantile_k=4.0,
...     quantile_flag_crit=0.3
... )

Sensitive detection for noisy data:

>>> raw_clean = annotate_noisy_segments(
...     raw,
...     epoch_duration=1.0,
...     quantile_k=2.0,
...     quantile_flag_crit=0.1,
...     annotation_description="BAD_very_noisy"
... )

EEG-only detection with channel selection:

>>> raw_clean = annotate_noisy_segments(
...     raw,
...     picks=['Fp1', 'Fp2', 'F3', 'F4', 'C3', 'C4', 'P3', 'P4'],
...     epoch_duration=2.0
... )