ACLab Logo

Classification of Infant Sleep–Wake States from Natural Overnight In-Crib Sleep Videos

Shayda Moezzi and ACLab Associates

Abstract

Infant sleep is critical for healthy development, and disruptions in sleep patterns can have profound implications for infant brain maturation and overall well-being. Traditional methods for monitoring infant sleep often rely on intrusive equipment or time-intensive manual annotations, which hinder their scalability in clinical and research applications. We present our dataset, SmallSleeps, which includes 152 hours of overnight recordings of 17 infants aged 4–11 months captured in real-world home environments. Using this dataset, we train a deep learning algorithm for classification of infant sleep–wake states from short 90 s video clips drawn from natural, overnight, in-crib baby monitor footage, based on a two-stream spatiotemporal model which integrates rich RGB frames with optical flow features. Our binary classification algorithm was trained and tested on "pure" state clips featuring a single state dominating the timeline (i.e., over 90% sleep or over 90% wake) and achieves over 80% precision and recall. We also perform a careful experimental study of the result of training and testing on "mixed" clips featuring specified levels of heterogeneity, with a view towards applications to infant sleep segmentation and sleep quality classification in longer, overnight videos, where local behavior is often mixed. This local-to-global approach allows for deep learning to be effectively deployed on the strength of tens of thousands of video clips, despite a relatively modest sample size of 17 infants.

Why monitor infant sleep?

  • Infant sleep plays a vital role in early brain development, cognitive function, and overall well-being​
  • Disruptions in sleep patterns during infancy can signal developmental issues and have long-term implications for health and neurodevelopment​
Polysomnography EEG Baby Sleep Video

Towards a computer vision based approach

Problem: Traditional methods are intrusive, expensive, and uncomfortable, limiting their widespread use​
  • Gold standard for sleep monitoring is polysomnography (PSG)​
  • PSG involves recording physiological signals using multiple contact sensors​
  • PSG can cause discomfort and disrupt natural sleep cycles​
Solution: Using computer vision techniques, we aim to develop a non-invasive, scalable system for infant sleep—wake classification from naturalistic overnight videos​
  • Spatiotemporal models provide a scalable and accessible alternative for analyzing infant sleep—wake states​
  • Home in on the strengths of computer vision to address the complexities of real-world infant behavior, moving toward a future of seamless, data-driven infant health monitoring​

Overall Pipeline

Infant Sleep-Wake Classification Overview

Two-Stream Network Architecture

  • Processes 90-second video clips through parallel RGB and optical flow streams.
  • Each stream consists of two main stages:

1. Feature Extraction

  • Uses 3D ConvNet to extract spatiotemporal features from K sequential frames.

2. Classification

  • Incorporates self-attention mechanisms for temporal dependency modeling.
  • Applies temporal averaging and fully connected layers with ReLU activation.

Modality Selection/Fusion

  • Allows flexible stream combination: RGB alone, flow alone, or RGB + flow.
  • Final classification into Sleep/Wake states.

The architecture highlights two main processing stages: feature extraction (green blocks) and classification (purple blocks).

SmallSleeps Dataset Creation

Over 152 hours of overnight recordings recorded at 10 frames per second of 17 infants aged 4–11 months

  • Video cameras were sent to infants' homes and baby monitors were set up in cribs by caregivers and activated for overnight recordings​
  • Behavioral coding employed to annotate sleep and wake states, allowing for non-invasive data collection​
    • Each video reviewed in near-real-time by two research assistants who placed time markers for start and end of waking-like states​
    • True wake states then inferred from the aggregation of both sets of codes
SmallSleeps Dataset Creation Process

Feature Space Analysis

Feature Space Analysis

Our t-SNE visualization of the feature space reveals distinct patterns in how RGB and optical flow features capture sleep-wake states. The optical flow feature space shows clear separation between sleep and wake states, with subject-specific clusters that maintain their distinctiveness. This visualization helps explain the superior performance of optical flow features in our classification tasks, particularly in capturing the subtle movement patterns that distinguish wake states from sleep.

Impact

  • Contribution: Developed a spatiotemporal deep-learning model for infant sleep-wake classification trained on our own annotated real-world overnight infant sleep dataset​
  • Impact: Provides a non-invasive, scalable solution for infant sleep monitoring, bridging the gap between research and real-world applications​

CV4Smalls 2025 Presentation

BibTeX

@InProceedings{Moezzi_2025_WACV,
    author    = {Moezzi, Shayda and Wan, Michael and Manne, Sai Kumar Reddy and Mathew, Amal and Zhu, Shaotong and Galoaa, Bishoy and Hatamimajoumerd, Elaheh and Grace, Emma and Rowan, Cassandra B and Zimmerman, Emily and Taylor, Briana J and Hayes, Marie J and Ostadabbas, Sarah},
    title     = {Classification of Infant Sleep-Wake States from Natural Overnight In-Crib Sleep Videos},
    booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops},
    month     = {February},
    year      = {2025},
    pages     = {42-51}
}