Shayda Moezzi and ACLab Associates
Infant sleep is critical for healthy development, and disruptions in sleep patterns can have profound implications for infant brain maturation and overall well-being. Traditional methods for monitoring infant sleep often rely on intrusive equipment or time-intensive manual annotations, which hinder their scalability in clinical and research applications. We present our dataset, SmallSleeps, which includes 152 hours of overnight recordings of 17 infants aged 4–11 months captured in real-world home environments. Using this dataset, we train a deep learning algorithm for classification of infant sleep–wake states from short 90 s video clips drawn from natural, overnight, in-crib baby monitor footage, based on a two-stream spatiotemporal model which integrates rich RGB frames with optical flow features. Our binary classification algorithm was trained and tested on "pure" state clips featuring a single state dominating the timeline (i.e., over 90% sleep or over 90% wake) and achieves over 80% precision and recall. We also perform a careful experimental study of the result of training and testing on "mixed" clips featuring specified levels of heterogeneity, with a view towards applications to infant sleep segmentation and sleep quality classification in longer, overnight videos, where local behavior is often mixed. This local-to-global approach allows for deep learning to be effectively deployed on the strength of tens of thousands of video clips, despite a relatively modest sample size of 17 infants.
Two-Stream Network Architecture
1. Feature Extraction
2. Classification
Modality Selection/Fusion
The architecture highlights two main processing stages: feature extraction (green blocks) and classification (purple blocks).
Over 152 hours of overnight recordings recorded at 10 frames per second of 17 infants aged 4–11 months
Our t-SNE visualization of the feature space reveals distinct patterns in how RGB and optical flow features capture sleep-wake states. The optical flow feature space shows clear separation between sleep and wake states, with subject-specific clusters that maintain their distinctiveness. This visualization helps explain the superior performance of optical flow features in our classification tasks, particularly in capturing the subtle movement patterns that distinguish wake states from sleep.
@InProceedings{Moezzi_2025_WACV,
author = {Moezzi, Shayda and Wan, Michael and Manne, Sai Kumar Reddy and Mathew, Amal and Zhu, Shaotong and Galoaa, Bishoy and Hatamimajoumerd, Elaheh and Grace, Emma and Rowan, Cassandra B and Zimmerman, Emily and Taylor, Briana J and Hayes, Marie J and Ostadabbas, Sarah},
title = {Classification of Infant Sleep-Wake States from Natural Overnight In-Crib Sleep Videos},
booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops},
month = {February},
year = {2025},
pages = {42-51}
}