BSTT:A Bayesian Spatial-Temporal Transformer for Sleep Staging
Yuchen Liu  Ziyu Jia;
Institute Of Computing Technology,ICT Institute of Automation,CAS
Overview
Sleep plays an important role for humans and work on automated sleep staging using machines has important implications. Most of the current state-of-the-art approaches only consider one aspect of the spatial or temporal characteristics of the brain during sleep, resulting in inadequate modelling of the spatio-temporal characteristics of the brain's sleep process. A few works have taken into account the spatio-temporal characteristics of the brain, however, these works lack representation of the spatial-temporal relationships in the brain, resulting in models that are somewhat less interpretable. The Bayesian Spatio-Temporal Transformer (BSTT) model proposed in this paper introduces a relational inference method that can automatically infer the spatio-temporal relationships in the brain during sleep and generate spatio-temporal relationship inference maps to serve spatio-temporal feature extraction. The model also adopts an advanced Transformer architecture to capture spatio-temporal features, which further enhances the effectiveness of the model. Extensive experiments have shown that our model achieves the most advanced results available, while having a degree of interpretability.
Architecture
The overall architecture of the proposed Bayesian spatial-temporal transformer for sleep staging. BSTT includes two Bayesian transformer modules, a spatial Bayesian transformer and a temporal Bayesian transformer. For each transformer module, the input features are passed through the position embedding and layernorm layer. Then the multi-head Bayesian relation inference component infers the object’s spatial or temporal relation and captures the spatial-temporal features. The residual connection is used to prevent overfitting and gradient disappearance.
Results
We used eight baseline methods for comparison with our methodology, as follows:

Baselines:

Compared to other baseline methods, BSTT achieved the best performance on both datasets. Specifically, MCNN and MMCNN utilise CNN models to automatically extract sleep features, while RNN-based methods such as DeepSleepNet and TinySleepNet focus on the temporal context in sleep data and model multi-level temporal features during sleep for sleep staging. In addition, GraphSleepNet and ST-Transformer simultaneously modelled the spatial-temporal relationships during sleep with satisfactory results. However, GraphSleepNet and ST-Transformer cannot adequately reason about the spatial-temporal relationship, which limits the classification performance to a certain extent. Our Bayesian ST-Transformer uses a multi-headed Bayesian relational inference component to reason about spatial-temporal relationships in order to better model spatial and temporal relationships. As a result, the proposed model achieves the best classification performance on different datasets.
We visualize the spatial and temporal relationship graphs on the MASS-SS3 dataset. It has been revealed that during light sleep, cerebral blood flow (CBF) and cerebral metabolic rate (CMR) are only about 3% to 10% lower than those of wakefulness while during deep sleep (Madsen & Vorstrup, 1991). Synaptic connection activity is directly correlated with CBF and CMR, which is consistent with our spatial relationship intensity graphs. Besides, Previous studies have shown that the stability of the unchanging period is stronger, and sleep instability is the basis of sleep transition (Bassi et al., 2009), which is consistent with our experimental results.
BibTeX
@article{alias,
  title   = {BSTT: A Bayesian Spatial-Temporal Transformer for Sleep Staging},
  author  = {Liu, Yuchen and Jia, Ziyu},
  journal = {The Eleventh International Conference on Learning Representations},
  year    = {2023}
}