Feasibility of Eye Tracking Assisted Vestibular Rehabilitation Strategy Using Immersive Virtual Reality
Article information
Abstract
Objectives
Even though vestibular rehabilitation therapy (VRT) using head-mounted display (HMD) has been highlighted recently as a popular virtual reality platform, we should consider that HMD itself do not provide interactive environment for VRT. This study aimed to test the feasibility of interactive components using eye tracking assisted strategy through neurophysiologic evidence.
Methods
HMD implemented with an infrared-based eye tracker was used to generate a virtual environment for VRT. Eighteen healthy subjects participated in our experiment, wherein they performed a saccadic eye exercise (SEE) under two conditions of feedback-on (F-on, visualization of eye position) and feedback-off (F-off, non-visualization of eye position). Eye position was continuously monitored in real time on those two conditions, but this information was not provided to the participants. Electroencephalogram recordings were used to estimate neural dynamics and attention during SEE, in which only valid trials (correct responses) were included in electroencephalogram analysis.
Results
SEE accuracy was higher in the F-on than F-off condition (P=0.039). The power spectral density of beta band was higher in the F-on condition on the frontal (P=0.047), central (P=0.042), and occipital areas (P=0.045). Beta–event-related desynchronization was significantly more pronounced in the F-on (–0.19 on frontal and –0.22 on central clusters) than in the F-off condition (0.23 on frontal and 0.05 on central) on preparatory phase (P=0.005 for frontal and P=0.024 for central). In addition, more abundant functional connectivity was revealed under the F-on condition.
Conclusion
Considering substantial gain may come from goal directed attention and activation of brain-network while performing VRT, our preclinical study from SEE suggests that eye tracking algorithms may work efficiently in vestibular rehabilitation using HMD.
INTRODUCTION
Vestibular rehabilitation (VR) is known to be a safe and effective treatment modality for patients with dizziness. Earlier works have demonstrated that patients who are referred to VR programs earlier show less disability and better performance after vestibular deafferentation [1-3]. However, a few significant concerns have also been raised, for example, that VR program is tedious and requires active effort from patients to achieve better performance that VR [4]. Recently, virtual reality has been highlighted as a supplemental platform for overcoming these serious concerns, wherein a head-mounted display (HMD) that provides fully immersive experience is commonly adopted for clinical applications because of its affordability and portability.
Numerous previous studies have suggested that the attractive components of virtual reality may contribute to successful outcomes [5-8]. In addition, when we focus on the theoretical background that virtual reality enables visual-vestibular interactions by abundant visual stimuli [9,10], virtual reality likely provides an optimized environment for better performance in VR exercises.
However, findings on the clinical efficacy of virtual reality for VR are inconsistent [5,11]. Furthermore, despite growing interest in fully immersive wearable device such as HMD as a popular virtual reality platform for VR, the clinical optimization for dizzy patients is still in its beginning stages. To this end, we first developed HMD-based virtual reality contents on the basis of classical Cawthorne-Cooksey exercises [12,13], which have not been previously reported to our knowledge (Supplementary Materials 1-3, Video clips 1-3). However, we paradoxically noticed that HMD itself was not able to provide a desirable environment for VR. This finding is based on the theoretical conflict between the nature of a VR program requiring repeated performance of a task over time with intervention by a supervisor and the difficulty of this type of intervention within the fully immersive environment of HMD. Therefore, we hypothesized that the addition of an interactive component may be critical to optimize HMD for VR; thus, we added visual feedback via an eye tracker so that participants would be able to conduct VR exercises in a correct and interactive manner by perceiving their eye-gaze position without intervention by the supervisor.
Here, we attempted to measure neurophysiological responses in the context of attention and brain dynamics during saccadic eye exercise (SEE) as a part of VR in HMD. To date, previous investigations into the clinical implications of virtual reality have been carried out only in a context with clinical parameters such as a functional scale after VR [11,14]. Thus, even the present study investigated the neurophysiologic response for SEE, our results may provide fundamental insights into the clinical optimization of fully immersive virtual reality for all neurorehabilitation therapy with an interactive aspect, including VR.
MATERIALS AND METHODS
Development of digital content and experimental environment
An HMD (Oculus developers kit 2; Oculus, Irvine, CA, USA) implemented with an infrared-based eye tracker (SMI, Teltow, Germany) was used to generate an immersive virtual environment for the VR exercise. Digital contents for VR were designed by Unity 3D program (Unity Technologies, San Francisco, CA, USA), wherein we developed (1) smooth pursuit eye exercise, (2) SEE, (3) gaze stabilization exercise, and (4) visuo-vestibular head-eyes exercise (Supplementary Materials 1-3, Video clips 1-3). We selected the saccadic exercise as our experimental environment from those contents to avoid difficulty of analysis due to significant artifacts from head movement. The environment was revised to minimize visual distractors for our experiment (Fig. 1).
Subjects and experiment paradigm
Eighteen healthy subjects participated in our experiment (six women; mean age, 26.5 years; range, 21 to 36 years) in accordance with the ethics guidelines established by the Institutional Review Board of the Hallym University College of Medicine (IRB No. 2016-I102). Inclusion criteria were (1) normal vestibular function on conventional vestibular testing, (2) less than 40 years old, (3) no active or prior history of vertigo, and (4) no history of psychiatric problems. Electroencephalograms (EEGs) were used to obtain electrophysiological evidence. The participants wore an HMD in a sitting position while an EEG cap was fitted over the head, and the electrodes were connected to an analytic computer through wires. All subjects were asked to fixate their eyes on a fixation point for 1.5±0.5 seconds and to then move their eyes quickly from one to the other target presented in the horizontal plan (saccadic eye movement for 1.25 seconds). The experimental environment consisted of two blocks of feedback-on (F-on, visualization of eye position) and feedback-off (F-off, non-visualization of eye position) conditions, the order of which was randomly allocated among subjects (80 trials per block) (Fig. 1). There were no differences in the behavioral outcome and neurophysiologic response between the subjects who underwent F-on first (F-off later) and the F-off first (F-on later) (P>0.05). Eye and gaze position were continuously recorded in real time during all experimental exercises regardless of the visualization of eye position, but this information was not provided to the participants. The time from −1,250 to −1,000 ms (fixation period) relative to target presentation onset was used as a reference period to estimate event-related neural activity. StimTracker (Quad model; Cedrus Corp., San Pedro, CA, USA) was used to avoid operating system delays.
Signal acquisition and preprocessing
EEG recordings were used to infer functional neural dynamics, including the visual-vestibular interaction at the cortical level during VR. The detailed methods were described previously [15]. Briefly, the EEG signal was recorded via a BrainAmp DC amplifier with a 32-channel actiCAP (Brain Products, Munich, Germany) at a sampling rate of 5,000 Hz and at the FCz reference, which was distributed along the scalp according to the international 10-10 system. Electrode impedances were maintained below 5 kΩ during the recordings. The raw EEG data were resampled offline at 512 Hz and band-pass filtered using a 1 Hz highpass filter and a 50 Hz low-pass filter implemented in Brain Vision Analyzer software (Brain Products). After re-referencing to the common average reference, visual inspection and interpolation was applied to identify and reject the bad intervals and functioning channels. Eye movements such as blinks and other muscular artifacts were removed using independent component analysis [16]. Only valid trials (correct responses) were included. The time from −1,250 to −1,000 ms (fixation period) relative to target presentation onset was used as a reference period to estimate event-related neural activity. A semiautomatic artifact rejection algorithm was used to reject the epoch with peak signal amplitudes of more than 80 μV. We also tested whether the HMD would result in an elevation of the noise level for stable analysis, and the results showed no significant noise elevation compared with baseline, consistent with previous works [17,18]. After EEG data preprocessing, three subjects (all male) were excluded from the final analysis because of excessive artifacts in the raw EEG data.
EEG power spectrum and time-frequency analysis
The power spectral density (PSD) and event-related spectral perturbation (ERSP) as functions of frequency and time-frequency were processed by the EEGLAB [19] using custom functions/scripts running in MATLAB R2016a (MathWorks, Natick, MA, USA). The PSD was converted to a logarithmic scale (dB power) in the frequency domain and was calculated for each EEG epoch using a Hanning window with a length of 512 ms, which resulted in a frequency resolution of 1 Hz. For statistical analysis for PSD, we pooled EEGs channels of four sperate symmetrical clusters over the scalp: frontal (F3, Fz, F3), central (C3, Cz, C4), parietal (P3, Pz, P4) and occipital (O1, Oz, and O2) in the frequency bands theta (4–7 Hz), alpha (8–12 Hz), and beta (13–30 Hz). The PSD data were averaged in the entire time courses over each frequency band in the cluster.
In the time-frequency domain, we decomposed the EEG signal using short-term Fast Fourier Transform with a window size of 512 data points and a two pad-ratio for ERSP measures [20]. The results are presented as both event-related synchronization (ERS) and event-related desynchronization (ERD) over time from the specific baseline provided (pre-saccade fixation time). The statistics for the ERSP were obtained after dividing the entire time length into four consecutive data points of 250 ms from −1,000 to 0 ms. The statistical analysis for ERSP and PSD were measured by t-test between condition. All statistics were measured using IBM SPSS ver. 20.0 (IBM Corp., Armonk, NY, USA). We did not investigate the neural activity of the gamma band during saccadic eye movement after target onset for power spectrum and time frequency domain because rapid ocular muscle movements such as saccades impair the EEG signal in that frequency range [21].
Functional connectivity analysis
The directed transfer function was applied to the map and image of the functional connectivity at the source level using a modified MATLAB-based open-source toolbox, eConnectome (Biomedical Functional Imaging and Neuroengineering Laboratory, University of Minnesota, Minneapolis, MN, USA) [22,23]. A more detailed protocol has been described in our previous study [15]. In the present study, we manually created four bilateral regions of interests: (1) frontal eye field (FEF; right: x=22, y=26, z=45; left: x=–23, y=24, z=44); (2) parieto-insular vestibular cortex (PIVC; right: x=49, y=–35, z=18; left: x=–50, y=–38, z=18); (3) primary visual cortex (V1, right: x=11, y=–78, z=9; left: x=–11, y=–78, z=10), and (4) primary somatosensory cortex (S1, right: x=41, y=–27, z=47; left: x=–40, y=–27, z=47) based on the standard brains from the Montreal Neurological Institute [24], which is related to visual-vestibular and multisensory processing. Dynamic mean functional connectivity of the theta band on the saccadic period, and gamma band (30–50 Hz) on the fixation period are respectively presented, in which statistical assessment of the connectivity was performed using a surrogate (1,000 surrogate data sets, P<0.05).
RESULTS
Behavioral data
Reaction time (RT) for each condition was computed from eye movement onset, defined as the time when a participant’s eyes were exactly placed on the visual targets. The RTs for the F-off and F-on conditions were 306.21±95.36 and 342.29±62.24 ms, respectively, and were not significantly different. We defined corrective eye movement as participants exactly focusing their eyes on a visual target during a SEE, and accuracy was calculated as the number of corrective eye movements divided by the total number of trials (the number of corrective eye movements/the number of total trials×100): the accuracy was 81.82%±14.13% in the F-on condition and 65.71%±24.32% in the F-off condition, which showed a significant difference (P=0.039) (Fig. 2).
Power spectral and time-frequency analysis
Beta band-PSD under the F-on condition was 0.45±0.42 dB for the frontal area, 0.42±0.44 dB for the central area and 0.51±0.41 dB for the occipital area, which were significantly higher than the 0.15±0.11 dB for the frontal area, 0.13±0.11 dB for the central area and 0.29±0.21 dB for the occipital area under the F-off condition (P=0.047 for the frontal, P=0.042 for the central, and P=0.045) for the occipital area; each upper panel (Fig. 3A, B, and D). However, we did not find any significant difference in theta or alpha oscillatory power between the two conditions.
In the grand average waveform of event-related power modulations, beta-ERD were more significant before the onset of target presentation in the F-on than F-off condition. Mean beta-ERDs during preparatory period from –750 to –500 ms were calculated to be –0.19 on frontal and –0.22 on central clusters in F-on condition, whereas event-related power modulations during the same time period showed ERS (0.231 on frontal and 0.05 on central) rather than ERD pattern in F-off condition, which was statistically significant compared with those of F-on condition (P=0.005 for frontal and P=0.024 for central). From this result, we summarized that beta-ERD was more pronounced under the F-on condition during the mid-time of the preparatory phase for the saccadic exercise than under the F-off condition (Fig. 4).
Functional connectivity for VR exercise
The network of connectivity under the F-on condition was compared with that in the F-off condition. During the saccadic exercise period, participants exhibited significantly abundant dynamic connectivity across the FEF, S1, PIVC and V1 areas in the F-on condition compared with the F-off condition (Fig. 5A, B). Interestingly, contrary to the general expectation that FEF modulates saccadic eye movement, we detected no significant functional connectivity (approximately >50% of interactions within the network) between FEF and visual-vestibular during saccadic exercise in the F-off condition, suggesting that participants may not actively follow the target with their eyes.
In the fixation period, we investigated the functional connectivity of the gamma band to estimate attention, in which we found significant differences in functional connectivity between the two conditions (Fig. 5C, D). More specifically, participants exhibited significant functional connectivity from the right PIVC to the contralateral PIVC and to both V1s, as well as between both S1s, in the F-on condition. However, we did not find significant connectivity besides that between both PIVCs in the F-off condition.
DISCUSSION
Virtual reality has been used for numerous clinical applications in recent years, especially in rehabilitation or medical simulation. Furthermore, recent advances in virtual reality technology and the release of low-cost HMDs that provide fully immersive experience have yielded a paradigm shift in VR strategies. The underlying theoretical background is based on the idea that the realistic visual condition provided by virtual reality may enhance adaptation by retinal slip [25]. A recent meta-analysis clinically supported that rehabilitation exercises using virtual reality can be a very valuable treatment modality for functional gain [11].
However, our concern is that a fully immersive environment such as HMD may lead to potentially adverse consequences. This idea arises from the concept that VR exercises should be performed in an interactive manner involving a supervisor for better performance [10], yet the fully immersive environment of an HMD is completely isolated from the real world and may make it impossible for a supervisor to intervene. For instance, the supervisor may not even be aware of a participant’s scattering of attention during a given session due to the enclosed space created by the HMD.
We did not provide any information for participants that their eyes were being tracked and recorded in real time. Electrophysiological responses were measured using EEGs during VR with HMD to investigate the neuro-dynamics of the brain relating goal-directed attention and functional connectivity, which may provide in-depth neural evidence for optimization of the implementation of HMD even this is preliminary result from heathy individuals because these neuro-dynamics tested are essential components for enhancing the performance of VR [9,10].
The corrective eye exercise rate was 81.80% under the F-on condition, which showed a statistically higher accuracy than the 65.67% under the F-off condition (P=0.039). This result is largely expected because participants would attempt to correctively move their gaze towards a moving target for saccadic exercise based on the perception of their eye position. If patients tend to not exactly focus their eyes on a given visual target, intervention by the supervisor is required. However, the supervisor would not recognize whether patients correctly performed the VR exercise until after the session, given the immersive virtual reality environment, which suggests that the fully immersive virtual reality itself may lead to undesirable consequences in VR performance like viewing television because the participant is isolated from the outside while wearing it.
Our EEG power spectral analysis supported this hypothesis. The beta band-PSD was significantly higher in the F-on than in the F-off condition in the frontal (P=0.047), central (P=0.042) and occipital areas (P=0.045). EEG beta band activity decreased in the occipital area and was related to attentional deficiency [26], and it is assumed to facilitate long-range interactions on a cortical network level [27,28]. Accordingly, observation of a higher beta neural oscillatory power under the F-on condition indicates that participants may perform the exercise with sustained attention (Fig. 3).
Here, we further investigated ERSP to evaluate signal processing in the time domain. During preparation, execution, and mental simulation of movement, the spectral power in the beta rhythms decreased bilaterally over the sensorimotor cortex [29- 31], providing a mechanism for selective activation of sensorimotor ensembles. We found that beta-ERD patterns were usually exhibited before the actual saccadic movement, which is consistent with previous works [29-31]. However, we noted that betaERD was significantly more pronounced in the F-on condition than in the F-off condition during the time-period from −750 to −500 in the frontal and central area (Fig. 4). We speculated that certainty of the movement direction and expected loading of attention about the upcoming saccadic movement may result in a larger beta-ERD under the F-on condition [32,33]. Therefore, we suggest that participants may perform VR exercises with goal-directed attention under the F-on rather than the F-off condition, even though it remains unclear whether these rhythms contribute independently to motor behavior.
We investigated differences in neural connectivity between the two conditions during VR. Cortical structural changes have been linked to functional recovery after vestibular neuritis [34,35], and VR performance is dependent on mechanisms related to neuronal plasticity of the central nervous system [36,37]. During the saccadic exercise period, we noticed more abundant dynamic connectivity in the theta band across the FEF, S1, PIVC and V1 areas in the F-on condition than in the Foff condition. Interestingly, FEF was not included as a functional area because it had more than 50% of interactions within the network under the F-off condition even during saccadic exercise, contrary to the general expectation that FEF modulates saccadic eye movement. We speculated that participants actively did not move their eyes corresponding to the saccadic target, which is consistent with our behavioral results. In the fixation period, participants exhibited significant functional connectivity from the right PIVC to the contralateral PIVC and both V1s, as well as between both S1s, under the F-on condition. However, no dynamic functional connectivities besides those between both PIVCs were observed under the F-off condition.
Taken together, even though immersive virtual reality such as HMD may be ideal for promoting habituation [25], our neural evidence suggests that the immersive environment, completely isolated from outside intervention, would not work for better VR performance. Thus, interactive functions must be integrated into immersive platforms or another platform like a projection based virtual reality display should be considered. However, several aspects of these conclusions should be substantiated with a more qualified study design because our electrophysiology results were obtained for SEE from healthy subjects and did not consider long-term clinical efficacy, but that was not our main concern. In addition, we did not conduct a comparison study including a conventional VR exercise condition due to methodological limitations.
Finally, it should be noted that our approach is regarding on attention and functional connectivity for SEE using HMD. Our electrophysiological measurements did not include head rotation protocol due to difficulty of artifact rejection. SEE should be a part of adapted strategy to enhance the decreased slow phase component of the vestibular ocular reflex in patient with vertigo [38]. However, considering the main components of VR are head and eye exercise, electrophysiological measurements from SEE may not be representative of the whole VR exercise. However, the neurophysiological evidence suggests eye tracking assisted strategy may also work for enhancement of performance of VR with HMD though active engagement of patients.
HIGHLIGHTS
▪ Interactive environments play a critical role in performance of neurorehabilitation but head-mounted display-only does not provide such ideal environment for vestibular rehabilitation.
▪ Eighteen subjects performed a saccadic eye exercise with and without visual feedback in the virtual space.
▪ An eye tracking strategy enhanced goal-directed attention and brain connectivity while performing a saccadic exercise, which may be effective for improving the performance of rehabilitation exercises.
Notes
No potential conflict of interest relevant to this article was reported.
Acknowledgements
This research was supported by the Bio & Medical Technology Development Program (grant No. NRF-2017M3A9G-1027932) and basic research program (grant No. NRF-2016R1D1A1B03930234) of the national research foundation funded by the Korean government, and Hallym University research fund, Korea
SUPPLEMENTARY MATERIALS
Supplementary materials can be found via https://doi.org/10.21053/ceo.2018.01592.