
handle: 11583/2996076
Augmented and Virtual Reality (AR/VR) technologies are gaining popularity to improve healthcare professionals training, with precise eye tracking playing a crucial role in enhancing performance. However, these systems need to be both low-latency and low-power to operate in real-time scenarios on resource-constrained devices. Event-based cameras can be employed to address these requirements, as they offer energy-efficient, high temporal resolution data with minimal battery drain. However, their sparse data format necessitates specialized processing algorithms. In this work, we propose a data preprocessing technique that improves the performance of nonrecurrent Deep Neural Networks (DNNs) for pupil position estimation. With this approach, we integrate over time - with a leakage factor - multiple time surfaces of events, so that the input data is enriched with information from past events. Additionally, in order to better distinguish between recent and old information, we generate multiple memory channels characterized by different leakage/forgetting rates. These memory channels are fed to well-known non-recurrent neural estimators to predict the position of the pupil. As an example, by using time surfaces only and feeding them to a MobileNet-V3L model to track the pupil in DVS recordings, we achieve a P10 accuracy (Euclidean error lower than ten pixels) of 85.40%, whether by using memory channels we achieve a P10 accuracy of 94.37% with a negligible time overhead.
Augmented Reality (AR); Virtual Reality (VR); Eye tracking; Healthcare training; Low-latency; Low-power; Event-based cameras; High temporal resolution; Sparse data format; Data preprocessing; Deep Neural Networks (DNNs); Pupil position estimation; Time surfaces; Leakage factor; Memory channels; Forgetting rates; Non-recurrent neural estimators; MobileNet-V3L; DVS recordings; P10 accuracy; Euclidean error; Resource-constrained devices; Real-time scenarios; Energy-efficient
Augmented Reality (AR); Virtual Reality (VR); Eye tracking; Healthcare training; Low-latency; Low-power; Event-based cameras; High temporal resolution; Sparse data format; Data preprocessing; Deep Neural Networks (DNNs); Pupil position estimation; Time surfaces; Leakage factor; Memory channels; Forgetting rates; Non-recurrent neural estimators; MobileNet-V3L; DVS recordings; P10 accuracy; Euclidean error; Resource-constrained devices; Real-time scenarios; Energy-efficient
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
