Spatiotemporal deep learning

As spatiotemporal sensors become cheaper, spatiotemporal data become more widespread. At the same time, deep learning continues to be the de facto method to extract good representation in multiple applications domains. However, there are several challenges specific to spatiotemporal data. First, much spatiotemporal data are noisy, and general domain-agnostic denoising techniques do not always achieve the best results. Second, most approaches assume that data from nearby locations have similar dynamics. This assumptions is called homophily and it is not always true. However, this assumption is heavily relied upon in deep representation learning in other domains. Next, spatiotemporal data have complex local interactions. However, most approaches borrow deep learning architectures that are suitable in other domains but have insufficient expressive power for spatiotemporal data, resulting in the degradation of the latent representation. Finally, most approaches are trained end-to-end, requiring expensive retraining whenever there are changes in the road network. This research addresses the above challenges. To address the issue of extracting the underlying representation of noisy spatiotemporal data, we pick the map inference from the Global Positioning System (GPS) trajectories task. In this task, the goal is to infer the underlying road map from a collection of GPS trajectories. The challenge is that GPS trajectories are noisy, the noise distribution changes from place to place, and the labels are unbalanced, making it difficult to separate the less frequently travelled streets from the noise. Existing studies used modifications of generic denoising techniques such as k-means, skeletonisations, mean shifts, and SIFT features. However, they failed to capture all the relevant features in their representations. To tackle this, we developed a novel clustering algorithm called Iterated Trajectory Mean Shift (ITMS) to locate the road centre line points and we used a convolutional neural network (CNN) on trajectory descriptors to separate real and noisy connections between the centre line points. ITMS works by building upon a generic mean shift, as well as trajectory-specific traj-mean shift algorithm. It not only shifts the centroid towards the mean location of the neighbouring GPS points, but it also allows rotation of the centroid towards the mean headings and takes headings into consideration when determining neighbours. We substituted the manual feature aggregation representation with deep learning on our novel trajectory descriptors to ensure that no relevant features were lost in the intermediate representations. The result was a state-of-the-art map inference pipeline called Convolutional Trajectory Network (COLTRANE), which has been evaluated empirically on real-world city road data as well as airport tarmac data, which has a higher spatial complexity. Second, most deep learning approaches assumes homophily, and thus rely heavily on translational invariance in their representations. For example, a patch of pixels representing a cat does not change its semantic meaning when placed at different places in a picture or at an earlier or later frame in a video. However, this is not true for some spatiotemporal data, such as road network traffic.For example, the recurring traffic pattern at a specific point in a road network will be different from another point as both are influenced by nearby physical features such as entry and exit ramps, interchanges, and accidents, which are not always present in the data. Furthermore, we showed with real-world data that the traffic patterns at every location are unique, and the correlations between any two locations are also unique. We then introduced latent spatial feature vectors (LSFVs) to represent the spatial distributions of the temporal traffic patterns. Then, we utilized LSFVs in our deep learning pipeline, called Graph Self-attention WaveNet (G-SWaN), at two different modules: first at node encoding, and second at the spatial graph transformer. This ensured that the data was adaptive. We empirically evaluated G-SWaN on four real-world datasets, and we achieved state-of-the-art performance. Next, we established that the correlations between two locations are unique as well as complex. However, existing studies borrowed simpler graph neural networks (GNNs) popular in other domains that do not take this complexity into account. The underlying assumption in the popular GNN architecture, such as convolutional and attentional architectures, is that the node representation is simply the aggregate of the neighbours’ representations. However, we showed that traffic forecasting is different, and correct node representation is the aggregate of the neighbourhood interactions. To ensure that our representations capture these neighbourhood interactions, we used a more expressive message-passing GNN. Then, we empirically evaluated the superiority of representation through message-passing neural networks (MPNNs), using synthetic and real-world datasets. Finally, new roads and sensors were and continue to be built and deployed all the time. Learning LSFVs end-to-end means that they would require retraining every time a sensor is added to the network. Then, we propose a novel framework called Spatial Contrastive Pre-Training (SCPT) where we pretrained a spatial encoder using contrastive learning to allow the forecasting model to adapt to new sensors that have not been seen in the training data. We empirically evaluated our proposed framework with two real-world datasets, demonstrating an increased efficiency of up to 10\% in the low-data region. Our framework is also robust to the selection of sensors to include in the training set. This proposed framework has the potential to significantly improve traffic forecasting, by making it more scalable and adaptable to changes in the road network.

Related Organizations

RMIT University
Australia

Keywords

Stream and sensor data, Deep learning, Spatial data and applications

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now