Large Scale GPS Trajectory Generation Using Map Based on Two Stage GAN
Volume 19, Issue 1 (2021), pp. 126–141
Pub. online: 10 February 2021
Type: Data Science In Action
Received
1 October 2020
1 October 2020
Accepted
1 January 2021
1 January 2021
Published
10 February 2021
10 February 2021
Abstract
A large volume of trajectory data collected from human beings and vehicle mobility is highly sensitive due to privacy concerns. Therefore, generating synthetic and plausible trajectory data is pivotal in many location-based studies and applications. But existing LSTM-based methods are not suitable for modeling large-scale sequences due to gradient vanishing problem. Also, existing GAN-based methods are coarse-grained. Considering the trajectory’s geographical and sequential features, we propose a map-based Two-Stage GAN method (TSG) to tackle the challenges above and generate fine-grained and plausible large-scale trajectories. In the first stage, we first transfer GPS points data to discrete grid representation as the input for a modified deep convolutional generative adversarial network to learn the general pattern. In the second stage, inside each grid, we design an effective encoder-decoder network as the generator to extract road information from map image and then embed it into two parallel Long Short-Term Memory networks to generate GPS point sequences. Discriminator conditioned on encoded map image restrains generated point sequences in case they deviate from corresponding road networks. Experiments on real-world data are conducted to prove the effectiveness of our model in preserving geographical features and hidden mobility patterns. Moreover, our generated trajectories not only indicate the distribution similarity but also show satisfying road network matching accuracy.
Supplementary material
Supplementary MaterialThe trajectories data of Porto is available on Kaggle (http://www.kaggle.com/c/pkdd-15-predict-taxi-service-trajectory-i). Our Python code in experiment section can be found at https://github.com/XingruiWang/Two-Stage-Gan-in-trajectory-generation.
References
Andreini P, Bonechi S, Bianchini M, Mecocci A, Scarselli F, Sodi A (2019). A two stage gan for high resolution retinal image generation and segmentation. arXiv preprint: https://arxiv.org/abs/1907.12296.
Baratchi M, Meratnia N, Havinga PJM, Skidmore AK, Toxopeus BAKG (2014). A hierarchical hidden semi-Markov model for modeling mobility data. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, UbiComp ’14, (AJ Brush, A Friday, JA Kientz, J Scott, J Song, eds.), 401–412. Association for Computing Machinery, New York, NY, USA.
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. (2014). Learning phrase representations using rnn encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (A Moschitti, B Pang, W Daelemans, eds.), 1724–1734. Association for Computational Linguistics, Doha, Qatar.
Duan J (2017). Two stage GAN. https://davidsonic.github.io/Project/GAN.html (Accessed on 12/05/2020).
Garg S, Peitz S, Nallasamy U, Paulik M (2019). Jointly learning to align and translate with transformer models. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP), (K Inui, J Jiang, V Ng, X Wan, eds.), 4453–4462. Hong Kong, China. November 3–7.
Huang Z, Xu W, Yu K (2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint: https://arxiv.org/abs/1508.01991.
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, et al. (2016b). Google’s Neural Machine Translation system: Bridging the gap between human and machine translation. arXiv preprint: https://arxiv.org/abs/1609.08144.