Attention mechanism has become an almost ubiquitous model architecture in deep learning. One of its distinctive features is to compute non-negative probabilistic distribution to re-weight input representations. This work reconsiders attention weights as bidirectional coefficients instead of probabilistic measures for potential benefits in interpretability and representational capacity. After analyzing the iteration process of attention scores through backwards gradient propagation, we proposed a novel activation function, TanhMax, which possesses several favorable properties to satisfy the requirements of bidirectional attention. We conduct a battery of experiments to validate our analyses and advantages of proposed method on both text and image datasets. The results show that bidirectional attention is effective in revealing input unit’s semantics, presenting more interpretable explanations and increasing the expressive power of attention-based model.
Previous abstractive methods apply sequence-to-sequence structures to generate summary without a module to assist the system to detect vital mentions and relationships within a document. To address this problem, we utilize semantic graph to boost the generation performance. Firstly, we extract important entities from each document and then establish a graph inspired by the idea of distant supervision (Mintz et al., 2009). Then, we combine a Bi-LSTM with a graph encoder to obtain the representation of each graph node. A novel neural decoder is presented to leverage the information of such entity graphs. Automatic and human evaluations show the effectiveness of our technique.
A large volume of trajectory data collected from human beings and vehicle mobility is highly sensitive due to privacy concerns. Therefore, generating synthetic and plausible trajectory data is pivotal in many location-based studies and applications. But existing LSTM-based methods are not suitable for modeling large-scale sequences due to gradient vanishing problem. Also, existing GAN-based methods are coarse-grained. Considering the trajectory’s geographical and sequential features, we propose a map-based Two-Stage GAN method (TSG) to tackle the challenges above and generate fine-grained and plausible large-scale trajectories. In the first stage, we first transfer GPS points data to discrete grid representation as the input for a modified deep convolutional generative adversarial network to learn the general pattern. In the second stage, inside each grid, we design an effective encoder-decoder network as the generator to extract road information from map image and then embed it into two parallel Long Short-Term Memory networks to generate GPS point sequences. Discriminator conditioned on encoded map image restrains generated point sequences in case they deviate from corresponding road networks. Experiments on real-world data are conducted to prove the effectiveness of our model in preserving geographical features and hidden mobility patterns. Moreover, our generated trajectories not only indicate the distribution similarity but also show satisfying road network matching accuracy.