<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">JDS</journal-id>
      <journal-title-group>
        <journal-title>Journal of Data Science</journal-title>
      </journal-title-group>
      <issn pub-type="epub">1680-743X</issn>
      <issn pub-type="ppub">1680-743X</issn>
      <publisher>
        <publisher-name>SOSRUC</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">130306</article-id>
      <article-id pub-id-type="doi">10.6339/JDS.201504_13(2).0006</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Research Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Topic Model Kernel Classification with Probabilistically Reduced Features</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Nguyen</surname>
            <given-names>Vu</given-names>
          </name>
          <xref ref-type="aff" rid="j_JDS_aff_000"/>
        </contrib>
        <aff id="j_JDS_aff_000">Centre for Pattern Recognition and Data Analytics (PRaDA) Deakin University, Melbourne, Australia</aff>
        <contrib contrib-type="author">
          <name>
            <surname>Phung</surname>
            <given-names>Dinh</given-names>
          </name>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Venkatesh</surname>
            <given-names>Svetha</given-names>
          </name>
        </contrib>
      </contrib-group>
      <volume>13</volume>
      <issue>2</issue>
      <fpage>323</fpage>
      <lpage>340</lpage>
      <permissions>
        <ali:free_to_read xmlns:ali="http://www.niso.org/schemas/ali/1.0/"/>
      </permissions>
      <abstract>
        <p>Probabilistic topic models have become a standard in modern machine learning to deal with a wide range of applications. Representing data by dimensional reduction of mixture proportion extracted from topic models is not only richer in semantics interpretation, but could also be informative for classification tasks. In this paper, we describe the Topic Model Kernel (TMK), a topicbased kernel for Support Vector Machine classification on data being processed by probabilistic topic models. The applicability of our proposed kernel is demonstrated in several classification tasks with real world datasets. TMK outperforms existing kernels on the distributional features and give comparative results on nonprobabilistic data types.</p>
      </abstract>
      <kwd-group>
        <label>Keywords</label>
        <kwd>Topic Models</kwd>
        <kwd>Bayesian Nonparametric</kwd>
        <kwd>Support Vector Machine</kwd>
        <kwd>Kernel Method</kwd>
        <kwd>Classification</kwd>
        <kwd>Dimensionality Reduction</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
