Kuntarto, Guson P. and Moechtar, Fahmi Lutfiansyah and Santoso, Berkah I. and Gunawan, Irwan Prasetya (2015) Comparative study between Part-of-Speech and statistical methods of text extraction in the tourism domain. In: 2015 International Conference on Information Technology Systems and Innovation (ICITSI), 16-19 Nov. 2015, Bandung.
Text (pdf)
SIF-Artikel-002 kuntarto2015.pdf - Published Version Restricted to Registered users only Download (335kB) |
Abstract
In this paper, a comparison between two different text extraction methods is given, namely the linguistic (Part-of-Speech / POS) and statistical methods (Term Frequency Inverse Document Frequency / TF-IDF). Text extractions were performed as part of ontology population in the Indonesian tourism domain. This paper also contributes in creating a multimedia corpus from three different resources or websites of Balinese tourism domain. Performance of each method is evaluated by means of several relevance measures. It was found that the statistical method used gives higher relevance than the linguistic methods. We have analysed that this is due to the limitation of the reference terms used in the initial ontology from our previous research
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | Bali tourism; linguistic method; ontology population; part of speech; statistical method; TF-IDF |
Subjects: | Computer Science Computer Science > Database management Computer Science > Web-Based Group Decision Support System (WGDSS) > Web-Based Computer Science > Web-Based |
Divisions: | Fakultas Teknik dan Ilmu Komputer > Program Studi Informatika |
Depositing User: | Users 2 not found. |
Date Deposited: | 22 Jul 2016 03:26 |
Last Modified: | 10 Feb 2022 02:00 |
URI: | https://repository.bakrie.ac.id/id/eprint/125 |
Actions (login required)
View Item |