Location-Based Parallel Sequential Pattern Mining Algorithm

Byoungwook Kim, Gangman Yi

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Given a data sequence, sequential pattern mining, which finds frequent sequence patterns among them, is an important data mining problem. However, in the existing sequential pattern mining, only the purchase order of the items is considered, and the position where the item is purchased is not considered. In this paper, we developed a sequential pattern mining algorithm using Apache spark. The proposed algorithm finds frequent sequential patterns in parallel by distributing data to several machines. Experimentally, we performed a comprehensive performance study on the proposed algorithm by varying various parameter values using various synthetic data. Experimental results show that the proposed algorithm shows a linear speed improvement over the number of machines.

Original languageEnglish
Article number8826432
Pages (from-to)128651-128658
Number of pages8
JournalIEEE Access
Volume7
DOIs
StatePublished - 2019

Keywords

  • Big data
  • MapReduce
  • PrefixSpan
  • sequential pattern mining

Fingerprint

Dive into the research topics of 'Location-Based Parallel Sequential Pattern Mining Algorithm'. Together they form a unique fingerprint.

Cite this