Identification of target clusters by using the restricted normal mixture model

Seung Gu Kim, Jeong Soo Park, Yung Seop Lee

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This paper addresses the problem of identifying groups that satisfy the specific conditions for the means of feature variables. In this study, we refer to the identified groups as "target clusters" (TCs). To identify TCs, we propose a method based on the normal mixture model (NMM) restricted by a linear combination of means. We provide an expectation-maximization (EM) algorithm to fit the restricted NMM by using the maximum-likelihood method. The convergence property of the EM algorithm and a reasonable set of initial estimates are presented. We demonstrate the method's usefulness and validity through a simulation study and two well-known data sets. The proposed method provides several types of useful clusters, which would be difficult to achieve with conventional clustering or exploratory data analysis methods based on the ordinary NMM. A simple comparison with another target clustering approach shows that the proposed method is promising in the identification.

Original languageEnglish
Pages (from-to)941-960
Number of pages20
JournalJournal of Applied Statistics
Volume40
Issue number5
DOIs
StatePublished - May 2013

Keywords

  • EM algorithm
  • maximum-likelihood method
  • mean restrictions
  • microarray gene expression data
  • restricted normal mixture model
  • target clustering

Fingerprint

Dive into the research topics of 'Identification of target clusters by using the restricted normal mixture model'. Together they form a unique fingerprint.

Cite this