EPPADS: An enhanced phase-based performance-aware dynamic scheduler for high job execution performance in large scale clusters

  • Prince Hamandawana
  • , Ronnie Mativenga
  • , Se Jin Kwon
  • , Tae Sun Chung

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The way in which jobs are scheduled is critical to achieve high job processing performance in large scale data clusters. Most existing scheduling mechanism employs a First-In First-Out, serialized approach encompassed with task straggler hunting techniques which launches speculative tasks after detecting slow tasks. This is often achieved through the instrumentation of processing nodes. Such node instrumentation incurs frequent communication overheads as the number of processing nodes increase. Moreover the sequential scheduling of job tasks and the straggler hunting approach fails to meet optimal performance as they increase job waiting time in queue and incurs delayed speculative execution of straggling tasks respectively. In this paper we propose an Enhanced Phase based Performance Aware Dynamic Scheduler (EPPADS), which schedules job tasks without additional instrumentation modules. EPPADS uses a two staged scheduling approach, that is, the slow start phase (SSP) and accelerate phase (AccP). The SSP schedules the initial task in the queue in the normal FIFO way and records the initial execution times of the processing nodes. The AccP uses the initial execution times to compute the processing nodes task distribution ratio of the remaining tasks and schedules them using a single scheduling I/O. We implement EPPADS scheduler in Hadoop’s MapReduce framework. Our evaluation shows that EPPADS can achieve a performance improvement on FIFO scheduler of 30%. Compared with existing Dynamic scheduling approach which uses node instrumentation, EPPADS achieves a better performance of 22%.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 24th International Conference, DASFAA 2019, Proceedings
EditorsJun Yang, Juggapong Natwichai, Yongxin Tong, Joao Gama, Guoliang Li
PublisherSpringer Verlag
Pages140-156
Number of pages17
ISBN (Print)9783030185756
DOIs
StatePublished - 2019
Event24th International Conference on Database Systems for Advanced Applications, DASFAA 2019 - Chiang Mai, Thailand
Duration: 22 Apr 201925 Apr 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11446 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Conference on Database Systems for Advanced Applications, DASFAA 2019
Country/TerritoryThailand
CityChiang Mai
Period22/04/1925/04/19

Keywords

  • Distributed processing
  • MapReduce
  • Scheduling

Fingerprint

Dive into the research topics of 'EPPADS: An enhanced phase-based performance-aware dynamic scheduler for high job execution performance in large scale clusters'. Together they form a unique fingerprint.

Cite this