Publication:
Hybrid job scheduling for improved cluster utilization

Placeholder

Institution Authors

Research Projects

Organizational Unit

Journal Title

Journal ISSN

Volume Title

Type

bookPart

Access

restrictedAccess

Publication Status

published

Journal Issue

Abstract

In this paper, we investigate the models and issues as well as performance benefits of hybrid job scheduling over shared physical clusters. Clustering technologies that are currently supported include MPI, Hadoop-MapReduce and NoSQL systems. Our proposed scheduling model is above the cluster-specific middleware and OS-level schedulers and it is complementary to them. First, we demonstrate that we can effectively schedule MPI, Hadoop, NoSQL jobs together by profiling them and then co-scheduling. Second, we find that it is better to schedule cluster jobs with different job characteristics together (CPU vs. I/O intensive) rather than two CPU-intensive jobs. Third, we use the learning outcome of this principle to design of a greedy sort-merge scheduler. Up to 37% savings in total job completion times are demonstrated. These savings are directly proportional to the cluster utilization improvements.

Date

2014

Publisher

Springer Science+Business Media

Description

Due to copyright restrictions, the access to the full text of this article is only available via subscription.

Keywords

Citation

Collections


Page Views

0

File Download

0