Hdl Handle:
http://hdl.handle.net/10545/621407
Title:
Data Intensive and Network Aware (DIANA) grid scheduling
Authors:
McClatchey, Richard; Anjum, Ashiq; Stockinger, Heinz; Ali, Arshad; Willers, Ian; Thomas, Michael
Abstract:
In Grids scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may result in large processing queues and job execution delays due to site overloads. In this paper we describe a Data Intensive and Network Aware (DIANA) meta-scheduling approach, which takes into account data, processing power and network characteristics when making scheduling decisions across multiple sites. Through a practical implementation on a Grid testbed, we demonstrate that queue and execution times of data-intensive jobs can be significantly improved when we introduce our proposed DIANA scheduler. The basic scheduling decisions are dictated by a weighting factor for each potential target location which is a calculated function of network characteristics, processing cycles and data location and size. The job scheduler provides a global ranking of the computing resources and then selects an optimal one on the basis of this overall access and execution cost. The DIANA approach considers the Grid as a combination of active network elements and takes network characteristics as a first class criterion in the scheduling decision matrix along with computations and data. The scheduler can then make informed decisions by taking into account the changing state of the network, locality and size of the data and the pool of available processing cycles.
Affiliation:
University of West England; Swiss Institute of Bioinformatics; National University of Sciences and Technology; CERN; California Institute of Technology
Citation:
McClatchey, R. et al (2007) 'Data Intensive and Network Aware (DIANA) Grid Scheduling', Journal of Grid Computing, 5 (1):43
Publisher:
Springer
Journal:
Journal of Grid Computing
Issue Date:
27-Jan-2007
URI:
http://hdl.handle.net/10545/621407
DOI:
10.1007/s10723-006-9059-z
Additional Links:
http://link.springer.com/10.1007/s10723-006-9059-z
Type:
Article
Language:
en
ISSN:
15707873
EISSN:
15729184
Sponsors:
CERN
Appears in Collections:
Department of Electronics, Computing & Maths

Full metadata record

DC FieldValue Language
dc.contributor.authorMcClatchey, Richarden
dc.contributor.authorAnjum, Ashiqen
dc.contributor.authorStockinger, Heinzen
dc.contributor.authorAli, Arshaden
dc.contributor.authorWillers, Ianen
dc.contributor.authorThomas, Michaelen
dc.date.accessioned2017-02-17T12:06:54Z-
dc.date.available2017-02-17T12:06:54Z-
dc.date.issued2007-01-27-
dc.identifier.citationMcClatchey, R. et al (2007) 'Data Intensive and Network Aware (DIANA) Grid Scheduling', Journal of Grid Computing, 5 (1):43en
dc.identifier.issn15707873-
dc.identifier.doi10.1007/s10723-006-9059-z-
dc.identifier.urihttp://hdl.handle.net/10545/621407-
dc.description.abstractIn Grids scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may result in large processing queues and job execution delays due to site overloads. In this paper we describe a Data Intensive and Network Aware (DIANA) meta-scheduling approach, which takes into account data, processing power and network characteristics when making scheduling decisions across multiple sites. Through a practical implementation on a Grid testbed, we demonstrate that queue and execution times of data-intensive jobs can be significantly improved when we introduce our proposed DIANA scheduler. The basic scheduling decisions are dictated by a weighting factor for each potential target location which is a calculated function of network characteristics, processing cycles and data location and size. The job scheduler provides a global ranking of the computing resources and then selects an optimal one on the basis of this overall access and execution cost. The DIANA approach considers the Grid as a combination of active network elements and takes network characteristics as a first class criterion in the scheduling decision matrix along with computations and data. The scheduler can then make informed decisions by taking into account the changing state of the network, locality and size of the data and the pool of available processing cycles.en
dc.description.sponsorshipCERNen
dc.language.isoenen
dc.publisherSpringeren
dc.relation.urlhttp://link.springer.com/10.1007/s10723-006-9059-zen
dc.rightsArchived with thanks to Journal of Grid Computingen
dc.subjectData intensiveen
dc.subjectNetwork awareen
dc.subjectScheduling algorithmen
dc.subjectPeer-to-peer architecturesen
dc.subjectMeta schedulingen
dc.titleData Intensive and Network Aware (DIANA) grid schedulingen
dc.typeArticleen
dc.identifier.eissn15729184-
dc.contributor.departmentUniversity of West Englanden
dc.contributor.departmentSwiss Institute of Bioinformaticsen
dc.contributor.departmentNational University of Sciences and Technologyen
dc.contributor.departmentCERNen
dc.contributor.departmentCalifornia Institute of Technologyen
dc.identifier.journalJournal of Grid Computingen
All Items in UDORA are protected by copyright, with all rights reserved, unless otherwise indicated.