• Adapting scientific workflow structures using multi-objective optimization strategies

      Habib, Irfan; Anjum, Ashiq; Mcclatchey, Richard; Rana, Omer; University of Derby, UK (Association for Computing Machinery, 2013-04-01)
      Scientific workflows have become the primary mechanism for conducting analyses on distributed computing infrastructures such as grids and clouds. In recent years, the focus of optimization within scientific workflows has primarily been on computational tasks and workflow makespan. However, as workflow-based analysis becomes ever more data intensive, data optimization is becoming a prime concern. Moreover, scientific workflows can scale along several dimensions: (i) number of computational tasks, (ii) heterogeneity of computational resources, and the (iii) size and type (static versus streamed) of data involved. Adapting workflow structure in response to these scalability challenges remains an important research objective. Understanding how a workflow graph can be restructured in an automated manner (through task merge, for instance), to address constraints of a particular execution environment is explored in this work, using a multi-objective evolutionary approach. Our approach attempts to adapt the workflow structure to achieve both compute and data optimization. The question of when to terminate the evolutionary search in order to conserve computations is tackled with a novel termination criterion. The results presented in this article demonstrate the feasibility of the termination criterion and demonstrate that significant optimization can be achieved with a multi-objective approach.
    • Intelligent grid enabled services for neuroimaging analysis

      McClatchey, Richard; Habib, Irfan; Anjum, Ashiq; Munir, Kamran; Branson, Andrew; Bloodsworth, Peter; Kiani, Saad Liaquat; University of West England; University of Derby; National University of Science and Technology (Elsevier, 2013-12)
      This paper reports our work in the context of the neuGRID project in the development of intelligent services for a robust and efficient Neuroimaging analysis environment. neuGRID is an EC-funded project driven by the needs of the Alzheimer's disease research community that aims to facilitate the collection and archiving of large amounts of imaging data coupled with a set of services and algorithms. By taking Alzheimer's disease as an exemplar, the neuGRID project has developed a set of intelligent services and a Grid infrastructure to enable the European neuroscience community to carry out research required for the study of degenerative brain diseases. We have investigated the use of machine learning approaches, especially evolutionary multi-objective meta-heuristics for optimising scientific analysis on distributed infrastructures. The salient features of the services and the functionality of a planning and execution architecture based on an evolutionary multi-objective meta-heuristics to achieve analysis efficiency are presented. We also describe implementation details of the services that will form an intelligent analysis environment and present results on the optimisation that has been achieved as a result of this investigation.
    • Providing traceability for neuroimaging analyses.

      McClatchey, Richard; Branson, Andrew; Anjum, Ashiq; Bloodsworth, Peter; Habib, Irfan; Munir, Kamran; Shamdasani, Jetendr; Soomro, Kamran; University of the West of England (Elsevier, 2013-09)
      Introduction With the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being seen as viable solutions for managing distributed data and algorithms in the bio-medical domain. For neuroscientific analyses, especially those centred on complex image analysis, traceability of processes and datasets is essential but up to now this has not been captured in a manner that facilitates collaborative study. Purpose and method Few examples exist, of deployed medical systems based on Grids that provide the traceability of research data needed to facilitate complex analyses and none have been evaluated in practice. Over the past decade, we have been working with mammographers, paediatricians and neuroscientists in three generations of projects to provide the data management and provenance services now required for 21st century medical research. This paper outlines the finding of a requirements study and a resulting system architecture for the production of services to support neuroscientific studies of biomarkers for Alzheimer's disease. Results The paper proposes a software infrastructure and services that provide the foundation for such support. It introduces the use of the CRISTAL software to provide provenance management as one of a number of services delivered on a SOA, deployed to manage neuroimaging projects that have been studying biomarkers for Alzheimer's disease. Conclusions In the neuGRID and N4U projects a Provenance Service has been delivered that captures and reconstructs the workflow information needed to facilitate researchers in conducting neuroimaging analyses. The software enables neuroscientists to track the evolution of workflows and datasets. It also tracks the outcomes of various analyses and provides provenance traceability throughout the lifecycle of their studies. As the Provenance Service has been designed to be generic it can be applied across the medical domain as a reusable tool for supporting medical researchers thus providing communities of researchers for the first time with the necessary tools to conduct widely distributed collaborative programmes of medical analysis.