Representing variant calling format as directed acyclic graphs to enable the use of cloud computing for efficient and cost effective genome analysis
Name:
dags_revised_submit.pdf
Size:
384.0Kb
Format:
PDF
Description:
Author Accepted Manuscript
Abstract
Ever since the completion of the Human Genome Project in 2003, the human genome has been represented as a linear sequence of 3.2 billion base pairs and is referred to as the "Reference Genome". Since then it has become easier to sequence genomes of individuals due to rapid advancements in technology, which in turn has created a need to represent the new information using a different representation. Several attempts have been made to represent the genome sequence as a graph albeit for different purposes. Here we take a look at the Variant Calling Format (VCF) file which carries information about variations within genomes and is the primary format of choice for genome analysis tools. This short paper aims to motivate work in representing the VCF file as Directed Acyclic Graphs (DAGs) to run on a cloud in order to exploit the high performance capabilities provided by cloud computing.Citation
Aizad, S. et al (2017) 'Representing variant calling format as directed acyclic graphs to enable the use of cloud computing for efficient and cost effective genome analysis', Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 14-17 May. DOI: 10.1109/CCGRID.2017.116Publisher
IEEEJournal
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid ComputingDOI
10.1109/CCGRID.2017.116Additional Links
http://ieeexplore.ieee.org/document/7973781/Type
Meetings and ProceedingsLanguage
enISBN
9781509066117ae974a485f413a2113503eed53cd6c53
10.1109/CCGRID.2017.116