Representing variant calling format as directed acyclic graphs to enable the use of cloud computing for efficient and cost effective genome analysis
MetadataShow full item record
AbstractEver since the completion of the Human Genome Project in 2003, the human genome has been represented as a linear sequence of 3.2 billion base pairs and is referred to as the "Reference Genome". Since then it has become easier to sequence genomes of individuals due to rapid advancements in technology, which in turn has created a need to represent the new information using a different representation. Several attempts have been made to represent the genome sequence as a graph albeit for different purposes. Here we take a look at the Variant Calling Format (VCF) file which carries information about variations within genomes and is the primary format of choice for genome analysis tools. This short paper aims to motivate work in representing the VCF file as Directed Acyclic Graphs (DAGs) to run on a cloud in order to exploit the high performance capabilities provided by cloud computing.
CitationAizad, S. et al (2017) 'Representing variant calling format as directed acyclic graphs to enable the use of cloud computing for efficient and cost effective genome analysis', Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 14-17 May. DOI: 10.1109/CCGRID.2017.116
JournalProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
TypeMeetings and Proceedings