Show simple item record

dc.contributor.authorZada, Muhammad Sadiq Hassan
dc.contributor.authorYuan, Bo
dc.contributor.authorAnjum, Ashiq
dc.contributor.authorAzad, Muhammad Ajmal
dc.contributor.authorKhan, Wajahat Ali
dc.contributor.authorReiff-Marganiec, Stephan
dc.date.accessioned2021-02-08T15:54:00Z
dc.date.available2021-02-08T15:54:00Z
dc.date.issued2020-12-28
dc.identifier.citationZada, M.S.H., Yuan, B., Anjum, A., Azad, M.A., Khan, W.A. and Reiff-Marganiec, S. (2020). ‘Large-scale Data Integration Using Graph Probabilistic Dependencies (GPDs)’. IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, Leicester, 7-10 December. New York: IEEE, pp. 27-36.en_US
dc.identifier.isbn9780738123967
dc.identifier.doi10.1109/bdcat50828.2020.00028
dc.identifier.urihttp://hdl.handle.net/10545/625607
dc.description.abstractThe diversity and proliferation of Knowledge bases have made data integration one of the key challenges in the data science domain. The imperfect representations of entities, particularly in graphs, add additional challenges in data integration. Graph dependencies (GDs) were investigated in existing studies for the integration and maintenance of data quality on graphs. However, the majority of graphs contain plenty of duplicates with high diversity. Consequently, the existence of dependencies over these graphs becomes highly uncertain. In this paper, we proposed graph probabilistic dependencies (GPDs) to address the issue of uncertainty over these large-scale graphs with a novel class of dependencies for graphs. GPDs can provide a probabilistic explanation for dealing with uncertainty while discovering dependencies over graphs. Furthermore, a case study is provided to verify the correctness of the data integration process based on GPDs. Preliminary results demonstrated the effectiveness of GPDs in terms of reducing redundancies and inconsistencies over the benchmark datasets.en_US
dc.description.sponsorshipData Science Research Centre (DSRC) at the University of Derbyen_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urlhttps://ieeexplore.ieee.org/abstract/document/9302543en_US
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/*
dc.source2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT)
dc.subjectdata integrationen_US
dc.subjectinformation retrievalen_US
dc.subjectgraph probabilistic dependenciesen_US
dc.titleLarge-scale Data Integration Using Graph Probabilistic Dependencies (GPDs)en_US
dc.typeMeetings and Proceedingsen_US
dc.contributor.departmentUniversity of Derbyen_US
dc.contributor.departmentUniversity of Leicesteren_US
dcterms.dateAccepted2020-10-30
dc.author.detailSTF1867en_US


This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 4.0 International
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 4.0 International