Preservation of Manual Changes and Provenance for Data Quality using the Nano Version Control Repo

Lukasz MacHowski, Tshilidzi Marwala

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A new data structure called the Nano Version Control (NanoVC) repo emerges from computer science and the software industry. This data structure efficiently encodes entities at the nano-scale of the modelling spectrum and stores the provenance for that entity. The repo provides an intuitive representation of the history and data-lineage of the entity. Some provenance information can be computed on demand because of the repo structure. A simple algorithm for preservation of manual changes in the light of new data and changing algorithms utilizes the commit history in the repo to give us a sustainable way to merge information while keeping the provenance intact.

Original languageEnglish
Title of host publicationProceedings - 2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1905-1911
Number of pages7
ISBN (Electronic)9781665458412
DOIs
Publication statusPublished - 2021
Event2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021 - Las Vegas, United States
Duration: 15 Dec 202117 Dec 2021

Publication series

NameProceedings - 2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021

Conference

Conference2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021
Country/TerritoryUnited States
CityLas Vegas
Period15/12/2117/12/21

Keywords

  • data quality
  • data-lineage
  • provenance

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Preservation of Manual Changes and Provenance for Data Quality using the Nano Version Control Repo'. Together they form a unique fingerprint.

Cite this