LabelFlow Framework for Annotating Workflow Provenance

Journal Title: Informatics - Year 2018, Vol 5, Issue 1

Abstract

Scientists routinely analyse and share data for others to use. Successful data (re)use relies on having metadata describing the context of analysis of data. In many disciplines the creation of contextual metadata is referred to as reporting. One method of implementing analyses is with workflows. A stand-out feature of workflows is their ability to record provenance from executions. Provenance is useful when analyses are executed with changing parameters (changing contexts) and results need to be traced to respective parameters. In this paper we investigate whether provenance can be exploited to support reporting. Specifically; we outline a case-study based on a real-world workflow and set of reporting queries. We observe that provenance, as collected from workflow executions, is of limited use for reporting, as it supports queries partially. We identify that this is due to the generic nature of provenance, its lack of domain-specific contextual metadata. We observe that the required information is available in implicit form, embedded in data. We describe LabelFlow, a framework comprised of four Labelling Operators for decorating provenance with domain-specific Labels. LabelFlow can be instantiated for a domain by plugging it with domain-specific metadata extractors. We provide a tool that takes as input a workflow, and produces as output a Labelling Pipeline for that workflow, comprised of Labelling Operators. We revisit the case-study and show how Labels provide a more complete implementation of reporting queries.

Authors and Affiliations

Pinar Alper, Khalid Belhajjame, Vasa Curcin and Carole A. Goble

Keywords

Related Articles

Motivation and User Engagement in Fitness Tracking: Heuristics for Mobile Healthcare Wearables

Wearable fitness trackers have gained a new level of popularity due to their ambient data gathering and analysis. This has signalled a trend toward self-efficacy and increased motivation among users of these devices. F...

How Using Dedicated Software Can Improve RECIST Readings

Decision support tools exist for oncologic follow up. Their main interest is to help physicians improve their oncologic readings but this theoretical benefit has to be quantified by concrete evidence. The purpose of th...

Acknowledgement to Reviewers of Informatics in 2017

Peer review is an essential part in the publication process, ensuring that Informatics maintains high quality standards for its published papers. In 2017, a total of 44 papers were published in the journal. Thanks to t...

Modelling Digital Knowledge Transfer: Nurse Supervisors Transforming Learning at Point of Care to Advance Nursing Practice

Limited adoption of mobile technology for informal learning and continuing professional development within Australian healthcare environments has been explained primarily as an issue of insufficient digital and ehealth...

Choosing a Model for eConsult Specialist Remuneration: Factors to Consider

Electronic consultation (eConsult) is an innovative solution that allows specialists and primary care providers to communicate electronically, improving access to specialist care. Understanding the cost implications of...

Download PDF file
  • EP ID EP44120
  • DOI https://doi.org/10.3390/informatics5010011
  • Views 247
  • Downloads 0

How To Cite

Pinar Alper, Khalid Belhajjame, Vasa Curcin and Carole A. Goble (2018). LabelFlow Framework for Annotating Workflow Provenance. Informatics, 5(1), -. https://europub.co.uk./articles/-A-44120