ETL Best Practices for Data Quality Checks in RIS Databases
Journal Title: Informatics - Year 2019, Vol 6, Issue 1
Abstract
The topic of data integration from external data sources or independent IT-systems has received increasing attention recently in IT departments as well as at management level, in particular concerning data integration in federated database systems. An example of the latter are commercial research information systems (RIS), which regularly import, cleanse, transform and prepare the analysis research information of the institutions of a variety of databases. In addition, all these so-called steps must be provided in a secured quality. As several internal and external data sources are loaded for integration into the RIS, ensuring information quality is becoming increasingly challenging for the research institutions. Before the research information is transferred to a RIS, it must be checked and cleaned up. An important factor for successful or competent data integration is therefore always the data quality. The removal of data errors (such as duplicates and harmonization of the data structure, inconsistent data and outdated data, etc.) are essential tasks of data integration using extract, transform, and load (ETL) processes. Data is extracted from the source systems, transformed and loaded into the RIS. At this point conflicts between different data sources are controlled and solved, as well as data quality issues during data integration are eliminated. Against this background, our paper presents the process of data transformation in the context of RIS which gains an overview of the quality of research information in an institution’s internal and external data sources during its integration into RIS. In addition, the question of how to control and improve the quality issues during the integration process in RIS will be addressed.
Authors and Affiliations
Otmane Azeroual, Gunter Saake and Mohammad Abuosba
Developing and Improving Student Non-Technical Skills in IT Education: A Literature Review and Model
The purpose of this paper is to identify portions of the literature in the areas of Information Technology (IT) management, skills development, and curriculum development that support the design of a holistic conceptua...
Hybrid Design Tools—Image Quality Assessment of a Digitally Augmented Blackboard Integrated System
In the last two decades, Interactive White Boards (IWBs) have been widely available as a pedagogic tool. The usability of these boards for teaching disciplines where complex drawings are needed, we consider debatable i...
Multi-Gateway-Based Energy Holes Avoidance Routing Protocol for WSN
In wireless sensor networks (WSNs), efficient energy conservation is required to prolong the lifetime of the network. In this work, we have given emphasis on balanced energy consumption and energy holes avoidance. This...
Preferences of Informal Carers on Technology Packages to Support Meal Production by People Living with Dementia, Elicited from Personalised AT and ICT Product Brochures
Assistive technology (AT) can help support the continued independence of people living with dementia, supported by informal carers. Opinions and preferences of informal carers towards a range of assistive and digital i...
Acknowledgement to Reviewers of Informatics in 2018
Rigorous peer-review is the corner-stone of high-quality academic publishing. The editorial team greatly appreciates the reviewers who contributed their knowledge and expertise to the journal’s editorial process over t...