Academic literature on the topic 'Data quality ‪(DQ)‬'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Data quality ‪(DQ)‬.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Data quality ‪(DQ)‬"

1

Aljumaili, Mustafa, Ramin Karim, and Phillip Tretten. "Metadata-based data quality assessment." VINE Journal of Information and Knowledge Management Systems 46, no. 2 (2016): 232–50. http://dx.doi.org/10.1108/vjikms-11-2015-0059.

Full text
Abstract:
Purpose The purpose of this paper is to develop data quality (DQ) assessment model based on content analysis and metadata analysis. Design/methodology/approach A literature review of DQ assessment models has been conducted. A study of DQ key performances (KPIs) has been done. Finally, the proposed model has been developed and applied in a case study. Findings The results of this study shows that the metadata data have important information about DQ in a database and can be used to assess DQ to provide decision support for decision makers. Originality/value There is a lot of DQ assessment in the literature; however, metadata are not considered in these models. The model developed in this study is based on metadata in addition to the content analysis, to find a quantitative DQ assessment.
APA, Harvard, Vancouver, ISO, and other styles
2

Syed, Rehan, Rebekah Eden, Tendai Makasi, et al. "Digital Health Data Quality Issues: Systematic Review." Journal of Medical Internet Research 25 (March 31, 2023): e42615. http://dx.doi.org/10.2196/42615.

Full text
Abstract:
Background The promise of digital health is principally dependent on the ability to electronically capture data that can be analyzed to improve decision-making. However, the ability to effectively harness data has proven elusive, largely because of the quality of the data captured. Despite the importance of data quality (DQ), an agreed-upon DQ taxonomy evades literature. When consolidated frameworks are developed, the dimensions are often fragmented, without consideration of the interrelationships among the dimensions or their resultant impact. Objective The aim of this study was to develop a consolidated digital health DQ dimension and outcome (DQ-DO) framework to provide insights into 3 research questions: What are the dimensions of digital health DQ? How are the dimensions of digital health DQ related? and What are the impacts of digital health DQ? Methods Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a developmental systematic literature review was conducted of peer-reviewed literature focusing on digital health DQ in predominately hospital settings. A total of 227 relevant articles were retrieved and inductively analyzed to identify digital health DQ dimensions and outcomes. The inductive analysis was performed through open coding, constant comparison, and card sorting with subject matter experts to identify digital health DQ dimensions and digital health DQ outcomes. Subsequently, a computer-assisted analysis was performed and verified by DQ experts to identify the interrelationships among the DQ dimensions and relationships between DQ dimensions and outcomes. The analysis resulted in the development of the DQ-DO framework. Results The digital health DQ-DO framework consists of 6 dimensions of DQ, namely accessibility, accuracy, completeness, consistency, contextual validity, and currency; interrelationships among the dimensions of digital health DQ, with consistency being the most influential dimension impacting all other digital health DQ dimensions; 5 digital health DQ outcomes, namely clinical, clinician, research-related, business process, and organizational outcomes; and relationships between the digital health DQ dimensions and DQ outcomes, with the consistency and accessibility dimensions impacting all DQ outcomes. Conclusions The DQ-DO framework developed in this study demonstrates the complexity of digital health DQ and the necessity for reducing digital health DQ issues. The framework further provides health care executives with holistic insights into DQ issues and resultant outcomes, which can help them prioritize which DQ-related problems to tackle first.
APA, Harvard, Vancouver, ISO, and other styles
3

Batini, C., T. Blaschke, S. Lang, et al. "DATA QUALITY IN REMOTE SENSING." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7 (September 12, 2017): 447–53. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w7-447-2017.

Full text
Abstract:
The issue of data quality (DQ) is of growing importance in Remote Sensing (RS), due to the widespread use of digital services (incl. apps) that exploit remote sensing data. In this position paper a body of experts from the ISPRS Intercommission working group III/IVb “DQ” identifies, categorises and reasons about issues that are considered as crucial for a RS research and application agenda. This ISPRS initiative ensures to build on earlier work by other organisations such as IEEE, CEOS or GEO, in particular on the meritorious work of the Quality Assurance Framework for Earth Observation (QA4EO) which was established and endorsed by the Committee on Earth Observation Satellites (CEOS) but aims to broaden the view by including experts from computer science and particularly database science. The main activities and outcomes include: providing a taxonomy of DQ dimensions in the RS domain, achieving a global approach to DQ for heterogeneous-format RS data sets, investigate DQ dimensions in use, conceive a methodology for managing cost effective solutions on DQ in RS initiatives, and to address future challenges on RS DQ dimensions arising in the new era of the big Earth data.
APA, Harvard, Vancouver, ISO, and other styles
4

Anantharama, Nandini, Wray Buntine, and Andrew Nunn. "A Systematic Approach to Reconciling Data Quality Failures: Investigation Using Spinal Cord Injury Data." ACI Open 05, no. 02 (2021): e94-e103. http://dx.doi.org/10.1055/s-0041-1735975.

Full text
Abstract:
Abstract Background Secondary use of electronic health record's (EHR) data requires evaluation of data quality (DQ) for fitness of use. While multiple frameworks exist for quantifying DQ, there are no guidelines for the evaluation of DQ failures identified through such frameworks. Objectives This study proposes a systematic approach to evaluate DQ failures through the understanding of data provenance to support exploratory modeling in machine learning. Methods Our study is based on the EHR of spinal cord injury inpatients in a state spinal care center in Australia, admitted between 2011 and 2018 (inclusive), and aged over 17 years. DQ was measured in our prerequisite step of applying a DQ framework on the EHR data through rules that quantified DQ dimensions. DQ was measured as the percentage of values per field that meet the criteria or Krippendorff's α for agreement between variables. These failures were then assessed using semistructured interviews with purposively sampled domain experts. Results The DQ of the fields in our dataset was measured to be from 0% adherent up to 100%. Understanding the data provenance of fields with DQ failures enabled us to ascertain if each DQ failure was fatal, recoverable, or not relevant to the field's inclusion in our study. We also identify the themes of data provenance from a DQ perspective as systems, processes, and actors. Conclusion A systematic approach to understanding data provenance through the context of data generation helps in the reconciliation or repair of DQ failures and is a necessary step in the preparation of data for secondary use.
APA, Harvard, Vancouver, ISO, and other styles
5

Nazaire, Mare. "Integrating Data Quality Feedback: a Data Provider's Perspective." Biodiversity Information Science and Standards 2 (June 13, 2018): e26007. https://doi.org/10.3897/biss.2.26007.

Full text
Abstract:
The Herbarium of Rancho Santa Ana Botanic Garden [RSA-POM] is the third largest herbarium in California and consists of >1.2 million specimens, of which ~50% are digitized. As a data provider, RSA-POM publishes its data with several aggregators, including the Consortium of California Herbaria, JSTOR, Symbiota (which is subsequently pulled into iDigBio and GBIF), as well as its own local webportal. Each submission of data needs to be prepared and formatted according to the aggregator's specifications for publication. Feedback on data quality (DQ) ranges from an individual user (often only a few records at a time) to large aggregators (frequently in large batches). While some DQ items are easy fixes with little time and effort to correct, others can be more challenging and often require expertise beyond the skillset of curatorial staff. In other instances, there are issues concerning an aggregator's ability to provide updated data for repatriation. This talk will discuss the efforts of the RSA-POM Herbarium to provide data to various aggregators as well as perspectives on the challenges, limitations, and constraints when integrating DQ items from an aggregator back into the local database.
APA, Harvard, Vancouver, ISO, and other styles
6

Paul, Deborah, and Nicole Fisher. "Challenges For Implementing Collections Data Quality Feedback: synthesizing the community experience." Biodiversity Information Science and Standards 2 (June 13, 2018): e26003. https://doi.org/10.3897/biss.2.26003.

Full text
Abstract:
Much data quality (DQ) feedback is now available to data providers from aggregators of collections specimen and related data. Similarly, transcription centres and crowdsourcing platforms also provide data that must be assessed and often manipulated before uploading to a local database and subsequently published with aggregators. In order to facilitate broader DQ information use aggregators (GBIF, ALA, iDigBio, VertNet) and others, through the TDWG BDQ Interest Group, are harmonizing the DQ information provided - transforming part of the DQ feedback process. But, collections sharing data face challenges when trying to evaluate and integrate the information changes offered (by aggregators) for given records in local collection management systems and collection databases. Sharing DQ integration experiences can help reveal risks and opportunities. Discovering others have the same conundrums helps develop a community of belonging and may assist in removing duplication of effort. It is important to leverage the knowledge and experience of those who are currently validating data to improve the efficiency and effectiveness of the process. Documenting and classifying these challenges also facilitates motivation and community building by informing those who would tackle these challenges. In this case, talks from aggregators and data providers give all of us a chance to learn from their stories about implementing and integrating DQ feedback. Following the symposium, a special interest group (SIG at SPNHC) meeting offers everyone an opportunity to add their experiences with aggregator DQ feedback. See the SIG meeting: "Add Your Input to Challenges for Implementing Collections Data Quality Feedback: synthesizing the community experience", for details. For tractable issues, we plan to assemble and note the expected ways in which these barriers can be overcome. Where possible, we can tap into existing community resources (SPNHC, TDWG, biology.stackexchange.com, iDigBio, etc.) to help our data providers implement future data updates and track changes. At the same time, we plan to analyze the intractable issues - documenting why they remain challenging - and what, if any potential solutions are available or likely to be available in the future. This information provides future projects like DiSSCo (Distributed System of Scientific Collections) and BCoN (Biodiversity Collections Network) and others worldwide the information required to plan more effectively for cyber/human infrastructure. Synthesizing this input helps visionaries better understand, anticipate and support DQ management and data mobilization efforts going forward by informing design of future proposals and global projects structured with these outcomes in mind. At the end of the workshop, we intend to publish our findings, and merge them with the results of a global survey on the same topic.
APA, Harvard, Vancouver, ISO, and other styles
7

Veiga, Allan, and Antonio Saraiva. "Defining a Data Quality (DQ) profile and DQ report using a prototype of Node.js module of the Fitness for Use Backbone (FFUB)." Biodiversity Information Science and Standards 1 (August 14, 2017): e20275. https://doi.org/10.3897/tdwgproceedings.1.20275.

Full text
Abstract:
Despite the increasing availability of biodiversity data, determing the quality of data and informing would-be data consumers and users remains a significant issue. In order for data users and data owners to perform a satisfactory assessment and management of data fitness for use, they require a Data Quality (DQ) report, which presents a set of relevant DQ measures, validations, and amendments assigned to data. Determining the meaning of "fitness for use" is essential to best manage and assess DQ. To tackle the problem, the TDWG Biodiversity Data Quality (BDQ) - Interest Group (IG) (https://github.com/tdwg/bdq) has proposed a conceptual framework that defines the necessary components to describe Data DQ needs, DQ solutions, and DQ reports (Fig. 1). It supports, in a global and collaborative environment, a consistent description of: (1) the meaning of data fitness for use in specific contexts, using the concept of a DQ profile; (2) DQ solutions, using the concepts of specifications and mechanisms; and (3) the status of quality of data according to a DQ profile, using the concept of a DQ report (Veiga 2016, Veiga et al. 2017). Based on this this conceptual framework, we implemented a prototype of a Fitness for Use Backbone (FFUB) as a Node.js module (https://nodejs.org/api/modules.html) for registering and retrieving instances of the framework concepts. This prototype was built using Node.js, an asynchronous event-driven JavaScript runtime, which uses a non-blocking I/O model that makes it lightweight and efficient to build scalable network applications (https://nodejs.org). We registered our module in the npm package manager (https://www.npmjs.com) in order to facilitate its reuse and we made our source code available in GitHub (https://github.com) in order to foster collaborative development. To test the module, we developed a simple mechanism for measuring, validating and amending the quality of datasets and records, called BDQ-Toolkit, available in the FFUB module. The source code of the FFUB module can be found at https://github.com/BioComp-USP/ffub. Installing and using the module requires Node.js version 6 or higher. Instructions for installing and using the FFUB module can be found at https://www.npmjs.com/package/ffub (Veiga and Saraiva 2017). Using the FFUB module we defined a simple DQ profile describing the meaning of data fitness for use in a specific context by registering a hypothetical use case. Then, we registered a set of valuable information elements for the context of the use case. For measuring the quality of each valuable information elements, we registered a set of DQ dimensions. To validate if the DQ measures are good enough, a set of DQ criteria was defined and registered. Lastly, a set of DQ enhancements for amending the quality in the use case context was also defined and registered. In order to describe the DQ solution used to meet those DQ needs, we registered the BDQ-Toolkit mechanism and all the specifications implemented by it. Using these specifications and mechanism, we generated and assigned to a dataset and its records a set of DQ assertions, according to the DQ dimensions, criteria and enhancements defined in the DQ profile. Based on those assertions we can build DQ reports by composing all the assertions assigned to the dataset or to a specific record. This DQ report describes the status of DQ of a dataset or record according to the context of the DQ profile. This module provides an interface to use the proposed conceptual framework, which allows others to register instances of its concepts. Future work will include creating a RESTful API using sophisticated methods of data retrieval.
APA, Harvard, Vancouver, ISO, and other styles
8

Bian, Jiang, Tianchen Lyu, Alexander Loiacono, et al. "Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data." Journal of the American Medical Informatics Association 27, no. 12 (2020): 1999–2010. http://dx.doi.org/10.1093/jamia/ocaa245.

Full text
Abstract:
Abstract Objective To synthesize data quality (DQ) dimensions and assessment methods of real-world data, especially electronic health records, through a systematic scoping review and to assess the practice of DQ assessment in the national Patient-centered Clinical Research Network (PCORnet). Materials and Methods We started with 3 widely cited DQ literature—2 reviews from Chan et al (2010) and Weiskopf et al (2013a) and 1 DQ framework from Kahn et al (2016)—and expanded our review systematically to cover relevant articles published up to February 2020. We extracted DQ dimensions and assessment methods from these studies, mapped their relationships, and organized a synthesized summarization of existing DQ dimensions and assessment methods. We reviewed the data checks employed by the PCORnet and mapped them to the synthesized DQ dimensions and methods. Results We analyzed a total of 3 reviews, 20 DQ frameworks, and 226 DQ studies and extracted 14 DQ dimensions and 10 assessment methods. We found that completeness, concordance, and correctness/accuracy were commonly assessed. Element presence, validity check, and conformance were commonly used DQ assessment methods and were the main focuses of the PCORnet data checks. Discussion Definitions of DQ dimensions and methods were not consistent in the literature, and the DQ assessment practice was not evenly distributed (eg, usability and ease-of-use were rarely discussed). Challenges in DQ assessments, given the complex and heterogeneous nature of real-world data, exist. Conclusion The practice of DQ assessment is still limited in scope. Future work is warranted to generate understandable, executable, and reusable DQ measures.
APA, Harvard, Vancouver, ISO, and other styles
9

Veiga, Allan, and Antonio Saraiva. "Toward a Biodiversity Data Fitness for Use Backbone (FFUB): A Node.js module prototype." Biodiversity Information Science and Standards 1 (August 14, 2017): e20300. https://doi.org/10.3897/tdwgproceedings.1.20300.

Full text
Abstract:
Introduction: The Biodiversity informatics community has made important achievements regarding digitizing, integrating and publishing standardized data about global biodiversity. However, the assessment of the quality of such data and the determination of the fitness for use of those data in different contexts remain a challenge. To tackle such problem using a common approach and conceptual base, the TDWG Biodiversity Data Quality Interest Group - BDQ-IG (https://github.com/tdwg/bdq) has proposed a conceptual framework to define the necessary components to describe Data Quality (DQ) needs, DQ solutions, and DQ reports. It supports a consistent description of the meaning of DQ in specific contexts and how to assess and manage DQ in a global and collaborative environment Veiga 2016, Veiga et al. 2017. Based on the common ground provided by this conceptual framework, we implemented a prototype of a Fitness for Use Backbone (FFUB) as a Node.js module (https://nodejs.org/api/modules.html) for registering and retrieving instances of the framework concepts. Material and methods: This prototype was built using Node.js, an asynchronous event-driven JavaScript runtime, which uses a non-blocking I/O model that makes it lightweight and efficient to build scalable network applications (https://nodejs.org). In order to facilitate the reusability of the module, we registered it in the NPM package manager (https://www.npmjs.com). To foster collaboration on the development of the module, the source code was made available in the GitHub (https://github.com) version control system. To test the module, we have developed a simple mechanism for measuring, validating and amending the quality of datasets and records, called BDQ-Toolkit. The source code of the FFUB module can be found at https://github.com/BioComp-USP/ffub. Installing and using the module requires Node.js version 6 or higher. Instructions for installing and using the FFUB module can be found at https://www.npmjs.com/package/ffub. Results: The implemented prototype is organized into three main types of functions: registry, retrieve and print. Registry functions enable the creation instances of concepts of the conceptual framework, as illustrated in Fig. 1, such as use cases, information elements, dimensions, criteria, enhancements, specifications, mechanisms, assertions (measure, validation, and amendment) and DQ profiles. As a prototype, these instances are not persisted, but they are stored in an in-memory JSON object. Retrieve functions are used to get instances of the framework concepts, such as DQ reports, based on the in-memory JSON object. Print functions are used to write in the console the concepts stored in the in-memory JSON object in a formatted way. Inside the FFUB module, we implemented a test which registers a set of instances of the framework concepts, including a simple DQ profile, specifications and mechanisms and a set of assertions applied to a sample dataset and its records. Based on these registries, it is possible to retrieve and print DQ reports, presenting the current status of DQ of the sample dataset and its records according to the defined DQ profile. Final remarks: This module provides a practical interface to the proposed conceptual framework. It allows the input of instances of concepts and generates, as output, information which allows the DQ assessment and management. Future work includes creating a RESTful API, based on the functions developed in this prototype, using sophisticated methods of data retrieving based on NoSQL databases.
APA, Harvard, Vancouver, ISO, and other styles
10

Blake, Roger, and Ganesan Shankaranarayanan. "Discovering Data and Information Quality Research Insights Gained through Latent Semantic Analysis." International Journal of Business Intelligence Research 3, no. 1 (2012): 1–16. http://dx.doi.org/10.4018/jbir.2012010101.

Full text
Abstract:
In the recent decade, the field of data and information quality (DQ) has grown into a research area that spans multiple disciplines. The motivation here is to help understand the core topics and themes that constitute this area and to determine how those topics and themes from DQ relate to business intelligence (BI). To do so, the authors present the results of a study which mines the abstracts of articles in DQ published over the last decade. Using Latent Semantic Analysis (LSA) six core themes of DQ research are identified, as well as twelve dominant topics comprising them. Five of these topics--decision support, database design and data mining, data querying and cleansing, data integration, and DQ for analytics--all relate to BI, emphasizing the importance of research that combines DQ with BI. The DQ topics from these results are profiled with BI, and used to suggest several opportunities for researchers.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Data quality ‪(DQ)‬"

1

Morbey, Guilherme. Data Quality for Decision Makers: A Dialog Between a Board Member and a DQ Expert. Springer Gabler. in Springer Fachmedien Wiesbaden GmbH, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Morbey, Guilherme. Data Quality for Decision Makers: A dialog between a board member and a DQ expert. Springer Gabler, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Data quality ‪(DQ)‬"

1

Rivas, Bibiano, Jorge Merino, Manuel Serrano, Ismael Caballero, and Mario Piattini. "I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data." In Lecture Notes in Computer Science. Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-25747-1_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sáez Carlos, Martínez-Miranda Juan, Robles Montserrat, and García-Gómez Juan Miguel. "Organizing Data Quality Assessment of Shifting Biomedical Data." In Studies in Health Technology and Informatics. IOS Press, 2012. https://doi.org/10.3233/978-1-61499-101-4-721.

Full text
Abstract:
Low biomedical Data Quality (DQ) leads into poor decisions which may affect the care process or the result of evidence-based studies. Most of the current approaches for DQ leave unattended the shifting behaviour of data underlying concepts and its relation to DQ. There is also no agreement on a common set of DQ dimensions and how they interact and relate to these shifts. In this paper we propose an organization of biomedical DQ assessment based on these concepts, identifying characteristics and requirements which will facilitate future research. As a result, we define the Data Quality Vector compiling a unified set of DQ dimensions (completeness, consistency, duplicity, correctness, timeliness, spatial stability, contextualization, predictive value and reliability), as the foundations to the further development of DQ assessment algorithms and platforms.
APA, Harvard, Vancouver, ISO, and other styles
3

Weber Jens H., Price Morgan, and Davies Iryna. "Taming the Data Quality Dragon – A Theory and Method for Data Quality by Design." In Studies in Health Technology and Informatics. IOS Press, 2015. https://doi.org/10.3233/978-1-61499-564-7-928.

Full text
Abstract:
A lack of data quality (DQ) is often a significant inhibitor impeding the realization of cost and quality benefits expected from Clinical Information Systems (CIS). Attaining and sustaining DQ in CIS has been a multi-faceted and elusive goal. The current literature on DQ in health informatics mainly consists of empirical studies and practitioners' reports, but often lack a holistic approach to addressing DQ ‘by design’. This paper seeks to present a general framework for clinical DQ, which blends foundational engineering theories with concepts and methods from health informatics. We define an architectural viewpoint for designing and reasoning about DQ. We introduce the notion of DQ Probes for monitoring and assuring DQ during system operation. The concepts presented have been validated in a real-world case study.
APA, Harvard, Vancouver, ISO, and other styles
4

Sundararaman, Arun Thotapalli. "Data Quality for Data Mining in Business Intelligence Applications." In Advances in Business Strategy and Competitive Advantage. IGI Global, 2015. http://dx.doi.org/10.4018/978-1-4666-6477-7.ch003.

Full text
Abstract:
Data Quality (DQ) in data mining refers to the quality of the patterns or results of the models built using mining algorithms. DQ for data mining in Business Intelligence (BI) applications should be aligned with the objectives of the BI application. Objective measures, training/modeling approaches, and subjective measures are three major approaches that exist to measure DQ for data mining. However, there is no agreement yet on definitions or measurements or interpretations of DQ for data mining. Defining the factors of DQ for data mining and their measurement for a BI System has been one of the major challenges for researchers as well as practitioners. This chapter provides an overview of existing research in the area of DQ definition and measurement for data mining for BI, analyzes the gaps therein, besides reviewing proposed solutions and providing a direction for future research and practice in this area.
APA, Harvard, Vancouver, ISO, and other styles
5

Rahimi, Alireza, Siaw-Teng Liaw, Pradeep Kumar Ray, Jane Taggart, and Hairong Yu. "Ontology for Data Quality and Chronic Disease Management." In Healthcare Informatics and Analytics. IGI Global, 2015. http://dx.doi.org/10.4018/978-1-4666-6316-9.ch016.

Full text
Abstract:
Improved Data Quality (DQ) can improve the quality of decisions and lead to better policy in health organizations. Ontologies can support automated tools to assess DQ. This chapter examines ontology-based approaches to conceptualization and specification of DQ based on “fitness for purpose” within the health context. English language studies that addressed DQ, fitness for purpose, ontology-based approaches, and implementations were included. The authors screened 315 papers; excluded 36 duplicates, 182 on abstract review, and 46 on full-text review; leaving 52 papers. These were appraised with a realist “context-mechanism-impacts/outcomes” template. The authors found a lack of consensus frameworks or definitions for DQ and comprehensive ontological approaches to DQ or fitness for purpose. The majority of papers described the processes of the development of DQ tools. Some assessed the impact of implementing ontology-based specifications for DQ. There were few evaluative studies of the performance of DQ assessment tools developed; none compared ontological with non-ontological approaches.
APA, Harvard, Vancouver, ISO, and other styles
6

Tute Erik. "Striving for Use Case Specific Optimization of Data Quality Assessment for Health Data." In Studies in Health Technology and Informatics. IOS Press, 2018. https://doi.org/10.3233/978-1-61499-880-8-113.

Full text
Abstract:
Data quality (DQ) assessment is advisable before (re)using datasets. Besides supporting DQ-assessment, DQ-tools can indicate data integration issues. The objective of this contribution is to put up for discussion the identified current state of scientific knowledge in DQ-assessment for health data and the planned work resulting from that state of knowledge. The state of scientific knowledge bases on a continuous literature survey and tracking of related working groups' activities. 95 full text publications constitute the considered state of scientific knowledge of which a representative selection of six DQ-tools and -frameworks is presented. The delineated future work explores multi-institutional machine learning on the DQ-measurement results of an interoperable DQ-tool, with the goal to optimize DQ-measurement method combinations and reference values for DQ-issue recognition.
APA, Harvard, Vancouver, ISO, and other styles
7

Tahar, Kais, Raphael Verbuecheln, Tamara Martin, Holm Graessner, and Dagmar Krefting. "Local Data Quality Assessments on EHR-Based Real-World Data for Rare Diseases." In Caring is Sharing – Exploiting the Value in Data for Health and Innovation. IOS Press, 2023. http://dx.doi.org/10.3233/shti230121.

Full text
Abstract:
The project “Collaboration on Rare Diseases” CORD-MI connects various university hospitals in Germany to collect sufficient harmonized electronic health record (EHR) data for supporting clinical research in the field of rare diseases (RDs). However, the integration and transformation of heterogeneous data into an interoperable standard through Extract-Transform-Load (ETL) processes is a complex task that may influence the data quality (DQ). Local DQ assessments and control processes are needed to ensure and improve the quality of RD data. We therefore aim to investigate the impact of ETL processes on the quality of transformed RD data. Seven DQ indicators for three independent DQ dimensions were evaluated. The resulting reports show the correctness of calculated DQ metrics and detected DQ issues. Our study provides the first comparison results between the DQ of RD data before and after ETL processes. We found that ETL processes are challenging tasks that influence the quality of RD data. We have demonstrated that our methodology is useful and capable of evaluating the quality of real-world data stored in different formats and structures. Our methodology can therefore be used to improve the quality of RD documentation and to support clinical research.
APA, Harvard, Vancouver, ISO, and other styles
8

Triefenbach, Lucas, Ronny Otto, Jonas Bienzeisler, et al. "Establishing a Data Quality Baseline in the AKTIN Emergency Department Data Registry – A Secondary Use Perspective." In Studies in Health Technology and Informatics. IOS Press, 2022. http://dx.doi.org/10.3233/shti220439.

Full text
Abstract:
Secondary use of clinical data is an increasing application that is affected by the data quality (DQ) of its source systems. Techniques such as audits and risk-based monitoring for controlling DQ often rely on source data verification (SDV). SDV requires access to data generating systems. We present an approach to a targeted SDV based on manual input and synthetic data that is applicable in low resource settings with restricted system access. We deployed the protocol in the DQ management of the AKTIN Emergency Department Data Registry. Our targeted approach has shown to be feasible to form a DQ baseline that can be used for different DQ monitoring processes such as the identification of different error sources.
APA, Harvard, Vancouver, ISO, and other styles
9

Sundararaman, Arun Thotapalli. "Big Data Quality for Data Mining in Business Intelligence Applications." In Advances in Business Information Systems and Analytics. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-5781-5.ch004.

Full text
Abstract:
Study of data quality for data mining application has always been a complex topic; in the recent years, this topic has gained further complexity with the advent of big data as the source for data mining and business intelligence (BI) applications. In a big data environment, data is consumed in various states and various forms serving as input for data mining, and this is the main source of added complexity. These new complexities and challenges arise from the underlying dimensions of big data (volume, variety, velocity, and value) together with the ability to consume data at various stages of transition from raw data to standardized datasets. These have created a need for expanding the traditional data quality (DQ) factors into BDQ (big data quality) factors besides the need for new BDQ assessment and measurement frameworks for data mining and BI applications. However, very limited advancement has been made in research and industry in the topic of BDQ and their relevance and criticality for data mining and BI applications. Data quality in data mining refers to the quality of the patterns or results of the models built using mining algorithms. DQ for data mining in business intelligence applications should be aligned with the objectives of the BI application. Objective measures, training/modeling approaches, and subjective measures are three major approaches that exist to measure DQ for data mining. However, there is no agreement yet on definitions or measurements or interpretations of DQ for data mining. Defining the factors of DQ for data mining and their measurement for a BI system has been one of the major challenges for researchers as well as practitioners. This chapter provides an overview of existing research in the area of BDQ definitions and measurement for data mining for BI, analyzes the gaps therein, and provides a direction for future research and practice in this area.
APA, Harvard, Vancouver, ISO, and other styles
10

Sundararaman, Arun Thotapalli. "Effective Measurement of DQ/IQ for BI." In Information Quality and Governance for Business Intelligence. IGI Global, 2014. http://dx.doi.org/10.4018/978-1-4666-4892-0.ch012.

Full text
Abstract:
DQ/IQ measurement in general and in the specific context of BI has always been a topic of high interest for researchers. The topic of Data Quality (DQ) in the field of Information Management has been well researched, published, and studied. Despite such research advances, there has been very little understanding either from a theoretical or from a practical perspective of DQ/IQ measurement for BI. Assessing the quality of data for a BI System has been one of the major challenges for researchers as well as practitioners, leading to the need for frameworks to measure DQ for BI. The objective of this chapter is to provide an overview of the existing frameworks for measurement of DQ for BI, analyze the gaps therein, review proposed solutions, and provide a direction for future research and practice in this area.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Data quality ‪(DQ)‬"

1

Veiga, Allan Koch, and Antonio Mauro Saraiva. "Biodiversity Data Quality Profiling: A practical guideline." In VIII Workshop de Computação Aplicada à Gestão do Meio Ambiente e Recursos Naturais. Sociedade Brasileira de Computação - SBC, 2017. http://dx.doi.org/10.5753/wcama.2017.3441.

Full text
Abstract:
A crescente disponibilidade de dados de biodiversidade em todo o mundo, providos por um número crescente de instituições, e o crescente uso desses dados para uma variedade de usos suscitaram preocupações relacionadas a "adequação ao uso" desses dados e o impacto nos resultados desses usos. Para abordar estas questões, definiu-se um framework conceitual no contexto do Biodiversity Information Standards (TDWG) para servir como uma abordagem consistente para avaliar e gerir a Qualidade dos Dados (QD) em dados da biodiversidade. Com base neste quadro, propomos um método para definir Perfis DQ que descrevem o significado de "adequação ao uso" em um dado contexto e consequentemente permitir a avaliação e melhoria da QD.
APA, Harvard, Vancouver, ISO, and other styles
2

Ehrlinger, Lisa, Alexander Gindlhumer, Lisa-Marie Huber, and Wolfram Wöß. "DQ-MeeRKat: Automating Data Quality Monitoring with a Reference-Data-Profile-Annotated Knowledge Graph." In 10th International Conference on Data Science, Technology and Applications. SCITEPRESS - Science and Technology Publications, 2021. http://dx.doi.org/10.5220/0010546202150222.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ehrlinger, Lisa, Alexander Gindlhumer, Lisa-Marie Huber, and Wolfram Wöß. "DQ-MeeRKat: Automating Data Quality Monitoring with a Reference-Data-Profile-Annotated Knowledge Graph." In 10th International Conference on Data Science, Technology and Applications. SCITEPRESS - Science and Technology Publications, 2021. http://dx.doi.org/10.5220/0010546200002993.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Parody, Luisa, Maria Teresa Gomez-Lopez, Isabel Bermejo, Ismael Caballero, Rafael M. Gasca, and Mario Piattini. "PAIS-DQ: Extending process-aware information systems to support data quality in PAIS life-cycle." In 2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS). IEEE, 2016. http://dx.doi.org/10.1109/rcis.2016.7549342.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Pei, Shuiqiang, Xiaoguang Hu, Guofeng Zhang, and Li Fu. "Improved Voltage Sag Detection Method and Optimal Design for the Digital Low-Pass Filter in Small UAVs." In ASME 2015 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2015. http://dx.doi.org/10.1115/detc2015-46401.

Full text
Abstract:
Real-time and accurate detection of the voltage sag characteristics is the premise to achieve dynamic voltage restorer compensation. An improved αβ-dq transformation detection method is presented for the limitations of traditional detection methods. In this method, the α-axis component of the αβ static coordinate system is deduced according to the single-phase voltage. The virtual β-axis component is constructed from the derivative of the α-axis component. The magnitudes, duration, phase-angle jump of the voltage sag are detected quickly and accurately by αβ-dq transformation and low-pass filter. The original data is real-time, which ensures faster detection response speed and reduces the computation greatly. In addition, an optimization design method for digital low-pass filter is presented against the contradictions existing in real-time and filtering effect of common low-pass filter. This adopts inertial filter to improve the characteristics of Butterworth low-pass filter and enable them to better adapt to the needs of the voltage sag detection thus improving the real-time quality and precision of dynamic voltage restorer.
APA, Harvard, Vancouver, ISO, and other styles
6

Zeng, Y., S. Ryu, C. G. Chaney, et al. "Finding the Right Concept Via a Decision Quality Framework with Rapid Generation of Multiple Deepwater Conceptual Alternatives." In SPE Annual Technical Conference and Exhibition. SPE, 2022. http://dx.doi.org/10.2118/210288-ms.

Full text
Abstract:
Abstract The key to finding the highest-value concept in deepwater full-field development is by making high-quality decisions during the Concept Select stage of a project. One of the critical elements to achieve this is by considering a broad range of conceptual alternatives and evaluating them rapidly, providing timely feedback, and facilitating an exploratory learning process. However, concept-select decisions are challenged by competing objectives, significant uncertainties, and many possible concepts. Further, deepwater full-field developments require strong connectivity and interfaces across multiple disciplines, which include reservoir, wells, drilling, flow assurance, subsea, flowlines, risers, topsides, metocean, geotechnical, marine, costing, and project economics. Key challenges to the current methodology include a lack of capacity to consider multiple concepts, slow evaluation turn-around for each concept generated, continuous evaluation and revisions with new data and information, lack of ability to integrate processes across multiple disciplines, and poor risk management driven by technical/commercial uncertainties and unavailable data. This paper addresses these challenges by combining concepts from the Decision Quality (DQ) framework and FLOCO® (Field Layout Concept Optimizer), which is a metaheuristic model-based system-engineering software, to efficiently identify the highest value field development concepts among several possible alternatives. This novel approach applies a new framework to an offshore deepwater full-field development. Specifically, we explore the trade-space, evaluate the trade-offs between risk and reward, perform integrated techno-economic analysis, and identify the best concepts. Key outputs are the identification of development concepts that meet the given constraints and functional requirements for further optimization, while eliminating those that do not meet such requirements. The results demonstrate that the challenges in the current Concept Select phase can be simplified and that the proposed approach offers a quick, logical, and insightful means of selecting the highest-value concept. The case study demonstrates that the proposed improvement to the concept-select stage of deepwater full-field development process can lead to significantly improved project economics, as it fully explores the decision-space, key uncertainties, multiple technically feasible concepts, and key performance indicators such as net present value (NPV) and capital expenditures (CAPEX). This paper addresses the development of economic oil and gas projects through decision making enhanced by rapid digital prototyping and analysis. The integration of Decision Quality methodologies with systems-engineering decision-support tools is novel and is likely to become more important as the industry explores and develops more complicated targets in the future.
APA, Harvard, Vancouver, ISO, and other styles
7

Patel, Harsh, and Jonathan Chong. "How to Design a Modular, Effective, and Interpretable Machine Learning-Based Real-Time System: Lessons from Automated Electrical Submersible Pump Surveillance." In ADIPEC. SPE, 2023. http://dx.doi.org/10.2118/216761-ms.

Full text
Abstract:
Abstract Many machine learning (ML) projects do not progress beyond the proof-of-concept phase into real-world operations and remain economical at scale. Commonly discussed challenges revolve around digitalization, data, and infrastructure/tooling. However, there are other non-ML aspects that are equally if not more important towards building a successful system. This paper presents a general framework and lessons learned for building a robust, practical, and modular domain-centric ML-based system in contrast to purely "data-centric" or "model-centric" approaches. This paper presents the case study of a sophisticated "plug-and-play" real-time surveillance system for electrical submersible pumps (ESP) that has been successfully serving hundreds of wells of various configurations. The system has also been successfully tested to expand beyond advisory surveillance to include closed-loop-control for autonomous response to events. We discuss some of the intelligent design strategies that allow us to address various requirements and practical constraints, while still ensuring effective performance in the field. This paper also presents general learnings, design suggestions, and components that can be adapted for building similar ML-based systems for multivariate time-series type problems. The paper demonstrates with examples how building an artificial intelligence (AI) system with modular independent components can be more practical and effective in comparison to training large end-to-end deep learning models. These components can be independently tested, refactored, and even repurposed as libraries for other applications. We explain the significance of the first component, namely the data quality (DQ) engine, which is critical for any real-world engineering application dealing with the challenges of streaming field data. We discuss a second component, the reference engine, covering smart and practical ways in which first principles, subject-matter-expert knowledge, and memory-like features can be embedded into the system. Through the example of ESP surveillance, we differentiate performance of ML models from the performance of the overall system, and explain the complexities and trade-offs that go into tuning and evaluating such AI systems. We highlight the importance of designing observability into the core engine so each decision step within the system can be analyzed and explained. Such transparency is necessary for critical applications needing actionable insights and continuous improvements. The paper also shares a few other lessons from implementing surveillance at scale, and the ability to reuse components for robust, reliable closed-loop edge automation. Literature on ML system design is dominated by experiences from the technology sector, particularly consumer-facing applications. These are often challenging to adopt directly for high-risk, resource-constrained, and dynamic applications in upstream oil and gas. The insights from this paper will benefit data/ML professionals and help promote a greater appreciation by business leaders for what is required to build realistic real-time systems that incorporate ML models.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography