To see the other types of publications on this topic, follow the link: EHR data mining.

Journal articles on the topic 'EHR data mining'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'EHR data mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Khanal, Rajesh. "The Role of Open Standard Electronic Health Record in Medical Data Mining." European Journal of Business Management and Research 2, no. 2 (April 25, 2017): 1–7. http://dx.doi.org/10.24018/ejbmr.2017.2.2.9.

Full text
Abstract:
Electronic Health Record (EHR) has received significant attention of all the health service provider in the world. EHR contains electronic information of all the patient information such as demographics, medical history, family medical history, lab tests and results, and prescribed drug. There is not any consistency in type of the EHR software implemented by the hosting organization. So, the EHR is currently vendor dependent and is not transferrable to another health service provider. The open standard electronic health record makes it public available to both vendor and patient. It can further aid in creating a universal EHR database for medical data mining. Mining the EHR helps in developing the best standard of care and clinical practice. The following paper proposes a universal EHR database and medical data mining. The benefits and challenges of implementing a database system is also discussed in the paper. The following paper will also analyze the different application areas of the EHR data mining.
APA, Harvard, Vancouver, ISO, and other styles
2

Sarwar, Tabinda, Sattar Seifollahi, Jeffrey Chan, Xiuzhen Zhang, Vural Aksakalli, Irene Hudson, Karin Verspoor, and Lawrence Cavedon. "The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges." ACM Computing Surveys 55, no. 2 (March 31, 2023): 1–40. http://dx.doi.org/10.1145/3490234.

Full text
Abstract:
The primary objective of implementing Electronic Health Records (EHRs) is to improve the management of patients’ health-related information. However, these records have also been extensively used for the secondary purpose of clinical research and to improve healthcare practice. EHRs provide a rich set of information that includes demographics, medical history, medications, laboratory test results, and diagnosis. Data mining and analytics techniques have extensively exploited EHR information to study patient cohorts for various clinical and research applications, such as phenotype extraction, precision medicine, intervention evaluation, disease prediction, detection, and progression. But the presence of diverse data types and associated characteristics poses many challenges to the use of EHR data. In this article, we provide an overview of information found in EHR systems and their characteristics that could be utilized for secondary applications. We first discuss the different types of data stored in EHRs, followed by the data transformations necessary for data analysis and mining. Later, we discuss the data quality issues and characteristics of the EHRs along with the relevant methods used to address them. Moreover, this survey also highlights the usage of various data types for different applications. Hence, this article can serve as a primer for researchers to understand the use of EHRs for data mining and analytics purposes.
APA, Harvard, Vancouver, ISO, and other styles
3

Sundermann, Alexander J., James K. Miller, Jane W. Marsh, Melissa I. Saul, Kathleen A. Shutt, Marissa Pacey, Mustapha M. Mustapha, et al. "Automated data mining of the electronic health record for investigation of healthcare-associated outbreaks." Infection Control & Hospital Epidemiology 40, no. 3 (February 18, 2019): 314–19. http://dx.doi.org/10.1017/ice.2018.343.

Full text
Abstract:
AbstractBackground:Identifying routes of transmission among hospitalized patients during a healthcare-associated outbreak can be tedious, particularly among patients with complex hospital stays and multiple exposures. Data mining of the electronic health record (EHR) has the potential to rapidly identify common exposures among patients suspected of being part of an outbreak.Methods:We retrospectively analyzed 9 hospital outbreaks that occurred during 2011–2016 and that had previously been characterized both according to transmission route and by molecular characterization of the bacterial isolates. We determined (1) the ability of data mining of the EHR to identify the correct route of transmission, (2) how early the correct route was identified during the timeline of the outbreak, and (3) how many cases in the outbreaks could have been prevented had the system been running in real time.Results:Correct routes were identified for all outbreaks at the second patient, except for one outbreak involving >1 transmission route that was detected at the eighth patient. Up to 40 or 34 infections (78% or 66% of possible preventable infections, respectively) could have been prevented if data mining had been implemented in real time, assuming the initiation of an effective intervention within 7 or 14 days of identification of the transmission route, respectively.Conclusions:Data mining of the EHR was accurate for identifying routes of transmission among patients who were part of the outbreak. Prospective validation of this approach using routine whole-genome sequencing and data mining of the EHR for both outbreak detection and route attribution is ongoing.
APA, Harvard, Vancouver, ISO, and other styles
4

Grando, M. Adela, Vaishak Vellore, Benjamin J. Duncan, David R. Kaufman, Stephanie K. Furniss, Bradley N. Doebbeling, Karl A. Poterack, Timothy Miksch, and Richard A. Helmers. "Study of EHR-mediated workflows using ethnography and process mining methods." Health Informatics Journal 27, no. 2 (April 2021): 146045822110082. http://dx.doi.org/10.1177/14604582211008210.

Full text
Abstract:
Rapid ethnography and data mining approaches have been used individually to study clinical workflows, but have seldom been used together to overcome the limitations inherent in either type of method. For rapid ethnography, how reliable are the findings drawn from small samples? For data mining, how accurate are the discoveries drawn from automatic analysis of big data, when compared with observable data? This paper explores the combined use of rapid ethnography and process mining, aka ethno-mining, to study and compare metrics of a typical clinical documentation task, vital signs charting. The task was performed with different electronic health records (EHRs) used in three different hospital sites. The individual methods revealed substantial discrepancies in task duration between sites. Specifically, means of 159.6(78.55), 38.2(34.9), and 431.3(283.04) seconds were captured with rapid ethnography. When process mining was used, means of 518.6(3,808), 345.5(660.6), and 119.74(210.3) seconds were found. When ethno-mining was applied instead, outliers could be identified, explained and removed. Without outliers, mean task duration was similar between sites (78.1(66.7), 72.5(78.5), and 71.7(75) seconds). Results from this work suggest that integrating rapid ethnography and data mining into a single process may provide more meaningful results than a siloed approach when studying of workflow.
APA, Harvard, Vancouver, ISO, and other styles
5

Lee, Wu, Yuliang Shi, Hongfeng Sun, Lin Cheng, Kun Zhang, Xinjun Wang, and Zhiyong Chen. "MSIPA: Multi-Scale Interval Pattern-Aware Network for ICU Transfer Prediction." ACM Transactions on Knowledge Discovery from Data 16, no. 1 (February 28, 2022): 1–17. http://dx.doi.org/10.1145/3458284.

Full text
Abstract:
Accurate prediction of patients’ ICU transfer events is of great significance for improving ICU treatment efficiency. ICU transition prediction task based on Electronic Health Records (EHR) is a temporal mining task like many other health informatics mining tasks. In the EHR-based temporal mining task, existing approaches are usually unable to mine and exploit patterns used to improve model performance. This article proposes a network based on Interval Pattern-Aware, Multi-Scale Interval Pattern-Aware (MSIPA) network. MSIPA mines different interval patterns in temporal EHR data according to the short, medium, and long intervals. MSIPA utilizes the Scaled Dot-Product Attention mechanism to query the contexts corresponding to the three scale patterns. Furthermore, Transformer will use all three types of contextual information simultaneously for ICU transfer prediction. Extensive experiments on real-world data demonstrate that an MSIPA network outperforms state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
6

Liang, Chen, Sharon Weissman, Bankole Olatosi, Eric G. Poon, Michael E. Yarrington, and Xiaoming Li. "Curating a knowledge base for individuals with coinfection of HIV and SARS-CoV-2: a study protocol of EHR-based data mining and clinical implementation." BMJ Open 12, no. 9 (September 2022): e067204. http://dx.doi.org/10.1136/bmjopen-2022-067204.

Full text
Abstract:
IntroductionDespite a higher risk of severe COVID-19 disease in individuals with HIV, the interactions between SARS-CoV-2 and HIV infections remain unclear. To delineate these interactions, multicentre Electronic Health Records (EHR) hold existing promise to provide full-spectrum and longitudinal clinical data, demographics and sociobehavioural data at individual level. Presently, a comprehensive EHR-based cohort for the HIV/SARS-CoV-2 coinfection has not been established; EHR integration and data mining methods tailored for studying the coinfection are urgently needed yet remain underdeveloped.Methods and analysisThe overarching goal of this exploratory/developmental study is to establish an EHR-based cohort for individuals with HIV/SARS-CoV-2 coinfection and perform large-scale EHR-based data mining to examine the interactions between HIV and SARS-CoV-2 infections and systematically identify and validate factors contributing to the severe clinical course of the coinfection. We will use a nationwide EHR database in the USA, namely, National COVID Cohort Collaborative (N3C). Ultimately, collected clinical evidence will be implemented and used to pilot test a clinical decision support prototype to assist providers in screening and referral of at-risk patients in real-world clinics.Ethics and disseminationThe study was approved by the institutional review boards at the University of South Carolina (Pro00121828) as non-human subject study. Study findings will be presented at academic conferences and published in peer-reviewed journals. This study will disseminate urgently needed clinical evidence for guiding clinical practice for individuals with the coinfection at Prisma Health, a healthcare system in collaboration.
APA, Harvard, Vancouver, ISO, and other styles
7

Madhavan, Ramesh, Chi Tang, Pratik Bhattacharya, Fadi Delly, and Maysaa M. Basha. "Evaluation of Documentation Patterns of Trainees and Supervising Physicians Using Data Mining." Journal of Graduate Medical Education 6, no. 3 (September 1, 2014): 577–80. http://dx.doi.org/10.4300/jgme-d-13-00267.1.

Full text
Abstract:
Abstract Background The electronic health record (EHR) includes a rich data set that may offer opportunities for data mining and natural language processing to answer questions about quality of care, key aspects of resident education, or attributes of the residents' learning environment. Objective We used data obtained from the EHR to report on inpatient documentation practices of residents and attending physicians at a large academic medical center. Methods We conducted a retrospective observational study of deidentified patient notes entered over 7 consecutive months by a multispecialty university physician group at an urban hospital. A novel automated data mining technology was used to extract patient note–related variables. Results A sample of 26 802 consecutive patient notes was analyzed using the data mining and modeling tool Healthcare Smartgrid. Residents entered most of the notes (33%, 8178 of 24 787) between noon and 4 pm and 31% (7718 of 24 787) of notes between 8 am and noon. Attending physicians placed notes about teaching attestations within 24 hours in only 73% (17 843 of 24 443) of the records. Surgical residents were more likely to place notes before noon (P < .001). Nonsurgical faculty were more likely to provide attestation of resident notes within 24 hours (P < .001). Conclusions Data related to patient note entry was successfully used to objectively measure current work flow of resident physicians and their supervising faculty, and the findings have implications for physician oversight of residents' clinical work. We were able to demonstrate the utility of a data mining model as an assessment tool in graduate medical education.
APA, Harvard, Vancouver, ISO, and other styles
8

Ross, M. K., Wei Wei, and L. Ohno-Machado. "“Big Data” and the Electronic Health Record." Yearbook of Medical Informatics 23, no. 01 (August 2014): 97–104. http://dx.doi.org/10.15265/iy-2014-0003.

Full text
Abstract:
Summary Objectives: Implementation of Electronic Health Record (EHR) systems continues to expand. The massive number of patient encounters results in high amounts of stored data. Transforming clinical data into knowledge to improve patient care has been the goal of biomedical informatics professionals for many decades, and this work is now increasingly recognized outside our field. In reviewing the literature for the past three years, we focus on “big data” in the context of EHR systems and we report on some examples of how secondary use of data has been put into practice. Methods: We searched PubMed database for articles from January 1, 2011 to November 1, 2013. We initiated the search with keywords related to “big data” and EHR. We identified relevant articles and additional keywords from the retrieved articles were added. Based on the new keywords, more articles were retrieved and we manually narrowed down the set utilizing predefined inclusion and exclusion criteria. Results: Our final review includes articles categorized into the themes of data mining (pharmacovigilance, phenotyping, natural language processing), data application and integration (clinical decision support, personal monitoring, social media), and privacy and security. Conclusion: The increasing adoption of EHR systems worldwide makes it possible to capture large amounts of clinical data. There is an increasing number of articles addressing the theme of “big data”, and the concepts associated with these articles vary. The next step is to transform healthcare big data into actionable knowledge.
APA, Harvard, Vancouver, ISO, and other styles
9

Patel, J., Z. Siddiqui, A. Krishnan, and T. Thyvalikakath. "Leveraging Electronic Dental Record Data to Classify Patients Based on Their Smoking Intensity." Methods of Information in Medicine 57, no. 05/06 (November 2018): 253–60. http://dx.doi.org/10.1055/s-0039-1681088.

Full text
Abstract:
Background Smoking is an established risk factor for oral diseases and, therefore, dental clinicians routinely assess and record their patients' detailed smoking status. Researchers have successfully extracted smoking history from electronic health records (EHRs) using text mining methods. However, they could not retrieve patients' smoking intensity due to its limited availability in the EHR. The presence of detailed smoking information in the electronic dental record (EDR) often under a separate section allows retrieving this information with less preprocessing. Objective To determine patients' detailed smoking status based on smoking intensity from the EDR. Methods First, the authors created a reference standard of 3,296 unique patients’ smoking histories from the EDR that classified patients based on their smoking intensity. Next, they trained three machine learning classifiers (support vector machine, random forest, and naïve Bayes) using the training set (2,176) and evaluated performances on test set (1,120) using precision (P), recall (R), and F-measure (F). Finally, they applied the best classifier to classify smoking status from an additional 3,114 patients’ smoking histories. Results Support vector machine performed best to classify patients into smokers, nonsmokers, and unknowns (P, R, F: 98%); intermittent smoker (P: 95%, R: 98%, F: 96%); past smoker (P, R, F: 89%); light smoker (P, R, F: 87%); smokers with unknown intensity (P: 76%, R: 86%, F: 81%), and intermediate smoker (P: 90%, R: 88%, F: 89%). It performed moderately to differentiate heavy smokers (P: 90%, R: 44%, F: 60%). EDR could be a valuable source for obtaining patients’ detailed smoking information. Conclusion EDR data could serve as a valuable source for obtaining patients' detailed smoking information based on their smoking intensity that may not be readily available in the EHR.
APA, Harvard, Vancouver, ISO, and other styles
10

Hernandez-Boussard, Tina, Suzanne Tamang, James D. Brooks, Douglas W. Blayney, and Nigam Shah. "Measurement of urinary incontinence after prostate surgery from data-mining electronic health records (EHR)." Journal of Clinical Oncology 32, no. 15_suppl (May 20, 2014): 6612. http://dx.doi.org/10.1200/jco.2014.32.15_suppl.6612.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Sheets, Lincoln, Gregory Petroski, Yan Zhuang, Michael Phinney, Bin Ge, Jerry Parker, and Chi-Ren Shyu. "Combining Contrast Mining with Logistic Regression To Predict Healthcare Utilization in a Managed Care Population." Applied Clinical Informatics 08, no. 02 (April 2017): 430–46. http://dx.doi.org/10.4338/aci-2016-05-ra-0078.

Full text
Abstract:
SummaryBackground: Because 5% of patients incur 50% of healthcare expenses, population health managers need to be able to focus preventive and longitudinal care on those patients who are at highest risk of increased utilization. Predictive analytics can be used to identify these patients and to better manage their care. Data mining permits the development of models that surpass the size restrictions of traditional statistical methods and take advantage of the rich data available in the electronic health record (EHR), without limiting predictions to specific chronic conditions.Objective: The objective was to demonstrate the usefulness of unrestricted EHR data for predictive analytics in managed healthcare.Methods: In a population of 9,568 Medicare and Medicaid beneficiaries, patients in the highest 5% of charges were compared to equal numbers of patients with the lowest charges. Contrast mining was used to discover the combinations of clinical attributes frequently associated with high utilization and infrequently associated with low utilization. The attributes found in these combinations were then tested by multiple logistic regression, and the discrimination of the model was evaluated by the c-statistic.Results: Of 19,014 potential EHR patient attributes, 67 were found in combinations frequently associated with high utilization, but not with low utilization (support>20%). Eleven of these attributes were significantly associated with high utilization (p<0.05). A prediction model composed of these eleven attributes had a discrimination of 84%.Conclusions: EHR mining reduced an unusably high number of patient attributes to a manageable set of potential healthcare utilization predictors, without conjecturing on which attributes would be useful. Treating these results as hypotheses to be tested by conventional methods yielded a highly accurate predictive model. This novel, two-step methodology can assist population health managers to focus preventive and longitudinal care on those patients who are at highest risk for increased utilization.Citation: Sheets L, Petroski GF, Zhuang Y, Phinney MA, Ge B, Parker JC, Shyu C-R. Combining contrast mining with logistic regression to predict healthcare Appl Clin Inform 2017; 8: 430–446 https://doi.org/10.4338/ACI-2016-05-RA-0078
APA, Harvard, Vancouver, ISO, and other styles
12

Hur, Cinyoung, JeongA Wi, and YoungBin Kim. "Facilitating the Development of Deep Learning Models with Visual Analytics for Electronic Health Records." International Journal of Environmental Research and Public Health 17, no. 22 (November 10, 2020): 8303. http://dx.doi.org/10.3390/ijerph17228303.

Full text
Abstract:
Electronic health record (EHR) data are widely used to perform early diagnoses and create treatment plans, which are key areas of research. We aimed to increase the efficiency of iteratively applying data-intensive technology and verifying the results for complex and big EHR data. We used a system entailing sequence mining, interpretable deep learning models, and visualization on data extracted from the MIMIC-IIIdatabase for a group of patients diagnosed with heart disease. The results of sequence mining corresponded to specific pathways of interest to medical staff and were used to select patient groups that underwent these pathways. An interactive Sankey diagram representing these pathways and a heat map visually representing the weight of each variable were developed for temporal and quantitative illustration. We applied the proposed system to predict unplanned cardiac surgery using clinical pathways determined by sequence pattern mining to select cardiac surgery from complex EHRs to label subject groups and deep learning models. The proposed system aids in the selection of pathway-based patient groups, simplification of labeling, and exploratory the interpretation of the modeling results. The proposed system can help medical staff explore various pathways that patients have undergone and further facilitate the testing of various clinical hypotheses using big data in the medical domain.
APA, Harvard, Vancouver, ISO, and other styles
13

Ostherr, Kirsten. "Privacy, Data Mining, and Digital Profiling in Online Patient Narratives." Catalyst: Feminism, Theory, Technoscience 4, no. 1 (May 7, 2018): 1–24. http://dx.doi.org/10.28968/cftt.v4i1.288.

Full text
Abstract:
Practices of health datafication and inadequate privacy policies are redefining the meaning of online patient narratives. This article compares patient-driven illness narratives and clinic-driven illness narratives to uncover a set of unrecognized assumptions about trust and privacy in health discourses. Specifically, I show how the open sharing of patient stories in social media, blogs, and other public domains collides with privacy regulations and normative assumptions in the US health care system that prevent integration of those stories into electronic health record (EHR) systems. I argue that publicly told stories based on personal experiences of illness are valuable sources of health care information in part because they are subjective, richly detailed, and open ended. Yet, precisely because of their public nature, these patient stories are unprotected sources of data that are barred from integration into health care data ecologies where clinical action takes place. Consequently, an impermeable barrier exists between the officially sanctioned accounts in the clinical record and the contextual richness of patient stories on the social web. The tensions between these two approaches to narrative and data create an opening for exploitative digital profiling practices that can ─ and already do ─ harm patients. Examples are drawn from Hugo Campos and Medtronic, PatientsLikeMe, Apple Health Records, Google Health, Microsoft Health Vault, IBM Watson Health, and OpenNotes.
APA, Harvard, Vancouver, ISO, and other styles
14

Ostherr, Kirsten. "Privacy, Data Mining, and Digital Profiling in Online Patient Narratives." Catalyst: Feminism, Theory, Technoscience 4, no. 1 (May 7, 2018): 1–24. http://dx.doi.org/10.28968/cftt.v4i1.29628.

Full text
Abstract:
Practices of health datafication and inadequate privacy policies are redefining the meaning of online patient narratives. This article compares patient-driven illness narratives and clinic-driven illness narratives to uncover a set of unrecognized assumptions about trust and privacy in health discourses. Specifically, I show how the open sharing of patient stories in social media, blogs, and other public domains collides with privacy regulations and normative assumptions in the US health care system that prevent integration of those stories into electronic health record (EHR) systems. I argue that publicly told stories based on personal experiences of illness are valuable sources of health care information in part because they are subjective, richly detailed, and open ended. Yet, precisely because of their public nature, these patient stories are unprotected sources of data that are barred from integration into health care data ecologies where clinical action takes place. Consequently, an impermeable barrier exists between the officially sanctioned accounts in the clinical record and the contextual richness of patient stories on the social web. The tensions between these two approaches to narrative and data create an opening for exploitative digital profiling practices that can ─ and already do ─ harm patients. Examples are drawn from Hugo Campos and Medtronic, PatientsLikeMe, Apple Health Records, Google Health, Microsoft Health Vault, IBM Watson Health, and OpenNotes.
APA, Harvard, Vancouver, ISO, and other styles
15

Davis, Lea, and Jessica Dennis. "BEYOND BIOMARKERS: MINING CLINICAL LAB DATA FROM THE EHR FOR USE IN PSYCHIATRIC GENOMIC ANALYSIS." European Neuropsychopharmacology 29 (2019): S1052. http://dx.doi.org/10.1016/j.euroneuro.2018.07.065.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Jane, Nancy, Kannan Arputharaj, and Khanna Nehemiah. "A Temporal Mining Framework for Classifying Un-Evenly Spaced Clinical Data." Applied Clinical Informatics 07, no. 01 (January 2016): 1–21. http://dx.doi.org/10.4338/aci-2015-08-ra-0102.

Full text
Abstract:
SummaryClinical time-series data acquired from electronic health records (EHR) are liable to temporal complexities such as irregular observations, missing values and time constrained attributes that make the knowledge discovery process challenging.This paper presents a temporal rough set induced neuro-fuzzy (TRiNF) mining framework that handles these complexities and builds an effective clinical decision-making system. TRiNF provides two functionalities namely temporal data acquisition (TDA) and temporal classification.In TDA, a time-series forecasting model is constructed by adopting an improved double exponential smoothing method. The forecasting model is used in missing value imputation and temporal pattern extraction. The relevant attributes are selected using a temporal pattern based rough set approach. In temporal classification, a classification model is built with the selected attributes using a temporal pattern induced neuro-fuzzy classifier.For experimentation, this work uses two clinical time series dataset of hepatitis and thrombosis patients. The experimental result shows that with the proposed TRiNF framework, there is a significant reduction in the error rate, thereby obtaining the classification accuracy on an average of 92.59% for hepatitis and 91.69% for thrombosis dataset.The obtained classification results prove the efficiency of the proposed framework in terms of its improved classification accuracy.
APA, Harvard, Vancouver, ISO, and other styles
17

Amster, Andy, Joseph Jentzsch, Ham Pasupuleti, and K. G. Subramanian. "Completeness, accuracy, and computability of National Quality Forum-specified eMeasures." Journal of the American Medical Informatics Association 22, no. 2 (October 17, 2014): 409–16. http://dx.doi.org/10.1136/amiajnl-2014-002865.

Full text
Abstract:
Abstract Objective To analyze the completeness, computability, and accuracy of specifications for five National Quality Forum-specified (NQF) eMeasures spanning ambulatory, post-discharge, and emergency care within a comprehensive, integrated electronic health record (EHR) environment. Materials and methods To evaluate completeness, we assessed eMeasure logic, data elements, and value sets. To evaluate computability, we assessed the translation of eMeasure algorithms to programmable logic constructs and the availability of EHR data elements to implement specified data criteria, using a de-identified clinical data set from Kaiser Permanente Northwest. To assess accuracy, we compared eMeasure results with those obtained independently by existing audited chart abstraction methods used for external and internal reporting. Results One measure specification was incomplete; missing applicable LOINC codes rendered it non-computable. For three of four computable measures, data availability issues occurred; the literal specification guidance for a data element differed from the physical implementation of the data element in the EHR. In two cases, cross-referencing specified data elements to EHR equivalents allowed variably accurate measure computation. Substantial data availability issues occurred for one of the four computable measures, producing highly inaccurate results. Discussion Existing clinical workflows, documentation, and coding in the EHR were significant barriers to implementing eMeasures as specified. Implementation requires redesigning business or clinical practices and, for one measure, systemic EHR modifications, including clinical text search capabilities. Conclusions Five NQF eMeasures fell short of being machine-consumable specifications. Both clinical domain and technological expertise are required to implement manually intensive steps from data mapping to text mining to EHR-specific eMeasure implementation.
APA, Harvard, Vancouver, ISO, and other styles
18

Hartley, David M., Susannah Jonas, Daniel Grossoehme, Amy Kelly, Cassandra Dodds, Shannon M. Alford, Elizabeth Shenkman, et al. "Use of EHR-Based Pediatric Quality Measures: Views of Health System Leaders and Parents." American Journal of Medical Quality 35, no. 2 (May 22, 2019): 177–85. http://dx.doi.org/10.1177/1062860619850322.

Full text
Abstract:
Measures of health care quality are produced from a variety of data sources, but often, physicians do not believe these measures reflect the quality of provided care. The aim was to assess the value to health system leaders (HSLs) and parents of benchmarking on health care quality measures using data mined from the electronic health record (EHR). Using in-context interviews with HSLs and parents, the authors investigated what new decisions and actions benchmarking using data mined from the EHR may enable and how benchmarking information should be presented to be most informative. Results demonstrate that although parents may have little experience using data on health care quality for decision making, they affirmed its potential value. HSLs expressed the need for high-confidence, validated metrics. They also perceived barriers to achieving meaningful metrics but recognized that mining data directly from the EHR could overcome those barriers. Parents and HSLs need high-confidence health care quality data to support decision making.
APA, Harvard, Vancouver, ISO, and other styles
19

Park, Kangah, Minsu Cho, Minseok Song, Sooyoung Yoo, Hyunyoung Baek, Seok Kim, and Kidong Kim. "Exploring the potential of OMOP common data model for process mining in healthcare." PLOS ONE 18, no. 1 (January 3, 2023): e0279641. http://dx.doi.org/10.1371/journal.pone.0279641.

Full text
Abstract:
Background and objective Recently, Electronic Health Records (EHR) are increasingly being converted to Common Data Models (CDMs), a database schema designed to provide standardized vocabularies to facilitate collaborative observational research. To date, however, rare attempts exist to leverage CDM data for healthcare process mining, a technique to derive process-related knowledge (e.g., process model) from event logs. This paper presents a method to extract, construct, and analyze event logs from the Observational Medical Outcomes Partnership (OMOP) CDM for process mining and demonstrates CDM-based healthcare process mining with several real-life study cases while answering frequently posed questions in process mining, in the CDM environment. Methods We propose a method to extract, construct, and analyze event logs from the OMOP CDM for process types including inpatient, outpatient, emergency room processes, and patient journey. Using the proposed method, we extract the retrospective data of several surgical procedure cases (i.e., Total Laparoscopic Hysterectomy (TLH), Total Hip Replacement (THR), Coronary Bypass (CB), Transcatheter Aortic Valve Implantation (TAVI), Pancreaticoduodenectomy (PD)) from the CDM of a Korean tertiary hospital. Patient data are extracted for each of the operations and analyzed using several process mining techniques. Results Using process mining, the clinical pathways, outpatient process models, emergency room process models, and patient journeys are demonstrated using the extracted logs. The result shows CDM’s usability as a novel and valuable data source for healthcare process analysis, yet with a few considerations. We found that CDM should be complemented by different internal and external data sources to address the administrative and operational aspects of healthcare processes, particularly for outpatient and ER process analyses. Conclusion To the best of our knowledge, we are the first to exploit CDM for healthcare process mining. Specifically, we provide a step-by-step guidance by demonstrating process analysis from locating relevant CDM tables to visualizing results using process mining tools. The proposed method can be widely applicable across different institutions. This work can contribute to bringing a process mining perspective to the existing CDM users in the changing Hospital Information Systems (HIS) environment and also to facilitating CDM-based studies in the process mining research community.
APA, Harvard, Vancouver, ISO, and other styles
20

van Laar, Sylvia A., Ellen Kapiteijn, Kim B. Gombert-Handoko, Henk-Jan Guchelaar, and Juliette Zwaveling. "Application of Electronic Health Record Text Mining: Real-World Tolerability, Safety, and Efficacy of Adjuvant Melanoma Treatments." Cancers 14, no. 21 (November 3, 2022): 5426. http://dx.doi.org/10.3390/cancers14215426.

Full text
Abstract:
Introduction: Nivolumab (N), pembrolizumab (P), and dabrafenib plus trametinib (D + T) have been registered as adjuvant treatments for resected stage III and IV melanoma since 2018. Electronic health records (EHRs) are a real-world data source that can be used to review treatments in clinical practice. In this study, we applied EHR text-mining software to evaluate the real-world tolerability, safety, and efficacy of adjuvant melanoma treatments. Methods: Adult melanoma patients receiving adjuvant treatment between January 2019 and October 2021 at the Leiden University Medical Center, the Netherlands, were included. CTcue text-mining software (v3.1.0, CTcue B.V., Amsterdam, The Netherlands) was used to construct rule-based queries and perform context analysis for patient inclusion and data collection from structured and unstructured EHR data. Results: In total, 122 patients were included: 54 patients treated with nivolumab, 48 with pembrolizumab, and 20 with D + T. Significantly more patients discontinued treatment due to toxicity on D + T (N: 16%, P: 6%, D + T: 40%), and X2(6, n = 122) = 14.6 and p = 0.024. Immune checkpoint inhibitors (ICIs) mainly showed immune-related treatment-limiting adverse events (AEs), and chronic thyroid-related AE occurred frequently (hyperthyroidism: N: 15%, P: 13%, hypothyroidism: N: 20%, P: 19%). Treatment-limiting toxicity from D + T was primarily a combination of reversible AEs, including pyrexia and fatigue. The 1-year recurrence-free survival was 70.3% after nivolumab, 72.4% after pembrolizumab, and 83.0% after D + T. Conclusions: Text-mining EHR is a valuable method to collect real-world data to evaluate adjuvant melanoma treatments. ICIs were better tolerated than D + T, in line with RCT results. For BRAF+ patients, physicians must weigh the higher risk of reversible treatment-limiting AEs of D + T against the risk of long-term immune-related AEs.
APA, Harvard, Vancouver, ISO, and other styles
21

Kim, Yong-Mi, and Dursun Delen. "Medical informatics research trend analysis: A text mining approach." Health Informatics Journal 24, no. 4 (December 1, 2016): 432–52. http://dx.doi.org/10.1177/1460458216678443.

Full text
Abstract:
The objective of this research is to identify major subject areas of medical informatics and explore the time-variant changes therein. As such it can inform the field about where medical informatics research has been and where it is heading. Furthermore, by identifying subject areas, this study identifies the development trends and the boundaries of medical informatics as an academic field. To conduct the study, first we identified 26,307 articles in PubMed archives which were published in the top medical informatics journals within the timeframe of 2002 to 2013. And then, employing a text mining -based semi-automated analytic approach, we clustered major research topics by analyzing the most frequently appearing subject terms extracted from the abstracts of these articles. The results indicated that some subject areas, such as biomedical, are declining, while other research areas such as health information technology (HIT), Internet-enabled research, and electronic medical/health records (EMR/EHR), are growing. The changes within the research subject areas can largely be attributed to the increasing capabilities and use of HIT. The Internet, for example, has changed the way medical research is conducted in the health care field. While discovering new medical knowledge through clinical and biological experiments is important, the utilization of EMR/EHR enabled the researchers to discover novel medical insight buried deep inside massive data sets, and hence, data analytics research has become a common complement in the medical field, rapidly growing in popularity.
APA, Harvard, Vancouver, ISO, and other styles
22

Durojaiye, Ashimiyu B., Scott Levin, Matthew Toerper, Hadi Kharrazi, Harold P. Lehmann, and Ayse P. Gurses. "Evaluation of multidisciplinary collaboration in pediatric trauma care using EHR data." Journal of the American Medical Informatics Association 26, no. 6 (March 19, 2019): 506–15. http://dx.doi.org/10.1093/jamia/ocy184.

Full text
Abstract:
Abstract Objectives The study sought to identify collaborative electronic health record (EHR) usage patterns for pediatric trauma patients and determine how the usage patterns are related to patient outcomes. Materials and Methods A process mining–based network analysis was applied to EHR metadata and trauma registry data for a cohort of pediatric trauma patients with minor injuries at a Level I pediatric trauma center. The EHR metadata were processed into an event log that was segmented based on gaps in the temporal continuity of events. A usage pattern was constructed for each encounter by creating edges among functional roles that were captured within the same event log segment. These patterns were classified into groups using graph kernel and unsupervised spectral clustering methods. Demographics, clinical and network characteristics, and emergency department (ED) length of stay (LOS) of the groups were compared. Results Three distinct usage patterns that differed by network density were discovered: fully connected (clique), partially connected, and disconnected (isolated). Compared with the fully connected pattern, encounters with the partially connected pattern had an adjusted median ED LOS that was significantly longer (242.6 [95% confidence interval, 236.9–246.0] minutes vs 295.2 [95% confidence, 289.2–297.8] minutes), more frequently seen among day shift and weekday arrivals, and involved otolaryngology, ophthalmology services, and child life specialists. Discussion The clique-like usage pattern was associated with decreased ED LOS for the study cohort, suggesting greater degree of collaboration resulted in shorter stay. Conclusions Further investigation to understand and address causal factors can lead to improvement in multidisciplinary collaboration.
APA, Harvard, Vancouver, ISO, and other styles
23

Andry, Johanes Fernandes, Fabio Mangatas Silaen, Hendy Tannady, and Kevin Hadi Saputra. "Electronic health record to predict a heart attack used data mining with Naïve Bayes method." International Journal of Informatics and Communication Technology (IJ-ICT) 10, no. 3 (December 1, 2021): 182. http://dx.doi.org/10.11591/ijict.v10i3.pp182-187.

Full text
Abstract:
<span>A heart attack is a medical emergency. A heart attack usually occurs when a blood clot blocks the flow of blood to the heart. Cardiovascular disease is a variety of diseases that attack the body's cardiovascular system including the heart and blood vessels. Cardiovascular diseases (CVD) include angina, arrhythmia, heart attack, heart failure, atherosclerosis, stroke, and so on. To resolving (CVD) is to evaluate large scores of datasets, to compare for any information that can be used to forecast, to take care of organize. The method used Naïve Bayes classification because that method can determine target which can be used to answer some questions like whether the patient has the potential for heart disease. After data analyst, authors can use data to electronic health records (EHR).</span>
APA, Harvard, Vancouver, ISO, and other styles
24

Wu, Yonghui, Jeremy L. Warner, Liwei Wang, Min Jiang, Jun Xu, Qingxia Chen, Hui Nian, et al. "Discovery of Noncancer Drug Effects on Survival in Electronic Health Records of Patients With Cancer: A New Paradigm for Drug Repurposing." JCO Clinical Cancer Informatics, no. 3 (December 2019): 1–9. http://dx.doi.org/10.1200/cci.19.00001.

Full text
Abstract:
PURPOSEDrug development is becoming increasingly expensive and time consuming. Drug repurposing is one potential solution to accelerate drug discovery. However, limited research exists on the use of electronic health record (EHR) data for drug repurposing, and most published studies have been conducted in a hypothesis-driven manner that requires a predefined hypothesis about drugs and new indications. Whether EHRs can be used to detect drug repurposing signals is not clear. We want to demonstrate the feasibility of mining large, longitudinal EHRs for drug repurposing by detecting candidate noncancer drugs that can potentially be used for the treatment of cancer.PATIENTS AND METHODSBy linking cancer registry data to EHRs, we identified 43,310 patients with cancer treated at Vanderbilt University Medical Center (VUMC) and 98,366 treated at the Mayo Clinic. We assessed the effect of 146 noncancer drugs on cancer survival using VUMC EHR data and sought to replicate significant associations (false discovery rate < .1) using the identical approach with Mayo Clinic EHR data. To evaluate replicated signals further, we reviewed the biomedical literature and clinical trials on cancers for corroborating evidence.RESULTSWe identified 22 drugs from six drug classes (statins, proton pump inhibitors, angiotensin-converting enzyme inhibitors, β-blockers, nonsteroidal anti-inflammatory drugs, and α-1 blockers) associated with improved overall cancer survival (false discovery rate < .1) from VUMC; nine of the 22 drug associations were replicated at the Mayo Clinic. Literature and cancer clinical trial evaluations also showed very strong evidence to support the repurposing signals from EHRs.CONCLUSIONMining of EHRs for drug exposure–mediated survival signals is feasible and identifies potential candidates for antineoplastic repurposing. This study sets up a new model of mining EHRs for drug repurposing signals.
APA, Harvard, Vancouver, ISO, and other styles
25

Baowaly, Mrinal Kanti, Chia-Ching Lin, Chao-Lin Liu, and Kuan-Ta Chen. "Synthesizing electronic health records using improved generative adversarial networks." Journal of the American Medical Informatics Association 26, no. 3 (December 7, 2018): 228–41. http://dx.doi.org/10.1093/jamia/ocy142.

Full text
Abstract:
AbstractObjectiveThe aim of this study was to generate synthetic electronic health records (EHRs). The generated EHR data will be more realistic than those generated using the existing medical Generative Adversarial Network (medGAN) method.Materials and MethodsWe modified medGAN to obtain two synthetic data generation models—designated as medical Wasserstein GAN with gradient penalty (medWGAN) and medical boundary-seeking GAN (medBGAN)—and compared the results obtained using the three models. We used 2 databases: MIMIC-III and National Health Insurance Research Database (NHIRD), Taiwan. First, we trained the models and generated synthetic EHRs by using these three 3 models. We then analyzed and compared the models’ performance by using a few statistical methods (Kolmogorov–Smirnov test, dimension-wise probability for binary data, and dimension-wise average count for count data) and 2 machine learning tasks (association rule mining and prediction).ResultsWe conducted a comprehensive analysis and found our models were adequately efficient for generating synthetic EHR data. The proposed models outperformed medGAN in all cases, and among the 3 models, boundary-seeking GAN (medBGAN) performed the best.DiscussionTo generate realistic synthetic EHR data, the proposed models will be effective in the medical industry and related research from the viewpoint of providing better services. Moreover, they will eliminate barriers including limited access to EHR data and thus accelerate research on medical informatics.ConclusionThe proposed models can adequately learn the data distribution of real EHRs and efficiently generate realistic synthetic EHRs. The results show the superiority of our models over the existing model.
APA, Harvard, Vancouver, ISO, and other styles
26

Stachel, Anna, Julie Klock, Dan Ding, Jennifer Lighter, Kwesi Daniel, and Levi Waldron. "Data Mining to Guide a Program to Prevent Infection Related Readmissions From Skilled Nursing Facilities." Infection Control & Hospital Epidemiology 41, S1 (October 2020): s29—s30. http://dx.doi.org/10.1017/ice.2020.507.

Full text
Abstract:
Background: Readmissions to hospitals are common, costly and often preventable, notably readmissions due to infections. A 30-day readmission analysis following hospital discharges, found much of the variation in Medicare spending between hospitals was related to readmissions and skilled nursing facility (SNF) care. Although some readmissions of patients with advanced disease are not preventable, efforts to decrease readmission are most effectively directed towards those patients with intermediate levels of a specific risk. A prediction model to identify patients at highest (or intermediate) risk of infection readmission will help healthcare administrators and providers to allocate appropriate resources. Hospitals should use electronic health record (EHR) data with modern data mining techniques to create more curated, sophisticated models as part of a comprehensive transitional care program. We propose using the risk estimates of a validated prediction model to notify stakeholders and develop readmission rate reports by SNF or discharging physician. Methods: We applied machine learning (ML) methods to predict the risk of 30-day readmission due to sepsis and pneumonia of patients discharged to SNF. We used our EHR data during 2012–2017 to train and data from 2018 to validate. We applied ML algorithms to data including logistic regression, random forest, gradient boosting trees, and support vector machine. Data from EDW and EPIC clarity tables were extracted and managed using SAS Base 9.4 and Enterprise Miner 14.3 (SAS Institute, Cary, NC). We assessed the discrimination and calibration to select the most effective prediction model. Using the resulted risk estimates, we created a notification system and reports for key stakeholders. Results: Figures 1 and 2 show the discrimination and calibration results of the final selected gradient boosting model (GBM). For predicting unplanned readmissions with sepsis and with pneumonia within 30 days after discharge to SNF, the c-statistic for final GBM model with 140 features was 0.69 (95% CI 0.65-0.73) and 73 features was 0.71 (95% CI 0.66-0.75), respectively. Table 1 lists features important to the validation set of the prediction model. We used estimates from these models to develop a daily email notification of patients discharged to SNF stratified into a low, medium, and high risk group for sepsis and pneumonia. We additionally created reports with case-mix adjustments to benchmark SNFs and discharging physicians to monitor and understand performance. Conclusions: Hospitals should leverage the plethora of data found in EHRs to curate readmission prediction models, and promote collaboration among transitional care teams and IPC to ultimately reduce readmissions due to sepsis and pneumonia.Funding: NoneDisclosures: None
APA, Harvard, Vancouver, ISO, and other styles
27

Nowakowski, S., J. Razjouyan, A. D. Naik, R. Agrawal, K. Velamuri, S. Singh, and A. Sharafkhaneh. "1180 The Use Of Natural Language Processing To Extract Data From Psg Sleep Study Reports Using National Vha Electronic Medical Record Data." Sleep 43, Supplement_1 (April 2020): A450—A451. http://dx.doi.org/10.1093/sleep/zsaa056.1174.

Full text
Abstract:
Abstract Introduction In 2007, Congress asked the Department of Veteran Affairs to pay closer attention to the incidence of sleep disorders among veterans. We aimed to use natural language processing (NLP), a method that applies algorithms to understand the meaning and structure of sentences within Electronic Health Record (EHR) patient free-text notes, to identify the number of attended polysomnography (PSG) studies conducted in the Veterans Health Administration (VHA) and to evaluate the performance of NLP in extracting sleep data from the notes. Methods We identified 481,115 sleep studies using CPT code 95810 from 2000-19 in the national VHA. We used rule-based regular expression method (phrases: “sleep stage” and “arousal index”) to identify attended PSG reports in the patient free-text notes in the EHR, of which 69,847 records met the rule-based criteria. We randomly selected 178 notes to compare the accuracy of the algorithm in mining sleep parameters: total sleep time (TST), sleep efficiency (SE) and sleep onset latency (SOL) compared to human manual chart review. Results The number of documented PSG studies increased each year from 963 in 2000 to 14,209 in 2018. System performance of NLP compared to manually annotated reference standard in detecting sleep parameters was 83% for TST, 87% for SE, and 81% for SOL (accuracy benchmark ≥ 80%). Conclusion This study showed that NLP is a useful technique to mine EHR and extract data from patients’ free-text notes. Reasons that NLP is not 100% accurate included, the note authors used different phrasing (e.g., “recording duration”) which the NLP algorithm did not detect/extract or authors omitting sleep continuity variables from the notes. Nevertheless, this automated strategy to identify and extract sleep data can serve as an effective tool in large health care systems to be used for research and evaluation to improve sleep medicine patient care and outcomes. Support This material is based upon work supported in part by the Department of Veteran Affairs, Veterans Health Administration, Office of Research and Development, and the Center for Innovations in Quality, Effectiveness and Safety (CIN 13-413). Dr. Nowakowski is also supported by a National Institutes of Health (NIH) Grant (R01NR018342).
APA, Harvard, Vancouver, ISO, and other styles
28

de Lusignan, Simon, Ana Correa, Gaël Dos Santos, Nadia Meyer, François Haguinet, Rebecca Webb, Christopher McGee, et al. "Enhanced Safety Surveillance of Influenza Vaccines in General Practice, Winter 2015-16: Feasibility Study." JMIR Public Health and Surveillance 5, no. 4 (November 14, 2019): e12016. http://dx.doi.org/10.2196/12016.

Full text
Abstract:
Background The European Medicines Agency (EMA) requires vaccine manufacturers to conduct enhanced real-time surveillance of seasonal influenza vaccination. The EMA has specified a list of adverse events of interest to be monitored. The EMA sets out 3 different ways to conduct such surveillance: (1) active surveillance, (2) enhanced passive surveillance, or (3) electronic health record data mining (EHR-DM). English general practice (GP) is a suitable setting to implement enhanced passive surveillance and EHR-DM. Objective This study aimed to test the feasibility of conducting enhanced passive surveillance in GP using the yellow card scheme (adverse events of interest reporting cards) to determine if it has any advantages over EHR-DM alone. Methods A total of 9 GPs in England participated, of which 3 tested the feasibility of enhanced passive surveillance and the other 6 EHR-DM alone. The 3 that tested EPS provided patients with yellow (adverse events) cards for patients to report any adverse events. Data were extracted from all 9 GPs’ EHRs between weeks 35 and 49 (08/24/2015 to 12/06/2015), the main period of influenza vaccination. We conducted weekly analysis and end-of-study analyses. Results Our GPs were largely distributed across England with a registered population of 81,040. In the week 49 report, 15,863/81,040 people (19.57% of the registered practice population) were vaccinated. In the EPS practices, staff managed to hand out the cards to 61.25% (4150/6776) of the vaccinees, and of these cards, 1.98% (82/4150) were returned to the GP offices. Adverse events of interests were reported by 113 /7223 people (1.56%) in the enhanced passive surveillance practices, compared with 322/8640 people (3.73%) in the EHR-DM practices. Conclusions Overall, we demonstrated that GPs EHR-DM was an appropriate method of enhanced surveillance. However, the use of yellow cards, in enhanced passive surveillance practices, did not enhance the collection of adverse events of interests as demonstrated in this study. Their return rate was poor, data entry from them was not straightforward, and there were issues with data reconciliation. We concluded that customized cards prespecifying the EMA’s adverse events of interests, combined with EHR-DM, were needed to maximize data collection. International Registered Report Identifier (IRRID) RR2-10.1136/bmjopen-2016-015469
APA, Harvard, Vancouver, ISO, and other styles
29

Li, Kevin, Christopher J. Magnani, Selen Bozkurt, Tina Seto, Douglas W. Blayney, James D. Brooks, and Tina Hernandez-Boussard. "Practice-based evidence for factors associated with urinary incontinence following prostate cancer care." Journal of Clinical Oncology 36, no. 6_suppl (February 20, 2018): 106. http://dx.doi.org/10.1200/jco.2018.36.6_suppl.106.

Full text
Abstract:
106 Background: Urinary incontinence (UI) is a common complication following treatment for localized prostate cancer. Past studies evaluating UI risk factors use surveys or chart abstraction, which may be costly and lack generalizability. Electronic health records (EHR) allow us to examine UI at a population level. We applied data mining methods to EHR data to: (1) evaluate rates of UI following prostate cancer treatment; and (2) evaluate potential risk factors for posttreatment UI. Methods: We conducted a retrospective analysis of patients undergoing prostatectomy or radiation therapy for localized prostate cancer between 2009-2016, and who received follow-up care at our medical center. Our cohort was constructed from the institutional EHR and the California Cancer Registry. The primary outcome was the presence of UI, measured in three-month intervals from the start of first-line treatment. The secondary outcome was UI 12-24 months following treatment (“late UI”). UI was assessed using natural language processing of EHR clinician notes. UI was also assessed with the EPIC-26 quality of life survey, which a subset of patients had prospectively completed. Results: Our cohort consisted of 2783 men, of whom 1907 (69%) underwent surgery and the remainder received radiation; of this cohort, 609 (22%) had data on late UI status. UI prevalence was higher among surgery than radiation patients across all posttreatment time points, and 278 of 434 (64%) surgery patients had late UI compared to 78 of 175 (45%) radiation patients (p < 0.001). Univariable analyses showed an association between pretreatment and late UI among surgery patients as measured in the EHR (OR 2.5, 95% CI 1.0-6.5, p = 0.05) and by EPIC-26 (OR 8.1, 95% CI 1.8-36.5, p = 0.01). Only surgery (compared to radiation) was a significant predictor of late UI (OR 5.8, 95% CI 1.1-32.3, p = 0.05) in multivariable regression with EHR data. Conclusions: Using EHR data, we found that treatment modality was a significant predictor of late UI among prostate cancer patients who underwent prostatectomy or radiation therapy. These results suggest the utility of EHRs in patient-centered outcomes research in prostate cancer care, and should be validated at other sites.
APA, Harvard, Vancouver, ISO, and other styles
30

.T, Sunitha, Shyamala .J, and Annie Jesus Suganthi Rani.A. "Prognostication Stereotype of Patients Morbidity and Mortality by Extraction of E-Health Records." International Journal of Emerging Research in Management and Technology 6, no. 6 (June 29, 2018): 215. http://dx.doi.org/10.23956/ijermt.v6i6.271.

Full text
Abstract:
Data mining suggest an innovative way of prognostication stereotype of Patients health risks. Large amount of Electronic Health Records (EHRs) collected over the years have provided a rich base for risk analysis and prediction. An EHR contains digitally stored healthcare information about an individual, such as observations, laboratory tests, diagnostic reports, medications, procedures, patient identifying information and allergies. A special type of EHR is the Health Examination Records (HER) from annual general health check-ups. Identifying participants at risk based on their current and past HERs is important for early warning and preventive intervention. By “risk”, we mean unwanted outcomes such as mortality and morbidity. This approach is limited due to the classification problem and consequently it is not informative about the specific disease area in which a personal is at risk. Limited amount of data extracted from the health record is not feasible for providing the accurate risk prediction. The main motive of this project is for risk prediction to classify progressively developing situation with the majority of the data unlabeled.
APA, Harvard, Vancouver, ISO, and other styles
31

Noyd, David H., Nigel B. Neely, Claire Howell, Kevin C. Oeffinger, and Susan Kreissman. "Integration of EHR and Cancer Registry Data to Construct a Childhood Cancer Survivorship Cohort to Improve Long-Term Follow-up Care for Leukemia and Lymphoma Survivors." Blood 136, Supplement 1 (November 5, 2020): 8. http://dx.doi.org/10.1182/blood-2020-142402.

Full text
Abstract:
Background and aims: Marked improvements in pediatric oncology care for children with acute lymphoblastic leukemia (ALL) and leukemia emphasize the importance of appropriate follow-up care for childhood cancer survivors (CCS) to monitor for late effects. The electronic health record (EHR) sparks unique approaches to construct, maintain, and leverage CCS cohorts to improve care. The primary purpose of this study is to utilize a novel approach to integrate EHR and cancer registry data to construct a CCS cohort for ALL and lymphoma survivors. Data mining in the EHR facilitates stratification of patients into risk cohorts and identification of high risk patients with inadequate subspecialty survivorship care Methods Cancer registry data from January 1, 1994 to November 30, 2012 provided a base cohort from which pediatric oncology clinic visits were extracted from the EHR. Explanatory variables included gender, race/ethnicity, risk stratification, distance to the medical center, rural-urban commuting area code, and area deprivation index as a measure of socioeconomic status. Primary outcomes included date of last clinic visit to determine appropriate follow-up, defined as being seen in a subspecialty clinic between five and seven years after initial diagnosis. Patients who died or relapsed during the seven year follow-up period were excluded from the analysis. Results Between January 1, 1994 and November 30, 2012, there were 262 pediatric oncology patients evaluated at our institution and reported in the cancer registry who were alive and without evidence of recurrence seven years after initial diagnosis. Of these patients, 22% of patients (n=57) were considered to have inadequate follow-up care. In univariate analysis, younger age (p&lt;0.001), primary diagnosis of ALL compared to lymphoma (p=0.007), low risk strata (p=0.014) and closer proximity to primary treatment center or Children's Oncology Group-affiliate site (p=0.013 and 0.026, respectively) were associated with a higher likelihood of follow-up care. Multivariable logistic regression modeling is currently ongoing. Conclusions Integration of EHR and cancer registry data represents a feasible and novel approach to construct a cohort of childhood ALL and lymphoma survivors, risk stratify patients based on treatment exposures, and assess adequate subspecialty survivorship care. Future applications include refinement of treatment exposures through the EHR, adherence to guideline recommendations, and other late effects outcomes. Disclosures No relevant conflicts of interest to declare.
APA, Harvard, Vancouver, ISO, and other styles
32

Campbell, Elizabeth A., Mitchell G. Maltenfort, Justine Shults, Christopher B. Forrest, and Aaron J. Masino. "Characterizing clinical pediatric obesity subtypes using electronic health record data." PLOS Digital Health 1, no. 8 (August 4, 2022): e0000073. http://dx.doi.org/10.1371/journal.pdig.0000073.

Full text
Abstract:
In this work, we present a study of electronic health record (EHR) data that aims to identify pediatric obesity clinical subtypes. Specifically, we examine whether certain temporal condition patterns associated with childhood obesity incidence tend to cluster together to characterize subtypes of clinically similar patients. In a previous study, the sequence mining algorithm, SPADE was implemented on EHR data from a large retrospective cohort (n = 49 594 patients) to identify common condition trajectories surrounding pediatric obesity incidence. In this study, we used Latent Class Analysis (LCA) to identify potential subtypes formed by these temporal condition patterns. The demographic characteristics of patients in each subtype are also examined. An LCA model with 8 classes was developed that identified clinically similar patient subtypes. Patients in Class 1 had a high prevalence of respiratory and sleep disorders, patients in Class 2 had high rates of inflammatory skin conditions, patients in Class 3 had a high prevalence of seizure disorders, and patients in Class 4 had a high prevalence of Asthma. Patients in Class 5 lacked a clear characteristic morbidity pattern, and patients in Classes 6, 7, and 8 had a high prevalence of gastrointestinal issues, neurodevelopmental disorders, and physical symptoms respectively. Subjects generally had high membership probability for a single class (>70%), suggesting shared clinical characterization within the individual groups. We identified patient subtypes with temporal condition patterns that are significantly more common among obese pediatric patients using a Latent Class Analysis approach. Our findings may be used to characterize the prevalence of common conditions among newly obese pediatric patients and to identify pediatric obesity subtypes. The identified subtypes align with prior knowledge on comorbidities associated with childhood obesity, including gastro-intestinal, dermatologic, developmental, and sleep disorders, as well as asthma.
APA, Harvard, Vancouver, ISO, and other styles
33

Kimura, M. "Health IT in Asia-Pacific Region." Methods of Information in Medicine 50, no. 04 (2011): 378–79. http://dx.doi.org/10.1055/s-0038-1625136.

Full text
Abstract:
SummaryAPAMI 2009 was held in Hiroshima, Japan, on November 22–24, 2009. This issue includes two selected papers recommended by the programming committee co-chairs. They are “Lessons Learned from Data Mining of WHO Mortality Database” and “Survey on Medical Records and EHR in Asia-Pacific Region – Languages, Purposes, IDs and Regulations”. The theme of APAMI 2009 was "What are the medical records for?" A Hiroshima episode; Medical Records at Dr. Ban’s Clinic, 9 km from the Epicenter of A-Bomb, is included, which lets us think of the fundamental purpose of medical records.
APA, Harvard, Vancouver, ISO, and other styles
34

Noyd, David, Claire Howell, Kevin Oeffinger, Daniel Landi, and Kristin Schroeder. "EPID-16. INTEGRATION OF EHR AND CANCER REGISTRY DATA TO CONSTRUCT A PEDIATRIC NEURO-ONCOLOGY SURVIVORSHIP COHORT AND IMPROVE LONG-TERM FOLLOW-UP CARE." Neuro-Oncology 22, Supplement_3 (December 1, 2020): iii322. http://dx.doi.org/10.1093/neuonc/noaa222.202.

Full text
Abstract:
Abstract BACKGROUND Pediatric neuro-oncology (PNO) survivors suffer long-term physical and neurocognitive morbidity. Comprehensive care addressing late effects of brain tumors and treatment in these patients is important. Clinical guidelines offer a framework for evaluating late effects, yet lack of extended follow-up is a significant barrier. The electronic health record (EHR) allows novel and impactful opportunities to construct, maintain, and leverage survivorship cohorts for health care delivery and as a platform for research. METHODS This survivorship cohort includes all PNO cases ≤18-years-old reported to the state-mandated cancer registry by our institution. Data mining of the EHR for exposures, demographic, and clinical data identified patients with lack of extended follow-up (&gt;1000 days since last visit). Explanatory variables included age, race/ethnicity, and language. Primary outcome included date of last clinic visit. RESULTS Between January 1, 2013 and December 31, 2018, there were 324 PNO patients reported to our institutional registry with ongoing analysis to identify the specific survivorship cohort. Thirty patients died with an overall mortality of 9.3%. Two-hundred-and-sixteen patients were seen in PNO clinic, of which 18.5%% (n=40) did not receive extended follow-up. Patients without extended follow-up were an average of 3.5 years older up (p&lt;0.01); however, there was no significant difference in preferred language (p=0.97) or race/ethnicity (p=0.57). CONCLUSION Integration of EHR and cancer registry data represents a feasible, timely, and novel approach to construct a PNO survivorship cohort to identify and re-engage patients without extended follow-up. Future applications include analysis of exposures and complications during therapy on late effects outcomes.
APA, Harvard, Vancouver, ISO, and other styles
35

Kirola, Madhu, Minakshi Memoria, Ankur Dumka, Amrendra Tripathi, and Kapil Joshi. "A Comprehensive Review Study on: Optimized Data Mining, Machine Learning and Deep Learning Techniques for Breast Cancer Prediction in Big Data Context." Biomedical and Pharmacology Journal 15, no. 1 (March 31, 2022): 13–25. http://dx.doi.org/10.13005/bpj/2339.

Full text
Abstract:
In recent years, big data in health care is commonly used for the prediction of diseases. The most common cancer is breast cancer infections of metropolitan Indian women as well as in women worldwide with a broadly factor occurrence among nations and regions. According to WHO, among 14% of all cancer tumours in women breast cancer is well-known cancer in women in India also. Few researches have been done on breast cancer prediction on Big data. Big data is now triggering a revolution in healthcare, resulting in better and more optimized outcomes. Rapid technological advancements have increased data generation; EHR (Electronic Health Record) systems produce a massive amount of patient-level data. In the healthcare industry, applications of big data will help to improve outcomes. However, the traditional prediction models have less efficiency in terms of accuracy and error rate. This review article is about the comparative assessment of complex data mining, machine learning, deep learning models used for identifying breast cancer because accuracy rate of any particular algorithm depends on various factors such as implementation framework, datasets(small or large),types of dataset used(attribute based or image based)etc. Aim of this review article is to help to choose the appropriate breast cancer prediction techniques specifically in the Big data environment to produce effective and efficient result, Because “Early detection is the key to prevention-in case of any cancer”.
APA, Harvard, Vancouver, ISO, and other styles
36

Wang, Liqin, Suzanne V. Blackley, Kimberly G. Blumenthal, Sharmitha Yerneni, Foster R. Goss, Ying-Chih Lo, Sonam N. Shah, et al. "A dynamic reaction picklist for improving allergy reaction documentation in the electronic health record." Journal of the American Medical Informatics Association 27, no. 6 (May 17, 2020): 917–23. http://dx.doi.org/10.1093/jamia/ocaa042.

Full text
Abstract:
Abstract Objective Incomplete and static reaction picklists in the allergy module led to free-text and missing entries that inhibit the clinical decision support intended to prevent adverse drug reactions. We developed a novel, data-driven, “dynamic” reaction picklist to improve allergy documentation in the electronic health record (EHR). Materials and Methods We split 3 decades of allergy entries in the EHR of a large Massachusetts healthcare system into development and validation datasets. We consolidated duplicate allergens and those with the same ingredients or allergen groups. We created a reaction value set via expert review of a previously developed value set and then applied natural language processing to reconcile reactions from structured and free-text entries. Three association rule-mining measures were used to develop a comprehensive reaction picklist dynamically ranked by allergen. The dynamic picklist was assessed using recall at top k suggested reactions, comparing performance to the static picklist. Results The modified reaction value set contained 490 reaction concepts. Among 4 234 327 allergy entries collected, 7463 unique consolidated allergens and 469 unique reactions were identified. Of the 3 dynamic reaction picklists developed, the 1 with the optimal ranking achieved recalls of 0.632, 0.763, and 0.822 at the top 5, 10, and 15, respectively, significantly outperforming the static reaction picklist ranked by reaction frequency. Conclusion The dynamic reaction picklist developed using EHR data and a statistical measure was superior to the static picklist and suggested proper reactions for allergy documentation. Further studies might evaluate the usability and impact on allergy documentation in the EHR.
APA, Harvard, Vancouver, ISO, and other styles
37

Tung, Tsan-Hua, Poching DeLaurentis, and Yuehwern Yih. "Uncovering Discrepancies in IV Vancomycin Infusion Records between Pump Logs and EHR Documentation." Applied Clinical Informatics 13, no. 04 (August 2022): 891–900. http://dx.doi.org/10.1055/s-0042-1756428.

Full text
Abstract:
Abstract Background Infusion start time, completion time, and interruptions are the key data points needed in both area under the concentration–time curve (AUC)- and trough-based vancomycin therapeutic drug monitoring (TDM). However, little is known about the accuracy of documented times of drug infusions compared with automated recorded events in the infusion pump system. A traditional approach of direct observations of infusion practice is resource intensive and impractical to scale. We need a new methodology to leverage the infusion pump event logs to understand the prevalence of timestamp discrepancies as documented in the electronic health records (EHRs). Objectives We aimed to analyze timestamp discrepancies between EHR documentation (the information used for clinical decision making) and pump event logs (actual administration process) for vancomycin treatment as it may lead to suboptimal data used for therapeutic decisions. Methods We used process mining to study the conformance between pump event logs and EHR data for a single hospital in the United States from July to December 2016. An algorithm was developed to link records belonging to the same infusions. We analyzed discrepancies in infusion start time, completion time, and interruptions. Results Of the 1,858 infusions, 19.1% had infusion start time discrepancy more than ± 10 minutes. Of the 487 infusion interruptions, 2.5% lasted for more than 20 minutes before the infusion resumed. 24.2% (312 of 1,287) of 1-hour infusions and 32% (114 of 359) of 2-hour infusions had over 10-minute completion time discrepancy. We believe those discrepancies are inherent part of the current EHR documentation process commonly found in hospitals, not unique to the care facility under study. Conclusion We demonstrated pump event logs and EHR data can be utilized to study time discrepancies in infusion administration at scale. Such discrepancy should be further investigated at different hospitals to address the prevalence of the problem and improvement effort.
APA, Harvard, Vancouver, ISO, and other styles
38

Tutuko, Bambang, Siti Nurmaini, Muhammad Naufal Rachmatullah, Annisa Darmawahyuni, and Firdaus Firdaus. "A Deep Learning Approach to Integrate Medical Big Data for Improving Health Services in Indonesia." Computer Engineering and Applications Journal 9, no. 1 (February 1, 2020): 17–28. http://dx.doi.org/10.18495/comengapp.v9i1.328.

Full text
Abstract:
Medical Informatics to support health services in Indonesia is proposed in this paper. The focuses of paper to the analysis of Big Data for health care purposes with the aim of improving and developing clinical decision support systems (CDSS) or assessing medical data both for quality assurance and accessibility of health services. Electronic health records (EHR) are very rich in medical data sourced from patient. All the data can be aggregated to produce information, which includes medical history details such as, diagnostic tests, medicines and treatment plans, immunization records, allergies, radiological images, multivariate sensors device, laboratories, and test results. All the information will provide a valuable understanding of disease management system. In Indonesia country, with many rural areas with limited doctor it is an important case to investigate. Data mining about large-scale individuals and populations through EHRs can be combined with mobile networks and social media to inform about health and public policy. To support this research, many researchers have been applied the Deep Learning (DL) approach in data-mining problems related to health informatics. However, in practice, the use of DL is still questionable due to achieve optimal performance, relatively large data and resources are needed, given there are other learning algorithms that are relatively fast but produce close performance with fewer resources and parameterization, and have a better interpretability. In this paper, the advantage of Deep Learning to design medical informatics is described, due to such an approach is needed to make a good CDSS of health services.
APA, Harvard, Vancouver, ISO, and other styles
39

Amato, Michael S., Sherine El-Toukhy, Lorien C. Abroms, Henry Goodfellow, Alex T. Ramsey, Tracey Brown, Helena Jopling, and Zarnie Khadjesari. "Mining Electronic Health Records to Promote the Reach of Digital Interventions for Cancer Prevention Through Proactive Electronic Outreach: Protocol for the Mixed Methods OptiMine Study." JMIR Research Protocols 9, no. 12 (December 31, 2020): e23669. http://dx.doi.org/10.2196/23669.

Full text
Abstract:
Background Digital behavior change interventions have demonstrated effectiveness for smoking cessation and reducing alcohol intake, which ultimately reduce cancer risk. Leveraging electronic health records (EHR) to identify at-risk patients and increasing the reach of digital interventions through proactive electronic outreach provide a novel approach that may increase the number of individuals who engage with evidence-based treatment. Objective This study aims to increase the reach of digital behavior change interventions by implementing a proactive electronic message system for smoking cessation and alcohol reduction among a large, at-risk population identified through an acute hospital EHR. Methods This protocol describes a 3-phase, mixed-methods implementation study to assess the acceptability, feasibility, and reach of a proactive electronic message system to digital interventions using a hospital’s EHR system to identify eligible patients. In Phase 1, we will conduct focus group discussions with patients and hospital staff to assess the overall acceptability of the electronic message system. In Phase 2, we will conduct a descriptive analysis of the patient population in the hospital EHR regarding target risk behaviors and other person-level characteristics to determine the project’s feasibility and potential reach. In Phase 3, we will send proactive messages to patients identified as smokers or risky drinkers. Messages will encourage and provide access to behavior change mobile apps via an embedded link; the primary outcome will be the proportion of participants who click on the link to access information about the apps. Results At the time of initial protocol submission, data collection was complete, but analysis had not begun. This study was funded by Cancer Research UK from April 2019 to March 2020. Health Research Authority approval was granted in June 2019. Conclusions Increasing the reach of digital behavior change interventions can improve population health by reducing the burden of preventable death and disease. International Registered Report Identifier (IRRID) DERR1-10.2196/23669
APA, Harvard, Vancouver, ISO, and other styles
40

Yeatman, Timothy Joseph, Mark Watson, and Adam Chasse. "Leveraging the integrated EHR for trial matching across a nationwide network." Journal of Clinical Oncology 38, no. 4_suppl (February 1, 2020): 165. http://dx.doi.org/10.1200/jco.2020.38.4_suppl.165.

Full text
Abstract:
165 Background: The Guardian Research Network (GRN) is a nationwide consortium of integrated health systems, who share their electronic health records (EHR) to democratize clinical trial access through improvements in “process”. The GRN is a unique-in-class, free-to join, non-exclusive consortium leveraging the digital EHR---including labs, medications, demographics and non-discrete data (all text) data--- mining nightly for clinical trial candidates. Using a suite of NLP and AI tools, the GRN dramatically improves the efficiency of the clinical research staff, by electronically searching all records for the I/E criteria for trials. The GRN uses a central IRB, one contract and legal review, promising to revolutionize the trial accrual process and speed drug development. Methods: With a database of > 1.0M patients, the GRN reviews all active cancer records nightly from > 110 member hospitals to produce lists of trial candidates. Comprehensive electronic screens were filtered by manual reviews to rapidly find best candidates. Results: Our data collected over 10 mo suggest comprehensive electronic queries examining hundreds of thousands of records daily eliminate > 90% of ineligible patients in minutes. Manual review further refines eligible list. This is vastly different from current opportunistic screening approaches that examine only a tiny fraction of potential trial candidates (last week's new patients). Conclusions: The GRN has executed an all-inclusive approach to trial accrual, embedding a scalable database search technology within an integrated trial network. The novel approach seeks to exponentially expand operational capabilities of CTOs with limited staff, review all eligible patients, and solve for a large unmet need for democratizing trial access in the community. [Table: see text]
APA, Harvard, Vancouver, ISO, and other styles
41

Hatef, Elham, Masoud Rouhizadeh, Iddrisu Tia, Elyse Lasser, Felicia Hill-Briggs, Jill Marsteller, and Hadi Kharrazi. "Assessing the Availability of Data on Social and Behavioral Determinants in Structured and Unstructured Electronic Health Records: A Retrospective Analysis of a Multilevel Health Care System." JMIR Medical Informatics 7, no. 3 (August 2, 2019): e13802. http://dx.doi.org/10.2196/13802.

Full text
Abstract:
Background Most US health care providers have adopted electronic health records (EHRs) that facilitate the uniform collection of clinical information. However, standardized data formats to capture social and behavioral determinants of health (SBDH) in structured EHR fields are still evolving and not adopted widely. Consequently, at the point of care, SBDH data are often documented within unstructured EHR fields that require time-consuming and subjective methods to retrieve. Meanwhile, collecting SBDH data using traditional surveys on a large sample of patients is infeasible for health care providers attempting to rapidly incorporate SBDH data in their population health management efforts. A potential approach to facilitate targeted SBDH data collection is applying information extraction methods to EHR data to prescreen the population for identification of immediate social needs. Objective Our aim was to examine the availability and characteristics of SBDH data captured in the EHR of a multilevel academic health care system that provides both inpatient and outpatient care to patients with varying SBDH across Maryland. Methods We measured the availability of selected patient-level SBDH in both structured and unstructured EHR data. We assessed various SBDH including demographics, preferred language, alcohol use, smoking status, social connection and/or isolation, housing issues, financial resource strains, and availability of a home address. EHR’s structured data were represented by information collected between January 2003 and June 2018 from 5,401,324 patients. EHR’s unstructured data represented information captured for 1,188,202 patients between July 2016 and May 2018 (a shorter time frame because of limited availability of consistent unstructured data). We used text-mining techniques to extract a subset of SBDH factors from EHR’s unstructured data. Results We identified a valid address or zip code for 5.2 million (95.00%) of approximately 5.4 million patients. Ethnicity was captured for 2.7 million (50.00%), whereas race was documented for 4.9 million (90.00%) and a preferred language for 2.7 million (49.00%) patients. Information regarding alcohol use and smoking status was coded for 490,348 (9.08%) and 1,728,749 (32.01%) patients, respectively. Using the International Classification of Diseases–10th Revision diagnoses codes, we identified 35,171 (0.65%) patients with information related to social connection/isolation, 10,433 (0.19%) patients with housing issues, and 3543 (0.07%) patients with income/financial resource strain. Of approximately 1.2 million unique patients with unstructured data, 30,893 (2.60%) had at least one clinical note containing phrases referring to social connection/isolation, 35,646 (3.00%) included housing issues, and 11,882 (1.00%) had mentions of financial resource strain. Conclusions Apart from demographics, SBDH data are not regularly collected for patients. Health care providers should assess the availability and characteristics of SBDH data in EHRs. Evaluating the quality of SBDH data can potentially enable health care providers to modify underlying workflows to improve the documentation, collection, and extraction of SBDH data from EHRs.
APA, Harvard, Vancouver, ISO, and other styles
42

Gilbertson-White, Stephanie, Sanvesh Srivastava, Yunyi Li, Elyse Laures, Seyedehtanaz Saeidzadeh, Chi Yeung, and Sena Chae. "Multimorbidity, cancer, and symptoms: Using electronic health record data to cluster patients in multimorbidity phenotypes." Journal of Clinical Oncology 37, no. 31_suppl (November 1, 2019): 130. http://dx.doi.org/10.1200/jco.2019.37.31_suppl.130.

Full text
Abstract:
130 Background: Cancer-related symptoms are associated with decreased quality of life, increased health care utilization, and shorter life expectancy. There is limited understanding of how multiple chronic conditions (MCC) contribute to variability in symptoms experienced in the context of cancer. Data mining the EHR will allow us to use real clinical data to identify multimorbidity phenotypes based on the clinical similarity of patients. Purpose of this study is to identify distinct subgroups of patients based on the MCC and cancer diagnoses and describe differences across these subgroups. Methods: EHR data was extracted from adult patients (n=2977) newly diagnosed with cancer in 2017 at one academic medical center. The SEER cancer site/histology list was used to group cancer diagnosis. MCC present for >6 months on the problem list or ICD-10 billing data were used. K-Means and K-Modes clustering procedures, with K equaling 7, were used to cluster patients based on MCC. Results: The sample consisted of 58% women, 93% white, with mean age of 62.4 (16.1) years. The most frequent cancers were GI (17%), gynecological (14%), and pulmonary (10%). The most frequent MCC were hypertension (33%), anemia (24%), and metabolic diseases (21%). Seven clusters correspond to following primary cancer sites: GI, pulmonary, urinary, gynecological, breast, endocrine, and skin. The MCC rates varied significantly across different primary sites with hypertension being present in call clusters, but anemia was present only in GI and urinary system cancers clusters. Conclusions: K-Means and K-Modes clustering procedures, with K equaling 7, produced similar clusters of cancer primary sites and MCCs, indicating our findings are stable and replicable. Our data extraction methods and clustering techniques worked well and can be expanded upon. Our next step is to repeat the data extraction and clustering analysis with the full data from the data warehouse (>30,000 records). The identified multimorbidity phenotypes will be used as inclusion criteria for prospective research with patients to explore the relationships among MCCs and symptoms in the context of cancer.
APA, Harvard, Vancouver, ISO, and other styles
43

Savova, G. K., K. C. Kipper-Schuler, J. F. Hurdle, and S. M. Meystre. "Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research." Yearbook of Medical Informatics 17, no. 01 (August 2008): 128–44. http://dx.doi.org/10.1055/s-0038-1638592.

Full text
Abstract:
Summary Objectives We examine recent published research on the extraction of information from textual documents in the Electronic Health Record (EHR). Methods Literature review of the research published after 1995, based on PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers already included. Results 174 publications were selected and are discussed in this review in terms of methods used, pre-processing of textual documents, contextual features detection and analysis, extraction of information in general, extraction of codes and of information for decision-support and enrichment of the EHR, information extraction for surveillance, research, automated terminology management, and data mining, and de-identification of clinical text. Conclusions Performance of information extraction systems with clinical text has improved since the last systematic review in 1995, but they are still rarely applied outside of the laboratory they have been developed in. Competitive challenges for information extraction from clinical text, along with the availability of annotated clinical text corpora, and further improvements in system performance are important factors to stimulate advances in this field and to increase the acceptance and usage of these systems in concrete clinical and biomedical research contexts.
APA, Harvard, Vancouver, ISO, and other styles
44

Enayati, Moein, Mustafa Sir, Xingyu Zhang, Sarah J. Parker, Elizabeth Duffy, Hardeep Singh, Prashant Mahajan, and Kalyan S. Pasupathy. "Monitoring Diagnostic Safety Risks in Emergency Departments: Protocol for a Machine Learning Study." JMIR Research Protocols 10, no. 6 (June 14, 2021): e24642. http://dx.doi.org/10.2196/24642.

Full text
Abstract:
Background Diagnostic decision making, especially in emergency departments, is a highly complex cognitive process that involves uncertainty and susceptibility to errors. A combination of factors, including patient factors (eg, history, behaviors, complexity, and comorbidity), provider-care team factors (eg, cognitive load and information gathering and synthesis), and system factors (eg, health information technology, crowding, shift-based work, and interruptions) may contribute to diagnostic errors. Using electronic triggers to identify records of patients with certain patterns of care, such as escalation of care, has been useful to screen for diagnostic errors. Once errors are identified, sophisticated data analytics and machine learning techniques can be applied to existing electronic health record (EHR) data sets to shed light on potential risk factors influencing diagnostic decision making. Objective This study aims to identify variables associated with diagnostic errors in emergency departments using large-scale EHR data and machine learning techniques. Methods This study plans to use trigger algorithms within EHR data repositories to generate a large data set of records that are labeled trigger-positive or trigger-negative, depending on whether they meet certain criteria. Samples from both data sets will be validated using medical record reviews, upon which we expect to find a higher number of diagnostic safety events in the trigger-positive subset. Machine learning will be used to evaluate relationships between certain patient factors, provider-care team factors, and system-level risk factors and diagnostic safety signals in the statistically matched groups of trigger-positive and trigger-negative charts. Results This federally funded study was approved by the institutional review board of 2 academic medical centers with affiliated community hospitals. Trigger queries are being developed at both organizations, and sample cohorts will be labeled using the triggers. Machine learning techniques such as association rule mining, chi-square automated interaction detection, and classification and regression trees will be used to discover important variables that could be incorporated within future clinical decision support systems to help identify and reduce risks that contribute to diagnostic errors. Conclusions The use of large EHR data sets and machine learning to investigate risk factors (related to the patient, provider-care team, and system-level) in the diagnostic process may help create future mechanisms for monitoring diagnostic safety. International Registered Report Identifier (IRRID) DERR1-10.2196/24642
APA, Harvard, Vancouver, ISO, and other styles
45

Abedi, Vida, Jiang Li, Manu K. Shivakumar, Venkatesh Avula, Durgesh P. Chaudhary, Matthew J. Shellenberger, Harshit S. Khara, et al. "Increasing the Density of Laboratory Measures for Machine Learning Applications." Journal of Clinical Medicine 10, no. 1 (December 30, 2020): 103. http://dx.doi.org/10.3390/jcm10010103.

Full text
Abstract:
Background. The imputation of missingness is a key step in Electronic Health Records (EHR) mining, as it can significantly affect the conclusions derived from the downstream analysis in translational medicine. The missingness of laboratory values in EHR is not at random, yet imputation techniques tend to disregard this key distinction. Consequently, the development of an adaptive imputation strategy designed specifically for EHR is an important step in improving the data imbalance and enhancing the predictive power of modeling tools for healthcare applications. Method. We analyzed the laboratory measures derived from Geisinger’s EHR on patients in three distinct cohorts—patients tested for Clostridioides difficile (Cdiff) infection, patients with a diagnosis of inflammatory bowel disease (IBD), and patients with a diagnosis of hip or knee osteoarthritis (OA). We extracted Logical Observation Identifiers Names and Codes (LOINC) from which we excluded those with 75% or more missingness. The comorbidities, primary or secondary diagnosis, as well as active problem lists, were also extracted. The adaptive imputation strategy was designed based on a hybrid approach. The comorbidity patterns of patients were transformed into latent patterns and then clustered. Imputation was performed on a cluster of patients for each cohort independently to show the generalizability of the method. The results were compared with imputation applied to the complete dataset without incorporating the information from comorbidity patterns. Results. We analyzed a total of 67,445 patients (11,230 IBD patients, 10,000 OA patients, and 46,215 patients tested for C. difficile infection). We extracted 495 LOINC and 11,230 diagnosis codes for the IBD cohort, 8160 diagnosis codes for the Cdiff cohort, and 2042 diagnosis codes for the OA cohort based on the primary/secondary diagnosis and active problem list in the EHR. Overall, the most improvement from this strategy was observed when the laboratory measures had a higher level of missingness. The best root mean square error (RMSE) difference for each dataset was recorded as −35.5 for the Cdiff, −8.3 for the IBD, and −11.3 for the OA dataset. Conclusions. An adaptive imputation strategy designed specifically for EHR that uses complementary information from the clinical profile of the patient can be used to improve the imputation of missing laboratory values, especially when laboratory codes with high levels of missingness are included in the analysis.
APA, Harvard, Vancouver, ISO, and other styles
46

Hong, Zhen, Qin Xu, Xin Yan, Ran Zhang, Yuanfang Ren, and Qian Tong. "Analysis of Signs and Effects of Surgical Breast Cancer Patients Based on Big Data Technology." Computational Intelligence and Neuroscience 2022 (September 23, 2022): 1–8. http://dx.doi.org/10.1155/2022/3373553.

Full text
Abstract:
Big data in health care has gained popularity in recent years for disease prediction. Breast cancer infections are the most common cancer in urban Indian women, as well as women internationally, and are impacted by many events across countries and regions. Breast malignant growth is a notable disease among Indian women. According to the WHO, it represents 14% of all malignant growth tumors in women. A couple of studies have been directed utilizing big data to foresee breast malignant growth. Big data is causing a transformation in healthcare, with better and more ideal results. Monstrous volumes of patient-level data are created by using EHR (Electronic Health Record) systems data because of fast mechanical upgrades. Big data applications in the healthcare business will assist with improving results. Conventional forecast models, then again, are less productive in terms of accuracy and error rate because the exact pace of a specific calculation relies upon different factors such as execution structure, datasets (little or enormous), and kinds of datasets utilized (trait-based or picture based). This audit article looks at complex information mining, AI, and profound learning models utilized for recognizing breast malignant growth. Since “early identification is the way to avoidance in any malignant growth,” the motivation behind this audit article is to support the choice of fitting breast disease expectation calculations, explicitly in the big information climate, to convey powerful and productive results. This survey article analyzes the precision paces of perplexing information mining, AI, and profound learning models utilized for distinguishing breast disease on the grounds that the exactness pace of a specific calculation relies upon different factors such as execution structure, datasets (little or enormous), and dataset types (quality based or picture based). The reason for this audit article is to aid the determination of suitable breast disease expectation calculations, explicitly in the big information climate, to convey successful and productive outcomes. Thus, “Early discovery is the way to counteraction in the event of any malignant growth.”
APA, Harvard, Vancouver, ISO, and other styles
47

Mazzotti, Diego, Bethany Staley, Brendan Keenan, Allan Pack, Richard Schwab, and Mary Regina Boland. "399 Using Machine Learning to Inform Extraction of Clinical Data from Sleep Study Reports." Sleep 44, Supplement_2 (May 1, 2021): A158—A159. http://dx.doi.org/10.1093/sleep/zsab072.398.

Full text
Abstract:
Abstract Introduction In-laboratory and home sleep studies are important tools for diagnosing sleep disorders. However, a limited amount of measurements is used to inform disease severity and only specific measures, if any, are stored as structured fields into electronic health records (EHR). We propose a sleep study data extraction approach based on supervised machine learning to facilitate the development of specialized format-specific parsers for large-scale automated sleep data extraction. Methods Using retrospective data from the Penn Medicine Sleep Center, we identified 64,100 sleep study reports stored in Microsoft Word documents of varying formats, recorded from 2001–2018. A random sample of 200 reports was selected for manual annotation of formats (e.g., layout) and type (e.g. baseline, split-night, home sleep apnea tests). Using text mining tools, we extracted 71 document property features (e.g., section dimensions, paragraph and table elements, regular expression matches). We identified 14 different formats and 7 study types. We used these manual annotations as multiclass outcomes in a random forest classifier to evaluate prediction of sleep study format and type using document property features. Out-of-bag (OOB) error rates and multiclass area under the receiver operating curve (mAUC) were estimated to evaluate training and testing performance of each model. Results We successfully predicted sleep study format and type using random forest classifiers. Training OOB error rate was 5.6% for study format and 8.1% for study type. When evaluating these models in independent testing data, the mAUC for classification of study format was 0.85 and for study type was 1.00. When applied to the large universe of diagnostic sleep study reports, we successfully extracted hundreds of discrete fields in 38,252 reports representing 33,696 unique patients. Conclusion We accurately classified a sample of sleep study reports according to their format and type, using a random forest multiclass classification method. This informed the development and successful deployment of custom data extraction tools for sleep study reports. The ability to leverage these data can improve understanding of sleep disorders in the clinical setting and facilitate implementation of large-scale research studies within the EHR. Support (if any) American Heart Association (20CDA35310360).
APA, Harvard, Vancouver, ISO, and other styles
48

Kalenderian, Elsbeth, Enihomo Obadan-Udoh, Alfa Yansane, Karla Kent, Nutan Hebballi, Veronique Delattre, Krisna Kookal, Oluwabunmi Tokede, Joel White, and Muhammad Walji. "Feasibility of Electronic Health Record–Based Triggers in Detecting Dental Adverse Events." Applied Clinical Informatics 09, no. 03 (July 2018): 646–53. http://dx.doi.org/10.1055/s-0038-1668088.

Full text
Abstract:
Background We can now quantify and characterize the harm patients suffer in the dental chair by mining data from electronic health records (EHRs). Most dental institutions currently deploy a random audit of charts using locally developed definitions to identify such patient safety incidents. Instead, selection of patient charts using triggers and assessment through calibrated reviewers may more efficiently identify dental adverse events (AEs). Objective Our goal was to develop and test EHR-based triggers at four academic institutions and find dental AEs, defined as moderate or severe physical harm due to dental treatment. Methods We used an iterative and consensus-based process to develop 11 EHR-based triggers to identify dental AEs. Two dental experts at each institution independently reviewed a sample of triggered charts using a common AE definition and classification system. An expert panel provided a second level of review to confirm AEs identified by sites reviewers. We calculated the performance of each trigger and identified strategies for improvement. Results A total of 100 AEs were identified by 10 of the 11 triggers. In 57% of the cases, pain was the most common AE identified, followed by infection and hard tissue damage. Positive predictive value (PPV) for the triggers ranged from 0 to 0.29. The best performing triggers were those developed to identify infections (PPV = 0.29), allergies (PPV = 0.23), failed implants (PPV = 0.21), and nerve injuries (PPV = 0.19). Most AEs (90%) were categorized as temporary moderate-to-severe harm (E2) and the remainder as permanent moderate-to-severe harm (G2). Conclusion EHR-based triggers are a promising approach to unearth AEs among dental patients compared with a manual audit of random charts. Data in dental EHRs appear to be sufficiently structured to allow the use of triggers. Pain was the most common AE type followed by infection and hard tissue damage.
APA, Harvard, Vancouver, ISO, and other styles
49

Li, Dingwen, Patrick G. Lyons, Chenyang Lu, and Marin Kollef. "DeepAlerts: Deep Learning Based Multi-Horizon Alerts for Clinical Deterioration on Oncology Hospital Wards." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 01 (April 3, 2020): 743–50. http://dx.doi.org/10.1609/aaai.v34i01.5417.

Full text
Abstract:
Machine learning and data mining techniques are increasingly being applied to electronic health record (EHR) data to discover underlying patterns and make predictions for clinical use. For instance, these data may be evaluated to predict clinical deterioration events such as cardiopulmonary arrest or escalation of care to the intensive care unit (ICU). In clinical practice, early warning systems with multiple time horizons could indicate different levels of urgency, allowing clinicians to make decisions regarding triage, testing, and interventions for patients at risk of poor outcomes. These different horizon alerts are related and have intrinsic dependencies, which elicit multi-task learning. In this paper, we investigate approaches to properly train deep multi-task models for predicting clinical deterioration events via generating multi-horizon alerts for hospitalized patients outside the ICU, with particular application to oncology patients. Prior knowledge is used as a regularization to exploit the positive effects from the task relatedness. Simultaneously, we propose task-specific loss balancing to reduce the negative effects when optimizing the joint loss function of deep multi-task models. In addition, we demonstrate the effectiveness of the feature-generating techniques from prediction outcome interpretation. To evaluate the model performance of predicting multi-horizon deterioration alerts in a real world scenario, we apply our approaches to the EHR data from 20,700 hospitalizations of adult oncology patients. These patients' baseline high-risk status provides a unique opportunity: the application of an accurate model to an enriched population could produce improved positive predictive value and reduce false positive alerts. With our dataset, the model applying all proposed learning techniques achieves the best performance compared with common models previously developed for clinical deterioration warning.
APA, Harvard, Vancouver, ISO, and other styles
50

Hoffman, Sharona, and Andy Podgurski. "The Use and Misuse of Biomedical Data: Is Bigger Really Better?" American Journal of Law & Medicine 39, no. 4 (December 2013): 497–538. http://dx.doi.org/10.1177/009885881303900401.

Full text
Abstract:
Very large biomedical research databases, containing electronic health records (EHR) and genomic data from millions of patients, have been heralded recently for their potential to accelerate scientific discovery and produce dramatic improvements in medical treatments. Research enabled by these databases may also lead to profound changes in law, regulation, social policy, and even litigation strategies. Yet, is “big data” necessarily better data?This paper makes an original contribution to the legal literature by focusing on what can go wrong in the process of biomedical database research and what precautions are necessary to avoid critical mistakes. We address three main reasons for approaching such research with care and being cautious in relying on its outcomes for purposes of public policy or litigation. First, the data contained in biomedical databases is surprisingly likely to be incorrect or incomplete. Second, systematic biases, arising from both the nature of the data and the preconceptions of investigators, are serious threats to the validity of research results, especially in answering causal questions. Third, data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.In short, this paper sheds much-needed light on the problems of credulous and uninformed acceptance of research results derived from biomedical databases. An understanding of the pitfalls of big data analysis is of critical importance to anyone who will rely on or dispute its outcomes, including lawyers, policymakers, and the public at large. The Article also recommends technical, methodological, and educational interventions to combat the dangers of database errors and abuses.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography