Готові списки джерел за темами / Approximate record matching

Зміст

Статті в журналах
Дисертації
Частини книг
Тези доповідей конференцій
Звіти організацій

Добірка наукової літератури з теми "Approximate record matching"

Автор: Grafiati

Опубліковано: 30 травня 2022

Оновлено: 19 липня 2025

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Approximate record matching".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Approximate record matching"

Verykios, Vassilios S., Ahmed K. Elmagarmid, and Elias N. Houstis. "Automating the approximate record-matching process." Information Sciences 126, no. 1-4 (2000): 83–98. http://dx.doi.org/10.1016/s0020-0255(00)00013-x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Seleznjev, Oleg, and Bernhard Thalheim. "Random Databases with Approximate Record Matching." Methodology and Computing in Applied Probability 12, no. 1 (2008): 63–89. http://dx.doi.org/10.1007/s11009-008-9092-4.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Rozinek, Ondřej, Jaroslav Marek, Jan Panuš, and Jan Mareš. "Real-Time Fuzzy Record-Matching Similarity Metric and Optimal Q-Gram Filter." Algorithms 18, no. 3 (2025): 150. https://doi.org/10.3390/a18030150.

Повний текст джерела

Анотація:

In this paper, we introduce an advanced Fuzzy Record Similarity Metric (FRMS) that improves approximate record matching and models human perception of record similarity. The FRMS utilizes a newly developed similarity space with favorable properties combined with a metric space, employing a bag-of-words model with general applications in text mining and cluster analysis. To optimize the FRMS, we propose a two-stage method for approximate string matching and search that outperforms baseline methods in terms of average time complexity and F measure on various datasets. In the first stage, we cons

Стилі APA, Harvard, Vancouver, ISO та ін.

Essex, Aleksander. "Secure Approximate String Matching for Privacy-Preserving Record Linkage." IEEE Transactions on Information Forensics and Security 14, no. 10 (2019): 2623–32. http://dx.doi.org/10.1109/tifs.2019.2903651.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

J, Ujwala Rekha, and Shahu Chatrapati K. "Probabilistic multiple correlation based term weighting scheme for measuring similarity of unstructured text records." Indian Journal of Science and Technology 13, no. 11 (2020): 1276–82. https://doi.org/10.17485/IJST/v13i11.2020-31.

Повний текст джерела

Анотація:

Abstract <strong>Background/Objectives:</strong> In this study, a term weighting scheme derived from probabilistic multiple correlation is defined for measuring similarity between unstructured text records. <strong>Methods:</strong> While the intra-correlation is the correlation of terms in the same record, inter-correlation is the correlation of terms that exist in different records. Probabilistic multiple correlation-based term weighting calculates the weight or relevance of a term by considering its intra-correlation with one or more terms simultaneously. Subsequently, the te

Стилі APA, Harvard, Vancouver, ISO та ін.

Vasylenko, Oleh. "ANALYSIS OF KEY METADATA FOR IDENTIFYING DUPLICATES IN BIBLIOGRAPHIC RECORDS." Cybersecurity: Education, Science, Technique 3, no. 27 (2025): 87–99. https://doi.org/10.28925/2663-4023.2025.27.700.

Повний текст джерела

Анотація:

This study addresses the issue of duplicate bibliographic records in library information systems, a problem that is becoming increasingly relevant with the growth of digital catalogs. It specifically examines the key metadata fields used for comparing records and identifying duplicate entries. The analysis includes critical metadata fields such as title, ISBN, publisher, place of publication, publication date, pagination, series, and additional attributes used for identifying editions. Special attention is given to the variability of data within these fields, including issues arising from misp

Стилі APA, Harvard, Vancouver, ISO та ін.

Hanrath, Scott, and Erik Radio. "User search terms and controlled subject vocabularies in an institutional repository." Library Hi Tech 35, no. 3 (2017): 360–67. http://dx.doi.org/10.1108/lht-11-2016-0133.

Повний текст джерела

Анотація:

Purpose The purpose of this paper is to investigate the search behavior of institutional repository (IR) users in regard to subjects as a means of estimating the potential impact of applying a controlled subject vocabulary to an IR. Design/methodology/approach Google Analytics data were used to record cases where users arrived at an IR item page from an external web search and subsequently downloaded content. Search queries were compared against the Faceted Application of Subject Terminology (FAST) schema to determine the topical nature of the queries. Queries were also compared against the it

Стилі APA, Harvard, Vancouver, ISO та ін.

Williams, Richard, David Jenkins, Thomas Bolton, et al. "Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data." BMJ Open 15, no. 4 (2025): e093080. https://doi.org/10.1136/bmjopen-2024-093080.

Повний текст джерела

Анотація:

ObjectivesTo assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the UK. The original study examined the risk factors associated with hospitalisation following COVID-19 infection in people with diabetes.DesignA replication of a retrospective cohort study.SettingObservational electronic health record data from primary and secondary care sources in the UK. The original study used data from a large, urbanised region (Greater Manchester Care Record, Greater Manchester, UK—2.8 m patients). This replication study used a n

Стилі APA, Harvard, Vancouver, ISO та ін.

Bianchi Santiago, Josie D., Héctor Colón Jordán, and Didier Valdés. "Record Linkage of Crashes with Injuries and Medical Cost in Puerto Rico." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 10 (2020): 739–48. http://dx.doi.org/10.1177/0361198120935439.

Повний текст джерела

Анотація:

Cost considerations are critical in the analysis and prevention of traffic crashes. Integration of cost data into crash datasets facilitates the crash-cost analyses with all their related attributes. It is, however, a challenging task because of the lack of availability of unique identifiers across the databases and because of privacy and confidentiality regulations. This study performed a record linkage comparison between the deterministic and probabilistic approaches using attributes matching techniques with numerical distance and weight patterns under the Fellegi–Sunter approach. As a resul

Стилі APA, Harvard, Vancouver, ISO та ін.

Douglas, M. M., D. Gardner, D. Hucker, and S. W. Kendrick. "Best-Link Matching of Scottish Health Data Sets." Methods of Information in Medicine 37, no. 01 (1998): 64–68. http://dx.doi.org/10.1055/s-0038-1634494.

Повний текст джерела

Анотація:

Abstract:Methods are described used to link the Community Health Index and the National Health Service Central Register (NHSCR) in Scotland to provide a basis for a national patient index. The linkage used a combination of deterministic and probability matching techniques. A best-link principle was used by which each Community Health Index record was allowed to link only to the NHSCR record with which it achieved the highest match weight. This strategy, applied in the context of two files which each covered virtually the entire population of Scotland, increased the accuracy of linkage approxim

Стилі APA, Harvard, Vancouver, ISO та ін.

Більше джерел

Дисертації з теми "Approximate record matching"

Jupin, Joseph. "Temporal Graph Record Linkage and k-Safe Approximate Match." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/412419.

Повний текст джерела

Анотація:

Computer and Information Science<br>Ph.D.<br>Since the advent of electronic data processing, organizations have accrued vast amounts of data contained in multiple databases with no reliable global unique identifier. These databases were developed by different departments for different purposes at different times. Organizing and analyzing these data for human services requires linking records from all sources. RL (Record Linkage) is a process that connects records that are related to the identical or a sufficiently similar entity from multiple heterogeneous databases. RL is a data and compute i

Стилі APA, Harvard, Vancouver, ISO та ін.

Tam, Siu-lung. "Linear-size indexes for approximate pattern matching and dictionary matching." Click to view the E-thesis via HKUTO, 2010. http://sunzi.lib.hku.hk/hkuto/record/B44205326.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Тодоріко, Ольга Олексіївна. "Моделі та методи очищення та інтеграції текстових даних в інформаційних системах". Thesis, Запорізький національний університет, 2016. http://repository.kpi.kharkov.ua/handle/KhPI-Press/21856.

Повний текст джерела

Анотація:

Дисертація на здобуття наукового ступеня кандидата технічних наук за спеціальністю 05.13.06 – інформаційні технології. – Національний технічний університет "Харківський політехнічний інститут", Харків, 2016. У дисертаційній роботі вирішена актуальна науково-практична задача підвищення ефективності та якості технології очищення та інтеграції текстових даних в довідкових і пошукових інформаційних системах за рахунок використання моделей словозмінної парадигми та методу побудови лексемного індексу при організації пошуку за схожістю. Розроблено моделі словозмінної парадигми, що включають представл

Стилі APA, Harvard, Vancouver, ISO та ін.

Тодоріко, Ольга Олексіївна. "Моделі та методи очищення та інтеграції текстових даних в інформаційних системах". Thesis, НТУ "ХПІ", 2016. http://repository.kpi.kharkov.ua/handle/KhPI-Press/21853.

Повний текст джерела

Анотація:

Дисертація на здобуття наукового ступеня кандидата технічних наук за спеціальністю 05.13.06 – інформаційні технології. – Національний технічний університет «Харківський політехнічний інститут», Харків, 2016. У дисертаційній роботі вирішена актуальна науково-практична задача підвищення ефективності та якості технології очищення та інтеграції текстових даних в довідкових і пошукових інформаційних системах за рахунок використання моделей словозмінної парадигми та методу побудови лексемного індексу при організації пошуку за схожістю. Розроблено моделі словозмінної парадигми, що включають представл

Стилі APA, Harvard, Vancouver, ISO та ін.

Vatsalan, Dinusha. "Scalable and approximate privacy-preserving record linkage." Phd thesis, 2014. http://hdl.handle.net/1885/12370.

Повний текст джерела

Анотація:

Record linkage, the task of linking multiple databases with the aim to identify records that refer to the same entity, is occurring increasingly in many application areas. Generally, unique entity identifiers are not available in all the databases to be linked. Therefore, record linkage requires the use of personal identifying attributes, such as names and addresses, to identify matching records that need to be reconciled to the same entity. Often, it is not permissible to exchange personal identifying data across different organizations due to privacy and confidentiality concerns or regulatio

Стилі APA, Harvard, Vancouver, ISO та ін.

Dobiášovský, Jan. "Přibližná shoda znakových řetězců a její aplikace na ztotožňování metadat vědeckých publikací." Master's thesis, 2020. http://www.nusl.cz/ntk/nusl-415121.

Повний текст джерела

Анотація:

The thesis explores the application of approximate string matching in scientific publication record linkage process. An introduction to record matching along with five commonly used metrics for string distance (Levenshtein, Jaro, Jaro-Winkler, Cosine distances and Jaccard coefficient) are provided. These metrics are applied on publication metadata from V3S current research information system of the Czech Technical University in Prague. Based on the findings, optimal thresholds in the F1, F2 and F3-measures are determined for each metric.

Стилі APA, Harvard, Vancouver, ISO та ін.

Частини книг з теми "Approximate record matching"

Dong, Boxiang, and Hui Wendy Wang. "Efficient Authentication of Approximate Record Matching for Outsourced Databases." In Advances in Intelligent Systems and Computing. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-319-98056-0_6.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Grannis Shaun J., Overhage J. Marc, and McDonald Clement. "Real World Performance of Approximate String Comparators for use in Patient Matching." In Studies in Health Technology and Informatics. IOS Press, 2004. https://doi.org/10.3233/978-1-60750-949-3-43.

Повний текст джерела

Анотація:

Medical record linkage is becoming increasingly important as clinical data is distributed across independent sources. To improve linkage accuracy we studied different name comparison methods that establish agreement or disagreement between corresponding names. In addition to exact raw name matching and exact phonetic name matching, we tested three approximate string comparators. The approximate comparators included the modified Jaro-Winkler method, the longest common substring, and the Levenshtein edit distance. We also calculated the combined root-mean square of all three. We tested each name comparison method using a deterministic record linkage algorithm. Results were consistent across both hospitals. At a threshold comparator score of 0.8, the Jaro-Winkler comparator achieved the highest linkage sensitivities of 97.4% and 97.7%. The combined root-mean square method achieved sensitivities higher than the Levenshtein edit distance or longest common substring while sustaining high linkage specificity. Approximate string comparators increase deterministic linkage sensitivity by up to 10% compared to exact match comparisons and represent an accurate method of linking to vital statistics data.

Стилі APA, Harvard, Vancouver, ISO та ін.

Margaritis, Dimitris, Christos Faloutsos, and Sebastian Thrun. "NetCube." In Database Technologies. IGI Global, 2009. http://dx.doi.org/10.4018/978-1-60566-058-5.ch120.

Повний текст джерела

Анотація:

We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can succinctly represent the underlying joint probability distribution of the domain, and can therefore be used to calculate approximate counts for any conjunctive query combination of attribute values and “don’t cares.” The structure and parameters of these networks are learned from the database in a preprocessing stage. By means of such a network, the proposed method, called NetCube, exploits correlations and independencies among attributes to answer a count query quickly without accessing the database. Our preprocessing algorithm scales linearly on the size of the database, and is thus scalable; it is also parallelizable with a straightforward parallel implementation. We give an algorithm for estimating the count result of arbitrary queries that is fast (constant) on the database size. Our experimental results show that NetCubes have fast generation and use, achieve excellent compression and have low reconstruction error. Moreover, they naturally allow for visualization and data mining, at no extra cost.

Стилі APA, Harvard, Vancouver, ISO та ін.

Margaritis, Dimitris, Christos Faloutsos, and Sebastian Thrun. "NetCube." In Bayesian Network Technologies. IGI Global, 2007. http://dx.doi.org/10.4018/978-1-59904-141-4.ch004.

Повний текст джерела

Анотація:

Стилі APA, Harvard, Vancouver, ISO та ін.

Тези доповідей конференцій з теми "Approximate record matching"

Gollapalli, Mohammed, Xue Li, Ian Wood, and Guido Governatori. "Approximate Record Matching Using Hash Grams." In 2011 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 2011. http://dx.doi.org/10.1109/icdmw.2011.33.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Dong, Boxiang, and Wendy Wang. "ARM: Authenticated Approximate Record Matching for Outsourced Databases." In 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI). IEEE, 2016. http://dx.doi.org/10.1109/iri.2016.86.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Gonçalves, Marcos A. "Session details: Record linkage and approximate matching (DB)." In CIKM07: Conference on Information and Knowledge Management. ACM, 2007. http://dx.doi.org/10.1145/3250795.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Schraagen, Marijn. "Complete Coverage for Approximate String Matching in Record Linkage Using Bit Vectors." In 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2011. http://dx.doi.org/10.1109/ictai.2011.116.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Jia, Dan, Yong-Yi Wang, and Steve Rapp. "Material Properties and Flaw Characteristics of Vintage Girth Welds." In 2020 13th International Pipeline Conference. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/ipc2020-9658.

Повний текст джерела

Анотація:

Abstract Vintage pipelines, which in the context of this paper refer to pipelines built before approximately 1970, account for a large portion of the energy pipeline systems in North America. Integrity assessment of these pipelines can sometimes present challenges due to incomplete records and lack of material property data. When material properties for the welds of interest are not available, conservative estimates based on past experience are typically used for the unknown material property values. Such estimates can be overly conservative, potentially leading to unnecessary remedial actions

Стилі APA, Harvard, Vancouver, ISO та ін.

Hitz, Arne, Anja Konzept, Benedikt Reick, and Klaus Rheinberger. "Efficient GPS Route Matching Method for Battery Electric Bus Fleets." In Conference on Sustainable Mobility. SAE International, 2024. http://dx.doi.org/10.4271/2024-24-0026.

Повний текст джерела

Анотація:

<div class="section abstract"><div class="htmlview paragraph">A challenge of public transportation GPS data is the frequent utilization of monitoring systems with low sampling rates, primarily driven by the high costs associated with cellular data transmission of large datasets. Altitude data is often imprecise or not recorded at all in regions without large elevation changes. The low data quality limits the use of the data for further detailed investigations like a realistic energy consumption forecast for assessing the electrical grid load resulting from charging the vehicle flee

Стилі APA, Harvard, Vancouver, ISO та ін.

Ramakrishnan, Kishore Ranganath, Shoaib Ahmed, Benjamin Wahls, et al. "Gas Turbine Combustor Liner Wall Heat Load Characterization for Different Gaseous Fuels." In ASME 2019 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2019. http://dx.doi.org/10.1115/imece2019-11283.

Повний текст джерела

Анотація:

Abstract The knowledge of detailed distribution of heat load on swirl stabilized combustor liner wall is imperative in the development of liner-specific cooling arrangements, aimed towards maintaining uniform liner wall temperatures for reduced thermal stress levels. Heat transfer and fluid flow experiments have been conducted on a swirl stabilized lean premixed combustor to understand the behavior of Methane-, Propane-, and Butane-based flames. These fuels were compared at different equivalence ratios for a matching adiabatic flame temperature of Methane at 0.65 equivalence ratio. Above exper

Стилі APA, Harvard, Vancouver, ISO та ін.

Cummings, Scott M. "Prediction of Rolling Contact Fatigue Using Instrumented Wheelsets." In ASME 2008 Rail Transportation Division Fall Technical Conference. ASMEDC, 2008. http://dx.doi.org/10.1115/rtdf2008-74013.

Повний текст джерела

Анотація:

The measured wheel/rail forces from four wheels in the leading truck of a coal hopper car during one revenue service roundtrip were used to by the Wheel Defect Prevention Research Consortium (WDPRC) to predict rolling contact fatigue (RCF) damage. The data was recorded in March 2005 by TTCI for an unrelated Strategic Research Initiatives project funded by the Association of American Railroads (AAR). RCF damage was predicted in only a small portion of the approximately 4,000 km (2,500 miles) for which data was analyzed. The locations where RCF damage was predicted to occur were examined careful

Стилі APA, Harvard, Vancouver, ISO та ін.

Звіти організацій з теми "Approximate record matching"

Day, Christopher M., Howell Li, Sarah M. L. Hubbard, and Darcy M. Bullock. Observations of Trip Generation, Route Choice, and Trip Chaining with Private-Sector Probe Vehicle GPS Data. Purdue University, 2022. http://dx.doi.org/10.5703/1288284317368.

Повний текст джерела

Анотація:

This paper presents an exploratory study of GPS data from a private-sector data provider for analysis of trip generation, route choice, and trip chaining. The study focuses on travel to and from the Indianapolis International Airport. GPS data consisting of nearly 1 billion waypoints for 12 million trips collected over a 6-week period in the state of Indiana. Within this data, there were approximately 10,000 trip records indicating travel to facilities associated with the Indianapolis airport. The analysis is based the matching of waypoints to geographic areas that define the extents of roadwa

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!