To see the other types of publications on this topic, follow the link: Edit distance.

Journal articles on the topic 'Edit distance'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Edit distance.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Petty, Taylor, Jan Hannig, Tunde I. Huszar, and Hari Iyer. "A New String Edit Distance and Applications." Algorithms 15, no. 7 (2022): 242. http://dx.doi.org/10.3390/a15070242.

Full text
Abstract:
String edit distances have been used for decades in applications ranging from spelling correction and web search suggestions to DNA analysis. Most string edit distances are variations of the Levenshtein distance and consider only single-character edits. In forensic applications polymorphic genetic markers such as short tandem repeats (STRs) are used. At these repetitive motifs the DNA copying errors consist of more than just single base differences. More often the phenomenon of “stutter” is observed, where the number of repeated units differs (by whole units) from the template. To adapt the Levenshtein distance to be suitable for forensic applications where DNA sequence similarity is of interest, a generalized string edit distance is defined that accommodates the addition or deletion of whole motifs in addition to single-nucleotide edits. A dynamic programming implementation is developed for computing this distance between sequences. The novelty of this algorithm is in handling the complex interactions that arise between multiple- and single-character edits. Forensic examples illustrate the purpose and use of the Restricted Forensic Levenshtein (RFL) distance measure, but applications extend to sequence alignment and string similarity in other biological areas, as well as dynamic programming algorithms more broadly.
APA, Harvard, Vancouver, ISO, and other styles
2

Akutsu, Tatsuya, Daiji Fukagawa, and Atsuhiro Takasu. "Approximating Tree Edit Distance through String Edit Distance." Algorithmica 57, no. 2 (2008): 325–48. http://dx.doi.org/10.1007/s00453-008-9213-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hyyrö, Heikki, and Shunsuke Inenaga. "Dynamic RLE-Compressed Edit Distance Tables Under General Weighted Cost Functions." International Journal of Foundations of Computer Science 29, no. 04 (2018): 623–45. http://dx.doi.org/10.1142/s0129054118410083.

Full text
Abstract:
Kim and Park [A dynamic edit distance table, J. Disc. Algo., 2:302–312, 2004] proposed a method (KP) based on a “dynamic edit distance table” that allows one to efficiently maintain unit cost edit distance information between two strings [Formula: see text] of length [Formula: see text] and [Formula: see text] of length [Formula: see text] when the strings can be modified by single-character edits to their left or right ends. This type of computation is useful e.g. in cyclic string comparison. KP uses linear time, [Formula: see text], to update the distance representation after each single edit. Recently Hyyrö et al. [Incremental string comparison, J. Disc. Algo., 34:2-17, 2015] presented an efficient method for maintaining the dynamic edit distance table under general weighted edit distance, running in [Formula: see text] time per single edit, where [Formula: see text] is the maximum weight of the cost function. The work noted that the [Formula: see text] space requirement, and not the running time, may be the main bottleneck in using the dynamic edit distance table. In this paper we take the first steps towards reducing the space usage of the dynamic edit distance table by RLE compressing [Formula: see text] and [Formula: see text]. Let [Formula: see text] and [Formula: see text] be the lengths of RLE compressed versions of [Formula: see text] and [Formula: see text], respectively. We propose how to store the dynamic edit distance table using [Formula: see text] space while maintaining the same time complexity as the previous methods for uncompressed strings.
APA, Harvard, Vancouver, ISO, and other styles
4

Jie Wei. "Markov edit distance." IEEE Transactions on Pattern Analysis and Machine Intelligence 26, no. 3 (2004): 311–21. http://dx.doi.org/10.1109/tpami.2004.1262315.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kim, HyunJin. "A k-mismatch string matching for generalized edit distance using diagonal skipping method." PLOS ONE 16, no. 5 (2021): e0251047. http://dx.doi.org/10.1371/journal.pone.0251047.

Full text
Abstract:
This paper proposes an approximate string matching with k-mismatches when calculating the generalized edit distance. When the edit distance is generalized, more sophisticated string matching can be provided. However, the execution time increases because of the bundle of complex computations for calculating complicated edit distances. The computational costs for finding which steps or edit distances are over k-mismatches cannot be significant in the generalized edit distance metric. Therefore, we can reduce the execution time by determining steps over k-mismatches and then skipping them. The diagonal step calculations using the pruning register skips unnecessary distance calculations over k-mismatches. The overhead of control statements and reordered memory accesses can be amortized by skipping multiple steps. Even though the proposed skipping method requires additional overhead, the proposed scheme’s practical embodiments show that the execution time of string matching is reduced significantly when k is small.
APA, Harvard, Vancouver, ISO, and other styles
6

McGrane, Martin, and Michael A. Charleston. "Biological Network Edit Distance." Journal of Computational Biology 23, no. 9 (2016): 776–88. http://dx.doi.org/10.1089/cmb.2016.0062.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ristad, E. S., and P. N. Yianilos. "Learning string-edit distance." IEEE Transactions on Pattern Analysis and Machine Intelligence 20, no. 5 (1998): 522–32. http://dx.doi.org/10.1109/34.682181.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Myers, R., R. C. Wison, and E. R. Hancock. "Bayesian graph edit distance." IEEE Transactions on Pattern Analysis and Machine Intelligence 22, no. 6 (2000): 628–35. http://dx.doi.org/10.1109/34.862201.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Aratsu, Taku, Kouichi Hirata, and Tetsuji Kuboyama. "Approximating Tree Edit Distance through String Edit Distance for Binary Tree Codes." Fundamenta Informaticae 101, no. 3 (2010): 157–71. http://dx.doi.org/10.3233/fi-2010-282.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Cortés, Xavier, Donatello Conte, and Hubert Cardot. "Learning edit cost estimation models for graph edit distance." Pattern Recognition Letters 125 (July 2019): 256–63. http://dx.doi.org/10.1016/j.patrec.2019.05.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Kari, Lila, Stavros Konstantinidis, Steffen Kopecki, and Meng Yang. "Efficient Algorithms for Computing the Inner Edit Distance of a Regular Language via Transducers." Algorithms 11, no. 11 (2018): 165. http://dx.doi.org/10.3390/a11110165.

Full text
Abstract:
The concept of edit distance and its variants has applications in many areas such as computational linguistics, bioinformatics, and synchronization error detection in data communications. Here, we revisit the problem of computing the inner edit distance of a regular language given via a Nondeterministic Finite Automaton (NFA). This problem relates to the inherent maximal error-detecting capability of the language in question. We present two efficient algorithms for solving this problem, both of which execute in time O ( r 2 n 2 d ) , where r is the cardinality of the alphabet involved, n is the number of transitions in the given NFA, and d is the computed edit distance. We have implemented one of the two algorithms and present here a set of performance tests. The correctness of the algorithms is based on the connection between word distances and error detection and the fact that nondeterministic transducers can be used to represent the errors (resp., edit operations) involved in error-detection (resp., in word distances).
APA, Harvard, Vancouver, ISO, and other styles
12

Kovács, Gábor, Gábor Árpád Németh, Zoltán Pap, and Mahadevan Subramaniam. "Deriving Compact Test Suites for Telecommunication Software Using Distance Metrics." Journal of Communications Software and Systems 5, no. 2 (2009): 57. http://dx.doi.org/10.24138/jcomss.v5i2.205.

Full text
Abstract:
This paper proposes a string edit distance based test selection method to generate compact test sets for telecommunications software. Following the results of previous research, a trace in a test set is considered to be redundant if its edit distance from others is less than a given parameter. The algorithm first determines the minimum cardinality of the target test set inaccordance with the provided parameter, then it selects the test set with the highest sum of internal edit distances. The selection problem is reduced to an assignment problem in bipartite graphs.
APA, Harvard, Vancouver, ISO, and other styles
13

Winter, Felix, Nysret Musliu, and Peter Stuckey. "Explaining Propagators for String Edit Distance Constraints." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 02 (2020): 1676–83. http://dx.doi.org/10.1609/aaai.v34i02.5530.

Full text
Abstract:
The computation of string similarity measures has been thoroughly studied in the scientific literature and has applications in a wide variety of different areas. One of the most widely used measures is the so called string edit distance which captures the number of required edit operations to transform a string into another given string. Although polynomial time algorithms are known for calculating the edit distance between two strings, there also exist NP-hard problems from practical applications like scheduling or computational biology that constrain the minimum edit distance between arrays of decision variables. In this work, we propose a novel global constraint to formulate restrictions on the minimum edit distance for such problems. Furthermore, we describe a propagation algorithm and investigate an explanation strategy for an edit distance constraint propagator that can be incorporated into state of the art lazy clause generation solvers. Experimental results show that the proposed propagator is able to significantly improve the performance of existing exact methods regarding solution quality and computation speed for benchmark problems from the literature.
APA, Harvard, Vancouver, ISO, and other styles
14

Shapira, Dana, and James A. Storer. "Edit Distance with Block Deletions." Algorithms 4, no. 1 (2011): 40–60. http://dx.doi.org/10.3390/a4010040.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Sridharamurthy, Raghavendra, Talha Bin Masood, Adhitya Kamakshidasan, and Vijay Natarajan. "Edit Distance between Merge Trees." IEEE Transactions on Visualization and Computer Graphics 26, no. 3 (2020): 1518–31. http://dx.doi.org/10.1109/tvcg.2018.2873612.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Kim, Sung-Ryul, and Kunsoo Park. "A dynamic edit distance table." Journal of Discrete Algorithms 2, no. 2 (2004): 303–12. http://dx.doi.org/10.1016/s1570-8667(03)00082-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Shapira, Dana, and James A. Storer. "Edit distance with move operations." Journal of Discrete Algorithms 5, no. 2 (2007): 380–92. http://dx.doi.org/10.1016/j.jda.2005.01.010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Touzet, Hélène. "Tree edit distance with gaps." Information Processing Letters 85, no. 3 (2003): 123–29. http://dx.doi.org/10.1016/s0020-0190(02)00369-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Amir, Amihood, Estrella Eisenberg, and Ely Porat. "Swap and mismatch edit distance." Algorithmica 45, no. 1 (2006): 109–20. http://dx.doi.org/10.1007/s00453-005-1192-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Axenovich, Maria, and Ryan R. Martin. "Multicolor and directed edit distance." Journal of Combinatorics 2, no. 4 (2011): 525–56. http://dx.doi.org/10.4310/joc.2011.v2.n4.a4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Dzido, Tomasz, and Krzysztof Krzywdziński. "Edit distance measure for graphs." Czechoslovak Mathematical Journal 65, no. 3 (2015): 829–36. http://dx.doi.org/10.1007/s10587-015-0211-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Akutsu, Tatsuya. "A relation between edit distance for ordered trees and edit distance for Euler strings." Information Processing Letters 100, no. 3 (2006): 105–9. http://dx.doi.org/10.1016/j.ipl.2006.06.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

SOLÉ-RIBALTA, ALBERT, FRANCESC SERRATOSA, and ALBERTO SANFELIU. "ON THE GRAPH EDIT DISTANCE COST: PROPERTIES AND APPLICATIONS." International Journal of Pattern Recognition and Artificial Intelligence 26, no. 05 (2012): 1260004. http://dx.doi.org/10.1142/s021800141260004x.

Full text
Abstract:
We model the edit distance as a function in a labeling space. A labeling space is an Euclidean space where coordinates are the edit costs. Through this model, we define a class of cost. A class of cost is a region in the labeling space that all the edit costs have the same optimal labeling. Moreover, we characterize the distance value through the labeling space. This new point of view of the edit distance gives us the opportunity of defining some interesting properties that are useful for a better understanding of the edit distance. Finally, we show the usefulness of these properties through some applications.
APA, Harvard, Vancouver, ISO, and other styles
24

HAN, YO-SUB, SANG-KI KO, and KAI SALOMAA. "THE EDIT-DISTANCE BETWEEN A REGULAR LANGUAGE AND A CONTEXT-FREE LANGUAGE." International Journal of Foundations of Computer Science 24, no. 07 (2013): 1067–82. http://dx.doi.org/10.1142/s0129054113400315.

Full text
Abstract:
The edit-distance between two strings is the smallest number of operations required to transform one string into the other. The distance between languages L1and L2is the smallest edit-distance between string wi∈ Li, i = 1, 2. We consider the problem of computing the edit-distance of a given regular language and a given context-free language. First, we present an algorithm that finds for the languages an optimal alignment, that is, a sequence of edit operations that transforms a string in one language to a string in the other. The length of the optimal alignment, in the worst case, is exponential in the size of the given grammar and finite automaton. Then, we investigate the problem of computing only the edit-distance of the languages without explicitly producing an optimal alignment. We design a polynomial time algorithm that calculates the edit-distance based on unary homomorphisms.
APA, Harvard, Vancouver, ISO, and other styles
25

Kalenkova, Anna A., and Danil A. Kolesnikov. "Application of Genetic Algorithms for Finding Edit Distance between Process Models." Modeling and Analysis of Information Systems 25, no. 6 (2018): 711–25. http://dx.doi.org/10.18255/1818-1015-2018-6-711-725.

Full text
Abstract:
Finding graph-edit distance (graph similarity) is an important task in many computer science areas, such as image analysis, machine learning, chemicalinformatics. Recently, with the development of process mining techniques, it became important to adapt and apply existing graph analysis methods to examine process models (annotated graphs) discovered from event data. In particular, finding graph-edit distance techniques can be used to reveal patterns (subprocesses), compare discovered process models. As it was shown experimentally and theoretically justified, exact methods for finding graph-edit distances between discovered process models (and graphs in general) are computationally expensive and can be applied to small models only. In this paper, we present and assess accuracy and performance characteristics of an inexact genetic algorithm applied to find distances between process models discovered from event logs. In particular, we find distances between BPMN (Business Process Model and Notation) models discovered from event logs by using different process discovery algorithms. We show that the genetic algorithm allows us to dramatically reduce the time of comparison and produces results which are close to the optimal solutions (minimal graph edit distances calculated by the exact search algorithm).
APA, Harvard, Vancouver, ISO, and other styles
26

JIA, Nan, Xiao-dong FU, Yuan HUANG, Xiao-yan LIU, and Zhi-hua DAI. "Workflow distance metric based on tree edit distance." Journal of Computer Applications 32, no. 12 (2013): 3529–33. http://dx.doi.org/10.3724/sp.j.1087.2012.03529.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Khalid, Madiha, Muhammad Murtaza Yousaf, and Muhammad Umair Sadiq. "Toward Efficient Similarity Search under Edit Distance on Hybrid Architectures." Information 13, no. 10 (2022): 452. http://dx.doi.org/10.3390/info13100452.

Full text
Abstract:
Edit distance is the most widely used method to quantify similarity between two strings. We investigate the problem of similarity search under edit distance. Given a collection of sequences, the goal of similarity search under edit distance is to find sequences in the collection that are similar to a given query sequence where the similarity score is computed using edit distance. The canonical method of computing edit distance between two strings uses a dynamic programming-based approach that runs in quadratic time and space, which may not provide results in a reasonable amount of time for large sequences. It advocates for parallel algorithms to reduce the time taken by edit distance computation. To this end, we present scalable parallel algorithms to support efficient similarity search under edit distance. The efficiency and scalability of the proposed algorithms is demonstrated through an extensive set of experiments on real datasets. Moreover, to address the problem of uneven workload across different processing units, which is mainly caused due to the significant variance in the size of the sequences, different data distribution schemes are discussed and empirically analyzed. Experimental results have shown that the speedup achieved by the hybrid approach over inter-task and intra-task parallelism is 18 and 13, respectively.
APA, Harvard, Vancouver, ISO, and other styles
28

Garcia-Hernandez, Carlos, Alberto Fernández, and Francesc Serratosa. "Learning the Edit Costs of Graph Edit Distance Applied to Ligand-Based Virtual Screening." Current Topics in Medicinal Chemistry 20, no. 18 (2020): 1582–92. http://dx.doi.org/10.2174/1568026620666200603122000.

Full text
Abstract:
Background: Graph edit distance is a methodology used to solve error-tolerant graph matching. This methodology estimates a distance between two graphs by determining the minimum number of modifications required to transform one graph into the other. These modifications, known as edit operations, have an edit cost associated that has to be determined depending on the problem. Objective: This study focuses on the use of optimization techniques in order to learn the edit costs used when comparing graphs by means of the graph edit distance. Methods: Graphs represent reduced structural representations of molecules using pharmacophore-type node descriptions to encode the relevant molecular properties. This reduction technique is known as extended reduced graphs. The screening and statistical tools available on the ligand-based virtual screening benchmarking platform and the RDKit were used. Results: In the experiments, the graph edit distance using learned costs performed better or equally good than using predefined costs. This is exemplified with six publicly available datasets: DUD-E, MUV, GLL&GDD, CAPST, NRLiSt BDB, and ULS-UDS. Conclusion: This study shows that the graph edit distance along with learned edit costs is useful to identify bioactivity similarities in a structurally diverse group of molecules. Furthermore, the target-specific edit costs might provide useful structure-activity information for future drug-design efforts.
APA, Harvard, Vancouver, ISO, and other styles
29

Chuzaimah Zulkifli, Umi. "Pengembangan Modul PreprocessingTeks untuk Kasus Formalisasi dan Pengecekan Ejaan Bahasa Indonesia pada Aplikasi Web Mining Simple Solution (WMSS)." Jurnal Matematika Statistika dan Komputasi 15, no. 2 (2018): 95. http://dx.doi.org/10.20956/jmsk.v15i2.5718.

Full text
Abstract:
Abstract Data of social media currently has been much used to analyze both sentiment analysis and another analysis. In fact, data that is obtained from the social media in generally has some mistakes which can influence the spelling in writing of words. The solution offered is word formalization and spelling check. Based on the problem, it will be built a preprocessing model to overcome two the mistakes. The method that will be used in formalization is to change the words to be formal form based on KBBI, while the method used for spelling check is spelling correction. Spelling correction method consists of distance edit, bigram and distance edit rule. In this study, in addition the application of both methods, also it will be analyzed comparing the result of spelling correction. From the result of analysis shows that distance edit rule has higher accuracy, namely 83.39% than using both edit distance and bigram method. In addition, edit distance rule method also has faster performance than another both methods. Overall, method to change word to formal word were based on KBBI and spelling correction has been able to overcome the problem of two cases, such that it can increase accuracy of the result of the analysis. Keywords: preprocessing, spelling correction, edit distance, bigram AbstrakData media sosial saat ini telah banyak digunakan untuk melakukan analisis baik analisis sentimen maupun analisis terkait lainnya. Nyatanya, data yang diperoleh dari media sosial tersebut pada umumnya memiliki kesalahan yang akan mempengaruhi hasil analisis. Kesalahan tersebut berupa penggunaan kata yang tidak baku dan adanya kesalahan ejaan dalam penulisan kata. Solusi yang ditawarkan berupa formalisasi kata dan pengecekan ejaan. Berdasarkan masalah tersebut, akan dibangun modul preprocessing untuk mengatasi dua kesalahan di atas. Metode yang digunakan pada formalisasi adalah mengubah kata ke bentuk formal berdasarkan KBBI sedangkan metode yang digunakan pada pengecekan ejaan adalah spelling correction. Metode spelling correction tersebut terdiri dari tiga yaitu edit distance, bigram dan edit distance + rule. Pada penelitian ini, selain penerapan kedua metode juga akan dilakukan analisis untuk melihat perbandingan hasil pada metode spelling correction. Dari hasil analisis tersebut, diketahui bahwa metode edit distance + rule memiliki akurasi yang lebih tinggi yaitu sebesar 83,39% dibandingkan dengan kedua metode lainnya yaitu edit distance dan bigram. Selain itu, metode edit distance + rule juga memiliki performa tercepat dibandingkan kedua metode lainnya. Secara keseluruhan, metode mengubah kata ke bentuk formal berdasarkan KBBI dan spelling correction telah mampu mengatasi masalah pada dua kasus di atas sehingga dapat meningkatkan akurasi hasil analisis. Kata Kunci:preprocessing, spelling correction, edit distance, bigram
APA, Harvard, Vancouver, ISO, and other styles
30

Robles-Kelly, A., and E. R. Hancock. "Graph edit distance from spectral seriation." IEEE Transactions on Pattern Analysis and Machine Intelligence 27, no. 3 (2005): 365–78. http://dx.doi.org/10.1109/tpami.2005.56.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Micheli, Anne, and Dominique Rossin. "Edit distance between unlabeled ordered trees." RAIRO - Theoretical Informatics and Applications 40, no. 4 (2006): 593–609. http://dx.doi.org/10.1051/ita:2006043.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Andoni, Alexandr, and Robert Krauthgamer. "The smoothed complexity of edit distance." ACM Transactions on Algorithms 8, no. 4 (2012): 1–25. http://dx.doi.org/10.1145/2344422.2344434.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Ebrahimpour Boroojeny, Ali, Akash Shrestha, Ali Sharifi-Zarchi, Suzanne Renick Gallagher, S. Cenk Sahinalp, and Hamidreza Chitsaz. "Graph Traversal Edit Distance and Extensions." Journal of Computational Biology 27, no. 3 (2020): 317–29. http://dx.doi.org/10.1089/cmb.2019.0511.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Ostrovsky, Rafail, and Yuval Rabani. "Low distortion embeddings for edit distance." Journal of the ACM 54, no. 5 (2007): 23. http://dx.doi.org/10.1145/1284320.1284322.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Abu-Aisheh, Zeina, Romain Raveaux, Jean-Yves Ramel, and Patrick Martineau. "A parallel graph edit distance algorithm." Expert Systems with Applications 94 (March 2018): 41–57. http://dx.doi.org/10.1016/j.eswa.2017.10.043.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Sokol, D., G. Benson, and J. Tojeira. "Tandem repeats over the edit distance." Bioinformatics 23, no. 2 (2007): e30-e35. http://dx.doi.org/10.1093/bioinformatics/btl309.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Ferraro and Godin. "An Edit Distance between Quotiented Trees." Algorithmica 36, no. 1 (2003): 1–39. http://dx.doi.org/10.1007/s00453-002-1002-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Gonen, Mira, Dana Shapira, and James A. Storer. "Edit Distance with Multiple Block Operations†." Computer Journal 62, no. 5 (2018): 657–69. http://dx.doi.org/10.1093/comjnl/bxy066.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Gao, Xinbo, Bing Xiao, Dacheng Tao, and Xuelong Li. "A survey of graph edit distance." Pattern Analysis and Applications 13, no. 1 (2009): 113–29. http://dx.doi.org/10.1007/s10044-008-0141-y.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Medvedev, Paul. "Theoretical Analysis of Edit Distance Algorithms." Communications of the ACM 66, no. 12 (2023): 64–71. http://dx.doi.org/10.1145/3582490.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Boroujeni, Mahdi, Mohammad Ghodsi, and Saeed Seddighin. "Improved MPC Algorithms for Edit Distance and Ulam Distance." IEEE Transactions on Parallel and Distributed Systems 32, no. 11 (2021): 2764–76. http://dx.doi.org/10.1109/tpds.2021.3076534.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Yagahara, Ayako, Masahito Uesugi, and Hideto Yokoi. "Identification of Synonyms Using Definition Similarities in Japanese Medical Device Adverse Event Terminology." Applied Sciences 11, no. 8 (2021): 3659. http://dx.doi.org/10.3390/app11083659.

Full text
Abstract:
Japanese medical device adverse events terminology, published by the Japan Federation of Medical Devices Associations (JFMDA terminology), contains entries for 89 terminology items, with each of the terminology entries created independently. It is necessary to establish and verify the consistency of these terminology entries and map them efficiently and accurately. Therefore, developing an automatic synonym detection tool is an important concern. Such tools for edit distances and distributed representations have achieved good performance in previous studies. The purpose of this study was to identify synonyms in JFMDA terminology and evaluate the accuracy using these algorithms. A total of 125 definition sentence pairs were created from the terminology as baselines. Edit distances (Levenshtein and Jaro–Winkler distance) and distributed representations (Word2vec, fastText, and Doc2vec) were employed for calculating similarities. Receiver operating characteristic analysis was carried out to evaluate the accuracy of synonym detection. A comparison of the accuracies of the algorithms showed that the Jaro–Winkler distance had the highest sensitivity, Doc2vec with DM had the highest specificity, and the Levenshtein distance had the highest value in area under the curve. Edit distances and Doc2vec makes it possible to obtain high accuracy in predicting synonyms in JFMDA terminology.
APA, Harvard, Vancouver, ISO, and other styles
43

Al-Bakry, Abbas, and Marwa Al-Rikaby. "Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure." Iraqi Journal for Computers and Informatics 42, no. 1 (2016): 48–54. http://dx.doi.org/10.25195/ijci.v42i1.83.

Full text
Abstract:
Levenshtein is a Minimum Edit Distance method; it is usually used in spell checking applications for generatingcandidates. The method computes the number of the required edit operations to transform one string to another and it canrecognize three types of edit operations: deletion, insertion, and substitution of one letter. Damerau modified the Levenshteinmethod to consider another type of edit operations, the transposition of two adjacent letters, in addition to theconsidered three types. However, the modification suffers from the time complexity which was added to the original quadratictime complexity of the original method. In this paper, we proposed a modification for the original Levenshtein toconsider the same four types using very small number of matching operations which resulted in a shorter execution timeand a similarity measure is also achieved to exploit the resulted distance from any Edit Distance method for finding the amountof similarity between two given strings.
APA, Harvard, Vancouver, ISO, and other styles
44

MOHRI, MEHRYAR. "EDIT-DISTANCE OF WEIGHTED AUTOMATA: GENERAL DEFINITIONS AND ALGORITHMS." International Journal of Foundations of Computer Science 14, no. 06 (2003): 957–82. http://dx.doi.org/10.1142/s0129054103002114.

Full text
Abstract:
The problem of computing the similarity between two sequences arises in many areas such as computational biology and natural language processing. A common measure of the similarity of two strings is their edit-distance, that is the minimal cost of a series of symbol insertions, deletions, or substitutions transforming one string into the other. In several applications such as speech recognition or computational biology, the objects to compare are distributions over strings, i.e., sets of strings representing a range of alternative hypotheses with their associated weights or probabilities. We define the edit-distance of two distributions over strings and present algorithms for computing it when these distributions are given by automata. In the particular case where two sets of strings are given by unweighted automata, their edit-distance can be computed using the general algorithm of composition of weighted transducers combined with a single-source shortest-paths algorithm. In the general case, we show that general weighted automata algorithms over the appropriate semirings can be used to compute the edit-distance of two weighted automata exactly. These include classical algorithms such as the composition and ∊-removal of weighted transducers and a new and simple synchronization algorithm for weighted transducers which, combined with ∊-removal, can be used to normalize weighted transducers with bounded delays. Our algorithm for computing the edit-distance of weighted automata can be used to improve the word accuracy of automatic speech recognition systems. It can also be extended to provide an edit-distance automaton useful for re-scoring and other post-processing purposes in the context of large-vocabulary speech recognition.
APA, Harvard, Vancouver, ISO, and other styles
45

Chuzaimah Zulkifli, Umi Chuzaimah. "Pengembangan Modul Preprocessing Teks untuk Kasus Formalisasi dan Pengecekan Ejaan Bahasa Indonesia pada Aplikasi Web Mining Simple Solution (WMSS)." Jurnal Matematika Statistika dan Komputasi 15, no. 2 (2018): 92. http://dx.doi.org/10.20956/jmsk.v15i2.5574.

Full text
Abstract:
Data media sosial saat ini telah banyak digunakan untuk melakukan analisis baik analisis sentimen maupun analisis terkait lainnya. Nyatanya, data yang diperoleh dari media sosial tersebut pada umumnya memiliki kesalahan yang akan mempengaruhi hasil analisis. Kesalahan tersebut berupa penggunaan kata yang tidak baku dan adanya kesalahan ejaan dalam penulisan kata. Solusi yang ditawarkan berupa formalisasi kata dan pengecekan ejaan. Berdasarkan masalah tersebut, akan dibangun modul preprocessing untuk mengatasi dua kesalahan di atas. Metode yang digunakan pada formalisasi adalah mengubah kata ke bentuk formal berdasarkan KBBI sedangkan metode yang digunakan pada pengecekan ejaan adalah spelling correction. Metode spelling correction tersebut terdiri dari tiga yaitu edit distance, bigram dan edit distance + rule. Pada penelitian ini, selain penerapan kedua metode juga akan dilakukan analisis untuk melihat perbandingan hasil pada metode spelling correction. Dari hasil analisis tersebut, diketahui bahwa metode edit distance + rule memiliki akurasi yang lebih tinggi yaitu sebesar 83,39% dibandingkan dengan kedua metode lainnya yaitu edit distance dan bigram. Selain itu, metode edit distance + rule juga memiliki performa tercepat dibandingkan kedua metode lainnya. Secara keseluruhan, metode mengubah kata ke bentuk formal berdasarkan KBBI dan spelling correction telah mampu mengatasi masalah pada dua kasus di atas sehingga dapat meningkatkan akurasi hasil analisis.
APA, Harvard, Vancouver, ISO, and other styles
46

ARSLAN, ABDULLAH N., and ÖMER EĞECIOĞLU. "DICTIONARY LOOK-UP WITHIN SMALL EDIT DISTANCE." International Journal of Foundations of Computer Science 15, no. 01 (2004): 57–71. http://dx.doi.org/10.1142/s0129054104002303.

Full text
Abstract:
Let [Formula: see text] be a dictionary consisting of n binary strings of length m each, represented as a trie. The usual d-query asks if there exists a string in [Formula: see text] within Hamming distance d of a given binary query string q. We present a simple algorithm to determine if there is a member in [Formula: see text] within edit distanced of a given query string q of length m. The method takes time O(dmd+1) in the RAM model, independent of n, and requires O(dm) additional space. We also generalize the results for the case of the problem over a larger alphabet.
APA, Harvard, Vancouver, ISO, and other styles
47

Ma, Hai Tao, Chang Yong Yu, Chang Ming Xu, and Miao Fang. "Efficiently Subtree Matching between XML and Probabilistic XML Documents." Applied Mechanics and Materials 571-572 (June 2014): 575–79. http://dx.doi.org/10.4028/www.scientific.net/amm.571-572.575.

Full text
Abstract:
We explored the subtree matching problem of probabilistic XML documents: finding the matches of an XML query tree over a probabilistic XML document, using the canonical tree edit distance as a similarity measure between subtrees. Probabilistic XML is a probability distribution model capturing uncertainty of both value and structure. Query over probabilistic XML documents is difficult: an naivie algorithm has exponential complexity by directly compute the tree edit distance between the query tree and each certain XML tree represented by the probabilistic XML document. Based on the method of tree edit distance computation over certain XML subtrees, we defined a minimum-solution to the edit distance computation, which means the minimum cost to translate the query tree to the probabilistic XML tree. Furthermore, we developed an algorithm---ASM (Algorithm of Subtree Matching) to compute the minimum solution. Finally, we proved the complexity of ASM is linear in the size of the probabilistic XML document.
APA, Harvard, Vancouver, ISO, and other styles
48

Sadiq, Muhammad Umair, and Muhammad Murtaza Yousaf. "Distributed Algorithm for Parallel Edit Distance Computation." Computing and Informatics 39, no. 4 (2020): 757–79. http://dx.doi.org/10.31577/cai_2020_4_757.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

HUANG, Liang, Ze-mao ZHAO, and Xing-kai LIANG. "Web data extraction based on edit distance." Journal of Computer Applications 32, no. 6 (2013): 1662–65. http://dx.doi.org/10.3724/sp.j.1087.2012.01662.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Hu, Yumei, Yongtang Shi, and Yarong Wei. "The edit distance function of some graphs." Discussiones Mathematicae Graph Theory 40, no. 3 (2020): 807. http://dx.doi.org/10.7151/dmgt.2154.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography