Academic literature on the topic 'SUMMARIZATION ALGORITHMS'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'SUMMARIZATION ALGORITHMS.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "SUMMARIZATION ALGORITHMS"

1

Chang, Hsien-Tsung, Shu-Wei Liu, and Nilamadhab Mishra. "A tracking and summarization system for online Chinese news topics." Aslib Journal of Information Management 67, no. 6 (2015): 687–99. http://dx.doi.org/10.1108/ajim-10-2014-0147.

Full text
Abstract:
Purpose – The purpose of this paper is to design and implement new tracking and summarization algorithms for Chinese news content. Based on the proposed methods and algorithms, the authors extract the important sentences that are contained in topic stories and list those sentences according to timestamp order to ensure ease of understanding and to visualize multiple news stories on a single screen. Design/methodology/approach – This paper encompasses an investigational approach that implements a new Dynamic Centroid Summarization algorithm in addition to a Term Frequency (TF)-Density algorithm to empirically compute three target parameters, i.e., recall, precision, and F-measure. Findings – The proposed TF-Density algorithm is implemented and compared with the well-known algorithms Term Frequency-Inverse Word Frequency (TF-IWF) and Term Frequency-Inverse Document Frequency (TF-IDF). Three test data sets are configured from Chinese news web sites for use during the investigation, and two important findings are obtained that help the authors provide more precision and efficiency when recognizing the important words in the text. First, the authors evaluate three topic tracking algorithms, i.e., TF-Density, TF-IDF, and TF-IWF, with the said target parameters and find that the recall, precision, and F-measure of the proposed TF-Density algorithm is better than those of the TF-IWF and TF-IDF algorithms. In the context of the second finding, the authors implement a blind test approach to obtain the results of topic summarizations and find that the proposed Dynamic Centroid Summarization process can more accurately select topic sentences than the LexRank process. Research limitations/implications – The results show that the tracking and summarization algorithms for news topics can provide more precise and convenient results for users tracking the news. The analysis and implications are limited to Chinese news content from Chinese news web sites such as Apple Library, UDN, and well-known portals like Yahoo and Google. Originality/value – The research provides an empirical analysis of Chinese news content through the proposed TF-Density and Dynamic Centroid Summarization algorithms. It focusses on improving the means of summarizing a set of news stories to appear for browsing on a single screen and carries implications for innovative word measurements in practice.
APA, Harvard, Vancouver, ISO, and other styles
2

Et. al., Tamilselvan Jayaraman,. "Brainstorm optimization for multi-document summarization." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 10 (2021): 7607–19. http://dx.doi.org/10.17762/turcomat.v12i10.5670.

Full text
Abstract:
Document summarization is one of the solutions to mine the appropriate information from a huge number of documents. In this study, brainstorm optimization (BSO) based multi-document summarizer (MDSBSO) is proposed to solve the problem of multi-document summarization. The proposed MDSBSO is compared with two other multi-document summarization algorithms including particle swarm optimization (PSO) and bacterial foraging optimization (BFO). To evaluate the performance of proposed multi-document summarizer, two well-known benchmark document understanding conference (DUC) datasets are used. Performances of the compared algorithms are evaluated using ROUGE evaluation metrics. The experimental analysis clearly exposes that the proposed MDSBSO summarization algorithm produces significant enhancement when compared with the other summarization algorithms.
APA, Harvard, Vancouver, ISO, and other styles
3

Mall, Shalu, Avinash Maurya, Ashutosh Pandey, and Davain Khajuria. "Centroid Based Clustering Approach for Extractive Text Summarization." International Journal for Research in Applied Science and Engineering Technology 11, no. 6 (2023): 3404–9. http://dx.doi.org/10.22214/ijraset.2023.53542.

Full text
Abstract:
Abstract: Extractive text summarization is the process of identifying the most important information from a large text and presenting it in a condensed form. One popular approach to this problem is the use of centroid-based clustering algorithms, which group together similar sentences based on their content and then select representative sentences from each cluster to form a summary. In this research, we present a centroid-based clustering algorithm for email summarization that combines the use of word embeddings with a clustering algorithm. We compare our algorithm to existing summarization techniques. Our results show that our approach stands close to existing methods in terms of summary quality, while also being computationally efficient. Overall, our work demonstrates the potential of centroid-based clustering algorithms for extractive text summarization and suggests avenues for further research in this area.
APA, Harvard, Vancouver, ISO, and other styles
4

Yadav, Divakar, Naman Lalit, Riya Kaushik, et al. "Qualitative Analysis of Text Summarization Techniques and Its Applications in Health Domain." Computational Intelligence and Neuroscience 2022 (February 9, 2022): 1–14. http://dx.doi.org/10.1155/2022/3411881.

Full text
Abstract:
For the better utilization of the enormous amount of data available to us on the Internet and in different archives, summarization is a valuable method. Manual summarization by experts is an almost impossible and time-consuming activity. People could not access, read, or use such a big pile of information for their needs. Therefore, summary generation is essential and beneficial in the current scenario. This paper presents an efficient qualitative analysis of the different algorithms used for text summarization. We implemented five different algorithms, namely, term frequency-inverse document frequency (TF-IDF), LexRank, TextRank, BertSum, and PEGASUS, for a summary generation. These algorithms are chosen based on various factors. After reviewing the state-of-the-art literature, it generates good summaries results. The performance of these algorithms is compared on two different datasets, i.e., Reddit-TIFU and MultiNews, and their results are measured using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) measure to perform analysis to decide the best algorithm among these and generate the summary. After performing a qualitative analysis of the above algorithms, we observe that for both the datasets, i.e., Reddit-TIFU and MultiNews, PEGASUS had the best average F-score for abstractive text summarization and TextRank algorithms for extractive text summarization, with a better average F-score.
APA, Harvard, Vancouver, ISO, and other styles
5

BOKAEI, MOHAMMAD HADI, HOSSEIN SAMETI, and YANG LIU. "Extractive summarization of multi-party meetings through discourse segmentation." Natural Language Engineering 22, no. 1 (2015): 41–72. http://dx.doi.org/10.1017/s1351324914000199.

Full text
Abstract:
AbstractIn this article we tackle the problem of multi-party conversation summarization. We investigate the role of discourse segmentation of a conversation on meeting summarization. First, an unsupervised function segmentation algorithm is proposed to segment the transcript into functionally coherent parts, such asMonologuei(which indicates a segment where speakeriis the dominant speaker, e.g., lecturing all the other participants) orDiscussionx1x2, . . .,xn(which indicates a segment where speakersx1toxninvolve in a discussion). Then the salience score for a sentence is computed by leveraging the score of the segment containing the sentence. Performance of our proposed segmentation and summarization algorithms is evaluated using the AMI meeting corpus. We show better summarization performance over other state-of-the-art algorithms according to different metrics.
APA, Harvard, Vancouver, ISO, and other styles
6

Ioannis, Mademlis, Tefas Anastasios, and Pitas Ioannis. "A salient dictionary learning framework for activity video summarization via key-frame extraction." Elsevier Information Sciences 432 (January 2, 2018): 319–31. https://doi.org/10.1016/j.ins.2017.12.020.

Full text
Abstract:
Recently, dictionary learning methods for unsupervised video summarization have surpassed traditional video frame clustering approaches. This paper addresses static summarization of videos depicting activities, which possess certain recurrent properties. In this context, a flexible definition of an activity video summary is proposed, as the set of key-frames that can both reconstruct the original, full-length video and simultaneously represent its most salient parts. Both objectives can be jointly optimized across several information modalities. The two criteria are merged into a “salient dictionary” learning task that is proposed as a strict definition of the video summarization problem, encapsulating many existing algorithms. Three specific, novel video summarization methods are derived from this definition: the Numerical, the Greedy and the Genetic Algorithm. In all formulations, the reconstruction term is modeled algebraically as a Column Subset Selection Problem (CSSP), while the saliency term is modeled as an outlier detection problem, a low-rank approximation problem, or a summary dispersion maximization problem. In quantitative evaluation, the Greedy Algorithm seems to provide the best balance between speed and overall performance, with the faster Numerical Algorithm a close second. All the proposed methods outperform a baseline clustering approach and two competing state-of-the-art static video summarization algorithms.
APA, Harvard, Vancouver, ISO, and other styles
7

Dutta, Soumi, Vibhash Chandra, Kanav Mehra, Asit Kumar Das, Tanmoy Chakraborty, and Saptarshi Ghosh. "Ensemble Algorithms for Microblog Summarization." IEEE Intelligent Systems 33, no. 3 (2018): 4–14. http://dx.doi.org/10.1109/mis.2018.033001411.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Chu, Deming, Fan Zhang, Wenjie Zhang, Ying Zhang, and Xuemin Lin. "Graph Summarization: Compactness Meets Efficiency." Proceedings of the ACM on Management of Data 2, no. 3 (2024): 1–26. http://dx.doi.org/10.1145/3654943.

Full text
Abstract:
As the volume and ubiquity of graphs increase, a compact graph representation becomes essential for enabling efficient storage, transfer, and processing of graphs. Given a graph, the graph summarization problem asks for a compact representation that consists of a summary graph and the corrections, such that we can recreate the original graph from the representation exactly. Although this problem has been studied extensively, the existing works either trade summary compactness for efficiency, or vice versa. In particular, a well-known greedy method provides the most compact summary but incurs prohibitive time cost, while the state-of-the-art algorithms with practical overheads are more than 20% behind in summary compactness in our comparison with the greedy method. This paper presents Mags and Mags-DM, two algorithms that aim to bridge the compactness and efficiency in graph summarization. Mags adopts the existing greedy paradigm that provides state-of-the-art compactness, but significantly improves its efficiency with a novel algorithm design. Meanwhile, Mags-DM follows a different paradigm with practical efficiency and overcomes its limitations in compactness. Moreover, both algorithms can support parallel computing environments. We evaluate Mags and Mags-DM on graphs up to billion-scale and demonstrate that they achieve state-of-the-art in both compactness and efficiency, rather than in one of them. Compared with the method that offers state-of-the-art compactness, Mags and Mags-DM have a small difference (< 0.1% and < 2.1%) in compactness. For efficiency, Mags is on average 11.1x and 4.2x faster than the two state-of-the-art algorithms with practical overheads, while Mags-DM can further reduce the running time by 13.4x compared with Mags. This shows that graph summarization algorithms can be made practical while still offering a compact summary.
APA, Harvard, Vancouver, ISO, and other styles
9

Han, Kai, Shuang Cui, Tianshuai Zhu, et al. "Approximation Algorithms for Submodular Data Summarization with a Knapsack Constraint." ACM SIGMETRICS Performance Evaluation Review 49, no. 1 (2022): 65–66. http://dx.doi.org/10.1145/3543516.3453922.

Full text
Abstract:
Data summarization, a fundamental methodology aimed at selecting a representative subset of data elements from a large pool of ground data, has found numerous applications in big data processing, such as social network analysis [5, 7], crowdsourcing [6], clustering [4], network design [13], and document/corpus summarization [14]. Moreover, it is well acknowledged that the "representativeness" of a dataset in data summarization applications can often be modeled by submodularity - a mathematical concept abstracting the "diminishing returns" property in the real world. Therefore, a lot of studies have cast data summarization as a submodular function maximization problem (e.g., [2]).
APA, Harvard, Vancouver, ISO, and other styles
10

Han, Kai, Shuang Cui, Tianshuai Zhu, et al. "Approximation Algorithms for Submodular Data Summarization with a Knapsack Constraint." Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, no. 1 (2021): 1–31. http://dx.doi.org/10.1145/3447383.

Full text
Abstract:
Data summarization, i.e., selecting representative subsets of manageable size out of massive data, is often modeled as a submodular optimization problem. Although there exist extensive algorithms for submodular optimization, many of them incur large computational overheads and hence are not suitable for mining big data. In this work, we consider the fundamental problem of (non-monotone) submodular function maximization with a knapsack constraint, and propose simple yet effective and efficient algorithms for it. Specifically, we propose a deterministic algorithm with approximation ratio 6 and a randomized algorithm with approximation ratio 4, and show that both of them can be accelerated to achieve nearly linear running time at the cost of weakening the approximation ratio by an additive factor of ε. We then consider a more restrictive setting without full access to the whole dataset, and propose streaming algorithms with approximation ratios of 8+ε and 6+ε that make one pass and two passes over the data stream, respectively. As a by-product, we also propose a two-pass streaming algorithm with an approximation ratio of 2+ε when the considered submodular function is monotone. To the best of our knowledge, our algorithms achieve the best performance bounds compared to the state-of-the-art approximation algorithms with efficient implementation for the same problem. Finally, we evaluate our algorithms in two concrete submodular data summarization applications for revenue maximization in social networks and image summarization, and the empirical results show that our algorithms outperform the existing ones in terms of both effectiveness and efficiency.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "SUMMARIZATION ALGORITHMS"

1

Kolla, Maheedhar, and University of Lethbridge Faculty of Arts and Science. "Automatic text summarization using lexical chains : algorithms and experiments." Thesis, Lethbridge, Alta. : University of Lethbridge, Faculty of Arts and Science, 2004, 2004. http://hdl.handle.net/10133/226.

Full text
Abstract:
Summarization is a complex task that requires understanding of the document content to determine the importance of the text. Lexical cohesion is a method to identify connected portions of the text based on the relations between the words in the text. Lexical cohesive relations can be represented using lexical chaings. Lexical chains are sequences of semantically related words spread over the entire text. Lexical chains are used in variety of Natural Language Processing (NLP) and Information Retrieval (IR) applications. In current thesis, we propose a lexical chaining method that includes the glossary relations in the chaining process. These relations enable us to identify topically related concepts, for instance dormitory and student, and thereby enhances the identification of cohesive ties in the text. We then present methods that use the lexical chains to generate summaries by extracting sentences from the document(s). Headlines are generated by filtering the portions of the sentences extracted, which do not contribute towards the meaning of the sentence. Headlines generated can be used in real world application to skim through the document collections in a digital library. Multi-document summarization is gaining demand with the explosive growth of online news sources. It requires identification of the several themes present in the collection to attain good compression and avoid redundancy. In this thesis, we propose methods to group the portions of the texts of a document collection into meaningful clusters. clustering enable us to extract the various themes of the document collection. Sentences from clusters can then be extracted to generate a summary for the multi-document collection. Clusters can also be used to generate summaries with respect to a given query. We designed a system to compute lexical chains for the given text and use them to extract the salient portions of the document. Some specific tasks considered are: headline generation, multi-document summarization, and query-based summarization. Our experimental evaluation shows that efficient summaries can be extracted for the above tasks.<br>viii, 80 leaves : ill. ; 29 cm.
APA, Harvard, Vancouver, ISO, and other styles
2

Hodulik, George M. "Graph Summarization: Algorithms, Trained Heuristics, and Practical Storage Application." Case Western Reserve University School of Graduate Studies / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1482143946391013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hamid, Fahmida. "Evaluation Techniques and Graph-Based Algorithms for Automatic Summarization and Keyphrase Extraction." Thesis, University of North Texas, 2016. https://digital.library.unt.edu/ark:/67531/metadc862796/.

Full text
Abstract:
Automatic text summarization and keyphrase extraction are two interesting areas of research which extend along natural language processing and information retrieval. They have recently become very popular because of their wide applicability. Devising generic techniques for these tasks is challenging due to several issues. Yet we have a good number of intelligent systems performing the tasks. As different systems are designed with different perspectives, evaluating their performances with a generic strategy is crucial. It has also become immensely important to evaluate the performances with minimal human effort. In our work, we focus on designing a relativized scale for evaluating different algorithms. This is our major contribution which challenges the traditional approach of working with an absolute scale. We consider the impact of some of the environment variables (length of the document, references, and system-generated outputs) on the performance. Instead of defining some rigid lengths, we show how to adjust to their variations. We prove a mathematically sound baseline that should work for all kinds of documents. We emphasize automatically determining the syntactic well-formedness of the structures (sentences). We also propose defining an equivalence class for each unit (e.g. word) instead of the exact string matching strategy. We show an evaluation approach that considers the weighted relatedness of multiple references to adjust to the degree of disagreements between the gold standards. We publish the proposed approach as a free tool so that other systems can use it. We have also accumulated a dataset (scientific articles) with a reference summary and keyphrases for each document. Our approach is applicable not only for evaluating single-document based tasks but also for evaluating multiple-document based tasks. We have tested our evaluation method for three intrinsic tasks (taken from DUC 2004 conference), and in all three cases, it correlates positively with ROUGE. Based on our experiments for DUC 2004 Question-Answering task, it correlates with the human decision (extrinsic task) with 36.008% of accuracy. In general, we can state that the proposed relativized scale performs as well as the popular technique (ROUGE) with flexibility for the length of the output. As part of the evaluation we have also devised a new graph-based algorithm focusing on sentiment analysis. The proposed model can extract units (e.g. words or sentences) from the original text belonging either to the positive sentiment-pole or to the negative sentiment-pole. It embeds both (positive and negative) types of sentiment-flow into a single text-graph. The text-graph is composed with words or phrases as nodes, and their relations as edges. By recursively calling two mutually exclusive relations the model builds the final rank of the nodes. Based on the final rank, it splits two segments from the article: one with highly positive sentiment and the other with highly negative sentiments. The output of this model was tested with the non-polar TextRank generated output to quantify how much of the polar summaries actually covers the fact along with sentiment.
APA, Harvard, Vancouver, ISO, and other styles
4

Chiarandini, Luca. "Characterizing and modeling web sessions with applications." Doctoral thesis, Universitat Pompeu Fabra, 2014. http://hdl.handle.net/10803/283414.

Full text
Abstract:
This thesis focuses on the analysis and modeling of web sessions, groups of requests made by a single user for a single navigation purpose. Understanding how people browse through websites is important, helping us to improve interfaces and provide to better content. After first conducting a statistical analysis of web sessions, we go on to present algorithms to summarize and model web sessions. Finally, we describe applications that use novel browsing methods, in particular parallel browsing. We observe that people tend to browse images in a sequences and that those sequences can be considered as units of content in their own right. The session summarization algorithm presented in this thesis tackles a novel pattern mining problem, and this algorithm can also be applied to other fields, such as information propagation. From the statistical analysis and the models presented, we show that contextual information, such as the referrer domain and the time of day, plays a major role in the evolution of sessions. To understand browsing one should therefore take into account the context in which it takes place.<br>Esta tesis se centra en el análisis y modelaje de sesiones web, grupos de solicitudes realizadas por un único usuario para un sólo propósito de navegación. La comprensión de cómo la gente navega a través de los sitios web es importante para mejorar la interfaz y ofrecer un mejor contenido. En primer lugar, se realiza un análisis estadístico de las sesiones web. En segundo lugar, se presentan los algoritmos para identificar los patrones de navegación frecuentes y modelar las sesiones web. Finalmente, se describen varias aplicaciones que utilizan nuevas formas de navegación: la navegación paralela. A través del análisis de los registros de uso se observa que las personas tienden a navegar por las imágenes en modo secuencial y que esas secuencias pueden ser consideradas como unidades de contenido. % La generación de resumenes de sesiones presentada en esta tesis es un problema nuevo de extracción de patrones y se puede aplicar también a otros campos como el de la propagación de información. A partir del análisis y los modelos presentados entendemos que la información contextual, como el dominio previo de acceso o la hora del día, juega un papel importante en la evolución de las sesiones. Para entender la navegación no se debe, por tanto, olvidar el contexto en que esta se lleva a cabo.
APA, Harvard, Vancouver, ISO, and other styles
5

Santos, Joelson Antonio dos. "Algoritmos rápidos para estimativas de densidade hierárquicas e suas aplicações em mineração de dados." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-25102018-174244/.

Full text
Abstract:
O agrupamento de dados (ou do inglês Clustering) é uma tarefa não supervisionada capaz de descrever objetos em grupos (ou clusters), de maneira que objetos de um mesmo grupo sejam mais semelhantes entre si do que objetos de grupos distintos. As técnicas de agrupamento de dados são divididas em duas principais categorias: particionais e hierárquicas. As técnicas particionais dividem um conjunto de dados em um determinado número de grupos distintos, enquanto as técnicas hierárquicas fornecem uma sequência aninhada de agrupamentos particionais separados por diferentes níveis de granularidade. Adicionalmente, o agrupamento hierárquico de dados baseado em densidade é um paradigma particular de agrupamento que detecta grupos com diferentes concentrações ou densidades de objetos. Uma das técnicas mais populares desse paradigma é conhecida como HDBSCAN*. Além de prover hierarquias, HDBSCAN* é um framework que fornece detecção de outliers, agrupamento semi-supervisionado de dados e visualização dos resultados. No entanto, a maioria das técnicas hierárquicas, incluindo o HDBSCAN*, possui uma alta complexidade computacional. Fato que as tornam proibitivas para a análise de grandes conjuntos de dados. No presente trabalho de mestrado, foram propostas duas variações aproximadas de HDBSCAN* computacionalmente mais escaláveis para o agrupamento de grandes quantidades de dados. A primeira variação de HDBSCAN* segue o conceito de computação paralela e distribuída, conhecido como MapReduce. Já a segunda, segue o contexto de computação paralela utilizando memória compartilhada. Ambas as variações são baseadas em um conceito de divisão eficiente de dados, conhecido como Recursive Sampling, que permite o processamento paralelo desses dados. De maneira similar ao HDBSCAN*, as variações propostas também são capazes de fornecer uma completa análise não supervisionada de padrões em dados, incluindo a detecção de outliers. Experimentos foram realizados para avaliar a qualidade das variações propostas neste trabalho, especificamente, a variação baseada em MapReduce foi comparada com uma versão paralela e exata de HDBSCAN* conhecida como Random Blocks. Já a versão paralela em ambiente de memória compartilhada foi comparada com o estado da arte (HDBSCAN*). Em termos de qualidade de agrupamento e detecção de outliers, tanto a variação baseada em MapReduce quanto a baseada em memória compartilhada mostraram resultados próximos à versão paralela exata de HDBSCAN* e ao estado da arte, respectivamente. Já em termos de tempo computacional, as variações propostas mostraram maior escalabilidade e rapidez para o processamento de grandes quantidades de dados do que as versões comparadas.<br>Clustering is an unsupervised learning task able to describe a set of objects in clusters, so that objects of a same cluster are more similar than objects of other clusters. Clustering techniques are divided in two main categories: partitional and hierarchical. The particional techniques divide a dataset into a number of distinct clusters, while hierarchical techniques provide a nested sequence of partitional clusters separated by different levels of granularity. Furthermore, hierarchical density-based clustering is a particular clustering paradigm that detects clusters with different concentrations or densities of objects. One of the most popular techniques of this paradigm is known as HDBSCAN*. In addition to providing hierarchies, HDBSCAN* is a framework that provides outliers detection, semi-supervised clustering and visualization of results. However, most hierarchical techniques, including HDBSCAN*, have a high complexity computational. This fact makes them prohibitive for the analysis of large datasets. In this work have been proposed two approximate variations of HDBSCAN* computationally more scalable for clustering large amounts of data. The first variation follows the concept of parallel and distributed computing, known as MapReduce. The second one follows the context of parallel computing using shared memory. Both variations are based on a concept of efficient data division, known as Recursive Sampling, which allows parallel processing of this data. In a manner similar to HDBSCAN*, the proposed variations are also capable of providing complete unsupervised patterns analysis in data, including outliers detection. Experiments have been carried out to evaluate the quality of the variations proposed in this work, specifically, the variation based on MapReduce have been compared to a parallel and exact version of HDBSCAN*, known as Random Blocks. Already the version parallel in shared memory environment have been compared to the state of the art (HDBSCAN*). In terms of clustering quality and outliers detection, the variation based on MapReduce and other based on shared memory showed results close to the exact parallel verson of HDBSCAN* and the state of the art, respectively. In terms of computational time, the proposed variations showed greater scalability and speed for processing large amounts of data than the compared versions.
APA, Harvard, Vancouver, ISO, and other styles
6

Krübel, Monique. "Analyse und Vergleich von Extraktionsalgorithmen für die Automatische Textzusammenfassung." Master's thesis, Universitätsbibliothek Chemnitz, 2006. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200601180.

Full text
Abstract:
Obwohl schon seit den 50er Jahren auf dem Gebiet der Automatischen Textzusammenfassung Forschung betrieben wird, wurden der Nutzen und die Notwendigkeit dieser Systeme erst mit dem Boom des Internets richtig erkannt. Das World Wide Web stellt eine täglich wachsende Menge an Informationen zu nahezu jedem Thema zur Verfügung. Um den Zeitaufwand zum Finden und auch zum Wiederfinden der richtigen Informationen zu minimieren, traten Suchmaschinen ihren Siegeszug an. Doch um einen Überblick zu einem ausgewählten Thema zu erhalten, ist eine einfache Auflistung aller in Frage kommenden Seiten nicht mehr adäquat. Zusätzliche Mechanismen wie Extraktionsalgorithmen für die automatische Generierung von Zusammenfassungen können hier helfen, Suchmaschinen oder Webkataloge zu optimieren, um so den Zeitaufwand bei der Recherche zu verringern und die Suche einfacher und komfortabler zu gestalten. In dieser Diplomarbeit wurde eine Analyse von Extraktionsalgorithmen durchgeführt, welche für die automatische Textzusammenfassung genutzt werden können. Auf Basis dieser Analyse als viel versprechend eingestufte Algorithmen wurden in Java implementiert und die mit diesen Algorithmen erstellten Zusammenfassungen in einer Evaluation verglichen.
APA, Harvard, Vancouver, ISO, and other styles
7

Maaloul, Mohamed. "Approche hybride pour le résumé automatique de textes : Application à la langue arabe." Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM4778.

Full text
Abstract:
Cette thèse s'intègre dans le cadre du traitement automatique du langage naturel. La problématique du résumé automatique de documents arabes qui a été abordée, dans cette thèse, s'est cristallisée autour de deux points. Le premier point concerne les critères utilisés pour décider du contenu essentiel à extraire. Le deuxième point se focalise sur les moyens qui permettent d'exprimer le contenu essentiel extrait sous la forme d'un texte ciblant les besoins potentiels d'un utilisateur. Afin de montrer la faisabilité de notre approche, nous avons développé le système "L.A.E", basé sur une approche hybride qui combine une analyse symbolique avec un traitement numérique. Les résultats d'évaluation de ce système sont encourageants et prouvent la performance de l'approche hybride proposée. Ces résultats, ont montré, en premier lieu, l'applicabilité de l'approche dans le contexte de documents sans restriction quant à leur thème (Éducation, Sport, Science, Politique, Reportage, etc.), leur contenu et leur volume. Ils ont aussi montré l'importance de l'apprentissage dans la phase de classement et sélection des phrases forment l'extrait final<br>This thesis falls within the framework of Natural Language Processing. The problems of automatic summarization of Arabic documents which was approached, in this thesis, are based on two points. The first point relates to the criteria used to determine the essential content to extract. The second point focuses on the means to express the essential content extracted in the form of a text targeting the user potential needs.In order to show the feasibility of our approach, we developed the "L.A.E" system, based on a hybrid approach which combines a symbolic analysis with a numerical processing.The evaluation results are encouraging and prove the performance of the proposed hybrid approach.These results showed, initially, the applicability of the approach in the context of mono documents without restriction as for their topics (Education, Sport, Science, Politics, Interaction, etc), their content and their volume. They also showed the importance of the machine learning in the phase of classification and selection of the sentences forming the final extract
APA, Harvard, Vancouver, ISO, and other styles
8

Pokorný, Lubomír. "Metody sumarizace textových dokumentů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236443.

Full text
Abstract:
This thesis deals with one-document summarization of text data. Part of it is devoted to data preparation, mainly to the normalization. Listed are some of the stemming algorithms and it contains also description of lemmatization. The main part is devoted to Luhn"s method for summarization and its extension of use WordNet dictionary. Oswald summarization method is described and applied as well. Designed and implemented application performs automatic generation of abstracts using these methods. A set of experiments where developed, which verified correct functionality of the application and of extension of Luhn"s summarization method too.
APA, Harvard, Vancouver, ISO, and other styles
9

Hassanlou, Nasrin. "Probabilistic graph summarization." Thesis, 2012. http://hdl.handle.net/1828/4403.

Full text
Abstract:
We study group-summarization of probabilistic graphs that naturally arise in social networks, semistructured data, and other applications. Our proposed framework groups the nodes and edges of the graph based on a user selected set of node attributes. We present methods to compute useful graph aggregates without the need to create all of the possible graph-instances of the original probabilistic graph. Also, we present an algorithm for graph summarization based on pure relational (SQL) technology. We analyze our algorithm and practically evaluate its efficiency using an extended Epinions dataset as well as synthetic datasets. The experimental results show the scalability of our algorithm and its efficiency in producing highly compressed summary graphs in reasonable time.<br>Graduate
APA, Harvard, Vancouver, ISO, and other styles
10

SINGH, SWATI. "ANALYSIS FOR TEXT SUMMARIZATION ALGORITHMS FOR DIFFERENT DATASETS." Thesis, 2017. http://dspace.dtu.ac.in:8080/jspui/handle/repository/15975.

Full text
Abstract:
With the exponential increase in the data available on the internet for a single domain, it is difficult to understand the gist of a whole document without reading the whole document. Automatic Text Summarization reduces the content of the document by presenting important key points from the data. Extracting the major points from the document is easier and requires less machinery than forming new sentences from the available data. Research in this domain started nearly 50 years ago from identifying key features to rank important sentences in a text document. The main aim of text summarization is to obtain human quality summarization, which is still a distant dream. Abstractive Summarization techniques uses dynamic wordnet corpus to produce coherent and succinct summaries. Automatic text summarization has applications in various domains including medical research, legal domain, doctoral research, documents available on internet etc. To serve the need of text summarization, numerous algorithms based on different content selection and features using different methodologies are made in last half century. Research started from Single document summarization has shifted to Multi-document summarization in last few decades in order to save more time and compressing the same domain documents at once. Here, An analysis is presented on the Single document and Multi-document summarization algorithms on different domain datasets.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "SUMMARIZATION ALGORITHMS"

1

Tian, Yuanyuan, and Jignesh M. Patel. "Interactive Graph Summarization." In Link Mining: Models, Algorithms, and Applications. Springer New York, 2010. http://dx.doi.org/10.1007/978-1-4419-6515-8_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Javed, Hira, M. M. Sufyan Beg, and Nadeem Akhtar. "Multimodal Summarization: A Concise Review." In Algorithms for Intelligent Systems. Springer Singapore, 2022. http://dx.doi.org/10.1007/978-981-16-6893-7_54.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Varghese, Tiju George, and C. V. Priya. "Automatic Text Summarization: Methods, Metrics and Datasets." In Algorithms for Intelligent Systems. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-99-8398-8_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Komorowski, Artur, Lucjan Janowski, and Mikołaj Leszczuk. "Evaluation of Multimedia Content Summarization Algorithms." In Cryptology and Network Security. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-98678-4_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhao, Yu, Songping Huang, Dongsheng Zhou, Zhaoyun Ding, Fei Wang, and Aixin Nian. "CNsum: Automatic Summarization for Chinese News Text." In Wireless Algorithms, Systems, and Applications. Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19214-2_45.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sharma, Arjun Datt, and Shaleen Deep. "Too Long-Didn’t Read: A Practical Web Based Approach towards Text Summarization." In Applied Algorithms. Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-04126-1_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gokul Amuthan, S., and S. Chitrakala. "CESumm: Semantic Graph-Based Approach for Extractive Text Summarization." In Algorithms for Intelligent Systems. Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-3246-4_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Chen, Chen, Cindy Xide Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, and Jiawei Han. "Mining Large Information Networks by Graph Summarization." In Link Mining: Models, Algorithms, and Applications. Springer New York, 2010. http://dx.doi.org/10.1007/978-1-4419-6515-8_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Tsitovich, Aliaksei, Natasha Sharygina, Christoph M. Wintersteiger, and Daniel Kroening. "Loop Summarization and Termination Analysis." In Tools and Algorithms for the Construction and Analysis of Systems. Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-19835-9_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Rehman, Tohida, Suchandan Das, Debarshi Kumar Sanyal, and Samiran Chattopadhyay. "An Analysis of Abstractive Text Summarization Using Pre-trained Models." In Algorithms for Intelligent Systems. Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-1657-1_21.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "SUMMARIZATION ALGORITHMS"

1

More, Mukesh, Pallavi Yevale, Abhang Mandwale, Kundan Agrawal, Om Mahale, and Sahilsing Rajput. "Hindi Text Summarization: Using BERT." In 2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS). IEEE, 2024. http://dx.doi.org/10.1109/iacis61494.2024.10721619.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ganguly, Sayam, Sourav Mandal, Nabanita Das, Bikash Sadhukhan, Sagarika Sarkar, and Swagata Paul. "WhisperSum: Unified Audio-to-Text Summarization." In 2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS). IEEE, 2024. http://dx.doi.org/10.1109/iacis61494.2024.10721926.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

wang, kang, guohua shen, zhiqiu huang, and xinbo zhang. "Exploring ChatGPT's code summarization capabilities: an empirical study." In International Conference on Algorithms, High Performance Computing and Artificial Intelligence, edited by Pavel Loskot and Liang Hu. SPIE, 2024. http://dx.doi.org/10.1117/12.3051717.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sun, Dan, Jacky He, Hanlu Zhang, Zhen Qi, Hongye Zheng, and Xiaokai Wang. "A LongFormer-Based Framework for Accurate and Efficient Medical Text Summarization." In 2025 8th International Conference on Advanced Algorithms and Control Engineering (ICAACE). IEEE, 2025. https://doi.org/10.1109/icaace65325.2025.11019176.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Chen, Yongming. "Data Pyramid for Enterprise News Summarization: A Three-Stage Knowledge-Enhanced Approach." In 2025 8th International Conference on Advanced Algorithms and Control Engineering (ICAACE). IEEE, 2025. https://doi.org/10.1109/icaace65325.2025.11019188.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Nugroho, Galang Setia, Esmeralda Contessa Dj amal, and Ridwan Ilyas. "Summarization of Scientific Articles using Hybrid SciBERT and Graph-Based Algorithms." In 2024 11th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI). IEEE, 2024. https://doi.org/10.1109/eecsi63442.2024.10776308.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Chakraborti, Rounak, Romit Banerjee, and Soma Das. "Evaluating the Efficacy of Text Summarization Models: A Comparison of NLP Algorithms." In 2025 8th International Conference on Electronics, Materials Engineering & Nano-Technology (IEMENTech). IEEE, 2025. https://doi.org/10.1109/iementech65115.2025.10959463.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Muhamediyeva, Dilnoz, Nilufar Niyozmatova, Sanjar Ungalov, Mamatov Abduvali, and Turgunova Nafisa. "Evaluating the effectiveness of text summarization algorithms based on recall-oriented understudy for Gisting evaluation metrics." In Fourth International Conference on Digital Technologies, Optics, and Materials Science (DTIEE 2025), edited by Arthur Gibadullin and Khamza Eshankulov. SPIE, 2025. https://doi.org/10.1117/12.3072740.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

R, Sandhya B., Shusank Chaudhary, Basant Pandit, Khem Raj Seth, and Aditya Jha. "Unleashing the potential of Natural Language Processing for News Link Article Summarization: Comparing TF-IDF and Text Rank Algorithms." In 2024 International Conference on Distributed Systems, Computer Networks and Cybersecurity (ICDSCNC). IEEE, 2024. https://doi.org/10.1109/icdscnc62492.2024.10939237.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ren, Zheng. "Balancing role contributions: a novel approach for role-oriented dialogue summarization." In 4th International Conference on Automation Control. Algorithm and Intelligent Bionics, edited by Jing Na and Shuping He. SPIE, 2024. http://dx.doi.org/10.1117/12.3039616.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!