Log in

Relevant bibliographies by topics / Summarization / Journal articles

To see the other types of publications on this topic, follow the link: Summarization.

Journal articles on the topic 'Summarization'

Author: Grafiati

Published: 4 June 2021

Last updated: 9 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Summarization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

da Cunha, Iria, Leo Wanner, and Teresa Cabré. "Summarization of specialized discourse." Terminology 13, no. 2 (November 19, 2007): 249–86. http://dx.doi.org/10.1075/term.13.2.07cun.

Full text

Abstract:

In this article, we present the current state of our work on a linguistically-motivated model for automatic summarization of medical articles in Spanish. The model takes into account the results of an empirical study which reveals that, on the one hand, domain-specific summarization criteria can often be derived from the summaries of domain specialists, and, on the other hand, adequate summarization strategies must be multidimensional, i.e., cover various types of linguistic clues. We take into account the textual, lexical, discursive, syntactic and communicative dimensions. This is novel in the field of summarization. The experiments carried out so far indicate that our model is suitable to provide high quality summarizations.

APA, Harvard, Vancouver, ISO, and other styles

2

Sirohi, Neeraj Kumar, Dr Mamta Bansal, and Dr S. N. Rajan Rajan. "Text Summarization Approaches Using Machine Learning & LSTM." Revista Gestão Inovação e Tecnologias 11, no. 4 (September 1, 2021): 5010–26. http://dx.doi.org/10.47059/revistageintec.v11i4.2526.

Full text

Abstract:

Due to the massive amount of online textual data generated in a diversity of social media, web, and other information-centric applications. To select the vital data from the large text, need to study the full article and generate summary also not loose critical information of text document this process is called summarization. Text summarization is done either by human which need expertise in that area, also very tedious and time consuming. second type of summarization is done through system which is known as automatic text summarization which generate summary automatically. There are mainly two categories of Automatic text summarizations that is abstractive and extractive text summarization. Extractive summary is produced by picking important and high rank sentences and word from the text document on the other hand the sentences and word are present in the summary generated through Abstractive method may not present in original text. This article mainly focuses on different ATS (Automatic text summarization) techniques that has been instigated in the present are argue. The paper begin with a concise introduction of automatic text summarization, then closely discussed the innovative developments in extractive and abstractive text summarization methods, and then transfers to literature survey, and it finally sum-up with the proposed techniques using LSTM with encoder Decoder for abstractive text summarization are discussed along with some future work directions.

APA, Harvard, Vancouver, ISO, and other styles

3

Blekanov, Ivan S., Nikita Tarasov, and Svetlana S. Bodrunova. "Transformer-Based Abstractive Summarization for Reddit and Twitter: Single Posts vs. Comment Pools in Three Languages." Future Internet 14, no. 3 (February 23, 2022): 69. http://dx.doi.org/10.3390/fi14030069.

Full text

Abstract:

Abstractive summarization is a technique that allows for extracting condensed meanings from long texts, with a variety of potential practical applications. Nonetheless, today’s abstractive summarization research is limited to testing the models on various types of data, which brings only marginal improvements and does not lead to massive practical employment of the method. In particular, abstractive summarization is not used for social media research, where it would be very useful for opinion and topic mining due to the complications that social media data create for other methods of textual analysis. Of all social media, Reddit is most frequently used for testing new neural models of text summarization on large-scale datasets in English, without further testing on real-world smaller-size data in various languages or from various other platforms. Moreover, for social media, summarizing pools of texts (one-author posts, comment threads, discussion cascades, etc.) may bring crucial results relevant for social studies, which have not yet been tested. However, the existing methods of abstractive summarization are not fine-tuned for social media data and have next-to-never been applied to data from platforms beyond Reddit, nor for comments or non-English user texts. We address these research gaps by fine-tuning the newest Transformer-based neural network models LongFormer and T5 and testing them against BART, and on real-world data from Reddit, with improvements of up to 2%. Then, we apply the best model (fine-tuned T5) to pools of comments from Reddit and assess the similarity of post and comment summarizations. Further, to overcome the 500-token limitation of T5 for analyzing social media pools that are usually bigger, we apply LongFormer Large and T5 Large to pools of tweets from a large-scale discussion on the Charlie Hebdo massacre in three languages and prove that pool summarizations may be used for detecting micro-shifts in agendas of networked discussions. Our results show, however, that additional learning is definitely needed for German and French, as the results for these languages are non-satisfactory, and more fine-tuning is needed even in English for Twitter data. Thus, we show that a ‘one-for-all’ neural-network summarization model is still impossible to reach, while fine-tuning for platform affordances works well. We also show that fine-tuned T5 works best for small-scale social media data, but LongFormer is helpful for larger-scale pool summarizations.

APA, Harvard, Vancouver, ISO, and other styles

4

Pei, Jisheng, and Xiaojun Ye. "Information-Balance-Aware Approximated Summarization of Data Provenance." Scientific Programming 2017 (September 12, 2017): 1–11. http://dx.doi.org/10.1155/2017/4504589.

Full text

Abstract:

Extracting useful knowledge from data provenance information has been challenging because provenance information is often overwhelmingly enormous for users to understand. Recently, it has been proposed that we may summarize data provenance items by grouping semantically related provenance annotations so as to achieve concise provenance representation. Users may provide their intended use of the provenance data in terms of provisioning, and the quality of provenance summarization could be optimized for smaller size and closer distance between the provisioning results derived from the summarization and those from the original provenance. However, apart from the intended provisioning use, we notice that more dedicated and diverse user requirements can be expressed and considered in the summarization process by assigning importance weights to provenance elements. Moreover, we introduce information balance index (IBI), an entropy based measurement, to dynamically evaluate the amount of information retained by the summary to check how it suits user requirements. An alternative provenance summarization algorithm that supports manipulation of information balance is presented. Case studies and experiments show that, in summarization process, information balance can be effectively steered towards user-defined goals and requirement-driven variants of the provenance summarizations can be achieved to support a series of interesting scenarios.

APA, Harvard, Vancouver, ISO, and other styles

5

Li, Chih-Yuan, Soon Ae Chun, and James Geller. "Perspective-Based Microblog Summarization." Information 16, no. 4 (April 1, 2025): 285. https://doi.org/10.3390/info16040285.

Full text

Abstract:

Social media allows people to express and share a variety of experiences, opinions, beliefs, interpretations, or viewpoints on a single topic. Summarizing a collection of social media posts (microblogs) on one topic may be challenging and can result in an incoherent summary due to multiple perspectives from different users. We introduce a novel approach to microblog summarization, the Multiple-View Summarization Framework (MVSF), designed to efficiently generate multiple summaries from the same social media dataset depending on chosen perspectives and deliver personalized and fine-grained summaries. The MVSF leverages component-of-perspective computing, which can recognize the perspectives expressed in microblogs, such as sentiments, political orientations, or unreliable opinions (fake news). The perspective computing can filter social media data to summarize them according to specific user-selected perspectives. For the summarization methods, our framework implements three extractive summarization methods: Entity-based, Social Signal-based, and Triple-based. We conduct comparative evaluations of MVSF summarizations against state-of-the-art summarization models, including BertSum, SBert, T5, and Bart-Large-CNN, by using a gold-standard BBC news dataset and Rouge scores. Furthermore, we utilize a dataset of 18,047 tweets about COVID-19 vaccines to demonstrate the applications of MVSF. Our contributions include the innovative approach of using user perspectives in summarization methods as a unified framework, capable of generating multiple summaries that reflect different perspectives, in contrast to prior approaches of generating one-size-fits-all summaries for one dataset. The practical implication of MVSF is that it offers users diverse perspectives from social media data. Our prototype web application is also implemented using ChatGPT to show the feasibility of our approach.

APA, Harvard, Vancouver, ISO, and other styles

6

M., Nafees Muneera, and P.Sriramya. "Extractive Text Summarization for Social News using Hybrid Techniques in Opinion Mining." International Journal of Engineering and Advanced Technology (IJEAT) 9, no. 3 (February 29, 2020): 2109–15. https://doi.org/10.35940/ijeat.B3356.029320.

Full text

Abstract:

Presently almost all enterprises are oriented into building text data in abundance savoring the benefits of big data concept but the reality is that it’s not practically possible to go through all this data/documents for decision making because of the time constraint. Here in exists intense need of an approach as an alternative for the actual content which can summarize the complete textual content. By adopting these summarizing approaches, the accuracy in data retrieval of summarized content via search queries can be enhanced compared to performing search over the broad range of original textual content. There are many text summarization techniques formulated having their own pros and cons. The present work focuses on a comprehensive news review of extractive text summarization process methods and also taking into account, data appended dynamically. The existing work recommends a technique of hybrid text summarization that’s a blend of CRF (conditional random fields) and LSA (Latent Semantic Analysis) which being highly adhesive with low redundant summary and coherent and in-depth information. The above hybrid techniques is being extracted in five types that being: Positive and negative, statement, questions, suggestions and comments. The technique of LSA extracts hidden semantic structures within words/sentences that being commonly utilized in the process of summarization. The statistical modeling technique of CRF adopts ML (machine leaning) for offering structured detection and providing multiple options for evaluation of opinion summarization thereby identifying the most appropriate algorithm for news text summarizations considering the heavy volume of datasets.

APA, Harvard, Vancouver, ISO, and other styles

7

Bhatia, Neelima, and Arunima Jaiswal. "Literature Review on Automatic Text Summarization: Single and Multiple Summarizations." International Journal of Computer Applications 117, no. 6 (May 20, 2015): 25–29. http://dx.doi.org/10.5120/20560-2948.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Zhang, Qianjin, Dahai Jin, Yawen Wang, and Yunzhan Gong. "Statement-Grained Hierarchy Enhanced Code Summarization." Electronics 13, no. 4 (February 15, 2024): 765. http://dx.doi.org/10.3390/electronics13040765.

Full text

Abstract:

Code summarization plays a vital role in aiding developers with program comprehension by generating corresponding textual descriptions for code snippets. While recent approaches have concentrated on encoding the textual and structural characteristics of source code, they often neglect the global hierarchical features, causing limited code representation. Addressing this gap, our paper introduces the statement-grained hierarchy enhanced Transformer model (SHT), a novel framework that integrates global hierarchy, syntax, and token sequences to automatically generate summaries for code snippets. SHT is distinctively designed with two encoders to learn both hierarchical and sequential features of code. One relational attention encoder processes the statement-grained hierarchical graph, producing hierarchical embeddings. Subsequently, another sequence encoder integrates these hierarchical structures with token sequences. The resulting enriched representation is then fed into a vanilla Transformer decoder, which effectively generates concise and informative summarizations. Our extensive experiments demonstrate that SHT significantly outperforms state-of-the-art approaches on two widely used Java benchmarks. This underscores the effectiveness of incorporating global hierarchical information in enhancing the quality of code summarizations.

APA, Harvard, Vancouver, ISO, and other styles

9

S, Sai Shashank, Sindhu S, Vineeth V, and Pranathi C. "VIDEO SUMMARIZATION." International Research Journal of Computer Science 9, no. 8 (August 13, 2022): 277–80. http://dx.doi.org/10.26562/irjcs.2022.v0908.24.

Full text

Abstract:

The general public now has access to a vast amount of multimedia information thanks to recent technological advancements and the quick expansion of consumer electronics, making it challenging to effectively consume video material among the thousands of options accessible. By choosing and presenting the most educational or fascinating materials for users, we provide a method to quickly summarize the content of a lengthy video document. The practice of condensing a raw video into a more manageable form without losing much information is known as video summarizing. Either a comprehensive analysis of the full movie or the local differences between neighboring frames are used to achieve this. The majority of such approaches rely on universal characteristics like color, texture, motion data, etc. Video summaries are evaluated depending on the sort of content they are formed from (object, event, perception, or feature-based) and the functionality made available to the user for consumption (interactive or static, personalized or generic). The suggested system analyses each frame of a video as input before producing a summary. Each frame receives a score that is used to compare it to a threshold value in the final phase. Every frame whose frame score exceeds the threshold is chosen as a key frame and is represented in the final movie summary. This technique enables us to condense video information of various lengths while guaranteeing that the key moments are included. The purpose of video summary is to facilitate quick access, speed up browsing through a sizable video database, and offer a condensed video representation while maintaining the core activities of the original video.

APA, Harvard, Vancouver, ISO, and other styles

10

Nenkova, Ani. "Automatic Summarization." Foundations and Trends® in Information Retrieval 5, no. 2 (2011): 103–233. http://dx.doi.org/10.1561/1500000015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Larson, Martha. "Automatic Summarization." Foundations and Trends® in Information Retrieval 5, no. 3 (2012): 235–422. http://dx.doi.org/10.1561/1500000020.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

D, Manju, Radhamani V, Dhanush Kannan A, Kavya B, Sangavi S, and Swetha Srinivasan. "TEXT SUMMARIZATION." YMER Digital 21, no. 07 (July 7, 2022): 173–82. http://dx.doi.org/10.37896/ymer21.07/13.

Full text

Abstract:

n the last few years, a huge amount of text data from different sources has been created every day. The enormous data which needs to be processed contains valuable detail which needs to be efficiently summarized so that it serves a purpose. It is very tedious to summarize and classify large amounts of documents when done manually. It becomes cumbersome to develop a summary taking every semantics into consideration. Therefore, automatic text summarization acts as a solution. Text summarization can help in understanding the huge corpus by providing a gist of the corpus enabling comprehension in a timely manner. This paper studies the development of a web application which summarizes the given input text using different models and its deployment. Keywords: Text summarization, NLP, AWS, Text mining

APA, Harvard, Vancouver, ISO, and other styles

13

Maña-López, Manuel J., Manuel De Buenaga, and José M. Gómez-Hidalgo. "Multidocument summarization." ACM Transactions on Information Systems 22, no. 2 (April 2004): 215–41. http://dx.doi.org/10.1145/984321.984323.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Vikas, A., Pradyumna G.V.N, and Tahir Ahmed Shaik. "Text Summarization." International Journal of Engineering and Computer Science 9, no. 2 (February 3, 2020): 24940–45. http://dx.doi.org/10.18535/ijecs/v9i2.4437.

Full text

Abstract:

In this new era, where tremendous information is available on the internet, it is most important to provide the improved mechanism to extract the information quickly and most efficiently. It is very difficult for human beings to manually extract the summary of a large documents of text. There are plenty of text material available on the internet. So, there is a problem of searching for relevant documents from the number of documents available and absorbing relevant information from it. In order to solve the above two problems, the automatic text summarization is very much necessary. Text summarization is the process of identifying the most important meaningful information in a document or set of related documents and compressing them into a shorter version preserving its overall meanings.

APA, Harvard, Vancouver, ISO, and other styles

15

Balaji, J., T. V. Geetha, and Ranjani Parthasarathi. "Abstractive Summarization." International Journal on Semantic Web and Information Systems 12, no. 2 (April 2016): 76–99. http://dx.doi.org/10.4018/ijswis.2016040104.

Full text

Abstract:

Customization of information from web documents is an immense job that involves mainly the shortening of original texts. This task is carried out using summarization techniques. In general, an automatically generated summary is of two types – extractive and abstractive. Extractive methods use surface level and statistical features for the selection of important sentences, without considering the meaning conveyed by those sentences. In contrast, abstractive methods need a formal semantic representation, where the selection of important components and the rephrasing of the selected components are carried out using the semantic features associated with the words as well as the context. Furthermore, a deep linguistic analysis is needed for generating summaries. However, the bottleneck behind abstractive summarization is that it requires semantic representation, inference rules and natural language generation. In this paper, The authors propose a semi-supervised bootstrapping approach for the identification of important components for abstractive summarization. The input to the proposed approach is a fully connected semantic graph of a document, where the semantic graphs are constructed for sentences, which are then connected by synonym concepts and co-referring entities to form a complete semantic graph. The direction of the traversal of nodes is determined by a modified spreading activation algorithm, where the importance of the nodes and edges are decided, based on the node and its connected edges under consideration. Summary obtained using the proposed approach is compared with extractive and template based summaries, and also evaluated using ROUGE scores.

APA, Harvard, Vancouver, ISO, and other styles

16

Indu Nair, Dr V. "YouTube Summarization." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (May 3, 2025): 1–9. https://doi.org/10.55041/ijsrem46776.

Full text

Abstract:

Abstract The YouTube Summarizer is an AI-powered web application designed to enhance digital content consumption by providing concise summaries of YouTube videos. Developed using the Next.js framework, the platform integrates state-of-the-art language models such as GPT, Gemini, and LLaMA to generate context-aware summaries from extracted video transcripts. It supports multilingual outputs and offers summary customization—like video-style or podcast-style formats—tailored to user preferences. The application features a sleek, responsive UI with session history tracking, making it accessible and efficient for students, researchers, and content creators aiming to quickly grasp video content without full viewing. Keywords — YouTube Summarizer, Natural Language Processing, GPT, Gemini, LLaMA, Next.js, AI-based Tool, Multilingual Summarization. This research presents an AI-powered web application that addresses the growing challenge of processing lengthy YouTube videos by generating customizable, context-aware summaries. The tool leverages cutting-edge transformer-based language models (GPT, Gemini, and LLaMA), capable of multilingual processing and summary style adjustments. Key parameters such as transcript quality, model selection, and summarization goals are evaluated and optimized. This approach enhances accessibility, efficiency, and content comprehension for diverse user groups including students, researchers, and educators.

APA, Harvard, Vancouver, ISO, and other styles

17

Jha, Nitesh Kumar, and Arnab Mitra. "Introducing Word's Importance Level-Based Text Summarization Using Tree Structure." International Journal of Information Retrieval Research 10, no. 1 (January 2020): 13–33. http://dx.doi.org/10.4018/ijirr.2020010102.

Full text

Abstract:

Text-summarization plays a significant role towards quick knowledge acquisition from any text-based knowledge resource. To enhance the text-summarization process, a new approach towards automatic text-summarization is presented in this article that facilitates level (word importance factor)-based automated text-summarization. An equivalent tree is produced from the directed-graph during the input text processing with WordNet. Detailed investigations further ensure that the execution time for proposed automatic text-summarization, is strictly following a linear relationship with reference to the varying volume of inputs. Further investigation towards the performance of proposed automatic text-summarization approach ensures its superiority over several other existing text-summarization approaches.

APA, Harvard, Vancouver, ISO, and other styles

18

D., K. Kanitha, Muhammad Noorul Mubarak D., and A. Shanavas S. "ISSUES IN MALAYALAM TEXT SUMMARIZATION." International Journal of Applied and Advanced Scientific Research 3, no. 1 (March 21, 2018): 201–4. https://doi.org/10.5281/zenodo.1205085.

Full text

Abstract:

Text Summarization is the process of creates an abridged version of the original text and it covers overall idea about the document. The human summarization requires lot of time and effort. At the same time summarization system produce summary within a short span of time. It generates summaries or abstracts of large documents. Many techniques have been developed for summarization of text in various languages.  The techniques may be language dependent or independent.  Some techniques may be varies from its discourse structure. The summarization methods can be classified as extractive and abstractive. The abstractive method requires language processing tools. The extractive summarization depends on statistical and linguistic tools. This paper mainly concentrated some of the issues faced by the Malayalam text summarization. The Malayalam summarization faces some difficulties for creating a fruitful summary.

APA, Harvard, Vancouver, ISO, and other styles

19

Lucky, Henry, and Derwin Suhartono. "Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization." Journal of Information and Communication Technology 21, No.1 (November 11, 2021): 71–94. http://dx.doi.org/10.32890/jict2022.21.1.4.

Full text

Abstract:

Text summarization aims to reduce text by removing less useful information to obtain information quickly and precisely. In Indonesian abstractive text summarization, the research mostly focuses on multi-document summarization which methods will not work optimally in single-document summarization. As the public summarization datasets and works in English are focusing on single-document summarization, this study emphasized on Indonesian single-document summarization. Abstractive text summarization studies in English frequently use Bidirectional Encoder Representations from Transformers (BERT), and since Indonesian BERT checkpoint is available, it was employed in this study. This study investigated the use of Indonesian BERT in abstractive text summarization on the IndoSum dataset using the BERTSum model. The investigation proceeded by using various combinations of model encoders, model embedding sizes, and model decoders. Evaluation results showed that models with more embedding size and used Generative Pre-Training (GPT)-like decoder could improve the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score and BERTScore of the model results.

APA, Harvard, Vancouver, ISO, and other styles

20

Hua, Hang, Yunlong Tang, Chenliang Xu, and Jiebo Luo. "V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 4 (April 11, 2025): 3599–607. https://doi.org/10.1609/aaai.v39i4.32374.

Full text

Abstract:

Video summarization aims to create short, accurate, and cohesive summaries of longer videos. Despite the existence of various video summarization datasets, a notable limitation is their limited amount of source videos, which hampers the effective training of advanced large vision-language models (VLMs). Additionally, most existing datasets are created for video-to-video summarization, overlooking the contemporary need for multimodal video content summarization. Recent efforts have been made to expand from unimodal to multimodal video summarization, categorizing the task into three sub-tasks based on the summary's modality: video-to-video (V2V), video-to-text (V2T), and a combination of video and text summarization (V2VT). However, the textual summaries in previous multimodal datasets are inadequate. To address these issues, we introduce Instruct-V2Xum, a cross-modal video summarization dataset featuring 30,000 diverse videos sourced from YouTube, with lengths ranging from 40 to 940 seconds and an average summarization ratio of 16.39%. Each video summary in Instruct-V2Xum is paired with a textual summary that references specific frame indexes, facilitating the generation of aligned video and textual summaries. In addition, we propose a new video summarization framework named V2Xum-LLM. V2Xum-LLM, specifically V2Xum-LLaMA in this study, is the first framework that unifies different video summarization tasks into one large language model's (LLM) text decoder and achieves task-controllable video summarization with temporal prompts and task instructions. Experiments show that V2Xum-LLaMA outperforms strong baseline models on multiple video summarization tasks. Furthermore, we propose an enhanced evaluation metric for V2V and V2VT summarization tasks.

APA, Harvard, Vancouver, ISO, and other styles

21

Parimoo, Rohit, Rohit Sharma, Naleen Gaur, Nimish Jain, and Sweeta Bansal. "Applying Text Rank to Build an Automatic Text Summarization Web Application." International Journal for Research in Applied Science and Engineering Technology 10, no. 4 (April 30, 2022): 865–67. http://dx.doi.org/10.22214/ijraset.2022.40766.

Full text

Abstract:

Abstract: Automatic Text Summarization is one of the most trending research areas in the field of Natural Language Processing. The main aim of text summarization is to reduce the size of a text without losing any important information. Various techniques can be used for automatic summarization of text. In this paper we are going to focus on the automatic summarization of text using graph-based methods. In particular, we are going to discuss the implementation of a general-purpose web application which performs automatic summarization on the text entered using the Text Rank Algorithm. Summarization of text using graph-based approaches involves pre-processing and cleansing of text, tokenizing the sentences present in the text, representing the tokenized text in the form of numerical vectors, creating a similarity matrix which shows the semantic similarity between different sentences present in the text, representing the similarity matrix as a graph, scoring and ranking the sentences and extracting the summary. Keywords: Text Summarization, Unsupervised Learning, Text Rank, Page Rank, Web Application, Graph Based Summarization, Extractive Summarization

APA, Harvard, Vancouver, ISO, and other styles

22

ÖZKAN, Adem. "Özetleme Tekniğinin Dil Öğretiminde Kullanımı Üzerine Kapsamli Bir İnceleme ve PQRST Tekniği." International Journal of Social Sciences 9, no. 38 (March 21, 2025): 188–223. https://doi.org/10.52096/usbd.9.38.11.

Full text

Abstract:

This study examines the importance of teaching summarization techniques and different application strategies. Summarization is a significant method that enables students to comprehend texts and enhance their written communication skills. The research focuses on how summarization strategies can be effectively taught to students and implemented in instructional processes. Techniques such as generalization, deletion, and restructuring are used to help students make texts shorter, concise, and understandable. Additionally, it emphasizes that summarization techniques can enhance not only students' language skills but also their critical thinking abilities. The PQRST technique involves making summarization a structured part of the lesson. Developing summarization skills is highlighted to contribute to the improvement of students' speaking, understanding, and storytelling abilities. For educators, effectively using summarization methods in the classroom is highlighted to provide benefits to students both individually and in group work Key Words: Summarization technique, PQRST, speaking, listening, writing.

APA, Harvard, Vancouver, ISO, and other styles

23

Peronikolis, Michail, and Costas Panagiotakis. "Personalized Video Summarization: A Comprehensive Survey of Methods and Datasets." Applied Sciences 14, no. 11 (May 22, 2024): 4400. http://dx.doi.org/10.3390/app14114400.

Full text

Abstract:

In recent years, the scientific and technological developments have led to an explosion of available videos on the web, increasing the necessity of fast and effective video analysis and summarization. Video summarization methods aim to generate a synopsis by selecting the most informative parts of the video content. The user’s personal preferences, often involved in the expected results, should be taken into account in the video summaries. In this paper, we provide the first comprehensive survey on personalized video summarization relevant to the techniques and datasets used. In this context, we classify and review personalized video summary techniques based on the type of personalized summary, on the criteria, on the video domain, on the source of information, on the time of summarization, and on the machine learning technique. Depending on the type of methodology used by the personalized video summarization techniques for the summary production process, we classify the techniques into five major categories, which are feature-based video summarization, keyframe selection, shot selection-based approach, video summarization using trajectory analysis, and personalized video summarization using clustering. We also compare personalized video summarization methods and present 37 datasets used to evaluate personalized video summarization methods. Finally, we analyze opportunities and challenges in the field and suggest innovative research lines.

APA, Harvard, Vancouver, ISO, and other styles

24

Diedrichsen, Elke. "Linguistic challenges in automatic summarization technology." Journal of Computer-Assisted Linguistic Research 1, no. 1 (June 26, 2017): 40. http://dx.doi.org/10.4995/jclr.2017.7787.

Full text

Abstract:

Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization.

APA, Harvard, Vancouver, ISO, and other styles

25

Howlader, Prottyee, Prapti Paul, Meghana Madavi, Laxmi Bewoor, and V. S. Deshpande. "Fine Tuning Transformer Based BERT Model for Generating the Automatic Book Summary." International Journal on Recent and Innovation Trends in Computing and Communication 10, no. 1s (December 15, 2022): 347–52. http://dx.doi.org/10.17762/ijritcc.v10i1s.5902.

Full text

Abstract:

Major text summarization research is mainly focusing on summarizing short documents and very few works is witnessed for long document summarization. Additionally, extractive summarization is more addressed as compared with abstractive summarization. Abstractive summarization, unlike extractive summarization, does not only copy essential words from the original text but requires paraphrasing to get close to human generated summary. The machine learning, deep learning models are adapted to contemporary pre-trained models like transformers. Transformer based Language models gaining a lot of attention because of self-supervised training while fine-tuning for Natural Language Processing (NLP) downstream task like text summarization. The proposed work is an attempt to investigate the use of transformers for abstraction. The proposed work is tested for book especially as a long document for evaluating the performance of the model.

APA, Harvard, Vancouver, ISO, and other styles

26

Tahseen, Rabia, Uzma Omer, Muhammad Shoaib Farooq, and Faiqa Adnan. "Text Summarization Techniques Using Natural Language Processing: A Systematic Literature Review." VFAST Transactions on Software Engineering 9, no. 4 (December 31, 2021): 102–8. http://dx.doi.org/10.21015/vtse.v9i4.856.

Full text

Abstract:

In recent years, data has been growing rapidly in almost every domain. Due to this excessiveness of data, there is a need for an automatic text summarizer that summarizes long and numerical data especially textual data without losing its content. Text summarization has been under research for decades and researchers used different summarization methods by using natural language processing and combining various algorithms. This paper presents a systematic literature review by showing a survey of text summarization methods and explains the accuracy of these methods used for text summarization. The paper first introduced some concepts of extractive and abstractive text summarization and also define how deep learning models can be used for the improvement of text summarization. This paper aims to identify the current utilization of text summarization in different application domains. Different methodologies are discussed for text summarization. To carry out this SLR, twenty-four published articles have been chosen carefully for this domain. Moreover, it discusses issues and challenges which are investigated in different application domains using text summarization methods. Lastly, the existing work of different researchers has been carried out for further discussion.

APA, Harvard, Vancouver, ISO, and other styles

27

R., Priyadharshini, and Marikkannan M. "A Study Analysis based on Text Summarization Methods." Journal of Advancement in Parallel Computing 4, no. 1 (May 26, 2021): 1–6. https://doi.org/10.5281/zenodo.4802978.

Full text

Abstract:

Everybody wants the simple and easy to use things in this busy world and then all are expected the ready to serve things in anything. So that same thing also discovered day by day in the computer era, In the computer field the concept of Text summarization creates the most attention. Text summarization is a crucial and timely tool to reduce the text data. This paper produces the main idea of the text summarization, like what is text summarization and also describes what techniques are used in this field currently. Text summarization reduces the huge amount of data into simple text document, it’s also describes more over all the techniques of the text summarization. This paper overviews the methods used in earlier years for text summarization and looks at the principle thought of their presentation to give the best outcome.

APA, Harvard, Vancouver, ISO, and other styles

28

Kundan Chaudhari, Raj Mahale, Fardeen Khan, Shradha Gaikwad, and Vita Jadhav. "Comprehensive Survey of Abstractive Text Summarization Techniques." International Research Journal on Advanced Engineering and Management (IRJAEM) 6, no. 07 (July 15, 2024): 2217–31. http://dx.doi.org/10.47392/irjaem.2024.0323.

Full text

Abstract:

Text summarization using pre-trained encoders has become a crucial technique for efficiently managing large volumes of text data. The rise of automatic summarization systems addresses the need to process ever-increasing data while meeting user-specific requirements. Recent scientific research highlights significant advancements in abstractive summarization, with a particular focus on neural network-based methods. A detailed review of various neural network models for abstractive summarization identifies five key components essential to their design: encoder-decoder architecture, mechanisms, training strategies and optimization algorithms, dataset selection, and evaluation metrics. Each of these elements is pivotal in enhancing the summarization process. This study aims to provide a thorough understanding of the latest developments in neural network-based abstractive summarization models, offering insights into the evolving field and underscoring the associated challenges. Qualitative analysis using a concept matrix reveals common design trends in contemporary neural abstractive summarization systems. Notably, BERT-based encoder-decoder models have emerged as leading innovations, representing the most recent progress in the field. Based on the insights from this review, the study recommends integrating pre-trained language models with neural network techniques to achieve optimal performance in abstractive summarization tasks. As the volume of online information continues to surge, the field of automatic text summarization has garnered significant attention within the Natural Language Processing (NLP) community. Spanning over five decades, researchers have approached this problem from diverse angles, exploring various domains and employing a multitude of paradigms. This survey aims to delve into some of the most pertinent methodologies, focusing on both single-document and multiple-document summarization techniques, with a particular emphasis on empirical methods and extractive approaches. Additionally, the survey explores promising strategies that target specific intricacies of the summarization task. Notably, considerable attention is dedicated to the automatic evaluation of summarization systems, recognizing its pivotal role in guiding future research endeavors.

APA, Harvard, Vancouver, ISO, and other styles

29

Thomas, Sinnu Susan, Sumana Gupta, and Venkatesh K. Subramanian. "Perceptual Video Summarization—A New Framework for Video Summarization." IEEE Transactions on Circuits and Systems for Video Technology 27, no. 8 (August 2017): 1790–802. http://dx.doi.org/10.1109/tcsvt.2016.2556558.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Rahamat Basha, S., J. Keziya Rani, and J. J. C. Prasad Yadav. "A Novel Summarization-based Approach for Feature Reduction Enhancing Text Classification Accuracy." Engineering, Technology & Applied Science Research 9, no. 6 (December 1, 2019): 5001–5. http://dx.doi.org/10.48084/etasr.3173.

Full text

Abstract:

Automatic summarization is the process of shortening one (in single document summarization) or multiple documents (in multi-document summarization). In this paper, a new feature selection method for the nearest neighbor classifier by summarizing the original training documents based on sentence importance measure is proposed. Our approach for single document summarization uses two measures for sentence similarity: the frequency of the terms in one sentence and the similarity of that sentence to other sentences. All sentences were ranked accordingly and the sentences with top ranks (with a threshold constraint) were selected for summarization. The summary of every document in the corpus is taken into a new document used for the summarization evaluation process.

APA, Harvard, Vancouver, ISO, and other styles

31

Rahamat, Basha S., Rani J. Keziya, and Yadav J. J. C. Prasad. "A Novel Summarization-based Approach for Feature Reduction Enhancing Text Classification Accuracy." Engineering, Technology & Applied Science Research 9, no. 6 (December 1, 2019): 5001–5. https://doi.org/10.5281/zenodo.3566535.

Full text

Abstract:

Automatic summarization is the process of shortening one (in single document summarization) or multiple documents (in multi-document summarization). In this paper, a new feature selection method for the nearest neighbor classifier by summarizing the original training documents based on sentence importance measure is proposed. Our approach for single document summarization uses two measures for sentence similarity: the frequency of the terms in one sentence and the similarity of that sentence to other sentences. All sentences were ranked accordingly and the sentences with top ranks (with a threshold constraint) were selected for summarization. The summary of every document in the corpus is taken into a new document used for the summarization evaluation process.

APA, Harvard, Vancouver, ISO, and other styles

32

Karunamurthy, Dr A., R. Ramakrishnan, J. Nivetha, and S. Varsha. "Auto Synopsis: An Intelligent Web-Based Application for Automating Content Summarization Using Advanced NLP Techniques." International Scientific Journal of Engineering and Management 03, no. 12 (December 19, 2024): 1–6. https://doi.org/10.55041/isjem02157.

Full text

Abstract:

Auto Synopsis introduces an efficient web-based application designed to automate text summarization using advanced natural language processing (NLP) techniques. Built with Flask, the system extracts and processes textual content, transforming it into concise, meaningful summaries. The text undergoes preprocessing steps, including tokenization, lemmatization, and stemming, to prepare it for analysis. Auto Synopsis supports both extractive and abstractive summarization. Extractive summarization selects and extracts important sentences or segments from the original text, while abstractive summarization generates new sentences that convey core ideas in a more natural, human-like form. For smaller documents, a sentence similarity approach using cosine distance ranks sentences based on relevance. For larger documents, the PageRank algorithm evaluates sentence importance to select the most significant content. Auto Synopsis features a secure user authentication system, allowing individuals to create accounts, log in, and access personalized summaries. Designed for students, researchers, and professionals, this tool aims to streamline the summarization process, helping users quickly extract essential information from lengthy text. By reducing reading time and enhancing productivity, Auto Synopsis provides an invaluable solution for efficiently processing large volumes of information, ensuring that users gain quick and meaningful insights from complex documents. Keywords: Text Summarization, Automatic Summarization, Extractive Summarization, Abstractive Summarization, Natural Language Processing, Flask Web Application, PageRank Algorithm

APA, Harvard, Vancouver, ISO, and other styles

33

Kopeć, Mateusz. "Three-step coreference-based summarizer for Polish news texts." Poznan Studies in Contemporary Linguistics 55, no. 2 (June 26, 2019): 397–443. http://dx.doi.org/10.1515/psicl-2019-0015.

Full text

Abstract:

Abstract This article addresses the problem of automatic summarization of press articles in Polish. The main novelty of this research lays in the proposal of a three-step summarization algorithm which benefits from using coreference information. In related work section, all coreference-based approaches to summarization are presented. Then we describe in detail all publicly available summarization tools developed for Polish language. We state the problem of single-document press article summarization for Polish, describing the training and evaluation dataset: the POLISH SUMMARIES CORPUS. Next, a new coreference-based extractive summarization system NICOLAS is introduced. Its algorithm utilises advanced third-party preprocessing tools to extract the coreference information from the text to be summarized. This information is transformed into a complex set of features related to coreference concepts (mentions and coreference clusters) that are used for training the summarization system (on the basis of a manually prepared gold summaries corpus). The proposed solution is compared to the best publicly available summarization systems for Polish language and two state-of-the-art tools, developed for English language, but adapted to Polish for this article. NICOLAS summarization system obtains best scores, for selected metrics outperforming other systems in a statistically significant way. The evaluation also contains calculation of interesting upper-bounds: human performance and theoretical upper-bound.

APA, Harvard, Vancouver, ISO, and other styles

34

Chang, Hsien-Tsung, Shu-Wei Liu, and Nilamadhab Mishra. "A tracking and summarization system for online Chinese news topics." Aslib Journal of Information Management 67, no. 6 (November 16, 2015): 687–99. http://dx.doi.org/10.1108/ajim-10-2014-0147.

Full text

Abstract:

Purpose – The purpose of this paper is to design and implement new tracking and summarization algorithms for Chinese news content. Based on the proposed methods and algorithms, the authors extract the important sentences that are contained in topic stories and list those sentences according to timestamp order to ensure ease of understanding and to visualize multiple news stories on a single screen. Design/methodology/approach – This paper encompasses an investigational approach that implements a new Dynamic Centroid Summarization algorithm in addition to a Term Frequency (TF)-Density algorithm to empirically compute three target parameters, i.e., recall, precision, and F-measure. Findings – The proposed TF-Density algorithm is implemented and compared with the well-known algorithms Term Frequency-Inverse Word Frequency (TF-IWF) and Term Frequency-Inverse Document Frequency (TF-IDF). Three test data sets are configured from Chinese news web sites for use during the investigation, and two important findings are obtained that help the authors provide more precision and efficiency when recognizing the important words in the text. First, the authors evaluate three topic tracking algorithms, i.e., TF-Density, TF-IDF, and TF-IWF, with the said target parameters and find that the recall, precision, and F-measure of the proposed TF-Density algorithm is better than those of the TF-IWF and TF-IDF algorithms. In the context of the second finding, the authors implement a blind test approach to obtain the results of topic summarizations and find that the proposed Dynamic Centroid Summarization process can more accurately select topic sentences than the LexRank process. Research limitations/implications – The results show that the tracking and summarization algorithms for news topics can provide more precise and convenient results for users tracking the news. The analysis and implications are limited to Chinese news content from Chinese news web sites such as Apple Library, UDN, and well-known portals like Yahoo and Google. Originality/value – The research provides an empirical analysis of Chinese news content through the proposed TF-Density and Dynamic Centroid Summarization algorithms. It focusses on improving the means of summarizing a set of news stories to appear for browsing on a single screen and carries implications for innovative word measurements in practice.

APA, Harvard, Vancouver, ISO, and other styles

35

ber, Bam, and Micah Jason. "News Filtering and Summarization System Architecture for Recognition and Summarization of News Pages." Bonfring International Journal of Data Mining 7, no. 2 (May 31, 2017): 11–15. http://dx.doi.org/10.9756/bijdm.8339.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Riahi Samani, Zahra, and Mohsen Ebrahimi Moghaddam. "Image Collection Summarization Method Based on Semantic Hierarchies." AI 1, no. 2 (May 18, 2020): 209–28. http://dx.doi.org/10.3390/ai1020014.

Full text

Abstract:

The size of internet image collections is increasing drastically. As a result, new techniques are required to facilitate users in browsing, navigation, and summarization of these large volume collections. Image collection summarization methods present users with a set of exemplar images as the most representative ones from the initial image collection. In this study, an image collection summarization technique was introduced according to semantic hierarchies among them. In the proposed approach, images were mapped to the nodes of a pre-defined domain ontology. In this way, a semantic hierarchical classifier was used, which finally mapped images to different nodes of the ontology. We made a compromise between the degree of freedom of the classifier and the goodness of the summarization method. The summarization was done using a group of high-level features that provided a semantic measurement of information in images. Experimental outcomes indicated that the introduced image collection summarization method outperformed the recent techniques for the summarization of image collections.

APA, Harvard, Vancouver, ISO, and other styles

37

Pivovarov, Rimma, and Noémie Elhadad. "Automated methods for the summarization of electronic health records." Journal of the American Medical Informatics Association 22, no. 5 (April 15, 2015): 938–47. http://dx.doi.org/10.1093/jamia/ocv032.

Full text

Abstract:

Abstract Objectives This review examines work on automated summarization of electronic health record (EHR) data and in particular, individual patient record summarization. We organize the published research and highlight methodological challenges in the area of EHR summarization implementation. Target audience The target audience for this review includes researchers, designers, and informaticians who are concerned about the problem of information overload in the clinical setting as well as both users and developers of clinical summarization systems. Scope Automated summarization has been a long-studied subject in the fields of natural language processing and human–computer interaction, but the translation of summarization and visualization methods to the complexity of the clinical workflow is slow moving. We assess work in aggregating and visualizing patient information with a particular focus on methods for detecting and removing redundancy, describing temporality, determining salience, accounting for missing data, and taking advantage of encoded clinical knowledge. We identify and discuss open challenges critical to the implementation and use of robust EHR summarization systems.

APA, Harvard, Vancouver, ISO, and other styles

38

Ahuir, Vicent, José-Ángel González, Lluís-F. Hurtado, and Encarna Segarra. "Abstractive Summarizers Become Emotional on News Summarization." Applied Sciences 14, no. 2 (January 15, 2024): 713. http://dx.doi.org/10.3390/app14020713.

Full text

Abstract:

Emotions are central to understanding contemporary journalism; however, they are overlooked in automatic news summarization. Actually, summaries are an entry point to the source article that could favor some emotions to captivate the reader. Nevertheless, the emotional content of summarization corpora and the emotional behavior of summarization models are still unexplored. In this work, we explore the usage of established methodologies to study the emotional content of summarization corpora and the emotional behavior of summarization models. Using these methodologies, we study the emotional content of two widely used summarization corpora: Cnn/Dailymail and Xsum, and the capabilities of three state-of-the-art transformer-based abstractive systems for eliciting emotions in the generated summaries: Bart, Pegasus, and T5. The main significant findings are as follows: (i) emotions are persistent in the two summarization corpora, (ii) summarizers approach moderately well the emotions of the reference summaries, and (iii) more than 75% of the emotions introduced by novel words in generated summaries are present in the reference ones. The combined use of these methodologies has allowed us to conduct a satisfactory study of the emotional content in news summarization.

APA, Harvard, Vancouver, ISO, and other styles

39

Sanchez-Gomez, Jesús Manuel, Miguel Ángel Vega-Rodríguez, and Sánchez Carlos Javier Pérez. "Automatic update summarization by a multi-objective number-one-selection genetic approach." IEEE Transactions on Cybernetics 53, no. 12 (June 7, 2023): 7443–54. https://doi.org/10.1109/TCYB.2022.3223163.

Full text

Abstract:

Currently, the explosive growth of the information available on the internet makes automatic text summarization systems increasingly important. A particularly relevant challenge is the update summarization task. Update summarization differs from traditional summarization in its dynamic nature. While traditional summarization is static, i.e., the document collections about a specific topic remain unchanged, update summarization addresses dynamic document collections based on a specific topic. Therefore, update summarization consists of summarizing the new document collection under the assumption that the user has already read a previous summarization and only the new information is interesting. The Multi-Objective Number-One selection Genetic Algorithm (MONOGA) has been designed and implemented to address this problem. The proposed algorithm produces a summary that is relevant to the user’s given query, and it also contains updates information. Experiments were conducted on Text Analysis Conference (TAC) datasets, and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics were considered to assess the model performance. The results obtained by the proposed approach outperform those from the existing approaches in the scientific literature, obtaining average percentage improvements between 12.74% and 55.03% in the ROUGE scores.

APA, Harvard, Vancouver, ISO, and other styles

40

Zhu, Junnan, Lu Xiang, Yu Zhou, Jiajun Zhang, and Chengqing Zong. "Graph-based Multimodal Ranking Models for Multimodal Summarization." ACM Transactions on Asian and Low-Resource Language Information Processing 20, no. 4 (May 26, 2021): 1–21. http://dx.doi.org/10.1145/3445794.

Full text

Abstract:

Multimodal summarization aims to extract the most important information from the multimedia input. It is becoming increasingly popular due to the rapid growth of multimedia data in recent years. There are various researches focusing on different multimodal summarization tasks. However, the existing methods can only generate single-modal output or multimodal output. In addition, most of them need a lot of annotated samples for training, which makes it difficult to be generalized to other tasks or domains. Motivated by this, we propose a unified framework for multimodal summarization that can cover both single-modal output summarization and multimodal output summarization. In our framework, we consider three different scenarios and propose the respective unsupervised graph-based multimodal summarization models without the requirement of any manually annotated document-summary pairs for training: (1) generic multimodal ranking, (2) modal-dominated multimodal ranking, and (3) non-redundant text-image multimodal ranking. Furthermore, an image-text similarity estimation model is introduced to measure the semantic similarity between image and text. Experiments show that our proposed models outperform the single-modal summarization methods on both automatic and human evaluation metrics. Besides, our models can also improve the single-modal summarization with the guidance of the multimedia information. This study can be applied as the benchmark for further study on multimodal summarization task.

APA, Harvard, Vancouver, ISO, and other styles

41

Zhang, Mengli, Gang Zhou, Wanting Yu, Ningbo Huang, and Wenfen Liu. "A Comprehensive Survey of Abstractive Text Summarization Based on Deep Learning." Computational Intelligence and Neuroscience 2022 (August 1, 2022): 1–21. http://dx.doi.org/10.1155/2022/7132226.

Full text

Abstract:

With the rapid development of the Internet, the massive amount of web textual data has grown exponentially, which has brought considerable challenges to downstream tasks, such as document management, text classification, and information retrieval. Automatic text summarization (ATS) is becoming an extremely important means to solve this problem. The core of ATS is to mine the gist of the original text and automatically generate a concise and readable summary. Recently, to better balance and develop these two aspects, deep learning (DL)-based abstractive summarization models have been developed. At present, for ATS tasks, almost all state-of-the-art (SOTA) models are based on DL architecture. However, a comprehensive literature survey is still lacking in the field of DL-based abstractive text summarization. To fill this gap, this paper provides researchers with a comprehensive survey of DL-based abstractive summarization. We first give an overview of abstractive summarization and DL. Then, we summarize several typical frameworks of abstractive summarization. After that, we also give a comparison of several popular datasets that are commonly used for training, validation, and testing. We further analyze the performance of several typical abstractive summarization systems on common datasets. Finally, we highlight some open challenges in the abstractive summarization task and outline some future research trends. We hope that these explorations will provide researchers with new insights into DL-based abstractive summarization.

APA, Harvard, Vancouver, ISO, and other styles

42

Kartamanah, Fatih Fauzan, Aldy Rialdy Atmadja, and Ichsan Budiman. "Analyzing PEGASUS Model Performance with ROUGE on Indonesian News Summarization." sinkron 9, no. 1 (January 6, 2025): 31–42. https://doi.org/10.33395/sinkron.v9i1.14303.

Full text

Abstract:

Text summarization technology has been rapidly advancing, playing a vital role in improving information accessibility and reducing reading time within Natural Language Processing (NLP) research. There are two primary approaches to text summarization: extractive and abstractive. Extractive methods focus on selecting key sentences or phrases directly from the source text, while abstractive summarization generates new sentences that capture the essence of the content. Abstractive summarization, although more flexible, poses greater challenges in maintaining coherence and contextual relevance due to its complexity. This study aims to enhance automated abstractive summarization for Indonesian-language online news articles by employing the PEGASUS (Pre-training with Extracted Gap-sentences Sequences for Abstractive Summarization) model, which leverages an encoder-decoder architecture optimized for summarization tasks. The dataset utilized consists of 193,883 articles from Liputan6, a prominent Indonesian news platform. The model was fine-tuned and evaluated using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metric, focusing on F-1 scores for ROUGE-1, ROUGE-2, and ROUGE-L. The results demonstrated the model's ability to generate coherent and informative summaries, achieving ROUGE-1, ROUGE-2, and ROUGE-L scores of 0.439, 0.183, and 0.406, respectively. These findings underscore the potential of the PEGASUS model in addressing the challenges of abstractive summarization for low-resource languages like Indonesian language, offering a significant contribution to summarization quality for online news content.

APA, Harvard, Vancouver, ISO, and other styles

43

D., K. Kanitha, Muhammad Noorul Mubarak D., and A. Shanavas S. "COMPARISON OF TEXT SUMMARIZER IN INDIAN LANGUAGES." International Journal of Advanced Trends in Engineering and Technology 3, no. 1 (March 21, 2018): 79–82. https://doi.org/10.5281/zenodo.1205087.

Full text

Abstract:

Text summarization is the process of extracting the relevant information from a source text keeps the significant information. Mainly two types of text summarization methods such as abstractive and extractive. The extractive summarization ranks all sentences and high scored sentences are selected as summary. The abstractive summarization understands the content of a document and re-state in few words. This paper discusses about various text summarization methods followed by the Indian languages. The existing algorithms are explained and then the merits and demerits are discussed. This paper also investigates which method is suitable for summarizing documents in Indian languages.

APA, Harvard, Vancouver, ISO, and other styles

44

Bai, Jiyang, and Peixiang Zhao. "Poligras: Policy-Based Graph Summarization." Proceedings of the VLDB Endowment 17, no. 10 (June 2024): 2432–44. http://dx.doi.org/10.14778/3675034.3675037.

Full text

Abstract:

Large graphs are ubiquitous. Their sizes, rates of growth, and complexity, however, have significantly outpaced human capabilities to ingest and make sense of them. As a cost-effective graph simplification technique, graph summarization is aimed to reduce large graphs into concise, structure-preserving, and quality-enhanced summaries readily available for efficient graph storage, processing, and visualization. Concretely, given a graph G , graph summarization condenses G into a succinct representation comprising (1) a supergraph with supernodes representing disjoint sets of vertices of G and superedges depicting aggregate-level connections between supernodes, and (2) a set of correction edges that help reconstruct G losslessly from the supergraph. Existing graph summarization solutions offer non-optimal graph summaries and are time-demanding in real-world large graphs. In this paper, we propose a learning-enhanced graph summarization approach, Poligras ( Poli cy-based gra ph summarization), to model the most critical computational component in graph summarization: supernode selection and merging. Specifically, we design a probabilistic policy learned and optimized by neural networks for efficient optimal supernode pair selection. As the first learning-enhanced, scalable graph summarization method, Poligras achieves significantly improved performance over state-of-the-art graph summarization solutions in real-world large graphs.

APA, Harvard, Vancouver, ISO, and other styles

45

Jain, Rekha, Linesh Raja, Sandeep Kumar Sharma, and Devershi Pallavi Bhatt. "Particle swarm optimization model for Hindi text summarization." Journal of Information and Optimization Sciences 45, no. 4 (2024): 839–50. http://dx.doi.org/10.47974/jios-1609.

Full text

Abstract:

Text Summarization is one of the techniques that shorten the original text without vanishing its information as well as meaning. A lot of algorithms exist for text summarization. Two approaches namely Abstractive Text Summarization and Extractive Text Summarization are used for this purpose. In Abstractive text summarization, the entire document is regenerated using a few lines. Whereas in Extractive Text Summarization sentences are filtered based on some ranks assigned to them by a specific algorithm. A lot of work has already been done in languages like English, Chinese etc. In this paper, the authors propose the summarization of Hindi text using the Particle Swarm Optimization model. Initially, the text in the Hindi language is summarized using a ranking-based technique then PSO (Particle Swarm Optimization) is applied to have an optimized summary of the text. One of the ranking-based techniques i.e. TF-IDF is introduced. Implementation of the proposed Systems is initially discussed in five steps- preprocessing, feature extraction, ranks generation, post-processing and optimized summarization using PSO. At the end, results are shown in terms of an optimized summary of text in a specific language. This system can be implemented in any standard language, but Hindi is selected for practical implementation because very few research work is done in the Hindi Language.

APA, Harvard, Vancouver, ISO, and other styles

46

Zhang, Xinyuan, Ruiyi Zhang, Manzil Zaheer, and Amr Ahmed. "Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 16 (May 18, 2021): 14489–97. http://dx.doi.org/10.1609/aaai.v35i16.17703.

Full text

Abstract:

High-quality dialogue-summary paired data is expensive to produce and domain-sensitive, making abstractive dialogue summarization a challenging task. In this work, we propose the first unsupervised abstractive dialogue summarization model for tete-a-tetes (SuTaT). Unlike standard text summarization, a dialogue summarization method should consider the multi-speaker scenario where the speakers have different roles, goals, and language styles. In a tete-a-tete, such as a customer-agent conversation, SuTaT aims to summarize for each speaker by modeling the customer utterances and the agent utterances separately while retaining their correlations. SuTaT consists of a conditional generative module and two unsupervised summarization modules. The conditional generative module contains two encoders and two decoders in a variational autoencoder framework where the dependencies between two latent spaces are captured. With the same encoders and decoders, two unsupervised summarization modules equipped with sentence-level self-attention mechanisms generate summaries without using any annotations. Experimental results show that SuTaT is superior on unsupervised dialogue summarization for both automatic and human evaluations, and is capable of dialogue classification and single-turn conversation generation.

APA, Harvard, Vancouver, ISO, and other styles

47

Ashwini Mandale-Jadhav. "Text Summarization Using Natural Language Processing." Journal of Electrical Systems 20, no. 11s (January 31, 2025): 3410–17. https://doi.org/10.52783/jes.8095.

Full text

Abstract:

Text summarization is a crucial task in natural language processing (NLP) that aims to condense large volumes of text into concise and informative summaries. This paper presents a comprehensive study of text summarization techniques using advanced NLP methods. The research focuses on extractive summarization, where key sentences or phrases are extracted from the original text to form a coherent summary. Various approaches such as graph-based algorithms, deep learning models, and hybrid methods combining linguistic features and neural networks are explored and evaluated. The paper also investigates the impact of domain-specific summarization techniques for specialized content areas. Experimental results on benchmark datasets demonstrate the effectiveness and scalability of the proposed methods compared to baseline summarization techniques. The findings contribute to advancing the state-of-the-art in text summarization, with implications for applications in information retrieval, document analysis, and automated content generation

APA, Harvard, Vancouver, ISO, and other styles

48

Dr. Geetanjali Vinayak Kale. "A Comprehensive Study of Text Summarization with Advent of Large Language Models." Communications on Applied Nonlinear Analysis 32, no. 9s (March 4, 2025): 1073–88. https://doi.org/10.52783/cana.v32.4113.

Full text

Abstract:

Introduction: Communication is at the heart of the human race. With the growth of social media and other communication platforms, the globe is now connected at a single click. People communicate and tend to share information through these platforms. A massive amount of data is being generated and being analysed every second. To tackle the problem of analysing Big Data and withdraw insights from it, is a difficult task. Text summarization is the process of concise representation of textual data so as to extract the most important information out of it. Text summarization plays a major role in analysing big data and taking decisions based upon the insights drawn. With the advent of Large Language Models, the techniques used for summarization have been enhanced to a large extent. The following paper surveys old techniques for text summarization and studies the new methodologies using Large Language Models. The paper aims to deliver the most up-to-date survey of text summarization and enhancement in it using Large Language Models. Objectives: The objective of this paper is to provide a comprehensive study on text summarization, focusing on its evolution and the advancements brought by Large Language Models (LLMs). It aims to analyse the development of summarization techniques, including both extractive and abstractive methods, while exploring the mathematical algorithms and machine learning models that underpin these approaches. Additionally, the paper discusses the impact of natural language processing (NLP) advancements, particularly LLMs, in enhancing the accuracy and efficiency of summarization. A comparative analysis of traditional and modern approaches is presented, evaluating their effectiveness using various datasets. Furthermore, the study highlights key research contributions in the field and identifies current challenges, paving the way for future innovations in text summarization. Methods: Text summarization research has evolved significantly, exploring extractive and abstractive methods using machine learning and LLMs. Mark Dredze et al. used LSA and LDA for email summarization, while Mohamed Abdel Fattah et al. trained ML models for sentence extraction. Jan Ulrich et al. applied regression-based learning for email thread summaries. Pete Burnap et al. focused on summarizing real-world events from social media. Derek Miller et al. used BERT and K-Means for lecture summarization, and Rahim Khan et al. leveraged K-Means and TF-IDF for news summarization. Mingxi Zhang et al. optimized TextRank for keyword extraction. Jingqing Zhang et al. introduced PEGASUS for abstractive summarization with advanced pre-training techniques. Zhang et al. also explored long-dialogue summarization using retrieval-based and hierarchical encoding methods. Conclusions: The advent of Large Language Models (LLMs) has significantly advanced text summarization by addressing the shortcomings of earlier extractive and abstractive techniques. Traditional extractive methods often produced fragmented summaries, while early abstractive approaches struggled with coherence and redundancy. LLMs, powered by transformers and self-attention mechanisms, have enabled more fluent, contextually aware, and human-like summaries. These models can effectively capture long-range dependencies, rephrase content, and generate concise yet meaningful summaries. As a result, LLMs have expanded the scope of text summarization, making it more applicable and reliable across various domains, including news, research, and automated content generation.

APA, Harvard, Vancouver, ISO, and other styles

49

Thanh, Tam Doan, Tan Minh Nguyen, Thai Binh Nguyen, Hoang Trung Nguyen, Hai Long Nguyen, Mai Vu Tran, Quang Thuy Ha, and Ha Thanh Nguyen. "Graph-based and generative approaches to multi-document summarization." Journal of Computer Science and Cybernetics 40, no. 3 (August 23, 2024): 203–17. https://doi.org/10.15625/1813-9663/18353.

Full text

Abstract:

Multi-document summarization is a challenging problem in the Natural Language Processing field that has drawn a lot of interest from the research community. In this paper, we propose a two-phase pipeline to tackle the Vietnamese abstractive multi-document summarization task. The initial phase of the pipeline involves an extractive summarization stage including two different systems. The first system employs a hybrid model based on the TextRank algorithm and a text correlation consideration mechanism. The second system is a modified version of SummPip - an unsupervised graph-based method for multi-document summarization. The second phase of the pipeline is abstractive summarization models. Particularly, generative models are applied to produce abstractive summaries from previous phase outputs. The proposed method achieves competitive results as we surpassed many strong research teams to finish the first rank in the AbMusu task - Vietnamese abstractive multi-document summarization, organized in the VLSP 2022 workshop.

APA, Harvard, Vancouver, ISO, and other styles

50

Timalsina, Bipin, Nawaraj Paudel, and Tej Bahadur Shahi. "Attention based Recurrent Neural Network for Nepali Text Summarization." Journal of Institute of Science and Technology 27, no. 1 (June 30, 2022): 141–48. http://dx.doi.org/10.3126/jist.v27i1.46709.

Full text

Abstract:

Automatic text summarization has been a challenging topic in natural language processing (NLP) as it demands preserving important information while summarizing the large text into a summary. Extractive and abstractive text summarization are widely investigated approaches for text summarization. In extractive summarization, the important sentence from the large text is extracted and combined to create a summary whereas abstractive summarization creates a summary that is more focused on meaning, rather than content. Therefore, abstractive summarization gained more attention from researchers in the recent past. However, text summarization is still an untouched topic in the Nepali language. To this end, we proposed an abstractive text summarization for Nepali text. Here, we, first, create a Nepali text dataset by scraping Nepali news from the online news portals. Second, we design a deep learning-based text summarization model based on an encoder-decoder recurrent neural network with attention. More precisely, Long Short-Term Memory (LSTM) cells are used in the encoder and decoder layer. Third, we build nine different models by selecting various hyper-parameters such as the number of hidden layers and the number of nodes. Finally, we report the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score for each model to evaluate their performance. Among nine different models created by adjusting different numbers of layers and hidden states, the model with a single-layer encoder and 256 hidden states outperformed all other models with F-Score values of 15.74, 3.29, and 15.21 for ROUGE-1 ROUGE-2 and ROUGE-L, respectively.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!