To see the other types of publications on this topic, follow the link: Document Summarization.

Journal articles on the topic 'Document Summarization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Document Summarization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Rahamat Basha, S., J. Keziya Rani, and J. J. C. Prasad Yadav. "A Novel Summarization-based Approach for Feature Reduction Enhancing Text Classification Accuracy." Engineering, Technology & Applied Science Research 9, no. 6 (December 1, 2019): 5001–5. http://dx.doi.org/10.48084/etasr.3173.

Full text
Abstract:
Automatic summarization is the process of shortening one (in single document summarization) or multiple documents (in multi-document summarization). In this paper, a new feature selection method for the nearest neighbor classifier by summarizing the original training documents based on sentence importance measure is proposed. Our approach for single document summarization uses two measures for sentence similarity: the frequency of the terms in one sentence and the similarity of that sentence to other sentences. All sentences were ranked accordingly and the sentences with top ranks (with a threshold constraint) were selected for summarization. The summary of every document in the corpus is taken into a new document used for the summarization evaluation process.
APA, Harvard, Vancouver, ISO, and other styles
2

Singh, Sandhya, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, and Seema Verma. "Towards Better Single Document Summarization using Multi-Document Summarization Approach." International Journal of Computer Sciences and Engineering 7, no. 5 (May 31, 2019): 695–703. http://dx.doi.org/10.26438/ijcse/v7i5.695703.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kongara, Srinivasa Rao, Dasika Sree Rama Chandra Murthy, and Gangadhara Rao Kancherla. "An Automatic Text Summarization Method with the Concern of Covering Complete Formation." Recent Advances in Computer Science and Communications 13, no. 5 (November 5, 2020): 977–86. http://dx.doi.org/10.2174/2213275912666190716105347.

Full text
Abstract:
Background: Text summarization is the process of generating a short description of the entire document which is more difficult to read. This method provides a convenient way of extracting the most useful information and a short summary of the documents. In the existing research work, this is focused by introducing the Fuzzy Rule-based Automated Summarization Method (FRASM). Existing work tends to have various limitations which might limit its applicability to the various real-world applications. The existing method is only suitable for the single document summarization where various applications such as research industries tend to summarize information from multiple documents. Methods: This paper proposed Multi-document Automated Summarization Method (MDASM) to introduce the summarization framework which would result in the accurate summarized outcome from the multiple documents. In this work, multi-document summarization is performed whereas in the existing system only single document summarization was performed. Initially document clustering is performed using modified k means cluster algorithm to group the similar kind of documents that provides the same meaning. This is identified by measuring the frequent term measurement. After clustering, pre-processing is performed by introducing the Hybrid TF-IDF and Singular value decomposition technique which would eliminate the irrelevant content and would result in the required content. Then sentence measurement is one by introducing the additional metrics namely Title measurement in addition to the existing work metrics to accurately retrieve the sentences with more similarity. Finally, a fuzzy rule system is applied to perform text summarization. Results: The overall evaluation of the research work is conducted in the MatLab simulation environment from which it is proved that the proposed research method ensures the optimal outcome than the existing research method in terms of accurate summarization. MDASM produces 89.28% increased accuracy, 89.28% increased precision, 89.36% increased recall value and 70% increased the f-measure value which performs better than FRASM. Conclusion: The summarization processes carried out in this work provides the accurate summarized outcome.
APA, Harvard, Vancouver, ISO, and other styles
4

Diedrichsen, Elke. "Linguistic challenges in automatic summarization technology." Journal of Computer-Assisted Linguistic Research 1, no. 1 (June 26, 2017): 40. http://dx.doi.org/10.4995/jclr.2017.7787.

Full text
Abstract:
Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization.
APA, Harvard, Vancouver, ISO, and other styles
5

D’Silva, Suzanne, Neha Joshi, Sudha Rao, Sangeetha Venkatraman, and Seema Shrawne. "Improved Algorithms for Document Classification &Query-based Multi-Document Summarization." International Journal of Engineering and Technology 3, no. 4 (2011): 404–9. http://dx.doi.org/10.7763/ijet.2011.v3.261.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Vikas, A., Pradyumna G.V.N, and Tahir Ahmed Shaik. "Text Summarization." International Journal of Engineering and Computer Science 9, no. 2 (February 3, 2020): 24940–45. http://dx.doi.org/10.18535/ijecs/v9i2.4437.

Full text
Abstract:
In this new era, where tremendous information is available on the internet, it is most important to provide the improved mechanism to extract the information quickly and most efficiently. It is very difficult for human beings to manually extract the summary of a large documents of text. There are plenty of text material available on the internet. So, there is a problem of searching for relevant documents from the number of documents available and absorbing relevant information from it. In order to solve the above two problems, the automatic text summarization is very much necessary. Text summarization is the process of identifying the most important meaningful information in a document or set of related documents and compressing them into a shorter version preserving its overall meanings.
APA, Harvard, Vancouver, ISO, and other styles
7

Sirohi, Neeraj Kumar, Dr Mamta Bansal, and Dr S. N. Rajan Rajan. "Text Summarization Approaches Using Machine Learning & LSTM." Revista Gestão Inovação e Tecnologias 11, no. 4 (September 1, 2021): 5010–26. http://dx.doi.org/10.47059/revistageintec.v11i4.2526.

Full text
Abstract:
Due to the massive amount of online textual data generated in a diversity of social media, web, and other information-centric applications. To select the vital data from the large text, need to study the full article and generate summary also not loose critical information of text document this process is called summarization. Text summarization is done either by human which need expertise in that area, also very tedious and time consuming. second type of summarization is done through system which is known as automatic text summarization which generate summary automatically. There are mainly two categories of Automatic text summarizations that is abstractive and extractive text summarization. Extractive summary is produced by picking important and high rank sentences and word from the text document on the other hand the sentences and word are present in the summary generated through Abstractive method may not present in original text. This article mainly focuses on different ATS (Automatic text summarization) techniques that has been instigated in the present are argue. The paper begin with a concise introduction of automatic text summarization, then closely discussed the innovative developments in extractive and abstractive text summarization methods, and then transfers to literature survey, and it finally sum-up with the proposed techniques using LSTM with encoder Decoder for abstractive text summarization are discussed along with some future work directions.
APA, Harvard, Vancouver, ISO, and other styles
8

Manju, K., S. David Peter, and Sumam Idicula. "A Framework for Generating Extractive Summary from Multiple Malayalam Documents." Information 12, no. 1 (January 18, 2021): 41. http://dx.doi.org/10.3390/info12010041.

Full text
Abstract:
Automatic extractive text summarization retrieves a subset of data that represents most notable sentences in the entire document. In the era of digital explosion, which is mostly unstructured textual data, there is a demand for users to understand the huge amount of text in a short time; this demands the need for an automatic text summarizer. From summaries, the users get the idea of the entire content of the document and can decide whether to read the entire document or not. This work mainly focuses on generating a summary from multiple news documents. In this case, the summary helps to reduce the redundant news from the different newspapers. A multi-document summary is more challenging than a single-document summary since it has to solve the problem of overlapping information among sentences from different documents. Extractive text summarization yields the sensitive part of the document by neglecting the irrelevant and redundant sentences. In this paper, we propose a framework for extracting a summary from multiple documents in the Malayalam Language. Also, since the multi-document summarization data set is sparse, methods based on deep learning are difficult to apply. The proposed work discusses the performance of existing standard algorithms in multi-document summarization of the Malayalam Language. We propose a sentence extraction algorithm that selects the top ranked sentences with maximum diversity. The system is found to perform well in terms of precision, recall, and F-measure on multiple input documents.
APA, Harvard, Vancouver, ISO, and other styles
9

Mamidala, Kishore Kumar, and Suresh Kumar Sanampudi. "A Novel Framework for Multi-Document Temporal Summarization (MDTS)." Emerging Science Journal 5, no. 2 (April 1, 2021): 184–90. http://dx.doi.org/10.28991/esj-2021-01268.

Full text
Abstract:
Internet or Web consists of a massive amount of information, handling which is a tedious task. Summarization plays a crucial role in extracting or abstracting key content from multiple sources with its meaning contained, thereby reducing the complexity in handling the information. Multi-document summarization gives the gist of the content collected from multiple documents. Temporal summarization concentrates on temporally related events. This paper proposes a Multi-Document Temporal Summarization (MDTS) technique that generates the summary based on temporally related events extracted from multiple documents. This technique extracts the events with the time stamp. TIMEML standards tags are used in extracting events and times. These event-times are stored in a structured database form for easier operations. Sentence ranking methods are build based on the frequency of events occurrences in the sentence. Sentence similarity measures are computed to eliminate the redundant sentences in an extracted summary. Depending on the required summary length, top-ranked sentences are selected to form the summary. Experiments are conducted on DUC 2006 and DUC 2007 data set that was released for multi-document summarization task. The extracted summaries are evaluated using ROUGE to determine precision, recall and F measure of generated summaries. The performance of the proposed method is compared with particle swarm optimization-based algorithm (PSOS), Cat swarm optimization-based summarization (CSOS), Cuckoo Search based multi-document summarization (MDSCSA). It is found that the performance of MDTS is better when compared with other methods. Doi: 10.28991/esj-2021-01268 Full Text: PDF
APA, Harvard, Vancouver, ISO, and other styles
10

Yadav, Avaneesh Kumar, Ashish Kumar Maurya, Ranvijay, and Rama Shankar Yadav. "Extractive Text Summarization Using Recent Approaches: A Survey." Ingénierie des systèmes d information 26, no. 1 (February 28, 2021): 109–21. http://dx.doi.org/10.18280/isi.260112.

Full text
Abstract:
In this era of growing digital media, the volume of text data increases day by day from various sources and may contain entire documents, books, articles, etc. This amount of text is a source of information that may be insignificant, redundant, and sometimes may not carry any meaningful representation. Therefore, we require some techniques and tools that can automatically summarize the enormous amounts of text data and help us to decide whether they are useful or not. Text summarization is a process that generates a brief version of the document in the form of a meaningful summary. It can be classified into abstractive text summarization and extractive text summarization. Abstractive text summarization generates an abstract type of summary from the given document. In extractive text summarization, a summary is created from the given document that contains crucial sentences of the document. Many authors proposed various techniques for both types of text summarization. This paper presents a survey of extractive text summarization on graphical-based techniques. Specifically, it focuses on unsupervised and supervised techniques. This paper shows the recent works and advances on them and focuses on the strength and weaknesses of surveys of previous works in tabular form. At last, it concentrates on the evaluation measure techniques of summary.
APA, Harvard, Vancouver, ISO, and other styles
11

Jeong, Hyoungil, Youngjoong Ko, and Jungyun Seo. "Statistical Text Summarization Using a Category-Based Language Model on a Bootstrapping Framework." International Journal on Artificial Intelligence Tools 27, no. 03 (May 2018): 1850014. http://dx.doi.org/10.1142/s0218213018500148.

Full text
Abstract:
Traditional text summarization systems have not used the category information of documents to be summarized. However, the estimated weights of each word can be often biased on small data such as a single document. Thus we proposed an effective feature-weighting method for document summarization that utilizes category information and solves the biased probability problem. The method uses a category-based smoothing method and a bootstrapping framework. As a result, in our experiments, our proposed summarization method achieves better performance than other statistical sentence-extraction methods.
APA, Harvard, Vancouver, ISO, and other styles
12

Rautray, Rasmita, Rakesh Chandra Balabantaray, and Anisha Bhardwaj. "Document Summarization Using Sentence Features." International Journal of Information Retrieval Research 5, no. 1 (January 2015): 36–47. http://dx.doi.org/10.4018/ijirr.2015010103.

Full text
Abstract:
Problem of exponential growth of information available electronically, there is an increasing demand for text summarization. Text summarization is the process of extracting the contents of the original text in a shorter form that provides useful information to the user. This paper presents a summarizer to produce summaries while reducing the redundant information and maximizing the summary relevancy. The proposed model takes several features into an account, including title feature, sentence weight, term weight, sentence position, inter sentence similarity, proper noun, thematic word and numerical data. The score of each feature for the model can be obtained from the document sets. However, the results of such models are evaluated to measure their performance based on F-score of extracted sentences at 20% compression rate on a C-50 data corpus. Experimental studies on C-50 data corpus, PSO summarizer show significantly better performance compared to other summarizer.
APA, Harvard, Vancouver, ISO, and other styles
13

LIU, Mei-Ling, De-Quan ZHENG, Tie-Jun ZHAO, and Yang YU. "Dynamic Multi-Document Summarization Model." Journal of Software 23, no. 2 (March 6, 2012): 289–98. http://dx.doi.org/10.3724/sp.j.1001.2012.03999.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Kumar. "Automatic Multi Document Summarization Approaches." Journal of Computer Science 8, no. 1 (January 1, 2012): 133–40. http://dx.doi.org/10.3844/jcssp.2012.133.140.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

dos Santos Marujo, Luís Carlos. "Event-based Multi-document Summarization." ACM SIGIR Forum 49, no. 2 (January 29, 2016): 148–49. http://dx.doi.org/10.1145/2888422.2888448.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Li, Jingxuan, Lei Li, and Tao Li. "Multi-document summarization via submodularity." Applied Intelligence 37, no. 3 (February 9, 2012): 420–30. http://dx.doi.org/10.1007/s10489-012-0336-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Yao, Jin-ge, Xiaojun Wan, and Jianguo Xiao. "Recent advances in document summarization." Knowledge and Information Systems 53, no. 2 (March 28, 2017): 297–336. http://dx.doi.org/10.1007/s10115-017-1042-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Wang, Dingding, and Tao Li. "Weighted consensus multi-document summarization." Information Processing & Management 48, no. 3 (May 2012): 513–23. http://dx.doi.org/10.1016/j.ipm.2011.07.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Atkinson, John, and Ricardo Munoz. "Rhetorics-based multi-document summarization." Expert Systems with Applications 40, no. 11 (September 2013): 4346–52. http://dx.doi.org/10.1016/j.eswa.2013.01.017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

G. El Barbary, O., and Radwan Abu Gdairi. "Neutrosophic Logic-Based Document Summarization." Journal of Mathematics 2021 (July 24, 2021): 1–7. http://dx.doi.org/10.1155/2021/9938693.

Full text
Abstract:
Nowadays, rich quantity of information is offered on the Net which makes it hard for the clients to detect necessary information. Programmed techniques are desirable to effectively filter and search useful data from the Net. The purpose of purported text summarization is to get satisfied content handling with information variety. The main factor of document summarization is to extract benefit feature. In this paper, we extract word feature in three group called important words. Also, we extract sentence feature depending on the extracted words. With increasing knowledge on the Internet, it turns out to be an extremely time-consuming, exhausting, and boring mission to read the whole content and papers and get the relevant information on precise topics
APA, Harvard, Vancouver, ISO, and other styles
21

Dief, Nada A., Ali E. Al-Desouky, Amr Aly Eldin, and Asmaa M. El-Said. "An Adaptive Semantic Descriptive Model for Multi-Document Representation to Enhance Generic Summarization." International Journal of Software Engineering and Knowledge Engineering 27, no. 01 (February 2017): 23–48. http://dx.doi.org/10.1142/s0218194017500024.

Full text
Abstract:
Due to the increasing accessibility of online data and the availability of thousands of documents on the Internet, it becomes very difficult for a human to review and analyze each document manually. The sheer size of such documents and data presents a significant challenge for users. Providing automatic summaries of specific topics helps the users to overcome this problem. Most of the current extractive multi-document summarization systems can successfully extract summary sentences; however, many limitations exist which include the degree of redundancy, inaccurate extraction of important sentences, low coverage and poor coherence among the selected sentences. This paper introduces an adaptive extractive multi-document generic (EMDG) methodology for automatic text summarization. The framework of this methodology relies on a novel approach for sentence similarity measure, a discriminative sentence selection method for sentence scoring and a reordering technique for the extracted sentences after removing the redundant ones. Extensive experiments are done on the summarization benchmark datasets DUC2005, DUC2006 and DUC2007. This proves that the proposed EMDG methodology is more effective than the current extractive multi-document summarization systems. Rouge evaluation for automatic summarization is used to validate the proposed EMDG methodology, and the experimental results showed that it is more effective and outperforms the baseline techniques, where the generated summary is characterized by high coverage and cohesion.
APA, Harvard, Vancouver, ISO, and other styles
22

PERA, MARIA SOLEDAD, and YIU-KAI NG. "A NAÏVE BAYES CLASSIFIER FOR WEB DOCUMENT SUMMARIES CREATED BY USING WORD SIMILARITY AND SIGNIFICANT FACTORS." International Journal on Artificial Intelligence Tools 19, no. 04 (August 2010): 465–86. http://dx.doi.org/10.1142/s0218213010000285.

Full text
Abstract:
Text classification categorizes web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consuming and users are still required to spend considerable amount of time scanning through the classified web documents to identify the ones with contents that satisfy their information needs. In solving this problem, we first introduce CorSum, an extractive single-document summarization approach, which is simple and effective in performing the summarization task, since it only relies on word similarity to generate high-quality summaries. We further enhance CorSum by considering the significance factor of sentences in documents, in addition to using word-correlation factors, for document summarization. We denote the enhanced approach CorSum-SF and use the summaries generated by CorSum-SF to train a Multinomial Naïve Bayes classifier for categorizing web document summaries into predefined classes. Experimental results on the DUC-2002 and 20 Newsgroups datasets show that CorSum-SF outperforms other extractive summarization methods, and classification time (accuracy, respectively) is significantly reduced (compatible, respectively) using CorSum-SF generated summaries compared with using the entire documents. More importantly, browsing summaries, instead of entire documents, which are assigned to predefined categories, facilitates the information search process on the Web.
APA, Harvard, Vancouver, ISO, and other styles
23

Kore, Rahul C., Prachi Ray, Priyanka Lade, and Amit Nerurkar. "Legal Document Summarization Using Nlp and Ml Techniques." International Journal of Engineering and Computer Science 9, no. 05 (May 20, 2020): 25039–46. http://dx.doi.org/10.18535/ijecs/v9i05.4488.

Full text
Abstract:
Reading legal documents are tedious and sometimes it requires domain knowledge related to that document. It is hard to read the full legal document without missing the key important sentences. With increasing number of legal documents it would be convenient to get the essential information from the document without having to go through the whole document. The purpose of this study is to understand a large legal document within a short duration of time. Summarization gives flexibility and convenience to the reader. Using vector representation of words, text ranking algorithms, similarity techniques, this study gives a way to produce the highest ranked sentences. Summarization produces the result in such a way that it covers the most vital information of the document in a concise manner. The paper proposes how the different natural language processing concepts can be used to produce the desired result and give readers the relief from going through the whole complex document. This study definitively presents the steps that are required to achieve the aim and elaborates all the algorithms used at each and every step in the process.
APA, Harvard, Vancouver, ISO, and other styles
24

NASERASADI, ALI, HAMID KHOSRAVI, and FARAMARZ SADEGHI. "Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem." Natural Language Engineering 25, no. 1 (October 31, 2018): 121–46. http://dx.doi.org/10.1017/s1351324918000414.

Full text
Abstract:
AbstractBy increasing the amount of data in computer networks, searching and finding suitable information will be harder for users. One of the most widespread forms of information on such networks are textual documents. So exploring these documents to get information about their content is difficult and sometimes impossible. Multi-document text summarization systems are an aid to producing a summary with a fixed and predefined length, while covering the maximum content of the input documents. This paper presents a novel method for multi-document extractive summarization based on textual entailment relations and sentence compression via formulating the problem as a knapsack problem. In this approach, sentences of documents are ranked according to the extended Tf-Idf method, then entailment scores of selected sentences are computed. Through these scores, the final score of each sentence is calculated. Finally, by decreasing the lengths of sentences via sentence compression, the problem has been solved by greedy and dynamic Programming approaches to the knapsack problem. Experiments on standard summarization datasets and evaluating the results based on the Rouge system show that the suggested method, according to the best of our knowledge, has increased F-measure of query-based summarization systems by two per cent and F-measure of general summarization systems by five per cent.
APA, Harvard, Vancouver, ISO, and other styles
25

Kolte, Shilpa G., and Jagdish W. Bakal. "Big Data Summarization Using Novel Clustering Algorithm and Semantic Feature Approach." International Journal of Rough Sets and Data Analysis 4, no. 3 (July 2017): 108–17. http://dx.doi.org/10.4018/ijrsda.2017070108.

Full text
Abstract:
This paper proposes a big data (i.e., documents, texts) summarization method using proposed clustering and semantic features. This paper proposes a novel clustering algorithm which is used for big data summarization. The proposed system works in four phases and provides a modular implementation of multiple documents summarization. The experimental results using Iris dataset show that the proposed clustering algorithm performs better than K-means and K-medodis algorithm. The performance of big data (i.e., documents, texts) summarization is evaluated using Australian legal cases from the Federal Court of Australia (FCA) database. The experimental results demonstrate that the proposed method can summarize big data document superior as compared with existing systems.
APA, Harvard, Vancouver, ISO, and other styles
26

Han, Xu-Wang, Hai-Tao Zheng, Jin-Yuan Chen, and Cong-Zhi Zhao. "Diverse Decoding for Abstractive Document Summarization." Applied Sciences 9, no. 3 (January 23, 2019): 386. http://dx.doi.org/10.3390/app9030386.

Full text
Abstract:
Recently, neural sequence-to-sequence models have made impressive progress in abstractive document summarization. Unfortunately, as neural abstractive summarization research is in a primitive stage, the performance of these models is still far from ideal. In this paper, we propose a novel method called Neural Abstractive Summarization with Diverse Decoding (NASDD). This method augments the standard attentional sequence-to-sequence model in two aspects. First, we introduce a diversity-promoting beam search approach in the decoding process, which alleviates the serious diversity issue caused by standard beam search and hence increases the possibility of generating summary sequences that are more informative. Second, we creatively utilize the attention mechanism combined with the key information of the input document as an estimation of the salient information coverage, which aids in finding the optimal summary sequence. We carry out the experimental evaluation with state-of-the-art methods on the CNN/Daily Mail summarization dataset, and the results demonstrate the superiority of our proposed method.
APA, Harvard, Vancouver, ISO, and other styles
27

Shahana Bano, Mrs, B. Divyanjali, A. K M L R V Virajitha, and M. Tejaswi. "Document Summarization Using Clustering and Text Analysis." International Journal of Engineering & Technology 7, no. 2.32 (May 31, 2018): 456. http://dx.doi.org/10.14419/ijet.v7i2.32.15740.

Full text
Abstract:
Document summarization is a procedure of shortening the content report with a product, so as to make the outline with the significant parts of unique record.Now a days ,users are very much tired about their works and they don’t have much time to spend reading a lot of information .they just want the maximum and accurate information which describes everything and occupies minimum space.This paper discusses an important approach for document summarization by using clustering and text analysis. In this paper, we are performing the clustering and text analytic techniques for reducing the data redundancy and for identifying similarity sentences in text of documents and grouping them in cluster based on their term frequency value of the words. Mainly these techniques help to reduce the data and documents are generated with high efficiency.
APA, Harvard, Vancouver, ISO, and other styles
28

Sakhare, Dipti Yashodhan, and Rajkumar Rajkumar. "Effect of Feature Selection on Small and Large Document Summarization." IAES International Journal of Artificial Intelligence (IJ-AI) 3, no. 3 (September 1, 2014): 112. http://dx.doi.org/10.11591/ijai.v3.i3.pp112-120.

Full text
Abstract:
<p>As the amount of textual Information increases, we experience a need for Automatic Text Summarizers. In Automatic summarization a text document or a larger corpus of multiple documents are reduced to a short set of words or paragraph that conveys the main meaning of the text Summarization can be classified into two approaches: extraction and abstraction. This paper focuses on extraction approach.The goal of text summarization based on extraction approach is sentences selection. The first step in summarization by extraction is the identification of important features. In our approach short stories and biographies are used as test documents. Each document is prepared by pre-processing process: sentence segmentation, tokenization, stop word removal, case folding, lemmatization, and stemming. Then, using important features, sentence filtering, data compression and finally calculating score for each sentence is done. In this paper we proposed various features of Summary Extraction and also analyzed features that are to be applied depending upon the size of the Document. The experimentation is performed with the DUC 2002 dataset. The comparative results of the proposed approach and that of MS-Word are also presented here. The concept based features are given more weightage. From these results we propose that use of the concept based features helps in improving the quality of the summary in case of large documents.</p>
APA, Harvard, Vancouver, ISO, and other styles
29

Bewoor, M. S., and S. H. Patil. "Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms." Engineering, Technology & Applied Science Research 8, no. 1 (February 20, 2018): 2562–67. http://dx.doi.org/10.48084/etasr.1775.

Full text
Abstract:
The availability of various digital sources has created a demand for text mining mechanisms. Effective summary generation mechanisms are needed in order to utilize relevant information from often overwhelming digital data sources. In this view, this paper conducts a survey of various single as well as multi-document text summarization techniques. It also provides analysis of treating a query sentence as a common one, segmented from documents for text summarization. Experimental results show the degree of effectiveness in text summarization over different clustering algorithms.
APA, Harvard, Vancouver, ISO, and other styles
30

Mustamiin, Muhamad, Ahmad Lubis Ghozali, and Muhammad Lukman Sifa. "Peringkasan Multi-dokumen menggunakan Metode Pengelompokkan berbasis Hirarki dengan Multi-level Divisive Coefficient." Jurnal Teknologi Informasi dan Ilmu Komputer 5, no. 6 (November 22, 2018): 697. http://dx.doi.org/10.25126/jtiik.2018561149.

Full text
Abstract:
<p class="Abstrak">Peringkasan merupakan salah satu bagian dari perolehan informasi yang bertujuan untuk mendapatkan informasi secara cepat dan efisien dengan membuat intisari dari suatu dokumen. Dokumen-dokumen khususnya dokumen laporan setiap hari semakin bertambah seiring dengan bertambahnya pelaksanaan suatu kegiatan atau acara. Kebutuhan informasi yang semakin cepat, jumlah dokumen yang semakin bertambah banyak membuat kebutuhan akan adanya peringkasan dokumen semakin tinggi. Peringkasan yang digunakan untuk meringkas lebih dari satu dokumen disebut peringkasan multi-dokumen. Untuk mencegah adanya penyampaian informasi yang berulang pada peringkasan multi-dokumen, maka proses pengelompokkan diperlukan untuk menjamin bahwa informasi yang disampaikan bervariasi dan mencakup semua bagian dari dokumen-dokumen tersebut. Pengelompokkan hirarki dengan multi-level <em>divisive</em> <em>coefficient</em> dapat digunakan untuk mengelompokkan suatu bagian/kalimat dalam dokumen-dokumen dengan bervariasi dan mendalam yang disesuaikan dengan tingkat kebutuhan informasi dari pengguna. Bedasarkan dari tingkat kompresi peringkasan yang berbeda-beda, peringkasan menggunakan pengelompokkan hirarki dengan multi-level <em>divisive</em> <em>coefficient</em> dapat menghasilkan hasil peringkasan yang cukup baik dengan nilai f-measure sebesar 0,398 sementara nilai f-measure peringkasan dengan satu level <em>divisive</em> <em>coefficient</em> hanya mencapai 0,335.</p><p class="Judul2"><strong><em>Abstract</em></strong></p><p class="Abstract"><em>Summarization is one part of the information retrieval that aims to obtain information quickly and efficiently by making the essence of a document. Documents, especially document reports every day increasing as the implementation of an event. The need for information is getting faster, the increasing number of documents makes the need for document summaries is getting higher. Summarization used to summarize more than one document is called multi-document summarization. To prevent repetitive information from being submitted to multi-document summarization, the grouping process is necessary to ensure that the information submitted varies and covers all parts of the documents. Hierarchical clustering with multi-level divisive coefficient can be used to group a part / sentence in documents with varying and depth adjusted to the level of information needs of the user. Based on different compression levels of summarization, summarization using hierarchical clustering with multi-level divisive coefficient can produce a fairly good summary result with f-measure value of 0.398 while the f-measure summarization value with one level of divisive coefficient only reaches 0.335.</em></p>
APA, Harvard, Vancouver, ISO, and other styles
31

Garg, Srashti, and Dr Akash Saxena. "Novel Algorithm for Multi-document Summarization using Lexical Concept." International Journal of Trend in Scientific Research and Development Volume-2, Issue-3 (April 30, 2018): 2115–19. http://dx.doi.org/10.31142/ijtsrd11644.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Hariharan, Shanmugasundaram, and Rengaramanujam Srinivasan. "Enhancements to Graph Based Methods for Single Document Summarization." International Journal of Engineering and Technology 2, no. 1 (2010): 101–11. http://dx.doi.org/10.7763/ijet.2010.v2.107.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Sarkar, Kamal. "Automatic Single Document Text Summarization Using Key Concepts in Documents." Journal of Information Processing Systems 9, no. 4 (December 31, 2013): 602–20. http://dx.doi.org/10.3745/jips.2013.9.4.602.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Wei, Furu, Wenjie Li, Qin Lu, and Yanxiang He. "A document-sensitive graph model for multi-document summarization." Knowledge and Information Systems 22, no. 2 (March 3, 2009): 245–59. http://dx.doi.org/10.1007/s10115-009-0194-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Kim, Minyoung. "Document Summarization via Convex-Concave Programming." International Journal of Fuzzy Logic and Intelligent Systems 16, no. 4 (December 25, 2016): 293–98. http://dx.doi.org/10.5391/ijfis.2016.16.4.293.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

ZHANG, Zhi-qing. "Single-document summarization based on semantics." Journal of Computer Applications 30, no. 6 (June 25, 2010): 1673–75. http://dx.doi.org/10.3724/sp.j.1087.2010.01673.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Jeswani, Aditya, Shruti More, Kabir Kapoor, Sifat Sheikh, and Ramchandra Mangrulkar. "Document Summarization using Graph Based Methodology." International Journal of Computer Applications Technology and Research 9, no. 8 (August 5, 2020): 240–45. http://dx.doi.org/10.7753/ijcatr0908.1005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

AB, Archana, and Sunitha C. "An Overview on Document Summarization Techniques." International Journal Of Recent Advances in Engineering & Technology 08, no. 03 (March 30, 2020): 31–36. http://dx.doi.org/10.46564/ijraet.2020.v08i03.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Qiang, Ji-Peng, Ping Chen, Wei Ding, Fei Xie, and Xindong Wu. "Multi-document summarization using closed patterns." Knowledge-Based Systems 99 (May 2016): 28–38. http://dx.doi.org/10.1016/j.knosys.2016.01.030.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

R K Rao, Pattabhi, and Sobha Lalitha Devi. "Patent Document Summarization Using Conceptual Graphs." International Journal on Natural Language Computing 6, no. 3 (June 30, 2017): 15–32. http://dx.doi.org/10.5121/ijnlc.2017.6302.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Bhagat, Kalyani. "Multi Document summarization using EM Clustering." IOSR Journal of Engineering 4, no. 5 (May 2014): 45–50. http://dx.doi.org/10.9790/3021-04564550.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Neduncheli, R., R. Muthucumar, and E. Saranathan. "Evaluation of Multi Document Summarization Techniques." Research Journal of Applied Sciences 7, no. 4 (April 1, 2012): 229–33. http://dx.doi.org/10.3923/rjasci.2012.229.233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Na, Liu, Tang Di, Lu Ying, Tang Xiao-Jun, and Wang Hai-Wen. "Topic-sensitive multi-document summarization algorithm." Computer Science and Information Systems 12, no. 4 (2015): 1375–89. http://dx.doi.org/10.2298/csis140815060n.

Full text
Abstract:
Latent Dirichlet Allocation (LDA) has been used to generate text corpora topics recently. However, not all the estimated topics are of equal importance or correspond to genuine themes of the domain. Some of the topics can be a collection of irrelevant words or represent insignificant themes. This paper proposed a topic-sensitive algorithm for multi-document summarization. This algorithm uses LDA model and weight linear combination strategy to identify significance topic which is used in sentence weight calculation. Each topic is measured by three different LDA criteria. Significance topic is evaluated by using weight linear combination to combine the multi-criteria. In addition to topic features, the proposed approach also considered some statistics features, such as term frequency, sentence position, sentence length, etc. It not only highlights the advantages of statistics features, but also cooperates with topic model. The experiments showed that the proposed algorithm achieves better performance than the other state-of-the-art algorithms on DUC2002 corpus.
APA, Harvard, Vancouver, ISO, and other styles
44

Carenini, Giuseppe, Jackie Chi Kit Cheung, and Adam Pauls. "MULTI-DOCUMENT SUMMARIZATION OF EVALUATIVE TEXT." Computational Intelligence 29, no. 4 (April 23, 2012): 545–76. http://dx.doi.org/10.1111/j.1467-8640.2012.00417.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Wang, Dingding, Shenghuo Zhu, Tao Li, Yun Chi, and Yihong Gong. "Integrating Document Clustering and Multidocument Summarization." ACM Transactions on Knowledge Discovery from Data 5, no. 3 (August 2011): 1–26. http://dx.doi.org/10.1145/1993077.1993078.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Alguliev, Rasim M., Ramiz M. Aliguliyev, and Nijat R. Isazade. "CDDS: Constraint-driven document summarization models." Expert Systems with Applications 40, no. 2 (February 2013): 458–65. http://dx.doi.org/10.1016/j.eswa.2012.07.049.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Mohd, Mudasir, Rafiya Jan, and Muzaffar Shah. "Text document summarization using word embedding." Expert Systems with Applications 143 (April 2020): 112958. http://dx.doi.org/10.1016/j.eswa.2019.112958.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Rautray, Rasmita, Rakesh Chandra Balabantaray, Rasmita Dash, and Rajashree Dash. "CSMDSE-Cuckoo Search Based Multi Document Summary Extractor." International Journal of Cognitive Informatics and Natural Intelligence 13, no. 4 (October 2019): 56–70. http://dx.doi.org/10.4018/ijcini.2019100103.

Full text
Abstract:
In the current scenario, managing of a useful web of information has become a challenging issue due to a large amount of information related to many fields is online. The summarization of text is considered as one of the solutions to extract pertinent text from vast documents. Hence, a novel Cuckoo Search-based multi document summary extractor (CSMDSE) is presented to handle the multi-document summarization (MDS) problem. The proposed CSMDSE is assimilating with few other swarm-based summary extractors, such as Cat Swarm Optimization based Extractor (CSOE), Particle Swarm Optimization based Extractor (PSOE), Improved Particle Swarm Optimization based Extractor (IPSOE) and Ant Colony Optimization based Extractor (ACOE). Finally, a simulation of CSMDSE is compared with other techniques with respect to the traditional benchmark datasets for summarization problem. The experimental analysis clearly indicates CSMDSE has good performance than the other summary extractors discussed in this study.
APA, Harvard, Vancouver, ISO, and other styles
49

Canhasi, Ercan. "Fast document summarization using locality sensitive hashing and memory access efficient node ranking." International Journal of Electrical and Computer Engineering (IJECE) 6, no. 3 (June 1, 2016): 945. http://dx.doi.org/10.11591/ijece.v6i3.9030.

Full text
Abstract:
Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm. The common text modeling method connects a pair of sentences based on their similarities. Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences. The quadratic time complexity makes it impractical for large documents. In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection. Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph. In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search. For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays. Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity. We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud. In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.
APA, Harvard, Vancouver, ISO, and other styles
50

Canhasi, Ercan. "Fast document summarization using locality sensitive hashing and memory access efficient node ranking." International Journal of Electrical and Computer Engineering (IJECE) 6, no. 3 (June 1, 2016): 945. http://dx.doi.org/10.11591/ijece.v6i3.pp945-954.

Full text
Abstract:
Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm. The common text modeling method connects a pair of sentences based on their similarities. Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences. The quadratic time complexity makes it impractical for large documents. In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection. Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph. In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search. For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays. Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity. We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud. In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography