Academic literature on the topic 'Summarization evaluation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Summarization evaluation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Summarization evaluation"

1

Fabbri, Alexander R., Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, and Dragomir Radev. "SummEval: Re-evaluating Summarization Evaluation." Transactions of the Association for Computational Linguistics 9 (2021): 391–409. http://dx.doi.org/10.1162/tacl_a_00373.

Full text
Abstract:
Abstract The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress. We address the existing shortcomings of summarization evaluation methods along five dimensions: 1) we re-evaluate 14 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization model outputs along with expert and crowd-sourced human annotations; 2) we consistently benchmark 23 recent summarization models using the aforementioned automatic evaluation metrics; 3) we assemble the largest collection of summaries generated by models trained on the CNN/DailyMail news dataset and share it in a unified format; 4) we implement and share a toolkit that provides an extensible and unified API for evaluating summarization models across a broad range of automatic metrics; and 5) we assemble and share the largest and most diverse, in terms of model types, collection of human judgments of model-generated summaries on the CNN/Daily Mail dataset annotated by both expert judges and crowd-source workers. We hope that this work will help promote a more complete evaluation protocol for text summarization as well as advance research in developing evaluation metrics that better correlate with human judgments.
APA, Harvard, Vancouver, ISO, and other styles
2

Murray, Gabriel, Thomas Kleinbauer, Peter Poller, Tilman Becker, Steve Renals, and Jonathan Kilgour. "Extrinsic summarization evaluation." ACM Transactions on Speech and Language Processing 6, no. 2 (2009): 1–29. http://dx.doi.org/10.1145/1596517.1596518.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

MANI, INDERJEET, GARY KLEIN, DAVID HOUSE, LYNETTE HIRSCHMAN, THERESE FIRMIN, and BETH SUNDHEIM. "SUMMAC: a text summarization evaluation." Natural Language Engineering 8, no. 1 (2002): 43–68. http://dx.doi.org/10.1017/s1351324901002741.

Full text
Abstract:
The TIPSTER Text Summarization Evaluation (SUMMAC) has developed several new extrinsic and intrinsic methods for evaluating summaries. It has established definitively that automatic text summarization is very effective in relevance assessment tasks on news articles. Summaries as short as 17% of full text length sped up decision-making by almost a factor of 2 with no statistically significant degradation in accuracy. Analysis of feedback forms filled in after each decision indicated that the intelligibility of present-day machine-generated summaries is high. Systems that performed most accurately in the production of indicative and informative topic-related summaries used term frequency and co-occurrence statistics, and vocabulary overlap comparisons between text passages. However, in the absence of a topic, these statistical methods do not appear to provide any additional leverage: in the case of generic summaries, the systems were indistinguishable in accuracy. The paper discusses some of the tradeoffs and challenges faced by the evaluation, and also lists some of the lessons learned, impacts, and possible future directions. The evaluation methods used in the SUMMAC evaluation are of interest to both summarization evaluation as well as evaluation of other ‘output-related’ NLP technologies, where there may be many potentially acceptable outputs, with no automatic way to compare them.
APA, Harvard, Vancouver, ISO, and other styles
4

Giannakopoulos, George, Vangelis Karkaletsis, George Vouros, and Panagiotis Stamatopoulos. "Summarization system evaluation revisited." ACM Transactions on Speech and Language Processing 5, no. 3 (2008): 1–39. http://dx.doi.org/10.1145/1410358.1410359.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dumont, Emilie, and Bernard Mérialdo. "Rushes video summarization and evaluation." Multimedia Tools and Applications 48, no. 1 (2009): 51–68. http://dx.doi.org/10.1007/s11042-009-0374-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Conroy, John M., Judith D. Schlesinger, and Dianne P. O'Leary. "Nouveau-ROUGE: A Novelty Metric for Update Summarization." Computational Linguistics 37, no. 1 (2011): 1–8. http://dx.doi.org/10.1162/coli_a_00033.

Full text
Abstract:
An update summary should provide a fluent summarization of new information on a time-evolving topic, assuming that the reader has already reviewed older documents or summaries. In 2007 and 2008, an annual summarization evaluation included an update summarization task. Several participating systems produced update summaries indistinguishable from human-generated summaries when measured using ROUGE. However, no machine system performed near human-level performance in manual evaluations such as pyramid and overall responsiveness scoring. We present a metric called Nouveau-ROUGE that improves correlation with manual evaluation metrics and can be used to predict both the pyramid score and overall responsiveness for update summaries. Nouveau-ROUGE can serve as a less expensive surrogate for manual evaluations when comparing existing systems and when developing new ones.
APA, Harvard, Vancouver, ISO, and other styles
7

Okumura, Manabu, Takahiro Fukusima, Hidetsugu Nanba, and Tsutomu Hirao. "Text Summarization Challenge 2 text summarization evaluation at NTCIR workshop 3." ACM SIGIR Forum 38, no. 1 (2004): 29–38. http://dx.doi.org/10.1145/986278.986284.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Neduncheli, R., R. Muthucumar, and E. Saranathan. "Evaluation of Multi Document Summarization Techniques." Research Journal of Applied Sciences 7, no. 4 (2012): 229–33. http://dx.doi.org/10.3923/rjasci.2012.229.233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Polepalli Ramesh, Balaji, Ricky J. Sethi, and Hong Yu. "Figure-Associated Text Summarization and Evaluation." PLOS ONE 10, no. 2 (2015): e0115671. http://dx.doi.org/10.1371/journal.pone.0115671.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kopeć, Mateusz. "Three-step coreference-based summarizer for Polish news texts." Poznan Studies in Contemporary Linguistics 55, no. 2 (2019): 397–443. http://dx.doi.org/10.1515/psicl-2019-0015.

Full text
Abstract:
Abstract This article addresses the problem of automatic summarization of press articles in Polish. The main novelty of this research lays in the proposal of a three-step summarization algorithm which benefits from using coreference information. In related work section, all coreference-based approaches to summarization are presented. Then we describe in detail all publicly available summarization tools developed for Polish language. We state the problem of single-document press article summarization for Polish, describing the training and evaluation dataset: the POLISH SUMMARIES CORPUS. Next, a new coreference-based extractive summarization system NICOLAS is introduced. Its algorithm utilises advanced third-party preprocessing tools to extract the coreference information from the text to be summarized. This information is transformed into a complex set of features related to coreference concepts (mentions and coreference clusters) that are used for training the summarization system (on the basis of a manually prepared gold summaries corpus). The proposed solution is compared to the best publicly available summarization systems for Polish language and two state-of-the-art tools, developed for English language, but adapted to Polish for this article. NICOLAS summarization system obtains best scores, for selected metrics outperforming other systems in a statistically significant way. The evaluation also contains calculation of interesting upper-bounds: human performance and theoretical upper-bound.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Summarization evaluation"

1

Nahnsen, Thade. "Automation of summarization evaluation methods and their application to the summarization process." Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5278.

Full text
Abstract:
Summarization is the process of creating a more compact textual representation of a document or a collection of documents. In view of the vast increase in electronically available information sources in the last decade, filters such as automatically generated summaries are becoming ever more important to facilitate the efficient acquisition and use of required information. Different methods using natural language processing (NLP) techniques are being used to this end. One of the shallowest approaches is the clustering of available documents and the representation of the resulting clusters by one of the documents; an example of this approach is the Google News website. It is also possible to augment the clustering of documents with a summarization process, which would result in a more balanced representation of the information in the cluster, NewsBlaster being an example. However, while some systems are already available on the web, summarization is still considered a difficult problem in the NLP community. One of the major problems hampering the development of proficient summarization systems is the evaluation of the (true) quality of system-generated summaries. This is exemplified by the fact that the current state-of-the-art evaluation method to assess the information content of summaries, the Pyramid evaluation scheme, is a manual procedure. In this light, this thesis has three main objectives. 1. The development of a fully automated evaluation method. The proposed scheme is rooted in the ideas underlying the Pyramid evaluation scheme and makes use of deep syntactic information and lexical semantics. Its performance improves notably on previous automated evaluation methods. 2. The development of an automatic summarization system which draws on the conceptual idea of the Pyramid evaluation scheme and the techniques developed for the proposed evaluation system. The approach features the algorithm for determining the pyramid and bases importance on the number of occurrences of the variable-sized contributors of the pyramid as opposed to word-based methods exploited elsewhere. 3. The development of a text coherence component that can be used for obtaining the best ordering of the sentences in a summary.
APA, Harvard, Vancouver, ISO, and other styles
2

Jonsson, Fredrik. "Evaluation of the Transformer Model for Abstractive Text Summarization." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263325.

Full text
Abstract:
Being able to generate summaries automatically could speed up the spread and retention of information and potentially increase productivity in several fields. Using RNN-based encoder-decoder models with attention have been successful on a variety of language-related tasks such as automatic summarization but also in the field of machine translation. Lately, the Transformer model has been shown to outperform RNN-based models with attention in the relatedfield of machine translation. This study compares the Transformer model to a LSTM-based encoderdecoder model with attention on the task of abstractive summarization. Evaluation is done both automatically, using ROUGE score, as well as using human evaluators to estimate the grammar and readability of the generated summaries. The results show that the Transformer model produces better summaries both in terms of ROUGE score and when evaluated with human evaluators.<br>Att automatiskt kunna generera sammanfattningar ökar möjligheten att snabbt kunna sprida och ta del av information vilket potentiellt kan leda till produktivitetsökningar inom en mängd fält. RNN-baserade enkoder-dekodermodeller med attention har visat sig vara effektiva inom många språkrelaterade områden såsom automatiskt genererade sammanfattningar men också inom exempelvis automatisk översättning. På senare tid har Transformermodellen överträffat RNN-baserade enkoderdekodermodeller med attention inom det närliggande området automatiska översättningar. Denna uppsats jämför Transformermodellen med en LSTMbaserad enkoder-dekodermodell med attention både genom att använda det automatiska måttet ROUGE, men också genom att jämföra läsbarhet och grammatik i de automatgenererade sammanfattningarna med hjälp av mänskliga utvärderare. Resultaten visar att Transformermodellen genererar bättre sammanfattningar både utvärderat med ROUGE och när de mänskliga utvärderarna används.
APA, Harvard, Vancouver, ISO, and other styles
3

Lehto, Niko, and Mikael Sjödin. "Automatic text summarization of Swedish news articles." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-159972.

Full text
Abstract:
With an increasing amount of textual information available there is also an increased need to make this information more accessible. Our paper describes a modified TextRank model and investigates the different methods available to use automatic text summarization as a means for summary creation of swedish news articles. To evaluate our model we focused on intrinsic evaluation methods, in part through content evaluation in the form of of measuring referential clarity and non-redundancy, and in part by text quality evaluation measures, in the form of keyword retention and ROUGE evaluation. The results acquired indicate that stemming and improved stop word capabilities can have a positive effect on the ROUGE scores. The addition of redundancy checks also seems to have a positive effect on avoiding repetition of information. Keyword retention decreased somewhat, however. Lastly all methods had some trouble with dangling anaphora, showing a need for further work within anaphora resolution.
APA, Harvard, Vancouver, ISO, and other styles
4

Jaykumar, Nishita. "ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization." Wright State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464628801.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Rigouste, Lois. "Evolution of a text summarization system in an automatic evaluation framework." Thesis, University of Ottawa (Canada), 2003. http://hdl.handle.net/10393/26535.

Full text
Abstract:
CALLISTO is a text summarizer that searches through a space of possible configurations for the best one. This is different from other systems since it allows CALLISTO (1) to choose adequate components based on results obtained on the training data (and thus, to choose a configuration better adapted to the problem) and (2) to allow different texts to be summarized in different ways. The purpose of this thesis is to find out how the initial space CALLISTO explores can be modified to improve the overall quality of the summaries produced. The thesis reviews and evaluates the first arbitrary design choices made in the system, through a fully automated framework based on a content measure proposed by Lin and Hovy. We tried different modifications to CALLISTO such as replacing the internal evaluation measure, testing other discretization processes, changing the learning algorithm or adding new features to characterize the input text. We found that Naive Bayes outperformed the current learner C5.0, by identifying one configuration working satisfactorily for all texts.
APA, Harvard, Vancouver, ISO, and other styles
6

Hamid, Fahmida. "Evaluation Techniques and Graph-Based Algorithms for Automatic Summarization and Keyphrase Extraction." Thesis, University of North Texas, 2016. https://digital.library.unt.edu/ark:/67531/metadc862796/.

Full text
Abstract:
Automatic text summarization and keyphrase extraction are two interesting areas of research which extend along natural language processing and information retrieval. They have recently become very popular because of their wide applicability. Devising generic techniques for these tasks is challenging due to several issues. Yet we have a good number of intelligent systems performing the tasks. As different systems are designed with different perspectives, evaluating their performances with a generic strategy is crucial. It has also become immensely important to evaluate the performances with minimal human effort. In our work, we focus on designing a relativized scale for evaluating different algorithms. This is our major contribution which challenges the traditional approach of working with an absolute scale. We consider the impact of some of the environment variables (length of the document, references, and system-generated outputs) on the performance. Instead of defining some rigid lengths, we show how to adjust to their variations. We prove a mathematically sound baseline that should work for all kinds of documents. We emphasize automatically determining the syntactic well-formedness of the structures (sentences). We also propose defining an equivalence class for each unit (e.g. word) instead of the exact string matching strategy. We show an evaluation approach that considers the weighted relatedness of multiple references to adjust to the degree of disagreements between the gold standards. We publish the proposed approach as a free tool so that other systems can use it. We have also accumulated a dataset (scientific articles) with a reference summary and keyphrases for each document. Our approach is applicable not only for evaluating single-document based tasks but also for evaluating multiple-document based tasks. We have tested our evaluation method for three intrinsic tasks (taken from DUC 2004 conference), and in all three cases, it correlates positively with ROUGE. Based on our experiments for DUC 2004 Question-Answering task, it correlates with the human decision (extrinsic task) with 36.008% of accuracy. In general, we can state that the proposed relativized scale performs as well as the popular technique (ROUGE) with flexibility for the length of the output. As part of the evaluation we have also devised a new graph-based algorithm focusing on sentiment analysis. The proposed model can extract units (e.g. words or sentences) from the original text belonging either to the positive sentiment-pole or to the negative sentiment-pole. It embeds both (positive and negative) types of sentiment-flow into a single text-graph. The text-graph is composed with words or phrases as nodes, and their relations as edges. By recursively calling two mutually exclusive relations the model builds the final rank of the nodes. Based on the final rank, it splits two segments from the article: one with highly positive sentiment and the other with highly negative sentiments. The output of this model was tested with the non-polar TextRank generated output to quantify how much of the polar summaries actually covers the fact along with sentiment.
APA, Harvard, Vancouver, ISO, and other styles
7

Gonzalez-Gallardo, Carlos. "Automatic Multilingual Multimedia Summarization and Information Retrieval." Thesis, Avignon, 2019. http://www.theses.fr/2019AVIG0234.

Full text
Abstract:
Alors que les sources multimédias sont massivement disponibles en ligne, aider les utilisateurs à comprendre la grande quantité d'information générée est devenu un problème majeur. Une façon de procéder consiste à résumer le contenu multimédia, générant ainsi des versions abrégées et informatives des sources. Cette thèse aborde le sujet du résumé automatique (texte et parole) dans un contexte multilingue. Elle a été réalisée dans le cadre du projet CHISTERA-ANR Accès multilingue à l'information (AMIS). Le résumé multimédia basé sur le texte utilise des transcriptions pour produire des résumés qui peuvent être présentés sous forme textuelle ou dans leur format d'origine. La transcription des sources multimédia peut être effectuée manuellement ou automatiquement par un système de Reconnaissance automatique de la parole (RAP). Les transcriptions peuvent différer de la langue écrite car la source étant parlée. De plus, ces transcriptions manquent d'informations syntaxiques. Par exemple, les majuscules et les signes de ponctuation sont absents, ce qu'implique des phrases inexistantes. Pour palier ce problème nous proposons une méthode pour la détection des limites de phrases (DLP). Elle est orientée aux transcriptions et utilise des caractéristiques textuelles pour séparer les Unités sémantiques (US) dans un contexte multilingue. Notre approche, basée sur des vecteurs d'information des n-grammes de lettres et des Réseaux de neurones convolutifs, dépasse les performances des méthodes état-de-l'art en identifiant correctement les frontières des US en français, anglais et arabe standard. Nous étudions ensuite l'impact des corpora entre-domaines en arabe standard, en montrant que le raffinement d'un modèle, conçu initialement avec un grand corpus hors du domaine, avec un petit corpus du domaine améliore la performance de la DLP. Enfin, nous étendons ARTEX, un résumeur textuel extractif état de l'art, pour traiter de documents en arabe standard en adaptant ses modules de prétraitement. Les résumés peuvent être présentés sous une forme textuelle ou dans leur format multimédia original en alignant les US sélectionnées. En ce qui concerne le résumé multimédia basée sur l'audio, nous introduisons une méthode extractive qui représente l'informativité de la source à partir de ses caractéristiques audio pour sélectionner les segments les plus pertinents pour le résumé. Pendant la phase d'entraînement, notre méthode utilise les transcriptions des documents audio pour créer un modèle informatif qui établit une correspondance entre un ensemble de caractéristiques audio et une mesure de divergence. Dans notre système, les transcriptions ne sont plus nécessaires pour résumer des nouveaux documents audio. Les résultats obtenus sur un schéma multi-évaluation montrent que notre approche génère des résumés compréhensibles et informatifs. Nous avons étudié également les mesures d'évaluation et nous avons développé la méthode Window-based Sentence Boundary Evaluation (WiSeBE), une métrique semi-supervisée basée sur le (dés)accord multi-références. On examine si l'évaluation basée sur une référence unique d'un système de DLP suffit à évaluer son performance. Nous explorons également la possibilité de mesurer la qualité des transcriptions automatiques en fonction de leur informativité. De plus, nous étudions dans quelle mesure le résumé automatique peut compenser les problèmes posés au cours de la transcription. Enfin, nous étudions comment les mesures d'évaluation d'informativité peuvent être étendues pour l'évaluation de l'intérêt des passages textuels<br>As multimedia sources have become massively available online, helping users to understandthe large amount of information they generate has become a major issue. Oneway to approach this is by summarizing multimedia content, thus generating abridgedand informative versions of the original sources. This PhD thesis addresses the subjectof text and audio-based multimedia summarization in a multilingual context. It hasbeen conducted within the framework of the Access Multilingual Information opinionS(AMIS) CHISTERA-ANR project, whose main objective is to make informationeasy to understand for everybody.Text-based multimedia summarization uses transcripts to produce summaries thatmay be presented either as text or in their original format. The transcription of multimediasources can be done manually or automatically by an Automatic Speech Recognition(ASR) system. The transcripts produced using either method differ from wellformedwritten language given their source is mostly spoken language. In addition,ASR transcripts lack syntactic information. For example, capital letters and punctuationmarks are unavailable, which means sentences are nonexistent. To deal with thisproblem, we propose a Sentence Boundary Detection (SBD) method for ASR transcriptswhich uses textual features to separate the Semantic Units (SUs) within an automatictranscript in a multilingual context. Our approach, based on subword-level informationvectors and Convolutional Neural Networks (CNNs), overperforms baselines by correctlyidentifying SU borders for French, English and Modern Standard Arabic (MSA).We then study the impact of cross-domain datasets over MSA, showing that tuning amodel that was originally trained with a big out-of-domain dataset with a small indomaindataset normally improves SBD performance. Finally, we extend ARTEX, astate-of-the-art extractive text summarization method, to process documents in MSA byadapting preprocessing modules. The resulting summaries can be presented as plaintext or in their original multimedia format by aligning the selected SUs.Concerning audio-based summarization, we introduce an extractive method whichrepresents the informativeness of the source based on its audio features to select the segmentsthat are most pertinent to the summary. During the training phase, our methoduses available transcripts of the audio documents to create an informativeness modelwhich maps a set of audio features with a divergence value. Subsequently, when summarizingnew audio documents, transcripts are not needed anymore. Results over amulti-evaluator scheme show that our approach provides understandable and informative summaries.Evaluation measures is also a field which we deal with. We developWindow-basedSentence Boundary Evaluation (WiSeBE), a semi-supervised metric based on multireference(dis)agreement, that questions if evaluating an automatic SBD system basedon a single reference is enough to conclude how well the system is performing. We alsoexplore the possibility of measuring the quality of an automatic transcript based on itsinformativeness. In addition, we study to what extent automatic summarization maycompensate for the problems raised during the transcription phase. Lastly, we studyhow text informativeness evaluation measures may be extended to passage interestingnessevaluation
APA, Harvard, Vancouver, ISO, and other styles
8

Hobson, Stacy F. "Text summarization evaluation correlating human performance on an extrinsic task with automatic intrinsic metrics /." College Park, Md.: University of Maryland, 2007. http://hdl.handle.net/1903/7623.

Full text
Abstract:
Thesis (Ph. D.) -- University of Maryland, College Park, 2007.<br>Thesis research directed by: Dept. of Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
APA, Harvard, Vancouver, ISO, and other styles
9

Marinone, Emilio. "Evaluation of New Features for Extractive Summarization of Meeting Transcripts : Improvement of meeting summarization based on functional segmentation, introducing topic model, named entities and domain specific frequency measure." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-249560.

Full text
Abstract:
Automatic summarization of meeting transcripts has been widely stud­ied in last two decades, achieving continuous improvements in terms of the standard summarization metric (ROUGE). A user study has shown that people noticeably prefer abstractive summarization rather than the extractive approach. However, a fluent and informative ab­stract depends heavily on the performance of the Information Extrac­tion method(s) applied. In this work, basic concepts useful for understanding meeting sum­marization methods like Parts-of-Speech (POS), Named Entity Recog­nition (NER), frequency and similarity measure and topic models are introduced together with a broad literature analysis. The proposed method takes inspiration from the current unsupervised extractive state of the art and introduces new features that improve the baseline. It is based on functional segmentation, meaning that it first aims to divide the preprocessed source transcript into monologues and dialogues. Then, two different approaches are used to extract the most impor­tant sentences from each segment, whose concatenation together with redundancy reduction creates the final summary. Results show that a topic model trained on an extended corpus, some variations in the proposed parameters and the consideration of word tags improve the performance in terms of ROUGE Precision, Re­call and F-measure. It outperforms the currently best performing un­supervised extractive summarization method in terms of ROUGE-1 Precision and F-measure. A subjective evaluation of the generated summaries demonstrates that the current unsupervised framework is not yet accurate enough for commercial use, but the new introduced features can help super­vised methods to achieve acceptable performance. A much larger, non-artificially constructed meeting dataset with reference summaries is also needed for training supervised methods as well as a more accu­rate algorithm evaluation. The source code is available on GitHub: https://github.com/marinone94/ThesisMeetingSummarization<br>Automatgenererade textsammanfattningar av mötestranskript har varit ett allmänt studerat område de senaste två decennierna där resultatet varit ständiga förbättringar mätt mot standardsammanfattningsvärdet (ROUGE). En studie visar att människor märkbart föredrar abstraherade sammanfattningar gentemot omfattande sammanfattningar. En informativ och flytande textsammanfattning förlitar sig däremot mycket på informationsextraheringsmetoden som används. I det har arbetet presenteras grundläggande koncept som är användbara för att förstå textsammanfattningar så som: Parts-of-Speech (POS), Named Entity Recognition (NER), frekvens och likhetsvärden, och ämnesmodeller. Även en bred litterär analys ingår i arbetet. Den föreslagna metoden tar inspiration från de nuvarande främsta omfattande metoderna och introducerar nya egenskaper som förbättrar referensmodellen. Det är helt oövervakat och baseras på funktionell segmentering vilket betyder att den i först fallet försöker dela upp den förbehandlade källtexten i monologer och dialoger. Därefter används två metoder for att extrahera de mest betydelsefulla meningarna ur varje segment vilkas sammanbindning, tillsammans med redundansminskning, bildar den slutliga textsammanfattningen. Resultaten visar att en ämnesmodell, tränad på ett omfattande korpus med viss variation i de föreslagna parametrarna och med ordmärkning i åtanke, förbättrar prestandan mot ROUGE, precision, Recall och F-matning. Den överträffar den nuvarande bästa Rouge-1 precision och F-matning. En subjektiv utvärdering av de genererade textsammanfattningarna visar att det nuvarande, oövervakade ramverket inte är exakt nog for kommersiellt bruk än men att de nyintroducerade egenskaperna kan hjälpa oövervakade metoder uppnå acceptabla resultat. En mycket större, icke artificiellt skapad, datamängd bestående utav textsammanfattningar av möten krävs för att träna de övervakade, metoderna så väl som en mer noggrann utvärdering av de utvalda algoritmerna. Nya och existerande sammanfattningsmetoder kan appliceras på meningar extraherade ur den föreslagna metoden.
APA, Harvard, Vancouver, ISO, and other styles
10

Aguiar, Lu?s Henrique Gon?alves de. "Modelo Cassiopeia como avaliador de sum?rios autom?ticos: aplica??o em um corpus educacional." UFVJM, 2017. http://acervo.ufvjm.edu.br/jspui/handle/1/1644.

Full text
Abstract:
Submitted by Jos? Henrique Henrique (jose.neves@ufvjm.edu.br) on 2018-04-19T18:44:37Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) luis_henrique_goncalves_aguiar.pdf: 1963486 bytes, checksum: ce8ee9274d520386492773d2e289f109 (MD5)<br>Approved for entry into archive by Rodrigo Martins Cruz (rodrigo.cruz@ufvjm.edu.br) on 2018-04-23T16:27:14Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) luis_henrique_goncalves_aguiar.pdf: 1963486 bytes, checksum: ce8ee9274d520386492773d2e289f109 (MD5)<br>Made available in DSpace on 2018-04-23T16:27:14Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) luis_henrique_goncalves_aguiar.pdf: 1963486 bytes, checksum: ce8ee9274d520386492773d2e289f109 (MD5) Previous issue date: 2017<br>Considerando a grande quantidade de informa??es textuais dispon?veis atualmente, principalmente na web, est? se tronando cada vez mais dif?cil o acesso e a assimila??o desse conte?do para o usu?rio. Nesse contexto, torna-se necess?rio buscar tarefas capazes de transformar essa grande quantidade de dados em conhecimento ?til e organizado. Uma alternativa para amenizar esse problema, ? reduzir o volume de informa??es dispon?veis a partir da produ??o de resumos dos textos originais, por meio da sumariza??o autom?tica (SA) de textos. A sumariza??o autom?tica de textos consiste na produ??o autom?tica de resumos a partir de um ou mais textos-fonte, de modo que o sum?rio contenha as informa??es mais relevantes deste. A avalia??o de resumos ? uma tarefa importante no campo da sumariza??o autom?tica de texto, a abordagem mais intuitiva ? a avalia??o humana, por?m ? onerosa e improdutiva. Outra alternativa ? a avalia??o autom?tica, alguns avaliadores foram propostos, sendo a mais conhecida e amplamente usada ? a medida ROUGE (Recall-Oriented Understudy for Gisting Evaluation). Um fator limitante na avalia??o da ROUGE ? a utiliza??o do sum?rio humano de refer?ncia, o que implica em uma restri??o do idioma e dom?nio, al?m de requerer um trabalho humano demorado e oneroso. Diante das dificuldades encontradas na avalia??o de sum?rios autom?ticos, o presente trabalho apresenta o modelo Cassiopeia como um novo m?todo de avalia??o. O modelo ? um agrupador de textos hier?rquico, o qual consiste no uso da sumariza??o na etapa do pr?-processamento, onde a qualidade do agrupamento ? influenciada positivamente conforme a qualidade da sumariza??o. As simula??es realizadas neste trabalho mostraram que a avalia??o realizada pelo modelo Cassiopeia ? semelhante a avalia??o realizada pela ferramenta ROUGE. Por outro lado, a utiliza??o do modelo Cassiopeia como avaliador de sum?rios autom?ticos evidenciou algumas vantagens, sendo as principais; a n?o utiliza??o do sum?rio humano no processo de avalia??o, e a independ?ncia do dom?nio e do idioma.<br>Disserta??o (Mestrado Profissional) ? Programa de P?s-Gradua??o em Educa??o, Universidade Federal dos Vales do Jequitinhonha e Mucuri, 2017.<br>Considering the large amount of textual information currently available, especially on the web, it is becoming increasingly difficult to access and assimilate this content to the user. In this context, it becomes necessary to search for tasks that can transform this large amount of information into useful and organized knowledge. The solution, or at least an alternative, to moderate this problem is to reduce the volume of information available, from the production of abstracts of the original texts, through automatic summarization (SA) of texts. The Automatic Summarization of texts consists of the automatic production of abstracts from one or more source texts, which the summary must contain the most relevant information of the source text. The evaluation of abstracts is an important task in the field of automatic text summarization, the most intuitive approach is human evaluation, but it is costly and unproductive. Another alternative is the automatic evaluation, some evaluators have been proposed, and the most widely used is the ROUGE (Recall-Oriented Understudy for Gisting Evaluation). A limiting factor in ROUGE's evaluation is the use of the human reference summary, which implies a restriction of language and domain, as well as requiring time-consuming and expensive human work. In view of the difficulties encountered in the evaluation of automatic summaries, this paper presents the Cassiopeia model as a new evaluation method. The model is a hierarchical text grouper, which consists of the use of the summarization in the stage of the pre-processing, where the quality of the grouping is influenced positively according to the quality of the summarization. The simulations performed in this work showed that the evaluations performed by Cassiopeia in comparison to the ROUGE tool are similar. On the other hand, the use of the Cassiopeia model as an automatic summarization evaluator showed some advantages, the main ones are; being the non-use of the human abstract in the evaluation process, and the independent of the domain and the language.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Summarization evaluation"

1

Hovy, Eduard. Text Summarization. Edited by Ruslan Mitkov. Oxford University Press, 2012. http://dx.doi.org/10.1093/oxfordhb/9780199276349.013.0032.

Full text
Abstract:
This article describes research and development on the automated creation of summaries of one or more texts. It defines the concept of summary and presents an overview of the principal approaches in summarization. It describes the design, implementation, and performance of various summarization systems. The stages of automated text summarization are topic identification, interpretation, and summary generation, each having its sub stages. Due to the challenges involved, multi-document summarization is much less developed than single-document summarization. This article reviews particular techniques used in several summarization systems. Finally, this article assesses the methods of evaluating summaries. This article reviews evaluation strategies, from previous evaluation studies, to the two-basic measures method. Summaries are so task and genre specific; therefore, no single measurement covers all cases of evaluation
APA, Harvard, Vancouver, ISO, and other styles
2

Hirschman, Lynette, and Inderjeet Mani. Evaluation. Edited by Ruslan Mitkov. Oxford University Press, 2012. http://dx.doi.org/10.1093/oxfordhb/9780199276349.013.0022.

Full text
Abstract:
The commercial success of natural language (NL) technology has raised the technical criticality of evaluation. Choices of evaluation methods depend on software life cycles, typically charting four stages — research, advance prototype, operational prototype, and product. At the prototype stage, embedded evaluation can prove helpful. Analysis components can be loose grouped viz., segmentation, tagging, extracting information, and document threading. Output technologies such as text summarization can be evaluated in terms of intrinsic and extrinsic measures, the former checking for quality and informativeness and the latter, for efficiency and acceptability, in some tasks. ‘Post edit measures’ commonly used in machine translation, determine the amount of correction required to obtain a desirable output. Evaluation of interactive systems typically evaluates the system and the user as one team and deploys subject variability, which runs enough subjects to obtain statistical validity hence, incurring substantial costs. Evaluation being a social activity, creates a community for internal technical comparison, via shared evaluation criteria.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Summarization evaluation"

1

Nanba, Hidetsugu, Tsutomu Hirao, Takahiro Fukushima, and Manabu Okumura. "Text Summarization Challenge: An Evaluation Program for Text Summarization." In Evaluating Information Retrieval and Access Tasks. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-5554-1_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Khari, Manju, Renu Dalal, Arush Sharma, and Avinash Dubey. "Evaluation of Text-Summarization Technique." In Multimodal Biometric Systems. CRC Press, 2021. http://dx.doi.org/10.1201/9781003138068-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Komorowski, Artur, Lucjan Janowski, and Mikołaj Leszczuk. "Evaluation of Multimedia Content Summarization Algorithms." In Cryptology and Network Security. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-98678-4_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Mehta, Parth, and Prasenjit Majumder. "Corpora and Evaluation for Text Summarisation." In From Extractive to Abstractive Summarization: A Journey. Springer Singapore, 2019. http://dx.doi.org/10.1007/978-981-13-8934-4_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Pawling, Alec, Nitesh V. Chawla, and Amitabh Chaudhary. "Evaluation of Summarization Schemes for Learning in Streams." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11871637_34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Koch, T., Jinlong Ma, Liyuan Ma, and Dan Fang. "Summarization of Credibility Evaluation of Missile Simulation Data." In Lecture Notes in Electrical Engineering. Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-6571-2_175.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Yagunova, Elena, Olga Makarova, and Ekaterina Pronoza. "Data-Driven Unsupervised Evaluation of Automatic Text Summarization Systems." In Advances in Artificial Intelligence and Its Applications. Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-27101-9_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Sharma, Neeraj, and Vaibhav Jain. "Evaluation and Summarization of Student Feedback Using Sentiment Analysis." In Advances in Intelligent Systems and Computing. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-3383-9_35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Crossley, Scott A., Minkyung Kim, Laura Allen, and Danielle McNamara. "Automated Summarization Evaluation (ASE) Using Natural Language Processing Tools." In Lecture Notes in Computer Science. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-23204-7_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Khilji, Abdullah Faiz Ur Rahman, Utkarsh Sinha, Pintu Singh, Adnan Ali, and Partha Pakray. "Abstractive Text Summarization Approaches with Analysis of Evaluation Techniques." In Communications in Computer and Information Science. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-75529-4_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Summarization evaluation"

1

Bhandari, Manik, Pranav Narayan Gour, Atabak Ashfaq, Pengfei Liu, and Graham Neubig. "Re-evaluating Evaluation in Text Summarization." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.751.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gao, Yanjun, Chen Sun, and Rebecca J. Passonneau. "Automated Pyramid Summarization Evaluation." In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/k19-1038.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Litvak, Marina, Natalia Vanetik, and Itzhak Eretz Kdosha. "HEvAS: Headline Evaluation and Analysis System." In MultiLing 2019: Summarization Across Languages, Genres and Sources. Incoma Ltd., Shoumen, Bulgaria, 2019. http://dx.doi.org/10.26615/978-954-452-058-8_010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Radev, Dragomir R., and Daniel Tam. "Summarization evaluation using relative utility." In the twelfth international conference. ACM Press, 2003. http://dx.doi.org/10.1145/956863.956960.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Bommasani, Rishi, and Claire Cardie. "Intrinsic Evaluation of Summarization Datasets." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.649.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Koto, Fajri, Jey Han Lau, and Timothy Baldwin. "Evaluating the Efficacy of Summarization Evaluation across Languages." In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.findings-acl.71.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Litvak, Marina, Natalia Vanetik, and Yael Veksler. "EASY-M: Evaluation System for Multilingual Summarizers." In MultiLing 2019: Summarization Across Languages, Genres and Sources. Incoma Ltd., Shoumen, Bulgaria, 2019. http://dx.doi.org/10.26615/978-954-452-058-8_008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Indu, M., and K. V. Kavitha. "Review on text summarization evaluation methods." In 2016 International Conference on Research Advances in Integrated Navigation Systems (RAINS). IEEE, 2016. http://dx.doi.org/10.1109/rains.2016.7764406.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Khandelwal, Vikash, Rahul Gupta, and James Allan. "An evaluation corpus for temporal summarization." In the first international conference. Association for Computational Linguistics, 2001. http://dx.doi.org/10.3115/1072133.1072174.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Scanlon, Liam, Shiwei Zhang, Xiuzhen Zhang, and Mark Sanderson. "Evaluation of Cross Domain Text Summarization." In SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval. ACM, 2020. http://dx.doi.org/10.1145/3397271.3401285.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Summarization evaluation"

1

President, Stacy F., and Bonnie J. Dorr. Text Summarization Evaluation: Correlating Human Performance on an Extrinsic Task with Automatic Intrinsic Metrics. Defense Technical Information Center, 2006. http://dx.doi.org/10.21236/ada455670.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography