Dissertations / Theses on the topic 'Texte informatif'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Texte informatif.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Boulanger, Chantale Lamontagne Anne. "Pour une étude du texte informatif /." Thèse, Chicoutimi : Université du Québec à Chicoutimi, 1989. http://theses.uqac.ca.
Full textBoulanger, Chantale. "Pour une étude du texte informatif." Thèse, Université du Québec à Trois-Rivières, 1989. http://constellation.uqac.ca/1564/1/1461671.pdf.
Full textDonnelly, Karen. "Le développement du texte informatif en classe d'immersion au primaire." Thesis, Université Laval, 2013. http://www.theses.ulaval.ca/2013/29400/29400.pdf.
Full textMélançon, Julie. "Effets des habiletés métaphonologiques et métasyntaxiques sur la compréhension d'un texte informatif en 1re année du primaire." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/mq25676.pdf.
Full textBoganika, Luciane. "Le défi de l'éducation au Brésil et en France : Le processus de lecture des jeunes et des adultes en situation de réinsertion scolaire dans la perspective d'une reprise d'études." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAL023/document.
Full textIn terms of school reintegration of young people and adults, Brazil set up the educational program named Education for Young People and Adults [Educação de Jovens e Adultos – EJA], while France set up the educational project called Pass degree for the sake of the Access to University Studies [Diplôme d’Accès aux Études Universitaires - DAEU]. Each of these constitutes a complete educational device that composes The National System of Education in relation to others official programs as well as to specific courses, teaching staff and exams based on the degree acquisition. The graduates of both programs have the right to apply for public service. Moreover, in the Brazilian context they can apply for public and private universities and in the French situation they are able to sign up in the institutions that are open to people with a university degree. However, in Brazil the pedagogical training is oriented towards the job market and aims to promote the professional reintegration and the mobility of people, while the pedagogical training in France has as main point the pursuit of the studies by these individuals.In our thesis, we are interested in the teaching of the reading and we seek, by a contrastive perspective, to evaluate the effects of each of these two educational systems on the learning of the written comprehension; we also discuss the effects of the school model in which the readers have made their first learning, particularly, their first readings.Our research method, set up during several study sessions, invites the readers to answer a reading quiz about informative and journalistic texts that were extracted from the internet. To answer our present problematization, we formulate the following questions:(1) How do these readers read?(2) What are the elements of the text that they mobilize?(3) How are these elements taken up in their answers?(4) What are the strategies implemented during the reading?This method and its theoretical foundations are inspired by the work on reading and written comprehension conducted in the LIDILEM laboratory of the University of Grenoble-Alpes in the sense of the studies initiated by Michel DABENE (DABÈNE, FRIER & VISOZ, 1992) and of the works performed by the Brazilian team of research led by Lúcia CHEREM and Rosa NERY on the matrix of questions and the progress of reading. The intersection of these different indicators (part of the mobilized text, appropriation strategy of the text, identification of the argumentative path, value judgments and enunciative polyphony, formulation of the answer) allows us to reconstruct the coherence of the reading path – from the retrieval of the information to the understanding of the textual complexity through argumentation analysis – and to identify the types of reading concerning both the degrees of development and of reading-comprehension process.Furthermore, we ask whether the concern with the reading work in social thought (FREIRE, 1967) influences on the quality of the reading and written comprehension as well as on the quality of the pedagogical training of readers. We elaborate our answers by referring to researchers who, like Paulo Freire and Pierre Bourdieu, thought about the social question. We think, with them, that when reading is meaningful to the reader, the learning work is more motivated and more coherent.The special attention to the online press article, that we take as a basic document for our research, is motivated by this perspective that establishes the dialogue between the social practices of the written and the teaching-learning of the reading and written comprehension
Svanberg, Kerstin. "Les sensations sensorielles qu’évoquent les vins - pénibles à exprimer et encore plus à traduire ? : À propos de la richesse des couleurs et des arômes du roi bordelais et de ses confrères." Thesis, Linnéuniversitetet, Institutionen för språk (SPR), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-89214.
Full textShrestha, Prajol. "Alignement inter-modalités de corpus comparable monolingue." Phd thesis, Université de Nantes, 2013. http://tel.archives-ouvertes.fr/tel-00909179.
Full textKavanagh, Judith. "The Text Analyzer: A tool for knowledge acquisition from texts." Thesis, University of Ottawa (Canada), 1995. http://hdl.handle.net/10393/10149.
Full textRosell, Magnus. "Text Clustering Exploration : Swedish Text Representation and Clustering Results Unraveled." Doctoral thesis, KTH, Numerisk Analys och Datalogi, NADA, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-10129.
Full textQC 20100806
Saint-Germain, Isabelle. "Le passage de l'article scientifique au texte vulgarisé : analyse de la structure, du contenu et de la rhétorique des textes." Sherbrooke : Université de Sherbrooke, 2004.
Find full textMalanga, Paul R. "Using instruction and practice to improve recall of thematic and text-based information from orally presented texts /." The Ohio State University, 1997. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487946776022526.
Full textÖhrman, Frida. "Texter om QUASI-projektet : Att popularisera forskningsinformation." Thesis, Mälardalen University, Department of Innovation, Design and Product Development, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-343.
Full textQUASI-projektet är ett tvärvetenskapligt forskningsprojekt som inkluderar universitet och högskolor i fem europeiska länder. QUASI-samarbetet fokuserar på forskning om jästceller. Projektet avslutades i maj 2007 och de vetenskapliga resultaten ska förklaras för allmänheten. Syftet med mitt examensarbete har varit att utforma lättbegripliga texter om QUASI-projektet för att informera allmänheten om projektet och resultaten. Texterna ska publiceras på en webbsida. Min frågeställning har varit: Hur förenklar man forskningsinformation så att människor utan bakgrundskunskaper i ämnet förstår informationen på ett bra sätt? Jag har använt flera olika metoder för att komma fram till ett slutresultat. Genom att samla in litteratur och genomföra en enkätundersökning fick jag ett material att utgå från. Därefter har jag producerat mina texter, utprovat dem på tänkta användare, bearbetat och utprovat igen. Jag har även utprovat bilder i texterna, och sedan låtit en illustratör göra nya bilder. Jag har arbetat fram texter som enligt utprovningar är lättbegripliga för målgruppen. Jag blev klar med mina texter i tid och de blev godkända av uppdragsgivaren. Därmed har jag uppnått syftet med mitt examensarbete. Jag har lärt mig att det är väldigt personligt hur man populariserar forskning. Att skriva texter är ett hantverk som tar tid att lära sig. Jag har utarbetat ett sätt som fungerar bra för mig, men jag tror att alla har sina egna sätt att arbeta på. Det viktigaste är att man utprovar sitt material i flera omgångar på tänkta slutanvändare, och verkligen lyssnar på vad de har att säga och ändrar sina texter efter det.
Moysset, Bastien. "Détection, localisation et typage de texte dans des images de documents hétérogènes par Réseaux de Neurones Profonds." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEI044/document.
Full textBeing able to automatically read the texts written in documents, both printed and handwritten, makes it possible to access the information they convey. In order to realize full page text transcription, the detection and localization of the text lines is a crucial step. Traditional methods tend to use image processing based approaches, but they hardly generalize to very heterogeneous datasets. In this thesis, we propose to use a deep neural network based approach. We first propose a mono-dimensional segmentation of text paragraphs into lines that uses a technique inspired by the text recognition models. The connexionist temporal classification (CTC) method is used to implicitly align the sequences. Then, we propose a neural network that directly predicts the coordinates of the boxes bounding the text lines. Adding a confidence prediction to these hypothesis boxes enables to locate a varying number of objects. We propose to predict the objects locally in order to share the network parameters between the locations and to increase the number of different objects that each single box predictor sees during training. This compensates the rather small size of the available datasets. In order to recover the contextual information that carries knowledge on the document layout, we add multi-dimensional LSTM recurrent layers between the convolutional layers of our networks. We propose three full page text recognition strategies that tackle the need of high preciseness of the text line position predictions. We show on the heterogeneous Maurdor dataset how our methods perform on documents that can be printed or handwritten, in French, English or Arabic and we favourably compare to other state of the art methods. Visualizing the concepts learned by our neurons enables to underline the ability of the recurrent layers to convey the contextual information
Sætre, Rune. "GeneTUC: Natural Language Understanding in Medical Text." Doctoral thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2006. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-545.
Full textNatural Language Understanding (NLU) is a 50 years old research field, but its application to molecular biology literature (BioNLU) is a less than 10 years old field. After the complete human genome sequence was published by Human Genome Project and Celera in 2001, there has been an explosion of research, shifting the NLU focus from domains like news articles to the domain of molecular biology and medical literature. BioNLU is needed, since there are almost 2000 new articles published and indexed every day, and the biologists need to know about existing knowledge regarding their own research. So far, BioNLU results are not as good as in other NLU domains, so more research is needed to solve the challenges of creating useful NLU applications for the biologists.
The work in this PhD thesis is a “proof of concept”. It is the first to show that an existing Question Answering (QA) system can be successfully applied in the hard BioNLU domain, after the essential challenge of unknown entities is solved. The core contribution is a system that discovers and classifies unknown entities and relations between them automatically. The World Wide Web (through Google) is used as the main resource, and the performance is almost as good as other named entity extraction systems, but the advantage of this approach is that it is much simpler and requires less manual labor than any of the other comparable systems.
The first paper in this collection gives an overview of the field of NLU and shows how the Information Extraction (IE) problem can be formulated with Local Grammars. The second paper uses Machine Learning to automatically recognize protein name based on features from the GSearch Engine. In the third paper, GSearch is substituted with Google, and the task in this paper is to extract all unknown names belonging to one of 273 biomedical entity classes, like genes, proteins, processes etc. After getting promising results with Google, the fourth paper shows that this approach can also be used to retrieve interactions or relationships between the named entities. The fifth paper describes an online implementation of the system, and shows that the method scales well to a larger set of entities.
The final paper concludes the “proof of concept” research, and shows that the performance of the original GeneTUC NLU system has increased from handling 10% of the sentences in a large collection of abstracts in 2001, to 50% in 2006. This is still not good enough to create a commercial system, but it is believed that another 40% performance gain can be achieved by importing more verb templates into GeneTUC, just like nouns were imported during this work. Work has already begun on this, in the form of a local Masters Thesis.
Biedert, Ralf [Verfasser]. "Gaze-Based Human-Text Interaction/Text 2.0 / Ralf Biedert." München : Verlag Dr. Hut, 2014. http://d-nb.info/1050331605/34.
Full textRichards, Eric D. "Goal information and text comprehension." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape10/PQDD_0003/MQ41764.pdf.
Full textCiravegna, Fabio. "User-defined information extraction from texts." Thesis, University of East Anglia, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.273293.
Full textAdámek, Tomáš. "Metody stemmingu používané při dolování textu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-235547.
Full textPospíšil, Milan. "Extrakce sémantických vztahů z textu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-412824.
Full textPopescu, Ana-Maria. "Information extraction from unstructured web text /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/6935.
Full textHodgson, Grant Michael. "Suggesting Missing Information in Text Documents." BYU ScholarsArchive, 2018. https://scholarsarchive.byu.edu/etd/7296.
Full textViana, Hugo Henrique Amorim. "Automatic information retrieval through text-mining." Master's thesis, Faculdade de Ciências e Tecnologia, 2013. http://hdl.handle.net/10362/11308.
Full textNowadays, around a huge amount of firms in the European Union catalogued as Small and Medium Enterprises (SMEs), employ almost a great portion of the active workforce in Europe. Nonetheless, SMEs cannot afford implementing neither methods nor tools to systematically adapt innovation as a part of their business process. Innovation is the engine to be competitive in the globalized environment, especially in the current socio-economic situation. This thesis provides a platform that when integrated with ExtremeFactories(EF) project, aids SMEs to become more competitive by means of monitoring schedule functionality. In this thesis a text-mining platform that possesses the ability to schedule a gathering information through keywords is presented. In order to develop the platform, several choices concerning the implementation have been made, in the sense that one of them requires particular emphasis is the framework, Apache Lucene Core 2 by supplying an efficient text-mining tool and it is highly used for the purpose of the thesis.
Staab, Steffen. "Grading knowledge extracting degree information from texts /." Berlin ; Heidelberg : Springer, 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=965576841.
Full textCabral, Loni Grimm. "The role of metaphor in informative texts." reponame:Repositório Institucional da UFSC, 1994. https://repositorio.ufsc.br/xmlui/handle/123456789/157849.
Full textMade available in DSpace on 2016-01-08T18:47:18Z (GMT). No. of bitstreams: 1 96396.pdf: 4450999 bytes, checksum: 91488e04a41a0f60ee312b12e40eb5ee (MD5) Previous issue date: 1994
Textos informativos em Português e em Inglês são analisados a fim de observar o papel coesivo da metáfora. A análise é feita dentro da perspectiva de matching relations (Winter, 1986), rótulos (Francis, 1994), repetição lexical (Hoey, 1991) padrões textuais (Winter e Hove, 1986), retrospecção e prospeção (Sinclair, 1992) e macrofunções da linguagem (Halliday, 1985). Foram observadas marcas textuais diferenciando as metáforas interpessoais e ideacionais. Um exame da metáfora na tradução de textos informativos confirma o papel coesivo e sua diferenciação quanto às funções. Implicações da pesquisa para o ensino de leitura e análise de texto são apresentadas.
Tabassum, Binte Jafar Jeniya. "Information Extraction From User Generated Noisy Texts." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1606315356821532.
Full textNguyen, Minh Tien. "Détection de textes générés automatiquement." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAM025/document.
Full textAutomatically generated text has been used in numerous occasions with distinct intentions. It can simply go from generated comments in an online discussion to a much more mischievous task, such as manipulating bibliography information. So, this thesis first introduces different methods of generating free texts that resemble a certain topic and how those texts can be used. Therefore, we try to tackle with multiple research questions. The first question is how and what is the best method to detect a fully generated document.Then, we take it one step further to address the possibility of detecting a couple of sentences or a small paragraph of automatically generated text by proposing a new method to calculate sentences similarity using their grammatical structure. The last question is how to detect an automatically generated document without any samples, this is used to address the case of a new generator or a generator that it is impossible to collect samples from.This thesis also deals with the industrial aspect of development. A simple overview of a publishing workflow from a high-profile publisher is presented. From there, an analysis is carried out to be able to best incorporate our method of detection into the production workflow.In conclusion, this thesis has shed light on multiple important research questions about the possibility of detecting automatically generated texts in different setting. Besides the researching aspect, important engineering work in a real life industrial environment is also carried out to demonstrate that it is important to have real application along with hypothetical research
Carlehed, Claes. "Kognitiva belastningar vid läsning och navigering i elektronisk text." Thesis, University of Skövde, Department of Computer Science, 1998. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-207.
Full textIdag lagras enorma mängder elektronisk text i databaser på företag, institutioner och liknande ställen. Textmassorna är oftast lättåtkomliga, lätt redigerade och enkla att sprida. Dessvärre dyker det upp problem när man ska läsa och navigera i stora dokument. Kognitiva belastningar på korttidsminnet är en följd av svårigheten att överblicka stora dokument. Det uppstår även problem med navigeringen i dokumenten. Oftast leder detta till att läsaren skriver ut dokumentet för att undkomma dessa och liknande problem.
Föreliggande arbete utfördes i samarbete med Familjen Dafgård AB i Källby. Ett intranet håller på att utvecklas hos dem och i detta ingår stora textbaserade databaser. Undersökningar har utförts med hjälp av anställda på företaget för att fånga deras synpunkter vad gäller navigation och läsning i elektroniska dokument.
Metoden som använts för att bringa klarhet i hur läsarna upplever olika navigeringssätt i elektronisk text har varit en kvalitativ intervju och en kvantitativ effektivitetsmätning av två olika sätt att navigera i dokument, scrollning eller länkning av textmassan.
Resultaten visar att de anställda i huvudsak föredrar länkningen framför scrollningen i långa textdokument. Tidsstudien visar tendenser till att länkningen är något snabbare än scrollningen.
Reffle, Ulrich [Verfasser]. "Algorithmen und Methoden zur dokumentenspezifischen Analyse historischer und OCR-erfasster Texte / Ulrich Reffle." München : Verlag Dr. Hut, 2011. http://d-nb.info/1017353417/34.
Full textChétrit, Héloèise. "Ett verktyg för konstruktion av ontologier från text." Thesis, Linköping University, Department of Computer and Information Science, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2228.
Full textWith the growth of information stored over Internet, especially in the biological field, and with discoveries being made daily in this domain, scientists are faced with an overwhelming amount of articles. Reading all published articles is a tedious and time-consuming process. Therefore a way to summarise the information in the articles is needed. A solution is the derivation of an ontology representing the knowledge enclosed in the set of articles and allowing to browse through them.
In this thesis we present the tool Ontolo, which allows to build an initial ontology of a domain by inserting a set of articles related to that domain in the system. The quality of the ontology construction has been tested by comparing our ontology results for keywords to the ones provided by the Gene Ontology for the same keywords.
The obtained results are quite promising for a first prototype of the system as it finds many common terms on both ontologies for justa few hundred of inserted articles.
Staab, Steffen [Verfasser]. "Grading knowledge : extracting degree information from texts / Steffen Staab." Berlin, 2000. http://d-nb.info/965576841/34.
Full textSeidel, Christian [Verfasser]. "System zur dynamischen Erfassung domänenspezifischer Texteinheiten und Texte mit Hilfe von Eigenvektormethoden / Christian Seidel." München : Verlag Dr. Hut, 2010. http://d-nb.info/1008331384/34.
Full textRomsdorfer, Harald [Verfasser]. "Polyglot Text-to-Speech Synthesis : Text Analysis & Prosody Control / Harald Romsdorfer." Aachen : Shaker, 2009. http://d-nb.info/1156517354/34.
Full textLeidner, Jochen Lothar. "Toponym resolution in text." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/1849.
Full textKrishnan, Sharenya. "Text-Based Information Retrieval Using Relevance Feedback." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-53603.
Full textLanquillon, Carsten. "Enhancing text classification to improve information filtering." [S.l. : s.n.], 2001. http://deposit.ddb.de/cgi-bin/dokserv?idn=963801805.
Full textFoster, Mary Ellen. "Automatically generating text to accompany information graphics." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0001/MQ45946.pdf.
Full textTeufel, Simone. "Argumentative zoning : information extraction from scientific text." Thesis, University of Edinburgh, 1999. http://hdl.handle.net/1842/11456.
Full textMurad, Masrah Azrifah Azmi. "Fuzzy text mining for intelligent information retrieval." Thesis, University of Bristol, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.416830.
Full textKyriakides, Alexandros 1977. "Supervised information retrieval for text and images." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/28426.
Full textIncludes bibliographical references (leaves 73-74).
We present a novel approach to choosing an appropriate image for a news story. Our method uses the caption of the image to retrieve a suitable image. We have developed a word-extraction engine called WordEx. WordEx uses supervised learning to predict which words in the text of a news story are likely to be present in the caption of an appropriate image. The words extracted by WordEx are then used to retrieve the image from a collection of images. On average, the number of words extracted by WordEx is 10% of the original story text. Therefore, this word-extraction engine can also be applied to text documents for feature reduction.
by Alexandros Kyriakides.
M.Eng.
Smail, Nabila. "Contribution à l'analyse et à la recherche d'information en texte intégral : application de la transformée en ondelettes pour la recherche et l'analyse de textes." Phd thesis, Université Paris-Est, 2009. http://tel.archives-ouvertes.fr/tel-00504368.
Full textJessop, David M. "Information extraction from chemical patents." Thesis, University of Cambridge, 2011. https://www.repository.cam.ac.uk/handle/1810/238302.
Full textBundschus, Markus. "From Text to Knowledge." Diss., lmu, 2010. http://nbn-resolving.de/urn:nbn:de:bvb:19-118841.
Full textSong, Min Song Il-Yeol. "Robust knowledge extraction over large text collections /." Philadelphia, Pa. : Drexel University, 2005. http://dspace.library.drexel.edu/handle/1860/495.
Full textKäter, Thorsten. "Evaluierung des Text-Retrievalsystems "Intelligent Miner for Text" von IBM : eine Studie im Vergleich zur Evaluierung anderer Systeme /." [S.l. : s.n.], 1999. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB8230685.
Full textHasegawa, Satoshi, Kumi Sato, Shohei Matsunuma, Masaru Miyao, and Kohei Okamoto. "Multilingual Disaster Information System : Information Delivery Using Graphic Text for Mobile Phones." Springer, 2005. http://hdl.handle.net/2237/8651.
Full textBrifkany, Jan, and Yasini Anass El. "Text Recognition in Natural Images : A study in Text Detection." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-282935.
Full textUnder de senaste åren har en ökning av datorseende metoder och lösningar utvecklats för att lösa datorseende problemet. Genom att kombinera olika metoder från olika områden av datorseende har datavetare kunnat utveckla mer avancerade och komplexa modeller för att lösa dessa problem. Denna rapport kommer att omfatta två kategorier, textidentifiering och textigenkänning. Dessa områden kommer att definieras, beskrivas och analyseras i resultat- och diskussionskapitlet. Denna rapport kommer att omfatta ett mycket intressant och utmanande ämne, textigenkänning i naturliga bilder. Rapporten syftar till att bedöma förbättringen av OCR-resultatet efter det att tre bildsegmenteringsmetoder har tillämpats på bilder. Metoderna som har använts är ” Maximally stable extremal regions” och geometrisk filtrering baserad på geometriska egenskaper. Resultatet visade att hos OCR med segmenteringsmetoder hade en övergripande bättre resultat jämfört med OCR utan segmenteringsmetoder. Det visades också att bilder med horisontell textorientering hade bättre noggrannhet vid tillämpning av OCR med segmenteringsmetoder jämfört med bilder med flerorienterad textorientering.
Boynuegri, Akif. "Cross-lingual Information Retrieval On Turkish And English Texts." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12611903/index.pdf.
Full textSabir, Ahmed. "Enhancing scene text recognition with visual context information." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/670286.
Full textAquesta tesi aborda el problema de millorar els sistemes de reconeixement de text, que permeten detectar i reconèixer text en imatges no restringides (per exemple, un cartell al carrer, un anunci, una destinació d’autobús, etc.). L’objectiu és millorar el rendiment dels sistemes de visió existents explotant la informació semàntica derivada de la pròpia imatge. La idea principal és que conèixer el contingut de la imatge o el context visual en el que un text apareix, pot ajudar a decidir quines són les paraules correctes. Per exemple, el fet que una imatge mostri una cafeteria fa que sigui més probable que una paraula en un rètol es llegeixi com a Dunkin que no pas com unkind. Abordem aquest problema recorrent a avenços en el processament del llenguatge natural i l’aprenentatge automàtic, en particular, aprenent re-rankers i xarxes neuronals, per presentar solucions de postprocés que milloren els sistemes de l’estat de l’art de reconeixement de text, sense necessitat de costosos procediments de reentrenament o afinació que requereixin grans quantitats de dades. Descobrir el grau de relació semàntica entre les paraules candidates i el seu context d’imatge és una tasca relacionada amb l’avaluació de la semblança semàntica entre paraules o fragments de text. Tanmateix, determinar l’existència d’una relació semàntica és una tasca més general que avaluar la semblança (per exemple, cotxe, carretera i semàfor estan relacionats però no són similars) i per tant els mètodes existents requereixen certes adaptacions. Per satisfer els requisits d’aquestes perspectives més àmplies de relació semàntica, desenvolupem dos enfocaments per aprendre la relació semàntica de la paraula reconeguda i el seu context: paraula-a-paraula (amb els objectes a la imatge) o paraula-a-frase (subtítol de la imatge). En l’enfocament de paraula-a-paraula s’usen re-rankers basats en word-embeddings. El re-ranker pren les paraules proposades pel sistema base i les torna a reordenar en funció del context visual proporcionat pel classificador d’objectes. Per al segon cas, s’ha dissenyat un enfocament neuronal d’extrem a extrem per explotar la descripció de la imatge (subtítol) tant a nivell de frase com a nivell de paraula i re-ordenar les paraules candidates basant-se tant en el context visual com en les co-ocurrències amb el subtítol. Com a contribució addicional, per satisfer els requisits dels enfocs basats en dades com ara les xarxes neuronals, presentem un conjunt de dades de contextos visuals per a aquesta tasca, en el què el conjunt de dades COCO-text disponible públicament [Veit et al. 2016] s’ha ampliat amb informació sobre l’escena (inclosos els objectes i els llocs que apareixen a la imatge) per permetre als investigadors incloure les relacions semàntiques entre textos i escena als seus sistemes de reconeixement de text, i oferir una base d’avaluació comuna per a aquests enfocaments.
Tarczyńska, Anna. "Methods of Text Information Extraction in Digital Videos." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2656.
Full textThe huge amount of existing digital video files needs to provide indexing to make it available for customers (easier searching). The indexing can be provided by text information extraction. In this thesis we have analysed and compared methods of text information extraction in digital videos. Furthermore, we have evaluated them in the new context proposed by us, namely usefulness in sports news indexing and information retrieval.
Odd, Joel, and Emil Theologou. "Utilize OCR text to extract receipt data and classify receipts with common Machine Learning algorithms." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148350.
Full text