Dissertations / Theses on the topic 'Text detection and recognition'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Text detection and recognition.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Brifkany, Jan, and Yasini Anass El. "Text Recognition in Natural Images : A study in Text Detection." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-282935.
Full textUnder de senaste åren har en ökning av datorseende metoder och lösningar utvecklats för att lösa datorseende problemet. Genom att kombinera olika metoder från olika områden av datorseende har datavetare kunnat utveckla mer avancerade och komplexa modeller för att lösa dessa problem. Denna rapport kommer att omfatta två kategorier, textidentifiering och textigenkänning. Dessa områden kommer att definieras, beskrivas och analyseras i resultat- och diskussionskapitlet. Denna rapport kommer att omfatta ett mycket intressant och utmanande ämne, textigenkänning i naturliga bilder. Rapporten syftar till att bedöma förbättringen av OCR-resultatet efter det att tre bildsegmenteringsmetoder har tillämpats på bilder. Metoderna som har använts är ” Maximally stable extremal regions” och geometrisk filtrering baserad på geometriska egenskaper. Resultatet visade att hos OCR med segmenteringsmetoder hade en övergripande bättre resultat jämfört med OCR utan segmenteringsmetoder. Det visades också att bilder med horisontell textorientering hade bättre noggrannhet vid tillämpning av OCR med segmenteringsmetoder jämfört med bilder med flerorienterad textorientering.
Khiari, El Hebri. "Text Detection and Recognition in the Automotive Context." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32458.
Full textYousfi, Sonia. "Embedded Arabic text detection and recognition in videos." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI069/document.
Full textThis thesis focuses on Arabic embedded text detection and recognition in videos. Different approaches robust to Arabic text variability (fonts, scales, sizes, etc.) as well as to environmental and acquisition condition challenges (contrasts, degradation, complex background, etc.) are proposed. We introduce different machine learning-based solutions for robust text detection without relying on any pre-processing. The first method is based on Convolutional Neural Networks (ConvNet) while the others use a specific boosting cascade to select relevant hand-crafted text features. For the text recognition, our methodology is segmentation-free. Text images are transformed into sequences of features using a multi-scale scanning scheme. Standing out from the dominant methodology of hand-crafted features, we propose to learn relevant text representations from data using different deep learning methods, namely Deep Auto-Encoders, ConvNets and unsupervised learning models. Each one leads to a specific OCR (Optical Character Recognition) solution. Sequence labeling is performed without any prior segmentation using a recurrent connectionist learning model. Proposed solutions are compared to other methods based on non-connectionist and hand-crafted features. In addition, we propose to enhance the recognition results using Recurrent Neural Network-based language models that are able to capture long-range linguistic dependencies. Both OCR and language model probabilities are incorporated in a joint decoding scheme where additional hyper-parameters are introduced to boost recognition results and reduce the response time. Given the lack of public multimedia Arabic datasets, we propose novel annotated datasets issued from Arabic videos. The OCR dataset, called ALIF, is publicly available for research purposes. As the best of our knowledge, it is first public dataset dedicated for Arabic video OCR. Our proposed solutions were extensively evaluated. Obtained results highlight the genericity and the efficiency of our approaches, reaching a word recognition rate of 88.63% on the ALIF dataset and outperforming well-known commercial OCR engine by more than 36%
Olsson, Oskar, and Moa Eriksson. "Automated system tests with image recognition : focused on text detection and recognition." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160249.
Full textChen, Datong. "Text detection and recognition in images and video sequences /." [S.l.] : [s.n.], 2003. http://library.epfl.ch/theses/?display=detail&nr=2863.
Full textMešár, Marek. "Svět kolem nás jako hyperlink." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236204.
Full textFraz, Muhammad. "Video content analysis for intelligent forensics." Thesis, Loughborough University, 2014. https://dspace.lboro.ac.uk/2134/18065.
Full textWigington, Curtis Michael. "End-to-End Full-Page Handwriting Recognition." BYU ScholarsArchive, 2018. https://scholarsarchive.byu.edu/etd/7099.
Full textJaderberg, Maxwell. "Deep learning for text spotting." Thesis, University of Oxford, 2015. http://ora.ox.ac.uk/objects/uuid:e893c11e-6b6b-4d11-bb25-846bcef9b13e.
Full textLu, Hsin-Min. "SURVEILLANCE IN THE INFORMATION AGE: TEXT QUANTIFICATION, ANOMALY DETECTION, AND EMPIRICAL EVALUATION." Diss., The University of Arizona, 2010. http://hdl.handle.net/10150/193893.
Full textZhu, Winstead Xingran. "Hotspot Detection for Automatic Podcast Trailer Generation." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444887.
Full textMinetto, Rodrigo 1983. "Reconhecimento de texto e rastreamento de objetos 2D/3D." [s.n.], 2012. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275708.
Full textTese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-20T03:12:07Z (GMT). No. of bitstreams: 1 Minetto_Rodrigo_D.pdf: 35894128 bytes, checksum: 8a0e453fba7e6a9a02fb17a52fdbf878 (MD5) Previous issue date: 2012
Resumo: Nesta tese abordamos três problemas de visão computacional: (1) detecção e reconhecimento de objetos de texto planos em imagens de cenas reais; (2) rastreamento destes objetos de texto em vídeos digitais; e (3) o rastreamento de um objeto tridimensional rígido arbitrário com marcas conhecidas em um vídeo digital. Nós desenvolvemos, para cada um dos problemas, algoritmos inovadores, que são pelo menos tão precisos e robustos quanto outros algoritmos estado-da-arte. Especificamente, para reconhecimento de texto nós desenvolvemos (e validamos extensivamente) um novo descritor de imagem baseado em HOG especializado para escrita romana, que denominamos T-HOG, e mostramos sua contribuição como um filtro em um detector de texto (SNOOPERTEXT). Nós também melhoramos o algoritmo SNOOPERTEXT através do uso da técnica multiescala para tratar caracteres de tamanhos bastante variados e limitar a sensibilidade do algoritmo a vários artefatos. Para rastreamento de texto, nós descrevemos quatro estratégias básicas para combinar a detecção e o rastreamento de texto, e desenvolvemos também um rastreador específico baseado em filtro de partículas que explora o uso do reconhecedor T-HOG. Para o rastreamento de objetos rígidos, nós desenvolvemos um novo algoritmo preciso e robusto (AFFTRACK) que combina rastreamento de características por KLT com uma calibração de câmera melhorada. Nós testamos extensivamente nossos algoritmos com diversas bases de dados descritas na literatura. Nós também desenvolvemos algumas bases de dados (publicamente disponíveis) para a validação de algoritmos de detecção e rastreamento de texto e de rastreamento de objetos rígidos em vídeos
Abstract: In this thesis we address three computer vision problems: (1) the detection and recognition of flat text objects in images of real scenes; (2) the tracking of such text objects in a digital video; and (3) the tracking an arbitrary three-dimensional rigid object with known markings in a digital video. For each problem we developed innovative algorithms, which are at least as accurate and robust as other state-of-the-art algorithms. Specifically, for text classification we developed (and extensively evaluated) a new HOG-based descriptor specialized for Roman script, which we call T-HOG, and showed its value as a post-filter for an existing text detector (SNOOPERTEXT). We also improved the SNOOPERTEXT algorithm by using the multi-scale technique to handle widely different letter sizes while limiting the sensitivity of the algorithm to various artifacts. For text tracking, we describe four basic ways of combining a text detector and a text tracker, and we developed a specific tracker based on a particle-filter which exploits the T-HOG recognizer. For rigid object tracking we developed a new accurate and robust algorithm (AFFTRACK) that combines the KLT feature tracker with an improved camera calibration procedure. We extensively tested our algorithms on several benchmarks well-known in the literature. We also created benchmarks (publicly available) for the evaluation of text detection and tracking and rigid object tracking algorithms
Doutorado
Ciência da Computação
Doutor em Ciência da Computação
Day, Adam C. "Designing a face detection CAPTCHA." Morgantown, W. Va. : [West Virginia University Libraries], 2010. http://hdl.handle.net/10450/11036.
Full textTitle from document title page. Document formatted into pages; contains viii, 80 p. : ill. Includes abstract. Includes bibliographical references (p. 78-80).
Moysset, Bastien. "Détection, localisation et typage de texte dans des images de documents hétérogènes par Réseaux de Neurones Profonds." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEI044/document.
Full textBeing able to automatically read the texts written in documents, both printed and handwritten, makes it possible to access the information they convey. In order to realize full page text transcription, the detection and localization of the text lines is a crucial step. Traditional methods tend to use image processing based approaches, but they hardly generalize to very heterogeneous datasets. In this thesis, we propose to use a deep neural network based approach. We first propose a mono-dimensional segmentation of text paragraphs into lines that uses a technique inspired by the text recognition models. The connexionist temporal classification (CTC) method is used to implicitly align the sequences. Then, we propose a neural network that directly predicts the coordinates of the boxes bounding the text lines. Adding a confidence prediction to these hypothesis boxes enables to locate a varying number of objects. We propose to predict the objects locally in order to share the network parameters between the locations and to increase the number of different objects that each single box predictor sees during training. This compensates the rather small size of the available datasets. In order to recover the contextual information that carries knowledge on the document layout, we add multi-dimensional LSTM recurrent layers between the convolutional layers of our networks. We propose three full page text recognition strategies that tackle the need of high preciseness of the text line position predictions. We show on the heterogeneous Maurdor dataset how our methods perform on documents that can be printed or handwritten, in French, English or Arabic and we favourably compare to other state of the art methods. Visualizing the concepts learned by our neurons enables to underline the ability of the recurrent layers to convey the contextual information
Karagol, Yusuf. "Event Ordering In Turkish Texts." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612623/index.pdf.
Full textWestberg, Michael. "Time of Flight Based Teat Detection." Thesis, Linköping University, Department of Electrical Engineering, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-19292.
Full textTime of flight is an imaging technique with uses depth information to capture 3D information in a scene. Recent developments in the technology have made ToF cameras more widely available and practical to work with. The cameras now enable real time 3D imaging and positioning in a compact unit, making the technology suitable for variety of object recognition tasks
An object recognition system for locating teats is at the center of the DeLaval VMS, which is a fully automated system for milking cows. By implementing ToF technology as part of the visual detection procedure, it would be possible to locate and track all four teat’s positions in real time and potentially provide an improvement compared with the current system.
The developed algorithm for teat detection is able to locate teat shaped objects in scenes and extract information of their position, width and orientation. These parameters are determined with an accuracy of millimeters. The algorithm also shows promising results when tested on real cows. Although detecting many false positives the algorithm was able to correctly detected 171 out of 232 visible teats in a test set of real cow images. This result is a satisfying proof of concept and shows the potential of ToF technology in the field of automated milking.
Raymondi, Luis Guillermo Antezana, Fabricio Eduardo Aguirre Guzman, Jimmy Armas-Aguirre, and Paola Agonzalez. "Technological solution for the identification and reduction of stress level using wearables." IEEE Computer Society, 2020. http://hdl.handle.net/10757/656578.
Full textIn this article, a technological solution is proposed to identify and reduce the level of mental stress of a person through a wearable device. The proposal identifies a physiological variable: Heart rate, through the integration between a wearable and a mobile application through text recognition using the back camera of a smartphone. As part of the process, the technological solution shows a list of guidelines depending on the level of stress obtained in a given time. Once completed, it can be measured again in order to confirm the evolution of your stress level. This proposal allows the patient to keep his stress level under control in an effective and accessible way in real time. The proposal consists of four phases: 1. Collection of parameters through the wearable; 2. Data reception by the mobile application; 3. Data storage in a cloud environment and 4. Data collection and processing; this last phase is divided into 4 sub-phases: 4.1. Stress level analysis, 4.2. Recommendations to decrease the level obtained, 4.3. Comparison between measurements and 4.4. Measurement history per day. The proposal was validated in a workplace with people from 20 to 35 years old located in Lima, Peru. Preliminary results showed that 80% of patients managed to reduce their stress level with the proposed solution.
Revisión por pares
Packer, Thomas L. "Scalable Detection and Extraction of Data in Lists in OCRed Text for Ontology Population Using Semi-Supervised and Unsupervised Active Wrapper Induction." BYU ScholarsArchive, 2014. https://scholarsarchive.byu.edu/etd/4258.
Full textOscanoa1, Julio, Marcelo Mena, and Guillermo Kemper. "A Detection Method of Ectocervical Cell Nuclei for Pap test Images, Based on Adaptive Thresholds and Local Derivatives." Science and Engineering Research Support Society, 2015. http://hdl.handle.net/10757/624843.
Full textRevisón por pares
Karvir, Hrishikesh. "Design and Validation of a Sensor Integration and Feature Fusion Test-Bed for Image-Based Pattern Recognition Applications." Wright State University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=wright1291753291.
Full textAkra, Mohamad A. (Mohamad Ahmad). "Automated text recognition." Thesis, Massachusetts Institute of Technology, 1993. http://hdl.handle.net/1721.1/11109.
Full textIncludes bibliographical references (leaves 92-96).
by Mohamad A. Akra.
Ph.D.
Wachenfeld, Steffen. "Recognition of screen-rendered text /." Münster, 2009. http://opac.nebis.ch/cgi-bin/showAbstract.pl?sys=000252284.
Full textBen-Haim, Nadav. "Task specific image text recognition." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2008. http://wwwlib.umi.com/cr/ucsd/fullcit?p1450595.
Full textTitle from first page of PDF file (viewed June 16, 2008). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 37-39).
Goraine, Habib. "Machine recognition of Arabic text." Thesis, University of Reading, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.278135.
Full textSavkov, Aleksandar Dimitrov. "Deciphering clinical text : concept recognition in primary care text notes." Thesis, University of Sussex, 2017. http://sro.sussex.ac.uk/id/eprint/68232/.
Full textGiménez, Pastor Adrián. "Bernoulli HMMs for Handwritten Text Recognition." Doctoral thesis, Universitat Politècnica de València, 2014. http://hdl.handle.net/10251/37978.
Full textGiménez Pastor, A. (2014). Bernoulli HMMs for Handwritten Text Recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37978
TESIS
Alkhoury, Ihab. "Arabic Text Recognition and Machine Translation." Doctoral thesis, Universitat Politècnica de València, 2015. http://hdl.handle.net/10251/53029.
Full text[ES] El reconocimiento de texto manuscrito (HTR) en árabe y la traducción automática (MT) del árabe al inglés se han tratado habitualmente como dos áreas de estudio independientes. De hecho, la idea de crear un sistema que combine las dos áreas, que directamente genere texto en inglés a partir de imágenes que contienen texto en árabe, sigue siendo una tarea difícil. Este proceso se puede interpretar como la traducción de imágenes de texto en árabe. En esta tesis, se propone un sistema que reconoce las imágenes de texto manuscrito en árabe, y que traduce el texto reconocido al inglés. Este sistema está construido a partir de la combinación de un sistema HTR y un sistema MT. En cuanto al sistema HTR, nuestro trabajo se enfoca en el uso de los Bernoulli Hidden Markov Models (BHMMs). Los modelos BHMMs ya han sido probados anteriormente en tareas con alfabeto latino obteniendo buenos resultados. De hecho, existen resultados empíricos publicados usando corpus conocidos, tales como IAM o RIMES. En esta tesis, estos resultados se han extendido al texto manuscrito en árabe, en particular, a las bases de datos IfN/ENIT y NIST OpenHaRT. En aplicaciones reales, la transcripción del texto en árabe no se limita únicamente al texto manuscrito, sino también al texto impreso. El texto impreso se puede interpretar como una forma simplificada de texto manuscrito. Por lo tanto, para este tipo de texto, también proponemos el uso de modelos BHMMs. Además, estos modelos se han comparado con tecnología del estado del arte basada en redes neuronales. Una idea clave que ha demostrado ser muy eficaz en la aplicación de modelos BHMMs es el uso de una ventana deslizante (sliding window) de anchura adecuada durante la extracción de características. Esta idea ha permitido obtener resultados muy competitivos tanto en el reconocimiento de texto manuscrito en árabe como en el de texto impreso. De hecho, un sistema basado en este tipo de extracción de características quedó en la primera posición en el concurso ICDAR 2011 Arabic recognition competition usando la base de datos Arabic Printed Text Image (APTI). Además, esta idea se ha perfeccionado mediante el uso de técnicas de reposicionamiento aplicadas a las ventanas extraídas, dando lugar a nuevas mejoras en el reconocimiento de texto árabe. En el caso de texto manuscrito, este refinamiento ha conseguido mejorar el sistema que ocupó el primer lugar en el concurso ICFHR 2010 Arabic handwriting recognition competition usando IfN/ENIT. En el caso del texto impreso, este refinamiento condujo a un sistema mejor que ocupó el segundo lugar en el concurso ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el que se usaba APTI. Por otro lado, esta técnica se ha evaluado también en tecnología basada en redes neuronales, lo que ha llevado a resultados del estado del arte. Respecto a la traducción automática, el sistema se ha basado en la combinación de tres tipos de modelos estadísticos del estado del arte: los modelos standard phrase-based, los modelos hierarchical phrase-based y los modelos N-gram phrase-based. Esta combinación se hizo utilizando el método Recognizer Output Voting Error Reduction (ROVER). Por último, se han propuesto tres métodos para combinar los sistemas HTR y MT con el fin de desarrollar un sistema de traducción de imágenes de texto árabe a inglés. El sistema se ha evaluado sobre la base de datos NIST OpenHaRT, donde se han obtenido resultados competitivos.
[CAT] El reconeixement de text manuscrit (HTR) en àrab i la traducció automàtica (MT) de l'àrab a l'anglès s'han tractat habitualment com dues àrees d'estudi independents. De fet, la idea de crear un sistema que combine les dues àrees, que directament genere text en anglès a partir d'imatges que contenen text en àrab, continua sent una tasca difícil. Aquest procés es pot interpretar com la traducció d'imatges de text en àrab. En aquesta tesi, es proposa un sistema que reconeix les imatges de text manuscrit en àrab, i que tradueix el text reconegut a l'anglès. Aquest sistema està construït a partir de la combinació d'un sistema HTR i d'un sistema MT. Pel que fa al sistema HTR, el nostre treball s'enfoca en l'ús dels Bernoulli Hidden Markov Models (BHMMs). Els models BHMMs ja han estat provats anteriorment en tasques amb alfabet llatí obtenint bons resultats. De fet, existeixen resultats empírics publicats emprant corpus coneguts, tals com IAM o RIMES. En aquesta tesi, aquests resultats s'han estès a la escriptura manuscrita en àrab, en particular, a les bases de dades IfN/ENIT i NIST OpenHaRT. En aplicacions reals, la transcripció de text en àrab no es limita únicament al text manuscrit, sinó també al text imprès. El text imprès es pot interpretar com una forma simplificada de text manuscrit. Per tant, per a aquest tipus de text, també proposem l'ús de models BHMMs. A més a més, aquests models s'han comparat amb tecnologia de l'estat de l'art basada en xarxes neuronals. Una idea clau que ha demostrat ser molt eficaç en l'aplicació de models BHMMs és l'ús d'una finestra lliscant (sliding window) d'amplària adequada durant l'extracció de característiques. Aquesta idea ha permès obtenir resultats molt competitius tant en el reconeixement de text àrab manuscrit com en el de text imprès. De fet, un sistema basat en aquest tipus d'extracció de característiques va quedar en primera posició en el concurs ICDAR 2011 Arabic recognition competition emprant la base de dades Arabic Printed Text Image (APTI). A més a més, aquesta idea s'ha perfeccionat mitjançant l'ús de tècniques de reposicionament aplicades a les finestres extretes, donant lloc a noves millores en el reconeixement de text en àrab. En el cas de text manuscrit, aquest refinament ha aconseguit millorar el sistema que va ocupar el primer lloc en el concurs ICFHR 2010 Arabic handwriting recognition competition usant IfN/ENIT. En el cas del text imprès, aquest refinament va conduir a un sistema millor que va ocupar el segon lloc en el concurs ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el qual s'usava APTI. D'altra banda, aquesta tècnica s'ha avaluat també en tecnologia basada en xarxes neuronals, el que ha portat a resultats de l'estat de l'art. Respecte a la traducció automàtica, el sistema s'ha basat en la combinació de tres tipus de models estadístics de l'estat de l'art: els models standard phrase-based, els models hierarchical phrase-based i els models N-gram phrase-based. Aquesta combinació es va fer utilitzant el mètode Recognizer Output Voting Errada Reduction (ROVER). Finalment, s'han proposat tres mètodes per combinar els sistemes HTR i MT amb la finalitat de desenvolupar un sistema de traducció d'imatges de text àrab a anglès. El sistema s'ha avaluat sobre la base de dades NIST OpenHaRT, on s'han obtingut resultats competitius.
Alkhoury, I. (2015). Arabic Text Recognition and Machine Translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/53029
TESIS
Guthrie, David. "Unsupervised detection of anomalous text." Thesis, University of Sheffield, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.500287.
Full textOrizu, Udochukwu. "Implicit emotion detection in text." Thesis, Aston University, 2018. http://publications.aston.ac.uk/37693/.
Full textGupta, Smita. "Modelling Deception Detection in Text." Thesis, Kingston, Ont. : [s.n.], 2007. http://hdl.handle.net/1974/922.
Full textLu, Su. "DCT coefficient based text detection." Access to citation, abstract and download form provided by ProQuest Information and Learning Company; downloadable PDF file, 57 p, 2008. http://proquest.umi.com/pqdweb?did=1605147371&sid=4&Fmt=2&clientId=8331&RQT=309&VName=PQD.
Full textO'Shea, Kieran. "Roadsign detection & recognition /." Leeds : University of Leeds, School of Computer Studies, 2008. http://www.comp.leeds.ac.uk/fyproj/reports/0708/OShea.pdf.
Full textYoung-Lai, Matthew. "Text structure recognition using a region algebra." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/NQ60576.pdf.
Full textKeenan, Francis Gerard. "Large vocabulary syntactic analysis for text recognition." Thesis, Nottingham Trent University, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.334311.
Full textAbuhaiba, Ibrahim S. I. "Recognition of off-line handwritten cursive text." Thesis, Loughborough University, 1996. https://dspace.lboro.ac.uk/2134/7331.
Full textRose, Tony Gerard. "Large vocabulary semantic analysis for text recognition." Thesis, Nottingham Trent University, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.333961.
Full textZhang, Yaxi. "Named Entity Recognition for Social Media Text." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-395978.
Full textDahlstedt, Olle. "Automatic Handwritten Text Detection and Classification." Thesis, Uppsala universitet, Avdelningen för visuell information och interaktion, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-453809.
Full textUren, Victoria Susannah. "Combining text categorizers." Thesis, University of Portsmouth, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.343389.
Full textČervenec, Radek. "Rozpoznávání emocí v česky psaných textech." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-218962.
Full textKozlovski, Nikolai. "TEXT-IMAGE RESTORATION AND TEXT ALIGNMENT FOR MULTI-ENGINE OPTICAL CHARACTER RECOGNITION SYSTEMS." Master's thesis, University of Central Florida, 2006. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3607.
Full textM.S.E.E.
Department of Electrical and Computer Engineering
Engineering and Computer Science
Electrical Engineering
Bashir, Sulaimon A. "Change detection for activity recognition." Thesis, Robert Gordon University, 2017. http://hdl.handle.net/10059/3104.
Full textSabir, Ahmed. "Enhancing scene text recognition with visual context information." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/670286.
Full textAquesta tesi aborda el problema de millorar els sistemes de reconeixement de text, que permeten detectar i reconèixer text en imatges no restringides (per exemple, un cartell al carrer, un anunci, una destinació d’autobús, etc.). L’objectiu és millorar el rendiment dels sistemes de visió existents explotant la informació semàntica derivada de la pròpia imatge. La idea principal és que conèixer el contingut de la imatge o el context visual en el que un text apareix, pot ajudar a decidir quines són les paraules correctes. Per exemple, el fet que una imatge mostri una cafeteria fa que sigui més probable que una paraula en un rètol es llegeixi com a Dunkin que no pas com unkind. Abordem aquest problema recorrent a avenços en el processament del llenguatge natural i l’aprenentatge automàtic, en particular, aprenent re-rankers i xarxes neuronals, per presentar solucions de postprocés que milloren els sistemes de l’estat de l’art de reconeixement de text, sense necessitat de costosos procediments de reentrenament o afinació que requereixin grans quantitats de dades. Descobrir el grau de relació semàntica entre les paraules candidates i el seu context d’imatge és una tasca relacionada amb l’avaluació de la semblança semàntica entre paraules o fragments de text. Tanmateix, determinar l’existència d’una relació semàntica és una tasca més general que avaluar la semblança (per exemple, cotxe, carretera i semàfor estan relacionats però no són similars) i per tant els mètodes existents requereixen certes adaptacions. Per satisfer els requisits d’aquestes perspectives més àmplies de relació semàntica, desenvolupem dos enfocaments per aprendre la relació semàntica de la paraula reconeguda i el seu context: paraula-a-paraula (amb els objectes a la imatge) o paraula-a-frase (subtítol de la imatge). En l’enfocament de paraula-a-paraula s’usen re-rankers basats en word-embeddings. El re-ranker pren les paraules proposades pel sistema base i les torna a reordenar en funció del context visual proporcionat pel classificador d’objectes. Per al segon cas, s’ha dissenyat un enfocament neuronal d’extrem a extrem per explotar la descripció de la imatge (subtítol) tant a nivell de frase com a nivell de paraula i re-ordenar les paraules candidates basant-se tant en el context visual com en les co-ocurrències amb el subtítol. Com a contribució addicional, per satisfer els requisits dels enfocs basats en dades com ara les xarxes neuronals, presentem un conjunt de dades de contextos visuals per a aquesta tasca, en el què el conjunt de dades COCO-text disponible públicament [Veit et al. 2016] s’ha ampliat amb informació sobre l’escena (inclosos els objectes i els llocs que apareixen a la imatge) per permetre als investigadors incloure les relacions semàntiques entre textos i escena als seus sistemes de reconeixement de text, i oferir una base d’avaluació comuna per a aquests enfocaments.
Saracoglu, Ahmet. "Localization And Recognition Of Text In Digital Media." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/2/12609028/index.pdf.
Full textUzuner, Halil. "Robust text-independent speaker recognition over telecommunications systems." Thesis, University of Surrey, 2006. http://epubs.surrey.ac.uk/843391/.
Full textWildermoth, Brett Richard, and n/a. "Text-Independent Speaker Recognition Using Source Based Features." Griffith University. School of Microelectronic Engineering, 2001. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20040831.115646.
Full textGreenhalgh, Jack. "Driver assistance using automated symbol and text recognition." Thesis, University of Bristol, 2015. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.685967.
Full textBertolami, Roman. "Ensemble methods for offline handwritten text line recognition /." [S.l.] : [s.n.], 2008. http://www.zb.unibe.ch/download/eldiss/08bertolami_r.pdf.
Full textCalarasanu, Stefania Ana. "Improvement of a text detection chain and the proposition of a new evaluation protocol for text detection algorithms." Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066524/document.
Full textThe growing number of text detection approaches proposed in the literature requires a rigorous performance evaluation and ranking. An evaluation protocol relies on three elements: a reliable text reference, a matching strategy and finally a set of metrics. The few existing evaluation protocols often lack accuracy either due to inconsistent matching or due to unrepresentative metrics. In this thesis we propose a new evaluation protocol that tackles most of the drawbacks faced by currently used evaluation methods. This work is focused on three main contributions: firstly, we introduce a complex text reference representation that does not constrain text detectors to adopt a specific detection granularity level or annotation representation; secondly, we propose a set of matching rules capable of evaluating any type of scenario that can occur between a text reference and a detection; and finally we show how we can analyze a set of detection results, not only through a set of metrics, but also through an intuitive visual representation. A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images for which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images because the camera capture angle is not normal to the plane containing the text regions. Despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. In this thesis we also propose a rectification procedure capable of correcting highly distorted texts evaluated on a very challenging dataset
Namane, Abderrahmane. "Degraded printed text and handwritten recognition methods : Application to automatic bank check recognition." Université Louis Pasteur (Strasbourg) (1971-2008), 2007. http://www.theses.fr/2007STR13048.
Full textCharacter recognition is a significant stage in all document recognition systems. Character recognition is considered as an assignment problem and decision of a given character, and is an active research subject in many disciplines. This thesis is mainly related to the recognition of degraded printed and handwritten characters. New solutions were brought to the field of document image analysis (DIA). The first solution concerns the development of two recognition methods for handwritten numeral character, namely, the method based on the use of Fourier-Mellin transform (FMT) and the self-organization map (SOM), and the parallel combination of HMM-based classifiers using as parameter extraction a new projection technique. In the second solution, one finds a new holistic recognition method of handwritten words applied to French legal amount. The third solution presents two recognition methods based on neural networks for the degraded printed character applied to the Algerian postal check. The first work is based on sequential combination and the second used a serial combination based mainly on the introduction of a relative distance for the quality measurement of the degraded character. During the development of this thesis, methods of preprocessing were also developed, in particular, the handwritten numeral slant correction, the handwritten word central zone detection and its slope