Log in

Relevant bibliographies by topics / Tesseract ocr engine / Journal articles

To see the other types of publications on this topic, follow the link: Tesseract ocr engine.

Journal articles on the topic 'Tesseract ocr engine'

Author: Grafiati

Published: 4 June 2025

Last updated: 15 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Tesseract ocr engine.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chesley, Emily, Jillian Marcantonio, and Abigail Pearson. "Towards Syriac Digital Corpora: Evaluation of Tesseract 4.0 for Syriac OCR." Hugoye: Journal of Syriac Studies 22, no. 1 (2019): 109–92. http://dx.doi.org/10.31826/hug-2019-220105.

Full text

Abstract:

Abstract This paper summarizes the results of an extensive test of Tesseract 4.0, an open-source Optical Character Recognition (OCR) engine with Syriac capabilities, and ascertains the current state of Syriac OCR technology. Three popular print types (S14, W64, and E22) representing the Syriac type styles Estrangela, Serto, and East Syriac were OCRed using Tesseract’s two different OCR modes (Syriac Language and Syriac Script). Handwritten manuscripts were also preliminarily tested for OCR. The tests confirm that Tesseract 4.0 may be relied upon for printed Estrangela texts but should be used

APA, Harvard, Vancouver, ISO, and other styles

2

Mubeen, Dr Suraya, Jally Brahmani, Datha Pavan Kalyan, Ayesha Jagirdar, and A. Praveen Kumar. "Optical Character Recognition Using Tesseract." International Journal for Research in Applied Science and Engineering Technology 10, no. 11 (2022): 672–75. http://dx.doi.org/10.22214/ijraset.2022.47414.

Full text

Abstract:

Abstract: Optical Character Recognition (OCR) is a process or technology in which text within a digital image is recognized. With rapid pace of technology, people want quicker, handy and reliable tools, which can fulfil their daily needs. With this moto we had gone forward and analyzed the existing tools and made up this Android App, which provides seamless experience (No ads and easy-to-use), and great accuracy. The main objective of this project is to allow automatic extraction of the information that a user wants from the paper document and using it wherever it is needed. In this project, O

APA, Harvard, Vancouver, ISO, and other styles

3

Benaissa, Ali, Abdelkhalak Bahri, Ahmad El Allaoui, and My Abdelouahab Salahddine. "Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition." Data and Metadata 2 (December 9, 2023): 185. http://dx.doi.org/10.56294/dm2023185.

Full text

Abstract:

This article introduces a methodology for constructing a trained dataset to facilitate Tifinagh script recognition using the Tesseract OCR engine. The Tifinagh script, widely used in North Africa, poses a challenge due to the lack of built-in recognition capabilities in Tesseract. To overcome this limitation, our approach focuses on image generation, box generation, manual editing, charset extraction, and dataset compilation. By leveraging Python scripting, specialized software tools, and Tesseract's training utilities, we systematically create a comprehensive dataset for Tifinagh script recog

APA, Harvard, Vancouver, ISO, and other styles

4

Tiwari, Anurag. "Data Extraction from Images through OCR." International Journal for Research in Applied Science and Engineering Technology 9, no. VIII (2021): 435–37. http://dx.doi.org/10.22214/ijraset.2021.37377.

Full text

Abstract:

The paperwork used in maintaining various types of documents in our daily lives is tiresome and inefficient, it consumes a lot of time and it is difficult to maintain and remember the concerned documents. This project provides a solution to these problems by introducing Optical Character Recognition Technology (OCR) which runs on Tesseract OCR Engine. The project specifically aims at increasing data accessibility, usability and improving customer experience by decreasing the time spent to process, save, and maintain user data. Another objective of this project is to nullify the human error, wh

APA, Harvard, Vancouver, ISO, and other styles

5

Patience, Okechukwu Ogochukwu, Eziechina Malachy Amaechi, Onyemachi George, and Onuwa Nnachi Isaac. "Enhanced Text Recognition in Images Using Tesseract OCR within the Laravel Framework." Asian Journal of Research in Computer Science 17, no. 9 (2024): 58–69. http://dx.doi.org/10.9734/ajrcos/2024/v17i9499.

Full text

Abstract:

This research explores the integration of Tesseract OCR (Optical Character Recognition) within the Laravel framework to enhance text recognition capabilities in images. Tesseract OCR, an open-source OCR engine, is renowned for its accuracy and efficiency in converting various image formats into editable and searchable text. However, leveraging its full potential within a robust web application framework presents unique challenges and opportunities. This implementation focuses on creating a seamless, user-friendly application that processes images uploaded by users and accurately extracts text

APA, Harvard, Vancouver, ISO, and other styles

6

Joshi, Kartik. "Study of Tesseract OCR." GLS KALP: Journal of Multidisciplinary Studies 1, no. 2 (2024): 41–50. http://dx.doi.org/10.69974/glskalp.01.02.54.

Full text

Abstract:

In the current Internet and Digitization era, a huge amount of information is available in different forms like books, newspapers, etc. To preserve the contents of such documents, these documents are converted to a digital format by scanning them as images. Detection of text from the scanned images and correct identification of characters is a challenging problem in such cases. Tesseract is a recognition engine based upon open source license which uses some novel techniques for optical character recognition. Tesseract has been designed to recognize more than 100 languages. Few of these languag

APA, Harvard, Vancouver, ISO, and other styles

7

Clausner, Christian, Apostolos Antonacopoulos, and Stefan Pletschacher. "Efficient and effective OCR engine training." International Journal on Document Analysis and Recognition (IJDAR) 23, no. 1 (2019): 73–88. http://dx.doi.org/10.1007/s10032-019-00347-8.

Full text

Abstract:

Abstract We present an efficient and effective approach to train OCR engines using the Aletheia document analysis system. All components required for training are seamlessly integrated into Aletheia: training data preparation, the OCR engine’s training processes themselves, text recognition, and quantitative evaluation of the trained engine. Such a comprehensive training and evaluation system, guided through a GUI, allows for iterative incremental training to achieve best results. The widely used Tesseract OCR engine is used as a case study to demonstrate the efficiency and effectiveness of th

APA, Harvard, Vancouver, ISO, and other styles

8

Alan Jiju, Shaun Tuscano, and Chetana Badgujar. "OCR Text Extraction." International Journal of Engineering and Management Research 11, no. 2 (2021): 83–86. http://dx.doi.org/10.31033/ijemr.11.2.11.

Full text

Abstract:

This research tries to find out a methodology through which any data from the daily-use printed bills and invoices can be extracted. The data from these bills or invoices can be used extensively later on – such as machine learning or statistical analysis. This research focuses on extraction of final bill-amount, itinerary, date and similar data from bills and invoices as they encapsulate an ample amount of information about the users purchases, likes or dislikes etc. Optical Character Recognition (OCR) technology is a system that provides a full alphanumeric recognition of printed or handwritt

APA, Harvard, Vancouver, ISO, and other styles

9

Sporici, Dan, Elena Cușnir, and Costin-Anton Boiangiu. "Improving the Accuracy of Tesseract 4.0 OCR Engine Using Convolution-Based Preprocessing." Symmetry 12, no. 5 (2020): 715. http://dx.doi.org/10.3390/sym12050715.

Full text

Abstract:

Optical Character Recognition (OCR) is the process of identifying and converting texts rendered in images using pixels to a more computer-friendly representation. The presented work aims to prove that the accuracy of the Tesseract 4.0 OCR engine can be further enhanced by employing convolution-based preprocessing using specific kernels. As Tesseract 4.0 has proven great performance when evaluated against a favorable input, its capability of properly detecting and identifying characters in more realistic, unfriendly images is questioned. The article proposes an adaptive image preprocessing step

APA, Harvard, Vancouver, ISO, and other styles

10

Ibrahim, Ahmed. "Dhivehi OCR: Character Recognition of Thaana Script using Machine-Generated Text and Tesseract OCR Engine." International Journal of Social Research and Innovation 1, no. 1 (2018): 83–94. http://dx.doi.org/10.55712/ijsri.v1i1.23.

Full text

Abstract:

This paper provides technical aspects and the context of recognising Dhivehi characters using Tesseract OCR Engine, which is a freely available OCR engine with remarkable accuracy and support for multiple languages. The experiments that were conducted showed promising results with 69.46% accuracy and, more importantly, highlighted limitations that are unique to Dhivehi. These issues have been discussed in detail and possible directions for future research are presented.

APA, Harvard, Vancouver, ISO, and other styles

11

Joshi, Kalpesh. "Handwritten Text Recognition from Image." International Journal for Research in Applied Science and Engineering Technology 11, no. 6 (2023): 1528–30. http://dx.doi.org/10.22214/ijraset.2023.53364.

Full text

Abstract:

Abstract: A computer vision program called Handwritten Text Recognition (HTR) attempts to recognize and translate handwritten text from scanned or photographed images. In this project, we suggest implementing an HTR system using Tesseract and OpenCV. English, Chinese, and Arabic are all supported by the popular open-source optical character recognition (OCR) engine known as Tesseract. It is employed to find and identify printed text within photographs. On the other hand, OpenCV is a well-liked computer vision library that offers several tools for processing and analyzing images. The pre-proces

APA, Harvard, Vancouver, ISO, and other styles

12

Tsimpiris, Alkiviadis, Dimitrios Varsamis, and George Pavlidis. "Tesseract OCR Evaluation on Greek Food Menus Datasets." International Journal of Computing and Optimization 9, no. 1 (2022): 13–32. https://doi.org/10.12988/ijco.2022.9829.

Full text

Abstract:

This article presents a procedure for optical character recognition (OCR) improvement, after image preprocessing of Greek food menus images. To achieve this goal, many well-known and other more so- phisticated techniques for image preprocessing have been used. The performance of the Tesseract OCR engine has been studied for selected binarization, thresholding, noise and morphological filtering methods that applied to menu images before OCR feeding. The output text is compared to the reference text of each image (ground text) and the val- ues of evaluation indices indicate the appropriate prepr

APA, Harvard, Vancouver, ISO, and other styles

13

Indrawan, Gede, Ahmad Asroni, Luh Joni Erawati Dewi, I. Gede Aris Gunadi, and I. Ketut Paramarta. "Balinese Script Recognition Using Tesseract Mobile Framework." Lontar Komputer : Jurnal Ilmiah Teknologi Informasi 13, no. 3 (2022): 160. http://dx.doi.org/10.24843/lkjiti.2022.v13.i03.p03.

Full text

Abstract:

One of the main factors causing the decline in the use of Balinese Script is that Balinese people are less interested in reading Balinese Script because of their reluctance to learn Balinese Script, which is relatively complicated in the recognition process. The development of computer technology has now been used to help by performing character recognition or known as Optical Character Recognition (OCR). Developing the OCR application for Balinese Script is an effort to help preserve, from the technology side, as a means of education related to Balinese Script. In this study, that development

APA, Harvard, Vancouver, ISO, and other styles

14

Oudah, Nabeel, Maher Faik Esmaile, and Estabraq Abdulredaa. "Optical Character Recognition Using Active Contour Segmentation." Journal of Engineering 24, no. 1 (2018): 146–58. http://dx.doi.org/10.31026/j.eng.2018.01.10.

Full text

Abstract:

Document analysis of images snapped by camera is a growing challenge. These photos are often poor-quality compound images, composed of various objects and text; this makes automatic analysis complicated. OCR is one of the image processing techniques which is used to perform automatic identification of texts. Existing image processing techniques need to manage many parameters in order to clearly recognize the text in such pictures. Segmentation is regarded one of these essential parameters. This paper discusses the accuracy of segmentation process and its effect over the recognition process. Ac

APA, Harvard, Vancouver, ISO, and other styles

15

Muthusundari, Muthusundari, A. Velpoorani, S. Venkata Kusuma, Trisha L, and Om k. Rohini. "Optical character recognition system using artificial intelligence." LatIA 2 (August 13, 2024): 98. http://dx.doi.org/10.62486/latia202498.

Full text

Abstract:

Abstract A technique termed optical character recognition, or OCR, is used to extract text from images. An OCR the system's primary goal is to transform already present paper-based paperwork or picture data into usable papers. Character as well as word detection are the two main phases of an OCR, which is designed using many algorithms. An OCR also maintains a document's structure by focusing on sentence identification, which is a more sophisticated approach. Research has demonstrated that despite the efforts of numerous scholars, no error-free Bengali OCR has been produced. This issue is addr

APA, Harvard, Vancouver, ISO, and other styles

16

Makhmudov, Fazliddin, Mukhriddin Mukhiddinov, Akmalbek Abdusalomov, Kuldoshbay Avazov, Utkir Khamdamov, and Young Im Cho. "Improvement of the end-to-end scene text recognition method for “text-to-speech” conversion." International Journal of Wavelets, Multiresolution and Information Processing 18, no. 06 (2020): 2050052. http://dx.doi.org/10.1142/s0219691320500526.

Full text

Abstract:

Methods for text detection and recognition in images of natural scenes have become an active research topic in computer vision and have obtained encouraging achievements over several benchmarks. In this paper, we introduce a robust yet simple pipeline that produces accurate and fast text detection and recognition for the Uzbek language in natural scene images using a fully convolutional network and the Tesseract OCR engine. First, the text detection step quickly predicts text in random orientations in full-color images with a single fully convolutional neural network, discarding redundant inte

APA, Harvard, Vancouver, ISO, and other styles

17

Sharmin, Sabrina, Tasauf Mim, and Mohammad Rahman. "Bangla Optical Character Recognition for Mobile Platforms: A Comprehensive Cross-Platform Approach." American Journal of Electrical and Computer Engineering 8, no. 2 (2024): 31–42. http://dx.doi.org/10.11648/j.ajece.20240802.12.

Full text

Abstract:

The development of Optical Character Recognition (OCR) systems for Bangla script has been an area of active research since the 1980s. This study presents a comprehensive analysis and development of a cross-platform mobile application for Bangla OCR, leveraging the Tesseract OCR engine. The primary objective is to enhance the recognition accuracy of Bangla characters, achieving rates between 90% and 99%. The application is designed to facilitate the automatic extraction of text from images selected from the device&apos;s photo library, promoting the preservation and accessibility of Bangla

APA, Harvard, Vancouver, ISO, and other styles

18

Ircham Aji Nugroho, Bety Hayat Susanti, Mareta Wahyu Ardyani, and Nadia Paramita R.A. "The Design of a C1 Document Data Extraction Application Using a Tesseract-Optical Character Recognition Engine." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 8, no. 1 (2024): 42–53. http://dx.doi.org/10.29207/resti.v8i1.5151.

Full text

Abstract:

The 2019 election process used the Vote Counting Information System, also known as Sistem Informasi Penghitungan Suara (Situng), to provide transparency in the recapitulation process. The data displayed in Situng is from document C1 for 813,336 voting stations in Indonesia. The data collected from the C1 document is entered and uploaded into Situng by the officers of the Municipal General Election Commission (GEC). Since this process is performed by humans, it is not immune to errors. In the recapitulation process of the 2019 election results, there were 269 data entry errors, and the data ent

APA, Harvard, Vancouver, ISO, and other styles

19

Umesh, Hengaju, and Bal Krishna Bal Dr. "Improving the Recognition Accuracy of Tesseract-OCR Engine on Nepali Text Images via Preprocessing." Advancement in Image Processing and Pattern Recognition 3, no. 3 (2020): 1–13. https://doi.org/10.5281/zenodo.4361896.

Full text

Abstract:

<em>Image Documents scanned or captured by digital cameras on mobile phones suffer from a number of limitations like geometric distortions, focus loss, uneven lightning conditions, low scanning resolution etc. Because of these limitations, the quality of image documents is often degraded and because of this, the recognition accuracy of OCR engines gets affected. This work focuses on improving the recognition of Tesseract-OCR engine for Nepali image documents via preprocessing. For this purpose, we developed an image preprocessing pipeline consisting of 8 steps and tested with several Nepali te

APA, Harvard, Vancouver, ISO, and other styles

20

B, Rithik, Raghav G, Harshith M, Rahul Patwadi, and Aravind H S. "Licence Plate Recognition System Using Open-CV and Tesseract OCR Engine." International Journal of Engineering Research in Computer Science and Engineering 9, no. 9 (2022): 8–12. http://dx.doi.org/10.36647/ijercse/09.09.art003.

Full text

Abstract:

As the technology has taken a leap to make sure human lives get easier, it has also come with certain consequences. One of them being traffic control and vehicle owner identification has really become a serious issue in the 21st century. Due to the advancements in automobile technology, it is very easy for a person to violate traffic rules and it is practically not possible for humans to stop or have a track record of the vehicles’ number plate travelling at higher speeds. This is a major problem which is being faced by developing countries and our paper will discuss an implementable solution

APA, Harvard, Vancouver, ISO, and other styles

21

Silfverberg, Miikka, and Jack Rueter. "Can Morphological Analyzers Improve the Quality of Optical Character Recognition?" Septentrio Conference Series, no. 2 (June 17, 2015): 45. http://dx.doi.org/10.7557/5.3467.

Full text

Abstract:

Optical Character Recognition (OCR) can substantially improve the usability of digitized documents. Language modeling using word lists is known to improve OCR quality for English. For morphologically rich languages, however, even large word lists do not reach high coverage on unseen text. Morphological analyzers offer a more sophisticated approach, which is useful in many language processing applications. is paper investigates language modeling in the open-source OCR engine Tesseract using morphological analyzers. We present experiments on two Uralic languages Finnish and Erzya. According to

APA, Harvard, Vancouver, ISO, and other styles

22

Idrees, Saman, and Hossein Hassani. "Exploiting Script Similarities to Compensate for the Large Amount of Data in Training Tesseract LSTM: Towards Kurdish OCR." Applied Sciences 11, no. 20 (2021): 9752. http://dx.doi.org/10.3390/app11209752.

Full text

Abstract:

Applications based on Long-Short-Term Memory (LSTM) require large amounts of data for their training. Tesseract LSTM is a popular Optical Character Recognition (OCR) engine that has been trained and used in various languages. However, its training becomes obstructed when the target language is not resourceful. This research suggests a remedy for the problem of scant data in training Tesseract LSTM for a new language by exploiting a training dataset for a language with a similar script. The target of the experiment is Kurdish. It is a multi-dialect language and is considered less-resourced. We

APA, Harvard, Vancouver, ISO, and other styles

23

Mishra, Nitin, C. Patvardhan, C. Vasantha Lakshmi, and Sarika Singh. "Shirorekha Chopping Integrated Tesseract OCR Engine for Enhanced Hindi Language Recognition." International Journal of Computer Applications 39, no. 6 (2012): 19–23. http://dx.doi.org/10.5120/4824-7076.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Kamble, Prathamesh, Rohit Pisal, Hrutik Khade, Vishal Sole, and Prof S. R. Bhujbal. "Automated Vehicle Number Plate Detection and Recognition." International Journal for Research in Applied Science and Engineering Technology 11, no. 1 (2023): 1307–11. http://dx.doi.org/10.22214/ijraset.2023.48785.

Full text

Abstract:

Abstract: In this project, a Digital Image Processing-based prototype is developed. Actions such as Image Acquisition, enhancement that is pre-processing, Segmentation of the license plate and then application of OCR (Optical Character Recognition) is applied to store the number on text form. The plate number is displayed as text on the terminal using the principle of OCR with help of Tesseract engine. It is seen that the security forces and authorities face problems whenever security forces chase a vehicle or they can’t catch a vehicle which broke traffic rules. Authorities find it very hecti

APA, Harvard, Vancouver, ISO, and other styles

25

Saputra, The Manuel Eric, Ajib Susanto, and Bastiaans Jessica Carmelita. "Implementation of Tesseract OCR and Bounding Box for Text Extraction on Food Nutrition Labels." Building of Informatics, Technology and Science (BITS) 6, no. 3 (2024): 1403–12. https://doi.org/10.47065/bits.v6i3.6107.

Full text

Abstract:

This study focuses on implementing Optical Character Recognition (OCR) using the Tesseract engine, integrated with bounding box detection, to extract nutritional information from food nutrition labels. The research addresses the challenge of limited consumer access to and understanding of nutritional data, a factor contributing to health issues such as obesity and related metabolic disorders. Studies indicate that although Indonesian consumers generally have a good level of knowledge and positive attitudes toward nutritional labels, the actual behavior of reading and understanding these labels

APA, Harvard, Vancouver, ISO, and other styles

26

S., Andrew, Jepthah Yankey, and Ernest O. "An Automatic Number Plate Recognition System using OpenCV and Tesseract OCR Engine." International Journal of Computer Applications 180, no. 43 (2018): 1–5. http://dx.doi.org/10.5120/ijca2018917150.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Wydyanto. "Converting Image Text to Speech Using Raspberry Pi." Journal of Information Systems Engineering and Management 10, no. 34s (2025): 651–58. https://doi.org/10.52783/jisem.v10i34s.5860.

Full text

Abstract:

This project aims to develop a system that can convert image text into speech using Raspberry Pi. This system will use optical character recognition (OCR) technology to extract text from images, which will then be converted into speech using text-to-speech software. (TTS). (TTS). This will provide a valuable tool for individuals with visual impairments or those who have difficulty reading text in images. This research explores a Raspberry Pi-based device that can translate English into 53 dialects, using a camera module, OCR motor, Google Speech API, and Microsoft Translator. This feature is a

APA, Harvard, Vancouver, ISO, and other styles

28

AIRCC. "Information Extraction from Product Labels: A Machine Vision Approach." International Journal of Artificial Intelligence & Applications (IJAIA) 15, no. 2 (2024): 57–76. https://doi.org/10.5121/ijaia.2024.15204.

Full text

Abstract:

This research tackles the challenge of manual data extraction from product labels by employing a blend ofcomputer vision and Natural Language Processing (NLP). We introduce an enhanced model that combinesConvolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a ConvolutionalRecurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined byincorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open FoodFacts API (Applica

APA, Harvard, Vancouver, ISO, and other styles

29

Seitaj, Hansi, and Vinayak Elangovan. "Information Extraction from Product Labels: A Machine Vision Approach." International Journal of Artificial Intelligence & Applications 15, no. 2 (2024): 57–76. http://dx.doi.org/10.5121/ijaia.2024.15204.

Full text

Abstract:

This research tackles the challenge of manual data extraction from product labels by employing a blend of computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition (OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food Facts API (A

APA, Harvard, Vancouver, ISO, and other styles

30

Mudiarta, I. M. D. R., I. M. D. S. Atmaja, I. K. Suharsana, et al. "Balinese character recognition on mobile application based on tesseract open source OCR engine." Journal of Physics: Conference Series 1516 (April 2020): 012017. http://dx.doi.org/10.1088/1742-6596/1516/1/012017.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Esteves Coelho, Gustavo, Alexandre Pinheiro, Álvaro Silva Ribeiro, Catarina Simões, and Luís Lages Martins. "Automating flowmeter calibration process: Digital measurements from numerical displays using open-source optical character recognition tools." Acta IMEKO 13, no. 3 (2024): 1–6. http://dx.doi.org/10.21014/actaimeko.v13i3.1767.

Full text

Abstract:

This paper presents a methodology for obtaining digital machine-readable measurements from numerical displays images. The proposed method provides means to digitalize an automate a previously manual and labour-intensive laboratory procedure for flowmeters calibration. The proposed method allows to obtain machine-readable readings from remote numerical displays with available-off-the-shelf hardware and open-source software. By using smartphones for remote image capture and streaming and the Tesseract open-source OCR engine, is possible to leverage the infrastructure’s digital transition, improv

APA, Harvard, Vancouver, ISO, and other styles

32

Utami, Anisa Eka, Oky Dwi Nurhayati, and Kurniawan Teguh Martono. "Aplikasi Penerjemah Bahasa Inggris – Indonesia dengan Optical Character Recognition Berbasis Android." Jurnal Teknologi dan Sistem Komputer 4, no. 1 (2016): 167. http://dx.doi.org/10.14710/jtsiskom.4.1.2016.167-177.

Full text

Abstract:

Perangkat lunak untuk pengenalan karakter yang terdapat dalam ponsel pintar khususnya berbasis Android dikembangkan dengan penekanan pada mobilitas, portabilitas, ruang penyimpanan, perangkat keras, dan keterbatasan jangkauan dapat dipecahkan. Akan tetapi, kinerja sebuah ponsel pintar berbasis Android dan komputer berbeda maka kecepatan pengenalan karakter juga akan berpengaruh. Masalah ini tampaknya akan menunjukkan suatu solusi, yaitu dengan salah satu inovasi yang diterapkankan ke dalam perangkat Android dengan teknologi OCR (Optical Character Recognition). Perencanaan sistem menggunakan pe

APA, Harvard, Vancouver, ISO, and other styles

33

Munawaroh, Anisatul, and Eko Rudiawan Jamzuri. "Automatic optical inspection for detecting keycaps misplacement using Tesseract optical character recognition." International Journal of Electrical and Computer Engineering (IJECE) 13, no. 5 (2023): 5147. http://dx.doi.org/10.11591/ijece.v13i5.pp5147-5155.

Full text

Abstract:

<span lang="EN-US">This research study aims to develop automatic optical inspection (AOI) for detecting keycaps misplacement on the keyboard. The AOI hardware has been designed using an industrial camera with an additional mechanical jig and lighting system. Optical character recognition (OCR) using the Tesseract OCR engine is the proposed method to detect keycaps misplacement. In addition, captured images were cropped using a predefined region of interest (ROI) during the setup. Subsequently, the cropped ROIs were processed to acquire binary images. Furthermore, Tesseract processed thes

APA, Harvard, Vancouver, ISO, and other styles

34

Munawaroh, Anisatul, and Eko Rudiawan Jamzuri. "Automatic optical inspection for detecting keycaps misplacement using Tesseract optical character recognition." International Journal of Electrical and Computer Engineering (IJECE) 13, no. 5 (2023): 5147–55. https://doi.org/10.11591/ijece.v13i5.pp5147-5155.

Full text

Abstract:

This research study aims to develop automatic optical inspection (AOI) for detecting keycaps misplacement on the keyboard. The AOI hardware has been designed using an industrial camera with an additional mechanical jig and lighting system. Optical character recognition (OCR) using the Tesseract OCR engine is the proposed method to detect keycaps misplacement. In addition, captured images were cropped using a predefined region of interest (ROI) during the setup. Subsequently, the cropped ROIs were processed to acquire binary images. Furthermore, Tesseract processed these binary images to recogn

APA, Harvard, Vancouver, ISO, and other styles

35

Buoy, Rina, Nguonly Taing, Sovisal Chenda, and Sokchea Kor. "Khmer printed character recognition using attention-based Seq2Seq network." HO CHI MINH CITY OPEN UNIVERSITY JOURNAL OF SCIENCE - ENGINEERING AND TECHNOLOGY 12, no. 1 (2022): 3–16. http://dx.doi.org/10.46223/hcmcoujs.tech.en.12.1.2217.2022.

Full text

Abstract:

This paper presents an end-to-end deep convolutional recurrent neural network solution for Khmer optical character recognition (OCR) task. The proposed solution uses a sequence-to-sequence (Seq2Seq) architecture with attention mechanism. The encoder extracts visual features from an input text-line image via layers of convolutional blocks and a layer of gated recurrent units (GRU). The features are encoded in a single context vector and a sequence of hidden states which are fed to the decoder for decoding one character at a time until a special end-of-sentence (EOS) token is reached. The attent

APA, Harvard, Vancouver, ISO, and other styles

36

MALLISHWARI, N. "Implementation of the Image Text to Speech Conversion in the Desired Language by Translating with Raspberry Pi." International Scientific Journal of Engineering and Management 04, no. 06 (2025): 1–9. https://doi.org/10.55041/isjem04635.

Full text

Abstract:

ABSTRACT: The main problem in communication is language bias between the communicators. This device basically can be used by people who do not know English and want it to be translated to their native language. The novelty component of this research work is the speech output which is available in 53 different languages translated from English. This paper is based on a prototype which helps user to hear the contents of the text images in the desired language. It involves extraction of text from the image and converting the text to translated speech in the user desired language. This is done wit

APA, Harvard, Vancouver, ISO, and other styles

37

Nazdryukhin, A. S., I. N. Khramtsov, and A. N. Tushev. "PROCESSING IMAGES OF SALES RECEIPTS FOR ISOLATING AND RECOGNISING TEXT INFORMATION." Herald of Dagestan State Technical University. Technical Sciences 46, no. 4 (2020): 113–22. http://dx.doi.org/10.21822/2073-6185-2019-46-4-113-122.

Full text

Abstract:

Objectives. This article presents an application for the processing of scanned images of sales receipts for subsequent extraction of text information using the Tesseract OCR Engine. Such an application is useful for maintaining a family budget or for accounting in small companies. The main problem of receipt recognition is the low quality of ink and printing paper, which results in creasing and tears, as well as the rapid fading of printed characters.Methods. The study is based on a number of algorithms based on mathematical morphology methods for opening, closing and morphological gradient op

APA, Harvard, Vancouver, ISO, and other styles

38

Susanty, Meredita, and Herminarto Nugroho. "OPTICAL CHARACTER RECOGNITION IMPLEMENTATION FOR ADMISSION SYSTEM IN UNIVERSITAS PERTAMINA." Simetris: Jurnal Teknik Mesin, Elektro dan Ilmu Komputer 11, no. 1 (2020): 165–70. http://dx.doi.org/10.24176/simet.v11i1.3838.

Full text

Abstract:

Starting in 2019, prospective college students require to take Computer-Based Writing Exam (UTBK) to register for the state universities in Indonesia. Some private university also adopts this exam as a requirement for admission. One of the private university that adopts it is Universitas Pertamina. UTBK consist of several exam group score printed in a digital certificate in image format (jpg). The university admission team must download the UTBK certificate that has uploaded by applicants, read and record the score for each exam group then make a calculation to make a decision whether the appl

APA, Harvard, Vancouver, ISO, and other styles

39

Voncilă, Mihai-Lucian, Nicolae Tarbă, Cosmin-Dumitru Oprea, Costin Anton Boiangiu, and Nicolae Goga. "Pixel-Wise Method for Enhanced Tesseract OCR Accuracy Using Colour and Spatial Distances." BRAIN. Broad Research in Artificial Intelligence and Neuroscience 16, no. 2 (2025): 409. https://doi.org/10.70594/brain/16.2/29.

Full text

Abstract:

<span id="docs-internal-guid-1b01ee60-7fff-f351-f7b9-6bbf0b41a7de"><span>Digital images often contain noise introduced during acquisition, storage, or transmission, which can hinder the performance of Optical Character Recognition systems. Effective noise reduction is essential for improving the accuracy of these systems, as noise can obscure text and reduce recognition rates. The problem of removing noise from images is widely studied in computer vision but remains challenging due to the variety of noise types and the risk of introducing artifacts or blurring. In this work, we pro

APA, Harvard, Vancouver, ISO, and other styles

40

Singh, Shailendra. "Model for Converting PDF to Audio Format (Listen Your Book)." International Journal for Research in Applied Science and Engineering Technology 9, no. VII (2021): 3203–6. http://dx.doi.org/10.22214/ijraset.2021.36522.

Full text

Abstract:

The present paper has introduced an innovative and efficient technique that enables user to hear the contents of text images instead of reading through them. In the current world, there is a great increase in the utilization of digital technology and multiple methods are available for the people to capture images. such images may contain important textual content that the user may need to edit or store digitally. It merges the concept of Optical Character Recognition (OCR) and Text to Speech Synthesizer (TTS). This can be done using Optical Character Recognition with the use of Tesseract OCR E

APA, Harvard, Vancouver, ISO, and other styles

41

Nieva, de la Hidalga Abraham, David Owen, Irena Spacic, Paul Rosin, and Xianfang Sun. "Use of Semantic Segmentation for Increasing the Throughput of Digitisation Workflows for Natural History Collections." Biodiversity Information Science and Standards 3 (June 18, 2019): e37161. https://doi.org/10.3897/biss.3.37161.

Full text

Abstract:

The need to increase global accessibility to specimens while preserving the physical specimens by reducing their handling motivates digitisation. Digitisation of natural history collections has evolved from recording of specimens' catalogue data to including digital images and 3D models of specimens. The sheer size of the collections requires developing high throughput digitisation workflows, as well as novel acquisition systems, image standardisation, curation, preservation, and publishing. For instance, herbarium sheet digitisation workflows (and fast digitisation stations) can digitise up t

APA, Harvard, Vancouver, ISO, and other styles

42

Sawalkar, Prof Meera. "Text Recognition Using Image Processing Technology for Visiting Card." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (2023): 6374–78. http://dx.doi.org/10.22214/ijraset.2023.53081.

Full text

Abstract:

Abstract: Image recognition and optical character recognition technologies have become an integral part of our daily lives due to increasing computing power and the proliferation of scanning devices. A printed document can be quickly converted to a digital text file using optical character recognition and edited by the user. The time required to digitize documents is therefore minimal. This is especially useful when archiving large print volumes. In this study, we show how image processing techniques can be used in combination with optical character recognition to improve recognition accuracy

APA, Harvard, Vancouver, ISO, and other styles

43

Ripunjay, Singh, Goyal Sarthak, Agarwal Shivam, Saxena Divyansh, and Upadhyay Subho. "Evaluating the Performance of Ensembled YOLOv8 Variants in Smart Parking Applications Under Varying Lighting Conditions." International Journal of Mathematics and Computer Research 13, no. 04 (2025): 5026–32. https://doi.org/10.5281/zenodo.15124044.

Full text

Abstract:

With an emphasis on performance under various ambient illumination circumstances this paper explores the potency of YOLOv8 variants for vehicle and license plate detection. The suggested method will capture entire video frames, identify areas of interest with cars, and feed those regions into two distinct, pre-trained YOLOv8 models—one for license plate recognition and the other for vehicle detection. To make the photos easier for the Tesseract OCR engine to read, they are pre-processed using the OpenCV and Pillow libraries to make the images brighter and higher DPI. The four YOLOv8 mode

APA, Harvard, Vancouver, ISO, and other styles

44

Shiravale, Sankirti Sandeep, Jayadevan R, and Sanjeev S. Sannakki. "Recognition of Devanagari Scene Text Using Autoencoder CNN." ELCVIA Electronic Letters on Computer Vision and Image Analysis 20, no. 1 (2021): 55–69. http://dx.doi.org/10.5565/rev/elcvia.1344.

Full text

Abstract:

Scene text recognition is a well-rooted research domain covering a diverse application area. Recognition of scene text is challenging due to the complex nature of scene images. Various structural characteristics of the script also influence the recognition process. Text and background segmentation is a mandatory step in the scene text recognition process. A text recognition system produces the most accurate results if the structural and contextual information is preserved by the segmentation technique. Therefore, an attempt is made here to develop a robust foreground/background segmentation(se

APA, Harvard, Vancouver, ISO, and other styles

45

Juan, J. Navarro, and Reta Carolina. "Digital image processing in the creation of an intelligent system prototype for text detection and recognition in the labeling process of electrical cable." Latin-American Journal of Computing 7, no. 2 (2020): 92–107. https://doi.org/10.5281/zenodo.5745180.

Full text

Abstract:

Cable labeling allows identifying different configurations and batches of cables, as well as their characteristics. In the cable labeling process, different types of errors or defects can occur in the printed text, such as the absence of the label, ink bleeding, text parts missing, illegible text, and ink drops. In this paper, a prototype of a system for text validation in electrical cables using image processing techniques is presented. The proposed system consists of two stages. In the first stage, a method based on K-means clustering was proposed, allowing preprocessing images to condition

APA, Harvard, Vancouver, ISO, and other styles

46

You, Choon En, Wai Leong Pang, and Kah Yoong Chan. "AI-Based Low-Cost Real-Time Face Mask Detection and Health Status Monitoring System for COVID-19 Prevention." WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS 19 (November 4, 2022): 256–63. http://dx.doi.org/10.37394/23209.2022.19.26.

Full text

Abstract:

The outbreak of COVID-19 had brought a great challenge for the World Health Organization (WHO) in preventing the spreading of SARS-CoV-2. The Ministry of Health (MOH) of Malaysia introduced the MySejahtera mobile application for health monitoring and contact tracing. Wearing a face mask in public areas had been made compulsory by the Government. The overhead cost incurred in hiring the extra manpower to ensure all the visitors wear a face mask, check-in through MySejahtera and the status in MySejahtera is healthy before entering a premise. A low-cost solution is urgently needed to reduce the h

APA, Harvard, Vancouver, ISO, and other styles

47

Belot, Margot, Leonardo Preuss, Joël Tuberosa, et al. "High Throughput Information Extraction of Printed Specimen Labels from Large-Scale Digitization of Entomological Collections using a Semi-Automated Pipeline." Biodiversity Information Science and Standards 7 (September 11, 2023): e112466. https://doi.org/10.3897/biss.7.112466.

Full text

Abstract:

Insects account for half of the total described living organisms on Earth, with a vast number of species awaiting description. Insects play a major role in ecosystems but are yet threatened by habitat destruction, intensive farming, and climate change. Museum collections around the world house millions of insect specimens and large-scale digitization initiatives, such as the digitization street digitize! at the Museum für Naturkunde, have been undertaken recently to unlock this data. Accurate and efficient extraction of insect specimen label information is vital for building comprehensive data

APA, Harvard, Vancouver, ISO, and other styles

48

B. Talirongan, Florence Jean, Hidear Talirongan, Roseclaremath A. Caroro, Rolysent K. Paredes, Kenneth Largo, and Jasmine C. Dico. "Allocation and Scheduling System of Emission Testing using Optical Character Recognition." International Journal of Applied Science and Research 07, no. 05 (2024): 39–47. https://doi.org/10.56293/ijasr.2024.6106.

Full text

Abstract:

The scheduling system helped businesses in prioritizing tasks. However, emission centers are still using the manual scheduling of clients through an on-site visit without guaranteeing any available schedule. Hence, this study developed a system that allows vehicle owners to set an online schedule for their emission testing based on real-time schedule availability. The system used a Tesseract Optical Character Recognition (OCR) engine to scan the Certificate of Registration (CR) and Official Receipt (OR) images to extract CR and OR numbers. This approach assesses the authenticity of a vehicle’s

APA, Harvard, Vancouver, ISO, and other styles

49

Adedayo, Kayode David, and Ayomide Oluwaseyi Agunloye. "Real-time Automated Detection and Recognition of Nigerian License Plates via Deep Learning Single Shot Detection and Optical Character Recognition." Computer and Information Science 14, no. 4 (2021): 11. http://dx.doi.org/10.5539/cis.v14n4p11.

Full text

Abstract:

License plate detection and recognition are critical components of the development of a connected Intelligent transportation system, but are underused in developing countries because to the associated costs. Existing license plate detection and recognition systems with high accuracy require the usage of Graphical Processing Units (GPU), which may be difficult to come by in developing nations. Single stage detectors and commercial optical character recognition engines, on the other hand, are less computationally expensive and can achieve acceptable detection and recognition accuracy without the

APA, Harvard, Vancouver, ISO, and other styles

50

Vizitiu, Alexandru Mădălin, Marius Alexandru Sandu, Lidia Dobrescu, Adrian Focșa, and Cristian Constantin Molder. "Comparative Approach to De-Noising TEMPEST Video Frames." Sensors 24, no. 19 (2024): 6292. http://dx.doi.org/10.3390/s24196292.

Full text

Abstract:

Analysis of unintended compromising emissions from Video Display Units (VDUs) is an important topic in research communities. This paper examines the feasibility of recovering the information displayed on the monitor from reconstructed video frames. The study holds particular significance for our understanding of security vulnerabilities associated with the electromagnetic radiation of digital displays. Considering the amount of noise that reconstructed TEMPEST video frames have, the work in this paper focuses on two different approaches to de-noising images for efficient optical character reco

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!