To see the other types of publications on this topic, follow the link: Indonesia speech recognition.

Journal articles on the topic 'Indonesia speech recognition'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Indonesia speech recognition.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Nasri, Andi. "Konversi Suara Ucapan Bahasa Indonesia Ke Sistem Bahasa Isyarat Indonesia (Sibi)." Ainet : Jurnal Informatika 2, no. 2 (2020): 7–13. http://dx.doi.org/10.26618/ainet.v2i2.4025.

Full text
Abstract:
Dengan semakin berkembangnya teknologi speech recognition, berbagai software yang bertujuan untuk memudahkan orang tunarungu dalam berkomunikasi dengan yang lainnya telah dikembangkan. Sistem tersebut menterjemahkan suara ucapan menjadi bahasa isyarat atau sebaliknya bahasa isyarat diterjemahkan ke suara ucapan. Sistem tersebut sudah dikembangkan dalam berbagai bahasa seperti bahasa Inggris, Arab, Spanyol, Meksiko, Indonesia dan lain-lain. Khusus untuk bahasa Indonesia mulai juga sudah yang mencoba melakukan penelitian untuk membuat system seperti tersebut. Namun system yang dibuat masih terbatas pada Automatic Speech Recognition (ASR) yang digunakan dimana mempunyai kosa-kata yang terbatas. Dalam penelitian ini bertujuan untuk mengembangkan sistem penterjemah suara ucapan bahasa Indonesia ke Sistem Bahasa Isyarat Indonesia (SIBI) dengan data korpus yang lebih besar dan meggunkanan continue speech recognition untuk meningkatkan akurasi system.Dari hasil pengujian system menunjukan diperoleh hasil akurasi sebesar rata-rata 90,50 % dan Word Error Rate (WER) 9,50%. Hasil akurasi lebih tinggi dibandingkan penelitian kedua 48,75% dan penelitan pertama 66,67%. Disamping itu system juga dapat mengenali kata yang diucapkan secara kontinyu atau pengucapan kalimat. Kemudian hasil pengujian kinerja system mencapai 0,83 detik untuk Speech to Text dan 8,25 detik untuk speech to sign.
APA, Harvard, Vancouver, ISO, and other styles
2

Sukmawati, Nur Endah, Adhy Satriyo, and Sutikno. "Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition for Indonesian." TELKOMNIKA Telecommunication, Computing, Electronics and Control 15, no. 1 (2017): 292–98. https://doi.org/10.12928/TELKOMNIKA.v15i1.3605.

Full text
Abstract:
Speech recognition can be defined as the process of converting voice signals into the ranks of the word, by applying a specific algorithm that is implemented in a computer program. The research of speech recognition in Indonesia is relatively limited. This paper has studied methods of feature extraction which is the best among the Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCC) for speech recognition in Indonesian language. This is important because the method can produce a high accuracy for a particular language does not necessarily produce the same accuracy for other languages, considering every language has different characteristics. Thus this research hopefully can help further accelerate the use of automatic speech recognition for Indonesian language. There are two main processes in speech recognition, feature extraction and recognition. The method used for comparison feature extraction in this study is the LPC and MFCC, while the method of recognition using Hidden Markov Model (HMM). The test results showed that the MFCC method is better than LPC in Indonesian language speech recognition.
APA, Harvard, Vancouver, ISO, and other styles
3

William, Ezra, and Amalia Zahra. "Speech Recognition Dengan Whisper Dalam Bahasa Indonesia." Action Research Literate 9, no. 2 (2025): 386–97. https://doi.org/10.46799/arl.v9i2.2573.

Full text
Abstract:
Perkembangan teknologi kecerdasan buatan telah mendorong kemajuan dalam pengenalan suara (speech recognition), terutama dalam mendukung komunikasi digital yang lebih efisien. Salah satu model terbaru yang banyak digunakan adalah Whisper, yang dikembangkan oleh OpenAI dengan kemampuan pengenalan suara multibahasa yang diklaim memiliki akurasi tinggi. Namun, tantangan utama dalam implementasi teknologi ini di Indonesia adalah keterbatasan sumber daya data dalam bahasa lokal serta variasi aksen yang signifikan. Oleh karena itu, penelitian ini dilakukan untuk mengevaluasi kinerja model Whisper dalam mengenali dan mentranskripsi suara berbahasa Indonesia. Penelitian ini bertujuan untuk menganalisis tingkat akurasi Whisper dalam pengenalan ucapan bahasa Indonesia berdasarkan Word Error Rate (WER) serta membandingkannya dengan model XLS-R dan XLSR-53. Metode yang digunakan dalam penelitian ini adalah pendekatan komparatif dengan melakukan fine-tuning terhadap model Whisper menggunakan dataset Common Voice 13 dalam bahasa Indonesia. Evaluasi model dilakukan dengan mengukur WER pada tahap pelatihan dan pengujian. Hasil penelitian menunjukkan bahwa model Whisper memiliki performa terbaik dibandingkan model XLS-R dan XLSR-53 dalam mengenali ucapan bahasa Indonesia. Nilai WER Training yang diperoleh adalah 22.33505%, sedangkan nilai WER Testing adalah 19.774909%. Hal ini menunjukkan bahwa model Whisper lebih unggul dalam menangani variasi aksen dan kondisi akustik dibandingkan dengan model lainnya. Keunggulan ini terutama disebabkan oleh pelatihan berbasis data yang lebih besar serta kemampuan adaptasi model terhadap berbagai bahasa. Implikasi penelitian ini memberikan kontribusi dalam pengembangan teknologi speech recognition berbahasa Indonesia serta meningkatkan aksesibilitas bagi pengguna dalam berbagai sektor, seperti pendidikan, layanan publik, dan teknologi komunikasi.
APA, Harvard, Vancouver, ISO, and other styles
4

Novela, Martin, and T. Basaruddin. "Dataset Suara dan Teks Berbahasa Indonesia Pada Rekaman Podcast dan Talk show." JURNAL FASILKOM 11, no. 2 (2021): 61–66. http://dx.doi.org/10.37859/jf.v11i2.2628.

Full text
Abstract:
Salah satu faktor keberhasilan suatu model pembelajaran dalam machine learning atau deep learning adalah dataset yang digunakan. Pada tulisan ini menyajikan dataset suara dari rekaman podcast dan talk show beserta transkripsi berbahasa Indonesia. Dataset ini disajikan karena belum adanya ketersediaan dataset berbahasa Indonesia yang dapat diakses secara publik untuk digunakan pada pembelajaran model Text-to-Speech ataupun Audio Speech Recognition. Dataset terdiri dari 3270 rekaman yang diproses untuk mendapatkan transkripsi berupa teks atau kalimat berbahasa Indonesia. Dalam pembuatan dataset ini dilakukan beberapa tahapan seperti pra-pemrosesan, tahapan translasi, tahapan validasi pertama dan tahapan validasi kedua. Dataset dibuat dengan format yang mengikuti format dari dataset LJSpeech untuk memudahkan pemrosesan dataset ketika digunakan dalam suatu model sebagai input. Dataset ini diharapkan dapat membantu meningkatkan kualitas pembelajaran untuk pemrosesan Text-to-Speech seperti pada model Tacotron2 ataupun pada pemrosesan Audio Speech Recognition untuk bahasa Indonesia.
APA, Harvard, Vancouver, ISO, and other styles
5

Suhardiyanto, Totok. "Mengapa Mesin Pencari Suara Gagal Mengenali Bahasa Indonesia? Sebuah Kajian Awal Tentang Asr (automatic speech recognition) Bahasa Indonesia." JURNAL ARBITRER 1, no. 1 (2013): 88. http://dx.doi.org/10.25077/ar.1.1.88-98.2013.

Full text
Abstract:
This paper is about the study of Indonesian Automatic Speech Recognition (ASR) designed by Informational- Technological Computer ( TIK).Specifically, this paper is aimed at describing how this tool operates in recognizing some in-puts in Indonesian language. TIK industry has something to do with Indonesian PWO where the use of the smart phones developes massively in Indonesia. This study processes around 10.774 data in Indonesian language in form of sentence, phrase, and word. From this number, less than 20% can be categorized perfect. The others have an error in format and recognition. This is due to some factors bringing about the failure of ASR in recognising the in put in Indonesian language.
APA, Harvard, Vancouver, ISO, and other styles
6

,, Muhammad, Syahroni Hidayat, and Ahmad Zuli Amrullah. "Speech Recognition Untuk Aplikasi Kamus Bahasa Indonesia-Sumbawa Berbasis Android." Jurnal Bumigora Information Technology (BITe) 1, no. 2 (2019): 126–37. http://dx.doi.org/10.30812/bite.v1i2.606.

Full text
Abstract:
ABSTRAK
 Sumbawa sebagai salah satu daerah yang dianugrahi potensi wisata yang beragam menjadikan daya tarik masyarakat luar Sumbawa (wisatawan) untuk berkunjung, bekerja, maupun untuk belajar. Namun terkadang bahasa menjadi salah satu kendala yang dihadapi mayarakat luar Sumbawa jika ingin berinteraksi dengan masyarakat asli Sumbawa. Sehingga dibutuhkan sebuah instrument yang bisa digunakan sehingga perbedaan bahasa tidak menjadi kendala dalam berinteraksi yaitu kamus. Oleh karena itu, kamus yang disajikan haruslah sesuai dengan teknologi yang banyak diminati oleh masyarakat Indonesia pada umumnya yaitu smartphone Android dikarenakan fitur-fitur yang tersedia dalam smartphone tersebut. Salah satu fiturnya adalah speech recognition.Perancangan sistem ini dilakukan dengan metodologi waterfall yang terdiri dari proses analisis, desain, pengkodean, pengujian, dan terakhir pemeliharaan. Tools yang digunakan adalah Android Studio dan DB Browser for SQLite (DB4S). Metode pengujian menggunakan Black Box untuk uji fungsionalitas aplikasi dan Word Correct Rate (WCR) untuk menguji akurasi sistem dengan menggunakan 30 kata yang berbeda dan setiap kata diulang sebanyak 10 kali.Hasil yang sudah dicapai dalam penelitian ini adalah terciptanya aplikasi Kamus Bahasa Indonesia- Sumbawa Berbasis Android dengan memanfaatkan teknologi speech recognition.Kesimpulan dari penelitian ini adalah Uji fungsionalitas menunjukkan fitur-fitur aplikasi dapat bekerja dengan baik ketika offline maupun online. Sedangkan untuk uji coba akurasi sistem didapatkan hasil WCR secara berturut-turut sebesar 92.67% ketika offline dan 95.33% ketika online.
 ABSTRACT
 Sumbawa as one of the areas that is blessed with diverse tourism potential makes the appeal of people outside Sumbawa (tourists) to visit, work, or to study. But sometimes language becomes one of the obstacles faced by people outside Sumbawa if they want to interact with the native people of Sumbawa. So we need an instrument that can be used so that differences in language do not become obstacles in interacting with the dictionary. Therefore, the dictionary presented must be in accordance with the technology that is in great demand by the Indonesian people in general, namely Android smartphones because of the features available in these smartphones. One of the features is speech recognition. The design of this system is done by the waterfall methodology which consists of the process of analysis, design, coding, testing, and finally maintenance. The tools used are Android Studio and DB Browser for SQLite (DB4S). The testing method uses Black Box to test application functionality and Word Correct Rate (WCR) to test the accuracy of the system using 30 different words and each word is repeated 10 times. The results achieved in this study are the creation of an Indonesian-Sumbawa-based Dictionary application Android by utilizing speech recognition technology. The conclusion of this research is the functionality test shows that the application features can work well when offline or online. Whereas for testing the accuracy of the system the WCR results obtained were 92.67% when offline and 95.33% when online.
 
APA, Harvard, Vancouver, ISO, and other styles
7

Nurhasanah, Youllia Indrawaty, Irma Amelia Dewi, and Bagus Ade Saputro. "Iqro Reading Learning System through Speech Recognition Using Mel Frequency Cepstral Coefficient (MFCC) and Vector Quantization (VQ) Method." IJAIT (International Journal of Applied Information Technology) 2, no. 01 (2018): 29. http://dx.doi.org/10.25124/ijait.v2i01.1173.

Full text
Abstract:
Historically, the study of Qur'an in Indonesia evolved along with the spread of Islam. Learning methods of reading the Qur'an have been found ranging from al-Baghdadi, al-Barqi, Qiraati, Iqro', Human, Tartila, and others, which can make it easier to learn to read the Qur'an. Currently, the development of speech recognition technology can be used for the detection of Iqro vol 3 reading pronunciations. Speech recognition consists of two general stages of feature extraction and speech matching. The feature extraction step is used to derive speech-feature and speech-matching stages to compare compatibility between test sound and train voice. The speech recognition method used to recognize Iqro readings is extracting speech signal features using Mel Frequency Cepstral Coefficient (MFCC) and classifying them using Vector Quantization (VQ) to get the appropriate speech results. The result of testing for speech recognition system of Iqro reading has been tested for 30 peoples as a sample of data and there are 6 utterances indicating the information failed, so the system has a success rate of 80%.
APA, Harvard, Vancouver, ISO, and other styles
8

Henry, Henry, and Eryc Eryc. "Speech Recognition Untuk Membantu Pelafalan Hanyu Pinyin Sebagai Bagian Dari Edukasi Bahasa Mandarin." Jurnal Ilmiah Edutic : Pendidikan dan Informatika 10, no. 2 (2024): 117–28. http://dx.doi.org/10.21107/edutic.v10i2.22633.

Full text
Abstract:
Pada era globalisasi, kemampuan berbahasa internasional semakin penting dalam bidang studi ataupun bidang bisnis. Salah satu bahasa internasional yang sering dipakai adalah bahasa Mandarin. Beberapa negara sudah mulai memasukkan bidang studi bahasa Mandarin dalam kurikulumnya, salah satunya adalah Indonesia. Perbedaan karakteristik bahasa Indonesia dan Mandarin membuat pelajar bahasa Mandarin di Indonesia cenderung melakukan kesalahan pelafalan pada bahasa Mandarin. Speech Recognition adalah cabang dari artificial intelligence yang memungkinkan komputer untuk menerima input berupa suara. Speech recognition dapat digunakan untuk merancang aplikasi yang mampu melatih kemampuan pelafalan bahasa Mandarin. Penelitian ini bertujuan untuk meneliti proses perancangan dan evaluasi aplikasi berbasis mobile yang menggunakan speech recognition untuk membantu pelajar bahasa Mandarin dengan pelatihan pelafalan hanyu pinyin. Aplikasi ini dirancang menggunakan metode SDLC tipe waterfall dan menggunakan bahasa pemograman dart dengan framework flutter dan mengguakan package speech_to_text dan flutter_tts. Pengujian aplikasi menggunakan pendekatan black-box testing. Pengumpulan data secara kuantitatif dilakukan dengan penyebaran kuesioner kepada 33 pengguna aplikasi yang dirancang dan hasil kuesioner menunjukkan hasil “Sangat Efektif” dari segi pengoperasian, tampilan, dan isi materi aplikasi menggunakan penilaian skala interval likert.
APA, Harvard, Vancouver, ISO, and other styles
9

Amrullah, Ahmad Zuli, and Khurniawan Eko Saputro. "Analisis dan Perancangan Kamus Interaktif Bahasa Isyarat Indonesia dengan Speech Recognition." Jurnal Bumigora Information Technology (BITe) 1, no. 2 (2019): 110–15. http://dx.doi.org/10.30812/bite.v1i2.604.

Full text
Abstract:
ABSTRAK
 
 
 Intisari – Menurut data Survei Sosial Ekonomi Nasional (Susenas) pada tahun 2012 terdapat sekitar 9,9 juta anak Indonesia menyandang disabilitas. Sekitar 7.87% dari total jumlah penyandang disabilitas tersebut mengalami tunarungu atau keterbatasan mendengar. Penyandang tunarungu melakukan komunikasi dengan menggunakan Bahasa isyarat. Karena tidak semua orang mengerti dengan bahasa isyarat maka dibutuhkan alat bantu atau aplikasi untuk berkomunikasi dengan penyandang tunarungu. Keterbatasan dalam berkomunikasi antara orang biasa dengan penyandang tunarungu. Oleh karena ity, untuk membantu mahasiswa dan dosen berkomunikasi dengan mahasiswa yang tunarung maka dibutuhkan aplikasi kamus Bahasa isyarat dengan Speech Recognition. Pengembangan aplikasi ini menggunakan metode pengembangan aplikasi waterfall. Dimana setiap alur berjalan secara selaras dan memudahkan untuk mencari kesalahan system. Pengujian dilakukan dengan verifikasi kebutuhan untuk memastikan produk perangkat lunak yang dihasilkan sesuai dengan spesifikasi yang ditentukan.
 Kata Kunci: Bahasa isyarat; kamus; speech recognition;
 ABSTRACT
 
 Digest - According to data from the National Socio-Economic Survey (Susenas) in 2012 there were around 9.9 million Indonesian children with disabilities. Around 7.87% of the total number of persons with disabilities experience hearing impairment or hearing impairment. People with hearing impairment communicate using sign language. Because not everyone understands sign language, tools or applications are needed to communicate with deaf people. Limitations in communicating between ordinary people and hearing impaired people. Therefore, to help students and lecturers communicate with students who are fussy, it requires a sign language dictionary application with Speech Recognition. This application development uses the waterfall application development method. Where each flow runs in harmony and makes it easy to find system errors. The test is carried out by verifying the need to ensure that the software product is produced according to the specified specifications.
 
 Keywords: Signal language; dictionary; speech recognition;
APA, Harvard, Vancouver, ISO, and other styles
10

Hilman, F. Pardede, Adhi Purwoko, Zilvan Vicky, Ramdan Ade, and Krisnandi Dikdik. "Deep convolutional neural networks-based features for Indonesian large vocabulary speech recognition." International Journal of Artificial Intelligence (IJ-AI) 12, no. 2 (2023): 610–17. https://doi.org/10.11591/ijai.v12.i2.pp610-617.

Full text
Abstract:
There are great interests in developing speech recognition using deep learning technologies due to their capability to model the complexity of pronunciations, syntax, and language rules of speech data better than the traditional hidden Markov model (HMM) do. But, the availability of large amount of data is necessary for deep learning-based speech recognition to be effective. While this is not a problem for mainstream languages such as English or Chinese, this is not the case for non-mainstream languages such as Indonesian. To overcome this limitation, we present deep features based on convolutional neural networks (CNN) for Indonesian large vocabulary continuous speech recognition in this paper. The CNN is trained discriminatively which is different from usual deep learning implementations where the networks are trained generatively. Our evaluations show that the proposed method on Indonesian speech data achieves 7.26% and 9.01% error reduction rates over the state-of-the-art deep belief networks-deep neural networks (DBN-DNN) for large vocabulary continuous speech recognition (LVCSR), with Mel frequency cepstral coefficients (MFCC) and filterbank (FBANK) used as features, respectively. An error reduction rate of 6.13% is achieved compared to CNN-DNN with generative training.
APA, Harvard, Vancouver, ISO, and other styles
11

Roissyah Fernanda Khoiroh, Eric Julianto, Safrizal Ardana Ardiyansa, Haidar Ahmad Fajri, Aryaguna Abi Rafdi Yasa, and Brian Sangapta. "Implementasi Speech Recognition Whisper pada Debat Calon Wakil Presiden Republik Indonesia." Explore 14, no. 2 (2024): 67–74. http://dx.doi.org/10.35200/ex.v14i2.115.

Full text
Abstract:
Negara Republik Kesatuan Indonesia (NKRI) memiliki sistem pemerintahan demokrasi, sehingga warga negara Indonesia bebas menggunakan hak pilih mereka dalam menentukan kandidat wakil presiden. Hak pemilih ini telah dijamin dalam Undang-Undang Dasar Tahun 1945 tentang hak konstituional pemilu. Pemilu 2024 memiliki sejumlah 204.807.222 pemilih yang sudah ditetapkan dalam Daftar Pemilih Tetap (DPT) yang didominasi oleh generasi milenial dan generasi Z. Generasi muda tersebut membutuhkan informasi yang cukup untuk mengetahui visi dan misi, serta gagasan para kandidat. Visi dan misi harus dapat tersampaikan dengan baik, agar mereka dapat memilih dengan bijak. Terdapat faktor yang mengakibatkan gagasan mereka tidak tersampaikan dengan baik, seperti penggunaaan kalimat yang asing dan tidak baku, serta gangguan dari lingkungan sekitar. Salah satu solusi permasalahan ini adalah dengan penggunaan model speech recognition secara otomatis. Whisper merupakan model speech recognition yang mampu mendeteksi suara dengan baik. Whisper memiliki tingkat akurasi sebesar 95,19% pada gagasan Muhaimin, 96,71% pada gagasan Gibran, dan 87,16% pada gagasan Mahfud. Whisper juga mampu mengenali frekuensi penggunaan kata-kata yang disampaikan oleh ketiga kandidat. Berdasarkan frekuensi kata yang ditekankan, dapat diinterpretasikan bahwa Muhaimin ingin membangun kedekatan dan keterbukaan terhadap masyarakat Indonesia, serta berfokus pada isu-isu pembangunan ekonomi, sedangkan Gibran menekankan bahwa dia ingin memberikan perhatian dan ruang bagi kaum muda untuk berkembang, dan Mahfud ingin berfokus pada fasilitas, program kerja, serta menyelesaikan isu ekonomi.
APA, Harvard, Vancouver, ISO, and other styles
12

Hidayat, Syahrul, Yisti Vita Via, and Eka Prakarsa Mandyartha. "Penerapan Model Hybrid Convolutional Neural Network dan Long Short-Term Memory untuk Pengenalan Real-Time Sistem Isyarat Bahasa Indonesia (SIBI)." JURNAL MEDIA INFORMATIKA BUDIDARMA 8, no. 3 (2024): 1586. http://dx.doi.org/10.30865/mib.v8i3.7837.

Full text
Abstract:
The Indonesian Sign Language System (SIBI) is an essential means of communication for the deaf and speech-impaired community in Indonesia. However, the limited public understanding of SIBI often hinders effective communication. This study develops a real-time SIBI sign recognition model to facilitate effective communication for the deaf and speech-impaired in Indonesia. The proposed method integrates a hybrid CNN-LSTM model to process the spatial and temporal information from the data. The study evaluates the model's performance on 25 types of SIBI signs. The dataset used consists of image sequences captured in real-time. Training is conducted with various parameters, including batch size, learning rate, and epochs. Model evaluation is carried out using accuracy, precision, recall, and f1-score metrics. The training and validation results show an increase in accuracy with the number of epochs: 87% at 10 epochs, 93% at 25 epochs, and 100% at 50 epochs. In real-time detection tests, the model with the image sequence dataset accurately detected SIBI signs in environments and with objects consistent with the dataset. The real-time detection program generates SIBI sign predictions in text form and sentences. The output of this research is efficient and accurate SIBI sign recognition technology. This research is expected to facilitate more effective communication for the deaf and speech-impaired community in Indonesia.
APA, Harvard, Vancouver, ISO, and other styles
13

Gafar, Agum Agidtama, and Jayanti Yusmah Sari. "Sistem Pengenalan Bahasa Isyarat Indonesia dengan Menggunakan Metode Fuzzy K-Nearest Neighbor." Jurnal ULTIMATICS 9, no. 2 (2018): 122–28. http://dx.doi.org/10.31937/ti.v9i2.671.

Full text
Abstract:
The Indonesian Natural Sign System (SIBI) is one of the most natural languages of communication, especially for deaf and speech impaired. Deaf and speech impaired can understand and communicate with each other by using sign language, but some normal people will have difficulty understanding sign language with deaf and speech impunity to say. To overcome these problems need develop a system that is able to recognize the Indonesian Sign System (SIBI) which is expected capable of learning media in communicating between the deaf and normal humans. The introduction of the Indonesian Sign System (SIBI) will consists of three main stages: image acquisition, preprocessing and recognition. In this research the classification method used is Fuzzy KNearest Neighbor (FKNN) method. Based on the results of experiments conducted with the classification using the method Fuzzy K-Nearest Neighbor (FKNN) obtained an accuracy of 88%.
 Index Term— Fuzzy K-Nearest Neighbor, Sistem Isyarat Bahasa Indonesia (SIBI).
APA, Harvard, Vancouver, ISO, and other styles
14

Wijonarko, Panji, and Amalia Zahra. "Spoken language identification on 4 Indonesian local languages using deep learning." Bulletin of Electrical Engineering and Informatics 11, no. 6 (2022): 3288–93. http://dx.doi.org/10.11591/eei.v11i6.4166.

Full text
Abstract:
Language identification is at the forefront of assistance in many applications, including multilingual speech systems, spoken language translation, multilingual speech recognition, and human-machine interaction via voice. The identification of indonesian local languages using spoken language identification technology has enormous potential to advance tourism potential and digital content in Indonesia. The goal of this study is to identify four Indonesian local languages: Javanese, Sundanese, Minangkabau, and Buginese, utilizing deep learning classification techniques such as artificial neural network (ANN), convolutional neural network (CNN), and long-term short memory (LSTM). The selected extraction feature for audio data extraction employs mel-frequency cepstral coefficient (MFCC). The results showed that the LSTM model had the highest accuracy for each speech duration (3 s, 10 s, and 30 s), followed by the CNN and ANN models.
APA, Harvard, Vancouver, ISO, and other styles
15

Panji, Wijonarko, and Zahra Amalia. "Spoken language identification on 4 Indonesian local languages using deep learning." Bulletin of Electrical Engineering and Informatics 11, no. 6 (2022): 3288~3293. https://doi.org/10.11591/eei.v11i6.4166.

Full text
Abstract:
Language identification is at the forefront of assistance in many applications, including multilingual speech systems, spoken language translation, multilingual speech recognition, and human-machine interaction via voice. The identification of indonesian local languages using spoken language identification technology has enormous potential to advance tourism potential and digital content in Indonesia. The goal of this study is to identify four Indonesian local languages: Javanese, Sundanese, Minangkabau, and Buginese, utilizing deep learning classification techniques such as artificial neural network (ANN), convolutional neural network (CNN), and long-term short memory (LSTM). The selected extraction feature for audio data extraction employs mel-frequency cepstral coefficient (MFCC). The results showed that the LSTM model had the highest accuracy for each speech duration (3 s, 10 s, and 30 s), followed by the CNN and ANN models.
APA, Harvard, Vancouver, ISO, and other styles
16

Muslim, Bukhori. "PENYIMPANGAN TEORI BROWN DAN LEVINSON DALAM TINDAK TUTUR PESERTA TALK SHOW INDONESIA LAWYERS CLUB (ILC) DI TV ONE DAN RELEVANSINYA TERHADAP PEMBELAJARAN BAHASA INDONESIA DI SMA." RETORIKA: Jurnal Ilmu Bahasa 3, no. 1 (2017): 104–17. http://dx.doi.org/10.22225/jr.3.1.100.104-117.

Full text
Abstract:
Abstract
 This research purpose to describe forms of deviation Brown and Levinson's theory of politeness in speech act participant Indonesia Lawyers Club on TV One and its relevance to the Indonesian language learning in high school. The theory is used to solve the problem in this research is the pragmatic theory. While the approach used is descriptive qualitative approach with data collection technique is a technique of documentation and observation. The results showed that the forms of deviations Brown and Levinson's theory of politeness that occur in the speech act participant Indonesia Lawyers Club edition of May 27, 2014 and 7 April 2015 consisted of threatening the positive face and negative face threats. Participants utterances that threaten positive face expression covers complaints, charges, disapproval, criticism, expressions that do not koopratif, embarrass opponents said, and words taboo. While the band is used in the expression of negative advance threatening the expression of rejection, suggestions, advice, requests, prohibitions, promises and praise. Types of speech acts band is used that speech acts directive, declarative, expressive, and refresentatif. Meanwhile, the offense Brown and Levinson’s theory in speech acts ILC participants more based on an awareness for justice, self-defense, solidarity groups, power, recognition of self and groups, law enforcement, the fight against corruption and advocacy on behalf of the people. Relevance of the research results can be applied in learning Indonesian in class XI SMA second half, KD 9. 2 with the subject matter by providing comments on the discussion.
APA, Harvard, Vancouver, ISO, and other styles
17

Asril, Jarin, Santosa Agung, Teduh Uliniansyah Mohammad, Ruslana Aini Lyla, Nurfadhilah Elvira, and Gunarso Gunarso. "Automatic speech recognition for Indonesian medical dictation in cloud environment." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1762–72. https://doi.org/10.11591/ijai.v13.i2.pp1762-1772.

Full text
Abstract:
This paper introduces Sistem Pengenalan Wicara untuk Pendiktean Medis (SPWPM), an automatic speech recognition (ASR) system designed specifically for Indonesian medical dictation. The main objective of SPWPM is to assist medical professionals in producing medical reports and diagnosing patients. Deployed within a cloud computing service architecture, SPWPM strives to achieve a minimum speech recognition accuracy of 95%. The ASR model of SPWPM is developed using Kaldi and PyChain technologies creating a comprehensive training dataset involving collaboration with Labs247 Company and Harapan Kita Heart and Blood Vessel Hospital. Several optimization techniques were applied, including language modeling with smoothing, lexicon generation using the Grapheme-to-Phoneme Converter, and data augmentation. The readiness of this technology to assist hospital users was assessed through two evaluations: the SPWPM architecture test and the SPWPM speech recognition test. The results demonstrate the system's preparedness in accurately transcribing medical dictation, showcasing its potential to enhance medical reporting for healthcare professionals in hospital environments.
APA, Harvard, Vancouver, ISO, and other styles
18

Riyanta, Bambang, Henry Ardian Irianta, and Berli Paripurna Kamiel. "Development of Speech Command Control Based TinyML System for Post-Stroke Dysarthria Therapy Device." Journal of Robotics and Control (JRC) 4, no. 4 (2023): 466–78. http://dx.doi.org/10.18196/jrc.v4i4.15918.

Full text
Abstract:
Post-stroke dysarthria (PSD) is a widespread outcome of a stroke. To help in the objective evaluation of dysarthria, the development of pathological voice recognition and technology has a lot of attention. Soft robotics therapy devices have been received as an alternative rehabilitation and hand grasp assistance for improving activity daily living (ADL). Despite the significant progress in this field, most soft robotic therapy devices use a complex, bulky, lack of pathological voice recognition model, large computational power, and stationary controller. This study aims to develop a portable wirelessly multi-controller with a simulated dysarthric vowel speech in Bahasa Indonesia and non-dysarthric micro speech recognition, using tiny machine learning (TinyMl) system for hardware efficiency. The speech interface using INMP441, compute with a lightweight Deep Convolutional Neural network (DCNN) design and embedded into ESP-32. Feature model using Short Time Fourier Transform (STFT) and fed into CNN. This method has proven useful in micro-speech recognition with low computational power in both speech scenarios with a level of accuracy above 90%. Realtime inference performance on ESP-32 using hand prosthetics, with 3-level household noise intensity respectively 24db,42db, and 62db, and has respectively resulted from 95%, 85%, and 50% Accuracy. Wireless connectivity success rate with both controllers is around 0.2 - 0.5 ms.
APA, Harvard, Vancouver, ISO, and other styles
19

Novianty, Astri, and Fairuz Azmi. "Sign Language Recognition using Principal Component Analysis and Support Vector Machine." IJAIT (International Journal of Applied Information Technology) 4, no. 01 (2021): 49. http://dx.doi.org/10.25124/ijait.v4i01.3015.

Full text
Abstract:
The World Health Organization (WHO) estimates that over five percent of the world's population are hearing-impaired. One of the communication problems that often arise between deaf or speech impaired with normal people is the low level of knowledge and understanding of the deaf or speech impaired's normal sign language in their daily communication. To overcome this problem, we build a sign language recognition system, especially for the Indonesian language. The sign language system for Bahasa Indonesia, called Bisindo, is unique from the others. Our work utilizes two image processing algorithms for the pre-processing, namely the grayscale conversion and the histogram equalization. Subsequently, the principal component analysis (PCA) is employed for dimensional reduction and feature extraction. Finally, the support vector machine (SVM) is applied as the classifier. Results indicate that the use of the histogram equalization significantly enhances the accuracy of the recognition. Comprehensive experiments by applying different random seeds for testing data confirm that our method achieves 76.8% accuracy. Accordingly, a more robust method is still open to enhance the accuracy in sign language recognition.
APA, Harvard, Vancouver, ISO, and other styles
20

Chamidy, Totok. "Metode Mel Frequency Cepstral Coeffisients (MFCC) Pada klasifikasi Hidden Markov Model (HMM) Untuk Kata Arabic pada Penutur Indonesia." MATICS 8, no. 1 (2016): 36. http://dx.doi.org/10.18860/mat.v8i1.3482.

Full text
Abstract:
<p class="Abstract" style="text-align: justify;"><em>Abstract</em>— Speech recognition is a system to transform the spoken word into text. Human voice signals have a very high of variability. Speech signals in the different pronunciation text, also resulting in distinctive speech patterns. This, furthermore, happens if the text is spoken by a speaker who is not the mother tongue of the speakers. For example, text Arabic words spoken by Indonesian speaker. In this study, Mel Frequency cepstral Coeffisients (MFCC) feature extraction techniques explored for voice recognition of the Arabic words for Indonesian speakers with data training using Arabian native speakers. Furthermore, features that have been extracted, classified using Hidden Markov Model (HMM). HMM is one of the sound modeling where the voice signal is analyzed and searched the maximum probability value that can be recognized, from the modeling results will be obtained parameters are then used in the word recognition process. Recognized word is a word that has the maximum suitability. The system produces an accuracy by an average of 83.1% for test data sampling frequency of 8,000 Hz, 82.3% for test data sampling frequency of 22050 Hz, 82.2% for test data sampling frequency of 44100 Hz.</p>
APA, Harvard, Vancouver, ISO, and other styles
21

Suwento, Ronny, Dini Widiarni Widodo, Tri Juda Airlangga, et al. "Clinical Trial for Cartilage Conduction Hearing Aid in Indonesia." Audiology Research 11, no. 3 (2021): 410–17. http://dx.doi.org/10.3390/audiolres11030038.

Full text
Abstract:
Hearing improvement represents one of the may valuable outcomes in microtia and aural atresia reconstruction surgery. Most patients with poor development in their hearing function have had a severe microtia. Conventional methods to improve hearing function are bone conduction and bone anchored hearing aids. Cartilage conduction hearing aids (CCHA) represents a new amplification method. This study assessed the outcomes and evaluated the impact and its safety in the patients with microtia and aural atresia whose hearing dysfunction did not improve after surgery for ear reconstruction in our hospital. Hearing functions were evaluated with pure tone audiometry or sound field testing by behavioral audiometry and speech audiometry before and after CCHA fitting. As a result, there was a significant difference between unaided and aided thresholds (p < 0.001). Speech recognition threshold and speech discrimination level also significantly improved with CCHA. The average functional gains of 14 ears were 26.9 ± 2.3 dB. Almost all parents of the patients reported satisfaction with the performance of CCHA, and daily communication in children with hearing loss also became better than usual.
APA, Harvard, Vancouver, ISO, and other styles
22

Khoirotul Aini, Yulistia, Tri Budi Santoso, and Titon Dutono. "Pemodelan CNN Untuk Deteksi Emosi Berbasis Speech Bahasa Indonesia." Jurnal Komputer Terapan, Vol. 7 No. 1 (2021) (June 2, 2021): 143–52. http://dx.doi.org/10.35143/jkt.v7i1.4623.

Full text
Abstract:
Di dalam interaksi antara manusia dan komputer diperlukan kemampuan untuk melakukan pengenalan, penafsiran, dan memberikan respons emosi yang diekspresikan dalam ucapan. Sampai saat ini penelitan speech emotion recognition (SER) yang berbasis bahasa Indonesia masih sangat sedikit. Hal ini disebabkan keterbatasan korpus data berbahasa Indonesia untuk SER. Pada penelitian ini dibuat sistem SER dengan mengambil dataset dari TV series berbahasa Indonesia. Sistem dirancang dengan kemampuan untuk melakukan proses klasifikasi emosi, yaitu empat kelas label emosi marah, senang, netral dan sedih. Untuk implementasinya digunakan metode deep learning, yang dalam hal ini dipilih metode CNN. Pada sistem ini input berupa kombinasi dari tiga fitur, yaitu MFCC, frekuensi fundamental, dan RMSE. Dari eksperimen yang telah dijalankan telah diperoleh hasil terbaik untuk sistem SER berbahasa Indonesia dengan menggunakan input MFCC + frekuensi fundamental, yang menunjukkan tingkat akurasi sebesar 85%. Sedangkan akurasi terendah ketika menggunakan fitur MFCC + RMSE yaitu 72%. Dari study awal ini diharapkan mampu memberikan gambaran bagi para peneliti di bidang SER, tentang bagaimana memilih fitur sinyal wicara sebagai input di dalam pengujian dan mempermudah untuk langkah pengembangan penelitiannya.
APA, Harvard, Vancouver, ISO, and other styles
23

Tamrin, Fikram, Sakina Sudin, and Gamaria Mandar. "ANALISIS SENTIMEN MENGGUNAKAN SPEECH RECOGNITION DENGAN ALGORITMA NAIVE BAYES CLASSIFIER." Jurnal Sains Komputer dan Teknologi Informasi 6, no. 1 (2023): 60–64. http://dx.doi.org/10.33084/jsakti.v6i1.5416.

Full text
Abstract:
Analisis sentimen atau disebut juga opini adalah proses pemahaman, penggalian, dan analisis data tekstual secara otomatis untuk mendapatkan informasi sentimen yang terkandung dalam sebuah kalimat opini terhadap suatu isu oleh seseorang, yang cenderung memiliki opini negatif atau positif. Tujuan dari penelitian ini adalah mengklasifikasikan data mahasiswa dan mahasiswi di lingkungan sekitar kampus Muhammadiyah Maluku Utara menjadi 2 kategori, positif dan negatif. Pada penelitian ini, teks yang kami gunakan berbahasa Indonesia yang memuat data responden dari kalangan umum dan mahasiswa. Adapun pengembangan website layanan masyarakat untuk menampung opini dari masyarakat dan mahasiswa yang menggunakan fitur speech recognition untuk mengubah ucapan menjadi tulisan secara langsung (real-time) dalam bahasa Indonesia dengan input microphone. Opini publik yang terdapat pada website layanan publik dapat dijadikan sebagai bahan analisis apakah masyarakat dan mahasiswa memiliki sikap negatif terhadap keberadaan kampus Universitas Muhammadiyah Maluku Utara. Data yang digunakan terdiri dari 300 data yang terbagi menjadi 2 yaitu 220 untuk data latih dan 80 data untuk data uji. Klasifikasi data sentimen menggunakan text mining dengan Naïve Bayes Classifier. Sebelum klasifikasi dilakukan beberapa langkah pengolahan kata, seperti: Case Folding, Cleaning, Stopword, Tokenizing dan Stemming. Hasil dari 80 data pengujian yang diklasifikasikan adalah 43 data sentimen positif dan 7 data negatif. Dapat diartikan bahwa 80 data uji yang terklasifikasi termasuk dalam kategori sentimen positif karena data negatif lebih kecil dari data dengan sentimen positif. Dapat diartikan bahwa data uji 80 termasuk dalam kategori sentimen karena data negatif lebih kecil dari sentimen positif. Akurasi pada algoritma Naïve Bayes Classifier memiliki nilai sebesar 62,5%.
APA, Harvard, Vancouver, ISO, and other styles
24

Hermanto, Hermanto, and Tjong Wan Sen. "Syllable-Based Javanese Speech Recognition Using MFCC and CNNs: Noise Impact Evaluation." JURNAL TEKNIK INFORMATIKA 18, no. 1 (2025): 32–42. https://doi.org/10.15408/jti.v18i1.41067.

Full text
Abstract:
Javanese, a regional language in Indonesia spoken by over 100 million people, is classified as a low-resource language, presenting significant challenges in the development of effective speech recognition systems due to limited linguistic resources and data. Furthermore, the presence of noise is a significant factor that impacts the performance of speech recognition systems. This study aims to develop a speech recognition model for the Javanese language, focusing on a syllable-based approach using Mel Frequency Cepstral Coefficients (MFCC) for audio feature extraction and Convolutional Neural Networks (CNNs) methods for classification. Additionally, it will analyze how different types of colored noise: white gaussian, pink, and brown, when added to the audio, impact the model's accuracy. The results showed that the proposed method reached a peak accuracy of 81% when tested on the original audio (audio without any synthetic noise added). Moreover, in noisy audio, model accuracy improves as noise levels decrease. Interestingly, with brown noise at a 20 dB SNR, the model's accuracy slightly increases to 83%, representing a 2.47% improvement over the original audio. These results demonstrate that the proposed syllable-based method is a promising approach for real-world applications in Javanese speech recognition, and the slight accuracy improvement in noisy conditions suggests potential regularization effects
APA, Harvard, Vancouver, ISO, and other styles
25

Deslianti, Dwita. "APLIKASI SPEECH TO TEXT BAHASA INDONESIA KE BAHASA BENGKULU MENGGUNAKAN POCKETSPHINX BERBASIS ANDROID." JSAI (Journal Scientific and Applied Informatics) 1, no. 2 (2018): 54–57. http://dx.doi.org/10.36085/jsai.v1i2.14.

Full text
Abstract:
Saat ini smart phone berbasis Android khususnya sudah menjadi kebutuhan masyarakat karena menjanjikan banyak kemudahan sehari-hari salah satunya ialah speech recognition yang dikembangkan oleh perusahaan Google. Dengan fitur seperti ini, pengguna dapat dimudahkan mencari lokasi, artikel dan apapun yang kita butuhkan saat kita sibuk dalam berkendara dengan hanya menggunakan suara kita saja. Saat bepergian keluar kota pun kita sangat memerlukan sebuah smart phone yang mampu mendampingi kita dalam perjalanan semisal untuk melakukan komunikasi dengan bahasa daerah yang kita kunjungi. Dari permasalahan tersebut maka penulis akan membahas bagaimana membangun suatu aplikasi Speech to Text Bahasa Indonesia ke Bahasa Bengkulu Menggunakan Pocketsphinx Berbasis Android. Tujuan penelitian ini adalah membangun sebuah program aplikasi Speech to Text Bahasa Indonesia ke Bahasa Bengkulu Menggunakan Pocketsphinx Berbasis Android Sehingga manfaat dari penelitian ini adalah Dapat membantu ketika sedang bepergian ke kota Bengkulu untuk melakukan komunikasi dengan bahasa Bengkulu.
APA, Harvard, Vancouver, ISO, and other styles
26

Yusnita, Lita, Rosalina Rosalina, Rusdianto Roestam, and R. B. Wahyu. "Implementation of Real-Time Static Hand Gesture Recognition Using Artificial Neural Network." CommIT (Communication and Information Technology) Journal 11, no. 2 (2017): 85. http://dx.doi.org/10.21512/commit.v11i2.2282.

Full text
Abstract:
This paper implements static hand gesture recognition in recognizing the alphabetical sign from “A” to “Z”, number from “0” to “9”, and additional punctuation mark such as “Period”, “Question Mark”, and “Space” in Sistem Isyarat Bahasa Indonesia (SIBI). Hand gestures are obtained by evaluating the contourrepresentation from image segmentation of the glove wore by user. Then, it is classified using Artificial Neural Network (ANN) based on the training model previously built from 100 images for each gesture. The accuracy rate of hand gesture translation is calculated to be 90%. Moreover, speech translation recognizes NATO phonetic letter as the speech input for translation.
APA, Harvard, Vancouver, ISO, and other styles
27

KASYIDI, FATAN, RIDWAN ILYAS, and NIDA MUTHI ANNISA. "Peningkatan Kemampuan Pengenalan Emosi Melalui Suara dalam Bahasa Indonesia." MIND Journal 6, no. 2 (2021): 194–204. http://dx.doi.org/10.26760/mindjournal.v6i2.194-204.

Full text
Abstract:
AbstrakInteraksi manusia dengan komputer merupakan fenomena yang terus berkembang diikuti oleh meningkatnya penggunaan komputer yang sering digunakan dalam ranah sosial manusia. Manusia saling berinteraksi dengan melibatkan emosi untuk memahami seseorang. Emosi manusia seringkali terwakili melalui cara berbicara. Penelitian tentang pengenalan emosi melalui suara telah banyak dilakukan, namun terdapat upaya peningkatan pengenalan emosi melalui suara, terutama masalah korpus yang menjadi salah satu faktor yang menjadikan pengenalan emosi ini belum menghasilkan akurasi pengenalan yang optimal, khususnya berkaitan dengan imbalance data. Penelitian ini dilakukan untuk meningkatkan performa pengenalan emosi untuk mengenali lima kelas emosi yaitu senang, marah, sedih dan kepuasan serta netral menggunakan algoritma boosting. Selain itu, digunakan pula metode seperti CNN dan RNN untuk dapat dilakukan perbandingan serta penerapan SMOTE untuk korpusnya. Setelah eksperimen, dapat dihasilkan akurasi pengenalan mencapai 65% untuk akurasi untuk data tes berdasarkan konfigurasi 22050 Hz sebagai sampling rate, MFCCs dan oversampling SMOTE.Kata kunci: Imbalance data, Algoritma Boosting, CNN, RNN, SMOTEAbstractHuman interaction with computers are a growing phenomenon followed by the increasing use of computers which are often utilized in human social activities. Humans interact with one another by involving emotions. Plenty of research on speech emotion recognition has been established. Nevertheless, there are still efforts to enhance speech emotion recognition, especially the corpus problem which is one of the factors that the model does not in an optimal performance, especially about imbalance data. This study was conducted to enhance the performance of emotion recognition to recognize five class emotions: happiness, angry, sadness, contentment, and neutral. Furthermore, we employed CNN, RNN, and Boosting Algorithms. Lastly, we applied SMOTE to the corpus. After the experiment, the accuracy reached 65% with 22050 Hz configuration as rate, MFCCs, and SMOTE oversampling.Keywords: Data Imbalance, Boosting Algorithms, CNN, RNN, SMOTE
APA, Harvard, Vancouver, ISO, and other styles
28

Fathurrahman, Arisza Zufar, Dea Inesia Sri Utami, and Kartika Aghni Safitri. "Utilization of Augmented Reality as a Solution for Vernacular Language Approaches to Recognize an Object Through Speech Recognition." International Journal of Research and Applied Technology 3, no. 1 (2023): 79–86. http://dx.doi.org/10.34010/injuratech.v3i1.9954.

Full text
Abstract:
With the rise of western culture entering Indonesia, vernacular languages seem no longer essential to learn. People are more concerned with learning foreign languages and cultures so that they are relevant to the times and make it easier to adapt the professions needed, whereas preserving culture is no less important to protect the beloved country of Indonesia. Therefore, a solution relevant to the times is required to solve problems related to this vernacular language. This research aims to create a technology that will provide answers to the community to recognize the vernacular language of a particular object using Augmented Reality and Speech Recognition technology. In supporting the research, we use a qualitative descriptive method and the SDLC waterfall concept in its design. The results obtained indicate that this technology has succeeded in helping to make it easier to find out the vernacular language of an object. Augmented Reality technology gives an exciting impression when using this application. In contrast, Speech Recognition technology makes it easier for users to use this application because they can access it through speech only. With this application, people realize that preserving the vernacular language can be done using today's technology
APA, Harvard, Vancouver, ISO, and other styles
29

Kurniadi, Dede, Fitri Nuraeni, Indra Trisna Raharja, and Asri Mulyani. "Perancangan Aplikasi Text To Speech Dalam Bahasa Indonesia Menggunakan Firebase Machine Learning Kit Berbasis Android." Jurnal Teknologi Informasi dan Ilmu Komputer 9, no. 6 (2022): 1281. http://dx.doi.org/10.25126/jtiik.2022965985.

Full text
Abstract:
<p>Aplikasi <em>text to speech</em> dapat merubah teks menjadi keluaran suara menggunakan <em>engine text to speech</em>, namun teks tersebut harus berupa teks digital agar bisa di render. Sehingga, jika teks berada pada suatu objek maka harus diekstrak terlebih dahulu. <em>Firebase</em> <em>Machine</em> <em>Learning</em> Kit menyediakan API <em>text recognition</em> untuk membantu proses ekstrak teks. <em>Firebase Machine Learning Kit</em> (ML-Kit) juga menyediakan API <em>language</em> <em>identifier</em> untuk mendeteksi bahasa pada teks yang dibaca sehingga suara yang dikeluarkan dari teks yang dibaca dapat optimal dengan menggunakan dialek bahasa tertentu. Tujuan dari penelitian ini adalah membangun aplikasi <em>text to speech</em> dalam Bahasa Indonesia dengan penerapan <em>Firebase Machine Learning Kit </em>berbasis android. Dalam membangun aplikasi ini menggunakan metode <em>extreme programming </em>yang tahapannya terdiri dari <em>planning</em>,<em> design</em>,<em> coding</em>, dan <em>testing</em>. Hasil dari penelitian ini, berupa aplikasi yang dapat digunakan sebagai alat bantu pembelajaran bahasa asing dan alat digitaisasi teks serta terjemah ke dalam Bahasa Indonesia dan 34 dialek bahasa untuk keluaran suara <em>text to speech</em>. Selain itu, pada penelitian ini didapatkan nilai akurasi pengenalan teks dari tulisan tangan dan tulisan mesin, dengan rata-rata persentase akurasi untuk tulisan tangan sebesar 85,25%, sedangkan rata-rata persentase akurasi untuk tulisan mesin sebesar 87,35%. Dengan akurasi yang baik tersebut, maka aplikasi siap untuk dipergunakan sebagai alat bantu dalam proses pembelajaran bahasa asing oleh masyarakat Indonesia.</p><p> </p><p><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Text to speech applications can convert text into voice output using a text to speech engine, but the text must be digital text in order to render. So, if the text is in an object, it must be extracted first. The Firebase Machine Learning Kit provides a text recognition API to help extract text. The Firebase Machine Learning Kit (ML-Kit) also provides a language identifier API to detect the language in the text being read so that the sound emitted from the text read can be optimized by using a specific language dialect. The purpose of this research is to build a text to speech application in Indonesian with the application of an Android-based Firebase Machine Learning Kit. In building this application using the extreme programming method whose stages consist of planning, design, coding, and testing. The results of this study are in the form of applications that can be used as foreign language learning aids and text digitization tools and translations into Indonesian and 34 language dialects for text to speech voice output. In addition, in this study, the accuracy of text recognition from handwriting and machine writing was obtained, with an average percentage of accuracy for handwriting of 85.25%, while the average percentage of accuracy for machine writing was 87,35%. With good accuracy, the application is ready to be used as a tool in the process of learning foreign languages by the Indonesian people.</em></p><p><em><strong><br /></strong></em></p>
APA, Harvard, Vancouver, ISO, and other styles
30

Hatala, Zulkarnaen. "Langkah Praktis Membangun Sistem Pengenalan Suara dengan HTK." JSAI (Journal Scientific and Applied Informatics) 2, no. 2 (2019): 149–53. http://dx.doi.org/10.36085/jsai.v2i2.314.

Full text
Abstract:
Dipaparkan prosedur untuk mengembangkan Sistem Pengenalan Suara otomatis, Automatic Speech Recognition System (ASR) untuk kasus online recognition. Prosedur ini secara cepat dan efisien membangun ASR menggunakan Hidden Markov Toolkit (HTK). Langkah-langkah praktis ini dipaparkan secara jelas untuk mengimplementasikan ASR dengan daftar kata sedikit (Small Vocabulary) dalam contoh kasus pengenalan digit Bahasa Indonesia. Dijelaskan beberapa teknik meningkatkan performansi seperti cara mengatasi noise, pengejaan ganda dan penerapan Principle Component Analysis. Hasil akhir berupa Word Error Rate
APA, Harvard, Vancouver, ISO, and other styles
31

Armaisya, Dimas Dwi, Panca Dewi Pamungkasari, Achmad Pratama Rifai, Ira Diana Sholihati, and Gopal Sakarkar. "Comparison Of Feature Extraction Techniques For Long Short-Term Memory Models In Indonesian Automatic Speech Recognition." Green Intelligent Systems and Applications 5, no. 1 (2025): 74–92. https://doi.org/10.53623/gisa.v5i1.605.

Full text
Abstract:
Automatic Speech Recognition (ASR) faced challenges in accuracy and noise robustness, particularly in Bahasa Indonesia. This research addressed the limitations of single feature extraction methods, such as Mel-Frequency Cepstral Coefficients (MFCC), which were sensitive to noise, and Relative Spectral Transform - Perceptual Linear Predictive (RASTA-PLP), which was less effective in frequency representation, by proposing a hybrid approach that combined both techniques using Long Short-Term Memory (LSTM) models. MFCC enhanced spectral accuracy, while RASTA-PLP improved noise robustness, resulting in a more adaptive and informative acoustic representation. The evaluation demonstrated that the hybrid method outperformed single and non-extraction approaches, achieving a Character Error Rate (CER) of 0.5245 on clean data and 0.8811 on noisy data, as well as a Word Error Rate (WER) of 0.9229 on clean data and 1.0015 on noisy data. Although the hybrid approach required longer training times and higher memory usage, it remained stable and effective in reducing transcription errors. These findings suggested that the hybrid method was an optimal solution for Indonesian speech recognition in various acoustic conditions.
APA, Harvard, Vancouver, ISO, and other styles
32

Priyo Perdana Adati, Parabelem T. D. Rompas, and Olivia Kembuan. "Aplikasi Pengenalan Bahasa Mongondow Dengan Speech Recognition Menggunakan Metode Rapid Application Development (RAD)." Jurnal Penelitian Rumpun Ilmu Teknik 2, no. 2 (2023): 139–59. http://dx.doi.org/10.55606/juprit.v2i2.1931.

Full text
Abstract:
Indonesia is a nation with a diversity of languages and cultures. The regions in Indonesia have several different languages as a medium for communication. Language itself has an important value in the daily life of today's society because it is a means of communication. Besides that, language also includes self-identity and culture. Therefore preserving regional languages is important. This research aims to design a mobile-based Mongondow language recognition application that is easy for Android users to use anywhere and anytime. The developer method used for designing this application is Rapid Application Development (RAD). Designing and developing this application uses the Kodular website with drag-and-drop block programming features without having to type coding. From the research that has been done, an application for the introduction of the Mongondow language will help the general public and young people, especially those in Bolaang Mongondow, in learning the local language. Also, the community can prevent the loss of their mother tongue in their area.
APA, Harvard, Vancouver, ISO, and other styles
33

Tridarma, Panggih, and Sukmawati Nur Endah. "Pengenalan Ucapan Bahasa Indonesia Menggunakan MFCC dan Recurrent Neural Network." JURNAL MASYARAKAT INFORMATIKA 11, no. 2 (2020): 36–44. http://dx.doi.org/10.14710/jmasif.11.2.34874.

Full text
Abstract:
Pengenalan ucapan (speech recognition) merupakan perkembangan teknologi dalam bidang suara. Pengenalan ucapan memungkinkan suatu perangkat lunak mengenali kata-kata yang diucapkan oleh manusia dan ditampilkan dalam bentuk tulisan. Namun masih terdapat masalah untuk mengenali kata-kata yang diucapkan, seperti karakteristik suara yang berbeda, usia, kesehatan, dan jenis kelamin. Penelitian ini membahas pengenalan ucapan bahasa Indonesia dengan menggunakan Mel-Frequency Cepstral Coefficient (MFCC) sebagai metode ekstraksi ciri dan Recurrent Neural Network (RNN) sebagai metode pengenalannya dengan membandingkan arsitektur Elman RNN dan arsitektur Jordan RNN. Pembagian data latih dan data uji dilakukan dengan menggunakan metode k-fold cross validation dengan nilai k=5. Hasil penelitian menunjukkan bahwa arsitektur Elman RNN pada parameter 900 hidden neuron, target error 0.0005, learning rate 0.01, dan maksimal epoch 10000 dengan koefisien MFCC 20 menghasilkan akurasi terbaik sebesar 72.65%. Sedangkan hasil penelitian untuk arsitektur Jordan RNN pada parameter 500 hidden neuron, target error 0.0005, learning rate 0.01, dan maksimal epoch 10000 dengan koefisien MFCC 12 menghasilkan akurasi terbaik sebesar 73.55%. Sehingga berdasarkan hasil penelitian yang didapat, arsitektur Jordan RNN memiliki kinerja yang lebih baik dibandingkan dengan arsitektur Elman RNN dalam mengenali ucapan Bahasa Indonesia berjenis continuous speech
APA, Harvard, Vancouver, ISO, and other styles
34

Salamun, Sukri, Khairul Amin, Luluk Elvitaria, and Liza Trisnawati. "Artificial Intelligence Automatic Speech Recognition (ASR) untuk pencarian potongan ayat Al-Qu’ran." Jurnal Komputer Terapan, Vol. 8 No. 1 (2022) (May 31, 2022): 36–45. http://dx.doi.org/10.35143/jkt.v8i1.5299.

Full text
Abstract:
Indonesia merupakan negara dengan jumlah umat muslim terbesar di dunia, yang menjadikan pembacaan ayat-ayat Al-Qur’an sering terdengar di berbagai tempat-tempat umum seperti Mesjid, Mushollah, dan di berbagai kegiatan. Pemanfaatan Automatic Speech Recognition (ASR) sebagai pengenalan kata yang bertujuan untuk mengetahui ayat-ayat Al-Qur’an yang di bacakan untuk menambah pengetahuan mengenai ayat-ayat serta informasi pendukung lainnya sebagai salah satu sarana berdakwah dalam menyampaikan pengetahuan mengenai ayat-ayat Al-Qur’an. Automatic Speech Recognitions (ASR) ini dirancang menggunakan bahasa pemograman Python dan menggunakan framework Django untuk menampilkan informasi mengenai ayat-ayat yang dibacakan dalam bentuk tampilan berbasis web. Penelitian ini bertujuan untuk menciptakan sebuah teknik dan sistem untuk memasukkan perintah suara ke dalam mesin, agar mesin dapat mengerti apa yang manusia ucapkan dan mematuhi apa yang diperintahkannya. Aplikasi ini mengubah data suara menjadi data text menggunakan sistem pengenalan suara yang bekerja secara otomatis dengan pencocokan pola didigitalkan audio kata yang diucapkan terhadap model komputer dari pola bicara untuk menghasilkan keluaran akhir berupa teks yang di simpan didalam database.
APA, Harvard, Vancouver, ISO, and other styles
35

Kurniawan, Tesar, Nursin Nursin, Muhamad Amin Bakrie, and Seta Samsiana. "Rancang Bangun Sistem Kendali Berbasis Googlespeech Untuk Aktivasi Peralatan Listrik Rumah." JREC (Journal of Electrical and Electronics) 5, no. 2 (2018): 83–98. http://dx.doi.org/10.33558/jrec.v5i2.459.

Full text
Abstract:
App inventor adalah media pengembang perangkat lunak untuk sistem android, yang memudahkan para pengembangnya mengembangkan idenya, salah satunya aplikasi yang mampu mengendalikan peralatan listrik rumah menggunakan suara melalui telepon pintar yang dapat mengontrol aktivasi peralatan listrik rumah. Google Speech digunakan untuk pengenalan suara yang kemudian memberikan input ke Arduino untuk mengendalikan aktivasi peralatan listrik rumah, Peralatan listrik rumah seperti lampu, motor pompa akuarium, kipas, door lock dan motor servo yang memanfaatkan relay sebagai driver, kemudian dilakukanlah pengujian dan penelitian pada laporan ini berisi tentang pengujian akurasi pengenalan suara google Speech dan pengujian jarak koneksi Bluetooth. Tingkat keakurasian pada google Speech yang paling baik dari 3 bahasa yaitu Bahasa Indonesia disusulBahasa jawa dan terakhir Bahasa sunda, sedangkan untuk jarak koneksi pada Bluetooth dapat dioperasikan jarak maksimal pada ruang bebas adalah 20 m dan jarak maksimal pada ruang berhalangan adalah 13 m.
 App inventor is a software developer media for android systems, which makes it easy for developers to develop their ideas, i.e an application that is able to control home electrical appliances using voice over smart phones that can control the activation of home electrical appliances. Google Speech is used for voice recognition which then provides input to Arduino to control the activation of home electrical appliances, such as lamps, aquarium pump motors, fans, door locks. A servo motors is used as drivers, then test and research on this report Contains about Speech google speech recognition accuracy testing and Bluetooth connection distance testing. Level of accuracy on google Speech the best of 3 languages ie Indonesian followed by Java and last language Sundanese, while for the distance on the Bluetooth connection can be operated the maximum distance in free space is 20 m and the maximum distance in the absence room is 13 m.
APA, Harvard, Vancouver, ISO, and other styles
36

Isyanto, Haris, Henry Candra, Fadliondi Fadliondi, Riza Samsinar, and Muhammad Fauzi Nur Fajri. "Designing Device of Touchless Smart Lift using Voice Commands with Method of Speech Recognition based on the Internet of Things to Prevent the Spread of COVID-19." Journal of Electrical Technology UMY 7, no. 2 (2024): 65–75. http://dx.doi.org/10.18196/jet.v7i2.21108.

Full text
Abstract:
In Indonesia, the number of sufferers who were confirmed positive for COVID-19 as of May 14, 2021 approximately 1,734,285 people. One of the causes of the spread of COVID-19 is indicated by passengers touching the lift push button panels. Based on these problems, a device of touchless Smart Lift was designed using voice commands with the method of speech recognition. This voice command controls hardware using the human voice. Voice commands are part of speech recognition methods. The speech recognition method is very suitable to be applied to controlling a lift, so that lift users can access the intended floor via a smart speaker. From testing the performance of the smart lift, the results obtained were that the infrared temperature sensor distance 5 cm with a temperature of 36.65oC. The fastest response time testing 2.14 seconds. Sensor weight testing 195.5 Kg. Testing the accuracy of voice commands for the first and second floors obtained the best results of 100% and for the third floor 95%. From the results of this research, it is hoped that the smart lift device will be able to reduce the spread of COVID-19 without touching the lift push button panel.
APA, Harvard, Vancouver, ISO, and other styles
37

Althoff, Mohammad Noval, Affandy Affandy, Ardytha Luthfiarta, Mohammad Wahyu Bagus Dwi Satya, and Halizah Basiron. "Leveraging Label Preprocessing for Effective End-to-End Indonesian Automatic Speech Recognition." sinkron 9, no. 1 (2025): 55–64. https://doi.org/10.33395/sinkron.v9i1.14257.

Full text
Abstract:
This research explores the potential of improving low-resource Automatic Speech Recognition (ASR) performance by leveraging label preprocessing techniques in conjunction with the wav2vec2-large Self-Supervised Learning (SSL) model. ASR technology plays a critical role in enhancing educational accessibility for children with disabilities in Indonesia, yet its development faces challenges due to limited labeled datasets. SSL models like wav2vec 2.0 have shown promise by learning rich speech representations from raw audio with minimal labeled data. Still, their dependence on large datasets and significant computational resources limits their application in low-resource settings. This study introduces a label preprocessing technique to address these limitations, comparing three scenarios: training without preprocessing, with the proposed preprocessing method, and with an alternative method. Using only 16 hours of labeled data, the proposed preprocessing approach achieves a Word Error Rate (WER) of 15.83%, significantly outperforming the baseline scenario (33.45% WER) and the alternative preprocessing method (19.62% WER). Further training using the proposed preprocessing technique with increased epochs reduces the WER to 14.00%. These results highlight the effectiveness of label preprocessing in reducing data dependency while enhancing model performance. The findings demonstrate the feasibility of developing robust ASR models for low-resource languages, offering a scalable solution for advancing ASR technology and improving educational accessibility, particularly for underrepresented languages.
APA, Harvard, Vancouver, ISO, and other styles
38

Widiastuti, Nur Oktaviani. "Desain Konseptual Speech Recognition di Komunikasi Pesawat untuk Mengurangi Kesalahan Komunikasi Penerbangan." WARTA ARDHIA 42, no. 3 (2017): 109. http://dx.doi.org/10.25104/wa.v42i3.244.109-116.

Full text
Abstract:
Penyebab utama kecelakaan penerbangan adalah human error (55%) dengan salah satu penyebab adalah miskomunikasi. Miskomunikasi menjadi penyebab kasus kecelakaan terbesar di dunia seperti kecelakaan antara Pan Am dan KLM, Garuda dengan nomor penerbangan 152, dan kecelakaan yang baru-baru ini terjadi antara Batik Air dan Transnusa. Selain itu, KNKT pernah mengalami kebingungan saat menginvestigasi cockpit recorder Air Asia QZ8501. Saat miskomunikasi terjadi, instruksi sering kali sulit dipahami dan diperlukan pengulangan komunikasi, yang mempersempit waktu pengambilan tindakan. Penelitian ini ingin mengembangkan konsep speech recognition dengan sistem voice sign and text untuk prosedur pengecekan komunikasi kru penerbangan. Metode dalam penelitian ini adalah kajian pustaka dan perancangan sistem. Hasil dari penelitian ini adalah penambahan prosedur komunikasi dengan sign dan teks yang diperoleh dengan bantuan speech recognition. Pesan teks berasal dari ucapan yang diterjemahkan oleh speech recognition menjadi teks, kemudian teks akan diubah menjadi sign. Lalu sign dan teks akan ditampilkan dan dilihat langsung oleh pilot. Dengan penambahan sistem sign dan teks disamping komunikasi melalui suara diharapkan kesalahan dapat diminimalisir, pengambilan keputusan dapat dilakukan dengan cepat, dan kecelakaan dapat dihindari.
 [Speech Recognition Conceptual Design in Aircraft Communications to Reduce Flight Communication Mistake] The main cause of aviation accidents is the human error (55%) in which one of its factor is the miscommunication. Miscommunication involved in the most accidents in the world including Pan Am and KLM accident, Garuda flight 152 (the biggest accident in Indonesia), and that was recently occurred between Batik Air and Transnusa Airline. Moreover, NTSC was confused when investigating the Air Asia QZ8501 cockpit recorder. When miscommunication occurs, the repetition of communication is prominently required due to the difficulty in understanding the instruction, which later narrowed the time to take any action. This study is intended to develop the concept of speech recognition with voice sign and text system for flight crew communication checking procedure. The methodology that is used in this research is the combining of literature review and system design. The results of this study is the addition of communication procedure by means of sign and text that obtained from speech recognition process. The text is produced from the translation of voice by speech recognition and converted into sign afterwards. Both of sign and text will be displayed, thus can be seen by the pilot. In addition to communication by voice, the implementation of sign and text is expected in minimalizing the error, supporting faster decision making, and avoiding the accident.
APA, Harvard, Vancouver, ISO, and other styles
39

Purbohadi, Dwijoko, Silvia Afriani, Nicko Rachmanio, and Arlina Dewi. "Developing Medical Virtual Teaching Assistant Based on Speech Recognition Technology." International Journal of Online and Biomedical Engineering (iJOE) 17, no. 04 (2021): 107. http://dx.doi.org/10.3991/ijoe.v17i04.21343.

Full text
Abstract:
This paper proposes to present the results of the development of the Virtual Teaching Assistant (VTA). This system is an e-learning module as a learning aid for medical students currently pursuing professional medical doctor in hospitals. In Indonesia, students of the medical doctor profession education must study and work in hospitals like an experienced doctor. They interact directly with patients and provide the same services as doctors. Every student has a professional doctor at the hospital as a mentor or companion. However, student meetings with accompanying doctors are minimal. It is not uncommon for students to encounter difficulties when dealing with patients, but they do not immediately receive guidance. As students, it is natural that they sometimes forget the theory. These students need a theoretical learning source that is fast and practical, which students can use between activities. We developed VTA to meet the needs of information and fast learning resources. VTA can run on computer, laptop, or smartphones by utilizing speech recognition technology. Students only need to ask questions in the form of speech using their everyday language, and VTA will provide answers. Although the VTA answer is still not satisfactory, it potentially to support Question and Answer-based mobile learning for particularly learning subject.
APA, Harvard, Vancouver, ISO, and other styles
40

Saputra, Ramadhan Dwi, Kania Venisa Rachim, and Vicko Taniady. "Empowering Voices: Building an Electronic Petition System for Strengthening Freedom of Speech in Indonesia." Journal of Judicial Review 25, no. 1 (2023): 71. http://dx.doi.org/10.37253/jjr.v25i1.7459.

Full text
Abstract:
Currently, the issue of freedom of expression poses a significant challenge in Indonesia. Despite being a democratic nation, the scope of people's freedom of expression is largely confined to electoral processes. In order to advance this fundamental right, the implementation of an electronic petition system has been undertaken as a means to facilitate the exercise of freedom of expression. The primary objective of this research is to examine the status quo of freedom of expression in Indonesia and to analyze the pressing need for the adoption of an electronic petition system. This study employs a normative legal approach and conducts comparative analysis with the United Kingdom and Germany, utilizing secondary data sources. The findings of this research demonstrate that Indonesia would greatly benefit from the adoption of an e-democracy system through the implementation of an electronic petition system. The efficacy of such a system has been successfully demonstrated in the United Kingdom and Germany, where it has served as an effective intermediary between the public and the government, ensuring sustained public participation and influencing governmental decision-making processes. In order to implement the electronic petition system in Indonesia, several crucial steps must be undertaken. These steps include the establishment of a Petition Committee, the formulation of Petition Laws, and the official recognition of a dedicated website serving as the electronic petition platform in Indonesia. Additionally, political will and legislative enforcement will be required to ensure the Indonesian Parliament's commitment to act upon the outcomes of these petitions.
APA, Harvard, Vancouver, ISO, and other styles
41

Hidayat, Syahroni, Muhammad Tajuddin, Siti Agrippina Alodia Yusuf, Jihadil Qudsi, and Nenet Natasudian Jaya. "WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM." IIUM Engineering Journal 23, no. 1 (2022): 68–81. http://dx.doi.org/10.31436/iiumej.v23i1.1760.

Full text
Abstract:
Speaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speech is the most critical process. The features are used to represent the speech as unique features to distinguish speech samples from one another. In this research, we proposed the use of a combination of Wavelet and Mel Frequency Cepstral Coefficient (MFCC), Wavelet-MFCC, as feature extraction methods, and Hidden Markov Model (HMM) as classification. The speech signal is first extracted using Wavelet into one level of decomposition, then only the sub-band detail coefficient is used as the feature for further extraction using MFCC. The modeled system was applied in 300 speech datasets of 30 speakers uttering “HADIR” in the Indonesian language. K-fold cross-validation is implemented with five folds. As much as 80% of the data were trained for each fold, while the rest was used as testing data. Based on the testing, the system's accuracy using the combination of Wavelet-MFCC obtained is 96.67%. ABSTRAK: Pengecaman penutur adalah proses mengenali penutur dari ucapannya yang dapat digunakan dalam banyak aspek kehidupan, seperti mengambil akses dari jauh ke peranti peribadi, mendapat kawalan ke atas akses suara, dan melakukan penyelidikan forensik. Ciri-ciri khas dari ucapan merupakan proses paling kritikal dalam pengecaman penutur. Ciri-ciri ini digunakan bagi mengenali ciri unik yang terdapat pada sesebuah ucapan dalam membezakan satu sama lain. Penyelidikan ini mencadangkan penggunaan kombinasi Wavelet dan Mel Frekuensi Pekali Cepstral (MFCC), Wavelet-MFCC, sebagai kaedah ekstrak ciri-ciri penutur, dan Model Markov Tersembunyi (HMM) sebagai pengelasan. Isyarat penuturan pada awalnya diekstrak menggunakan Wavelet menjadi satu tahap penguraian, kemudian hanya pekali perincian sub-jalur digunakan bagi pengekstrakan ciri-ciri berikutnya menggunakan MFCC. Model ini diterapkan kepada 300 kumpulan data ucapan daripada 30 penutur yang mengucapkan kata "HADIR" dalam bahasa Indonesia. Pengesahan silang K-lipat dilaksanakan dengan 5 lipatan. Sebanyak 80% data telah dilatih bagi setiap lipatan, sementara selebihnya digunakan sebagai data ujian. Berdasarkan ujian ini, ketepatan sistem yang menggunakan kombinasi Wavelet-MFCC memperolehi 96.67%.
APA, Harvard, Vancouver, ISO, and other styles
42

Intan, Nurma Yulita, Hidayat Akik, Setiawan Abdullah Atje, and Maulana Awangga Rolly. "Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Recognition." TELKOMNIKA Telecommunication, Computing, Electronics and Control 16, no. 5 (2018): 2191–98. https://doi.org/10.12928/TELKOMNIKA.v16i5.7927.

Full text
Abstract:
Sundanese language is one of the popular languages in Indonesia. Thus, research in Sundanese language becomes essential to be made. It is the reason this study was being made. The vital parts to get the high accuracy of recognition are feature extraction and classifier. The important goal of this study was to analyze the first one. Three types of feature extraction tested were Linear Predictive Coding (LPC), Mel Frequency Cepstral Coefficients (MFCC), and Human Factor Cepstral Coefficients (HFCC). The results of the three feature extraction became the input of the classifier. The study applied Hidden Markov Models as its classifier. However, before the classification was done, we need to do the quantization. In this study, it was based on clustering. Each result was compared against the number of clusters and hidden states used. The dataset came from four people who spoke digits from zero to nine as much as 60 times to do this experiments. Finally, it showed that all feature extraction produced the same performance for the corpus used.
APA, Harvard, Vancouver, ISO, and other styles
43

Kusuma, Mandahadi, and Fayyadh Aunilbarr. "Indonesian Word Sound Recognition Using Convolutional Neural Network Method." JOIV : International Journal on Informatics Visualization 9, no. 2 (2025): 789. https://doi.org/10.62527/joiv.9.2.2679.

Full text
Abstract:
Access to education, particularly in a university environment, is essential for deaf and hard-of-hearing students as more of them pursue higher education. At UIN Sunan Kalijaga the current challenges are a limited number of sign language interpreters and translating technical terminology in lectures. Many methods are available for speech recognition, but research on how well this method performs in Indonesian has not been published, especially in education-level recognizers. This experimental study aims to investigate if Indonesian words can be recognized through Convolutional Neural Networks (CNN) and to find out the Data Ratio for Training, Validation, and Testing set to get the best performance. The study used a dataset of 4 Indonesian words with the total voice sample, each with 50 voice samples from young adults aged 19-23. Audio data is preprocessed into spectrograms, inputs to the CNN model using TensorFlow. The CNN Model had a 90% accuracy with a 60:20:20 ratio between training, validation, and test data. The other ratios (70:15:15 and 80:10:10) provided accuracy ranges of between 80% to 90%. This study shows that CNNs are the best for Indonesian word recognition and that the data ratio of 60:20:20 is optimal. This result has valuable benefits, such as using voice-to-text over lectures to enhance the ease of learning and education in Indonesia. Further studies should be conducted using different neural network approaches; the denoise approach is also necessary to increase accuracy.
APA, Harvard, Vancouver, ISO, and other styles
44

Dwijayanti, Suci, Alvio Yunita Putri, and Bhakti Yudho Suprapto. "Speaker Identification Using a Convolutional Neural Network." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 6, no. 1 (2022): 140–45. http://dx.doi.org/10.29207/resti.v6i1.3795.

Full text
Abstract:
Speech, a mode of communication between humans and machines, has various applications, including biometric systems for identifying people have access to secure systems. Feature extraction is an important factor in speech recognition with high accuracy. Therefore, we implemented a spectrogram, which is a pictorial representation of speech in terms of raw features, to identify speakers. These features were inputted into a convolutional neural network (CNN), and a CNN-visual geometry group (CNN-VGG) architecture was used to recognize the speakers. We used 780 primary data from 78 speakers, and each speaker uttered a number in Bahasa Indonesia. The proposed architecture, CNN-VGG-f, has a learning rate of 0.001, batch size of 256, and epoch of 100. The results indicate that this architecture can generate a suitable model for speaker identification. A spectrogram was used to determine the best features for identifying the speakers. The proposed method exhibited an accuracy of 98.78%, which is significantly higher than the accuracies of the method involving Mel-frequency cepstral coefficients (MFCCs; 34.62%) and the combination of MFCCs and deltas (26.92%). Overall, CNN-VGG-f with the spectrogram can identify 77 speakers from the samples, validating the usefulness of the combination of spectrograms and CNN in speech recognition applications.
APA, Harvard, Vancouver, ISO, and other styles
45

Nurrizqy, Irfan Muzakky, Barlian Henryranu Prasetio, and Rekyan Regasari Mardi Putri. "Sistem Kontrol Perangkat Inframerah Menggunakan Speech Recognition dengan Spectrogram dan Convolutional Neural Network Berbasis Mikrokontroler." Jurnal Teknologi Informasi dan Ilmu Komputer 10, no. 5 (2023): 955–62. http://dx.doi.org/10.25126/jtiik.20231056909.

Full text
Abstract:
Menurut data dari Biro Pusat Statistik (BPS), terdapat sebanyak 22,5 juta dari penduduk Indonesia merupakan penyandang disabilitas. Angka ini berjumlah sekitar lima persen dari keseluruhan penduduk Indonesia. Di zaman sekarang, kemajuan teknologi di seluruh dunia berkembang dengan pesat, sehingga muncul banyak hal yang dapat membantu menyederhanakan kehidupan semua orang, terutama penyandang disabilitas. Salah satu hal yang membantu penyandang disabilitas adalah munculnya perangkat pintar yang dapat dikendalikan menggunakan indra selain tangan, seperti suara. Penelitian ini bertujuan untuk mengembangkan sistem yang dapat mengendalikan perangkat inframerah dengan menggunakan suara sebagai input. Sistem tersebut akan dikembangkan menggunakan mikrokontroler dan metode speech recognition yang terdiri dari spectrogram dan CNN. Penelitian ini direncanakan untuk tujuan untuk membantu penyandang disabilitas dalam mengendalikan perangkat-perangkat di sekitar rumah. Hasil pengujian menunjukkan bahwa akurasi model CNN sebesar 93% dan akurasi percobaan terhadap pengguna sebesar 74,25%. Sistem ini juga dapat menjalankan proses speech recognition dengan waktu rata-rata 0,105 detik. Jarak optimal yang diperlukan antara pengguna dengan mikrofon adalah 30 cm dan jarak optimal yang diperlukan antara transmitter inframerah dengan perangkat yang dikendalikan adalah 30 cm. Abstract According to data from the Central Bureau of Statistics (BPS), around 22.5 million of Indonesia's population are people with disabilities. This number amounts to about five percent of Indonesia's total population. In the present day, where technology advances are rapidly developing all around the world, there have been many things that can help simplify the lives of everyone in the world, especially people with disabilities. One thing that helps people with disabilities is the emergence of smart devices that do not need to be controlled using hands but can use other senses such as sound. This research aims to develop a system that can control infrared devices using sound as input. The system will be developed using microcontrollers and speech recognition methods consisting of spectrogram and CNN. This research is conducted with the goal of helping people with disabilities in controlling devices around the house. Testing results show that the accuracy of the CNN model is 93% and the accuracy of trials on users is 74.25%. The system can also run the speech recognition process with an average time of 0.105 seconds. The optimal distance required between the user and microphone is 30 cm and the optimal distance required between the infrared transmitter and the controlled device is 30 cm.
APA, Harvard, Vancouver, ISO, and other styles
46

Nurrizqy, Irfan Muzakky, Barlian Henryranu Prasetio, and Rekyan Regasari Mardi Putri. "Sistem Kontrol Perangkat Inframerah Menggunakan Speech Recognition dengan Spectrogram dan Convolutional Neural Network Berbasis Mikrokontroler." Jurnal Teknologi Informasi dan Ilmu Komputer 10, no. 5 (2023): 955–62. https://doi.org/10.25126/jtiik.2023106909.

Full text
Abstract:
Menurut data dari Biro Pusat Statistik (BPS), terdapat sebanyak 22,5 juta dari penduduk Indonesia merupakan penyandang disabilitas. Angka ini berjumlah sekitar lima persen dari keseluruhan penduduk Indonesia. Di zaman sekarang, kemajuan teknologi di seluruh dunia berkembang dengan pesat, sehingga muncul banyak hal yang dapat membantu menyederhanakan kehidupan semua orang, terutama penyandang disabilitas. Salah satu hal yang membantu penyandang disabilitas adalah munculnya perangkat pintar yang dapat dikendalikan menggunakan indra selain tangan, seperti suara. Penelitian ini bertujuan untuk mengembangkan sistem yang dapat mengendalikan perangkat inframerah dengan menggunakan suara sebagai input. Sistem tersebut akan dikembangkan menggunakan mikrokontroler dan metode speech recognition yang terdiri dari spectrogram dan CNN. Penelitian ini direncanakan untuk tujuan untuk membantu penyandang disabilitas dalam mengendalikan perangkat-perangkat di sekitar rumah. Hasil pengujian menunjukkan bahwa akurasi model CNN sebesar 93% dan akurasi percobaan terhadap pengguna sebesar 74,25%. Sistem ini juga dapat menjalankan proses speech recognition dengan waktu rata-rata 0,105 detik. Jarak optimal yang diperlukan antara pengguna dengan mikrofon adalah 30 cm dan jarak optimal yang diperlukan antara transmitter inframerah dengan perangkat yang dikendalikan adalah 30 cm. Abstract According to data from the Central Bureau of Statistics (BPS), around 22.5 million of Indonesia's population are people with disabilities. This number amounts to about five percent of Indonesia's total population. In the present day, where technology advances are rapidly developing all around the world, there have been many things that can help simplify the lives of everyone in the world, especially people with disabilities. One thing that helps people with disabilities is the emergence of smart devices that do not need to be controlled using hands but can use other senses such as sound. This research aims to develop a system that can control infrared devices using sound as input. The system will be developed using microcontrollers and speech recognition methods consisting of spectrogram and CNN. This research is conducted with the goal of helping people with disabilities in controlling devices around the house. Testing results show that the accuracy of the CNN model is 93% and the accuracy of trials on users is 74.25%. The system can also run the speech recognition process with an average time of 0.105 seconds. The optimal distance required between the user and microphone is 30 cm and the optimal distance required between the infrared transmitter and the controlled device is 30 cm.
APA, Harvard, Vancouver, ISO, and other styles
47

Ashar, Ahmad Kharisudin, Munawarah, and Agus Sifaunajah. "PENERAPAN GAME EDUKASI “SPEAK ENGLISH” PADA SEKOLAH DASAR MENGGUNAKAN TEKNOLOGI SPEECH RECOGNITION." SAINTEKBU 11, no. 2 (2019): 8–20. http://dx.doi.org/10.32764/saintekbu.v11i2.218.

Full text
Abstract:
Bahasa inggris merupakan sebuah bahasa komunikasi yang sangat berperan dalam menghubungkan antar negara. Dalam berkomunikasi dapat menyampaikan secara lisan maupun tulisan agar tidak terjadi kesalahpahaman dalam memberikan informasi. Orang-orang dari berbagai negara terdorong untuk mempelajari bahasa inggris sebagai sarana menambah wawasan dalam mengikuti perkembangan zaman. Di Indonesia sendiri, bahasa Inggris merupakan bahasa asing yang wajib untuk dipelajari, mulai dari tingkat Sekolah Dasar sampai Perguruan Tinggi. Akan tetapi dalam praktek pembelajarannya masih terdapat banyak masalah, Khususnya di lembaga pendidikan tingkat Sekolah Dasar. Masalah tersebut seperti, siswa kesulitan dalam mempelajari bahasa asing atau bahasa baru yang hanya digunakan ketika proses belajar di kelas, masalah lain disebabkan karakter pribadi masing-masing siswa berbeda-beda dalam memperhatikan pelajaran. Metode pembelajaran juga perlu dipertimbangkan karena masih banyak siswa yang kurang dalam memanfaatkan media pembelajaran, terutama media pembelajaran yang disajikan dalam smartphone berbasis android. Penelitian ini bertujuan untuk membuat game edukasi yang berisi pembelajaran bahasa inggris dengan menggunakan teknologi Speech Recognition pada tingkat Sekolah Dasar. Metode yang digunakan yaitu System Development Life Cycle (SDLC) dengan menggunakan model Waterfall. Dalam proses pengumpulan data, peneliti menggunakan study literatur dan observasi di MI AL IHSAN Jombang. Hasil penelitian menunjukkan bahwa game yang edukatif sangat membantu siswa lebih efektif dan dapat mendorong siswa dalam mempelajari bahasa inggris.
APA, Harvard, Vancouver, ISO, and other styles
48

Junining, Esti, Sony Alif, and Nuria Setiarini. "Automatic speech recognition in computer-assisted language learning for individual learning in speaking." JEES (Journal of English Educators Society) 5, no. 2 (2020): 193–97. http://dx.doi.org/10.21070/jees.v5i2.867.

Full text
Abstract:
This study is intended to help English as a Foreign Language (EFL) learners in Indonesia to reduce their anxiety level while speaking in front of other people. This study helps to develop an atmosphere that encourages students to practice speaking independently. The interesting atmosphere can be obtained by using Automatic Speech Recognition (ASR) where every student can practice speaking individually without feeling anxious or pressurized, because he/she can practice independently in front of a computer or a gadget. This study used research and development design as it tried to develop a product which can create an atmosphere that encourages students to practice their speaking. The instrument used is a questionnaire which is used to analyze the students’ need of learning English. This study developed a product which utilized ASR technology using C# programming language. This study revealed that the product developed using ASR can make students practice speaking individually without feeling anxious and pressurized.
APA, Harvard, Vancouver, ISO, and other styles
49

Lailatul Hasanah and Budi Yuniarto. "Early Study of LLM Implementation in Survey Interviews." Jurnal Aplikasi Statistika & Komputasi Statistik 17, no. 1 (2025): 12–22. https://doi.org/10.34123/jurnalasks.v17i1.792.

Full text
Abstract:
Introduction/Main Objectives: This research aims to conduct a preliminary study into the use of LLMs for extracting information to fill out questionnaires in survey interviews. Background Problems: BPS-Statistics Indonesia used paper-based questionnaires for interviews and is recently utilizing the Computer Assisted Personal Interviewing (CAPI) method. However, the CAPI method has some drawbacks. Enumerators must input data into the device, which can be burdensome and prone to errors. Novelty: This study uses a large language model (LLM) to extract information from survey interviews. Research Methods: This study utilizes a text-to-speech application to translate interview results into text. Translation accuracy is measured by the Word Error Rate (WER). Then the text was extracted using the ChatGPT 3.5 Turbo model. GPT-3.5 Turbo is part of the GPT family of algorithms developed by OpenAI. Finding/Results: The extraction results are formatted into a JSON file, which is intended to be used for automatic filling into the database and then evaluated using precision, recall, and F1-score. Based on research conducted by utilizing the Speech Recognition API by Google and the ChatGPT 3.5 Turbo model, an average WER of 10% was obtained in speech recognition and an average accuracy of 76.16% in automatic data extraction.
APA, Harvard, Vancouver, ISO, and other styles
50

Arnapi, Karnaji, Izzah Khalif Raihan Abidin, and Rofadan Mina Arsyada. "Paradigma Hukum Kedudukan Kepolisian Negara Republik Indonesia Dalam Pengamanan Aksi Unjuk Rasa." Media Iuris 7, no. 1 (2024): 31–50. http://dx.doi.org/10.20473/mi.v7i1.43709.

Full text
Abstract:
AbstractThe law will continue to develop to achieve justice. In social life, the Indonesian National Police is essential in ensuring security and order, including demonstrations by certain groups of people. Indonesia, as a democratic country, absolutely guarantees the existence of public opinion, namely freedom of speech, freedom of the press, intellectual freedom, and freedom of religion, as stated in Article 28E paragraph (3) of the 1945 Constitution of the Republic of Indonesia and detailed in other laws and regulations. Departing from the issue of violence in demonstrations, we will discuss the legal urgency of granting constitutional rights to demonstrations, the status and criminal responsibility of the Indonesian National Police in security and state order. The inconsistency in the recognition of constitutional rights at demonstrations can be seen in the criminal sanctions for the Indonesian National Police who commit acts that violate the law that have not been applied strictly and proportionately.Keywords: Rally; Constitutional Rights; Human Rights; Criminal Liability; The Indonesian National Police. AbstrakHukum akan terus berkembang untuk mencapai suatu keadilan. Dalam kehidupan sosial bermasyarakat, Kepolisian Republik Indonesia memegang peranan penting untuk menjamin keamanan dan ketertiban salah satunya aksi unjuk rasa yang dilakukan oleh sekelompok orang tertentu. Indonesia sebagai negara demokrasi secara absolut menjamin eksistensi pendapat umum yakni kebebasan berbicara, kebebasan pers, kebebasan intelektual, dan kebebasan beragama sebagaimana Pasal 28E ayat (3) Undang-Undang Dasar Negara Republik Indonesia Tahun 1945 dan diperinci dalam peraturan perundang-undangan lainnya. Berangkat pada isu kekerasan dalam aksi unjuk rasa, akan dibahas urgensi hukum pemberian hak konstitusi pada unjuk rasa, kedudukan dan pertanggungjawaban pidana Kepolisian Republik Indonesia dalam keamanan dan ketertiban negara. Inkonsistensi pengakuan hak konstitusi pada unjuk rasa nampak pada sanksi pidana bagi POLRI yang melakukan perbuatan melanggar hukum yang belum diterapkan secara tegas dan proporsional.Kata Kunci: Unjuk Rasa; Hak Konstitusi; Hak Asasi Manusia; Pertanggungjawaban Pidana; Kepolisian Negara Republik Indonesia.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography