Добірка наукової літератури з теми "Low resource language"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Low resource language".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Low resource language"

1

Pakray, Partha, Alexander Gelbukh, and Sivaji Bandyopadhyay. "Natural language processing applications for low-resource languages." Natural Language Processing 31, no. 2 (2025): 183–97. https://doi.org/10.1017/nlp.2024.33.

Повний текст джерела
Анотація:
AbstractNatural language processing (NLP) has significantly advanced our ability to model and interact with human language through technology. However, these advancements have disproportionately benefited high-resource languages with abundant data for training complex models. Low-resource languages, often spoken by smaller or marginalized communities, need help realizing the full potential of NLP applications. The primary challenges in developing NLP applications for low-resource languages stem from the need for large, well-annotated datasets, standardized tools, and linguistic resources. This scarcity of resources hinders the performance of data-driven approaches that have excelled in high-resource settings. Further, low-resource languages frequently exhibit complex grammatical structures, diverse vocabularies, and unique social contexts, which pose additional challenges for standard NLP techniques. Innovative strategies are emerging to address these challenges. Researchers are actively collecting and curating datasets, even utilizing community engagement platforms to expand data resources. Transfer learning, where models pre-trained on high-resource languages are adapted to low-resource settings, has shown significant promise. Multilingual models like Multilingual Bidirectional Encoder Representations from Transformers (mBERT) and Cross Lingual Models (XLM-R), trained on vast quantities of multilingual data, offer a powerful avenue for cross-lingual knowledge transfer. Additionally, researchers are exploring integrating multimodal approaches, combining textual data with images, audio, or video, to enhance NLP performance in low-resource language scenarios. This survey covers applications like part-of-speech tagging, morphological analysis, sentiment analysis, hate speech detection, dependency parsing, language identification, discourse annotation guidelines, question answering, machine translation, information retrieval, and predictive authoring for augmentative and alternative communication systems. The review also highlights machine learning approaches, deep learning approaches, Transformers, and cross-lingual transfer learning as practical techniques. Developing practical NLP applications for low-resource languages is crucial for preserving linguistic diversity, fostering inclusion within the digital world, and expanding our understanding of human language. While challenges remain, the strategies outlined in this survey demonstrate the ongoing progress and highlight the potential for NLP to empower communities that speak low-resource languages and contribute to a more equitable landscape within language technology.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Lin, Donghui, Yohei Murakami, and Toru Ishida. "Towards Language Service Creation and Customization for Low-Resource Languages." Information 11, no. 2 (2020): 67. http://dx.doi.org/10.3390/info11020067.

Повний текст джерела
Анотація:
The most challenging issue with low-resource languages is the difficulty of obtaining enough language resources. In this paper, we propose a language service framework for low-resource languages that enables the automatic creation and customization of new resources from existing ones. To achieve this goal, we first introduce a service-oriented language infrastructure, the Language Grid; it realizes new language services by supporting the sharing and combining of language resources. We then show the applicability of the Language Grid to low-resource languages. Furthermore, we describe how we can now realize the automation and customization of language services. Finally, we illustrate our design concept by detailing a case study of automating and customizing bilingual dictionary induction for low-resource Turkic languages and Indonesian ethnic languages.
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Ranasinghe, Tharindu, and Marcos Zampieri. "Multilingual Offensive Language Identification for Low-resource Languages." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 1 (2022): 1–13. http://dx.doi.org/10.1145/3457610.

Повний текст джерела
Анотація:
Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this article, we take advantage of available English datasets by applying cross-lingual contextual word embeddings and transfer learning to make predictions in low-resource languages. We project predictions on comparable data in Arabic, Bengali, Danish, Greek, Hindi, Spanish, and Turkish. We report results of 0.8415 F1 macro for Bengali in TRAC-2 shared task [23], 0.8532 F1 macro for Danish and 0.8701 F1 macro for Greek in OffensEval 2020 [58], 0.8568 F1 macro for Hindi in HASOC 2019 shared task [27], and 0.7513 F1 macro for Spanish in in SemEval-2019 Task 5 (HatEval) [7], showing that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages. Additionally, we report competitive performance on Arabic and Turkish using the training and development sets of OffensEval 2020 shared task. The results for all languages confirm the robustness of cross-lingual contextual embeddings and transfer learning for this task.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Cassano, Federico, John Gouwar, Francesca Lucchetti, et al. "Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs." Proceedings of the ACM on Programming Languages 8, OOPSLA2 (2024): 677–708. http://dx.doi.org/10.1145/3689735.

Повний текст джерела
Анотація:
Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software engineering. However, the quality of code produced by a Code LLM varies significantly by programming language. Code LLMs produce impressive results on high-resource programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript), but struggle with low-resource languages that have limited training data available (e.g., OCaml, Racket, and several others). This paper presents an effective approach for boosting the performance of Code LLMs on low-resource languages using semi-synthetic data. Our approach, called MultiPL-T, generates high-quality datasets for low-resource languages, which can then be used to fine-tune any pretrained Code LLM. MultiPL-T translates training data from high-resource languages into training data for low-resource languages in the following way. 1) We use a Code LLM to synthesize unit tests for commented code from a high-resource source language, filtering out faulty tests and code with low test coverage. 2) We use a Code LLM to translate the code from the high-resource source language to a target low-resource language. This gives us a corpus of candidate training data in the target language, but many of these translations are wrong. 3) We use a lightweight compiler to compile the test cases generated in (1) from the source language to the target language, which allows us to filter our obviously wrong translations. The result is a training corpus in the target low-resource language where all items have been validated with test cases. We apply this approach to generate tens of thousands of new, validated training items for five low-resource languages: Julia, Lua, OCaml, R, and Racket, using Python as the source high-resource language. Furthermore, we use an open Code LLM (StarCoderBase) with open training data (The Stack), which allows us to decontaminate benchmarks, train models without violating licenses, and run experiments that could not otherwise be done. Using datasets generated with MultiPL-T, we present fine-tuned versions of StarCoderBase and Code Llama for Julia, Lua, OCaml, R, and Racket that outperform other fine-tunes of these base models on the natural language to code task. We also present Racket fine-tunes for two very recent models, DeepSeek Coder and StarCoder2, to show that MultiPL-T continues to outperform other fine-tuning approaches for low-resource languages. The MultiPL-T approach is easy to apply to new languages, and is significantly more efficient and effective than alternatives such as training longer.
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Abigail Rai. "Part-of-Speech (POS) Tagging of Low-Resource Language (Limbu) with Deep learning." Panamerican Mathematical Journal 35, no. 1s (2024): 149–57. http://dx.doi.org/10.52783/pmj.v35.i1s.2297.

Повний текст джерела
Анотація:
POS tagging is a basic Natural Language Processing (NLP) task that tags the words in an input text according to its grammatical values. Although POS Tagging is a fundamental application for very resourced languages, such as Limbu, is still unknown due to only few tagged datasets and linguistic resources. This research project uses deep learning techniques, transfer learning, and the BiLSTM-CRF model to develop an accurate POS-tagging system for the Limbu language. Using annotated and unannotated language data, we progress in achieving a small yet informative dataset of Limbu text. Skilled multilingual tutoring was modified to enhance success on low-resource language tests. The model as propose attains 90% accuracy, which is very much better than traditional rule-based and machine learning methods for Limbu POS tagging. The results indicate that deep learning methods can address linguistic issues facing low-resource languages even with limited data. In turn, this study provides a cornerstone for follow up NLP-based applications of Limbu and similar low-resource languages, demonstrating how deep learning can fill the gap where data is scarce.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Nitu, Melania, and Mihai Dascalu. "Natural Language Processing Tools for Romanian – Going Beyond a Low-Resource Language." Interaction Design and Architecture(s), no. 60 (March 15, 2024): 7–26. http://dx.doi.org/10.55612/s-5002-060-001sp.

Повний текст джерела
Анотація:
Advances in Natural Language Processing bring innovative instruments to the educational field to improve the quality of the didactic process by addressing challenges like language barriers and creating personalized learning experiences. Most research in the domain is dedicated to high-resource languages, such as English, while languages with limited coverage, like Romanian, are still underrepresented in the field. Operating on low-resource languages is essential to ensure equitable access to educational opportunities and to preserve linguistic diversity. Through continuous investments in developing Romanian educational instruments, we are rapidly going beyond a low-resource language. This paper presents recent educational instruments and frameworks dedicated to Romanian, leveraging state-of-the-art NLP techniques, such as building advanced Romanian language models and benchmarks encompassing tools for language learning, text comprehension, question answering, automatic essay scoring, and information retrieval. The methods and insights gained are transferable to other low-resource languages, emphasizing methodological adaptability, collaborative frameworks, and technology transfer to address similar challenges in diverse linguistic contexts. Two use cases are presented, focusing on assessing student performance in Moodle courses and extracting main ideas from students’ feedback. These practical applications in Romanian academic settings serve as examples for enhancing educational practices in other less-resourced languages.
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Zhou, Shuyan, Shruti Rijhwani, John Wieting, Jaime Carbonell, and Graham Neubig. "Improving Candidate Generation for Low-resource Cross-lingual Entity Linking." Transactions of the Association for Computational Linguistics 8 (July 2020): 109–24. http://dx.doi.org/10.1162/tacl_a_00303.

Повний текст джерела
Анотація:
Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts. The first step of (X)EL is candidate generation, which retrieves a list of plausible candidate entities from the target-language KB for each mention. Approaches based on resources from Wikipedia have proven successful in the realm of relatively high-resource languages, but these do not extend well to low-resource languages with few, if any, Wikipedia pages. Recently, transfer learning methods have been shown to reduce the demand for resources in the low-resource languages by utilizing resources in closely related languages, but the performance still lags far behind their high-resource counterparts. In this paper, we first assess the problems faced by current entity candidate generation methods for low-resource XEL, then propose three improvements that (1) reduce the disconnect between entity mentions and KB entries, and (2) improve the robustness of the model to low-resource scenarios. The methods are simple, but effective: We experiment with our approach on seven XEL datasets and find that they yield an average gain of 16.9% in Top-30 gold candidate recall, compared with state-of-the-art baselines. Our improved model also yields an average gain of 7.9% in in-KB accuracy of end-to-end XEL. 1
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Vargas, Francielle, Wolfgang Schmeisser-Nieto, Zohar Rabinovich, Thiago A. S. Pardo, and Fabrício Benevenuto. "Discourse annotation guideline for low-resource languages." Natural Language Processing 31, no. 2 (2025): 700–743. https://doi.org/10.1017/nlp.2024.19.

Повний текст джерела
Анотація:
AbstractMost existing discourse annotation guidelines have focused on the English language. As a result, there is a significant lack of research and resources concerning computational discourse-level language understanding and generation for other languages. To fill this relevant gap, we introduce the first discourse annotation guideline using the rhetorical structure theory (RST) for low-resource languages. Specifically, this guideline provides accurate examples of discourse coherence relations in three romance languages: Italian, Portuguese, and Spanish. We further discuss theoretical definitions of RST and compare different artificial intelligence discourse frameworks, hence offering a reliable and accessible survey to new researchers and annotators.
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Li, Zihao, Yucheng Shi, Zirui Liu, et al. "Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 27 (2025): 28186–94. https://doi.org/10.1609/aaai.v39i27.35038.

Повний текст джерела
Анотація:
The development of Large Language Models (LLMs) relies on extensive text corpora, which are often unevenly distributed across languages. This imbalance results in LLMs performing significantly better on high-resource languages like English, German, and French, while their capabilities in low-resource languages remain inadequate. Currently, there is a lack of quantitative methods to evaluate the performance of LLMs in these low-resource languages. To address this gap, we propose the Language Ranker, an intrinsic metric designed to benchmark and rank languages based on LLM performance using internal representations. By comparing the LLM's internal representation of various languages against a baseline derived from English, we can assess the model's multilingual capabilities in a robust and language-agnostic manner. Our analysis reveals that high-resource languages exhibit higher similarity scores with English, demonstrating superior performance, while low-resource languages show lower similarity scores, underscoring the effectiveness of our metric in assessing language-specific capabilities. Besides, the experiments show that there is a strong correlation between the LLM’s performance in different languages and the proportion of those languages in its pre-training corpus. These insights underscore the efficacy of the Language Ranker as a tool for evaluating LLM performance across different languages, particularly those with limited resources.
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Azragul Yusup, Azragul Yusup, Degang Chen Azragul Yusup, Yifei Ge Degang Chen, Hongliang Mao Yifei Ge, and Nujian Wang Hongliang Mao. "Resource Construction and Ensemble Learning based Sentiment Analysis for the Low-resource Language Uyghur." 網際網路技術學刊 24, no. 4 (2023): 1009–16. http://dx.doi.org/10.53106/160792642023072404018.

Повний текст джерела
Анотація:
<p>To address the problem of scarce low-resource sentiment analysis corpus nowadays, this paper proposes a sentence-level sentiment analysis resource conversion method HTL based on the syntactic-semantic knowledge of the low-resource language Uyghur to convert high-resource corpus to low-resource corpus. In the conversion process, a k-fold cross-filtering method is proposed to reduce the distortion of data samples, which is used to select high-quality samples for conversion; finally, the Uyghur sentiment analysis dataset USD is constructed; the Baseline of this dataset is verified under the LSTM model, and the accuracy and F1 values reach 81.07% and 81.13%, respectively, which can provide a reference for the construction of low-resource language corpus nowadays. The accuracy and F1 values reached 81.07% and 81.13%, respectively, which can provide a reference for the construction of today’s low-resource corpus. Meanwhile, this paper also proposes a sentiment analysis model based on logistic regression ensemble learning, SA-LREL, which combines the advantages of several lightweight network models such as TextCNN, RNN, and RCNN as the base model, and the meta-model is constructed using logistic regression functions for ensemble, and the accuracy and F1 values reach 82.17% and 81.86% respectively in the test set, and the experimental results show that the method can effectively improve the performance of Uyghur sentiment analysis task.</p> <p> </p>
Стилі APA, Harvard, Vancouver, ISO та ін.
Більше джерел

Дисертації з теми "Low resource language"

1

Jansson, Herman. "Low-resource Language Question Answering Systemwith BERT." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42317.

Повний текст джерела
Анотація:
The complexity for being at the forefront regarding information retrieval systems are constantly increasing. Recent technology of natural language processing called BERT has reached superhuman performance in high resource languages for reading comprehension tasks. However, several researchers has stated that multilingual model’s are not enough for low-resource languages, since they are lacking a thorough understanding of those languages. Recently, a Swedish pre-trained BERT model has been introduced which is trained on significantly more Swedish data than the multilingual models currently available. This study compares both multilingual and Swedish monolingual inherited BERT model’s for question answering utilizing both a English and a Swedish machine translated SQuADv2 data set during its fine-tuning process. The models are evaluated with SQuADv2 benchmark and within a implemented question answering system built upon the classical retriever-reader methodology. This study introduces a naive and more robust prediction method for the proposed question answering system as well finding a sweet spot for each individual model approach integrated into the system. The question answering system is evaluated and compared against another question answering library at the leading edge within the area, applying a custom crafted Swedish evaluation data set. The results show that the fine-tuned model based on the Swedish pre-trained model and the Swedish SQuADv2 data set were superior in all evaluation metrics except speed. The comparison between the different systems resulted in a higher evaluation score but a slower prediction time for this study’s system.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Zhang, Yuan Ph D. Massachusetts Institute of Technology. "Transfer learning for low-resource natural language analysis." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/108847.

Повний текст джерела
Анотація:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (pages 131-142).<br>Expressive machine learning models such as deep neural networks are highly effective when they can be trained with large amounts of in-domain labeled training data. While such annotations may not be readily available for the target task, it is often possible to find labeled data for another related task. The goal of this thesis is to develop novel transfer learning techniques that can effectively leverage annotations in source tasks to improve performance of the target low-resource task. In particular, we focus on two transfer learning scenarios: (1) transfer across languages and (2) transfer across tasks or domains in the same language. In multilingual transfer, we tackle challenges from two perspectives. First, we show that linguistic prior knowledge can be utilized to guide syntactic parsing with little human intervention, by using a hierarchical low-rank tensor method. In both unsupervised and semi-supervised transfer scenarios, this method consistently outperforms state-of-the-art multilingual transfer parsers and the traditional tensor model across more than ten languages. Second, we study lexical-level multilingual transfer in low-resource settings. We demonstrate that only a few (e.g., ten) word translation pairs suffice for an accurate transfer for part-of-speech (POS) tagging. Averaged across six languages, our approach achieves a 37.5% improvement over the monolingual top-performing method when using a comparable amount of supervision. In the second monolingual transfer scenario, we propose an aspect-augmented adversarial network that allows aspect transfer over the same domain. We use this method to transfer across different aspects in the same pathology reports, where traditional domain adaptation approaches commonly fail. Experimental results demonstrate that our approach outperforms different baselines and model variants, yielding a 24% gain on this pathology dataset.<br>by Yuan Zhang.<br>Ph. D.
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Zouhair, Taha. "Automatic Speech Recognition for low-resource languages using Wav2Vec2 : Modern Standard Arabic (MSA) as an example of a low-resource language." Thesis, Högskolan Dalarna, Institutionen för information och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:du-37702.

Повний текст джерела
Анотація:
The need for fully automatic translation at DigitalTolk, a Stockholm-based company providing translation services, leads to exploring Automatic Speech Recognition as a first step for Modern Standard Arabic (MSA). Facebook AI recently released a second version of its Wav2Vec models, dubbed Wav2Vec 2.0, which uses deep neural networks and provides several English pretrained models along with a multilingual model trained in 53 different languages, referred to as the Cross-Lingual Speech Representation (XLSR-53). The small English and the XLSR-53 pretrained models are tested, and the results stemming from them discussed, with the Arabic data from Mozilla Common Voice. In this research, the small model did not yield any results and may have needed more unlabelled data to train whereas the large model proved to be successful in predicting the audio recordings in Arabic and a Word Error Rate of 24.40% was achieved, an unprecedented result. The small model turned out to be not suitable for training especially on languages other than English and where the unlabelled data is not enough. On the other hand, the large model gave very promising results despite the low amount of data. The large model should be the model of choice for any future training that needs to be done on low resource languages such as Arabic.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Packham, Sean. "Crowdsourcing a text corpus for a low resource language." Master's thesis, University of Cape Town, 2016. http://hdl.handle.net/11427/20436.

Повний текст джерела
Анотація:
Low resourced languages, such as South Africa's isiXhosa, have a limited number of digitised texts, making it challenging to build language corpora and the information retrieval services, such as search and translation that depend on them. Researchers have been unable to assemble isiXhosa corpora of sufficient size and quality to produce working machine translation systems and it has been acknowledged that there is little to know training data and sourcing translations from professionals can be a costly process. A crowdsourcing translation game which paid participants for their contributions was proposed as a solution to source original and relevant parallel corpora for low resource languages such as isiXhosa. The objectives of this dissertation is to report on the four experiments that were conducted to assess user motivation and contribution quantity under various scenarios using the developed crowdsourcing translation game. The first experiment was a pilot study to test a custom built system and to find out if social network users would volunteer to participate in a translation game for free. The second experiment tested multiple payment schemes with users from the University of Cape Town. The schemes rewarded users with consistent, increasing or decreasing amounts for subsequent contributions. Experiment 3 tested whether the same users from Experiment 2 would continue contributing if payments were taken away. The last experiment tested a payment scheme that did not offer a direct and guaranteed reward. Users were paid based on their leaderboard placement and only a limited number of the top leaderboard spots were allocated rewards. From experiment 1 and 3 we found that people do not volunteer without financial incentives, experiment 2 and 4 showed that people want increased rewards when putting in increased effort , experiment 3 also showed that people will not continue contributing if the financial incentives are taken away and experiment 4 also showed that the possibility of incentives is as attractive as offering guaranteed incentives .
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Louvan, Samuel. "Low-Resource Natural Language Understanding in Task-Oriented Dialogue." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/333813.

Повний текст джерела
Анотація:
Task-oriented dialogue (ToD) systems need to interpret the user's input to understand the user's needs (intent) and corresponding relevant information (slots). This process is performed by a Natural Language Understanding (NLU) component, which maps the text utterance into a semantic frame representation, involving two subtasks: intent classification (text classification) and slot filling (sequence tagging). Typically, new domains and languages are regularly added to the system to support more functionalities. Collecting domain-specific data and performing fine-grained annotation of large amounts of data every time a new domain and language is introduced can be expensive. Thus, developing an NLU model that generalizes well across domains and languages with less labeled data (low-resource) is crucial and remains challenging. This thesis focuses on investigating transfer learning and data augmentation methods for low-resource NLU in ToD. Our first contribution is a study of the potential of non-conversational text as a source for transfer. Most transfer learning approaches assume labeled conversational data as the source task and adapt the NLU model to the target task. We show that leveraging similar tasks from non-conversational text improves performance on target slot filling tasks through multi-task learning in low-resource settings. Second, we propose a set of lightweight augmentation methods that apply data transformation on token and sentence levels through slot value substitution and syntactic manipulation. Despite its simplicity, the performance is comparable to deep learning-based augmentation models, and it is effective on six languages on NLU tasks. Third, we investigate the effectiveness of domain adaptive pre-training for zero-shot cross-lingual NLU. In terms of overall performance, continued pre-training in English is effective across languages. This result indicates that the domain knowledge learned in English is transferable to other languages. In addition to that, domain similarity is essential. We show that intermediate pre-training data that is more similar – in terms of data distribution – to the target dataset yields better performance.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Lakew, Surafel Melaku. "Multilingual Neural Machine Translation for Low Resource Languages." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/257906.

Повний текст джерела
Анотація:
Machine Translation (MT) is the task of mapping a source language to a target language. The recent introduction of neural MT (NMT) has shown promising results for high-resource language, however, poorly performing for low-resource language (LRL) settings. Furthermore, the vast majority of the 7, 000+ languages around the world do not have parallel data, creating a zero-resource language (ZRL) scenario. In this thesis, we present our approach to improving NMT for LRL and ZRL, leveraging a multilingual NMT modeling (M-NMT), an approach that allows building a single NMT to translate across multiple source and target languages. This thesis begins by i) analyzing the effectiveness of M-NMT for LRL and ZRL translation tasks, spanning two NMT modeling architectures (Recurrent and Transformer), ii) presents a self-learning approach for improving the zero-shot translation directions of ZRLs, iii) proposes a dynamic transfer-learning approach from a pre-trained (parent) model to a LRL (child) model by tailoring to the vocabulary entries of the latter, iv) extends M-NMT to translate from a source language to specific language varieties (e.g. dialects), and finally, v) proposes an approach that can control the verbosity of an NMT model output. Our experimental findings show the effectiveness of the proposed approaches in improving NMT of LRLs and ZRLs.
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Lakew, Surafel Melaku. "Multilingual Neural Machine Translation for Low Resource Languages." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/257906.

Повний текст джерела
Анотація:
Machine Translation (MT) is the task of mapping a source language to a target language. The recent introduction of neural MT (NMT) has shown promising results for high-resource language, however, poorly performing for low-resource language (LRL) settings. Furthermore, the vast majority of the 7, 000+ languages around the world do not have parallel data, creating a zero-resource language (ZRL) scenario. In this thesis, we present our approach to improving NMT for LRL and ZRL, leveraging a multilingual NMT modeling (M-NMT), an approach that allows building a single NMT to translate across multiple source and target languages. This thesis begins by i) analyzing the effectiveness of M-NMT for LRL and ZRL translation tasks, spanning two NMT modeling architectures (Recurrent and Transformer), ii) presents a self-learning approach for improving the zero-shot translation directions of ZRLs, iii) proposes a dynamic transfer-learning approach from a pre-trained (parent) model to a LRL (child) model by tailoring to the vocabulary entries of the latter, iv) extends M-NMT to translate from a source language to specific language varieties (e.g. dialects), and finally, v) proposes an approach that can control the verbosity of an NMT model output. Our experimental findings show the effectiveness of the proposed approaches in improving NMT of LRLs and ZRLs.
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Mairidan, Wushouer. "Pivot-Based Bilingual Dictionary Creation for Low-Resource Languages." 京都大学 (Kyoto University), 2015. http://hdl.handle.net/2433/199441.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Samson, Juan Sarah Flora. "Exploiting resources from closely-related languages for automatic speech recognition in low-resource languages from Malaysia." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAM061/document.

Повний текст джерела
Анотація:
Les langues en Malaisie meurent à un rythme alarmant. A l'heure actuelle, 15 langues sont en danger alors que deux langues se sont éteintes récemment. Une des méthodes pour sauvegarder les langues est de les documenter, mais c'est une tâche fastidieuse lorsque celle-ci est effectuée manuellement.Un système de reconnaissance automatique de la parole (RAP) serait utile pour accélérer le processus de documentation de ressources orales. Cependant, la construction des systèmes de RAP pour une langue cible nécessite une grande quantité de données d'apprentissage comme le suggèrent les techniques actuelles de l'état de l'art, fondées sur des approches empiriques. Par conséquent, il existe de nombreux défis à relever pour construire des systèmes de transcription pour les langues qui possèdent des quantités de données limitées.L'objectif principal de cette thèse est d'étudier les effets de l'utilisation de données de langues étroitement liées, pour construire un système de RAP pour les langues à faibles ressources en Malaisie. Des études antérieures ont montré que les méthodes inter-lingues et multilingues pourraient améliorer les performances des systèmes de RAP à faibles ressources. Dans cette thèse, nous essayons de répondre à plusieurs questions concernant ces approches: comment savons-nous si une langue est utile ou non dans un processus d'apprentissage trans-lingue ? Comment la relation entre la langue source et la langue cible influence les performances de la reconnaissance de la parole ? La simple mise en commun (pooling) des données d'une langue est-elle une approche optimale ?Notre cas d'étude est l'iban, une langue peu dotée de l'île de Bornéo. Nous étudions les effets de l'utilisation des données du malais, une langue locale dominante qui est proche de l'iban, pour développer un système de RAP pour l'iban, sous différentes contraintes de ressources. Nous proposons plusieurs approches pour adapter les données du malais afin obtenir des modèles de prononciation et des modèles acoustiques pour l'iban.Comme la contruction d'un dictionnaire de prononciation à partir de zéro nécessite des ressources humaines importantes, nous avons développé une approche semi-supervisée pour construire rapidement un dictionnaire de prononciation pour l'iban. Celui-ci est fondé sur des techniques d'amorçage, pour améliorer la correspondance entre les données du malais et de l'iban.Pour augmenter la performance des modèles acoustiques à faibles ressources, nous avons exploré deux techniques de modélisation : les modèles de mélanges gaussiens à sous-espaces (SGMM) et les réseaux de neurones profonds (DNN). Nous avons proposé, dans ce cadre, des méthodes de transfert translingue pour la modélisation acoustique permettant de tirer profit d'une grande quantité de langues “proches” de la langue cible d'intérêt. Les résultats montrent que l'utilisation de données du malais est bénéfique pour augmenter les performances des systèmes de RAP de l'iban. Par ailleurs, nous avons également adapté les modèles SGMM et DNN au cas spécifique de la transcription automatique de la parole non native (très présente en Malaisie). Nous avons proposé une approche fine de fusion pour obtenir un SGMM multi-accent optimal. En outre, nous avons développé un modèle DNN spécifique pour la parole accentuée. Les deux approches permettent des améliorations significatives de la précision du système de RAP. De notre étude, nous observons que les modèles SGMM et, de façon plus surprenante, les modèles DNN sont très performants sur des jeux de données d'apprentissage en quantité limités<br>Languages in Malaysia are dying in an alarming rate. As of today, 15 languages are in danger while two languages are extinct. One of the methods to save languages is by documenting languages, but it is a tedious task when performed manually.Automatic Speech Recognition (ASR) system could be a tool to help speed up the process of documenting speeches from the native speakers. However, building ASR systems for a target language requires a large amount of training data as current state-of-the-art techniques are based on empirical approach. Hence, there are many challenges in building ASR for languages that have limited data available.The main aim of this thesis is to investigate the effects of using data from closely-related languages to build ASR for low-resource languages in Malaysia. Past studies have shown that cross-lingual and multilingual methods could improve performance of low-resource ASR. In this thesis, we try to answer several questions concerning these approaches: How do we know which language is beneficial for our low-resource language? How does the relationship between source and target languages influence speech recognition performance? Is pooling language data an optimal approach for multilingual strategy?Our case study is Iban, an under-resourced language spoken in Borneo island. We study the effects of using data from Malay, a local dominant language which is close to Iban, for developing Iban ASR under different resource constraints. We have proposed several approaches to adapt Malay data to obtain pronunciation and acoustic models for Iban speech.Building a pronunciation dictionary from scratch is time consuming, as one needs to properly define the sound units of each word in a vocabulary. We developed a semi-supervised approach to quickly build a pronunciation dictionary for Iban. It was based on bootstrapping techniques for improving Malay data to match Iban pronunciations.To increase the performance of low-resource acoustic models we explored two acoustic modelling techniques, the Subspace Gaussian Mixture Models (SGMM) and Deep Neural Networks (DNN). We performed cross-lingual strategies using both frameworks for adapting out-of-language data to Iban speech. Results show that using Malay data is beneficial for increasing the performance of Iban ASR. We also tested SGMM and DNN to improve low-resource non-native ASR. We proposed a fine merging strategy for obtaining an optimal multi-accent SGMM. In addition, we developed an accent-specific DNN using native speech data. After applying both methods, we obtained significant improvements in ASR accuracy. From our study, we observe that using SGMM and DNN for cross-lingual strategy is effective when training data is very limited
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Tafreshi, Shabnam. "Cross-Genre, Cross-Lingual, and Low-Resource Emotion Classification." Thesis, The George Washington University, 2021. http://pqdtopen.proquest.com/#viewpdf?dispub=28088437.

Повний текст джерела
Анотація:
Emotions can be defined as a natural, instinctive state of mind arising from one’s circumstances, mood, and relationships with others. It has long been a question to be answered by psychology that how and what is it that humans feel. Enabling computers to recognize human emotions has been an of interest to researchers since 1990s (Picard et al., 1995). Ever since, this area of research has grown significantly and emotion detection is becoming an important component in many natural language processing tasks. Several theories exist for defining emotions and are chosen by researchers according to their needs. For instance, according to appraisal theory, a psychology theory, emotions are produced by our evaluations (appraisals or estimates) of events that cause a specific reaction in different people. Some emotions are easy and universal, while others are complex and nuanced. Emotion classification is generally the process of labeling a piece of text with one or more corresponding emotion labels. Psychologists have developed numerous models and taxonomies of emotions. The model or taxonomy depends on the problem, and thorough study is often required to select the best model. Early studies of emotion classification focused on building computational models to classify basic emotion categories. In recent years, increasing volumes of social media and the digitization of data have opened a new horizon in this area of study, where emotion classification is a key component of applications, including mood and behavioral studies, as well as disaster relief, amongst many other applications. Sophisticated models have been built to detect and classify emotion in text, but few analyze how well a model is able to learn emotion cues. The ability to learn emotion cues properly and be able to generalize this learning is very important. This work investigates the robustness of emotion classification approaches across genres and languages, with a focus on quantifying how well state-of-the-art models are able to learn emotion cues. First, we use multi-task learning and hierarchical models to build emotion models that were trained on data combined from multiple genres. Our hypothesis is that a multi-genre, noisy training environment will help the classifier learn emotion cues that are prevalent across genres. Second, we explore splitting text (i.e. sentence) into its clauses and testing whether the model’s performance improves. Emotion analysis needs fine-grained annotation and clause-level annotation can be beneficial to design features to improve emotion detection performance. Intuitively, clause-level annotations may help the model focus on emotion cues, while ignoring irrelevant portions of the text. Third, we adopted a transfer learning approach for cross-lingual/genre emotion classification to focus the classifier’s attention on emotion cues which are consistent across languages. Fourth, we empirically show how to combine different genres to be able to build robust models that can be used as source models for emotion transfer to low-resource target languages. Finally, this study involved curating and re-annotating popular emotional data sets in different genres, and annotating a multi-genre corpus of Persian tweets and news, and generating a collection of emotional sentences for a low-resource language, Azerbaijani, a language spoken in the north west of Iran.
Стилі APA, Harvard, Vancouver, ISO та ін.
Більше джерел

Книги з теми "Low resource language"

1

Chakravarthi, Bharathi Raja, Bharathi B, Miguel Ángel García Cumbreras, et al., eds. Speech and Language Technologies for Low-Resource Languages. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-58495-4.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

M, Anand Kumar, Bharathi Raja Chakravarthi, Bharathi B, et al., eds. Speech and Language Technologies for Low-Resource Languages. Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-33231-9.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Canadian Legal Information Centre. Plain Language Centre. Plain Language Resource Centre catalogue. Multiculturalism and Citizenship Canada], 1992.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Cooper, Erica Lindsay. Text-to-Speech Synthesis Using Found Data for Low-Resource Languages. [publisher not identified], 2019.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Bureau, Canada Translation, Promoting Access to Justice in Both Official Languages (Canada), Promotion de l'accès à la justice dans les deux langues officielles (Canada), and Canada. Bureau de la traduction., eds. Lexique du droit des fiducies (common law): Law of trusts glossary (common law) [electronic resource]. Bureau de la traduction = Translation Bureau, 2004.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Library, Canada Multiculturalism and Citizenship Canada Departmental. Plain Language Resource Centre catalogue =: Catalogue du centre de ressources sur le langage clair et simple. Multiculturalism and Citizenship Canada], 1992.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Library, Canada Multiculturalism and Citizenship Canada Departmental. Plain Language Resource Centre catalogue =: Catalogue du centre de ressources sur le langage clair et simple. Multiculturalism and Citizenship Canada], 1992.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Megías, José Manuel Lucía. Literatura románica en Internet: Los textos. Editorial Castalia, 2002.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Juda, Lawrence. International law and ocean use management. Routledge, 1996.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Bundesanstalt für Geowissenschaften und Rohstoffe, ed. Glossary of shared water resources: Technical, socioeconomic and legal terminology : [English-Arabic]. United Nations, 2012.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Більше джерел

Частини книг з теми "Low resource language"

1

Palakodety, Shriphani, Ashiqur R. KhudaBukhsh, and Guha Jayachandran. "Language Identification." In Low Resource Social Media Text Mining. Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-5625-5_4.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Deshpande, Pranjali, and Sunita Jahirabadkar. "Low-Resource Language Document Summarization: A Challenge." In Data Science. Chapman and Hall/CRC, 2022. http://dx.doi.org/10.1201/9781003283249-15.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Grießhaber, Daniel, Ngoc Thang Vu, and Johannes Maucher. "Low-Resource Text Classification Using Domain-Adversarial Learning." In Statistical Language and Speech Processing. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-00810-9_12.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Juan, Sarah Samson, Muhamad Fikri Che Ismail, Hamimah Ujir, and Irwandi Hipiny. "Language Modelling for a Low-Resource Language in Sarawak, Malaysia." In Lecture Notes in Electrical Engineering. Springer Singapore, 2019. http://dx.doi.org/10.1007/978-981-15-1289-6_14.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Zhu, ShaoLin, Xiao Li, YaTing Yang, Lei Wang, and ChengGang Mi. "Learning Bilingual Lexicon for Low-Resource Language Pairs." In Natural Language Processing and Chinese Computing. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-73618-1_66.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Rakhimova, Diana, Eşref Adali, and Aidana Karibayeva. "Hybrid Approach Text Generation for Low-Resource Language." In Communications in Computer and Information Science. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-70248-8_20.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Ouily, Hamed Joseph, Aminata Sabané, Delwende Eliane Birba, Rodrique Kafando, Abdoul-Kader Kabore, and Tégawendé F. Bissyandé. "A Low-Resource Language Translation: French to Mooré." In Communications in Computer and Information Science. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-88226-5_30.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Charuka, Kaveesh, Sandareka Wickramanayake, Thanuja D. Ambegoda, Pasan Madhushan, and Dineth Wijesooriya. "Sign Language Recognition for Low Resource Languages Using Few Shot Learning." In Communications in Computer and Information Science. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-8141-0_16.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Chen, Yaqi, Hao Zhang, Xukui Yang, Wenlin Zhang, and Dan Qu. "Task-Consistent Meta Learning for Low-Resource Speech Recognition." In Natural Language Processing and Chinese Computing. Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-44693-1_28.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Baruah, Rupjyoti, and Anil Kumar Singh. "A Clinical Practice by Machine Translation on Low Resource Languages." In Natural Language Processing in Healthcare. CRC Press, 2022. http://dx.doi.org/10.1201/9781003138013-1.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Тези доповідей конференцій з теми "Low resource language"

1

Zou, Shuai, Xuefeng Liang, and Yiyang Huang. "LipReading for Low-resource Languages by Language Dynamic LoRA." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10889645.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Downey, C. M., Terra Blevins, Dhwani Serai, Dwija Parikh, and Shane Steinert-Threlkeld. "Targeted Multilingual Adaptation for Low-resource Language Families." In Findings of the Association for Computational Linguistics: EMNLP 2024. Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.findings-emnlp.918.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Sutherland, Emery M., Melvatha R. Chee, and Marios S. Pattichis. "Navajo Speech Recognition Using Low-Resource Language Models." In 2024 58th Asilomar Conference on Signals, Systems, and Computers. IEEE, 2024. https://doi.org/10.1109/ieeeconf60004.2024.10942828.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Balachandran, Priyatharshan, Uthayasanker Thayasivam, Randil Pushpananda, and Ruvan Weerasinghe. "Towards Effective Emotion Analysis in Low-Resource Tamil Texts." In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages. Association for Computational Linguistics, 2025. https://doi.org/10.18653/v1/2025.dravidianlangtech-1.101.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Dash, Amulya, and Yashvardhan Sharma. "Towards Improving Translation Ability of Large Language Models on Low Resource Languages." In 14th International Conference on Pattern Recognition Applications and Methods. SCITEPRESS - Science and Technology Publications, 2025. https://doi.org/10.5220/0013319000003905.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Nigatu, Hellina Hailu, Atnafu Lambebo Tonja, Benjamin Rosman, Thamar Solorio, and Monojit Choudhury. "The Zeno’s Paradox of ‘Low-Resource’ Languages." In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.emnlp-main.983.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Kwok, Chin Yuen, Sheng Li, Jia Qi Yip, and Eng Siong Chng. "Low-resource Language Adaptation with Ensemble of PEFT Approaches." In 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2024. https://doi.org/10.1109/apsipaasc63619.2025.10848814.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Sadat, Mobashir, and Cornelia Caragea. "Co-training for Low Resource Scientific Natural Language Inference." In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.acl-long.139.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Adeyemi, Mofetoluwa, Akintunde Oladipo, Ronak Pradeep, and Jimmy Lin. "Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages." In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.acl-short.59.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Simran, Rishabh Sharma, and Manish Nagpal. "Handwritten Language Detection for Low-Resource Languages Using a CNN-BiLSTM Hybrid Model." In 2024 5th IEEE Global Conference for Advancement in Technology (GCAT). IEEE, 2024. https://doi.org/10.1109/gcat62922.2024.10923881.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Звіти організацій з теми "Low resource language"

1

Enimil, Sandra, Rachael Samberg, Erik Limpitlaw, Samantha Teremi, and Katie Zimmerman. e-Resource Licensing Explained: An A–Z Licensing Guidebook for Libraries. Association of Research Libraries, 2024. https://doi.org/10.29242/report.eresourcelicensing2024.

Повний текст джерела
Анотація:
ARL is pleased to publish e-Resource Licensing Explained: An A–Z Licensing Guidebook for Libraries, a practical tool to empower librarians who license electronic resources (e-resources). The guidebook includes easily digestible legal explanations and pragmatic strategies for preserving rights that users already have under US copyright law, particularly in the face of restrictive license terms that would otherwise constrain or eliminate those rights. For each term of an e-resource license agreement, the book explores: essentials of the law, desired results, desired language, tricks and traps, and overall importance and risk. Additional sections and tools will be added over time to reflect changes in best practices, business models, and laws and regulations.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Pikilnyak, Andrey V., Nadia M. Stetsenko, Volodymyr P. Stetsenko, Tetiana V. Bondarenko, and Halyna V. Tkachuk. Comparative analysis of online dictionaries in the context of the digital transformation of education. [б. в.], 2021. http://dx.doi.org/10.31812/123456789/4431.

Повний текст джерела
Анотація:
The article is devoted to a comparative analysis of popular online dictionaries and an overview of the main tools of these resources to study a language. The use of dictionaries in learning a foreign language is an important step to understanding the language. The effectiveness of this process increases with the use of online dictionaries, which have a lot of tools for improving the educational process. Based on the Alexa Internet resource it was found the most popular online dictionaries: Cambridge Dictionary, Wordreference, Merriam–Webster, Wiktionary, TheFreeDictionary, Dictionary.com, Glosbe, Collins Dictionary, Longman Dictionary, Oxford Dictionary. As a result of the deep analysis of these online dictionaries, we found out they have the next standard functions like the word explanations, transcription, audio pronounce, semantic connections, and examples of use. In propose dictionaries, we also found out the additional tools of learning foreign languages (mostly English) that can be effective. In general, we described sixteen functions of the online platforms for learning that can be useful in learning a foreign language. We have compiled a comparison table based on the next functions: machine translation, multilingualism, a video of pronunciation, an image of a word, discussion, collaborative edit, the rank of words, hints, learning tools, thesaurus, paid services, sharing content, hyperlinks in a definition, registration, lists of words, mobile version, etc. Based on the additional tools of online dictionaries we created a diagram that shows the functionality of analyzed platforms.
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Marin, Anabel, and Gabriel Palazzo. Civic Power in Just Transitions: Blocking the Way or Transforming the Future? Institute of Development Studies, 2024. https://doi.org/10.19088/ids.2024.045.

Повний текст джерела
Анотація:
As the global shift towards a low-carbon economy accelerates, demand for critical minerals is projected to soar, intensifying pressures on supply chains and local environments. Policies like the European Union’s Critical Raw Materials Act and the U.S. Inflation Reduction Act are increasingly proposed to secure mineral access while upholding environmental standards and reducing ecological impacts. However, mining activities face significant civil resistance worldwide. This paper reveals that opposition to mineral extraction is a pervasive global phenomenon, spanning diverse sociopolitical contexts and posing major challenges for the political sustainability of the energy transition. Using data from the Global Database of Events, Language and Tone (GDELT) Project, we map patterns of conflict and cooperation in mining regions globally, providing an unprecedented systematic overview of impacts and underlying economic, environmental, and justice-related drivers. Our findings indicate that many conflicts reach high levels of polarisation, which, as case studies show, often lead to costly delays or project cancellations. Although cooperation frequently arises alongside conflict, high-commitment cooperative actions remain limited in impact and uncertain in their ability to drive meaningful change. We argue that a just, sustainable, and democratic transition requires moving beyond traditional Corporate Social Responsibility approaches and recent proposals for public participation in Environmental Impact Assessments. Instead, it demands deeper democratisation of investment decisions through inclusive governance frameworks that can effectively navigate the complexities of mineral resource extraction.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Sitabkhan, Yasmin, Matthew C. H. Jukes, Eileen Dombrowski, and Indrah Munialo. Differentiated Instruction in Multigrade Preprimary Classrooms in Kenya. RTI Press, 2022. http://dx.doi.org/10.3768/rtipress.2022.op.0084.2212.

Повний текст джерела
Анотація:
There is little evidence of how differentiated instruction is being implemented, if at all, in low- and middle-income contexts, which often have unique challenges such as availability of resources and large class sizes. In this paper, we present the results of a qualitative study in eight multigrade preprimary classrooms in Kenya. We used classroom observations and teacher interviews to understand how teachers approached differentiation during language and mathematics lessons, including understanding why teachers were making the moves we observed. All teachers differentiated instruction to some extent in our findings, and we provide detailed descriptions of the ways that teachers adapted content to fit the needs of their students. We also provide recommendations, including how to support teachers in creating activities that are appropriate for different abilities of students in the same classrooms, and suggest next steps for research in this area.
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Chew, Robert F., Kirsty J. Weitzel, Peter Baumgartner, et al. Improving Text Classification with Boolean Retrieval for Rare Categories: A Case Study Identifying Firearm Violence Conversations in the Crisis Text Line Database. RTI Press, 2023. http://dx.doi.org/10.3768/rtipress.2023.mr.0050.2304.

Повний текст джерела
Анотація:
Advancements in machine learning and natural language processing have made text classification increasingly attractive for information retrieval. However, developing text classifiers is challenging when no prior labeled data are available for a rare category of interest. Finding instances of the rare class using a uniform random sample can be inefficient and costly due to the rare category’s low base rate. This work presents an approach that combines the strengths of text classification and Boolean retrieval to help learn rare concepts of interest. As a motivating example, we use the task of finding conversations that reference firearm injury or violence in the Crisis Text Line database. Identifying rare categories, like firearm injury or violence, can improve crisis lines' abilities to support people with firearm-related crises or provide appropriate resources. Our approach outperforms a set of iteratively refined Boolean queries and results in a recall of 0.91 on a test set generated from a process independent of our study. Our results suggest that text classification with Boolean retrieval initialization can be effective for finding rare categories of interest and improve on the precision of using Boolean retrieval alone.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Terzyan, Aram. The State of Minority Rights in Uzbekistan: A Comparative Analysis of Tajiks, Russians, and Koreans. Eurasia Institutes, 2023. http://dx.doi.org/10.47669/erd-1-2023.

Повний текст джерела
Анотація:
This paper examines the state of minority rights in Uzbekistan, focusing on three significant ethnic groups: Tajiks, Russians, and Koreans. It explores the historical context of these minorities, the cultural and linguistic challenges they face, socioeconomic issues, and their political representation. Under the authoritarian rule of Islam Karimov, Uzbekistan emphasized a unified Uzbek identity, often marginalizing minority cultures and languages. Despite President Shavkat Mirziyoyev’s reforms aimed at improving human rights, including the establishment of a Human Rights Ombudsman and the Development Strategy for 2017-2021, significant challenges remain. Legislative initiatives such as the draft Law on the Protection of the Rights and Interests of National Minorities and efforts to enhance cultural policies have had mixed success. This analysis highlights the need for comprehensive measures to ensure robust legal protections, equitable resource allocation, and genuine political inclusion for all ethnic minorities in Uzbekistan. The international community’s role in advocating for these rights is also discussed, emphasizing the gap between policy and practice in protecting minority rights in Uzbekistan.
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Avellán, Leopoldo, and Steve Brito. Crossroads in a Fog: Navigating Latin America's Development Challenges with Text Analytics. Inter-American Development Bank, 2023. http://dx.doi.org/10.18235/0005489.

Повний текст джерела
Анотація:
Latin America and the Caribbean are facing challenging times due to a combination of worsening development gaps and limited fiscal space to address them. Furthermore, the region is contending with an unfavorable external environment. Issues such as rising poverty, climate change, inadequate infrastructure, and low-quality education and health services, among others, require immediate attention. Deciding how to prioritize efforts to address these development gaps is challenging due to their complexity and urgency, and setting priorities becomes even more difficult when resources are limited. Therefore, it is crucial to have tools that help policymakers prioritize current development challenges to guide the allocation of financial support from international financial institutions and other development partners. This paper contributes to this discussion by using Natural Language Processing (NLP) to identify the most critical development areas. It applies these techniques to detailed periodic country analysis reports (Country Development Challenges, CDCs) prepared by country economists at the Inter-American Development Bank (IDB) from 2015 to 2021. The study reveals that despite the perception that new development challenges have become more critical lately, the region continues to struggle with the same challenges from the past, particularly those related to the government's institutional capacity, fiscal policy, education, productivity and firms, infrastructure, and poverty.
Стилі APA, Harvard, Vancouver, ISO та ін.
8

F, Verdugo-Paiva, Izcovich A, Ragusa M, and Rada G. Lopinavir/ritonavir for the treatment of COVID-19: A living systematic review protocol. Epistemonikos Interactive Evidence Synthesis, 2024. http://dx.doi.org/10.30846/ies.4f3c02f030.

Повний текст джерела
Анотація:
Objective To assess the efficacy and safety of lopinavir/ritonavir for the treatment of patients with COVID-19. Design This is the protocol of a living systematic review. Data sources We will conduct searches in the [https://app.iloveevidence.com/loves/5e6fdb9669c00e4ac072701d](L.OVE platform for COVID-19), a system that maps PICO questions to a repository maintained through regular searches in electronic databases, preprint servers, trial registries and other resources relevant to COVID-19. No date or language restrictions will be applied. Eligibility criteria for selecting studies and methods We adapted an already published common protocol for multiple parallel systematic reviews to the specificities of this question. We will include randomised trials evaluating the effect of lopinavir/ritonavir— as monotherapy or in combination with other drugs — versus placebo or no treatment in patients with COVID-19. Randomised trials evaluating lopinavir/ritonavir in infections caused by other coronaviruses, such as MERS-CoV and SARS-CoV, and non-randomised studies in COVID-19 will be searched in case no direct evidence from randomised trials is found, or if the direct evidence provides low- or very low-certainty for critical outcomes. Two reviewers will independently screen each study for eligibility, extract data, and assess the risk of bias. We will perform random-effects meta-analyses and use GRADE to assess the certainty of the evidence for each outcome. A living, web-based version of this review will be openly available during the COVID-19 pandemic. We will resubmit it if the conclusions change or there are substantial updates. Ethics and dissemination No ethics approval is considered necessary. The results of this review will be widely disseminated via peer-reviewed publications, social networks and traditional media. PROSPERO Registration [https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=179212](CRD42020179212) Keywords COVID-19, severe acute respiratory syndrome coronavirus 2, Coronavirus Infections, Systematic review, lopinavir, lopinavir/ritonavir, antivirals
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Rudman, Debbie Laliberte, and Rebecca M. Aldrich. Social Isolation, Third Places, and Precarious Employment Circumstances: A Scoping Review. University of Western Ontario, 2022. http://dx.doi.org/10.5206/otpub.2022.54.

Повний текст джерела
Анотація:
Rising rates of social isolation in Canada and other middle- and high-income countries have turned scholarly attention to the kinds of places that facilitate social connections. “Third places” - physical and virtual places beyond home (first places) and work (second places) - are thought to foster social interaction, connection, belonging, and support. This evidence brief reports on a SSHRC funded knowledge synthesis that linked understandings about “third places” with situations of precarious employment, given that people facing precarious employment circumstances often lack the social opportunities and resources associated with stable workplaces. This scoping review assessed what is known about the types and characteristics of “third places” that help maintain social connectedness and address social isolation for adults experiencing precarious employment circumstances. The project examined English-language research articles published in multidisciplinary academic journals between 2012 and 2022. The review captured diverse forms of employment (i.e., gig work, involuntary part-time work, seasonal work, temporary migrant work) characterized as transient, non-permanent, unpredictable, having few worker protections or rights, and associated with low or unpredictable remuneration, as well as cyclical and long-term unemployment. In addition to synthesizing study results, findings attend to how studies addressed diverse social positions and studies’ geographic locations, methodologies, methods, and quality. The goal of the project was to understand the current state of knowledge on this topic; create dialogue about how social isolation can be addressed through precarious workers’ engagement with “third places”; and identify opportunities for stakeholders to partner on place-based interventions with people experiencing precarious employment circumstances.
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Does your Local Control Accountability (LCAP) Plan deliver on the promise of increased or improved services for English Learners? 10 research aligned rubrics to help answer the question and guide your program. The Center for Equity for English Learners (CEEL), 2015. http://dx.doi.org/10.15365/ceel.lcap2015.1.

Повний текст джерела
Анотація:
As California’s Local Control Funding Formula (LCFF) came into effect in 2013, districts were given more flexibility to use state resources and create a new school finance system to improve/increase services for students with greater needs for support, including English Learners (ELs), students from low-income backgrounds, and foster youth. Local Education Agencies (LEAs) were tasked with preparing the Local Control and Accountability Plans (LCAPs) to describe how districts use their plans to meet their annual goals for all students. To aid LEAs in their design and implementation of programs to address the needs of ELs, Californians Together, the California Association for Bilingual Education (CABE), California Rural Legal Assistance (CRLA), and the Center for Equity for English Learners (CEEL) collaboratively developed the rubrics with 10 focus areas that have a high impact on ELs. These areas include: (1) English Language Development, (2) Parent Engagement, (3) Professional Development, (4) Programs and Course Access, (5) Expenditures, (6) District Wide Use of Concentration and Supplemental Grant Funds, (7) School Wide Use of Concentration and Supplemental Grant Funds, (8) Actions and Services, (9) Proportionality, and (10) English Learner Data to Inform Goals. These 10 rubrics and their corresponding indicators are based on research-based principles and practices for English Learners. These rubrics were first employed in the review of first-year LCAPs by the above-mentioned organizations and remain an important analytical instrument for district leaders to gain insights into the planning for and improving programs and services for ELs.
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!

До бібліографії