To see the other types of publications on this topic, follow the link: Language resource.

Journal articles on the topic 'Language resource'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Language resource.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Lin, Donghui, Yohei Murakami, and Toru Ishida. "Towards Language Service Creation and Customization for Low-Resource Languages." Information 11, no. 2 (January 27, 2020): 67. http://dx.doi.org/10.3390/info11020067.

Full text
Abstract:
The most challenging issue with low-resource languages is the difficulty of obtaining enough language resources. In this paper, we propose a language service framework for low-resource languages that enables the automatic creation and customization of new resources from existing ones. To achieve this goal, we first introduce a service-oriented language infrastructure, the Language Grid; it realizes new language services by supporting the sharing and combining of language resources. We then show the applicability of the Language Grid to low-resource languages. Furthermore, we describe how we can now realize the automation and customization of language services. Finally, we illustrate our design concept by detailing a case study of automating and customizing bilingual dictionary induction for low-resource Turkic languages and Indonesian ethnic languages.
APA, Harvard, Vancouver, ISO, and other styles
2

Zinn, Claus. "The Language Resource Switchboard." Computational Linguistics 44, no. 4 (December 2018): 631–39. http://dx.doi.org/10.1162/coli_a_00329.

Full text
Abstract:
The CLARIN research infrastructure gives users access to an increasingly rich and diverse set of language-related resources and tools. Whereas there is ample support for searching resources using metadata-based search, or full-text search, or for aggregating resources into virtual collections, there is little support for users to help them process resources in one way or another. In spite of the large number of tools that process texts in many different languages, there is no single point of access where users can find tools to fit their needs and the resources they have. In this squib, we present the Language Resource Switchboard (LRS), which helps users to discover tools that can process their resources. For this, the LRS identifies all applicable tools for a given resource, lists the tasks the tools can achieve, and invokes the selected tool in such a way so that processing can start immediately with little or no prior tool parameterization.
APA, Harvard, Vancouver, ISO, and other styles
3

Santos, André C., Luís D. Pedrosa, Martijn Kuipers, and Rui M. Rocha. "Resource Description Language: A Unified Description Language for Network Embedded Resources." International Journal of Distributed Sensor Networks 8, no. 8 (January 2012): 860864. http://dx.doi.org/10.1155/2012/860864.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Rickerson, Earl. "Language Resource Center." IALLT Journal of Language Learning Technologies 29, no. 1 (January 1, 1996): 25–34. http://dx.doi.org/10.17161/iallt.v29i1.9605.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lee, Chanhee, Kisu Yang, Taesun Whang, Chanjun Park, Andrew Matteson, and Heuiseok Lim. "Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models." Applied Sciences 11, no. 5 (February 24, 2021): 1974. http://dx.doi.org/10.3390/app11051974.

Full text
Abstract:
Language model pretraining is an effective method for improving the performance of downstream natural language processing tasks. Even though language modeling is unsupervised and thus collecting data for it is relatively less expensive, it is still a challenging process for languages with limited resources. This results in great technological disparity between high- and low-resource languages for numerous downstream natural language processing tasks. In this paper, we aim to make this technology more accessible by enabling data efficient training of pretrained language models. It is achieved by formulating language modeling of low-resource languages as a domain adaptation task using transformer-based language models pretrained on corpora of high-resource languages. Our novel cross-lingual post-training approach selectively reuses parameters of the language model trained on a high-resource language and post-trains them while learning language-specific parameters in the low-resource language. We also propose implicit translation layers that can learn linguistic differences between languages at a sequence level. To evaluate our method, we post-train a RoBERTa model pretrained in English and conduct a case study for the Korean language. Quantitative results from intrinsic and extrinsic evaluations show that our method outperforms several massively multilingual and monolingual pretrained language models in most settings and improves the data efficiency by a factor of up to 32 compared to monolingual training.
APA, Harvard, Vancouver, ISO, and other styles
6

Tune, Kula Kekeba, and Vasudeva Varma. "Building CLIA for Resource-Scarce African Languages." International Journal of Information Retrieval Research 5, no. 1 (January 2015): 48–67. http://dx.doi.org/10.4018/ijirr.2015010104.

Full text
Abstract:
Since most of the existing major search engines and commercial Information Retrieval (IR) systems are primarily designed for well-resourced European and Asian languages, they have paid little attention to the development of Cross-Language Information Access (CLIA) technologies for resource-scarce African languages. This paper presents the authors' experience in building CLIA for indigenous African languages, with a special focus on the development and evaluation of Oromo-English-CLIR. The authors have adopted a knowledge-based query translation approach to design and implement their initial Oromo-English CLIR (OMEN-CLIR). Apart from designing and building the first OMEN-CLIR from scratch, another major contribution of this study is assessing the performance of the proposed retrieval system at one of the well-recognized international Cross-Language Evaluation Forums like the CLEF campaign. The overall performance of OMEN-CLIR was found to be very promising and encouraging, given the limited amount of linguistic resources available for severely under-resourced African languages like Afaan Oromo.
APA, Harvard, Vancouver, ISO, and other styles
7

Ranasinghe, Tharindu, and Marcos Zampieri. "Multilingual Offensive Language Identification for Low-resource Languages." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 1 (January 31, 2022): 1–13. http://dx.doi.org/10.1145/3457610.

Full text
Abstract:
Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this article, we take advantage of available English datasets by applying cross-lingual contextual word embeddings and transfer learning to make predictions in low-resource languages. We project predictions on comparable data in Arabic, Bengali, Danish, Greek, Hindi, Spanish, and Turkish. We report results of 0.8415 F1 macro for Bengali in TRAC-2 shared task [23], 0.8532 F1 macro for Danish and 0.8701 F1 macro for Greek in OffensEval 2020 [58], 0.8568 F1 macro for Hindi in HASOC 2019 shared task [27], and 0.7513 F1 macro for Spanish in in SemEval-2019 Task 5 (HatEval) [7], showing that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages. Additionally, we report competitive performance on Arabic and Turkish using the training and development sets of OffensEval 2020 shared task. The results for all languages confirm the robustness of cross-lingual contextual embeddings and transfer learning for this task.
APA, Harvard, Vancouver, ISO, and other styles
8

Rijhwani, Shruti, Jiateng Xie, Graham Neubig, and Jaime Carbonell. "Zero-Shot Neural Transfer for Cross-Lingual Entity Linking." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6924–31. http://dx.doi.org/10.1609/aaai.v33i01.33016924.

Full text
Abstract:
Cross-lingual entity linking maps an entity mention in a source language to its corresponding entry in a structured knowledge base that is in a different (target) language. While previous work relies heavily on bilingual lexical resources to bridge the gap between the source and the target languages, these resources are scarce or unavailable for many low-resource languages. To address this problem, we investigate zero-shot cross-lingual entity linking, in which we assume no bilingual lexical resources are available in the source low-resource language. Specifically, we propose pivot-basedentity linking, which leverages information from a highresource “pivot” language to train character-level neural entity linking models that are transferred to the source lowresource language in a zero-shot manner. With experiments on 9 low-resource languages and transfer through a total of54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario.1 Further, we also investigate the use of language-universal phonological representations which improves average accuracy (absolute) by 36% when transferring between languages that use different scripts.
APA, Harvard, Vancouver, ISO, and other styles
9

McGroarty, Mary. "Home language: Refuge, resistance, resource?" Language Teaching 45, no. 1 (January 27, 2011): 89–104. http://dx.doi.org/10.1017/s0261444810000558.

Full text
Abstract:
This presentation builds on the concept of orientations to languages other than English in the US first suggested by Ruíz (1984). Using examples from recent ethnographic, sociolinguistic, and policy-related investigations undertaken principally in North America, the discussion explores possible connections between individual and group language identities. It demonstrates that orientations to languages are dynamic inside and outside speech communities, varying across time and according to multiple contextual factors, including the history and size of local bilingual groups along with the impact of contemporary economic and political conditions. Often the conceptions of multiple languages reflected in policy and pedagogy oversimplify the complexity documented by research and raise questions for teaching practice.
APA, Harvard, Vancouver, ISO, and other styles
10

Rallo, John A. "Foreign Language Resource Center." IALLT Journal of Language Learning Technologies 4, no. 3 (January 17, 2019): 14–22. http://dx.doi.org/10.17161/iallt.v4i3.8760.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Lozo, Deborah, and Kathryn Dix. "Language-Reading Resource Model." Perspectives on School-Based Issues 4, no. 1 (April 2003): 52–54. http://dx.doi.org/10.1044/sbi4.1.52.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Zhou, Shuyan, Shruti Rijhwani, John Wieting, Jaime Carbonell, and Graham Neubig. "Improving Candidate Generation for Low-resource Cross-lingual Entity Linking." Transactions of the Association for Computational Linguistics 8 (July 2020): 109–24. http://dx.doi.org/10.1162/tacl_a_00303.

Full text
Abstract:
Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts. The first step of (X)EL is candidate generation, which retrieves a list of plausible candidate entities from the target-language KB for each mention. Approaches based on resources from Wikipedia have proven successful in the realm of relatively high-resource languages, but these do not extend well to low-resource languages with few, if any, Wikipedia pages. Recently, transfer learning methods have been shown to reduce the demand for resources in the low-resource languages by utilizing resources in closely related languages, but the performance still lags far behind their high-resource counterparts. In this paper, we first assess the problems faced by current entity candidate generation methods for low-resource XEL, then propose three improvements that (1) reduce the disconnect between entity mentions and KB entries, and (2) improve the robustness of the model to low-resource scenarios. The methods are simple, but effective: We experiment with our approach on seven XEL datasets and find that they yield an average gain of 16.9% in Top-30 gold candidate recall, compared with state-of-the-art baselines. Our improved model also yields an average gain of 7.9% in in-KB accuracy of end-to-end XEL. 1
APA, Harvard, Vancouver, ISO, and other styles
13

Saee, Suhaila, Ranaivo-Malancon Bali, Lay-Ki Soon, and Tek-Yong Lim. "Crawling Social Media to Create Morphological Resource of Under-Resourced Language: Melanau Language." Advanced Science Letters 23, no. 11 (November 1, 2017): 11503–7. http://dx.doi.org/10.1166/asl.2017.10316.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Wang, Pidong, Preslav Nakov, and Hwee Tou Ng. "Source Language Adaptation Approaches for Resource-Poor Machine Translation." Computational Linguistics 42, no. 2 (June 2016): 277–306. http://dx.doi.org/10.1162/coli_a_00248.

Full text
Abstract:
Most of the world languages are resource-poor for statistical machine translation; still, many of them are actually related to some resource-rich language. Thus, we propose three novel, language-independent approaches to source language adaptation for resource-poor statistical machine translation. Specifically, we build improved statistical machine translation models from a resource-poor language POOR into a target language TGT by adapting and using a large bitext for a related resource-rich language RICH and the same target language TGT. We assume a small POOR–TGT bitext from which we learn word-level and phrase-level paraphrases and cross-lingual morphological variants between the resource-rich and the resource-poor language. Our work is of importance for resource-poor machine translation because it can provide a useful guideline for people building machine translation systems for resource-poor languages. Our experiments for Indonesian/Malay–English translation show that using the large adapted resource-rich bitext yields 7.26 BLEU points of improvement over the unadapted one and 3.09 BLEU points over the original small bitext. Moreover, combining the small POOR–TGT bitext with the adapted bitext outperforms the corresponding combinations with the unadapted bitext by 1.93–3.25 BLEU points. We also demonstrate the applicability of our approaches to other languages and domains.
APA, Harvard, Vancouver, ISO, and other styles
15

Kachkou, Dz I. "Applying the language acquisition model to the solution small language processing tasks." Informatics 19, no. 1 (January 5, 2022): 96–110. http://dx.doi.org/10.37661/1816-0301-2022-19-1-96-110.

Full text
Abstract:
The problem of building a computer model of a small language was under solution. The relevance of this task is due to the following considerations: the need to eliminate the information inequality between speakers of different languages; the need for new tools for the study of poorly understood languages, as well as innovative approaches to language modeling in the low-resource context; the problem of supporting and developing small languages.There are three main objectives in solving the problem of small natural language processing at the stage of describing the problem situation: to justify the problem of modeling language in the context of resource scarcity as a special task in the field of natural languages processing, to review the literature on the relevant topic, to develop the concept of language acquisition model with a relatively small number of available resources. Computer modeling techniques using neural networks, semi-supervised learning and reinforcement learning were involved.The paper provides a review of the literature on modeling the learning of vocabulary, morphology, and grammar of a child's native language. Based on the current understanding of the language acquisition and existing computer models of this process, the architecture of the system of small language processing, which is taught through modeling of ontogenesis, is proposed. The main components of the system and the principles of their interaction are highlighted. The system is based on a module built on the basis of modern dialogical language models and taught in some rich-resources language (e.g., English). During training, an intermediate layer is used which represents statements in some abstract form, for example, in the symbols of formal semantics. The relationship between the formal recording of utterances and their translation into the target low-resource language is learned by modeling the child's acquisition of vocabulary and grammar of the language. One of components stands for the non-linguistic context in which language learning takes place.This article explores the problem of modeling small languages. A detailed substantiation of the relevance of modeling small languages is given: the social significance of the problem is noted, the benefits for linguistics, ethnography, ethnology and cultural anthropology are shown. The ineffectiveness of approaches applied to large languages in conditions of a lack of resources is noted. A model of language learning by means of ontogenesis simulation is proposed, which is based both on the results obtained in the field of computer modeling and on the data of psycholinguistics.
APA, Harvard, Vancouver, ISO, and other styles
16

Nakov, P., and H. T. Ng. "Improving Statistical Machine Translation for a Resource-Poor Language Using Related Resource-Rich Languages." Journal of Artificial Intelligence Research 44 (May 30, 2012): 179–222. http://dx.doi.org/10.1613/jair.3540.

Full text
Abstract:
We propose a novel language-independent approach for improving machine translation for resource-poor languages by exploiting their similarity to resource-rich ones. More precisely, we improve the translation from a resource-poor source language X_1 into a resource-rich language Y given a bi-text containing a limited number of parallel sentences for X_1-Y and a larger bi-text for X_2-Y for some resource-rich language X_2 that is closely related to X_1. This is achieved by taking advantage of the opportunities that vocabulary overlap and similarities between the languages X_1 and X_2 in spelling, word order, and syntax offer: (1) we improve the word alignments for the resource-poor language, (2) we further augment it with additional translation options, and (3) we take care of potential spelling differences through appropriate transliteration. The evaluation for Indonesian- >English using Malay and for Spanish -> English using Portuguese and pretending Spanish is resource-poor shows an absolute gain of up to 1.35 and 3.37 BLEU points, respectively, which is an improvement over the best rivaling approaches, while using much less additional data. Overall, our method cuts the amount of necessary "real'' training data by a factor of 2--5.
APA, Harvard, Vancouver, ISO, and other styles
17

Pashayeva, Gulshan. "Language as a soft power resource." Language Problems and Language Planning 42, no. 2 (June 21, 2018): 132–43. http://dx.doi.org/10.1075/lplp.00016.pas.

Full text
Abstract:
Abstract The term soft power, developed by Joseph Nye, is a widely popular concept used to describe efforts to attract rather than coerce as means of persuasion. Language, which is widely viewed as a traditional (not to say extremely important) component of nationhood and a symbol of identity and group consciousness, can be used as an expression of soft power resources within this context. It is apparent that in today’s globalized world, the role of international languages as global means of communication has increased considerably. At the same time, English has become the de facto lingua franca in international trade, academia, technology and many other fields. Against this background, this article examines the impact of language as a soft power resource in the case of the Republic of Azerbaijan, which is a multi-ethnic state located at the crossroads of Europe and Asia. Due to its geographic location, the constant migrations of people who have passed through its territory throughout the centuries, and it has long been a zone of active interaction of languages, cultures and civilizations.
APA, Harvard, Vancouver, ISO, and other styles
18

Mati, Diellza Nagavci, Mentor Hamiti, Arsim Susuri, Besnik Selimi, and Jaumin Ajdari. "Building Dictionaries for Low Resource Languages: Challenges of Unsupervised Learning." Annals of Emerging Technologies in Computing 5, no. 3 (July 1, 2021): 52–58. http://dx.doi.org/10.33166/aetic.2021.03.005.

Full text
Abstract:
The development of natural language processing resources for Albanian has grown steadily in recent years. This paper presents research conducted on unsupervised learning-the challenges associated with building a dictionary for the Albanian language and creating part-of-speech tagging models. The majority of languages have their own dictionary, but languages with low resources suffer from a lack of resources. It facilitates the sharing of information and services for users and whole communities through natural language processing. The experimentation corpora for the Albanian language includes 250K sentences from different disciplines, with a proposal for a part-of-speech tagging tag set that can adequately represent the underlying linguistic phenomena. Contributing to the development of Albanian is the purpose of this paper. The results of experiments with the Albanian language corpus revealed that its use of articles and pronouns resembles that of more high-resource languages. According to this study, the total expected frequency as a means for correctly tagging words has been proven effective for populating the Albanian language dictionary.
APA, Harvard, Vancouver, ISO, and other styles
19

Pavlenko, Olena O., Oksana Ye Bondar, Bae Gi Yon, Choi Kwangoon, Nataliia S. Tymchenko-Mikhailidi, and Darja A. Kassim. "The enhancement of a foreign language competence: free online resources, mobile apps, and other opportunities." CTE Workshop Proceedings 6 (March 21, 2019): 279–93. http://dx.doi.org/10.55056/cte.391.

Full text
Abstract:
In this article, we present an overview of free online resources, mobile apps, and other opportunities available for an independent study of a foreign language (based on the examples of English and Korean languages) in group and individual settings, geared towards increasing a foreign language competence. Initially, the authors formulated the criteria for selecting free online resources: the resource should be convenient for independent work; the resource should be available at any convenient time; it should be easy in navigation; it should provide opportunities for improving as many components of a foreign language competence as possible; preferably, the resource should have online as well as offline mobile apps. It is suggested to classify free online resources based on their functional characteristics. Various opportunities of the available resources are highlighted and the expediency of their utilization for specific objectives (i.e., advancement of foreign language competence in listening, reading, writing, speaking; the expansion of the vocabulary, etc.) is substantiated. The authors also emphasize free online opportunities of preparation for international examinations not only in the English language, such as TOEFL or IELTS, but also in the Korean language, such as TOPIK, by using online resources in English.
APA, Harvard, Vancouver, ISO, and other styles
20

Ataa Allah, Fadoua, and Siham Boulaknadel. "Morpho-Lexicon for standard Moroccan Amazigh." MATEC Web of Conferences 210 (2018): 04024. http://dx.doi.org/10.1051/matecconf/201821004024.

Full text
Abstract:
Standardized resources are key components for the development of applications related to human language technology. Therefore, it is important to adopt it for designing lexical resources, especially for less commonly resourced languages such Amazigh. This language is spoken by many North African communities, including Morocco. Due to historical, geographical and sociolinguistic factors, the Amazigh language is characterized by the proliferation of many intervarieties, which has led to a complex morphology. This latter poses significant challenge to NLP tasks, especially that Amazigh language belongs to the Afro-Asiatic language (Hamito-Semitic) family, known by its non-concatenative morphology based on root and pattern. Face to the scarcity of Amazigh language resources dealing with morphemes encoding, orthographic changes, and morphotactic variations, the elaboration of a standardized lexical resource will certainly ensure a large exchange and exploitation. In this context, this paper describes ongoing work for elaborating a morphological lexicon, based on inflected forms, for the standard Moroccan Amazigh language.
APA, Harvard, Vancouver, ISO, and other styles
21

Aysa, Anwar, Mijit Ablimit, Hankiz Yilahun, and Askar Hamdulla. "Chinese-Uyghur Bilingual Lexicon Extraction Based on Weak Supervision." Information 13, no. 4 (March 31, 2022): 175. http://dx.doi.org/10.3390/info13040175.

Full text
Abstract:
Bilingual lexicon extraction is useful, especially for low-resource languages that can leverage from high-resource languages. The Uyghur language is a derivative language, and its language resources are scarce and noisy. Moreover, it is difficult to find a bilingual resource to utilize the linguistic knowledge of other large resource languages, such as Chinese or English. There is little related research on unsupervised extraction for the Chinese-Uyghur languages, and the existing methods mainly focus on term extraction methods based on translated parallel corpora. Accordingly, unsupervised knowledge extraction methods are effective, especially for the low-resource languages. This paper proposes a method to extract a Chinese-Uyghur bilingual dictionary by combining the inter-word relationship matrix mapped by the neural network cross-language word embedding vector. A seed dictionary is used as a weak supervision signal. A small Chinese-Uyghur parallel data resource is used to map the multilingual word vectors into a unified vector space. As the word-particles of these two languages are not well-coordinated, stems are used as the main linguistic particles. The strong inter-word semantic relationship of word vectors is used to associate Chinese-Uyghur semantic information. Two retrieval indicators, such as nearest neighbor retrieval and cross-domain similarity local scaling, are used to calculate similarity to extract bilingual dictionaries. The experimental results show that the accuracy of the Chinese-Uyghur bilingual dictionary extraction method proposed in this paper is improved to 65.06%. This method helps to improve Chinese-Uyghur machine translation, automatic knowledge extraction, and multilingual translations.
APA, Harvard, Vancouver, ISO, and other styles
22

Karimi, Samaneh, and Azadeh Shakery. "A language-model-based approach for subjectivity detection." Journal of Information Science 43, no. 3 (April 1, 2016): 356–77. http://dx.doi.org/10.1177/0165551516641818.

Full text
Abstract:
The rapid growth of opinionated text on the Web increases the demand for efficient methods for detecting subjective texts. In this paper, a subjectivity detection method is proposed which utilizes a language-model-based structure to define a subjectivity score for each document where the topic relevance of documents does not affect the subjectivity scores. In order to overcome the limited content in short documents, we further propose an expansion method to better estimate the language models. Since the lack of linguistic resources in resource-lean languages like Persian makes subjectivity detection difficult in these languages, the method is proposed in two versions: a semi-supervised version for resource-lean languages and a supervised version. Experimental evaluations on five datasets in two languages, English and Persian, demonstrate that the method performs well in distinguishing subjective documents from objective ones in both languages.
APA, Harvard, Vancouver, ISO, and other styles
23

Doyle, Robert. "A Digital Language Resource Center." IALLT Journal of Language Learning Technologies 32, no. 1 (April 15, 2000): 17–26. http://dx.doi.org/10.17161/iallt.v32i1.8307.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Terdal, Marjorie, Sandra Lee McKay, and Sau-ling Cynthia Wong. "Language Diversity: Problem or Resource?" TESOL Quarterly 23, no. 4 (December 1989): 685. http://dx.doi.org/10.2307/3587539.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Combeaud Bonallack, Patricia. "Amnesty International Language Resource Centre." Translation and Interpreting in Non-Governmental Organisations 7, no. 1 (August 10, 2018): 92–105. http://dx.doi.org/10.1075/ts.00005.com.

Full text
Abstract:
Abstract In the wake of profound structural and organizational changes stemming from an institutional restructure as part of the Global Transition Programme and the Governance Reform undertaken to transform Amnesty International into a truly global movement, the Amnesty International Language Resource Centre examines the role it must play in order to align with the new operating model and evaluates the position it should reach so as to contribute to organizational aspirations for increased diversity and inclusion, as well as integrated approaches to working practices. This paper presents, in broad terms, the operational impact the structural changes have had on its language activities and reaffirms the Language Resource Centre’s crucial contribution to advancing the human rights agenda as a language service provider and strategic language advisor.
APA, Harvard, Vancouver, ISO, and other styles
26

Shikali, Casper S., and Refuoe Mokhosi. "Enhancing African low-resource languages: Swahili data for language modelling." Data in Brief 31 (August 2020): 105951. http://dx.doi.org/10.1016/j.dib.2020.105951.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Asubiaro, Toluwase, Tunde Adegbola, Robert Mercer, and Isola Ajiferuke. "A word‐level language identification strategy for resource‐scarce languages." Proceedings of the Association for Information Science and Technology 55, no. 1 (January 2018): 19–28. http://dx.doi.org/10.1002/pra2.2018.14505501004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Du, Lian Yan. "Design on Structure and Integrity for Foreign Language Network Learning Resource Library." Applied Mechanics and Materials 543-547 (March 2014): 4581–84. http://dx.doi.org/10.4028/www.scientific.net/amm.543-547.4581.

Full text
Abstract:
Network learning resources is accompanied by computer technology, communication technology and network technology developed new educational resources, for foreign language teaching providing a variety of teaching materials. For difficult problems in foreign language network learning resources construction, the paper based on SQL Server database management system to study resource library structure and integrity design. It based on requirements analysis and systems involved in language classes, resource types, resources, resource files, resource evaluation five entities, the part of structural design consisted includes conceptual design and logical design, the part of integrity design includes entity integrity design and referential integrity designs. This article provides practical solutions for foreign language learning resource library construction, for improving the quality of teaching foreign language has great significance.
APA, Harvard, Vancouver, ISO, and other styles
29

Laskar, Sahinur Rahman, Abdullah Faiz Ur Rahman Khilji, Partha Pakray, and Sivaji Bandyopadhyay. "Improved neural machine translation for low-resource English–Assamese pair." Journal of Intelligent & Fuzzy Systems 42, no. 5 (March 31, 2022): 4727–38. http://dx.doi.org/10.3233/jifs-219260.

Full text
Abstract:
Language translation is essential to bring the world closer and plays a significant part in building a community among people of different linguistic backgrounds. Machine translation dramatically helps in removing the language barrier and allows easier communication among linguistically diverse communities. Due to the unavailability of resources, major languages of the world are accounted as low-resource languages. This leads to a challenging task of automating translation among various such languages to benefit indigenous speakers. This article investigates neural machine translation for the English–Assamese resource-poor language pair by tackling insufficient data and out-of-vocabulary problems. We have also proposed an approach of data augmentation-based NMT, which exploits synthetic parallel data and shows significantly improved translation accuracy for English-to-Assamese and Assamese-to-English translation and obtained state-of-the-art results.
APA, Harvard, Vancouver, ISO, and other styles
30

ZENNAKI, O., N. SEMMAR, and L. BESACIER. "A neural approach for inducing multilingual resources and natural language processing tools for low-resource languages." Natural Language Engineering 25, no. 1 (August 6, 2018): 43–67. http://dx.doi.org/10.1017/s1351324918000293.

Full text
Abstract:
AbstractThis work focuses on the rapid development of linguistic annotation tools for low-resource languages (languages that have no labeled training data). We experiment with several cross-lingual annotation projection methods using recurrent neural networks (RNN) models. The distinctive feature of our approach is that our multilingual word representation requires only a parallel corpus between source and target languages. More precisely, our approach has the following characteristics: (a) it does not use word alignment information, (b) it does not assume any knowledge about target languages (one requirement is that the two languages (source and target) are not too syntactically divergent), which makes it applicable to a wide range of low-resource languages, (c) it provides authentic multilingual taggers (one tagger forNlanguages). We investigate both uni and bidirectional RNN models and propose a method to include external information (for instance, low-level information from part-of-speech tags) in the RNN to train higher level taggers (for instance, Super Sense taggers). We demonstrate the validity and genericity of our model by using parallel corpora (obtained by manual or automatic translation). Our experiments are conducted to induce cross-lingual part-of-speech and Super Sense taggers. We also use our approach in a weakly supervised context, and it shows an excellent potential for very low-resource settings (less than 1k training utterances).
APA, Harvard, Vancouver, ISO, and other styles
31

Liaqat, Muhammad Irzam, Muhammad Awais Hassan, Muhammad Shoaib, Syed Khaldoon Khurshid, and Mohamed A. Shamseldin. "Sentiment analysis techniques, challenges, and opportunities: Urdu language-based analytical study." PeerJ Computer Science 8 (August 31, 2022): e1032. http://dx.doi.org/10.7717/peerj-cs.1032.

Full text
Abstract:
Sentiment analysis in research involves the processing and analysis of sentiments from textual data. The sentiment analysis for high resource languages such as English and French has been carried out effectively in the past. However, its applications are comparatively few for resource-poor languages due to a lack of textual resources. This systematic literature explores different aspects of Urdu-based sentiment analysis, a classic case of poor resource language. While Urdu is a South Asian language understood by one hundred and sixty-nine million people across the planet. There are various shortcomings in the literature, including limitation of large corpora, language parsers, and lack of pre-trained machine learning models that result in poor performance. This article has analyzed and evaluated studies addressing machine learning-based Urdu sentiment analysis. After searching and filtering, forty articles have been inspected. Research objectives have been proposed that lead to research questions. Our searches were organized in digital repositories after selecting and screening relevant studies. Data was extracted from these studies. Our work on the existing literature reflects that sentiment classification performance can be improved by overcoming the challenges such as word sense disambiguation and massive datasets. Furthermore, Urdu-based language constructs, including language parsers and emoticons, context-level sentiment analysis techniques, pre-processing methods, and lexical resources, can also be improved.
APA, Harvard, Vancouver, ISO, and other styles
32

Chen, Xilun, Yu Sun, Ben Athiwaratkun, Claire Cardie, and Kilian Weinberger. "Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification." Transactions of the Association for Computational Linguistics 6 (December 2018): 557–70. http://dx.doi.org/10.1162/tacl_a_00039.

Full text
Abstract:
In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN 1 ) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exist. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.
APA, Harvard, Vancouver, ISO, and other styles
33

Mi, Chenggang, Shaolin Zhu, and Rui Nie. "Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion." Computational Intelligence and Neuroscience 2021 (April 8, 2021): 1–9. http://dx.doi.org/10.1155/2021/9975078.

Full text
Abstract:
Loanword identification is studied in recent years to alleviate data sparseness in several natural language processing (NLP) tasks, such as machine translation, cross-lingual information retrieval, and so on. However, recent studies on this topic usually put efforts on high-resource languages (such as Chinese, English, and Russian); for low-resource languages, such as Uyghur and Mongolian, due to the limitation of resources and lack of annotated data, loanword identification on these languages tends to have lower performance. To overcome this problem, we first propose a lexical constraint-based data augmentation method to generate training data for low-resource language loanword identification; then, a loanword identification model based on a log-linear RNN is introduced to improve the performance of low-resource loanword identification by incorporating features such as word-level embeddings, character-level embeddings, pronunciation similarity, and part-of-speech (POS) into one model. Experimental results on loanword identification in Uyghur (in this study, we mainly focus on Arabic, Chinese, Russian, and Turkish loanwords in Uyghur) showed that our proposed method achieves best performance compared with several strong baseline systems.
APA, Harvard, Vancouver, ISO, and other styles
34

Hemphill, Christy, and Aaron Hemphill. "Maximizing Scalability in Literacy Game App Design for Minority Languages." International Journal of Technology in Education 4, no. 4 (October 1, 2021): 668–80. http://dx.doi.org/10.46328/ijte.138.

Full text
Abstract:
Minority language communities lack access to educational technology that facilitates literacy skill building. The approach currently taken by most educational game app developers privileges widely spoken languages and often requires intensive resource investment. In response, a new game app was designed to provide easily localized, pedagogically appropriate games for literacy skill building. Scalability to multiple minority languages was possible through a programming design based on language packs that could be compiled by local implementation teams without specialized technical skills and without significant resource investment. We describe the scalability issues encountered when localizing the app for the initial ten minority language pilot groups and how a language-neutral app design that relies on language packs to specify language-specific content and parameters can adequately address these issues. When it comes to meeting the demands of growing education technology markets in underserved Indigenous and minority communities, localizing an app initially designed for maximum scalability is more feasible than investing significant resources converting apps custom designed for one language into new languages.
APA, Harvard, Vancouver, ISO, and other styles
35

Tucker, Benjamin V., Matthew C. Kelley, and Charles Redmon. "A place to share teaching resources: Speech and language resource bank." Journal of the Acoustical Society of America 149, no. 4 (April 2021): A147. http://dx.doi.org/10.1121/10.0005365.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Chen, Siqi, Yijie Pei, Zunwang Ke, and Wushour Silamu. "Low-Resource Named Entity Recognition via the Pre-Training Model." Symmetry 13, no. 5 (May 2, 2021): 786. http://dx.doi.org/10.3390/sym13050786.

Full text
Abstract:
Named entity recognition (NER) is an important task in the processing of natural language, which needs to determine entity boundaries and classify them into pre-defined categories. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to obtain high performance. However, there is minimal annotated data available about Uyghur and Hungarian (UH languages) NER tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for named entity recognition tasks: fine-tuning the pre-trained language model. Therefore, we propose a fine-tuning method for a low-resource language model, which constructs a fine-tuning dataset through data augmentation; then the dataset of a high-resource language is added; and finally the cross-language pre-trained model is fine-tuned on this dataset. In addition, we propose an attention-based fine-tuning strategy that uses symmetry to better select relevant semantic and syntactic information from pre-trained language models and apply these symmetry features to name entity recognition tasks. We evaluated our approach on Uyghur and Hungarian datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the available resources for named entity recognition and some of the open research questions.
APA, Harvard, Vancouver, ISO, and other styles
37

Bari, M. Saiful, Shafiq Joty, and Prathyusha Jwalapuram. "Zero-Resource Cross-Lingual Named Entity Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7415–23. http://dx.doi.org/10.1609/aaai.v34i05.6237.

Full text
Abstract:
Recently, neural methods have achieved state-of-the-art (SOTA) results in Named Entity Recognition (NER) tasks for many languages without the need for manually crafted features. However, these models still require manually annotated training data, which is not available for many languages. In this paper, we propose an unsupervised cross-lingual NER model that can transfer NER knowledge from one language to another in a completely unsupervised way without relying on any bilingual dictionary or parallel data. Our model achieves this through word-level adversarial learning and augmented fine-tuning with parameter sharing and feature augmentation. Experiments on five different languages demonstrate the effectiveness of our approach, outperforming existing models by a good margin and setting a new SOTA for each language pair.
APA, Harvard, Vancouver, ISO, and other styles
38

Winter, Joanne. "Discourse as a resource." Australian Review of Applied Linguistics 15, no. 1 (January 1, 1992): 1–22. http://dx.doi.org/10.1075/aral.15.1.01win.

Full text
Abstract:
Abstract Language attitudes have frequently been included in investigations of language shift, language maintenance, second language acquisition and bilingualism. Speakers’ attitudes about and towards such language issues contribute toward the planning and provision of language services and education in the speech community. The data gathering methods adopted for the collection of speakers’ language attitudes usually consist of sociolinguistic questionnaires and/or social psychological matched guise experiments. In this paper I will present some exploratory ideas about discourse analysis as a method for the collection and analysis of language attitudes. The data for the investigation is a series of group negotiations among female and male speakers from Anglo-Australian and Greek-Australian backgrounds. The speakers were participating in group ‘negotiations’ discussing various issues of language planning and policy in an Australian context.
APA, Harvard, Vancouver, ISO, and other styles
39

Gillis-Webber, Frances. "Conversion of the English-Xhosa Dictionary for Nurses to a Linguistic Linked Data Framework." Information 9, no. 11 (November 6, 2018): 274. http://dx.doi.org/10.3390/info9110274.

Full text
Abstract:
The English-Xhosa Dictionary for Nurses (EXDN) is a bilingual, unidirectional printed dictionary in the public domain, with English and isiXhosa as the language pair. By extending the digitisation efforts of EXDN from a human-readable digital object to a machine-readable state, using Resource Description Framework (RDF) as the data model, semantically interoperable structured data can be created, thus enabling EXDN’s data to be reused, aggregated and integrated with other language resources, where it can serve as a potential aid in the development of future language resources for isiXhosa, an under-resourced language in South Africa. The methodological guidelines for the construction of a Linguistic Linked Data framework (LLDF) for a lexicographic resource, as applied to EXDN, are described, where an LLDF can be defined as a framework: (1) which describes data in RDF, (2) using a model designed for the representation of linguistic information, (3) which adheres to Linked Data principles, and (4) which supports versioning, allowing for change. The result is a bidirectional lexicographic resource, previously bounded and static, now unbounded and evolving, with the ability to extend to multilingualism.
APA, Harvard, Vancouver, ISO, and other styles
40

Xiao, Yubei, Ke Gong, Pan Zhou, Guolin Zheng, Xiaodan Liang, and Liang Lin. "Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 16 (May 18, 2021): 14112–20. http://dx.doi.org/10.1609/aaai.v35i16.17661.

Full text
Abstract:
Low-resource automatic speech recognition (ASR) is challenging, as the low-resource target language data cannot well train an ASR model. To solve this issue, meta-learning formulates ASR for each source language into many small ASR tasks and meta-learns a model initialization on all tasks from different source languages to access fast adaptation on unseen target languages. However, for different source languages, the quantity and difficulty vary greatly because of their different data scales and diverse phonological systems, which leads to task-quantity and task-difficulty imbalance issues and thus a failure of multilingual meta-learning ASR (MML-ASR). In this work, we solve this problem by developing a novel adversarial meta sampling (AMS) approach to improve MML-ASR. When sampling tasks in MML-ASR, AMS adaptively determines the task sampling probability for each source language. Specifically, for each source language, if the query loss is large, it means that its tasks are not well sampled to train ASR model in terms of its quantity and difficulty and thus should be sampled more frequently for extra learning. Inspired by this fact, we feed the historical task query loss of all source language domain into a network to learn a task sampling policy for adversarially increasing the current query loss of MML-ASR. Thus, the learnt task sampling policy can master the learning situation of each language and thus predicts good task sampling probability for each language for more effective learning. Finally, experiment results on two multilingual datasets show significant performance improvement when applying our AMS on MML-ASR, and also demonstrate the applicability of AMS to other low-resource speech tasks and transfer learning ASR approaches.
APA, Harvard, Vancouver, ISO, and other styles
41

Et. al., Syed Abdul Basit Andrabi,. "A Review of Machine Translation for South Asian Low Resource Languages." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 5 (April 10, 2021): 1134–47. http://dx.doi.org/10.17762/turcomat.v12i5.1777.

Full text
Abstract:
Machine translation is an application of natural language processing. Humans use native languages to communicate with one another, whereas programming languages communicate between humans and computers. NLP is the field that involves a broad set of techniques for analysis, manipulation and automatic generation of human languages or natural languages with the help of computers. It is essential to provide access to information to people for their development in the present information age. It is necessary to put equal emphasis on removing the barrier of language between different divisions of society. The area of NLP strives to fill this gap of the language barrier by applying machine translation. One natural language is transformed into another natural language with the aid of computers. The first few years of this area were dedicated to the development of rule-based systems. Still, later on, due to the increase in computational power, there was a transition towards statistical machine translation. The motive of machine translation is that the meaning of the translated text should be preserved during translation. This research paper aims to analyse the machine translation approaches used for resource-poor languages and determine the needs and challenges the researchers face. This paper also reviews the machine translation systems that are available for poor research languages.
APA, Harvard, Vancouver, ISO, and other styles
42

Weiwei, Sun. "Research on the Integration of Preschool Language Education Resources Based on Metadata Storage." Journal of Mathematics 2022 (January 31, 2022): 1–8. http://dx.doi.org/10.1155/2022/4802381.

Full text
Abstract:
Aiming at the problems of high redundancy and slow integration speed in the existing education resource data integration methods, a new preschool language education resource integration method based on metadata warehouse is designed. The metadata warehouse is designed, and the advantages of the integrated database are analyzed. On this basis, the sample data of preschool language education resources are classified with the help of cost matrix, and the constraints of different types of classification are set. The data collector of preschool language education resources is set up by using random forest algorithm to complete the data collection of preschool language education resources. The data of preschool language education resources are processed consistently, and the convergence of the data is calculated by edge function. On this basis, the redundant data in preschool language education data resources are characterized with the help of discourse, and the redundant data are removed to complete the data preprocessing of preschool language education resources. We determine the dimension distance between preschool language education resource data and complete the clustering integration of preschool language education resource data with the help of fuzzy mean clustering algorithm. The experimental results show that the integration method designed in this paper can reduce the redundancy in the integrated data, and the integration speed is fast.
APA, Harvard, Vancouver, ISO, and other styles
43

Jensson, Arnar Thor, Koji Iwano, and Sadaoki Furui. "Language Model Adaptation Using Machine-Translated Text for Resource-Deficient Languages." EURASIP Journal on Audio, Speech, and Music Processing 2008 (2008): 1–7. http://dx.doi.org/10.1155/2008/573832.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Markus, Pudelko, Tomoki Sekiguchi, and Norihiko Takeuchi. "Language and international human resource management." Japanese Journal of Administrative Science 28, no. 2 (2015): 139–49. http://dx.doi.org/10.5651/jaas.28.139.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Farber, Judith, Mary Ellen Denenberg, Susan Klyman, and Patricia Lachman. "Language Resource Room Level of Service." Language, Speech, and Hearing Services in Schools 23, no. 4 (October 1992): 293–99. http://dx.doi.org/10.1044/0161-1461.2304.293.

Full text
Abstract:
This article describes the Philadelphia School District’s approach for providing an intensive level of language treatment by combining aspects of the traditional itinerant pull-out method with instruction received in the classroom. The speech-language pathologist assumes the roles of co-teacher, consultant, and direct treatment provider. This innovative program allows flexibility of programming and adjusts the level of effort to individual and classroom needs. Students with moderate to severe speech-language disorders are selected on a system-wide basis for this level of service. Initial resistance to the presence of speech-language pathologists in classrooms eases as students’ speech-language performance shows marked improvement. Preliminary data analysis indicates that the Language Resource Room model is a successful adjunct to traditional treatment modes.
APA, Harvard, Vancouver, ISO, and other styles
46

Pietikäinen, Sari. "Discussion: Language in nature resource economies." International Journal of the Sociology of Language 2019, no. 258 (August 27, 2019): 171–76. http://dx.doi.org/10.1515/ijsl-2019-2033.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Shi, Xiayang, Xinyi Liu, Zhenqiang Yu, Pei Cheng, and Chun Xu. "Extracting Parallel Sentences from Low-Resource Language Pairs with Minimal Supervision." Journal of Physics: Conference Series 2171, no. 1 (January 1, 2022): 012044. http://dx.doi.org/10.1088/1742-6596/2171/1/012044.

Full text
Abstract:
Abstract At present, machine translation in the market depends on parallel sentence corpus, and the number of parallel sentences will affect the performance of machine translation, especially in low resource corpus. In recent years, the use of non parallel corpora to learn cross language word representation as low resources and less supervision to obtain bilingual sentence pairs provides a new idea. In this paper, we propose a new method. First, we create cross domain mappings in a small number of single languages. Then a classifier is constructed to extract bilingual parallel sentence pairs. Finally, we prove the effectiveness of our method in Uygur Chinese low resource language by using machine translation, and achieve good results.
APA, Harvard, Vancouver, ISO, and other styles
48

Marusenko, Michael A. "National-functional paradigm: increase of language capital as a goal of language policy." Philological Sciences. Scientific Essays of Higher Education, no. 3 (May 2019): 60–67. http://dx.doi.org/10.20339/phs.3-19.060.

Full text
Abstract:
The article discusses changes in linguistic ideology in the postmodern and globalization period, which led to a change in the linguistic paradigm from ethnocultural to national-functional, which arose on the basis of the developmental ideology. Language policies always have a “hidden agenda” aimed at creating language hierarchies and marginalizing language communities. In the new paradigm, the question turned out to be not only the concepts of language and languages, but also many related concepts derived from the concept of the discrete nature of a language, such as language rights, mother language, bilingualism, multilingualism, code switching, etc. The resource-oriented approach to language radically changes attitudes towards languages and language groups. Language management is viewed as an analogy of natural resource management, and language policy makers control the learning and use of languages in the same way as is done in business resource allocation models. Earlier, the spread of English was studied within the framework of the theory of language imperialism. Since the last third of the ХХ century, instead of the concept of linguistic imperialism, which had a heuristic value in the early post-colonial period, the concept of linguistic capital is used. The historical chance of the Russian Federation is the use by the absolute majority of the population of the endogenous world language. Proficiency in Russian constitutes the main share of the linguistic capital of its speakers.
APA, Harvard, Vancouver, ISO, and other styles
49

Kann, Katharina, Samuel R. Bowman, and Kyunghyun Cho. "Learning to Learn Morphological Inflection for Resource-Poor Languages." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8058–65. http://dx.doi.org/10.1609/aaai.v34i05.6316.

Full text
Abstract:
We propose to cast the task of morphological inflection—mapping a lemma to an indicated inflected form—for resource-poor languages as a meta-learning problem. Treating each language as a separate task, we use data from high-resource source languages to learn a set of model parameters that can serve as a strong initialization point for fine-tuning on a resource-poor target language. Experiments with two model architectures on 29 target languages from 3 families show that our suggested approach outperforms all baselines. In particular, it obtains a 31.7% higher absolute accuracy than a previously proposed cross-lingual transfer model and outperforms the previous state of the art by 1.7% absolute accuracy on average over languages.
APA, Harvard, Vancouver, ISO, and other styles
50

C. PRIYA, C. PRIYA. "The Role of English Language in Human Resource Management and Employability." International Journal of Scientific Research 3, no. 4 (June 1, 2012): 143–44. http://dx.doi.org/10.15373/22778179/apr2014/218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography