To see the other types of publications on this topic, follow the link: English language — Named Entities.

Journal articles on the topic 'English language — Named Entities'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'English language — Named Entities.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chen, Yufeng, Chengqing Zong, and Keh-Yih Su. "A Joint Model to Identify and Align Bilingual Named Entities." Computational Linguistics 39, no. 2 (June 2013): 229–66. http://dx.doi.org/10.1162/coli_a_00122.

Full text
Abstract:
In this article, an integrated model is derived that jointly identifies and aligns bilingual named entities (NEs) between Chinese and English. The model is motivated by the following observations: (1) whether an NE is translated semantically or phonetically depends greatly on its entity type, (2) entities within an aligned pair should share the same type, and (3) the initially detected NEs can act as anchors and provide further information while selecting NE candidates. Based on these observations, this article proposes a translation mode ratio feature (defined as the proportion of NE internal tokens that are semantically translated), enforces an entity type consistency constraint, and utilizes additional new NE likelihoods (based on the initially detected NE anchors). Experiments show that this novel method significantly outperforms the baseline. The type-insensitive F-score of identified NE pairs increases from 78.4% to 88.0% (12.2% relative improvement) in our Chinese–English NE alignment task, and the type-sensitive F-score increases from 68.4% to 83.0% (21.3% relative improvement). Furthermore, the proposed model demonstrates its robustness when it is tested across different domains. Finally, when semi-supervised learning is conducted to train the adopted English NE recognition model, the proposed model also significantly boosts the English NE recognition type-sensitive F-score.
APA, Harvard, Vancouver, ISO, and other styles
2

Mahmood, Ahsan, Hikmat Ullah Khan, Zahoor Ur Rehman, Khalid Iqbal, and Ch Muhmmad Shahzad Faisal. "KEFST: a knowledge extraction framework using finite-state transducers." Electronic Library 37, no. 2 (April 1, 2019): 365–84. http://dx.doi.org/10.1108/el-10-2018-0196.

Full text
Abstract:
Purpose The purpose of this research study is to extract and identify named entities from Hadith literature. Named entity recognition (NER) refers to the identification of the named entities in a computer readable text having an annotation of categorization tags for information extraction. NER is an active research area in information management and information retrieval systems. NER serves as a baseline for machines to understand the context of a given content and helps in knowledge extraction. Although NER is considered as a solved task in major languages such as English, in languages such as Urdu, NER is still a challenging task. Moreover, NER depends on the language and domain of study; thus, it is gaining the attention of researchers in different domains. Design/methodology/approach This paper proposes a knowledge extraction framework using finite-state transducers (FSTs) – KEFST – to extract the named entities. KEFST consists of five steps: content extraction, tokenization, part of speech tagging, multi-word detection and NER. An extensive empirical analysis using the data corpus of Urdu translation of Sahih Al-Bukhari, a widely known hadith book, reveals that the proposed method effectively recognizes the entities to obtain better results. Findings The significant performance in terms of f-measure, precision and recall validates that the proposed model outperforms the existing methods for NER in the relevant literature. Originality/value This research is novel in this regard that no previous work is proposed in the Urdu language to extract named entities using FSTs and no previous work is proposed for Urdu hadith data NER.
APA, Harvard, Vancouver, ISO, and other styles
3

Forouzandeh, Aynaz, Mohammad-Reza Feizi-Derakhshi, and Pejman Gholami-Dastgerdi. "Persian Named Entity Recognition by Gray Wolf Optimization Algorithm." Scientific Programming 2022 (December 10, 2022): 1–12. http://dx.doi.org/10.1155/2022/6368709.

Full text
Abstract:
Named entity recognition (NER) is a subfield of natural language processing (NLP). It is able to identify proper nouns, such as person names, locations, and organizations, and has been widely used in various tasks. NER can be practical in extracting information from social media data. However, the unstructured and noisy nature of social media (such as grammatical errors and typos) causes new challenges for NER, especially for low-resource languages such as Persian, and existing NER methods mainly focus on formal texts and English social media. To overcome this challenge, we consider Persian NER as an optimization problem and use the binary Gray Wolf Optimization (GWO) algorithm to segment posts into small possible phrases of named entities. Later, named entities are recognized based on their score. Also, we prove that even human opinion can differ in the NER task and compare our method with other systems with the S e p _ T D _ T e l 01 dataset and the results show that our proposed system obtains a higher F1 score in comparison with other methods.
APA, Harvard, Vancouver, ISO, and other styles
4

Ekbal, Asif, Sudip Kumar Naskar, and Sivaji Bandyopadhyay. "Named Entity Recognition and transliteration in Bengali." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 95–114. http://dx.doi.org/10.1075/li.30.1.07ekb.

Full text
Abstract:
The paper reports about the development of a Named Entity Recognition (NER) system in Bengali using a tagged Bengali news corpus and the subsequent transliteration of the recognized Bengali Named Entities (NEs) into English. Three different models of the NER have been developed. A semi-supervised learning method has been adopted to develop the first two models, one without linguistic features (Model A) and the other with linguistic features (Model B). The third one (Model C) is based on statistical Hidden Markov Model. A modified joint-source channel model has been used along with a number of alternatives to generate the English transliterations of Bengali NEs and vice-versa. The transliteration models learn the mappings from the bilingual training sets optionally guided by linguistic knowledge in the form of conjuncts and diphthongs in Bengali and their representations in English. The NER system has demonstrated the highest average Recall, Precision and F-Score values of 89.62%, 78.67% and 83.79% respectively in Model C. Evaluation of the proposed transliteration models demonstrated that the modified joint source-channel model performs best in terms of evaluation metrics for person and location names for both Bengali to English (B2E) transliteration and English to Bengali transliteration (E2B). The use of the linguistic knowledge during training of the transliteration models improves performance.
APA, Harvard, Vancouver, ISO, and other styles
5

Tarmizi, Shasha Arzila, and Saidah Saad. "NAMED ENTITY RECOGNITION FOR QURANIC TEXT USING RULE BASED APPROACHES." Asia-Pacific Journal of Information Technology and Multimedia 11, no. 02 (December 31, 2022): 112–22. http://dx.doi.org/10.17576/apjitm-2022-0101-08.

Full text
Abstract:
The variety and difference between domains for textual data require customization in the Natural Language Processing component especially in Named Entity Recognition where different domains contain several types of entities. The current NER model is deemed not fit to accurately extract entities from Quranic text due to its unique content. This paper describes the building of a rule-based Named Entity Recognition method to extract the entities that exist in the English translation to the meaning of the Quranic text and its performance evaluation. Named entity tagging, a common task in-text annotation, in which entities (nouns) in the unstructured text are identified and assigned a class. A few rules are built to extract several types of entities such as the name of prophets and people, creation, location, time, and the various names of God. The rules are built mainly using regular expressions and gazetteers. The rules that have been built result in high precision and recall as well as a satisfactory F-score of over 90%. The results from this experiment can be used as annotation in building a machine learning model to extract entities from the same type of domain specifically on the Quranic text or generally in the Islamic domain text.
APA, Harvard, Vancouver, ISO, and other styles
6

Tarmizi, Shasha Arzila, and Saidah Saad. "Named Entity Recognition For Quranic Text Using Rule Based Approaches." Asia-Pacific Journal of Information Technology and Multimedia 11, no. 02 (December 31, 2022): 112–22. http://dx.doi.org/10.17576/apjitm-2022-1102-09.

Full text
Abstract:
The variety and difference between domains for textual data require customization in the Natural Language Processing component especially in Named Entity Recognition where different domains contain several types of entities. The current NER model is deemed not fit to accurately extract entities from Quranic text due to its unique content. This paper describes the building of a rule-based Named Entity Recognition method to extract the entities that exist in the English translation to the meaning of the Quranic text and its performance evaluation. Named entity tagging, a common task in-text annotation, in which entities (nouns) in the unstructured text are identified and assigned a class. A few rules are built to extract several types of entities such as the name of prophets and people, creation, location, time, and the various names of God. The rules are built mainly using regular expressions and gazetteers. The rules that have been built result in high precision and recall as well as a satisfactory F-score of over 90%. The results from this experiment can be used as annotation in building a machine learning model to extract entities from the same type of domain specifically on the Quranic text or generally in the Islamic domain text.
APA, Harvard, Vancouver, ISO, and other styles
7

Boudjellal, Nada, Huaping Zhang, Asif Khan, Arshad Ahmad, Rashid Naseem, Jianyun Shang, and Lin Dai. "ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition." Complexity 2021 (March 13, 2021): 1–6. http://dx.doi.org/10.1155/2021/6633213.

Full text
Abstract:
The web is being loaded daily with a huge volume of data, mainly unstructured textual data, which increases the need for information extraction and NLP systems significantly. Named-entity recognition task is a key step towards efficiently understanding text data and saving time and effort. Being a widely used language globally, English is taking over most of the research conducted in this field, especially in the biomedical domain. Unlike other languages, Arabic suffers from lack of resources. This work presents a BERT-based model to identify biomedical named entities in the Arabic text data (specifically disease and treatment named entities) that investigates the effectiveness of pretraining a monolingual BERT model with a small-scale biomedical dataset on enhancing the model understanding of Arabic biomedical text. The model performance was compared with two state-of-the-art models (namely, AraBERT and multilingual BERT cased), and it outperformed both models with 85% F1-score.
APA, Harvard, Vancouver, ISO, and other styles
8

Panibog, A. "Fairytale Precedent Names in English-Language MediaDiscourse." Studia Philologica 1-2, no. 18-19 (2022): 48–57. http://dx.doi.org/10.28925/2311-2425.2022.1894.

Full text
Abstract:
The article considers fairytale precedent names selected from English-language media discourse texts based on cognitive linguistics. Coverage of the fairytale precedent names linguocognitive features was carried out within the framework of conceptual analysis that allowed revealing the connection between linguistic and conceptual structures. The study material includes cited statements containing fairytale anthroponyms posted on Internet sites and in the English Web 2020 data corpus (enTenTenTen20) of the Sketch Engine application. This corpus is an English corpus of texts collected from the Internet between 2019 and 2021. Based on the analysis of this material, a hypothesis has been proposed that the vast majority of fairytale precedent names that function in English-language media discourse are formed on the analogy principle. The study found that characteristic of media texts is the use of precedent names in metaphorical models which are likened to entities belonging to different conceptual spheres. In this case,the comparison of objects is carried out by the feature joint to both compared entities. In the analyzed material, the metaphor is represented by the models “a PERSON-man is like an ANIMAL-mythonym” and “an OBJECT-plant is like the ANIMAL-mythonym”. In the formation of the fairytale precedent names, the principle of analogy is also used in which two entities belonging to the same conceptual sphere are compared. As a rule, such similarity of a comparative (what is compared) and a correlate (what is compared with) occurs according to the full degree of similarity. The ability to characterize other objects of reality is explained in a prototype aspect of fairytale precedent names namely their similarity as an exemplary class representative to the leading property of the primary referent. The study results indicated that the analog comparison frequency (87,72%) of the fairytale precedent names is much higher than metaphorical (12,28%) that confirms the proposed hypothesis. Thus, we can conclude that in the modern English-language media discourse the fairytale precedent names are formed mainly on the basis of analogy.
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Hao, Lekai Zhou, Jianyong Duan, and Li He. "Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training." Applied Sciences 13, no. 4 (February 16, 2023): 2548. http://dx.doi.org/10.3390/app13042548.

Full text
Abstract:
Named entity recognition aims to extract entities with specific meaning from unstructured text. Currently, deep learning methods have been widely used for this task and have achieved remarkable results, but it is often difficult to achieve better results with less labeled data. To address this problem, this paper proposes a method for cross-lingual entity recognition based on an attention mechanism and adversarial training, using resource-rich language annotation data to migrate to low-resource languages for named entity recognition tasks and outputting changing semantic vectors through the attention mechanism to effectively solve the long-sequence semantic dilution problem. To verify the effectiveness of the proposed method, the method in this paper is applied to the English–Chinese cross-lingual named entity recognition task based on the WeiboNER data set and the People-Daily2004 data set. The obtained F1 value of the optimal model is 53.22% (a 6.29% improvement compared to the baseline). The experimental results show that the cross-lingual adversarial named entity recognition method proposed in this paper can significantly improve the results of named entity recognition in low resource languages.
APA, Harvard, Vancouver, ISO, and other styles
10

Sboev, Alexander, Roman Rybka, Anton Selivanov, Ivan Moloshnikov, Artem Gryaznov, Alexander Naumov, Sanna Sboeva, Gleb Rylkov, and Soyora Zakirova. "Accuracy Analysis of the End-to-End Extraction of Related Named Entities from Russian Drug Review Texts by Modern Approaches Validated on English Biomedical Corpora." Mathematics 11, no. 2 (January 9, 2023): 354. http://dx.doi.org/10.3390/math11020354.

Full text
Abstract:
An extraction of significant information from Internet sources is an important task of pharmacovigilance due to the need for post-clinical drugs monitoring. This research considers the task of end-to-end recognition of pharmaceutically significant named entities and their relations in texts in natural language. The meaning of ”end-to-end“ is that both of the tasks are performed within a single process on the ”raw“ text without annotation. The study is based on the current version of the Russian Drug Review Corpus—a dataset of 3800 review texts from the Russian segment of the Internet. Currently, this is the only corpus in the Russian language appropriate for research of the mentioned type. We estimated the accuracy of the recognition of the pharmaceutically significant entities and their relations in two approaches based on neural-network language models. The first core approach is to sequentially solve tasks of named-entities recognition and relation extraction (the sequential approach). The second one solves both tasks simultaneously with a single neural network (the joint approach). The study includes a comparison of both approaches, along with the hyperparameters selection to maximize resulting accuracy. It is shown that both approaches solve the target task at the same level of accuracy: 52–53% macro-averaged F1−score, which is the current level of accuracy for “end-to-end” tasks on the Russian language. Additionally, the paper presents the results for English open datasets ADE and DDI based on the joint approach, and hyperparameter selection for the modern domain-specific language models. The result is that the achieved accuracies of 84.2% (ADE) and 73.3% (DDI) are comparable or better than other published results for the datasets.
APA, Harvard, Vancouver, ISO, and other styles
11

Mitrofan, Maria, Verginica Barbu Mititelu, and Grigorina Mitrofan. "Towards the Construction of a Gold Standard Biomedical Corpus for the Romanian Language." Data 3, no. 4 (November 23, 2018): 53. http://dx.doi.org/10.3390/data3040053.

Full text
Abstract:
Gold standard corpora (GSCs) are essential for the supervised training and evaluation of systems that perform natural language processing (NLP) tasks. Currently, most of the resources used in biomedical NLP tasks are mainly in English. Little effort has been reported for other languages including Romanian and, thus, access to such language resources is poor. In this paper, we present the construction of the first morphologically and terminologically annotated biomedical corpus of the Romanian language (MoNERo), meant to serve as a gold standard for biomedical part-of-speech (POS) tagging and biomedical named entity recognition (bioNER). It contains 14,012 tokens distributed in three medical subdomains: cardiology, diabetes and endocrinology, extracted from books, journals and blogposts. In order to automatically annotate the corpus with POS tags, we used a Romanian tag set which has 715 labels, while diseases, anatomy, procedures and chemicals and drugs labels were manually annotated for bioNER with a Cohen Kappa coefficient of 92.8% and revealed the occurrence of 1877 medical named entities. The automatic annotation of the corpus has been manually checked. The corpus is publicly available and can be used to facilitate the development of NLP algorithms for the Romanian language.
APA, Harvard, Vancouver, ISO, and other styles
12

MARRERO, M., and J. URBANO. "A Semi-automatic and low-cost method to learn patterns for named entity recognition." Natural Language Engineering 24, no. 1 (June 15, 2017): 39–75. http://dx.doi.org/10.1017/s135132491700016x.

Full text
Abstract:
AbstractNamed Entity Recognition is a basic task in Information Extraction that aims at identifying entities of interest within full text documents. The patterns used to recognize entities can be rule based, as in the popular JAPE system. However, hand-crafting effective patterns is often difficult, and yet there is little research devoted to methods capable of learning human-readable patterns, possibly with arbitrary sets of features. In this paper, we present a semi-automatic method to generate both regular expressions and a subset of the JAPE language. It does not need a corpus annotated beforehand. Instead, it employs active learning and combines clustering with an algorithm that finds alignments between symbols present in the entities discovered during the learning process. The method currently supports a fixed set of character features and an arbitrary set of token features, but it can incorporate other kinds of features as well. Through several experiments with an English corpus, we show the ability of the method to generate effective patterns at a low annotation cost, and how it can successfully help in the annotation of brand new corpora.
APA, Harvard, Vancouver, ISO, and other styles
13

Nasharuddin, Nurul Amelina, Muhamad Taufik Abdullah, Azreen Azman, Rabiah Abdul Kadir, and Enrique Herrera-Viedma. "Feature-based Similarity Method for Aligning the Malay and English News Document." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 11, no. 4 (October 15, 2013): 2410–21. http://dx.doi.org/10.24297/ijct.v11i4.3125.

Full text
Abstract:
Corpus-based translation approach can be used to obtain reliable translation knowledge in addition to the use of dictionaries or machine translation. But the availability of such corpus is very limited especially for the low-resources languages. Many works have been reported for the alignments of multilingual documents especially among the European languages, but less focusing on the languages with less linguistics resources. One of the challenges is to align the available multilingual documents for the creation of comparable corpus for these kinds of languages. This article describes an alignment method that utilized the statistical features of the documents such as the documents’ titles, texts of the contents, and also the named entities present in each document. This method will be focusing on the English and Malay news documents, in which in which the Malay language is considered as a low-resource language. Source and target documents were then compared in a pair. Accuracy, precision, and recall measurements were used in evaluating the results with the inclusion of three relevance scales; Same story, Shared aspect and Unrelated, to assess the alignment pairs. The results indicate that the method performed well in aligning the news documents with the accuracy of 96% and average precision of 81%.
APA, Harvard, Vancouver, ISO, and other styles
14

Sun, Yu, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. "ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8968–75. http://dx.doi.org/10.1609/aaai.v34i05.6428.

Full text
Abstract:
Recently pre-trained models have achieved state-of-the-art results in various language understanding tasks. Current pre-training procedures usually focus on training the model with several simple tasks to grasp the co-occurrence of words or sentences. However, besides co-occurring information, there exists other valuable lexical, syntactic and semantic information in training corpora, such as named entities, semantic closeness and discourse relations. In order to extract the lexical, syntactic and semantic information from training corpora, we propose a continual pre-training framework named ERNIE 2.0 which incrementally builds pre-training tasks and then learn pre-trained models on these constructed tasks via continual multi-task learning. Based on this framework, we construct several tasks and train the ERNIE 2.0 model to capture lexical, syntactic and semantic aspects of information in the training data. Experimental results demonstrate that ERNIE 2.0 model outperforms BERT and XLNet on 16 tasks including English tasks on GLUE benchmarks and several similar tasks in Chinese. The source codes and pre-trained models have been released at https://github.com/PaddlePaddle/ERNIE.
APA, Harvard, Vancouver, ISO, and other styles
15

Lefever, Els, Marjan Van de Kauter, and Véronique Hoste. "HypoTerm." Terminology 20, no. 2 (October 31, 2014): 250–78. http://dx.doi.org/10.1075/term.20.2.06lef.

Full text
Abstract:
HypoTerm is a data-driven semantic relation finder that starts from a list of automatically extracted domain- and user-specific terms from technical corpora, and generates a list of relations between these terms. This research study focused on the detection of hypernym relations between relevant terms and named entities. In order to detect all relevant hypernym relations in technical texts, we combined a lexico-syntactic pattern-based approach and a morpho-syntactic analyzer. To evaluate our relation finder, we constructed and manually annotated gold standard data for the dredging and financial domain in Dutch and English. The experimental results show that the HypoTerm system achieves high precision and recall figures for technical texts when starting from valid domain-specific terms and named entities. Thanks to this data-driven approach, it is possible to take an important step from terminology to concept extraction without using any external lexico-semantic resources.
APA, Harvard, Vancouver, ISO, and other styles
16

Shi, Xue, Yingping Yi, Ying Xiong, Buzhou Tang, Qingcai Chen, Xiaolong Wang, Zongcheng Ji, Yaoyun Zhang, and Hua Xu. "Extracting entities with attributes in clinical text via joint deep learning." Journal of the American Medical Informatics Association 26, no. 12 (September 24, 2019): 1584–91. http://dx.doi.org/10.1093/jamia/ocz158.

Full text
Abstract:
Abstract Objective Extracting clinical entities and their attributes is a fundamental task of natural language processing (NLP) in the medical domain. This task is typically recognized as 2 sequential subtasks in a pipeline, clinical entity or attribute recognition followed by entity-attribute relation extraction. One problem of pipeline methods is that errors from entity recognition are unavoidably passed to relation extraction. We propose a novel joint deep learning method to recognize clinical entities or attributes and extract entity-attribute relations simultaneously. Materials and Methods The proposed method integrates 2 state-of-the-art methods for named entity recognition and relation extraction, namely bidirectional long short-term memory with conditional random field and bidirectional long short-term memory, into a unified framework. In this method, relation constraints between clinical entities and attributes and weights of the 2 subtasks are also considered simultaneously. We compare the method with other related methods (ie, pipeline methods and other joint deep learning methods) on an existing English corpus from SemEval-2015 and a newly developed Chinese corpus. Results Our proposed method achieves the best F1 of 74.46% on entity recognition and the best F1 of 50.21% on relation extraction on the English corpus, and 89.32% and 88.13% on the Chinese corpora, respectively, which outperform the other methods on both tasks. Conclusions The joint deep learning–based method could improve both entity recognition and relation extraction from clinical text in both English and Chinese, indicating that the approach is promising.
APA, Harvard, Vancouver, ISO, and other styles
17

Liu, Jingang, Chunhe Xia, Haihua Yan, and Wenjing Xu. "Innovative Deep Neural Network Modeling for Fine-Grained Chinese Entity Recognition." Electronics 9, no. 6 (June 15, 2020): 1001. http://dx.doi.org/10.3390/electronics9061001.

Full text
Abstract:
Named entity recognition (NER) is a basic but crucial task in the field of natural language processing (NLP) and big data analysis. The recognition of named entities based on Chinese is more complicated and difficult than English, which makes the task of NER in Chinese more challenging. In particular, fine-grained named entity recognition is more challenging than traditional named entity recognition tasks, mainly because fine-grained tasks have higher requirements for the ability of automatic feature extraction and information representation of deep neural models. In this paper, we propose an innovative neural network model named En2BiLSTM-CRF to improve the effect of fine-grained Chinese entity recognition tasks. This proposed model including the initial encoding layer, the enhanced encoding layer, and the decoding layer combines the advantages of pre-training model encoding, dual bidirectional long short-term memory (BiLSTM) networks, and a residual connection mechanism. Hence, it can encode information multiple times and extract contextual features hierarchically. We conducted sufficient experiments on two representative datasets using multiple important metrics and compared them with other advanced baselines. We present promising results showing that our proposed En2BiLSTM-CRF has better performance as well as better generalization ability in both fine-grained and coarse-grained Chinese entity recognition tasks.
APA, Harvard, Vancouver, ISO, and other styles
18

Bao, Ping, and Suoling Zhu. "System design for location name recognition in ancient local chronicles." Library Hi Tech 32, no. 2 (June 10, 2014): 276–84. http://dx.doi.org/10.1108/lht-07-2013-0101.

Full text
Abstract:
Purpose – The purpose of this paper is to present a system for recognition of location names in ancient books written in languages, such as Chinese, in which proper names are not signaled by an initial capital letter. Design/methodology/approach – Rule-based and statistical methods were combined to develop a set of rules for identification of product-related location names in the local chronicles of Guangdong. A name recognition system, with functions of document management, information extraction and storage, rule management, location name recognition, and inquiry and statistics, was developed using Microsoft's .NET framework, SQL Server 2005, ADO.NET and XML. The system was evaluated with precision ratio, recall ratio and the comprehensive index, F. Findings – The system was quite successful at recognizing product-related location names (F was 71.8 percent), demonstrating the potential for application of automatic named entity recognition techniques in digital collation of ancient books such as local chronicles. Research limitations/implications – Results suffered from limitations in initial digitization of the text. Statistical methods, such as the hidden Markov model, should be combined with an extended set of recognition rules to improve recognition scores and system efficiency. Practical implications – Electronic access to local chronicles by location name saves time for chorographers and provides researchers with new opportunities. Social implications – Named entity recognition brings previously isolated ancient documents together in a knowledge base of scholarly and cultural value. Originality/value – Automatic name recognition can be implemented in information extraction from ancient books in languages other than English. The system described here can also be adapted to modern texts and other named entities.
APA, Harvard, Vancouver, ISO, and other styles
19

Han, Xiaoyu, Yue Zhang, Wenkai Zhang, and Tinglei Huang. "An Attention-Based Model Using Character Composition of Entities in Chinese Relation Extraction." Information 11, no. 2 (January 31, 2020): 79. http://dx.doi.org/10.3390/info11020079.

Full text
Abstract:
Relation extraction is a vital task in natural language processing. It aims to identify the relationship between two specified entities in a sentence. Besides information contained in the sentence, additional information about the entities is verified to be helpful in relation extraction. Additional information such as entity type getting by NER (Named Entity Recognition) and description provided by knowledge base both have their limitations. Nevertheless, there exists another way to provide additional information which can overcome these limitations in Chinese relation extraction. As Chinese characters usually have explicit meanings and can carry more information than English letters. We suggest that characters that constitute the entities can provide additional information which is helpful for the relation extraction task, especially in large scale datasets. This assumption has never been verified before. The main obstacle is the lack of large-scale Chinese relation datasets. In this paper, first, we generate a large scale Chinese relation extraction dataset based on a Chinese encyclopedia. Second, we propose an attention-based model using the characters that compose the entities. The result on the generated dataset shows that these characters can provide useful information for the Chinese relation extraction task. By using this information, the attention mechanism we used can recognize the crucial part of the sentence that can express the relation. The proposed model outperforms other baseline models on our Chinese relation extraction dataset.
APA, Harvard, Vancouver, ISO, and other styles
20

He, Chunhui, Zhen Tan, Haoran Wang, Chong Zhang, Yanli Hu, and Bin Ge. "Open Domain Chinese Triples Hierarchical Extraction Method." Applied Sciences 10, no. 14 (July 14, 2020): 4819. http://dx.doi.org/10.3390/app10144819.

Full text
Abstract:
Open domain relation prediction is an important task in triples extraction. When faced with the task of constructing large-scale knowledge graph systems, with the exception of structured data, it is necessary to automatically extract triples from a large amount of unstructured text to expand entities and relations. Although a large number of English open relation prediction methods have achieved good performance, the high-performance system for open domain Chinese triples extraction remains undeveloped due to the lack of large-scale Chinese annotation corpora and the difficulty of Chinese language processing. In this paper, we propose an integrated open domain Chinese triples hierarchical extraction method (CTHE) to solve this problem, considering the advantages of Bi-LSTM-CRF and Att-Bi-GRU models based on the pre-trained BERT encoding model. This method can recognize the named entities from Chinese sentences to establish entity pairs, and implement hierarchical extraction of specific and open relations based on the user-defined schema library and attention mechanism. The experimental results demonstrate the effectiveness of this method, which achieved stable performance on the test dataset, and better precision and F1-score in comparison with state-of-the-art Chinese open domain triples extraction methods. Furthermore, a large-scale annotated dataset for a Chinese named entity recognition (NER) task is established, which provides support for research on Chinese NER tasks.
APA, Harvard, Vancouver, ISO, and other styles
21

Gusev, Daniil, and Zinaida Apanovich. "METHODS OF PROCESSING TEXTUAL INFORMATION IN ENTITY ALIGNMENT ALGORITHMS." Bulletin of the Novosibirsk Computing Center. Series: Computer Science, no. 45 (2021): 49–58. http://dx.doi.org/10.31144/bncc.cs.2542-1972.2021.n45.p49-58.

Full text
Abstract:
Entity alignment algorithms aim to find equivalent entities in cross-lingual knowledge graphs, which is important for the task of obtaining information about real-world objects. Recently, several studies have been conducted on entity alignment algorithms on various datasets. Algorithms using information about entity names have shown a wide range of results. In this paper, we have conducted a study of this phenomenon. Work has been done to improve the quality of matching cross-language entity names in vector space. Also, experiments with the modern models of processing natural languages have been carried out. The information obtained has led to a significant increase in the accuracy of entity alignment on the English-Russian dataset.
APA, Harvard, Vancouver, ISO, and other styles
22

Silvestri, Stefano, Francesco Gargiulo, and Mario Ciampi. "Iterative Annotation of Biomedical NER Corpora with Deep Neural Networks and Knowledge Bases." Applied Sciences 12, no. 12 (June 7, 2022): 5775. http://dx.doi.org/10.3390/app12125775.

Full text
Abstract:
The large availability of clinical natural language documents, such as clinical narratives or diagnoses, requires the definition of smart automatic systems for their processing and analysis, but the lack of annotated corpora in the biomedical domain, especially in languages different from English, makes it difficult to exploit the state-of-art machine-learning systems to extract information from such kinds of documents. For these reasons, healthcare professionals lose big opportunities that can arise from the analysis of this data. In this paper, we propose a methodology to reduce the manual efforts needed to annotate a biomedical named entity recognition (B-NER) corpus, exploiting both active learning and distant supervision, respectively based on deep learning models (e.g., Bi-LSTM, word2vec FastText, ELMo and BERT) and biomedical knowledge bases, in order to speed up the annotation task and limit class imbalance issues. We assessed this approach by creating an Italian-language electronic health record corpus annotated with biomedical domain entities in a small fraction of the time required for a fully manual annotation. The obtained corpus was used to train a B-NER deep neural network whose performances are comparable with the state of the art, with an F1-Score equal to 0.9661 and 0.8875 on two test sets.
APA, Harvard, Vancouver, ISO, and other styles
23

Garg, Kamal Deep, Shashi Shekhar, Ajit Kumar, Vishal Goyal, Bhisham Sharma, Rajeswari Chengoden, and Gautam Srivastava. "Framework for Handling Rare Word Problems in Neural Machine Translation System Using Multi-Word Expressions." Applied Sciences 12, no. 21 (October 31, 2022): 11038. http://dx.doi.org/10.3390/app122111038.

Full text
Abstract:
Neural machine translation (NMT) is an ongoing technique used to implement machine translation (MT) systems. Natural language processing (NLP) researchers have shown that NMT systems are unable to deal with out-of-vocabulary (OOV) words and multi-word expressions (MWEs) in the text. OOV words are those that are not part of the current vocabulary of the NMT system. MWEs are phrases that consist of a minimum of two terms but are treated as a single unit. MWEs have great importance in NLP, linguistic theory, and MT systems. In this article, OOV words and MWEs are handled for the Punjabi to English NMT system. A parallel corpus for Punjabi to English containing MWEs was developed and used to train the different models of NMT. Punjabi is a low-resource language as it lacks the availability of a large parallel corpus for building various NLP tools, and this is an attempt to improve the accuracy of Punjabi in the English NMT system by using named entities and MWEs in the corpus. The developed NMT models were assessed using human evaluation through adequacy and fluency as well as automated assessment tools such as the bilingual evaluation study (BLEU) and translation error rate (TER) score. Results show that using word embedding (WE) and MWEs corpus increased the accuracy of translation for the Punjabi to English language pair. The best BLEU score obtained was 15.45 for the small test set, 43.32 for the medium test set, and 34.5 for the large test set, respectively. The best TER rate score obtained was 57.34% for the small test set, 37.29% for the medium test set, and 53.79% for the large test set, repectively.
APA, Harvard, Vancouver, ISO, and other styles
24

Zhang, Yuhao, Yuhui Zhang, Peng Qi, Christopher D. Manning, and Curtis P. Langlotz. "Biomedical and clinical English model packages for the Stanza Python NLP library." Journal of the American Medical Informatics Association 28, no. 9 (June 22, 2021): 1892–99. http://dx.doi.org/10.1093/jamia/ocab090.

Full text
Abstract:
Abstract Objective The study sought to develop and evaluate neural natural language processing (NLP) packages for the syntactic analysis and named entity recognition of biomedical and clinical English text. Materials and Methods We implement and train biomedical and clinical English NLP pipelines by extending the widely used Stanza library originally designed for general NLP tasks. Our models are trained with a mix of public datasets such as the CRAFT treebank as well as with a private corpus of radiology reports annotated with 5 radiology-domain entities. The resulting pipelines are fully based on neural networks, and are able to perform tokenization, part-of-speech tagging, lemmatization, dependency parsing, and named entity recognition for both biomedical and clinical text. We compare our systems against popular open-source NLP libraries such as CoreNLP and scispaCy, state-of-the-art models such as the BioBERT models, and winning systems from the BioNLP CRAFT shared task. Results For syntactic analysis, our systems achieve much better performance compared with the released scispaCy models and CoreNLP models retrained on the same treebanks, and are on par with the winning system from the CRAFT shared task. For NER, our systems substantially outperform scispaCy, and are better or on par with the state-of-the-art performance from BioBERT, while being much more computationally efficient. Conclusions We introduce biomedical and clinical NLP packages built for the Stanza library. These packages offer performance that is similar to the state of the art, and are also optimized for ease of use. To facilitate research, we make all our models publicly available. We also provide an online demonstration (http://stanza.run/bio).
APA, Harvard, Vancouver, ISO, and other styles
25

Sakhovskiy, Andrey Sergeyevich, and Elena Viktorovna Tutubalina. "Сross-lingual transfer learning in drug-related information extraction from user-generated texts." Proceedings of the Institute for System Programming of the RAS 33, no. 6 (2021): 217–28. http://dx.doi.org/10.15514/ispras-2021-33(6)-15.

Full text
Abstract:
Aggregating knowledge about drug, disease, and drug reaction entities across a broader range of domains and languages is critical for information extraction (IE) applications. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for biomedical named entity recognition (NER) and multi-label sentence classification tasks. We investigate the role of transfer learning (TL) strategies between two English corpora and a novel annotated corpus of Russian reviews about drug therapy. Labels for sentences include health-related issues or their absence. The sentences with one are additionally labelled at the expression level to identify fine-grained subtypes such as drug names, drug indications, and drug reactions. Evaluation results demonstrate that BERT trained on Russian and English raw reviews (5M in total) shows the best transfer capabilities on evaluation of adverse drug reactions on Russian data. The macro F1 score of 74.85% in the NER task was achieved by our RuDR-BERT model. For the classification task, our EnRuDR-BERT model achieves the macro F1 score of 70%, gaining 8.64% over the score of a general domain BERT model.
APA, Harvard, Vancouver, ISO, and other styles
26

Silva, Rita, Vera Cabarrão, and Sara Mendes. "Anotação de Entidades Mencionadas na área do Gaming." Revista da Associação Portuguesa de Linguística, no. 9 (October 25, 2022): 223–35. http://dx.doi.org/10.26334/2183-9077/rapln9ano2022a15.

Full text
Abstract:
This paper aims to analyse the effects of including gaming entities in the performance of the NER system, for the English language and in a machine translation industrial context of customer support content. To identify and classify gaming entities (by the Named Entity Recognition (NER) model), three new categories were created and added to the already used annotation typology: GAME NAME, GAME FEATURE and GAME CURRENCY. A set of reference annotations (gold standard) was also developed, allowing not only the training of the NER system but also the evaluation of its performance and accuracy in a more objective way, namely by counting the number of entities that the system identifies and categorises correctly. In the scope of this work, 6618 sentences from 7 gaming clients were manually annotated, constituting the gold standard which was then used to train and evaluate the NER system. The objective of the experiments was to assess whether the existing NER system improved its performance when trained with the gold standard created specifically for the gaming domain and if it could handle the new gaming categories added to the typology by identifying and categorizing them correctly. The results of both experiments were auspicious and positive, demonstrating the relevance of greater investment in domain-specific entity recognition, namely in the context of customer service text processing.
APA, Harvard, Vancouver, ISO, and other styles
27

Maharani, Amalina, and Emy Sudarwati. "“PUBLISH OR PERISH” : JAVANESE LANGUAGE MAINTENANCE ON JAVANESE-ENGLISH CODE SWITCHING SONG." Lire Journal (Journal of Linguistics and Literature) 5, no. 2 (June 30, 2021): 150–67. http://dx.doi.org/10.33019/lire.v5i2.118.

Full text
Abstract:
This descriptive qualitative study aims to shed new light on Javanese language maintenance through the practice of English-Javanese code-switching reflected in a song entitled Lathi by Weird Genius feat Sara Fajira. The intrinsic merit of the song 'Lathi,' covering cultural values, song lyrics significance, and the song's moral message, were deliberately discussed here. The data are taken from interview transcripts, observation, and documentation. The data were analyzed by first classifying the Lathi song lyrics into types of code switching, investigating the youths’ perception regarding the used of Javanese English code switching in Lathi song, and analyze the aspects of the songs highlighted the idea of Javanese language maintenance. The findings of this study suggest that the phenomenon of code-switching in Lathi songs is deliberately done to keep maintaining Javanese's existence as one of the popular vernacular in Indonesia. Language maintenance of the Javanese language in a song named Lathi can pique the public's interest in learning Javanese by creating Javanese language maintenance represented in its song lyrics. It makes the Javanese language gain popularity in the community, particularly among students and young people. It is, of course, a good sign of minimizing the threat of language shift. The continuous use of the local language as a language maintenance effort will avoid losing the community's first language.
APA, Harvard, Vancouver, ISO, and other styles
28

Liang, Li-Xin, Lin Lin, E. Lin, Wu-Shao Wen, and Guo-Yan Huang. "A Joint Learning Model to Extract Entities and Relations for Chinese Literature Based on Self-Attention." Mathematics 10, no. 13 (June 24, 2022): 2216. http://dx.doi.org/10.3390/math10132216.

Full text
Abstract:
Extracting structured information from massive and heterogeneous text is a hot research topic in the field of natural language processing. It includes two key technologies: named entity recognition (NER) and relation extraction (RE). However, previous NER models consider less about the influence of mutual attention between words in the text on the prediction of entity labels, and there is less research on how to more fully extract sentence information for relational classification. In addition, previous research treats NER and RE as a pipeline of two separated tasks, which neglects the connection between them, and is mainly focused on the English corpus. In this paper, based on the self-attention mechanism, bidirectional long short-term memory (BiLSTM) neural network and conditional random field (CRF) model, we put forth a Chinese NER method based on BiLSTM-Self-Attention-CRF and a RE method based on BiLSTM-Multilevel-Attention in the field of Chinese literature. In particular, considering the relationship between these two tasks in terms of word vector and context feature representation in the neural network model, we put forth a joint learning method for NER and RE tasks based on the same underlying module, which jointly updates the parameters of the shared module during the training of these two tasks. For performance evaluation, we make use of the largest Chinese data set containing these two tasks. Experimental results show that the proposed independently trained NER and RE models achieve better performance than all previous methods, and our joint NER-RE training model outperforms the independently-trained NER and RE model.
APA, Harvard, Vancouver, ISO, and other styles
29

Guo, Weiwei. "Correlation between the Dissemination of Classic English Literary Works and Cultural Cognition in the New Media Era." Advances in Multimedia 2022 (July 20, 2022): 1–9. http://dx.doi.org/10.1155/2022/3616432.

Full text
Abstract:
With the continuous development of new media technology, the spiritual needs of the masses have been greatly satisfied and the aesthetic ability has also been significantly improved compared with the past. From the current point of view, “literary works,” as the spiritual food of contemporary people, are promoting social spirit. The use of natural language processing and knowledge graph technology can improve cultural cognition to promote the dissemination and development of classic English literature, which has become a necessary means of dissemination of classic English literature. Most of the existing classic English literary works are appreciated based on modern literature datasets. Nowadays, with the continuous development of new media technology, there are fewer studies on the dissemination and cultural cognition of classic English literary works. This makes it impossible for readers to obtain cultural cognition from classic English literary works, making it difficult for the dissemination and development of classic English literary works. In view of the above problems, using natural language processing and knowledge graph technology, taking Shakespeare's play “Hamlet” represented by classic English literary works as an example, the research on the construction method of knowledge graph is carried out and the cultural characteristics in literary works are extracted and analyzed. In parsing, a bidirectional gated recurrent unit network model based on hybrid character embedding is proposed. Based on n-gram embedding, by combining pretraining embedding and radical embedding, it can fully consider the rich semantic information in English literature works to extract. Feature: in terms of named entity recognition, based on the existing iterative atrous convolutional network model, an iterative atrous convolutional network model is proposed. To get the best sequence label and get the last labeled entity information, in terms of knowledge graph construction and visual query, a workflow method for building knowledge graph from unstructured text is proposed and a flask-based knowledge graph visual query system is designed, which applies the best model of the above two tasks. We decode the complete “Hamlet” text, extract entities and their semantic links as nodes and relationships in the knowledge graph, store knowledge through the graph database, and finally form a visual query system that combines the front and back end.
APA, Harvard, Vancouver, ISO, and other styles
30

Al-Hamly, Mashael A., and Mohammed Farghal. "The translation of proper nouns into Arabic." Babel. Revue internationale de la traduction / International Journal of Translation 61, no. 4 (December 31, 2015): 511–26. http://dx.doi.org/10.1075/babel.61.4.04alh.

Full text
Abstract:
This paper aims to explore the strategies that translators adopt when rendering English proper nouns into Arabic and, consequently, offer both qualitative and quantitative insights into this process. It is a case study of proper nouns in professional Arabic translation based on one English novel (The White Tiger by Aravind Adiga (2008); translated into Arabic by Taiba Sadeq (2011). Proper nouns are categorized and analyzed in terms of internal syntactic structure (Central, Converted and Extended proper nouns), as well as thematically (e.g. personal names, names of institutions, bodies of water, etc.), with an eye to establishing correlations between the type of proper noun and the translation strategy opted for. The results indicate that the translator’s choice between different strategies is governed by two main factors. Firstly, the translator needs to check whether the proper noun individualizes entities by means of ordinary language predicates (e.g. common nouns), proper nouns proper, or a combination of both, as each type usually requires a different strategy. Secondly, the translator needs to pay attention to the degree of comprehensibility and naturalness of his/her rendering, which may necessitate consolidating the single strategies of transliteration and translation with addition in the form of a generic word or even substitution in the case of idiomatic proper nouns. The paper concludes that proper nouns cannot be treated uniformly in translation between English and Arabic because they belong to different categories and, consequently, they may require different translation strategies including transliteration, complete translation, partial translation, transliteration plus addition, and translation plus addition.
APA, Harvard, Vancouver, ISO, and other styles
31

Li, Yongbin, Xiaohua Wang, Linhu Hui, Liping Zou, Hongjin Li, Luo Xu, and Weihai Liu. "Chinese Clinical Named Entity Recognition in Electronic Medical Records: Development of a Lattice Long Short-Term Memory Model With Contextualized Character Representations." JMIR Medical Informatics 8, no. 9 (September 4, 2020): e19848. http://dx.doi.org/10.2196/19848.

Full text
Abstract:
Background Clinical named entity recognition (CNER), whose goal is to automatically identify clinical entities in electronic medical records (EMRs), is an important research direction of clinical text data mining and information extraction. The promotion of CNER can provide support for clinical decision making and medical knowledge base construction, which could then improve overall medical quality. Compared with English CNER, and due to the complexity of Chinese word segmentation and grammar, Chinese CNER was implemented later and is more challenging. Objective With the development of distributed representation and deep learning, a series of models have been applied in Chinese CNER. Different from the English version, Chinese CNER is mainly divided into character-based and word-based methods that cannot make comprehensive use of EMR information and cannot solve the problem of ambiguity in word representation. Methods In this paper, we propose a lattice long short-term memory (LSTM) model combined with a variant contextualized character representation and a conditional random field (CRF) layer for Chinese CNER: the Embeddings from Language Models (ELMo)-lattice-LSTM-CRF model. The lattice LSTM model can effectively utilize the information from characters and words in Chinese EMRs; in addition, the variant ELMo model uses Chinese characters as input instead of the character-encoding layer of the ELMo model, so as to learn domain-specific contextualized character embeddings. Results We evaluated our method using two Chinese CNER datasets from the China Conference on Knowledge Graph and Semantic Computing (CCKS): the CCKS-2017 CNER dataset and the CCKS-2019 CNER dataset. We obtained F1 scores of 90.13% and 85.02% on the test sets of these two datasets, respectively. Conclusions Our results show that our proposed method is effective in Chinese CNER. In addition, the results of our experiments show that variant contextualized character representations can significantly improve the performance of the model.
APA, Harvard, Vancouver, ISO, and other styles
32

Ong, Kenneth Keng Wee, Jean François Ghesquière, and Stefan Karl Serwe. "Frenglish shop signs in Singapore." English Today 29, no. 3 (August 15, 2013): 19–25. http://dx.doi.org/10.1017/s0266078413000278.

Full text
Abstract:
The presence of French in advertising communication within largely non-French speaking communities has been noted by a few linguists. Haarmann (1984, 1989) found that French is used in Japanese advertisements as ethno-cultural hieroglyphs which connote refinement, poshness, style and tastefulness – stereotypes of France and French culture. The unintelligibility of French to Japanese patrons is perceived as a non-issue, as social or symbolic meanings are deemed to be more vital to attract patrons than denotational meanings. A parallel case was found in British advertisements of food, fashion and beauty businesses where French symbolism or linguistic fetish is seen as attractive to largely non-French, English-speaking patrons (Kelly-Holmes, 2005). Notably, French symbolic meanings are sometimes accompanied by elaborative messages in English. Kelly-Holmes (2005) noted that English is used only where message comprehension is important for explicit communication. Curtin (2009) documented the fact that ‘vogue’ or ‘display’ French shop names favored by high-end restaurants and beauty salons in Taipei occurred concomitantly with vogue English. Vogue English is relatively more ubiquitous across the city's linguistic landscape due to its connotations being exploited in a wide span of applications vis-à-vis the chic prestige of French, which is tied to food, beauty and fashion businesses. The Taipei case shows that non-idiomatic French is employed as a socio-commercial accessory, similar to the case of decorative English used in Japan (Dougill, 1987) and in Milan, Italy (Ross, 1997). However, a more recent study on Tokyo shop signs gleaned linguistic patterns other than vogue English and vogue French (MacGregor, 2003), such as French + Japanese and English + French + Japanese. A recent study by Serwe et al. (in press) found that French and French-like shop names are increasingly in currency, with local shop owners keen to stand out and appeal to the increasingly cosmopolitan and sophisticated clientele in Singapore, who are nevertheless overwhelmingly non-French speaking. They further found that French and French-inspired shop signs of food businesses can be classified into four categories, namely, monolingual French, French + another language, French function words + another language, and coinages, noting that there are idiomatic usages and non-idiomatic usages in the first three categories. In this paper, we throw the spotlight on coinages, which we argue are mostly explicable as French-English code-switched blends. We focus on localized nominal concoctions used by shop owners across food and beauty commercial entities within Singapore. We borrowed the term ‘Frenglish’ from Martin's (2007) study to refer to the French-English blends. However, we noted that Martin's study focused on the use of English in advertising communication in France, where English is the minority language that is largely sidelined by the Toubon Law. Contrastively, English in Singapore is de facto the national language, while French is a foreign language with few speakers.
APA, Harvard, Vancouver, ISO, and other styles
33

Baker, Kathryn, Michael Bloodgood, Bonnie J. Dorr, Chris Callison-Burch, Nathaniel W. Filardo, Christine Piatko, Lori Levin, and Scott Miller. "Modality and Negation in SIMT Use of Modality and Negation in Semantically-Informed Syntactic MT." Computational Linguistics 38, no. 2 (June 2012): 411–38. http://dx.doi.org/10.1162/coli_a_00099.

Full text
Abstract:
This article describes the resource- and system-building efforts of an 8-week Johns Hopkins University Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically Informed Machine Translation (SIMT). We describe a new modality/negation (MN) annotation scheme, the creation of a (publicly available) MN lexicon, and two automated MN taggers that we built using the annotation scheme and lexicon. Our annotation scheme isolates three components of modality and negation: a trigger (a word that conveys modality or negation), a target (an action associated with modality or negation), and a holder (an experiencer of modality). We describe how our MN lexicon was semi-automatically produced and we demonstrate that a structure-based MN tagger results in precision around 86% (depending on genre) for tagging of a standard LDC data set. We apply our MN annotation scheme to statistical machine translation using a syntactic framework that supports the inclusion of semantic annotations. Syntactic tags enriched with semantic annotations are assigned to parse trees in the target-language training texts through a process of tree grafting. Although the focus of our work is modality and negation, the tree grafting procedure is general and supports other types of semantic information. We exploit this capability by including named entities, produced by a pre-existing tagger, in addition to the MN elements produced by the taggers described here. The resulting system significantly outperformed a linguistically naive baseline model (Hiero), and reached the highest scores yet reported on the NIST 2009 Urdu–English test set. This finding supports the hypothesis that both syntactic and semantic information can improve translation quality.
APA, Harvard, Vancouver, ISO, and other styles
34

Astri, Zul. "BOOK REVIEW: ENGLISH CURRICULUM AND MATERIAL DEVELOPMENT." LLT Journal: A Journal on Language and Language Teaching 25, no. 2 (October 20, 2022): 758–61. http://dx.doi.org/10.24071/llt.v25i2.5160.

Full text
Abstract:
This textbook, entitled "English Curriculum and Material Development," covers a variety of subjects in 11 Chapters. It is good for educational practitioners who are always in touch with the curriculum and syllabus. This book was written by a lecturer at the Ponorogo State Islamic Institute named Pryla Rochmawati. It consists of 4 main parts, namely Curriculum and Syllabus, Component of Curriculum, Curriculum in Indonesian Context, and Material Development. However, here I will briefly explain one by one the chapters contained in this book. Chapter 1 discusses the concept of curriculum and syllabus, including its definitions, the difference, kinds of the syllabus, and its importance in language teaching. Chapter 2 examines a component of the curriculum called Need Analysis. It discusses the definition, purpose, and targets, as well as the steps and techniques for doing a need analysis. Chapter 3 is concerned with the conceptualization of aims, goals, and objectives. Chapter 4 discusses Assessment and Testing, emphasizing the how and why of assessment and testing. Chapter 5 covers materials as a component of the curriculum. This section discusses the basis for material design, the material blueprint, and the origins of materials. Chapter 6 focuses on the teaching concept, which encompasses the roles of institutions, teachers, the teaching and learning process, and the application of curriculum through lesson plans. Chapter 7 examined the concept of evaluation. It discusses the approaches, purpose, and procedures used in conducting curriculum evaluation. Chapter 8 discusses the curriculum and syllabus in the Indonesian context. Chapter 9 discusses the SMA/MA English curriculum, including the syllabus and lesson plans for this grade. Chapter 10 focuses on the SMP/MTs level curriculum, including the syllabus and lesson plans for this grade. Finally, Chapter 11 examines the concept of material development in English language teaching. This textbook is intended to augment the teaching and learning processes in the English Curriculum and Material Development course, as well as to encourage students to be active and motivated learners.
APA, Harvard, Vancouver, ISO, and other styles
35

Nalantha, I. Made Drati, Ni Komang Arie Suwastini, I Gusti Ayu Agung Dian Susanthi, Putu Wiraningsih, and Ni Nyoman Artini. "Intra-Sentential and Intra-Lexical Code Mixing in Nessie Judge’s YouTube Video Entitled “Lagu Populer + Pesan Iblis Tersembunyi”." RETORIKA: Jurnal Ilmu Bahasa 7, no. 2 (October 19, 2021): 166–71. http://dx.doi.org/10.22225/jr.7.2.3748.166-171.

Full text
Abstract:
As a linguistic phenomenon, code mixing is common to be identified in language users. Furthermore, YouTube as one of the online platforms has become an environment rich with the use of code mixing. Considering that YouTube might influence the language use in its audience, the following study aimed to identify the use of code-mixing presented by Indonesian content creator named Nessie Judge. Following the qualitative analysis research from Miles, Huberman, & Saldana (2014), the recent study identified the types of code mixing as presented by Hoffman namely, Intra-sentential code-mixing and Intra-lexical code-mixing. The present study identified the use of code mixing type intra-sentential and -lexical uttered by the speaker. From 114 utterances made by Nessie Judge in her video, code-mixing was identified in 86 utterances, where 53 utterances belong to intra-sentential code mixing and 13 utterances belonged to intra-lexical code mixing. The analysis revealed that the use of code mixing might be rooted in the speakers’ inability to find the equivalent words while discussing the video content. By looking at the number of the data percentage, intra-sentential code-mixing had more data than intra-lexical code-mixing meaning that the use of intra-sentential code-mixing was more common rather than intra-lexical code-mixing. It can be concluded because the speaker in the video inserts English words at the end of sentences or in the middle of sentences most of the time. The speaker on the video was clearly seen mixed Indonesian words with English words without changing the structure or context of the sentences.
APA, Harvard, Vancouver, ISO, and other styles
36

Yermolenko, Serhiy, and Maria Ostapenko. "THEMATIC-IDEOGRAPHIC ASPECT OF DESCRIPTION AND ANALYSIS OF EPONYMY." Studia Linguistica, no. 15 (2019): 53–65. http://dx.doi.org/10.17721/studling2019.15.53-65.

Full text
Abstract:
The paper focuses on the thematic-ideographic aspect of eponymic derivation and, correspondingly, eponymic derivation relationship. Eponymy is a three-member structure which consists of an underlying proper name, a derived eponymic lexeme (or phraseme), and the language-internal as well as extralinguistic relationship between the underlying and derived entities. Eponyms are derived from proper names (or onoma propria). From the viewpoint of thematic-ideographic classification, eponyms are categorized into various groups, such as realonyms and mythonyms, in accordance with what is their real world status: realonyms denote real objects, whereas mythonym non-existent or fictitious ones. In the group of realonyms, such thematic-ideographic subclasses are distinguished as anthroponyms, ergonyms, ethnonyms, zoonyms, cosmonyms, toponyms, chrononyms and chrematonyms (indcluding ideonyms), and in mythonym group, mythological anthroponyms, ergonyms, ethnonyms, zoonyms and theonyms (including daemonyms), toponyms and chrematonyms. Various thematic-ideographic groups of proper names differ in the degree of their activity with respect to eponym formation. Illustrating this, the authors draw on relevant etymological and semantic data: each group of Ukrainian, English and French examples is provided with etymological and semantic information. They also consider the problem of the special status of chrononyms and ethnonyms as proper names. Their study shows that in the abovementioned languages, the most productive with respect to eponymic derivation are anthroponyms and toponyms, and the most unproductive, chronomyms and chrematonyms, No eponyms are found to derived from onomastic fitonyms. An important part of eponym study, thematic-ideographic and other, is so called source critique, something which is demonstrated by the analysis of some entries of Etymologic dictionary of the Ukrainian language (ESUM), The present paper and the materials it contains are of interest for general linguistic theory research in the field of the overall semantic and semiotic potential of the secondary use of proper names as a whole and their individual thematic-ideographic subclasses in particular.
APA, Harvard, Vancouver, ISO, and other styles
37

Andrássy, Géza, and László Imre Komlósi. "Dimensionality Expressed by Caseendings and Spatial Prepositions." Acta Agraria Debreceniensis, no. 1 (May 12, 2002): 7–15. http://dx.doi.org/10.34101/actaagrar/1/3528.

Full text
Abstract:
The purpose of this essay is to investigate some of the uses of English prepositions and Hungarian case endings employed to express spatial relations. The observation of invariant mistakes Hungarian native speakers learning English make initiated the investigation. The questions raised are: (a) where do the two systems match and where do mismatches lie, (b) how do language users perceive the world, and (c) do speakers observe spatial relations as two-dimensional or three-dimensional cognitive models? Do different languages see the same thing as either three-dimensional, or two-dimensional?Abondolo (1988) gives an adequate morphological analysis of ten Hungarian case-endings (inessive, illative, elative, superessive, delative, sublative adessive, ablative, allative and terminative) used in spatial reference, which give a closed set in references made to factors, such as (1) location which can be broken down as interior vs. exterior location with the latter being further analysable as superficial and proximal, and (2) orientation which can be analysed as zero orientation (position), source and goal. In addition to those in this list, two other case endings (genetive/dative and locative) are also used for expressing spatial relations but the last is only a variant of the inessive and superessive case-endings and is only used with place-names. The set is closed in the sense that the same item is meant to refer to the same sort of spatial relation in every case. Language textbooks, c.f. Benkő (1972) seem to suggest a neat match between the above Hungarian case endings and their English prepositional counterparts, e.g. London-ban (inessive) = in London.The picture, however, is far from being so clear-cut. The data, which were taken from various dictionaries and textbooks, show that the choices of both the prepositions and the case endings listed above depend on how the speaker considers factors (1) and (2) and that proximity is very important. Instead of a one-to-one match between the prepositions and the case endings, we rather find that the above case endings will match a dual, and in some cases a tripartite system of prepositions with the correspondences found in the two languages, which yield the following chart: We suggest that languages may view or map the same physical entities in different ways, for example along surface vs. volume or goal vs. passage, etc.Furthermore, we also find it possible that it is the language specific, inherent coding of the nominal phrase that decides – in many cases – upon the choice of prepositions and case endings.
APA, Harvard, Vancouver, ISO, and other styles
38

Rakic, Stanimir. "On metaphorical designation of humans, animals, plants and things in Serbian and English language." Juznoslovenski filolog, no. 60 (2004): 147–76. http://dx.doi.org/10.2298/jfi0460147r.

Full text
Abstract:
In this paper I examine compound names of plants, animals, human beings and other things in which at least one nominal component designates a part of the body or clothes, or some basic elements of houshold in Serbian and English. The object of my analysis are complex derivatives of the type (adjective noun) + suffix in Serbian and componds of the type noun's + noun, noun + noun and adjective + noun in English. I try to show that there is a difference in metaphorical designation of human beings and other living creatures and things by such compound nouns. My thesis is that the metathorical designation of human beings by such compounds is based on the symbolic meaning of some words and expressions while the designation of other things and beings relies on noticed similarity. In Serbian language such designation is provided by comples derivatives praznoglavac 'empty-headed person', tupoglavac 'dullard' debolokoiac 'callos person', golobradac 'young, inexperienced person' zutokljunac 'tledling' (fig), in English chicken liver, beetle brain birdbrain, bonehead, butterfingers, bigwig, blackleg, blue blood bluestocking, eat's paw, deadhead,fat-guts,fathead, goldbrick (kol) hardhat, hardhead, greenhorn, redcoat (ist), redneck (sl), thickhead, etc. Polisemous compounds like eat's paw lend support for this thesis because their designation of human beings is based on symbolic meaning of some words or expressions. I hypothesize that the direction and extend of the possible metaphorization of names may be accounted for by the following hierarchy (11) people - animals - plants - meterial things. Such hierarchy is well supported by the observations of Lakoff (1987) and Taylor (1995) about the role of human body in early experience and perception ofthe reality. Different restrictions which may be imposed in the hierarchy (11) should be the matter of further study, some of which have been noted on this paper. The compounds of this type denoting people have metaphorically meaning conected with some pejorative uses. These compounds refer to some psychological or characteral features, and show that for the classification of people such features are much more important than physical properties. While the animals and plants are classified according to some charecteristics of their body parts, people are usually classified according to psychollogical characteristics or their social functions. I have also noted a difference in structure between compounds designation animals and those designating plants and other things. The designation of animals relies more on metonymy, and that of plants and other things on metaphor based on comparision of noticed similarities. In the compounds designating animals, the nominal component relatively seldom refers to the parts of plants or other things. I guess that the cause may be the fact that the anatomy of plants is very different from the anatomy of animals. As a consequence the structure adjective + noun is much more characteristic of the compounds designating animals in English than the structure noun's + noun, and the same holds, although in a lesser degree for the compounds designating humans. It is also noticeable that in English compounds whose second component a part of body or clothes the first component rarely designates animals. On the other hand the compounds (9), in which the nominal head refers to some superordinate species, the first component often designates animal species, but usually of a very different kind. These data seem to lend support for Goldvarg & Gluksberg's thesis (1998) that metaforical interpretation is favoured if the nominal constituents denote quite different entities.
APA, Harvard, Vancouver, ISO, and other styles
39

Liepa, Dite. "Īpašvārdu pareizrakstība: problēmas un risinājumi." Valodu apguve: problēmas un perspektīva : zinātnisko rakstu krājums = Language Acquisition: Problems and Perspective : conference proceedings, no. 16 (May 6, 2020): 313–34. http://dx.doi.org/10.37384/va.2020.16.313.

Full text
Abstract:
The paper “Spelling of Proper Nouns: Problems and Solutions” focuses on the questions and spelling norms regarding capitalisation. The compiled material for the paper is based on the questions regarding language practice, professional experience as a language consultant (2005–2018) at the Latvian Language Agency, participation in the meetings of the Latvian Language Expert Commission of the State Language Centre, as well as experience as a lecturer at various Latvian institutions of higher education teaching the course “Business Latvian”. The article addresses the contradictory information presented in various sources, including school textbooks, handbooks and even laws issued by the state. Based on these sources, the spelling of names of educational institutions, entities and their divisions, eras and historical events, positions and titles, holidays, memorial and commemorative days, as well as names that include the components pasaules (world) and starptautiskais (international), are ambiguous. The paper gives sources that provide information on the particular problem cases, as well as spelling suggestions, based on the author’s professional experience. Unfortunately, many of the aforementioned issues still lack a solution and they have not been described in any source. In such cases the solution has to be based on the explanations of analogous examples. The issue of capitalisation is somewhat problematic, as developing unified spelling norms has never been easy due to both past spelling norms (including the Soviet heritage) and various views of linguists regarding capitalisation. Moreover, spelling of proper nouns in Latvian is also influenced by foreign language (English, Russian, German) spelling norms of proper nouns, subjective points of view of individual pedagogues, proofreaders, editors and other language professionals, as well as the objective reality (signboards in public environment, written media, names of companies registered in the Register of Enterprises etc.). However, attempts to find an explanation are met with (possibly deliberate) textbook authors’ avoidance of describing and tackling problematic issues, as that would take a lot of time and professional analysis. Sometimes linguists, being aware of the complexity of the situation, avoid taking initiative in popularising new spelling norms. The paper draws broadly on “Capitalisation in Latvian: Overview of Historical Research, Problems and Solutions” (Riga: Latvian Language Agency, 2012), which is the most recent and comprehensive work dedicated to this topic, as well as the most recent decisions of the Latvian Language Expert Commission of the State Language Centre.
APA, Harvard, Vancouver, ISO, and other styles
40

Amalia, Desthia. "A SEMIOTIC ANALYSIS OF SHAKESPEARE’S “O MISTRESS MINE” USING RIFFATERRE’S SEMIOTIC THEORY." Journal of Language and Literature 8, no. 1 (2020): 15–29. http://dx.doi.org/10.35760/jll.2020.v8i1.2707.

Full text
Abstract:
Analyzing literature works has been widely known as a complex spectrum because the words choice and construction in any literature works can be done in free way. However, literature is known as a free instrument for people to express their feeling. The research discussed about a song entitled "O Mistress Mine" using Riffaterre's semiotic theory. The song is chosen as signifier from Shakespeare's play Twelfth night due to it represents lead character's situation for having unrequited love and a message for them to seize the day (carpe diem). O Mistress Mine might appear as a short song in the whole play but the meaning behind the lyrics convey the love line of the leads. Shakespeare's literature works has been known as a complex literature work because of his choice of words and words construction speak wihin his era. Meanwile in this era people might find out that grasping meaning from Shakespeare's literature works is counted as complex work. Thus, the researcher chose Rifaterre's semiotic theory that includes three ways of analyzing the poem or song lyrics, in correlation to analyze further about figurative language and lead us to understand the meaning of the song. The research used qualitative method to find out symbols from the song and also using Riffaterre's semiotic theory. Purposive sampling is also used to pick the signs that have correlation with the message from Twelfth Night. The finding revealed there are three ways to analyze named displacing meaning, distorting meaning and creating meaning. Thus, there are 5 data used displacing meaning, 3 data used distorting meaning and 2 data used creating meaning. Furthermor, this research practically and theoritically can be used for the readers and students of English literature. this research practically and theoritically can be used for the readers and students of English literature.
APA, Harvard, Vancouver, ISO, and other styles
41

Nistiti, Nurul Ulfa. "PHILOSOPHY OF LANGUAGE: PRAGMATIC PRESUPPOSITION IN MOTIVATIONAL SPEECH WITHIN DISCOURSE AND ITS RELEVANCE OF MOTIVATION IN TEACHING LEARNING PROCESS TO REACH GOALS." Prosodi 15, no. 2 (October 11, 2021): 186–202. http://dx.doi.org/10.21107/prosodi.v15i2.12185.

Full text
Abstract:
This research was taken from online media in the form of a speech on a YouTube channel called the English Speeches Channel featuring an inspiring woman named Muniba Mazari Baloch. She is a Pakistani artist, model, activist, motivational speaker, singer, social reformer, and television host. Her motivational speech is titled we all are Perfectly Imperfect. This research accompaniment three research questions by analyzing the types of presuppositions contained in Muniba Mazari's speech and determining the type of presupposition in his speech that comes up with the confession discourse function, then knowing how far her confessions influences her audiencess through what he delivers. The research method used in this research is descriptive qualitative by analyzing several utterances in her speech, through two approaches of theory pragmatic presupposition and confessional discourse analysis. The results showed that Muniba Mazari used all types of pragmatic presuppositions (Existential, Factive, Non-Factive, Lexical, Structural, and Counterfactual). Through this type of presupposition, Muniba Mazari also brings out the function of confessional discourse. The function of confessional discourse contained in her speech is a therapeutic, didactic, and interrogatory function. During the research, researchers found the main threat from the combination of these two theories is the strength of Motivational Assertion. The main threat that became the main idea as the direction of Muniba Mazari's speech in motivating her audiences. Then, this main thread also asserts how powerful Muniba Mazari's speech was. In this context, the results bring about optimism, achievable objectives, passion, and confidence. Finally, Muniba Mazari's speech entitled We Are Perfectly Imperfect which contains many moral messages can be said to be a motivational speech. It can be manifested in learning-teaching process. The result of combining these two theories produces the main thread that can be applied by several teachers in motivating their students in the learning-teaching process.
APA, Harvard, Vancouver, ISO, and other styles
42

Mazur, Pawel, and Robert Dale. "Handling conjunctions in named entities." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 49–68. http://dx.doi.org/10.1075/li.30.1.05maz.

Full text
Abstract:
Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain problematic for existing text processing systems. One of these is the ambiguity of conjunctions in candidate named entity strings, an all-too-prevalent problem in corporate and legal documents. In this paper, we distinguish four uses of the conjunction in these strings, and explore the use of a supervised machine learning approach to conjunction disambiguation trained on a very limited set of ‘name internal’ features that avoids the need for expensive lexical or semantic resources. We achieve 84% correctly classified examples using k-fold evaluation on a data set of 600 instances. We argue that further improvements are likely to require the use of wider domain knowledge and name external features.
APA, Harvard, Vancouver, ISO, and other styles
43

Sunarto, Bambang. "Adangiyah." Dewa Ruci: Jurnal Pengkajian dan Penciptaan Seni 16, no. 1 (May 5, 2021): iii—iv. http://dx.doi.org/10.33153/dewaruci.v16i1.3601.

Full text
Abstract:
This edition is the first issue of Dewa Ruci’s Journal, in which all articles are in English. We deliberately changed the language of publication to English to facilitate information delivery to a wider audience. We realize that English is the official language for many countries rather than other languages in this world. The number of people who have literacy awareness and need scientific information about visual and performing arts regarding the archipelago’s cultural arts is also quite large.The decision to change the language of publication to English does not mean that we do not have nationalism or are not in love with the Indonesian language. This change is necessary to foster the intensity of scientific interaction among writers who are not limited to Indonesia’s territory alone. We desire that the scientific ideas outlined in Dewa Ruci’s Journal are read by intellectual circles of the arts internationally. We also want to express our scientific greetings to art experts from countries in New Zealand, the USA, Australia, Europe, especially Britain, and other English-speaking countries such as the Philippines, India, Pakistan, Zimbabwe, the Caribbean, Hong Kong, South Africa, and Canada. Of course, a change in English will also benefit intellectuals from countries that have acquired English as a second language, such as Malaysia, Brunei, Israel, Malaysia, and Sri Lanka. In essence, Dewa Ruci’s Journal editor wants to invite writers to greet the scientific community at large.We are grateful that six writers can greet the international community through their articles. The first is Tunjung Atmadi and Ika Yuni Purnama, who wrote an article entitled “Material Ergonomics on Application of Wooden Floors in the Interior of the Workspace Office.” This article discusses office interiors that are devoted to workspaces. The purpose of this study is to share knowledge about how to take advantage of space-forming elements in the interior design of a workspace by utilizing wooden floors like parquet. The focus is on choosing the use of wood by paying attention to the elements in its application. This research result has a significant meaning in the aesthetics, comfort, and safety of wooden floors in the workspace’s interior and its advantages and disadvantages.The second writer who had the opportunity to greet the Dewa Ruci Journal audience was intellectuals with diverse expertise, namely Taufiq Akbar, Dendi Pratama, Sarwanto, and Sunardi. Together they wrote an article entitled “Visual Adaptation: From Comics to Superhero Creation of Wayang.” This article discusses the fusion and mixing of wayang as a traditional culture with comics and films as contemporary culture products. This melting and mixing have given birth to new wayang creations with sources adapted from the superhero character “Avenger,” which they now call the Avenger Wayang Kreasi. According to them, Wayang Kreasi Avenger’s making maintains technical knowledge of the art of wayang kulit. It introduces young people who are not familiar with wayang kulit about the technique of carving sungging by displaying the attributes in the purwa skin for Wayang Kreasi Avenger. This creativity is an attempt to stimulate and show people’s love for the potential influence of traditional cultural heritage and its interaction with the potential of contemporary culture.The next authors are Sriyadi and RM Pramutomo, with an article entitled “Presentation Style of Bedhaya Bedhah Madiun Dance in Pura Mangkunegaran.” This article reveals a repertoire of Yogyakarta-style dance in Mangkunegaran, Surakarta, namely the Bedhaya Bedhah Madiun. The presence of this dance in Mangkunegaran occurred during the reign of Mangkunegara VII. However, the basic character of the Mangkunegaran style dance has a significant difference from the Yogyakarta style. This paper aims to examine the Bedhaya Bedhah Madiun dance’s presentation style in Mangkunegaran to determine the formation of its presentation technique. The shape of the Bedhaya Bedhah Madiun dance style in Mangkunegaran did not occur in an event but was a process. The presentation style’s formation is due to a problem in the inheritance system that has undergone significant changes. These problems arise from social, political, cultural, and economic conditions. The responses to these problems have shaped the Bedhaya Bedhah Madiun dance's distinctive features in Mangkunegaran, although not all of them have been positive.Hasbi wrote an article entitled “Sappo: Sulapa Eppa Walasuji as the Ideas of Creation Three Dimensional Painting.” This article reveals Hasbi’s creative process design in creating three-dimensional works of art, named Sappo. He got his inspiration from the ancient manuscripts written in Lontara, namely the manuscripts written in the traditional script of the Bugis-Makassar people on palm leaves, which they still keep until now. Sappo for the Bugis community is a fence that limits (surrounds, isolates) the land and houses. Sappo’s function is to protect herself, her family, and her people. Sulapa Eppa means four sides, is a mystical manifestation, the classical belief of the Bugis-Makassar people, which symbolizes the composition of the universe, wind-fire-water-earth. Walasuji is a kind of bamboo fence in rhombus rituals. Eppa Walasuji’s Sulapa is Hasbi’s concept in creating Sappo in the form of three-dimensional paintings. The idea is a symbolic expression borrowing the Lontara tradition's idiom to create a symbolic effect called Sappo.Mahdi Bahar and his friends wrote an article entitled “Transformation of Krinok to Bungo Krinok Music: The Innovation Certainty and Digital-Virtual Contribution for Cultural Advancement.” Together, they have made innovations to preserve Krinok music, one of Jambi’s traditional music themes, into new music that they call Bungo Krinok. He said that innovation is a necessity for the development of folk music. In innovating, they take advantage of digital technology. They realize this music’s existence as a cultural wealth that has great potential for developing and advancing art. The musical system, melodic contours, musical grammar, and distinctive interval patterns have formed krinok music’s character. This innovation has given birth to new music as a transformation from Jambi folk music called “Bungo Krinok” music.Finally, Luqman Wahyudi and Sri Hesti Heriwati. They both wrote an article entitled “Social Criticism About the 2019 Election Campaign on the Comic Strip Gump n Hell.” They explained that in 2019 there was an interesting phenomenon regarding the use of comic strips as a means of social criticism, especially in the Indonesian Presidential Election Campaign. The title of the comic is Gump n Hell by Errik Irwan Wibowo. The comic strip was published and viral on social media, describing the political events that took place. In this study, they took three samples of the comic strip Gump n Hell related to the moment of the 2019 election to analyze their meaning. From the results of this study, there is an implicit meaning in the comic strip of pop culture icons' use to represent political figures in the form of parodies.That is the essence of the issue of Volume 16 Number 1 (April Edition), 2021. Hopefully, the knowledge that has been present in this publication can spur the growth of visual and performing art science in international networks, both in the science of art creation and in scientific research of art in general. We hope that the development of visual and performing art science can reveal the various meanings behind various facts and phenomena of art life. Therefore, the growth of international networks is an indispensable need.Thank you.
APA, Harvard, Vancouver, ISO, and other styles
44

Lestari, Ni Putu Devi, I. Made Winaya, and I. Gst Ayu Gede Sosiowati. "Translation Procedures in Translating Proper Names from English into Indonesian." Humanis 24, no. 4 (November 23, 2020): 386. http://dx.doi.org/10.24843/jh.2020.v24.i04.p06.

Full text
Abstract:
Translation procedure is a procedure or a method to translate the unit of language from the source language to the target language. Every linguistic part needs to be translated. It means including the proper names in the literary work. This study is aimed at identifying and analyzing the types of the proper name and their translation procedures in the novel entitled Pembunuhan di Orient Express. The problems in this study are discussed based on the theory of proper name and the theory of the translation procedure by Newmark (1988). The method used to collect the data was documentation method. This study applied the descriptive qualitative method in analyzing the data. The result of the analysis was presented using an informal method. The analysis showed three types of proper names in the data sources. They are people’s names, the name of an object, and the geographical term. The translator uses seven methods from 18 translation procedures that were proposed by Newmark (1988).
APA, Harvard, Vancouver, ISO, and other styles
45

Galicia-Haro, Sofía N., and Alexander Gelbukh. "Complex named entities in Spanish texts." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 69–94. http://dx.doi.org/10.1075/li.30.1.06gal.

Full text
Abstract:
We present a linguistic analysis of Named Entities in Spanish texts. Our work is focused on the determination of the structure of complex proper names: names with coordinated constituents, names with prepositional phrases and names formed by several content words initialized by a capital letter. We present the analysis of circa 49,000 examples obtained from Mexican newspapers. We detailed their structure and give some notions about the context surrounding them. Since named entities belong to open class of words they are being created daily, so the challenge for a named entity recognizer is to precisely determine the boundaries of new entity names in any text and to analyze thoroughly their components for deep semantic analysis. Knowing their general classes of structure it should be possible to derive useful heuristics or a specific grammar for natural language processing applications.
APA, Harvard, Vancouver, ISO, and other styles
46

Kauffmann, Alexis, François-Claude Rey, Iana Atanassova, Arnaud Gaudinat, Peter Greenfield, Hélène Madinier, and Sylviane Cardey. "Indirectly Named Entity Recognition." Journal of Computer-Assisted Linguistic Research 5, no. 1 (December 13, 2021): 27–46. http://dx.doi.org/10.4995/jclr.2021.15922.

Full text
Abstract:
We define here indirectly named entities, as a term to denote multiword expressions referring to known named entities by means of periphrasis. While named entity recognition is a classical task in natural language processing, little attention has been paid to indirectly named entities and their treatment. In this paper, we try to address this gap, describing issues related to the detection and understanding of indirectly named entities in texts. We introduce a proof of concept for retrieving both lexicalised and non-lexicalised indirectly named entities in French texts. We also show example cases where this proof of concept is applied, and discuss future perspectives. We have initiated the creation of a first lexicon of 712 indirectly named entity entries that is available for future research.
APA, Harvard, Vancouver, ISO, and other styles
47

L.Dhore, M., S. K. Dixit, and Ruchi M. Dhore. "Issues in Hindi to English and Marathi to English Machine Transliteration of Named Entities." International Journal of Computer Applications 51, no. 14 (August 30, 2012): 37–44. http://dx.doi.org/10.5120/8112-1728.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Baksa, Krešimir, Dino Golović, Goran Glavaš, and Jan Šnajder. "Tagging Named Entities in Croatian Tweets." Slovenščina 2.0: empirical, applied and interdisciplinary research 4, no. 1 (February 5, 2017): 20–41. http://dx.doi.org/10.4312/slo2.0.2016.1.20-41.

Full text
Abstract:
Named entity extraction tools designed for recognizing named entities in texts written in standard language (e.g., news stories or legal texts) have been shown to be inadequate for user-generated textual content (e.g., tweets, forum posts). In this work, we propose a supervised approach to named entity recognition and classification for Croatian tweets. We compare two sequence labelling models: a hidden Markov model (HMM) and conditional random fields (CRF). Our experiments reveal that CRF is the best model for the task, achieving a very good performance of over 87% micro-averaged F1 score. We analyse the contributions of different feature groups and influence of the training set size on the performance of the CRF model.
APA, Harvard, Vancouver, ISO, and other styles
49

de Grauwe, Luc. "“In Overlandsche ende in Duytsche sprake” und “Die alghemene Duytsche tael”." Amsterdamer Beiträge zur älteren Germanistik 77, no. 3-4 (October 19, 2017): 637–68. http://dx.doi.org/10.1163/18756719-12340096.

Full text
Abstract:
Abstract The first printed Dutch grammar was entitled Twe-spraack vande Nederduitsche letterkunst (1548). In many places, the grammar names its own language simply Duytsch, but the book also uses this term – depending on context or audience, not seldom melting one significance into another – for what now is known as ‘Continental (West) Germanic’ (“Ick spreeck int ghemeen vande duytse taal, die zelve voor één taal houdende”, p. 110), referring to the entire complex of linguistic varieties, which nowadays come under the cognate standard languages Dutch (formerly in English Low/Nether Dutch) and German (High Dutch). Many textbooks, grammars, dictionaries etc. in 16th- to 18th-century Netherlands and Flanders strikingly reserved simple Duytsch for their own language (hence Dutch), contrasting it with ‘marked’ Hoogduytsch or even Overland(t)sch (avoiding hyperonymic -duytsch!). In addition to a treatment of the term Duytsch, this article also deals with some other, strongly related cruces in the Twe-spraack.
APA, Harvard, Vancouver, ISO, and other styles
50

Bilgin, Metin. "A Study on Named Entity Recognition with OpenNLP at English Texts." Journal of Applied Intelligent System 4, no. 1 (July 16, 2019): 1–8. http://dx.doi.org/10.33633/jais.v4i1.2096.

Full text
Abstract:
Named entity recognition is a subject, inside of information retrieval which is a subdomain of natural processing. It pertains to identifying and labeling of location, person, organization, etc., inside of text content. Named entity recognition provides identifying and classifying of person, area, etc. inside of formal and informal text content and it can be used for different purposes as question answering systems and removal of the relation between events. In this work, named entity recognition is performed and one method is suggested and results are discussed for assignment to unlabeled name entities by using OpenNLP library with the help of KNIME program in the data set.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography