Academic literature on the topic 'English language — Named Entities'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'English language — Named Entities.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "English language — Named Entities"

1

Chen, Yufeng, Chengqing Zong, and Keh-Yih Su. "A Joint Model to Identify and Align Bilingual Named Entities." Computational Linguistics 39, no. 2 (June 2013): 229–66. http://dx.doi.org/10.1162/coli_a_00122.

Full text
Abstract:
In this article, an integrated model is derived that jointly identifies and aligns bilingual named entities (NEs) between Chinese and English. The model is motivated by the following observations: (1) whether an NE is translated semantically or phonetically depends greatly on its entity type, (2) entities within an aligned pair should share the same type, and (3) the initially detected NEs can act as anchors and provide further information while selecting NE candidates. Based on these observations, this article proposes a translation mode ratio feature (defined as the proportion of NE internal tokens that are semantically translated), enforces an entity type consistency constraint, and utilizes additional new NE likelihoods (based on the initially detected NE anchors). Experiments show that this novel method significantly outperforms the baseline. The type-insensitive F-score of identified NE pairs increases from 78.4% to 88.0% (12.2% relative improvement) in our Chinese–English NE alignment task, and the type-sensitive F-score increases from 68.4% to 83.0% (21.3% relative improvement). Furthermore, the proposed model demonstrates its robustness when it is tested across different domains. Finally, when semi-supervised learning is conducted to train the adopted English NE recognition model, the proposed model also significantly boosts the English NE recognition type-sensitive F-score.
APA, Harvard, Vancouver, ISO, and other styles
2

Mahmood, Ahsan, Hikmat Ullah Khan, Zahoor Ur Rehman, Khalid Iqbal, and Ch Muhmmad Shahzad Faisal. "KEFST: a knowledge extraction framework using finite-state transducers." Electronic Library 37, no. 2 (April 1, 2019): 365–84. http://dx.doi.org/10.1108/el-10-2018-0196.

Full text
Abstract:
Purpose The purpose of this research study is to extract and identify named entities from Hadith literature. Named entity recognition (NER) refers to the identification of the named entities in a computer readable text having an annotation of categorization tags for information extraction. NER is an active research area in information management and information retrieval systems. NER serves as a baseline for machines to understand the context of a given content and helps in knowledge extraction. Although NER is considered as a solved task in major languages such as English, in languages such as Urdu, NER is still a challenging task. Moreover, NER depends on the language and domain of study; thus, it is gaining the attention of researchers in different domains. Design/methodology/approach This paper proposes a knowledge extraction framework using finite-state transducers (FSTs) – KEFST – to extract the named entities. KEFST consists of five steps: content extraction, tokenization, part of speech tagging, multi-word detection and NER. An extensive empirical analysis using the data corpus of Urdu translation of Sahih Al-Bukhari, a widely known hadith book, reveals that the proposed method effectively recognizes the entities to obtain better results. Findings The significant performance in terms of f-measure, precision and recall validates that the proposed model outperforms the existing methods for NER in the relevant literature. Originality/value This research is novel in this regard that no previous work is proposed in the Urdu language to extract named entities using FSTs and no previous work is proposed for Urdu hadith data NER.
APA, Harvard, Vancouver, ISO, and other styles
3

Forouzandeh, Aynaz, Mohammad-Reza Feizi-Derakhshi, and Pejman Gholami-Dastgerdi. "Persian Named Entity Recognition by Gray Wolf Optimization Algorithm." Scientific Programming 2022 (December 10, 2022): 1–12. http://dx.doi.org/10.1155/2022/6368709.

Full text
Abstract:
Named entity recognition (NER) is a subfield of natural language processing (NLP). It is able to identify proper nouns, such as person names, locations, and organizations, and has been widely used in various tasks. NER can be practical in extracting information from social media data. However, the unstructured and noisy nature of social media (such as grammatical errors and typos) causes new challenges for NER, especially for low-resource languages such as Persian, and existing NER methods mainly focus on formal texts and English social media. To overcome this challenge, we consider Persian NER as an optimization problem and use the binary Gray Wolf Optimization (GWO) algorithm to segment posts into small possible phrases of named entities. Later, named entities are recognized based on their score. Also, we prove that even human opinion can differ in the NER task and compare our method with other systems with the S e p _ T D _ T e l 01 dataset and the results show that our proposed system obtains a higher F1 score in comparison with other methods.
APA, Harvard, Vancouver, ISO, and other styles
4

Ekbal, Asif, Sudip Kumar Naskar, and Sivaji Bandyopadhyay. "Named Entity Recognition and transliteration in Bengali." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 95–114. http://dx.doi.org/10.1075/li.30.1.07ekb.

Full text
Abstract:
The paper reports about the development of a Named Entity Recognition (NER) system in Bengali using a tagged Bengali news corpus and the subsequent transliteration of the recognized Bengali Named Entities (NEs) into English. Three different models of the NER have been developed. A semi-supervised learning method has been adopted to develop the first two models, one without linguistic features (Model A) and the other with linguistic features (Model B). The third one (Model C) is based on statistical Hidden Markov Model. A modified joint-source channel model has been used along with a number of alternatives to generate the English transliterations of Bengali NEs and vice-versa. The transliteration models learn the mappings from the bilingual training sets optionally guided by linguistic knowledge in the form of conjuncts and diphthongs in Bengali and their representations in English. The NER system has demonstrated the highest average Recall, Precision and F-Score values of 89.62%, 78.67% and 83.79% respectively in Model C. Evaluation of the proposed transliteration models demonstrated that the modified joint source-channel model performs best in terms of evaluation metrics for person and location names for both Bengali to English (B2E) transliteration and English to Bengali transliteration (E2B). The use of the linguistic knowledge during training of the transliteration models improves performance.
APA, Harvard, Vancouver, ISO, and other styles
5

Tarmizi, Shasha Arzila, and Saidah Saad. "NAMED ENTITY RECOGNITION FOR QURANIC TEXT USING RULE BASED APPROACHES." Asia-Pacific Journal of Information Technology and Multimedia 11, no. 02 (December 31, 2022): 112–22. http://dx.doi.org/10.17576/apjitm-2022-0101-08.

Full text
Abstract:
The variety and difference between domains for textual data require customization in the Natural Language Processing component especially in Named Entity Recognition where different domains contain several types of entities. The current NER model is deemed not fit to accurately extract entities from Quranic text due to its unique content. This paper describes the building of a rule-based Named Entity Recognition method to extract the entities that exist in the English translation to the meaning of the Quranic text and its performance evaluation. Named entity tagging, a common task in-text annotation, in which entities (nouns) in the unstructured text are identified and assigned a class. A few rules are built to extract several types of entities such as the name of prophets and people, creation, location, time, and the various names of God. The rules are built mainly using regular expressions and gazetteers. The rules that have been built result in high precision and recall as well as a satisfactory F-score of over 90%. The results from this experiment can be used as annotation in building a machine learning model to extract entities from the same type of domain specifically on the Quranic text or generally in the Islamic domain text.
APA, Harvard, Vancouver, ISO, and other styles
6

Tarmizi, Shasha Arzila, and Saidah Saad. "Named Entity Recognition For Quranic Text Using Rule Based Approaches." Asia-Pacific Journal of Information Technology and Multimedia 11, no. 02 (December 31, 2022): 112–22. http://dx.doi.org/10.17576/apjitm-2022-1102-09.

Full text
Abstract:
The variety and difference between domains for textual data require customization in the Natural Language Processing component especially in Named Entity Recognition where different domains contain several types of entities. The current NER model is deemed not fit to accurately extract entities from Quranic text due to its unique content. This paper describes the building of a rule-based Named Entity Recognition method to extract the entities that exist in the English translation to the meaning of the Quranic text and its performance evaluation. Named entity tagging, a common task in-text annotation, in which entities (nouns) in the unstructured text are identified and assigned a class. A few rules are built to extract several types of entities such as the name of prophets and people, creation, location, time, and the various names of God. The rules are built mainly using regular expressions and gazetteers. The rules that have been built result in high precision and recall as well as a satisfactory F-score of over 90%. The results from this experiment can be used as annotation in building a machine learning model to extract entities from the same type of domain specifically on the Quranic text or generally in the Islamic domain text.
APA, Harvard, Vancouver, ISO, and other styles
7

Boudjellal, Nada, Huaping Zhang, Asif Khan, Arshad Ahmad, Rashid Naseem, Jianyun Shang, and Lin Dai. "ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition." Complexity 2021 (March 13, 2021): 1–6. http://dx.doi.org/10.1155/2021/6633213.

Full text
Abstract:
The web is being loaded daily with a huge volume of data, mainly unstructured textual data, which increases the need for information extraction and NLP systems significantly. Named-entity recognition task is a key step towards efficiently understanding text data and saving time and effort. Being a widely used language globally, English is taking over most of the research conducted in this field, especially in the biomedical domain. Unlike other languages, Arabic suffers from lack of resources. This work presents a BERT-based model to identify biomedical named entities in the Arabic text data (specifically disease and treatment named entities) that investigates the effectiveness of pretraining a monolingual BERT model with a small-scale biomedical dataset on enhancing the model understanding of Arabic biomedical text. The model performance was compared with two state-of-the-art models (namely, AraBERT and multilingual BERT cased), and it outperformed both models with 85% F1-score.
APA, Harvard, Vancouver, ISO, and other styles
8

Panibog, A. "Fairytale Precedent Names in English-Language MediaDiscourse." Studia Philologica 1-2, no. 18-19 (2022): 48–57. http://dx.doi.org/10.28925/2311-2425.2022.1894.

Full text
Abstract:
The article considers fairytale precedent names selected from English-language media discourse texts based on cognitive linguistics. Coverage of the fairytale precedent names linguocognitive features was carried out within the framework of conceptual analysis that allowed revealing the connection between linguistic and conceptual structures. The study material includes cited statements containing fairytale anthroponyms posted on Internet sites and in the English Web 2020 data corpus (enTenTenTen20) of the Sketch Engine application. This corpus is an English corpus of texts collected from the Internet between 2019 and 2021. Based on the analysis of this material, a hypothesis has been proposed that the vast majority of fairytale precedent names that function in English-language media discourse are formed on the analogy principle. The study found that characteristic of media texts is the use of precedent names in metaphorical models which are likened to entities belonging to different conceptual spheres. In this case,the comparison of objects is carried out by the feature joint to both compared entities. In the analyzed material, the metaphor is represented by the models “a PERSON-man is like an ANIMAL-mythonym” and “an OBJECT-plant is like the ANIMAL-mythonym”. In the formation of the fairytale precedent names, the principle of analogy is also used in which two entities belonging to the same conceptual sphere are compared. As a rule, such similarity of a comparative (what is compared) and a correlate (what is compared with) occurs according to the full degree of similarity. The ability to characterize other objects of reality is explained in a prototype aspect of fairytale precedent names namely their similarity as an exemplary class representative to the leading property of the primary referent. The study results indicated that the analog comparison frequency (87,72%) of the fairytale precedent names is much higher than metaphorical (12,28%) that confirms the proposed hypothesis. Thus, we can conclude that in the modern English-language media discourse the fairytale precedent names are formed mainly on the basis of analogy.
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Hao, Lekai Zhou, Jianyong Duan, and Li He. "Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training." Applied Sciences 13, no. 4 (February 16, 2023): 2548. http://dx.doi.org/10.3390/app13042548.

Full text
Abstract:
Named entity recognition aims to extract entities with specific meaning from unstructured text. Currently, deep learning methods have been widely used for this task and have achieved remarkable results, but it is often difficult to achieve better results with less labeled data. To address this problem, this paper proposes a method for cross-lingual entity recognition based on an attention mechanism and adversarial training, using resource-rich language annotation data to migrate to low-resource languages for named entity recognition tasks and outputting changing semantic vectors through the attention mechanism to effectively solve the long-sequence semantic dilution problem. To verify the effectiveness of the proposed method, the method in this paper is applied to the English–Chinese cross-lingual named entity recognition task based on the WeiboNER data set and the People-Daily2004 data set. The obtained F1 value of the optimal model is 53.22% (a 6.29% improvement compared to the baseline). The experimental results show that the cross-lingual adversarial named entity recognition method proposed in this paper can significantly improve the results of named entity recognition in low resource languages.
APA, Harvard, Vancouver, ISO, and other styles
10

Sboev, Alexander, Roman Rybka, Anton Selivanov, Ivan Moloshnikov, Artem Gryaznov, Alexander Naumov, Sanna Sboeva, Gleb Rylkov, and Soyora Zakirova. "Accuracy Analysis of the End-to-End Extraction of Related Named Entities from Russian Drug Review Texts by Modern Approaches Validated on English Biomedical Corpora." Mathematics 11, no. 2 (January 9, 2023): 354. http://dx.doi.org/10.3390/math11020354.

Full text
Abstract:
An extraction of significant information from Internet sources is an important task of pharmacovigilance due to the need for post-clinical drugs monitoring. This research considers the task of end-to-end recognition of pharmaceutically significant named entities and their relations in texts in natural language. The meaning of ”end-to-end“ is that both of the tasks are performed within a single process on the ”raw“ text without annotation. The study is based on the current version of the Russian Drug Review Corpus—a dataset of 3800 review texts from the Russian segment of the Internet. Currently, this is the only corpus in the Russian language appropriate for research of the mentioned type. We estimated the accuracy of the recognition of the pharmaceutically significant entities and their relations in two approaches based on neural-network language models. The first core approach is to sequentially solve tasks of named-entities recognition and relation extraction (the sequential approach). The second one solves both tasks simultaneously with a single neural network (the joint approach). The study includes a comparison of both approaches, along with the hyperparameters selection to maximize resulting accuracy. It is shown that both approaches solve the target task at the same level of accuracy: 52–53% macro-averaged F1−score, which is the current level of accuracy for “end-to-end” tasks on the Russian language. Additionally, the paper presents the results for English open datasets ADE and DDI based on the joint approach, and hyperparameter selection for the modern domain-specific language models. The result is that the achieved accuracies of 84.2% (ADE) and 73.3% (DDI) are comparable or better than other published results for the datasets.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "English language — Named Entities"

1

Ringland, Nicola. "Structured Named Entities." Thesis, The University of Sydney, 2015. http://hdl.handle.net/2123/14558.

Full text
Abstract:
The names of people, locations, and organisations play a central role in language, and named entity recognition (NER) has been widely studied, and successfully incorporated, into natural language processing (NLP) applications. The most common variant of NER involves identifying and classifying proper noun mentions of these and miscellaneous entities as linear spans in text. Unfortunately, this version of NER is no closer to a detailed treatment of named entities than chunking is to a full syntactic analysis. NER, so construed, reflects neither the syntactic nor semantic structure of NE mentions, and provides insufficient categorical distinctions to represent that structure. Representing this nested structure, where a mention may contain mention(s) of other entities, is critical for applications such as coreference resolution. The lack of this structure creates spurious ambiguity in the linear approximation. Research in NER has been shaped by the size and detail of the available annotated corpora. The existing structured named entity corpora are either small, in specialist domains, or in languages other than English. This thesis presents our Nested Named Entity (NNE) corpus of named entities and numerical and temporal expressions, taken from the WSJ portion of the Penn Treebank (PTB, Marcus et al., 1993). We use the BBN Pronoun Coreference and Entity Type Corpus (Weischedel and Brunstein, 2005a) as our basis, manually annotating it with a principled, fine-grained, nested annotation scheme and detailed annotation guidelines. The corpus comprises over 279,000 entities over 49,211 sentences (1,173,000 words), including 118,495 top-level entities. Our annotations were designed using twelve high-level principles that guided the development of the annotation scheme and difficult decisions for annotators. We also monitored the semantic grammar that was being induced during annotation, seeking to identify and reinforce common patterns to maintain consistent, parsimonious annotations. The result is a scheme of 118 hierarchical fine-grained entity types and nesting rules, covering all capitalised mentions of entities, and numerical and temporal expressions. Unlike many corpora, we have developed detailed guidelines, including extensive discussion of the edge cases, in an ongoing dialogue with our annotators which is critical for consistency and reproducibility. We annotated independently from the PTB bracketing, allowing annotators to choose spans which were inconsistent with the PTB conventions and errors, and only refer back to it to resolve genuine ambiguity consistently. We merged our NNE with the PTB, requiring some systematic and one-off changes to both annotations. This allows the NNE corpus to complement other PTB resources, such as PropBank, and inform PTB-derived corpora for other formalisms, such as CCG and HPSG. We compare this corpus against BBN. We consider several approaches to integrating the PTB and NNE annotations, which affect the sparsity of grammar rules and visibility of syntactic and NE structure. We explore their impact on parsing the NNE and merged variants using the Berkeley parser (Petrov et al., 2006), which performs surprisingly well without specialised NER features. We experiment with flattening the NNE annotations into linear NER variants with stacked categories, and explore the ability of a maximum entropy and a CRF NER system to reproduce them. The CRF performs substantially better, but is infeasible to train on the enormous stacked category sets. The flattened output of the Berkeley parser are almost competitive with the CRF. Our results demonstrate that the NNE corpus is feasible for statistical models to reproduce. We invite researchers to explore new, richer models of (joint) parsing and NER on this complex and challenging task. Our nested named entity corpus will improve a wide range of NLP tasks, such as coreference resolution and question answering, allowing automated systems to understand and exploit the true structure of named entities.
APA, Harvard, Vancouver, ISO, and other styles
2

Radford, William Edward John. "Linking named entities to Wikipedia." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/12850.

Full text
Abstract:
Natural language is fraught with problems of ambiguity, including name reference. A name in text can refer to multiple entities just as an entity can be known by different names. This thesis examines how a mention in text can be linked to an external knowledge base (KB), in our case, Wikipedia. The named entity linking (NEL) task requires systems to identify the KB entry, or Wikipedia article, that a mention refers to; or, if the KB does not contain the correct entry, return NIL. Entity linking systems can be complex and we present a framework for analysing their different components, which we use to analyse three seminal systems which are evaluated on a common dataset and we show the importance of precise search for linking. The Text Analysis Conference (TAC) is a major venue for NEL research. We report on our submissions to the entity linking shared task in 2010, 2011 and 2012. The information required to disambiguate entities is often found in the text, close to the mention. We explore apposition, a common way for authors to provide information about entities. We model syntactic and semantic restrictions with a joint model that achieves state-of-the-art apposition extraction performance. We generalise from apposition to examine local descriptions specified close to the mention. We add local description to our state-of-the-art linker by using patterns to extract the descriptions and matching against this restricted context. Not only does this make for a more precise match, we are also able to model failure to match. Local descriptions help disambiguate entities, further improving our state-of-the-art linker. The work in this thesis seeks to link textual entity mentions to knowledge bases. Linking is important for any task where external world knowledge is used and resolving ambiguity is fundamental to advancing research into these problems.
APA, Harvard, Vancouver, ISO, and other styles
3

Perkins, Drew. "Separating the Signal from the Noise: Predicting the Correct Entities in Named-Entity Linking." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412556.

Full text
Abstract:
In this study, I constructed a named-entity linking system that maps between contextual word embeddings and knowledge graph embeddings to predict correct entities. To establish a named-entity linking system, I first applied named-entity recognition to identify the entities of interest. I then performed candidate generation via locality sensitivity hashing (LSH), where a candidate group of potential entities were created for each identified entity. Afterwards, my named-entity disambiguation component was performed to select the most probable candidate. By concatenating contextual word embeddings and knowledge graph embeddings in my disambiguation component, I present a novel approach to named-entity linking. I conducted the experiments with the Kensho-Derived Wikimedia Dataset and the AIDA CoNLL-YAGO Dataset; the former dataset was used for deployment and the later is a benchmark dataset for entity linking tasks. Three deep learning models were evaluated on the named-entity disambiguation component with different context embeddings. The evaluation was treated as a classification task, where I trained my models to select the correct entity from a list of candidates. By optimizing the named-entity linking through this methodology, this entire system can be used in recommendation engines with high F1 of 86% using the former dataset. With the benchmark dataset, the proposed method is able to achieve F1 of 79%.
APA, Harvard, Vancouver, ISO, and other styles
4

Ruan, Wei. "Topic Segmentation and Medical Named Entities Recognition for Pictorially Visualizing Health Record Summary System." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39023.

Full text
Abstract:
Medical Information Visualization makes optimized use of digitized data of medical records, e.g. Electronic Medical Record. This thesis is an extended work of Pictorial Information Visualization System (PIVS) developed by Yongji Jin (Jin, 2016) Jiaren Suo (Suo, 2017) which is a graphical visualization system by picturizing patient’s medical history summary depicting patients’ medical information in order to help patients and doctors to easily capture patients’ past and present conditions. The summary information has been manually entered into the interface where the information can be taken from clinical notes. This study proposes a methodology of automatically extracting medical information from patients’ clinical notes by using the techniques of Natural Language Processing in order to produce medical history summarization from past medical records. We develop a Named Entities Recognition system to extract the information of the medical imaging procedure (performance date, human body location, imaging results and so on) and medications (medication names, frequency and quantities) by applying the model of conditional random fields with three main features and others: word-based, part-of-speech, Metamap semantic features. Adding Metamap semantic features is a novel idea which raised the accuracy compared to previous studies. Our evaluation shows that our model has higher accuracy than others on medication extraction as a case study. For enhancing the accuracy of entities extraction, we also propose a methodology of Topic Segmentation to clinical notes using boundary detection by determining the difference of classification probabilities of subsequence sequences, which is different from the traditional Topic Segmentation approaches such as TextTiling, TopicTiling and Beeferman Statistical Model. With Topic Segmentation combined for Named Entities Extraction, we observed higher accuracy for medication extraction compared to the case without the segmentation. Finally, we also present a prototype of integrating our information extraction system with PIVS by simply building the database of interface coordinates and the terms of human body parts.
APA, Harvard, Vancouver, ISO, and other styles
5

Hairston, Dorian. "PRETEND THE BALL IS NAMED JIM CROW." UKnowledge, 2018. https://uknowledge.uky.edu/english_etds/78.

Full text
Abstract:
The poems that form this collection titled, Pretend the Ball is Named Jim Crow, are written in the persona of Negro League Baseball’s Josh Gibson (1911-1947) and those closest to him. Gibson is credited with hitting over 800 home runs in his career and was the first Negro League Baseball Player to be inducted into Major League Baseball’s Hall of Fame without ever playing an inning of Major League Baseball.
APA, Harvard, Vancouver, ISO, and other styles
6

Bauer, Christian. "Stereotypical Gender Roles and their Patriarchal Effects in A Streetcar Named Desire." Thesis, Högskolan i Halmstad, Sektionen för humaniora (HUM), 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-17170.

Full text
Abstract:
Stereotypical gender roles have probably existed as long as human culture and are such a natural part if our lives that we barely take notice of them. Nevertheless, images of what we perceive as typically masculine and feminine in appearance and behavior depend on the individual’s perception. Within each gender one can find different stereotypes. A commonly assumed idea is that men are hard tough, while women are soft and vulnerable. I find it interesting hoe stereotypes function and how they are preserved almost without our awareness. Once I started reading and researching the topic of stereotypes it became clear to me that literature contains many stereotypes. The intension of this essay is to critically examine the stereotypical gender roles in the play A Streetcar Named Desire, written by Tennessee Williams in 1947. It is remarkable how the author portrays the three main characters: Stanley, Stella and Blanche. The sharp contracts and the dynamics between them are fascinating.
APA, Harvard, Vancouver, ISO, and other styles
7

Yoshida, Etsuko. "Patterns of use of referring expressions in English and Japanese dialogues." Thesis, University of Edinburgh, 2008. http://hdl.handle.net/1842/4036.

Full text
Abstract:
The main aim of the thesis is to investigate how discourse entities are linked with topic chaining and discourse coherence by showing that the choice and the distribution of referring expressions is correlated with the center transition patterns in the centering framework. The thesis provides an integrated interpretation in understanding the behaviour of referring expressions in discourse by considering the relation between referential choice and the local and global coherence of discourse. The thesis has three stages: (1) to provide a semantic and pragmatic perspective in a contrastive study of referring expressions in English and Japanese spontaneous dialogues, (2) to analyse the way anaphoric and deictic expressions can contribute to discourse organisation in structuring and focusing the specific discourse segment, and (3) to investigate the choice and the distribution of referring expressions in the Map Task Corpus and to clarify the way the participants collaborate to judge the most salient entity in the current discourse against their common ground. Significantly, despite the grammatical differences in the form of reference between the two languages, the ways of discourse development in both data sets show distinctive similarities in the process by which the topic entities are introduced, established, and shifted away to the subsequent topic entities. Comparing and contrasting the choice and the distribution of referring expressions of the four different transition patterns of centers, the crucial factors of their correspondent relations between English and Japanese referring expressions are shown in the findings that the topic chains of noun phrases are constructed and are treated like proper names in discourse. This can suggest that full noun phrases play a major role when the topic entity is established in the course of discourse. Since the existing centering model cannot handle the topic chain of noun phrases in the anaphoric relations in terms of the local focus of discourse, centering must be integrated with a model of global focus to account for both pronouns and full noun phrases that can be used for continuations across segment boundaries. Based on Walker’s cache model, I argue that the forms of anaphors are not always shorter, and the focus of attention is maintained by the chain of noun phrases rather than by (zero) pronouns both within a discourse segment and over discourse segment boundaries. These processes are predicted and likely to underlie other uses of language as well. The result can modify the existing perspectives that the focus of attention is normally represented by attenuated forms of reference, and full noun phrases always show focus-shift. In addition, necessary extension to the global coherence of discourse can link these anaphoric relations with the deictic expressions over discourse segment boundaries. Finally, I argue that the choice and the distribution of referring expressions in the Map Task Corpus depends on the way the participants collaborate to judge the most salient entity in the current discourse against their common ground.
APA, Harvard, Vancouver, ISO, and other styles
8

Ek, Adam. "Extracting social networks from fiction : Imaginary and invisible friends: Investigating the social world of imaginary friends." Thesis, Stockholms universitet, Institutionen för lingvistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-145659.

Full text
Abstract:
This thesis develops an approach to extract the social relation between characters in literary text to create a social network. The approach uses co-occurrences of named entities, keywords associated with the named entities, and the dependency relations that exist between the named entities to construct the network. Literary texts contain a large amount of pronouns to represent the named entities, to resolve the antecedents of pronouns, a pronoun resolution system is implemented based on a standard pronoun resolution algorithm. The results indicate that the pronoun resolution system finds the correct named entity in 60,4\% of all cases. The social network is evaluated by comparing character importance rankings based on graph properties with an independently human generated importance rankings. The generated social networks correlate moderately to strongly with the independent character ranking.
APA, Harvard, Vancouver, ISO, and other styles
9

Tang, Ling-Xiang. "Link discovery for Chinese/English cross-language web information retrieval." Thesis, Queensland University of Technology, 2012. https://eprints.qut.edu.au/58416/1/Ling-Xiang_Tang_Thesis.pdf.

Full text
Abstract:
Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.
APA, Harvard, Vancouver, ISO, and other styles
10

Amancio, Marcelo Adriano. "Elaboração textual via definição de entidades mencionadas e de perguntas relacionadas aos verbos em textos simplificados do português." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082011-122100/.

Full text
Abstract:
Esta pesquisa aborda o tema da Elaboração Textual para um público alvo que tem letramento nos níveis básicos e rudimentar, de acordo com a classificação do Indicador Nacional de Alfabetismo Funcional (INAF, 2009). A Elaboração Textual é definida como um conjunto de técnicas que acrescentam material redundante em textos, sendo tradicionalmente usadas a adição de definições, sinônimos, antônimos, ou qualquer informação externa com o objetivo de auxiliar na compreensão do texto. O objetivo deste projeto de mestrado foi a proposta de dois métodos originais de elaboração textual: (1) via definição das entidades mencionadas que aparecem em um texto e (2) via definições de perguntas elaboradas direcionadas aos verbos das orações de um texto. Para a primeira tarefa, usou-se um sistema de reconhecimento de entidades mencionadas da literatura, o Rembrandt, e definições curtas da enciclopédia Wikipédia, sendo este método incorporado no sistema Web FACILITA EDUCATIVO, uma das ferramentas desenvolvidas no projeto PorSimples. O método foi avaliado de forma preliminar com um pequeno grupo de leitores com baixo nível de letramento e a avaliação foi positiva, indicando que este auxílio facilitou a leitura dos usuários da avaliação. O método de geração de perguntas elaboradas aos verbos de uma oração é uma tarefa nova que foi definida, estudada, implementada e avaliada neste mestrado. A avaliação não foi realizada junto ao público alvo e sim com especialistas em processamento de língua natural que avaliaram positivamente o método e indicaram quais erros influenciam negativamente na qualidade das perguntas geradas automaticamente. Existem boas indicações de que os métodos de elaboração desenvolvidos podem ser úteis na melhoria da compreensão da leitura para o público alvo em questão, as pessoas com baixo nível de letramento
This research addresses the topic of Textual Elaboration for low-literacy readers, i.e. people at the rudimentary and basic literacy levels according to the National Indicator of Functional Literacy (INAF, 2009). Text Elaboration consists of a set of techniques that adds extra material in texts using, traditionally, definitions, synonyms, antonyms, or any external information to assist in text understanding. The main goal of this research was the proposal of two methods of Textual Elaboration: (1) the use of short definitions for Named Entities in texts and (2) assignment of wh-questions related to verbs in text. The first task used the Rembrandt named entity recognition system and short definitions of Wikipedia. It was implemented in PorSimples web Educational Facilita tool. This method was preliminarily evaluated with a small group of low-literacy readers. The evaluation results were positive, what indicates that the tool was useful for improving the text understanding. The assignment of wh-questions related to verbs task was defined, studied, implemented and assessed during this research. Its evaluation was conducted with NLP researches instead of with low-literacy readers. There are good evidences that the text elaboration methods and resources developed here are useful in helping text understanding for low-literacy readers
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "English language — Named Entities"

1

illustrator, Lyon Tammie, ed. A groundhog named Grady. New York: Scholastic, 2006.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Franco, Betsy. A bat named Pat: -at. New York, N.Y: Scholastic, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Dodd, Philip. The Reverend Guppy's aquarium: From Joseph P. Frisbie to Roy Jacuzzi : how everyday items were named for extraordinary people. New York: Gotham Books, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

What's in a name?: From Joseph P. Frisbie to Roy Jacuzzi : how everyday items were named for extraordinary people. New York: Gotham Books, 2007.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tuleja, Tad. Namesakes: An entertaining guide to the origins of more than 300 words named for people. New York: McGraw-Hill, 1987.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Barbara, Lehman, ed. A chartreuse leotard in a magenta limousine: And other words named after people and places. New York: Hyperion Books for Children, 1994.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Philip, Dodd. The Reverend Guppy's aquarium: From Joseph P. Frisbie to Roy Jacuzzi : how everyday items were named for extraordinary people. New York: Gotham Books, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Krull, Kathleen. One fun day with Lewis Carroll: A celebration of wordplay and a girl named Alice. Boston, MA: Houghton Mifflin Harcourt, 2017.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

The Reverend Guppy's Aquarium: From Joseph Frisbie to Roy Jacuzzi, How Everyday Itemswere Named for Extraordinary People. Gotham, 2007.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Let's Learn Readers: A Dog Named Opposite. Scholastic, Incorporated, 2014.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "English language — Named Entities"

1

Barrière, Caroline. "Searching for Named Entities." In Natural Language Understanding in a Semantic Web Context, 23–38. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-41337-2_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wang, Wei, Romaric Besançon, Olivier Ferret, and Brigitte Grau. "Semantic Clustering of Relations between Named Entities." In Advances in Natural Language Processing, 358–70. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-10888-9_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Collovini, Sandra, Bolivar Pereira, Henrique D. P. dos Santos, and Renata Vieira. "Annotating Relations Between Named Entities with Crowdsourcing." In Natural Language Processing and Information Systems, 290–97. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-91947-8_29.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Przybyła, Piotr. "Gathering Knowledge for Question Answering Beyond Named Entities." In Natural Language Processing and Information Systems, 412–17. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-19581-0_39.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Weissenbacher, Davy, and Christian Raymond. "Tree-Structured Named Entities Extraction from Competing Speech Transcriptions." In Natural Language Processing and Information Systems, 249–60. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-19581-0_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Galicia-Haro, Sofía N., Alexander Gelbukh, and Igor A. Bolshakov. "Identification of Composite Named Entities in a Spanish Textual Database." In Natural Language Processing and Information Systems, 395–400. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-27779-8_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Hajnicz, Elżbieta. "Mapping Named Entities from NKJP Corpus to Składnica Treebank and Polish Wordnet." In Language Processing and Intelligent Information Systems, 92–105. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-38634-3_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Babakov, Nikolay, David Dale, Varvara Logacheva, Irina Krotova, and Alexander Panchenko. "Studying the Role of Named Entities for Content Preservation in Text Style Transfer." In Natural Language Processing and Information Systems, 437–48. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-08473-7_40.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Piskorski, Jakub, Marcin Sydow, and Karol Wieloch. "Comparison of String Distance Metrics for Lemmatisation of Named Entities in Polish." In Human Language Technology. Challenges of the Information Society, 413–27. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04235-5_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Gupta, Rajdeep, and Sivaji Bandyopadhyay. "Testing the Effectiveness of Named Entities in Aligning Comparable English-Bengali Document Pair." In Communications in Computer and Information Science, 102–10. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37463-0_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "English language — Named Entities"

1

Laitonjam, Lenin, Loitongbam Gyanendro Singh, and Sanasam Ranbir Singh. "Transliteration of English Loanwords and Named-Entities to Manipuri: Phoneme vs Grapheme Representation." In 2018 International Conference on Asian Language Processing (IALP). IEEE, 2018. http://dx.doi.org/10.1109/ialp.2018.8629141.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Maximova, Olga, and Tatiana Maykova. "PROPER NAMES AS TERMINOLOGY IN SOCIAL SCIENCE." In NORDSCI Conference Proceedings. Saima Consult Ltd, 2021. http://dx.doi.org/10.32008/nordsci2021/b1/v4/20.

Full text
Abstract:
Proper names reflect the interaction between society and language. They identify unique entities and are used to refer to them. At the same time, it is not uncommon of proper names to serve as a source for word-formation. It should be noted, however, that while in a natural language (notably English) proper names mostly give rise to denominal verbs or adjectives, terminologies are different. Most units that count as terms are nouns, which makes their semantics somewhat special. The paper originates as one of a series towards a typology of sociological terminology and endeavors to analyze the terms whose etymology refers to a proper name (that is, eponymic terms). The research poses the following questions: whether this type of terms is common in Social Science, what are their structural and semantic distinctions as well as mechanisms behind their motivation, whether they are culture specific. The terms were manually retrieved from a set of data of 2500 terminological units extracted from a number of dictionaries and other sources. They were further grouped by structural criteria and the nature of eponymous components and made subject to morphological and semantic analyses. The research shows that structurally eponymic terms are morphological derivatives or two-(or more)-word compounds, with their prevalence estimated at 2%. The authors come to conclusion that terms of this type feature substantial diversity with regard to their eponymous components; they are motivated through the combination of encyclopedic knowledge of the entity, represented by the eponym, and the semantics of derivational morphemes or appellative components. Mythology-based eponymous terminology is represented by two groups, the first tracing back to Antiquity or biblical tradition, and the second of later origin, which requires a specific cultural experience for the meaning to be retrieved. Further analysis shows that the latter type along with toponym-based terminology is culture-specific in relation to American culture.
APA, Harvard, Vancouver, ISO, and other styles
3

Chinnakotla, Manoj Kumar, and Om P. Damani. "Experiences with English-Hindi, English-Tamil and English-Kannada transliteration tasks at NEWS 2009." In the 2009 Named Entities Workshop: Shared Task. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699705.1699716.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Matsushita, Kyoumoto, Takuya Makino, and Tomoya Iwakura. "Improving Neural Language Processing with Named Entities." In International Conference Recent Advances in Natural Language Processing. INCOMA Ltd. Shoumen, BULGARIA, 2021. http://dx.doi.org/10.26615/978-954-452-072-4_107.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Haque, Rejwanul, Sandipan Dandapat, Ankit Kumar Srivastava, Sudip Kumar Naskar, and Andy Way. "English-Hindi transliteration using context-informed PB-SMT." In the 2009 Named Entities Workshop: Shared Task. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699705.1699732.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Hong, Gumwon, Min-Jeong Kim, Do-Gil Lee, and Hae-Chang Rim. "A hybrid approach to English-Korean name transliteration." In the 2009 Named Entities Workshop: Shared Task. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699705.1699733.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Das, Amitava, Asif Ekbal, Tapabrata Mandal, and Sivaji Bandyopadhyay. "English to Hindi machine transliteration system at NEWS 2009." In the 2009 Named Entities Workshop: Shared Task. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699705.1699726.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ren, Feiliang, Muhua Zhu, Huizhen Wang, and Jingbo Zhu. "Chinese-English organization name translation based on correlative expansion." In the 2009 Named Entities Workshop: Shared Task. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699705.1699741.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Kwong, Oi Yee. "Graphemic approximation of phonological context for English-Chinese transliteration." In the 2009 Named Entities Workshop: Shared Task. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699705.1699747.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kato, Akihiko, Hiroyuki Shindo, and Yuji Matsumoto. "English Multiword Expression-aware Dependency Parsing Including Named Entities." In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA, USA: Association for Computational Linguistics, 2017. http://dx.doi.org/10.18653/v1/p17-2068.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography