Academic literature on the topic 'Named entity recognition legal documents transformer'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Named entity recognition legal documents transformer.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Named entity recognition legal documents transformer"

1

Yulianti, Evi, Naradhipa Bhary, Jafar Abdurrohman, Fariz Wahyuzan Dwitilas, Eka Qadri Nuranti, and Husna Sarirah Husin. "Named entity recognition on Indonesian legal documents: a dataset and study using transformer-based models." International Journal of Electrical and Computer Engineering (IJECE) 14, no. 5 (2024): 5489. http://dx.doi.org/10.11591/ijece.v14i5.pp5489-5501.

Full text
Abstract:
The large volume of court decision documents in Indonesia poses a challenge for researchers to assist legal practitioners in extracting useful information from the documents. This information can also benefit the general public by improving legal transparency, law enforcement, and people's understanding of the law implementation in Indonesia. A natural language processing task that extracts important information from a document is called named entity recognition (NER). In this study, the NER task is applied to legal domains, which is then referred to as legal entity recognition (LER) task. In this task, some important legal entities, such as judges, prosecutors, and advocates, are extracted from the decision documents. A new Indonesian LER dataset is built, called IndoLER data, consisting of approximately 1K decision documents with 20 types of fine-grained legal entities. Then, the transformer-based models, such as multilingual bidirectional encoder representations from transformers (BERT) or M-BERT, Indonesian BERT or IndoBERT, Indonesian robustly optimized BERT pretraining approach (RoBERTa) or IndoRoBERTa, XLM (cross lingual language model)-RoBERTa or XLMR, are proposed to solve the Indonesian LER task using this dataset. Our experimental results show that the RoBERTa-based models, such as XLM-R and IndoRoBERTa, can outperform the state-of-the-art deep-learning baselines using BiLSTM (bidirectional long short-term memory) and BiLSTM-conditional random field (BiLSTM-CRF) approaches by 7.2% to 7.9% and 2.1% to 2.6%, respectively. XLM-RoBERTa is shown to be the best-performing model, achieving the F1-score of 0.9295.
APA, Harvard, Vancouver, ISO, and other styles
2

Dong, Hongsong, Yuehui Kong, Wenlian Gao, and Jihua Liu. "Named Entity Recognition for Public Interest Litigation Based on a Deep Contextualized Pretraining Approach." Scientific Programming 2022 (October 11, 2022): 1–14. http://dx.doi.org/10.1155/2022/7682373.

Full text
Abstract:
The named entity recognition (NER) in the field of public interest litigation can assist prosecutors in handling cases and provide them with specific entities in making legal documents. Previously, the context-free deep learning model is used to catch the semantic comprehension, in which the static word vector is obtained without considering the context. Moreover, this kind of method relies on word segmentation technology and cannot solve the error transmission caused by word segmentation inaccuracy, which brings great challenges to the Chinese NER task. To tackle the above issues, an entity recognition method based on pretraining is proposed. First, based on the basic entities, three legal ontologies, NERP, NERCGP, and NERFPP are developed to expand the named entity recognition corpus in the judicial field. Second, a variant of the pretrained model BERT (Bidirectional Encoder Representations from Transformer) called BERT-WWM (whole-word mask)-EXT(extra) is introduced to catch the text character-level word vector hierarchical and the context bidirectional features, which effectively solve the problem of task boundary division of named entities. Then, to further improve the model recognition effect, the general knowledge learned from the pretrained model is used to fit the downstream neural network BiLSTM (bi-long short-term memory), and at the end of the architecture, CRF (conditional random fields) is introduced to restrict the label relationship. Finally, the experimental results show that the proposed method is more effective than the existing methods, which reach 96% and 90% in the F1 index of NER and NERP entities, respectively.
APA, Harvard, Vancouver, ISO, and other styles
3

Aejas, Bajeela, Abdelhak Belhi, and Abdelaziz Bouras. "Using AI to Ensure Reliable Supply Chains: Legal Relation Extraction for Sustainable and Transparent Contract Automation." Sustainability 17, no. 9 (2025): 4215. https://doi.org/10.3390/su17094215.

Full text
Abstract:
Efficient contract management is essential for ensuring sustainable and reliable supply chains; yet, traditional methods remain manual, error-prone, and inefficient, leading to delays, financial risks, and compliance challenges. AI and blockchain technology offer a transformative alternative, enabling the establishment of automated, transparent, and self-executing smart contracts that enhance efficiency and sustainability. As part of AI-driven smart contract automation, we previously implemented contractual clause extraction using question answering (QA) and named entity recognition (NER). This paper presents the next step in the information extraction process, relation extraction (RE), which aims to identify relationships between key legal entities and convert them into structured business rules for smart contract execution. To address RE in legal contracts, we present a novel hierarchical transformer model that captures sentence- and document-level dependencies. It incorporates global and segment-based attention mechanisms to extract complex legal relationships spanning multiple sentences. Given the scarcity of publicly available contractual datasets, we also introduce the contractual relation extraction (ContRE) dataset, specifically curated to support relation extraction tasks in legal contracts, that we use to evaluate the proposed model. Together, these contributions enable the structured automation of legal rules from unstructured contract text, advancing the development of AI-powered smart contracts.
APA, Harvard, Vancouver, ISO, and other styles
4

Ajay Mukund, S., and K. S. Easwarakumar. "Optimizing Legal Text Summarization Through Dynamic Retrieval-Augmented Generation and Domain-Specific Adaptation." Symmetry 17, no. 5 (2025): 633. https://doi.org/10.3390/sym17050633.

Full text
Abstract:
Legal text summarization presents distinct challenges due to the intricate and domain-specific nature of legal language. This paper introduces a novel framework integrating dynamic Retrieval-Augmented Generation (RAG) with domain-specific adaptation to enhance the accuracy and contextual relevance of legal document summaries. The proposed Dynamic Legal RAG system achieves a vital form of symmetry between information retrieval and content generation, ensuring that retrieved legal knowledge is both comprehensive and precise. Using the BM25 retriever with top-3 chunk selection, the system optimizes relevance and efficiency, minimizing redundancy while maximizing legally pertinent content. with top-3 chunk selection, the system optimizes relevance and efficiency, minimizing redundancy while maximizing legally pertinent content. A key design feature is the compression ratio constraint (0.05 to 0.5), maintaining structural symmetry between the original judgment and its summary by balancing representation and information density. Extensive evaluations establish BM25 as the most effective retriever, striking an optimal balance between precision and recall. A comparative analysis of transformer-based (Decoder-only) models—DeepSeek-7B, LLaMA 2-7B, and LLaMA 3.1-8B—demonstrates that LLaMA 3.1-8B, enriched with Legal Named Entity Recognition (NER) and the Dynamic RAG system, achieves superior performance with a BERTScore of 0.89. This study lays a strong foundation for future research in hybrid retrieval models, adaptive chunking strategies, and legal-specific evaluation metrics, with practical implications for case law analysis and automated legal drafting.
APA, Harvard, Vancouver, ISO, and other styles
5

Lu, Rui, and Linying Li. "Named Entity Recognition Method of Chinese Legal Documents Based on Parallel Instance Query Network." International Journal of Digital Crime and Forensics 16, no. 1 (2025): 1–19. https://doi.org/10.4018/ijdcf.367470.

Full text
Abstract:
Legal Named Entity Recognition (NER) is crucial in intelligent judiciary systems, focusing on identifying case-specific entities in legal texts. It helps convert unstructured legal documents into structured data, improving e-discovery efficiency. However, challenges arise from insufficient understanding of legal terminology, leading to errors in identifying long and nested entity boundaries. To address this, a Legal NER method based on a parallel instance query network is proposed. This method uses learnable instance queries to extract entities in parallel, with a BERT+BiLSTM+attention structure to encode context and query information. Entity prediction is performed using a pointer network to identify span boundaries and entity types. A linear label assignment mechanism aligns legal entities with queries for more accurate labeling. Experimental results show that the model outperforms existing methods, and further validation through ablation experiments and case studies supports its effectiveness, offering valuable insights for advancing legal NER research.
APA, Harvard, Vancouver, ISO, and other styles
6

Baviskar, Dipali, Swati Ahirrao, and Ketan Kotecha. "Multi-Layout Invoice Document Dataset (MIDD): A Dataset for Named Entity Recognition." Data 6, no. 7 (2021): 78. http://dx.doi.org/10.3390/data6070078.

Full text
Abstract:
The day-to-day working of an organization produces a massive volume of unstructured data in the form of invoices, legal contracts, mortgage processing forms, and many more. Organizations can utilize the insights concealed in such unstructured documents for their operational benefit. However, analyzing and extracting insights from such numerous and complex unstructured documents is a tedious task. Hence, the research in this area is encouraging the development of novel frameworks and tools that can automate the key information extraction from unstructured documents. However, the availability of standard, best-quality, and annotated unstructured document datasets is a serious challenge for accomplishing the goal of extracting key information from unstructured documents. This work expedites the researcher’s task by providing a high-quality, highly diverse, multi-layout, and annotated invoice documents dataset for extracting key information from unstructured documents. Researchers can use the proposed dataset for layout-independent unstructured invoice document processing and to develop an artificial intelligence (AI)-based tool to identify and extract named entities in the invoice documents. Our dataset includes 630 invoice document PDFs with four different layouts collected from diverse suppliers. As far as we know, our invoice dataset is the only openly available dataset comprising high-quality, highly diverse, multi-layout, and annotated invoice documents.
APA, Harvard, Vancouver, ISO, and other styles
7

Nastou, Katerina, Mikaela Koutrouli, Sampo Pyysalo, and Lars Juhl Jensen. "Improving dictionary-based named entity recognition with deep learning." Bioinformatics 40, Supplement_2 (2024): ii45—ii52. http://dx.doi.org/10.1093/bioinformatics/btae402.

Full text
Abstract:
Abstract Motivation Dictionary-based named entity recognition (NER) allows terms to be detected in a corpus and normalized to biomedical databases and ontologies. However, adaptation to different entity types requires new high-quality dictionaries and associated lists of blocked names for each type. The latter are so far created by identifying cases that cause many false positives through manual inspection of individual names, a process that scales poorly. Results In this work, we aim to improve block list s by automatically identifying names to block, based on the context in which they appear. By comparing results of three well-established biomedical NER methods, we generated a dataset of over 12.5 million text spans where the methods agree on the boundaries and type of entity tagged. These were used to generate positive and negative examples of contexts for four entity types (genes, diseases, species, and chemicals), which were used to train a Transformer-based model (BioBERT) to perform entity type classification. Application of the best model (F1-score = 96.7%) allowed us to generate a list of problematic names that should be blocked. Introducing this into our system doubled the size of the previous list of corpus-wide blocked names. In addition, we generated a document-specific list that allows ambiguous names to be blocked in specific documents. These changes boosted text mining precision by ∼5.5% on average, and over 8.5% for chemical and 7.5% for gene names, positively affecting several biological databases utilizing this NER system, like the STRING database, with only a minor drop in recall (0.6%). Availability and implementation All resources are available through Zenodo https://doi.org/10.5281/zenodo.11243139 and GitHub https://doi.org/10.5281/zenodo.10289360.
APA, Harvard, Vancouver, ISO, and other styles
8

Mazur, Pawel, and Robert Dale. "Handling conjunctions in named entities." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (2007): 49–68. http://dx.doi.org/10.1075/li.30.1.05maz.

Full text
Abstract:
Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain problematic for existing text processing systems. One of these is the ambiguity of conjunctions in candidate named entity strings, an all-too-prevalent problem in corporate and legal documents. In this paper, we distinguish four uses of the conjunction in these strings, and explore the use of a supervised machine learning approach to conjunction disambiguation trained on a very limited set of ‘name internal’ features that avoids the need for expensive lexical or semantic resources. We achieve 84% correctly classified examples using k-fold evaluation on a data set of 600 instances. We argue that further improvements are likely to require the use of wider domain knowledge and name external features.
APA, Harvard, Vancouver, ISO, and other styles
9

van Toledo, Chaïm, Friso van Dijk, and Marco Spruit. "Dutch Named Entity Recognition and De-Identification Methods for the Human Resource Domain." International Journal on Natural Language Computing 9, no. 6 (2020): 23–34. http://dx.doi.org/10.5121/ijnlc.2020.9602.

Full text
Abstract:
The human resource (HR) domain contains various types of privacy-sensitive textual data, such as e-mail correspondence and performance appraisal. Doing research on these documents brings several challenges, one of them anonymisation. In this paper, we evaluate the current Dutch text de-identification methods for the HR domain in four steps. First, by updating one of these methods with the latest named entity recognition (NER) models. The result is that the NER model based on the CoNLL 2002 corpus in combination with the BERTje transformer give the best combination for suppressing persons (recall 0.94) and locations (recall 0.82). For suppressing gender, DEDUCE is performing best (recall 0.53). Second NER evaluation is based on both strict de-identification of entities (a person must be suppressed as a person) and third evaluation on a loose sense of de-identification (no matter what how a person is suppressed, as long it is suppressed). In the fourth and last step a new kind of NER dataset is tested for recognising job titles in tezts.
APA, Harvard, Vancouver, ISO, and other styles
10

Zhao, Liupeng. "Legal Impact of Digital Information Technology on the Chain of Evidence in Criminal Cases." Journal of Combinatorial Mathematics and Combinatorial Computing 123, no. 1 (2024): 103–21. https://doi.org/10.61091/jcmcc123-08.

Full text
Abstract:
Criminal evidence serves as the foundation for criminal proceedings, with evidence used to ascertain the facts of cases being critical to achieving fairness and justice. This study explores the application of digital information technology in building a data resource base for criminal cases, formulating standard evidence guideline rules, and optimizing evidence verification procedures. A named entity recognition model based on the SVM-BiLSTM-CRF framework is proposed, coupled with an evidence relationship extraction model using the Transformer framework to improve evidence information extraction through sequential features and global feature capturing. Results show that the F1 value for entity recognition in criminal cases reaches 94.19%, and the evidence extraction model achieves an F1 value of 81.83% on the CAIL-A dataset. These results are utilized to construct evidence guidelines, helping case handlers increase case resolution rates to approximately 99%. The application of digital technology enhances evidence collection efficiency, accelerates case closures, and offers a pathway to improving judicial credibility.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Named entity recognition legal documents transformer"

1

Andersson-Säll, Tim. "Transforming Legal Entity Recognition." Thesis, Uppsala universitet, Statistiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447240.

Full text
Abstract:
Transformer-based architectures have in recent years advanced state-of-the-art performance in Natural Language Processing. Researchers have successfully adapted such models to downstream tasks within NLP in a domain-specific setting. This thesis examines the application of these models to the legal domain by doing Named Entity Recognition (NER) in a setting of scarce training data. Three different pre-trained BERT models are fine-tuned on a set of 101 court case documents, whereof one model is pre-trained on legal corpora and the other two on general corpora. Experiments are run to evaluate the models’ predictive performance given smaller or larger quantities of data to fine-tune on. Results show that BERT models work reasonably well for NER with legal data. Unlike many other domain-specific BERT models, the BERT model trained on legal corpora does not outperform the base models. Modest amounts of annotated data seem sufficient for reasonably good performance.
APA, Harvard, Vancouver, ISO, and other styles
2

Constum, Thomas. "Extractiοn d'infοrmatiοn dans des dοcuments histοriques à l'aide de grands mοdèles multimοdaux". Electronic Thesis or Diss., Normandie, 2024. http://www.theses.fr/2024NORMR083.

Full text
Abstract:
Cette thèse porte sur l'extraction automatique d'informations à partir de documents manuscrits historiques, dans le cadre des projets POPP et EXO-POPP. Le projet POPP se concentre sur les tableaux de recensement manuscrits de Paris (1921-1946), tandis qu'EXO-POPP traite des actes de mariage du département de la Seine (1880-1940). L’objectif principal est de développer une architecture de bout en bout pour l’extraction d’information à partir de documents complets, évitant les étapes explicites de segmentation.Dans un premier temps, une chaîne de traitement séquentielle a été développée pour le projet POPP, permettant l’extraction automatique des informations de 9 millions d’individus sur 300 000 pages. Ensuite, une architecture de bout en bout pour l'extraction d'information a été mise en place pour EXO-POPP, s’appuyant sur un encodeur convolutif et un décodeur Transformer, avec insertion de symboles spéciaux encodant les informations à extraire.Par la suite, l’intégration de grands modèles de langue basés sur l’architecture Transformer a conduit à la création du modèle DANIEL, qui a atteint un nouvel état de l’art sur plusieurs jeux de données publics (RIMES 2009 et M-POPP pour la reconnaissance d'écriture, IAM NER pour l'extraction d'information) tout en présentant une vitesse d'inférence supérieure aux approches existantes. Enfin, deux jeux de données publics issus des projets POPP et EXO-POPP ont été mis à disposition, ainsi que le code et les poids du modèle DANIEL<br>This thesis focuses on automatic information extraction from historical handwritten documents, within the framework of the POPP and EXO-POPP projects. The POPP project focuses on handwritten census tables from Paris (1921-1946), while EXO-POPP deals with marriage records from the Seine department (1880-1940). The main objective is to develop an end-to-end architecture for information extraction from complete documents, avoiding explicit segmentation steps.Initially, a sequential processing pipeline was developed for the POPP project, enabling the automatic extraction of information for 9 million individuals across 300,000 pages. Then, an end-to-end architecture for information extraction was implemented for EXO-POPP, based on a convolutional encoder and a Transformer decoder, with the insertion of special symbols encoding the information to be extracted.Subsequently, the integration of large language models based on the Transformer architecture led to the creation of the DANIEL model, which achieved a new state-of-the-art on several public datasets (RIMES 2009 and M-POPP for handwriting recognition, IAM NER for information extraction), while offering faster inference compared to existing approaches. Finally, two public datasets from the POPP and EXO-POPP projects were made available, along with the code and weights of the DANIEL model
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Named entity recognition legal documents transformer"

1

Vardhan, Harsh, Nitish Surana, and B. K. Tripathy. "Named-Entity Recognition for Legal Documents." In Advances in Intelligent Systems and Computing. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-3383-9_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Leitner, Elena, Georg Rehm, and Julian Moreno-Schneider. "Fine-Grained Named Entity Recognition in Legal Documents." In Lecture Notes in Computer Science. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-33220-4_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Xinrui, and Xudong Luo. "A Machine-Reading-Comprehension Method for Named Entity Recognition in Legal Documents." In Communications in Computer and Information Science. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-1645-0_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Albuquerque, Hidelberg O., Ellen Souza, Adriano L. I. Oliveira, et al. "On the Assessment of Deep Learning Models for Named Entity Recognition of Brazilian Legal Documents." In Progress in Artificial Intelligence. Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-49011-8_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

N., Shyamala Devi, and Grace Hannah J. "Sentiment-Based Summarization of Legal Documents Using Natural Language Processing (NLP) Techniques." In Advances in Information Security, Privacy, and Ethics. IGI Global, 2024. https://doi.org/10.4018/979-8-3693-6665-3.ch001.

Full text
Abstract:
This Chapter discusses the application of Natural Language Processing (NLP) in the legal domain for entity identification and the generation of hierarchy-based applications. NLP techniques, such as Named Entity Recognition (NER) and information extraction, are utilized to analyse legal documents, extract key information, and understand contextual relationships. The integration of NLP facilitates tasks such as legal document summarization, contract analysis, automated document generation, and legal research, thereby enhancing the efficiency and accuracy of legal processes. The creation of hierarchical structures within documents, language translation, sentiment analysis, and compliance monitoring further contribute to the comprehensive utilization of NLP in the legal field. The abstract highlights the transformative impact of NLP in streamlining legal workflows, improving information retrieval, and ensuring compliance with evolving legal standards. Most judicial systems in the world, centered around the Supreme Court as the highest-level court, grapples with a substantial caseload, receiving over 7,500 cases annually but hearing fewer than 150. The Supreme Court's unique role in interpreting the constitutionality of laws and establishing precedents underscores its significant influence. Challenges arise from the overwhelming caseload, leading to increased use of plea bargains and potential concerns about justice. To address these issues, a proposed research initiative suggests employing machine learning and natural language processing to accelerate Supreme Court decisions, aiming to uncover pivotal aspects that greatly influence the outcomes, and enhance our comprehension on India legal system's functioning and constraints. This chapter presents a methodology for sentiment-based summarization of legal documents using Natural Language Processing (NLP) techniques. The process involves pre-processing the text, conducting sentiment analysis to label sentences as positive, negative, or neutral, and then summarizing the document while incorporating the identified sentiments. Special attention is paid to legal-specific considerations such as terminology and context.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Named entity recognition legal documents transformer"

1

Bhandari, Ayush, Priyanshu Giriyan, Prathamesh Gawade, and Aarti Sahitya. "Evaluating Transformer Models for Named Entity Recognition in Indian Legal Texts." In 2025 3rd International Conference on Disruptive Technologies (ICDT). IEEE, 2025. https://doi.org/10.1109/icdt63985.2025.10986633.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Verwer, Nico. "Plain text processingin structured documents." In Declarative Amsterdam. John Benjamins, 2020. http://dx.doi.org/10.1075/da.2020.verwer.plain-text-processing.

Full text
Abstract:
Abstract Applications that analyze and process natural language can be used for things like named entity recognition, anonymization, topic extraction, sentiment analysis. In most cases, these applications use the plain text of a document, and may add or change markup. This causes problems when the original document already contains markup that must be preserved. The text to be analyzed may run across markup boundaries, and newly generated markup may lead to unbalanced (non well-formed) structures. This presentation shows how the Separated Markup API for XML (SMAX) can be used to apply natural language processing to XML documents. It preserves the existing document structure and allows for balanced insertion of new markup. A demonstration will be given of the use of SMAX for extracting and marking references in legal documents. This Link eXtractor was built for the Dutch center for governmental publications. SMAX and Simple Pipelines of Event API Transformers (SPEAT) will be available as open source software at the time of Declarative Amsterdam.
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Xiaolin, Zhuohao Chen, Gang Xu, and Bowen Huang. "Named entity recognition of legal documents based on cascade model." In 2021 International Symposium on Computer Technology and Information Science (ISCTIS). IEEE, 2021. http://dx.doi.org/10.1109/isctis51085.2021.00073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Yuan, Zhenzhen, and Hong Zhang. "Improving Named Entity Recognition of Chinese Legal Documents by Lexical Enhancement." In 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA). IEEE, 2021. http://dx.doi.org/10.1109/icaica52286.2021.9498036.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

da Silva, F. X. B., G. M. C. Guimarães, R. M. Marcacini, et al. "Named Entity Recognition Approaches Applied to Legal Document Segmentation." In Symposium on Knowledge Discovery, Mining and Learning. Sociedade Brasileira de Computação - SBC, 2022. http://dx.doi.org/10.5753/kdmile.2022.227949.

Full text
Abstract:
Document Segmentation is a method of dividing a document into smaller parts, known as segments, which share similarities that allow machines to distinguish between them. It might be useful to classify these segments, making it a problem with two steps: (I) the extraction of the segments; and (II) the annotation of these segments. The Named Entity Recognition problem's goal is to identify and classify entities within a text, having also to deal with those two questions: extraction and classification. In this study, we tackle the problem of Document Segmentation and the annotation of these segments through NER approaches, using CRF, CNN-CNN-LSTM and CNN-biLSTM-CRF models. The study is focused on Brazilian legal documents, proposing a data set of 127 annotated Portuguese texts from the Official Gazette of the Federal District, published between 2001 and 2015. The experiments were made using word-based and sentence-based models, with CRF sentence-based model showing the best results.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Xinrui, Xudong Luo, and Jiaye Wu. "A RoBERTa-GlobalPointer-Based Method for Named Entity Recognition of Legal Documents." In 2023 International Joint Conference on Neural Networks (IJCNN). IEEE, 2023. http://dx.doi.org/10.1109/ijcnn54540.2023.10191275.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Samarawickrama, Chamodi, Melonie de Almeida, Nisansa de Silva, Gathika Ratnayaka, and Amal Shehan Perera. "Party Identification of Legal Documents using Co-reference Resolution and Named Entity Recognition." In 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS). IEEE, 2020. http://dx.doi.org/10.1109/iciis51140.2020.9342720.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Shi, Jianwei, Kai Zheng, ZhiHua Zhang, and Qi Liu. "A Named Entity Recognition Method Based on Deep Learning For Chinese Legal Documents." In 2022 7th International Conference on Image, Vision and Computing (ICIVC). IEEE, 2022. http://dx.doi.org/10.1109/icivc55077.2022.9887060.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Trias, Fernando, Hongming Wang, Sylvain Jaume, and Stratos Idreos. "Named Entity Recognition in Historic Legal Text: A Transformer and State Machine Ensemble Method." In Proceedings of the Natural Legal Language Processing Workshop 2021. Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.nllp-1.18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Keshavarz, Hossein, Zografoula Vagena, Pigi Kouki, et al. "Named Entity Recognition in Long Documents: An End-to-end Case Study in the Legal Domain." In 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022. http://dx.doi.org/10.1109/bigdata55660.2022.10020873.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography