Log in

Relevant bibliographies by topics / Legal dataset / Journal articles

To see the other types of publications on this topic, follow the link: Legal dataset.

Journal articles on the topic 'Legal dataset'

Author: Grafiati

Published: 6 September 2023

Last updated: 26 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Legal dataset.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

KUNČIČ, ALJAŽ. "Institutional quality dataset." Journal of Institutional Economics 10, no. 1 (2013): 135–61. http://dx.doi.org/10.1017/s1744137413000192.

Full text

Abstract:

AbstractIn this paper, we emphasize the role of institutions as the underlying basis for economic and social activity. We describe and compare different institutional classification systems, which is rarely done in the literature, and show how to empirically operationalize institutional concepts. More than 30 established institutional indicators can be clustered into three homogeneous groups of formal institutions: legal, political and economic, which capture to a large extent the complete formal institutional environment of a country. We compute the latent quality of legal, political and econ

APA, Harvard, Vancouver, ISO, and other styles

2

Zhong, Haoxi, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. "JEC-QA: A Legal-Domain Question Answering Dataset." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 9701–8. http://dx.doi.org/10.1609/aaai.v34i05.6519.

Full text

Abstract:

We present JEC-QA, the largest question answering dataset in the legal domain, collected from the National Judicial Examination of China. The examination is a comprehensive evaluation of professional skills for legal practitioners. College students are required to pass the examination to be certified as a lawyer or a judge. The dataset is challenging for existing question answering methods, because both retrieving relevant materials and answering questions require the ability of logic reasoning. Due to the high demand of multiple reasoning abilities to answer legal questions, the state-of-the-

APA, Harvard, Vancouver, ISO, and other styles

3

Ratnayaka, Gathika, Nisansa de Silva, Amal Shehan Perera, Gayan Kavirathne, Thirasara Ariyarathna, and Anjana Wijesinghe. "Context Sensitive Verb Similarity Dataset for Legal Information Extraction." Data 7, no. 7 (2022): 87. http://dx.doi.org/10.3390/data7070087.

Full text

Abstract:

Existing literature demonstrates that verbs are pivotal in legal information extraction tasks due to their semantic and argumentative properties. However, granting computers the ability to interpret the meaning of a verb and its semantic properties in relation to a given context can be considered as a challenging task, mainly due to the polysemic and domain specific behaviours of verbs. Therefore, developing mechanisms to identify behaviors of verbs and evaluate how artificial models detect the domain specific and polysemic behaviours of verbs can be considered as tasks with significant import

APA, Harvard, Vancouver, ISO, and other styles

4

Lin, Chun-Hsien, and Pu-Jen Cheng. "LARQS: An Analogical Reasoning Evaluation Dataset for Legal Word Embedding." International Journal on Natural Language Computing 11, no. 3 (2022): 1–16. http://dx.doi.org/10.5121/ijnlc.2022.11301.

Full text

Abstract:

Applying natural language processing-related algorithms is currently a popular project in legal applications, for instance, document classification of legal documents, contract review and machine translation. Using the above machine learning algorithms, all need to encode the words in the document in the form of vectors. The word embedding model is a modern distributed word representation approach and the most common unsupervised word encoding method. It facilitates subjecting other algorithms and subsequently performing the downstream tasks of natural language processing vis-à-vis. The most c

APA, Harvard, Vancouver, ISO, and other styles

5

Chun-Hsien, Lin, and Cheng Pu-Jen. "LARQS: AN ANALOGICAL REASONING EVALUATION DATASET FOR LEGAL WORD EMBEDDING." International Journal on Natural Language Computing (IJNLC) 11, no. 3 (2022): 16. https://doi.org/10.5281/zenodo.6838828.

Full text

Abstract:

Applying natural language processing-related algorithms is currently a popular project in legal applications, for instance, document classification of legal documents, contract review and machine translation. Using the above machine learning algorithms, all need to encode the words in the document in the form of vectors. The word embedding model is a modern distributed word representation approach and the most common unsupervised word encoding method. It facilitates subjecting other algorithms and subsequently performing the downstream tasks of natural language processing vis-à-vis. The

APA, Harvard, Vancouver, ISO, and other styles

6

Topaz, Chad M. "A structured dataset of the federalist society’s public engagements." F1000Research 14 (February 14, 2025): 213. https://doi.org/10.12688/f1000research.161735.1.

Full text

Abstract:

Background The Federalist Society, a leading conservative legal organization, has played a significant role in shaping the American judiciary for decades. Despite its influence, comprehensive empirical data on the organization remains scarce. We address this gap by systematically documenting 20,205 public events hosted by the Society from 1984 to 2024, with substantive coverage from 2007 onward. Methods Following ethical best practices in data collection and ownership, we gathered event metadata—including titles, dates, locations, sponsors, topics, and speakers—via web scraping from the Federa

APA, Harvard, Vancouver, ISO, and other styles

7

Zein, Hazem, Samer Chantaf, Régis Fournier, and Amine Nait-Ali. "Generative adversarial networks for anonymous acneic face dataset generation." PLOS ONE 19, no. 4 (2024): e0297958. http://dx.doi.org/10.1371/journal.pone.0297958.

Full text

Abstract:

It is well known that the performance of any classification model is effective if the dataset used for the training process and the test process satisfy some specific requirements. In other words, the more the dataset size is large, balanced, and representative, the more one can trust the proposed model’s effectiveness and, consequently, the obtained results. Unfortunately, large-size anonymous datasets are generally not publicly available in biomedical applications, especially those dealing with pathological human face images. This concern makes using deep-learning-based approaches challengin

APA, Harvard, Vancouver, ISO, and other styles

8

Armona, Luis, and Adam M. Rosenberg. "Measuring the Market for Legal Firearms." AEA Papers and Proceedings 114 (May 1, 2024): 52–57. http://dx.doi.org/10.1257/pandp.20241082.

Full text

Abstract:

The Massachusetts Firearms Records Bureau recently published administrative data covering the universe of legal firearm transactions in the state. We use these data to validate state-level background checks as a proxy for firearm transactions and show that historical trends in transactions within Massachusetts align with the rest of the United States. Using auxiliary data from a national survey, we show that the Massachusetts dataset can detect patterns in the demographics of both gun ownership and type of firearm purchased. Our analysis suggests that this dataset is a promising source of info

APA, Harvard, Vancouver, ISO, and other styles

9

Shaheen, Z., D. I. Mouromtsev, and I. Postny. "RuLegalNER: a new dataset for Russian legal named entities recognition." Scientific and Technical Journal of Information Technologies, Mechanics and Optics 23, no. 4 (2023): 854–57. http://dx.doi.org/10.17586/2226-1494-2023-23-4-854-857.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Owsiak, Andrew P., Allison K. Cuttner, and Brent Buck. "The International Border Agreements Dataset." Conflict Management and Peace Science 35, no. 5 (2016): 559–76. http://dx.doi.org/10.1177/0738894216646978.

Full text

Abstract:

We introduce a dataset that focuses on the delimitation of interstate borders under international law—the International Border Agreements Dataset (IBAD). This dataset contains information on the agents involved in (e.g. states, third-parties, and colonial powers), methods used during (e.g. negotiation, mediation, arbitration, adjudication, administrative decrees, post-war conferences, and plebiscites), and outcomes of (e.g. full and intermediate agreements) the border settlement process during the period 1816–2001. Our focus on international legal agreements and the process that produces them

APA, Harvard, Vancouver, ISO, and other styles

11

Longpre, Shayne, Robert Mahari, Anthony Chen, et al. "A large-scale audit of dataset licensing and attribution in AI." Nature Machine Intelligence 6, no. 8 (2024): 975–87. http://dx.doi.org/10.1038/s42256-024-00878-8.

Full text

Abstract:

AbstractThe race to train language models on vast, diverse and inconsistently documented datasets raises pressing legal and ethical concerns. To improve data transparency and understanding, we convene a multi-disciplinary effort between legal and machine learning experts to systematically audit and trace more than 1,800 text datasets. We develop tools and standards to trace the lineage of these datasets, including their source, creators, licences and subsequent use. Our landscape analysis highlights sharp divides in the composition and focus of data licenced for commercial use. Important categ

APA, Harvard, Vancouver, ISO, and other styles

12

Enas Mohamed Ali Quteishat. "Predictive Modelling in Legal Decision-Making: Leveraging Machine Learning for Forecasting Legal Outcomes." Journal of Electrical Systems 20, no. 3 (2024): 2060–71. http://dx.doi.org/10.52783/jes.4006.

Full text

Abstract:

Predictive modelling holds significant promise in enhancing legal decision-making processes, particularly within the realm of the Supreme Court of the United States (SCOTUS). This paper investigates the application of Machine Learning (ML) algorithms to forecast legal outcomes, utilizing a dataset comprising SCOTUS cases. Through rigorous preprocessing and analysis, various ML techniques including Decision Trees, Random Forest, Support Vector Machines (SVM), Naive Bayes, k-Nearest Neighbors (k-NN), and XGBoost are applied. The performance of these models is evaluated using precision, recall, F

APA, Harvard, Vancouver, ISO, and other styles

13

Vuong, Pham, Huy Le Hoang, Phu Ngo Thinh, Nguyen Binh, Nguyen Diem, and D. Nguyen Hien. "Enhancing legal research through knowledge-infusedinformation retrieval for Vietnamese labor law." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 4 (2024): 3962–73. https://doi.org/10.11591/ijai.v13.i4.pp3962-3973.

Full text

Abstract:

The role of intelligent information retrieval systems in legal research optimization has become increasingly recognized. There are many methods for exhibiting advancements in the proficient retrieval of legal documents. However, those methods fail to tackle the specific challenges encountered in real-world labor law searches. This research breaks new ground in Vietnamese labor law retrieval by leveraging a comprehensive dataset of 300,000 documents across diverse categories (20 document types and 27 legal fields) to train and evaluate retrieval models specifically designed for Vietnamese labor

APA, Harvard, Vancouver, ISO, and other styles

14

Crossfield, Samantha S. R., Kieran Zucker, Paul Baxter, et al. "A data flow process for confidential data and its application in a health research project." PLOS ONE 17, no. 1 (2022): e0262609. http://dx.doi.org/10.1371/journal.pone.0262609.

Full text

Abstract:

Background The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to

APA, Harvard, Vancouver, ISO, and other styles

15

Baviskar, Dipali, Swati Ahirrao, and Ketan Kotecha. "Multi-Layout Invoice Document Dataset (MIDD): A Dataset for Named Entity Recognition." Data 6, no. 7 (2021): 78. http://dx.doi.org/10.3390/data6070078.

Full text

Abstract:

The day-to-day working of an organization produces a massive volume of unstructured data in the form of invoices, legal contracts, mortgage processing forms, and many more. Organizations can utilize the insights concealed in such unstructured documents for their operational benefit. However, analyzing and extracting insights from such numerous and complex unstructured documents is a tedious task. Hence, the research in this area is encouraging the development of novel frameworks and tools that can automate the key information extraction from unstructured documents. However, the availability of

APA, Harvard, Vancouver, ISO, and other styles

16

Hidayat, Fahrul, and Rakyan Paksi Nagara. "DATASET BATAS WILAYAH ADMINISTRASI UNTUK PENATAAN RUANG WILAYAH." Seminar Nasional Geomatika 3 (February 15, 2019): 441. http://dx.doi.org/10.24895/sng.2018.3-0.984.

Full text

Abstract:

Era desentralisasi politik Indonesia sudah berjalan selama 20 tahun namun permasalahan batas wilayah masih menjadi beban bagi pemerintah baik di tingkat pusat maupun daerah. Data Kementerian Dalam Negeri pada Januari 2018 menunjukkan bahwa batas wilayah administrasi daerah yang sudah memiliki dasar hukum adalah 48,47% atau 475 segmen. Persentase jumlah segmen yang masih dalam proses penegasan dan belum ditegaskan berturut - turut adalah 34,59% dan 16,94%. Batas wilayah seharusnya sudah jelas dan legal sebelum digunakan untuk proses administrasi suatu wilayah termasuk penataan ruang. Tujuan pen

APA, Harvard, Vancouver, ISO, and other styles

17

Paul, Shounak, Pawan Goyal, and Saptarshi Ghosh. "LeSICiN: A Heterogeneous Graph-Based Approach for Automatic Legal Statute Identification from Indian Legal Documents." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 11139–46. http://dx.doi.org/10.1609/aaai.v36i10.21363.

Full text

Abstract:

The task of Legal Statute Identification (LSI) aims to identify the legal statutes that are relevant to a given description of facts or evidence of a legal case. Existing methods only utilize the textual content of facts and legal articles to guide such a task. However, the citation network among case documents and legal statutes is a rich source of additional information, which is not considered by existing models. In this work, we take the first step towards utilising both the text and the legal citation network for the LSI task. We curate a large novel dataset for this task, including facts

APA, Harvard, Vancouver, ISO, and other styles

18

Chen, Zhe, Hongli Zhang, Lin Ye, and Shang Li. "An Approach Based on Multilevel Convolution for Sentence-Level Element Extraction of Legal Text." Wireless Communications and Mobile Computing 2021 (December 24, 2021): 1–12. http://dx.doi.org/10.1155/2021/1043872.

Full text

Abstract:

In the judicial field, with the increase of legal text data, the extraction of legal text elements plays a more and more important role. In this paper, we propose a sentence-level model of legal text element extraction based on the structure of multilabel text classification. Our proposed model contains an encoder and an improved decoder. The encoder applies multilevel convolutional neural networks (CNN) and Long Short-Term Memory (LSTM) as feature extraction networks to extract local neighborhood and context information from legal text, and a decoder applies LSTM with multiattention and full

APA, Harvard, Vancouver, ISO, and other styles

19

Bros, Victor, and Daniel Gatica-Perez. "The Suisse Romande Local News Dataset." Proceedings of the International AAAI Conference on Web and Social Media 19 (June 7, 2025): 2396–401. https://doi.org/10.1609/icwsm.v19i1.35942.

Full text

Abstract:

This paper introduces a comprehensive dataset of news articles sourced from ESH Médias, a prominent local press agency in Romandy, the French-speaking region of Switzerland. The dataset encompasses all articles published on their digital platforms from January 2015 through June 2022. With over 130,000 articles written in French, this dataset offers a rich insight into local news from the French-speaking cantons of Switzerland. The articles cover a diverse range of topics and provide valuable material for Natural Language Processing and media studies. To respect privacy and legal considerations

APA, Harvard, Vancouver, ISO, and other styles

20

Deakin, Simon, Zoe Adams, Parisa Bastani, and Louise Bishop. "The CBR-LRI Dataset: Methods, Properties and Potential of Leximetric Coding of Labour Laws." International Journal of Comparative Labour Law and Industrial Relations 33, Issue 1 (2017): 59–91. http://dx.doi.org/10.54648/ijcl2017004.

Full text

Abstract:

Leximetric data coding techniques aim to measure cross-national and inter-temporal variations in the content of legal rules, thereby facilitating statistical analysis of legal systems and their social and economic impacts. In this article we explain how leximetric methods were used to create the CBR Labour Regulation Index (CBR-LRI), an index and related dataset of labour laws from around the world spanning the period from 1970 to 2013. Datasets of this kind must, we suggest, observe certain conventions of transparency and validity if they are to be usable in statistical analysis. The theoreti

APA, Harvard, Vancouver, ISO, and other styles

21

Yulianti, Evi, Naradhipa Bhary, Jafar Abdurrohman, Fariz Wahyuzan Dwitilas, Eka Qadri Nuranti, and Husna Sarirah Husin. "Named entity recognition on Indonesian legal documents: a dataset and study using transformer-based models." International Journal of Electrical and Computer Engineering (IJECE) 14, no. 5 (2024): 5489. http://dx.doi.org/10.11591/ijece.v14i5.pp5489-5501.

Full text

Abstract:

The large volume of court decision documents in Indonesia poses a challenge for researchers to assist legal practitioners in extracting useful information from the documents. This information can also benefit the general public by improving legal transparency, law enforcement, and people's understanding of the law implementation in Indonesia. A natural language processing task that extracts important information from a document is called named entity recognition (NER). In this study, the NER task is applied to legal domains, which is then referred to as legal entity recognition (LER) task. In

APA, Harvard, Vancouver, ISO, and other styles

22

Leng, Sihan, Xiaojun Kang, Qingzhong Liang, Xinchuan Li, and Yuanyuan Fan. "Accusations and Law Articles Prediction in the Field of Environmental Protection." Applied Sciences 15, no. 1 (2024): 280. https://doi.org/10.3390/app15010280.

Full text

Abstract:

Legal judgment prediction is a common basic task in the field of Legal AI, aimed at using deep domain models to predict the outcomes of judicial cases, such as charges, legal provisions, and other related tasks. This task has practical applications in environmental law, including legal decision assistance and legal advice, offering a promising and broad prospect. However, most previous studies focus on using high-quality labeled data for strong supervised training in criminal justice, often neglecting the rich external knowledge contained in various charges and laws. This approach fails to acc

APA, Harvard, Vancouver, ISO, and other styles

23

Deepak Nair, Prof. "Legal Solutions - GenAI." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem48096.

Full text

Abstract:

Abstract—This paper presents LegalRAG Assistant, an AI-powered legal chatbot platform that leverages Generative AI (GenAI), Large Language Models (LLMs), and Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses to legal queries. The system integrates Indian legal frameworks including the Bharatiya Nyaya Sanhita (BNS) and RERA guidelines to support real-time legal consultation and document analysis. A structured pipeline was developed using vector embeddings (via Sentence-BERT) and FAISS for efficient semantic retrieval of legal texts, which are then fed into a fine

APA, Harvard, Vancouver, ISO, and other styles

24

Kuznetsov, M. D. "Recognition and Extraction of Named Entities from the User Agreements Corpus." LETI Transactions on Electrical Engineering & Computer Science 18, no. 3 (2025): 78–86. https://doi.org/10.32603/2071-8985-2025-18-3-78-86.

Full text

Abstract:

Data analysis and mining are used to solve a variety of different problems, but their effective use requires high-quality and large datasets. Open publication of such datasets is not always possible in accordance with the law. The presence of personal data in datasets necessitates their processing and cleaning before open publication. In particular, the PPInRussian text dataset created in 2024 for studying aspects of personal data processing cannot be published, but it has the potential to become a useful tool for both computer security researchers and legal scholars. This paper discusses mode

APA, Harvard, Vancouver, ISO, and other styles

25

NURBAYEV, DANIYAR. "The rule of law, central bank independence and price stability." Journal of Institutional Economics 14, no. 4 (2017): 659–87. http://dx.doi.org/10.1017/s1744137417000261.

Full text

Abstract:

AbstractThis work empirically investigates the effect of the interaction between the rule of law and legal central bank independence (CBI) on price stability (the level of inflation and inflation volatility), employing a panel dataset that covers up to 124 countries over the period from 1970 to 2013. A new, largely complete legal CBI dataset, covering 182 countries was used for the work. The results indicate that the effect of legal CBI on price stability depends on the strength of the rule of law. Moreover, the results reveal that legal CBI has no significant effect on price stability when th

APA, Harvard, Vancouver, ISO, and other styles

26

Pham, Vuong, Hoang Huy Le, Thinh Phu Ngo, Binh Nguyen, Diem Nguyen, and Hien D. Nguyen. "Enhancing legal research through knowledge-infused information retrieval for Vietnamese labor law." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 4 (2024): 3962. http://dx.doi.org/10.11591/ijai.v13.i4.pp3962-3973.

Full text

Abstract:

<p>The role of intelligent information retrieval systems in legal research optimization has become increasingly recognized. There are many methods for exhibiting advance mentsin the proficient retrieval of legal documents. However, those methods fail to tackle the specific challenges encountered in real-world labor law searches. This research breaks new ground in Vietnamese labor law retrieval by leveraging a comprehensive dataset of 300,000 documents a cross diverse cat egories (20 document types and 27 legal fields) to train and evaluate retrieval models specifically designed for Vietn

APA, Harvard, Vancouver, ISO, and other styles

27

Munshi, Amr Abdullah, Wesam Hasan AlSabban, Abdullah Tarek Farag, Omar Essam Rakha, Ahmad Al Sallab, and Majid Alotaibi. "Automated Islamic Jurisprudential Legal Opinions Generation Using Artificial Intelligence." Pertanika Journal of Science and Technology 30, no. 2 (2022): 1135–56. http://dx.doi.org/10.47836/pjst.30.2.16.

Full text

Abstract:

Islam is the second-largest and fastest-growing religion. The Islamic Law, Sharia, represents a profound component of the day-to-day lives of Muslims. While sources of Sharia are available for anyone, it often requires a highly qualified person, the Mufti, to provide Fatwa. With Islam followers representing almost 25% of the planet earth population, generating many queries, and the sophistication of the Mufti qualification process, creating a shortage in them, we have a supply-demand problem, calling for Automation solutions. This scenario motivates the application of Artificial Intelligence (

APA, Harvard, Vancouver, ISO, and other styles

28

Kemala, Ade Putera, and Hafizh Ash Shiddiqi. "Analysis of Indonesian Language Dataset for Tax Court Cases: Multiclass Classification of Court Verdicts." Jurnal Riset Informatika 5, no. 3 (2023): 419–24. http://dx.doi.org/10.34288/jri.v5i3.555.

Full text

Abstract:

Tax is an obligation that arises due to the existence of laws, creating a duty for citizens to contribute a certain portion of their income to the state. The Tax Court serves as a judicial authority for taxpayers seeking justice in tax disputes, handling various types of taxes daily. This paper analyzes an Indonesian language dataset of tax court cases, aiming to perform multiclass classification to predict court verdicts. The dataset undergoes preprocessing steps, while data augmentation using oversampling and label weighting techniques addresses class imbalance. Two models, bi-LSTM and IndoB

APA, Harvard, Vancouver, ISO, and other styles

29

Kemala, Ade Putera, and Hafizh Ash Shiddiqi. "Analysis of Indonesian Language Dataset for Tax Court Cases: Multiclass Classification of Court Verdicts." Jurnal Riset Informatika 5, no. 3 (2023): 419–24. http://dx.doi.org/10.34288/jri.v5i3.236.

Full text

Abstract:

Tax is an obligation that arises due to the existence of laws, creating a duty for citizens to contribute a certain portion of their income to the state. The Tax Court serves as a judicial authority for taxpayers seeking justice in tax disputes, handling various types of taxes on a daily basis. This paper presents an analysis of an Indonesian language dataset of tax court cases, aiming to perform multiclass classification to predict court verdicts. The dataset undergoes preprocessing steps, while data augmentation using oversampling and label weighting techniques address class imbalance. Two m

APA, Harvard, Vancouver, ISO, and other styles

30

Shankar, Atreya, Andreas Waldis, Christof Bless, Maria Andueza Rodriguez, and Luca Mazzola. "PrivacyGLUE: A Benchmark Dataset for General Language Understanding in Privacy Policies." Applied Sciences 13, no. 6 (2023): 3701. http://dx.doi.org/10.3390/app13063701.

Full text

Abstract:

Benchmarks for general language understanding have been rapidly developing in recent years of NLP research, particularly because of their utility in choosing strong-performing models for practical downstream applications. While benchmarks have been proposed in the legal language domain, virtually no such benchmarks exist for privacy policies despite their increasing importance in modern digital life. This could be explained by privacy policies falling under the legal language domain, but we find evidence to the contrary that motivates a separate benchmark for privacy policies. Consequently, we

APA, Harvard, Vancouver, ISO, and other styles

31

Louis, Antoine, Gijs Van Dijck, and Gerasimos Spanakis. "Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 20 (2024): 22266–75. http://dx.doi.org/10.1609/aaai.v38i20.30232.

Full text

Abstract:

Many individuals are likely to face a legal dispute at some point in their lives, but their lack of understanding of how to navigate these complex issues often renders them vulnerable. The advancement of natural language processing opens new avenues for bridging this legal literacy gap through the development of automated legal aid systems. However, existing legal question answering (LQA) approaches often suffer from a narrow scope, being either confined to specific legal domains or limited to brief, uninformative responses. In this work, we propose an end-to-end methodology designed to genera

APA, Harvard, Vancouver, ISO, and other styles

32

Liu, Xiaoqin, Chenou Xu, Fei Pu, Berlin Yang, and Shi Meng. "A Knowledge Graph Data Expansion Method Based on Relational Propensity Categories with Legal Applications." SHS Web of Conferences 181 (2024): 03022. http://dx.doi.org/10.1051/shsconf/202418103022.

Full text

Abstract:

By introducing the definition of propensity categories of relations, the implicit information in the knowledge graph is mined and the MCBE (Maximum Clique Based Expansion) algorithm is used for data expansion. Experimental results show that with baseline models TransE, RotatE, HAKE and Complex on FB15K dataset, its MRR and Hits@1 metrics are improved by (7.9%, 9.6%), (4.2%, 3.3%), (2.7%, 4.8%) and (1.7%, 2.4%), respectively. Experiments are also conducted on the FB15K, YAGO3-10, NELL-995 and DBpedia50 datasets using the TransE model as a baseline, and its MRR and Hits@1 metrics are improved on

APA, Harvard, Vancouver, ISO, and other styles

33

Morris, Katherine, and Anita Arzoomanian. "Insights from Regulatory Data on Development Needs of Community Pharmacy Professionals." Pharmacy 8, no. 3 (2020): 111. http://dx.doi.org/10.3390/pharmacy8030111.

Full text

Abstract:

The aim of this study was to use data available to a Canadian health professions regulator (Ontario College of Pharmacists) to identify areas of opportunity where practitioners (pharmacists and pharmacy technicians) could benefit from further development, in order to optimize practice and improve the quality of care. Four de-identified datasets were used to extract themes from areas of jurisprudence (1969 exam records), member practice assessments (2610 records), pharmacy assessments (2024 records) and conduct (640 case records). Outcome measures included performance in examinations and assess

APA, Harvard, Vancouver, ISO, and other styles

34

Feng, Yinchenyi, Feng Fan, Zhehao Xu, and Yiyun Xue. "The relationship between private ICT investment and the strength of legal rights indices." Advances in Operation Research and Production Management 1, no. 1 (2023): 1–4. http://dx.doi.org/10.54254/3029-0880/1/2023001.

Full text

Abstract:

Investment in ICT has been identified as an important driver of economic expansion and development. It is yet unclear how ICT affects the legal rights index, which regulates the activities of businesses. The proposal's goal is to examine how private ICT spending compares to the index measuring the robustness of legal rights for companies in Europe and Central Asia. This research proposal will investigate the relationship between private ICT investment and the strength of legal rights indices for companies throughout Europe and Central Asia. Dataset WDI.csv from the World Bank will be utilized

APA, Harvard, Vancouver, ISO, and other styles

35

Feng, Yinchenyi, Feng Fan, Zhehao Xu, and Yiyun Xue. "The relationship between private ICT investment and the strength of legal rights indices." Advances in Operation Research and Production Management 1, no. 1 (2023): 1–4. http://dx.doi.org/10.54254/3006-1229/1/2023001.

Full text

Abstract:

Investment in ICT has been identified as an important driver of economic expansion and development. It is yet unclear how ICT affects the legal rights index, which regulates the activities of businesses. The proposal's goal is to examine how private ICT spending compares to the index measuring the robustness of legal rights for companies in Europe and Central Asia. This research proposal will investigate the relationship between private ICT investment and the strength of legal rights indices for companies throughout Europe and Central Asia. Dataset WDI.csv from the World Bank will be utilized

APA, Harvard, Vancouver, ISO, and other styles

36

Lonardo, Luigi, and Andrea Palazzi. "A Dataset, Software Toolbox, and Interdisciplinary Research Agenda for the Common Foreign and Security Policy." European Foreign Affairs Review 25, Issue 2 (2020): 281–96. http://dx.doi.org/10.54648/eerr2020023.

Full text

Abstract:

The article introduces a text corpus containing all the legal acts adopted by the European Union from 1 December 2009 till 30 June 2019; it also provides an open-source built-for-purpose software toolbox that can be used to re-create and manipulate the dataset. The dataset, as well as the software toolbox, are publicly available on GitHub at: https://github.com/ndrplz/eurlextoolbox. The article describes the content and possible uses of the dataset and maps out some potential applications thereof. Since EU law plays a key role in virtually all aspects of European integration, we believe that t

APA, Harvard, Vancouver, ISO, and other styles

37

Naik, Varsha, and K. Rajeswari. "Indian Legal Judgment Summarization using LEGAL-BERT and BiLSTM model with Adaptive Length." EPJ Web of Conferences 328 (2025): 01043. https://doi.org/10.1051/epjconf/202532801043.

Full text

Abstract:

The Indian legal system is vast and complex, rapid expansion of legal documentation has created a pressing need for reliable and efficient summarization tools to support legal professionals and researchers. To help reduce the cost and time spent on reading and retrieving critical information from the legal judgment, we introduce an automated summarization technique using deep learning models that helps legal professionals extract key rulings, arguments, and case outcomes quickly and efficiently. We compared two summarization techniques using deep neural networks, specifically LEGAL-BERT and bi

APA, Harvard, Vancouver, ISO, and other styles

38

Bogdanović, Miloš, Jelena Kocić, and Leonid Stoimenov. "SRBerta—A Transformer Language Model for Serbian Cyrillic Legal Texts." Information 15, no. 2 (2024): 74. http://dx.doi.org/10.3390/info15020074.

Full text

Abstract:

Language is a unique ability of human beings. Although relatively simple for humans, the ability to understand human language is a highly complex task for machines. For a machine to learn a particular language, it must understand not only the words and rules used in a particular language, but also the context of sentences and the meaning that words take on in a particular context. In the experimental development we present in this paper, the goal was the development of the language model SRBerta—a language model designed to understand the formal language of Serbian legal documents. SRBerta is

APA, Harvard, Vancouver, ISO, and other styles

39

Zhang, Ke, Yufei Tu, Jun Lu, et al. "Multi-Head Hierarchical Attention Framework with Multi-Level Learning Optimization Strategy for Legal Text Recognition." Electronics 14, no. 10 (2025): 1946. https://doi.org/10.3390/electronics14101946.

Full text

Abstract:

Owing to the rapid increase in the amount of legal text data and the increasing demand for intelligent processing, multi-label legal text recognition is becoming increasingly important in practical applications such as legal information retrieval and case classification. However, traditional methods have limitations in handling the complex semantics and multi-label characteristics of legal texts, making it difficult to accurately extract feature and effective category information. Therefore, this study proposes a novel multi-head hierarchical attention framework suitable for multi-label legal

APA, Harvard, Vancouver, ISO, and other styles

40

Mosteiro, Pablo, Ruilin Wang, Floortje Scheepers, and Marco Spruit. "Investigating De-Identification Methodologies in Dutch Medical Texts: A Replication Study of Deduce and Deidentify." Electronics 14, no. 8 (2025): 1636. https://doi.org/10.3390/electronics14081636.

Full text

Abstract:

Deidentifying sensitive information in electronic health records (EHRs) is increasingly important as legal obligations to data privacy evolve along with the need to protect patient and institutional confidentiality. This study aims to comparatively evaluate the performance of two state-of-the-art deidentification systems, Deduce and Deidentify, on both real-world and synthetic Dutch medical texts, thereby providing insights into their relative strengths and limitations in preserving privacy while maintaining data utility. We employ a replication-extension research design, utilizing two distinc

APA, Harvard, Vancouver, ISO, and other styles

41

Carlotti, Danilo. "TEXT SUMMARIZATION AS AN EMPIRICAL LEGAL RESEARCH TOOL." Revista de Estudos Empíricos em Direito 10 (November 25, 2022): 1–17. http://dx.doi.org/10.19092/reed.v10.600.

Full text

Abstract:

This paper use text summarization techniques as a tool for empirical legal research, creating a summary of the decisions given the phrases predictive power with regards to the decision outcome. A dataset of habeas corpus decisions from various courts in Brazil is used that explicitly cite the COVID pandemic as a reason for requesting the release of the patients. A predictive model is created and through this analysis we propose to find the arguments most correlated with the outcome.

APA, Harvard, Vancouver, ISO, and other styles

42

Mosaher, Quazi Saad-ul, and Mousumi Hasan. "Offline Handwritten Signature Recognition Using Deep Convolution Neural Network." European Journal of Engineering and Technology Research 7, no. 4 (2022): 44–47. http://dx.doi.org/10.24018/ejeng.2022.7.4.2851.

Full text

Abstract:

In the modern age, technological advancement reached a new limit where authentication plays a vital role in security management. Biometric-based authentication is the most referenced procedure for authentication where signature verification is a significant part of it for authentication of a person. To prevent the falsification of signatures on important documents & legal transactions it is necessary to recognize a person's signature accurately. This paper focused on recognizing offline handwritten original & forged signatures using a deep convolution neural network. We use a completel

APA, Harvard, Vancouver, ISO, and other styles

43

Buschmann, Andy. "Introducing the Myanmar Protest Event Dataset Motivation, Methodology, and Research Prospects." Journal of Current Southeast Asian Affairs 37, no. 2 (2018): 125–42. http://dx.doi.org/10.1177/186810341803700205.

Full text

Abstract:

This article presents the Myanmar Protest Event Dataset, a unique dataset on protest assemblies in transitional Myanmar/Burma. The data contents were derived from the most visible forms of assembly – demonstrations, protest marches and labour strikes – and collected through a protest event analysis of local news reports. The coded variables range from information on the actual moment of the protest event, such as participants, issue, duration and location, to the aftermath, including variables related to legal consequences for protesters and the success of protesters’ claims, and many others.

APA, Harvard, Vancouver, ISO, and other styles

44

Rice, Douglas, Jesse H. Rhodes, and Tatishe Nteta. "Racial bias in legal language." Research & Politics 6, no. 2 (2019): 205316801984893. http://dx.doi.org/10.1177/2053168019848930.

Full text

Abstract:

Although racial bias in the law is widely recognized, it remains unclear how these biases are in entrenched in the language of the law, judicial opinions. In this article, we build on recent research introducing an approach to measuring the presence of implicit racial bias in large-scale corpora. Utilizing an original dataset of more than one million appellate court opinions from US state and federal courts, we estimate word embeddings for the more than 400,000 most common words found in legal opinions. In a series of analyses, we find strong and consistent evidence of implicit racial bias, as

APA, Harvard, Vancouver, ISO, and other styles

45

Shankar, Atreya, Andreas Waldis, Christof Bless, Rodriguez Maria Andueza, and Luca Mazzola. "PrivacyGLUE: A Benchmark Dataset for General Language Understanding in Privacy Policies." Applied Sciences / MDPI 13, no. 6 (2023): 3701. https://doi.org/10.5281/zenodo.8082892.

Full text

Abstract:

Featured Application We propose the PrivacyGLUE benchmark to compare and contrast NLP models' general language understanding in the privacy language domain. This will help practitioners in selecting understanding models for applications within the privacy language domain. Abstracts Benchmarks for general language understanding have been rapidly developing in recent years of NLP research, particularly because of their utility in choosing strong-performing models for practical downstream applications. While benchmarks have been proposed in the legal language domain, virtually no such benchma

APA, Harvard, Vancouver, ISO, and other styles

46

Aejas, Bajeela, Abdelhak Belhi, and Abdelaziz Bouras. "Using AI to Ensure Reliable Supply Chains: Legal Relation Extraction for Sustainable and Transparent Contract Automation." Sustainability 17, no. 9 (2025): 4215. https://doi.org/10.3390/su17094215.

Full text

Abstract:

Efficient contract management is essential for ensuring sustainable and reliable supply chains; yet, traditional methods remain manual, error-prone, and inefficient, leading to delays, financial risks, and compliance challenges. AI and blockchain technology offer a transformative alternative, enabling the establishment of automated, transparent, and self-executing smart contracts that enhance efficiency and sustainability. As part of AI-driven smart contract automation, we previously implemented contractual clause extraction using question answering (QA) and named entity recognition (NER). Thi

APA, Harvard, Vancouver, ISO, and other styles

47

Ch, Aomuerlige, and Siriguleng WANG. "A dataset of the Chinese-Mongolian bilingual question-answer corpus in the legal field." China Scientific Data 9, no. 4 (2024): 1–9. https://doi.org/10.11922/11-6035.csd.2024.0031.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Pandey, Radhika, Rajeswari Sengupta, Aatmin Shah, and Bhargavi Zaveri. "Legal restrictions on foreign institutional investors in a large, emerging economy: A comprehensive dataset." Data in Brief 28 (February 2020): 104819. http://dx.doi.org/10.1016/j.dib.2019.104819.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Koniaris, Marios, Dimitris Galanis, Eugenia Giannini, and Panayiotis Tsanakas. "Evaluation of Automatic Legal Text Summarization Techniques for Greek Case Law." Information 14, no. 4 (2023): 250. http://dx.doi.org/10.3390/info14040250.

Full text

Abstract:

The increasing amount of legal information available online is overwhelming for both citizens and legal professionals, making it difficult and time-consuming to find relevant information and keep up with the latest legal developments. Automatic text summarization techniques can be highly beneficial as they save time, reduce costs, and lessen the cognitive load of legal professionals. However, applying these techniques to legal documents poses several challenges due to the complexity of legal documents and the lack of needed resources, especially in linguistically under-resourced languages, suc

APA, Harvard, Vancouver, ISO, and other styles

50

Sarretta, A., and M. Minghini. "TOWARDS THE INTEGRATION OF AUTHORITATIVE AND OPENSTREETMAP GEOSPATIAL DATASETS IN SUPPORT OF THE EUROPEAN STRATEGY FOR DATA." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVI-4/W2-2021 (August 19, 2021): 159–66. http://dx.doi.org/10.5194/isprs-archives-xlvi-4-w2-2021-159-2021.

Full text

Abstract:

Abstract. Digital transformation is at core of Europe’s future and the importance of data is well highlighted by the recently published European strategy for data, which envisions the establishment of so-called European data spaces enabling seamless data flows across actors and sectors to ultimately boost the economy and generate innovation. Integrating datasets produced by multiple actors, including citizen-generated data, is a key objective of the strategy. This study focuses on OpenStreetMap (OSM), the most popular crowdsourced geographic information project, and is the first step towards a

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!