Log in

Relevant bibliographies by topics / Triplestore databáze

Contents

Journal articles
Dissertations / Theses

Academic literature on the topic 'Triplestore databáze'

Author: Grafiati

Published: 4 June 2021

Last updated: 17 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Triplestore databáze.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Triplestore databáze"

1

Yamada, Issaku, Matthew P. Campbell, Nathan Edwards, et al. "The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application." Glycobiology 31, no. 7 (2021): 741–50. http://dx.doi.org/10.1093/glycob/cwab013.

Full text

Abstract:

Abstract Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).

APA, Harvard, Vancouver, ISO, and other styles

2

Peroni, Silvio, and David Shotton. "OpenCitations, an infrastructure organization for open scholarship." Quantitative Science Studies 1, no. 1 (2020): 428–44. http://dx.doi.org/10.1162/qss_a_00023.

Full text

Abstract:

OpenCitations is an infrastructure organization for open scholarship dedicated to the publication of open citation data as Linked Open Data using Semantic Web technologies, thereby providing a disruptive alternative to traditional proprietary citation indexes. Open citation data are valuable for bibliometric analysis, increasing the reproducibility of large-scale analyses by enabling publication of the source data. Following brief introductions to the development and benefits of open scholarship and to Semantic Web technologies, this paper describes OpenCitations and its data sets, tools, services, and activities. These include the OpenCitations Data Model; the SPAR (Semantic Publishing and Referencing) Ontologies; OpenCitations’ open software of generic applicability for searching, browsing, and providing REST APIs over resource description framework (RDF) triplestores; Open Citation Identifiers (OCIs) and the OpenCitations OCI Resolution Service; the OpenCitations Corpus (OCC), a database of open downloadable bibliographic and citation data made available in RDF under a Creative Commons public domain dedication; and the OpenCitations Indexes of open citation data, of which the first and largest is COCI, the OpenCitations Index of Crossref Open DOI-to-DOI Citations, which currently contains over 624 million bibliographic citations and is receiving considerable usage by the scholarly community.

APA, Harvard, Vancouver, ISO, and other styles

3

Honti, Gergely, and János Abonyi. "Frequent Itemset Mining and Multi-Layer Network-Based Analysis of RDF Databases." Mathematics 9, no. 4 (2021): 450. http://dx.doi.org/10.3390/math9040450.

Full text

Abstract:

Triplestores or resource description framework (RDF) stores are purpose-built databases used to organise, store and share data with context. Knowledge extraction from a large amount of interconnected data requires effective tools and methods to address the complexity and the underlying structure of semantic information. We propose a method that generates an interpretable multilayered network from an RDF database. The method utilises frequent itemset mining (FIM) of the subjects, predicates and the objects of the RDF data, and automatically extracts informative subsets of the database for the analysis. The results are used to form layers in an analysable multidimensional network. The methodology enables a consistent, transparent, multi-aspect-oriented knowledge extraction from the linked dataset. To demonstrate the usability and effectiveness of the methodology, we analyse how the science of sustainability and climate change are structured using the Microsoft Academic Knowledge Graph. In the case study, the FIM forms networks of disciplines to reveal the significant interdisciplinary science communities in sustainability and climate change. The constructed multilayer network then enables an analysis of the significant disciplines and interdisciplinary scientific areas. To demonstrate the proposed knowledge extraction process, we search for interdisciplinary science communities and then measure and rank their multidisciplinary effects. The analysis identifies discipline similarities, pinpointing the similarity between atmospheric science and meteorology as well as between geomorphology and oceanography. The results confirm that frequent itemset mining provides an informative sampled subsets of RDF databases which can be simultaneously analysed as layers of a multilayer network.

APA, Harvard, Vancouver, ISO, and other styles

4

HACHEY, B., C. GROVER, and R. TOBIN. "Datasets for generic relation extraction." Natural Language Engineering 18, no. 1 (2011): 21–59. http://dx.doi.org/10.1017/s1351324911000106.

Full text

Abstract:

AbstractA vast amount of usable electronic data is in the form of unstructured text. The relation extraction task aims to identify useful information in text (e.g. PersonW works for OrganisationX, GeneY encodes ProteinZ) and recode it in a format such as a relational database or RDF triplestore that can be more effectively used for querying and automated reasoning. A number of resources have been developed for training and evaluating automatic systems for relation extraction in different domains. However, comparative evaluation is impeded by the fact that these corpora use different markup formats and notions of what constitutes a relation. We describe the preparation of corpora for comparative evaluation of relation extraction across domains based on the publicly available ACE 2004, ACE 2005 and BioInfer data sets. We present a common document type using token standoff and including detailed linguistic markup, while maintaining all information in the original annotation. The subsequent reannotation process normalises the two data sets so that they comply with a notion of relation that is intuitive, simple and informed by the semantic web. For the ACE data, we describe an automatic process that automatically converts many relations involving nested, nominal entity mentions to relations involving non-nested, named or pronominal entity mentions. For example, the first entity is mapped from ‘one’ to ‘Amidu Berry’ in the membership relation described in ‘Amidu Berry, one half of PBS’. Moreover, we describe a comparably reannotated version of the BioInfer corpus that flattens nested relations, maps part-whole to part-part relations and maps n-ary to binary relations. Finally, we summarise experiments that compare approaches to generic relation extraction, a knowledge discovery task that uses minimally supervised techniques to achieve maximally portable extractors. These experiments illustrate the utility of the corpora.1

APA, Harvard, Vancouver, ISO, and other styles

5

Piirainen, Esko, Eija-Leena Laiho, Tea von Bonsdorff, and Tapani Lahti. "Managing Taxon Data in FinBIF." Biodiversity Information Science and Standards 3 (June 26, 2019). http://dx.doi.org/10.3897/biss.3.37422.

Full text

Abstract:

The Finnish Biodiversity Information Facility, FinBIF (https://species.fi), has developed its own taxon database. This allows FinBIF taxon specialists to maintain their own, expert-validated view of Finnish species. The database covers national needs and can be rapidly expanded by our own development team. Furthermore, in the database each taxon is given a globally unique persistent URI identifier (https://www.w3.org/TR/uri-clarification), which refers to the taxon concept, not just to the name. The identifier doesn’t change if the taxon concept doesn’t change. We aim to ensure compatibility with checklists from other countries by linking taxon concepts as Linked Data (https://www.w3.org/wiki/LinkedData) — a work started as a part of the Nordic e-Infrastructure Collaboration (NeIC) DeepDive project (https://neic.no/deepdive). The database is used as a basis for observation/specimen searches, e-Learning and identification tools, and it is browsable by users of the FinBIF portal. The data is accessible to everyone under CC-BY 4.0 license (https://creativecommons.org/licenses/by/4.0) in machine readable formats. The taxon specialists maintain the taxon data using a web application. Currently, there are 60 specialists. All changes made to the data go live every night. The nightly update interval allows the specialists a grace period to make their changes. Allowing the taxon specialists to modify the taxonomy database themselves leads to some challenges. To maintain the integrity of critical data, such as lists of protected species, we have had to limit what the specialists can do. Changes to critical data is carried out by an administrator. The database has special features for linking observations to the taxonomy. These include hidden species aggregates and tools to override how a certain name used in observations is linked to the taxonomy. Misapplied names remain an unresolved problem. The most precise way to record an observation is to use a taxon concept: Most observations are still recorded using plain names, but it is possible for the observer to pick a concept. Also, when data is published in FinBIF from other information systems, the data providers can link their observations to the concepts using the identifiers of concepts. The ability to use taxon concepts as basis of observations means we have to maintain the concepts over time — a task that may become arduous in the future (Fig. 1). As it stands now, the FinBIF taxon data model — including adjacent classes such as publication, person, image, and endangerment assessments — consists of 260 properties. If the data model were stored in a normalized relational database, there would be approximately 56 tables, which could be difficult to maintain. Keeping track of a complete history of data is difficult in relational databases. Alternatively, we could use document storage to store taxon data. However, there are some difficulties associated with document storages: (1) much work is required to implement a system that does small atomic update operations; (2) batch updates modifying multiple documents usually require writing a script; and (3) they are not ideal for doing searches. We use a document storage for observation data, however, because they are well suited for storing large quantities of complex records. In FinBIF, we have decided to use a triplestore for all small datasets, such as taxon data. More specifically, the data is stored according to the RDF specification (https://www.w3.org/RDF). An RDF Schema defines the allowed properties for each class. Our triplestore implementation is an Oracle relational database with two tables (resource and statement), which gives us the ability to do SQL queries and updates. Doing small atomic updates is easy as only a small subset of the triplets can be updated instead of the entire data entity. Maintaining a complete record of history comes without much effort, as it can be done on an individual triplet level. For performance-critical queries, the taxon data is loaded into an Elasticsearch (https://www.elastic.co) search engine.

APA, Harvard, Vancouver, ISO, and other styles

6

Günther, Taras, Matthias Filter, and Fernanda Dórea. "Making Linked Data accessible for One Health Surveillance with the "One Health Linked Data Toolbox"." ARPHA Conference Abstracts 4 (May 28, 2021). http://dx.doi.org/10.3897/aca.4.e68821.

Full text

Abstract:

In times of emerging diseases, data sharing and data integration are of particular relevance for One Health Surveillance (OHS) and decision support. Furthermore, there is an increasing demand to provide governmental data in compliance to the FAIR (Findable, Accessible, Interoperable, Reusable) data principles. Semantic web technologies are key facilitators for providing data interoperability, as they allow explicit annotation of data with their meaning, enabling reuse without loss of the data collection context. Among these, we highlight ontologies as a tool for modeling knowledge in a field, which simplify the interpretation and mapping of datasets in a computer readable medium; and the Resource Description Format (RDF), which allows data to be shared among human and computer agents following this knowledge model. Despite their potential for enabling cross-sectoral interoperability and data linkage, the use and application of these technologies is often hindered by their complexity and the lack of easy-to-use software applications. To overcome these challenges the OHEJP Project ORION developed the Health Surveillance Ontology (HSO). This knowledge model forms a foundation for semantic interoperability in the domain of One Health Surveillance. It provides a solution to add data from the target sectors (public health, animal health and food safety) in compliance with the FAIR principles of findability, accessibility, interoperability, and reusability, supporting interdisciplinary data exchange and usage. To provide use cases and facilitate the accessibility to HSO, we developed the One Health Linked Data Toolbox (OHLDT), which consists of three new and custom-developed web applications with specific functionalities. The first web application allows users to convert surveillance data available in Excel files online into HSO-RDF and vice versa. The web application demonstrates that data provided in well-established data formats can be automatically translated in the linked data format HSO-RDF. The second application is a demonstrator of the usage of HSO-RDF in a HSO triplestore database. In the user interface of this application, the user can select HSO concepts based on which to search and filter among surveillance datasets stored in a HSO triplestore database. The service then provides automatically generated dashboards based on the context of the data. The third web application demonstrates the use of data interoperability in the OHS context by using HSO-RDF to annotate meta-data, and in this way link datasets across sectors. The web application provides a dashboard to compare public data on zoonosis surveillance provided by EFSA and ECDC. The first solution enables linked data production, while the second and third provide examples of linked data consumption, and their value in enabling data interoperability across sectors. All described solutions are based on the open-source software KNIME and are deployed as web service via a KNIME Server hosted at the German Federal Institute for Risk Assessment. The semantic web extension of KNIME, which is based on the Apache Jena Framework, allowed a rapid an easy development within the project. The underlying open source KNIME workflows are freely available and can be easily customized by interested end users. With our applications, we demonstrate that the use of linked data has a great potential strengthening the use of FAIR data in OHS and interdisciplinary data exchange.

APA, Harvard, Vancouver, ISO, and other styles

7

Heikkinen, Mikko, Ville-Matti Riihikoski, Anniina Kuusijärvi, Dare Talvitie, Tapani Lahti, and Leif Schulman. "Kotka - A national multi-purpose collection management system." Biodiversity Information Science and Standards 3 (June 18, 2019). http://dx.doi.org/10.3897/biss.3.37179.

Full text

Abstract:

Many natural history museums share a common problem: a multitude of legacy collection management systems (CMS) and the difficulty of finding a new system to replace them. Kotka is a CMS created by the Finnish Museum of Natural History (Luomus) to solve this problem. Its development started in late 2011 and was put into operational use in 2012. Kotka was first built to replace dozens of in-house systems previously used at Luomus, but eventually grew into a national system, which is now used by 10 institutions in Finland. Kotka currently holds c. 1.7 million specimens from zoological, botanical, paleontological, microbial and botanic garden collections, as well as data from genomic resource collections. Kotka is designed to fit the needs of different types of collections and can be further adapted when new needs arise. Kotka differs in many ways from traditional CMS's. It applies simple and pragmatic approaches. This has helped it to grow into a widely used system despite limited development resources – on average less than one full-time equivalent developer (FTE). The aim of Kotka is to improve collection management efficiency by providing practical tools. It emphasizes the quantity of digitized specimens over completeness of the data. It also harmonizes collection management practices by bringing all types of collections under one system. Kotka stores data mostly in a denormalized free text format using a triplestore and a simple hierarchical data model (Fig. 1). This allows greater flexibility of use and faster development compared to a normalized relational database. New data fields and structures can easily be added as needs arise. Kotka does some data validation, but quality control is seen as a continuous process and is mostly done after the data has been recorded into the system. The data model is loosely based on the ABCD (Access to Biological Collection Data) standard, but has been adapted to support practical needs. Kotka is a web application and data can be entered, edited, searched and exported through a browser-based user interface. However, most users prefer to enter new data in customizable MS-Excel templates, which support the hierarchical data model, and upload these to Kotka. Batch updates can also be done using Excel. Kotka stores all revisions of the data to avoid any data loss due to technical or human error. Kotka also supports designing and printing specimen labels, annotations by external users, as well as handling accessions, loan transactions, and the Nagoya protocol. Taxonomy management is done using a separate system provided by the Finnish Biodiversity Information Facility (FinBIF). This decoupling also allows entering specimen data before the taxonomy is updated, which speeds up specimen digitization. Every specimen is given a persistent unique HTTP-URI identifier (CETAF stable identifiers). Specimen data is accessible through the FinBIF portal at species.fi, and will later be shared to GBIF according to agreements with data holders. Kotka is continuously developed and adapted to new requirements in close collaboration with curators and technical collection staff, using agile software development methods. It is available as open source, but is tightly integrated with other FinBIF infrastructure, and currently only offered as an online service (Software as a Service) hosted by FinBIF.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Triplestore databáze"

1

Hanuš, Jiří. "Porovnání přístupů k ukládání otevřených propojených dat." Master's thesis, Vysoká škola ekonomická v Praze, 2015. http://www.nusl.cz/ntk/nusl-262273.

Full text

Abstract:

The aim of this diploma thesis is a detail description of current possibilities and ways of storing open data. It focuses on tools and database systems used for storing linked open data as well as on the selection of such systems for subsequent analysis and comparison. The practical part of the thesis then focuses on the comparison of selected systems based on a selected use case. This thesis introduces the fundamental terms and concepts concerning linked open data. Besides that, various approaches and formats for storing linked open data (namely file ori-ented approaches and database approaches) are analyzed. . The thesis also focuses on the RDF format and database systems. Ten triplestore database solutions (solutions for storing data in the RDF format) are introduced and described briefly. Out of these, three are cho-sen for a detailed analysis by which they are compared with one another and with a rela-tional database system. The core of the detail analysis lies in performance benchmarks. Ex-isting performance oriented benchmarks of triplestore systems are described and analyzed. In addition to that, the thesis introduces a newly developed benchmark as a collection of database queries. The benchmark is then used for the performance testing. The following systems have been tested: Apache Jena TDB/Fuseki, OpenLink Virtuoso, Oracle Spatial and Graph a Microsoft SQL Server. The main contribution of this thesis consists in a comprehensive presentation of current possibilities of storing linked open data.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!