To see the other types of publications on this topic, follow the link: Document databases.

Journal articles on the topic 'Document databases'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Document databases.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Shichkina, Yulia, and Van Muon Ha. "Method for Creating Collections with Embedded Documents for Document-oriented Databases Taking into Account Executable Queries." SPIIRAS Proceedings 19, no. 4 (2020): 829–54. http://dx.doi.org/10.15622/sp.2020.19.4.5.

Full text
Abstract:
In the recent decades, NoSQL databases have become more popular day by day. And increasingly, developers and database administrators, for whatever reason, have to solve the problems of database migration from a relational model in the model NoSQL databases like the document-oriented database MongoDB database. This article discusses the approach to this migration data based on set theory. A new formal method of determining the optimal runtime searches aggregate collections with the attached documents NoSQL databases such as the key document. The attributes of the database objects are included in optimizing the number of collections and their structures in search queries. The initial data are object properties (attributes, relationships between attributes) on which information is stored in the database, and query the properties that are most often performed, or the speed of which should be maximal. This article discusses the basic types of connections (1-1, 1-M, M-M), typical of the relational model. The proposed method is the following step of the method of creating a collection without embedded documents. The article also provides a method for determining what methods should be used in the reasonable cases to make work with databases more effectively. At the end, this article shows the results of testing of the proposed method on databases with different initial schemes. Experimental results show that the proposed method helps reduce the execution time of queries can also significantly as well as reduce the amount of memory required to store the data in a new database.
APA, Harvard, Vancouver, ISO, and other styles
2

Shichkina, Yulia, and Muon Ha. "Creating Collections with Embedded Documents for Document Databases Taking into Account the Queries." Computation 8, no. 2 (2020): 45. http://dx.doi.org/10.3390/computation8020045.

Full text
Abstract:
In this article, we describe a new formalized method for constructing the NoSQL document database of MongoDB, taking into account the structure of queries planned for execution to the database. The method is based on set theory. The initial data are the properties of objects, information about which is stored in the database, and the set of queries that are most often executed or whose execution speed should be maximum. In order to determine the need to create embedded documents, our method uses the type of relationship between tables in a relational database. Our studies have shown that this method is in addition to the method of creating collections without embedded documents. In the article, we also describe a methodology for determining in which cases which methods should be used to make working with databases more efficient. It should be noted that this approach can be used for translating data from MySQL to MongoDB and for the consolidation of these databases.
APA, Harvard, Vancouver, ISO, and other styles
3

Ait El Mouden, Zakariyaa, and Abdeslam Jakimi. "A New Algorithm for Storing and Migrating Data Modelled by Graphs." International Journal of Online and Biomedical Engineering (iJOE) 16, no. 11 (2020): 137. http://dx.doi.org/10.3991/ijoe.v16i11.15545.

Full text
Abstract:
<span>NoSQL databases have moved from theoretical solutions to exceed relational databases limits to a practical and indisputable application for storing and manipulation big data. In term of variety, NoSQL databases store heterogeneous data without being obliged to respect a predefined schema such as the case of relational and object-relational databases. NoSQL solutions surpass the traditional databases in storage capacity; we consider MongoDB for example, which is a document-oriented database capable of storing unlimited number of documents with a maximal size of 32TB depending on the machine that runs the database and also the operating system. Also, in term of velocity, many researches compared the execution time of different transactions and proved that NoSQL databases are the perfect solution for real-time applications. This paper presents an algorithm to store data modeled by graphs as NoSQL documents, the purpose of this study is to exploit the high amount of data stored in SQL databases and to make such data usable by recent clustering algorithms and other data science tools. This study links relational data to document datastores by defining an effective algorithm for reading relational data, modelling those data as graphs and storing those data as NoSQL documents.</span>
APA, Harvard, Vancouver, ISO, and other styles
4

Falk, Howard. "Building and using document databases." Electronic Library 16, no. 1 (1998): 55–59. http://dx.doi.org/10.1108/eb045616.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Bunin, M. S., I. A. Kolenchenko, and L. N. Pirumova. "Digital agriculture informational resources in local and international databases." Dokuchaev Soil Bulletin, no. 108 (October 19, 2021): 157–74. http://dx.doi.org/10.19047/0136-1694-2021-108-157-174.

Full text
Abstract:
The article reviews informational resources on precision and digital agriculture in international cross-disciplinal and sectoral international and local databases. The databases Web of Science, Scopus, AGRIS (by FAO UN) and Engineering infrastructure of Agriculture, Rosinformagrotech, AGROS (by Central Scientific Agricultural Library) with the retrieval requests “Digital agriculture”, “Precision Agriculture” were analyzed. The authors estimated the dynamics of document flows to the AGROS database and confirmed strong growth of volume of local publications on precision agriculture to the level demonstrating technology adoption of precision agriculture. Meanwhile the level of document flow on digital agriculture is at the starting level. Analysis of most frequent publications on the topic showed that there’s no local specialized journals on precision agriculture, most frequently publications appeared in local journals such as “Machinery and equipment for rural areas”, “Soil science and agrochemistry”, “Agricultural machinery and technology”. Predominantly materials were published in specialized foreign journals “Computers and electronics in agriculture” and “Precision agriculture”. Most of the documents were obtained from WOS and Scopus databases, but a lot of them are irrelevant. While searching for foreign documents it makes sense to use all the databases available, but most of the full-size texts in open access are available in the AGRIS database. In the same way AGROS database provide a wide range of full-size texts in the Russian language. Both AGROS and AGRIS databases showed high efficiency of search with most relevant documents in search results since both databases use thesaurus as a linguistic tool.
APA, Harvard, Vancouver, ISO, and other styles
6

Soni, Pradeep, and Narendra Singh Yadav. "Quantitative Analysis of Document Stored Databases." International Journal of Computer Applications 118, no. 20 (2015): 37–41. http://dx.doi.org/10.5120/20865-3587.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mokropolova, Yuliya E., and Ekaterina M. Smirnova. "DOCUMENT DATABASES ON CULTURE AND ART." Vestnik Tomskogo gosudarstvennogo universiteta. Kul'turologiya i iskusstvovedenie, no. 19(3) (September 1, 2015): 120–27. http://dx.doi.org/10.17223/22220836/19/16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kaszkiel, Marcin, Justin Zobel, and Ron Sacks-Davis. "Efficient passage ranking for document databases." ACM Transactions on Information Systems 17, no. 4 (1999): 406–39. http://dx.doi.org/10.1145/326440.326445.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Yje Lu and Chew Lim Tan. "Information retrieval in document image databases." IEEE Transactions on Knowledge and Data Engineering 16, no. 11 (2004): 1398–410. http://dx.doi.org/10.1109/tkde.2004.76.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Moffat, A., J. Zobel, and N. Sharman. "Text compression for dynamic document databases." IEEE Transactions on Knowledge and Data Engineering 9, no. 2 (1997): 302–13. http://dx.doi.org/10.1109/69.591454.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Nørvåg, Kjetil. "Granularity reduction in temporal document databases." Information Systems 31, no. 2 (2006): 134–47. http://dx.doi.org/10.1016/j.is.2004.10.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Gallinucci, Enrico, Matteo Golfarelli, and Stefano Rizzi. "Schema profiling of document-oriented databases." Information Systems 75 (June 2018): 13–25. http://dx.doi.org/10.1016/j.is.2018.02.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Lima, Cláudio, and Ronaldo Santos Mello. "On proposing and evaluating a NoSQL document database logical approach." International Journal of Web Information Systems 12, no. 4 (2016): 398–417. http://dx.doi.org/10.1108/ijwis-04-2016-0018.

Full text
Abstract:
Purpose NoSQL databases do not require a default schema associated with the data. Even that, they are categorized by data models. A model associated with the data can promote better strategies for persistence and manipulation of data in the target database. Based on this motivation, the purpose of this paper is to present an approach for logical design of NoSQL document databases that consists a process that converts a conceptual modeling into efficient logical representations for a NoSQL document database. The authors also evaluate their approach and demonstrate that the generated NoSQL logical structures reduce the amount of data items accessed by queries. Design/methodology/approach This paper presents an approach for logical design of NoSQL document database schemas based on a conceptual schema. The authors generate compact and redundancy-free schemas and define appropriate representations in a NoSQL document logical model. The estimated volume of data and workload information can be considered to generate optimized NoSQL document structures. Findings This approach was evaluated through a case study with an experimental evaluation in the e-commerce application domain. The results demonstrate that the authors’ workload-based conversion process improves query performance on NoSQL documents by reducing the number of database accesses. Originality/value Unlike related work, the reported approach covers all typical conceptual constructs, details a conversion process between conceptual schemas and logical representations for NoSQL document database category and, additionally, considers the estimated database workload to perform optimizations in the logical structure. An experimental evaluation shows that the proposed approach is promising.
APA, Harvard, Vancouver, ISO, and other styles
14

Winaya, I. Gede, and Ahmad Ashari. "Transformasi Skema Basis Data Relasional Menjadi Model Data Berorientasi Dokumen pada MongoDB." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 10, no. 1 (2016): 47. http://dx.doi.org/10.22146/ijccs.11188.

Full text
Abstract:
MongoDB is a database that uses document-oriented data storage models. In fact, to migrate from a relational database to NoSQL databases such as MongoDB is not an easy matter especially if the data are extremely complex. Based on the documentation that has been done by several global companies related to the use of MongoDB, it can be concluded that the process of migration from RDBMS to MongoDB require quite a long time. One process that takes quite a lot is transformation of relational database schema into a document-oriented data model on MongoDB. This research discusses the development transformation system of relational database schema to the document oriented data model in MongoDB. The process of transformation is done by utilizing the structure and relationships between tables in the scheme as the main parameters of the modeling algorithm. In the process of the modeling documents, it necessary to adjustments the specifications of MongoDB document that formed document model can be implemented in MongoDB. Document models are formed from transformation process can be a single document, embedded document, referenced document or combination of these. Document models are formed depending on the type, rules, and the value of the relationships cardinality between tables in the relational database schema.
APA, Harvard, Vancouver, ISO, and other styles
15

Schmitt, Oliver, and Tim A. Majchrzak. "Document-Based Databases for Medical Information Systems and Crisis Management." International Journal of Information Systems for Crisis Response and Management 5, no. 3 (2013): 63–80. http://dx.doi.org/10.4018/ijiscram.2013070104.

Full text
Abstract:
Both for healthcare and crisis management, the usage of Information Systems (IS) has become routine. In fact, they are unthinkable without sophisticated IT support. Virtually all IS rely on data storage. Despite the document-oriented nature of medical datasets, relational databases (RDBMS) prevail. The authors evaluate a document-based database to assess its feasibility for the domain of healthcare and crisis support. To foster the understanding of this technology, the authors present the background of form-originated data storage, introduce document-based databases, and describe a use case relying on document-based databases. Based on their findings, the authors generalize the results with a focus on crisis management. The authors investigated good indications that document-based databases such as CouchDB are well-suited for IS in medical contexts. They might be a feasible option for the future development of systems in various fields of healthcare, crisis response, and medical research.
APA, Harvard, Vancouver, ISO, and other styles
16

Smirnov, M. V., and R. S. Tolmasov. "Graphical Notation for Document Database Modeling." Open Education 25, no. 5 (2021): 50–60. http://dx.doi.org/10.21686/1818-4243-2021-5-50-60.

Full text
Abstract:
Goals and objectives. Graphical models have proven to be a reliable, clear and convenient tool for creating sketch models of databases. Most of the existing notations are designed for the relational data model, the dominant data model for the last thirty years. However, the development of information technologies has led to an increase in the popularity of non-relational data models, primarily the document model. One of the problems of its application in practice is the lack of suitable tools that allow performing graphical modeling of the database, taking into account the features of the document model, at the stage of logical design. The development of appropriate tools is an important and actual task, since their application in practical research makes it possible to identify, classify and analyze typical modeling errors that allow the designer to reduce the risk of their occurrence in the future. The purpose of this article is to develop a graphical notation that, on the one hand, providing convenience for the designer, and on the other hand, taking into account the peculiarities of creating and functioning of the noSQL document storage model.Materials and methods. The materials for the study were numerous publications devoted to the development of graphical notations in problems and their application to database design for various information systems. The selected materials were analyzed and the main graphical notations used to describe the relational data model were identified. Three notations were selected from them, a set of graphic stereotypes, which were most different from each other, the analysis of which allowed us to identify the main image patterns of the components of the relational model.The resulting patterns were applied to the main elements of the document database, which were obtained by analyzing the documentation of the popular MongoDB DBMS.Results. The result of the research was the creation of a new tool for modeling document databases at the logical level, which consists of a set of graphic stereotypes and rules for their application. On the one hand, the development is well known to practitioners who have previously worked with relational data models, since its development took into account many years of experience in using graphical models in the field of relational database design, and on the other hand, it reflects the features of the structure of the document model.Conclusion. The practical application of the developed model has shown the convenience of its use both in the process of designing document databases and in the process of teaching students within this subject area. The use of graphical models constructed in the proposed graphical notation will allow researchers to create and illustrate typical patterns of document databases, which will undoubtedly have a positive impact on the dynamics of the development of promising data storage technologies.
APA, Harvard, Vancouver, ISO, and other styles
17

Jiang, M. F., S. S. Tseng, and C. J. Tsai. "Intelligent query agent for structural document databases." Expert Systems with Applications 17, no. 2 (1999): 105–13. http://dx.doi.org/10.1016/s0957-4174(99)00028-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Che, Dunren, Karl Aberer, and M. Tamer Özsu. "Query optimization in XML structured-document databases." VLDB Journal 15, no. 3 (2006): 263–89. http://dx.doi.org/10.1007/s00778-005-0172-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Weiss, E. H. "Of document databases, SGML, and rhetorical neutrality." IEEE Transactions on Professional Communication 36, no. 2 (1993): 58–61. http://dx.doi.org/10.1109/47.222682.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

D'Souza, Daryl, James A. Thom, and Justin Zobel. "Collection selection for managed distributed document databases." Information Processing & Management 40, no. 3 (2004): 527–46. http://dx.doi.org/10.1016/s0306-4573(03)00008-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Wang, Jason Tsong-Li, Dennis Shasha, George J. S. Chang, Liam Relihan, Kaizhong Zhang, and Girish Patel. "Structural matching and discovery in document databases." ACM SIGMOD Record 26, no. 2 (1997): 560–63. http://dx.doi.org/10.1145/253262.253406.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Moffat, Alistair, Justin Zobel, Ian H. Witten, and Timothy C. Bell. "The role of compression in document databases." ACM SIGWEB Newsletter 4, no. 2 (1995): 20–21. http://dx.doi.org/10.1145/223301.223318.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Mayol, Enric, and Maria José Casañ. "Supporting the Genealogical Document Transcription Process." International Journal of Social and Organizational Dynamics in IT 3, no. 4 (2013): 1–18. http://dx.doi.org/10.4018/ijsodit.2013100101.

Full text
Abstract:
Lately, genealogy has been becoming a new popular activity and has increasing interest due to the easy access to heritage documentation on internet and digital documentation. The main interesting information sources for genealogy research are different kinds of genealogical documents (census, church vital records, wills, …). In Spain, several projects to digitalize heritage and genealogical documentation have developed recently, in order to improve its access and to preserve its conservation state. Such digital information is useful, but it would be even more useful to have its transcription in a persistent and searchable support like databases or web repositories. However, there not exist any standard proposal of what must be the contents of these database repositories. In this paper the authors describe main characteristics of a tool to support the transcription process of genealogical documentation. This tool allows for easy, intuitive and fast transcription of genealogical documentation, in agreement with the contents of each different kind of genealogical documents. Given an model describing a genealogical document structure and contents, our tool supports the user to transcribe the document contents. The authors also propose a conceptual schema to model and to describe, in a generic and uniform way, the main contents of such genealogical documentation of interest for genealogy and family history research. This model should be a first step to have a reference model to describe heritage documents, to facilitate the transcription process and to share transcribed data among different researchers and databases.
APA, Harvard, Vancouver, ISO, and other styles
24

Doleschal, Johannes, Benny Kimelfeld, and Wim Martens. "Database Principles and Challenges in Text Analysis." ACM SIGMOD Record 50, no. 2 (2021): 6–17. http://dx.doi.org/10.1145/3484622.3484624.

Full text
Abstract:
A common conceptual view of text analysis is that of a two-step process, where we first extract relations from text documents and then apply a relational query over the result. Hence, text analysis shares technical challenges with, and can draw ideas from, relational databases. A framework that formally instantiates this connection is that of the document spanners. In this article, we review recent advances in various research efforts that adapt fundamental database concepts to text analysis through the lens of document spanners. Among others, we discuss aspects of query evaluation, aggregate queries, provenance, and distributed query planning.
APA, Harvard, Vancouver, ISO, and other styles
25

Yogish, Deepa, T. N. Manjunath, H. K. Yogish, and Ravindra S. Hegadi. "Ranking Top Similar Documents for User Query Based on Normalized Vector Cosine Similarity Model." Journal of Computational and Theoretical Nanoscience 17, no. 9 (2020): 4531–34. http://dx.doi.org/10.1166/jctn.2020.9330.

Full text
Abstract:
As the technology is developing information in each fields like literature, technology, science, medicine etc., also increasing in high pace. To extract related document in huge collection of documents based on user query in digital world is an interesting problem. Documents similarity Technique used in many applications like text categorization, plagiarism discernment, document clustering, information retrieval, machine translation and question answering system. Many algorithms have been developed for this purpose that take a document or input query and match it with the document databases. This paper proposes novel approach to vectorize each document and query with normalized TF-IDF method and applying Cosine Similarity function to extract top 3 documents based on user query.
APA, Harvard, Vancouver, ISO, and other styles
26

Marrara, Stefania, Mauro Pelucchi, and Giuseppe Psaila. "Blind Queries Applied to JSON Document Stores." Information 10, no. 10 (2019): 291. http://dx.doi.org/10.3390/info10100291.

Full text
Abstract:
Social Media, Web Portals and, in general, information systems offer their own Application Programming Interfaces (APIs), used to provide large data sets concerning every aspect of day-by-day life. APIs usually provide data sets as collections of JSON documents. The heterogeneous structure of JSON documents returned by different APIs constitutes a barrier to effectively query and analyze these data sets. The adoption of NoSQL document stores, such as MongoDB, is useful for gathering these data sets, but does not solve the problem of querying the final heterogeneous repository. The aim of this paper is to provide analysts with a tool, named HammerJDB, that allows for blind querying collections of JSON documents within a NoSQL document database. The idea below is that users may know the application domain but it may be that they are not aware of the real structures of the documents stored in the database—the tool for blind querying tries to bridge the gap, by adopting a query rewriting mechanism. This paper is an evolution of a technique for blind querying Open Data portals and of its implementation within the Hammer framework, presented in some previous work. In this paper, we evolve that approach in order to query a NoSQL document database by evolving the Hammer framework into the HammerJDB framework, which is able to work on MongoDB databases. The effectiveness of the new approach is evaluated on a data set (derived from a real-life one), containing job-vacancy ads collected from European job portals.
APA, Harvard, Vancouver, ISO, and other styles
27

et al., Nohuddin. "Content analytics based on random forest classification technique: An empirical evaluation using online news dataset." International Journal of ADVANCED AND APPLIED SCIENCES 8, no. 2 (2021): 77–84. http://dx.doi.org/10.21833/ijaas.2021.02.011.

Full text
Abstract:
In this paper, a study is established for exploiting a document classification technique for categorizing a set of random online documents. The technique is aimed to assign one or more classes or categories to a document, making it easier to manage and sort. This paper describes an experiment on the proposed method for classifying documents effectively using the decision tree technique. The proposed research framework is a Document Analysis based on the Random Forest Algorithm (DARFA). The proposed framework consists of 5 components, which are (i) Document dataset, (ii) Data Preprocessing, (iii) Document Term Matrix, (iv) Random Forest classification, and (v) Visualization. The proposed classification method can analyze the content of document datasets and classifies documents according to the text content. The proposed framework use algorithms that include TF-IDF and Random Forest algorithm. The outcome of this study benefits as an enhancement to document management procedures like managing documents in daily business operations, consolidating inventory systems, organizing files in databases, and categorizing document folders.
APA, Harvard, Vancouver, ISO, and other styles
28

Fong, Joseph, and Herbert Shiu. "An Interpreter Approach for Exporting Relational Data into XML Documents with Structured Export Markup Language." Journal of Database Management 23, no. 1 (2012): 49–77. http://dx.doi.org/10.4018/jdm.2012010103.

Full text
Abstract:
Almost all enterprises use relational databases to handle real time business operations and most need to generate various XML documents for data exchanges internally among various departments and externally with business partners. Exporting data in a relational database to an XML document can be considered a data conversion process. Based on the four approaches for data conversion: Customized program, Interpretive transformer, Translator generator, and Logical level translation, this paper proposes a new interpretive approach using Structured Export Markup Language (SEML) interpreter for converting relational data into XML documents. The frameworks and languages proposed by other researchers are neither generic nor able to generate arbitrary XML documents. Therefore, SEML interpreter is a simple, user friendly, and complete solution with a new mark-up language ? SEML ? for data conversion. The solution can be used as a generic tool for extracting, transforming, and loading (ETL) purposes. In other words, the SEML interpreter is a solution for relational databases similar to what X-Query is for XML databases.
APA, Harvard, Vancouver, ISO, and other styles
29

Doermann, David, Huiping Li, and Omid Kia. "The detection of duplicates in document image databases." Image and Vision Computing 16, no. 12-13 (1998): 907–20. http://dx.doi.org/10.1016/s0262-8856(98)00054-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Lopresti, Daniel P. "String techniques for detecting duplicates in document databases." International Journal on Document Analysis and Recognition 2, no. 4 (2000): 186–99. http://dx.doi.org/10.1007/s100320050005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Blanco, Carlos, Diego García-Saiz, David G. Rosado, et al. "Security policies by design in NoSQL document databases." Journal of Information Security and Applications 65 (March 2022): 103120. http://dx.doi.org/10.1016/j.jisa.2022.103120.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Dr.A.Mekala. "An Ontology Approach to Data Integration using Mapping Method." International Journal for Modern Trends in Science and Technology 6, no. 12 (2020): 28–32. http://dx.doi.org/10.46501/ijmtst061206.

Full text
Abstract:
Text mining is a technique to discover meaningful patterns from the available text documents. The pattern sighting from the text and document association of document is a well-known problem in data mining. Analysis of text content and categorization of the documents is a composite task of data mining. Some of them are supervised and some of them unsupervised manner of document compilation. The term “Federated Databases” refers to the in sequence integration of distributed, autonomous and heterogeneous databases. Nevertheless, a federation can also include information systems, not only databases. At integrating data, more than a few issues must be addressed. Here, we focus on the trouble of heterogeneity, more specifically on semantic heterogeneity – that is, problems correlated to semantically equivalent concepts or semantically related/unrelated concepts. In categorize to address this problem; we apply the idea of ontologies as a tool for data integration. In this paper, we make clear this concept and we briefly explain a technique for constructing ontology by using a hybrid ontology approach.
APA, Harvard, Vancouver, ISO, and other styles
33

Fakharaldien, Mohammed Adam Ibrahim, Jasni Mohamed Zain, Norrozila Sulaiman, and Tutut Herawan. "XRecursive." International Journal of Information Retrieval Research 1, no. 4 (2011): 53–65. http://dx.doi.org/10.4018/ijirr.2011100104.

Full text
Abstract:
Storing XML documents in a relational database is a promising solution because relational databases are mature and scale very well. They have the advantages that in a relational database XML data and structured data can coexist making it possible to build application that involve both kinds of data with little extra effort. This paper proposes an alternative method named Xrecursive for mapping XML (eXtensible Markup Language) documents to RDB (Relational Databases). The Xrecursive method does not need a DTD (Document Text Definition) or XML schema. Further, it can be applied as a general solution for any XML data. The steps and algorithm of Xrecursive are given in details to describe how to use the storing structure to storage and query XML documents in relational database. The authors report their experimental results on a real database, showing that the performance of their Xrecursive algorithm achieves better results in terms of storage size, insertion time, mapping time, and reconstruction time as compared with that SUCXENT and XParent methods. In overall, Xrecursive performs better in term of query performances as compared to the both methods.
APA, Harvard, Vancouver, ISO, and other styles
34

Zmaranda, Doina R., Cristian I. Moisi, Cornelia A. Győrödi, Robert Ş. Győrödi, and Livia Bandici. "An Analysis of the Performance and Configuration Features of MySQL Document Store and Elasticsearch as an Alternative Backend in a Data Replication Solution." Applied Sciences 11, no. 24 (2021): 11590. http://dx.doi.org/10.3390/app112411590.

Full text
Abstract:
In recent years, with the increase in the volume and complexity of data, choosing a suitable database for storing huge amounts of data is not easy, because it must consider aspects such as manageability, scalability, and extensibility. Nowadays, the NoSQL databases have gained immense popularity for their efficiency in managing such datasets compared to relational databases. However, relational databases also exhibit some advantages in certain circumstances, therefore many applications use a combined approach: relational and non-relational. This paper performs a comparative evaluation of two popular open-source DBMSs: MySQL Document Store and Elasticsearch as non-relational DBMSs; this comparison is based on a detailed analysis of CRUD operations for different amounts of data showing how the databases could be modeled and used in an application. A case-study application was developed for this purpose in Java programming language and Spring framework using for data storage both relational MySQL and non-relational Elasticsearch and MySQL Document Store. To model the real situation encountered in several developed applications that use both relational and non-relational databases, a data replication solution that imports data from the primary relational MySQL database into Elasticsearch and MySQL Document Store as possible alternatives for more efficient data search was proposed and implemented.
APA, Harvard, Vancouver, ISO, and other styles
35

Győrödi, Cornelia A., Diana V. Dumşe-Burescu, Robert Ş. Győrödi, Doina R. Zmaranda, Livia Bandici, and Daniela E. Popescu. "Performance Impact of Optimization Methods on MySQL Document-Based and Relational Databases." Applied Sciences 11, no. 15 (2021): 6794. http://dx.doi.org/10.3390/app11156794.

Full text
Abstract:
Databases are an important part of today’s applications where large amounts of data need to be stored, processed, and accessed quickly. One of the important criteria when choosing to use a database technology is its data processing performance. In this paper, some methods for optimizing the database structure and queries were applied on two popular open-source database management systems: MySQL as a relational DBMS, and document-based MySQL as a non-relational DBMS. The main objective of this paper was to conduct a comparative analysis of the impact that the proposed optimization methods have on each specific DBMS when carrying out CRUD (CREATE, READ, UPDATE, DELETE) requests. To perform the analysis and performance evaluation of CRUD operations for different amounts of data, a case study testing architecture based on Java was developed and used to show how the databases’ proposed optimization methods can influence the performance of the application, and to highlight the differences in response time and complexity. The results obtained show the degree to which the proposed optimization methods contributed to the application’s performance improvement in the case of both databases; based on these, a detailed analysis and several conclusions are presented to support a decision for choosing a specific approach.
APA, Harvard, Vancouver, ISO, and other styles
36

Ha, Van Muon, Yulia A. Shichkina, and Sergey V. Kostichev. "Determining the Composition of Collections for Key-Document Databases Based on a Given Set of Object Properties and Database Querie." Computer tools in education, no. 3 (September 30, 2019): 15–28. http://dx.doi.org/10.32603/2071-2340-2019-3-15-28.

Full text
Abstract:
The work of transforming a database from one format periodically appears in different organizations for various reasons. Today, the mechanism for changing the format of relational databases is well developed. However, with the advent of new types of databases, such as NoSQL, this problem is prevalent due to the radically different ways of data organization at the various databases. This article discusses a formalized method based on set theory, at the choice of the number and composition of collections for a key-value type database. The initial data are the properties of objects, about which information is stored in the database, and the set of queries that are most frequently executed. The considered method can be applied not only when creating a new keyvalue database, but also when transforming an existing one, when moving from relational databases to NoSQL, when consolidating databases.
APA, Harvard, Vancouver, ISO, and other styles
37

Horvat, Marko, Alan Jović, and Danko Ivošević. "Lift Charts-Based Binary Classification in Unsupervised Setting for Concept-Based Retrieval of Emotionally Annotated Images from Affective Multimedia Databases." Information 11, no. 9 (2020): 429. http://dx.doi.org/10.3390/info11090429.

Full text
Abstract:
Evaluation of document classification is straightforward if complete information on the documents’ true categories exists. In this case, the rank of each document can be accurately determined and evaluated. However, in an unsupervised setting, where the exact document category is not available, lift charts become an advantageous method for evaluation of the retrieval quality and categorization of ranked documents. We introduce lift charts as binary classifiers of ranked documents and explain how to apply them to the concept-based retrieval of emotionally annotated images as one of the possible retrieval methods for this application. Furthermore, we describe affective multimedia databases on a representative example of the International Affective Picture System (IAPS) dataset, their applications, advantages, and deficiencies, and explain how lift charts may be used as a helpful method for document retrieval in this domain. Optimization of lift charts for recall and precision is also described. A typical scenario of document retrieval is presented on a set of 800 affective pictures labeled with an unsupervised glossary. In the lift charts-based retrieval using the approximate matching method, the highest attained accuracy, precision, and recall were 51.06%, 47.41%, 95.89%, and 81.83%, 99.70%, 33.56%, when optimized for recall and precision, respectively.
APA, Harvard, Vancouver, ISO, and other styles
38

Barillot, Marine J., Bernard Sarrut, and Christian G. Doreau. "Evaluation of Drug Interaction Document Citation in Nine On-Line Bibliographic Databases." Annals of Pharmacotherapy 31, no. 1 (1997): 45–49. http://dx.doi.org/10.1177/106002809703100106.

Full text
Abstract:
OBJECTTVE: To compare nine on-line bibliographic databases to obtain bibliographic references on specific drug interactions. DESIGN: Seven bibliographic databases were selected for their ability to provide information concerning drug interactions: EMBASE, MEDLINE, TOXLINE, BIOSIS, Chemical Abstracts (CAS), PHARMLINE, and International Pharmaceutical Abstracts (IPA). Two French on-line bibliographic databases (i.e., PASCAL, BIBLIOGRAPHIF) were also tested to compare them with the other international databases. Twenty drug interactions were selected randomly using the journal Reactions Weekly 1993. MAIN OUTCOMES MEASURES: The total number of references, the number of potentially relevant references, the number of case report references, the number of unique references in the total number of references, and the number of unique references between potentially relevant references were analyzed by using the Friedman two-way ANOVA by ranks. For each database, relevance and relative recall were calculated. RESULTS: For the total number of references, EMBASE was significantly more comprehensive than all other databases (p < 0.05). EMBASE had a significantly greater number of potentially relevant references than IPA, PHARMLINE, CAS, and BIBLIOGRAPHIF (p < 0.05). For the total number of case report references, only one significant difference, between EMBASE and BIBLIOGRAPHIF (p < 0.05), was observed. MEDLINE and TOXLINE had the lowest cost per potentially relevant reference. CONCLUSIONS: To obtain bibliographic references on drug interactions, the first step should be to search MEDLINE or TOXLINE; the second step, for completeness, should be to search EMBASE.
APA, Harvard, Vancouver, ISO, and other styles
39

Richardson, G. Manning, Janet Bowers, A. John Woodill, Joseph R. Barr, Jean Mark Gawron, and Richard A. Levine. "Topic Models: A Tutorial with R." International Journal of Semantic Computing 08, no. 01 (2014): 85–98. http://dx.doi.org/10.1142/s1793351x14500044.

Full text
Abstract:
This tutorial presents topic models for organizing and comparing documents. The technique and corresponding discussion focuses on analysis of short text documents, particularly micro-blogs. However, the base topic model and R implementation are generally applicable to text analytics of document databases.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhang, Xiao Lin, Wan Li Wang, and Xiao Qi Lv. "Research on Technology of Medical Image Database and its Connection with HIS Database." Advanced Materials Research 267 (June 2011): 119–23. http://dx.doi.org/10.4028/www.scientific.net/amr.267.119.

Full text
Abstract:
Medical digital image information storage standard and existing heterogeneous database technology were analyzed with view to the particularity of the standard for medical images; data in medical images were organized according to data organizational hierarchy in the DICOM standard and sophisticated relational databases were utilized to store medical images. Using XML as a middleware to connect medical image database to HIS and then through relational databases and XML transformation rules to complete the conversion of the two databases into XML document. The study indicated that the method is able to achieve the purpose of the connection of heterogeneous databases.
APA, Harvard, Vancouver, ISO, and other styles
41

Suroso, Finna, and Galih Hendro. "A Brief Study of Comparison between Three Document Databases." International Journal of Computer Applications 181, no. 34 (2018): 24–29. http://dx.doi.org/10.5120/ijca2018918251.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Chen, H., and K. J. Lynch. "Automatic construction of networks of concepts characterizing document databases." IEEE Transactions on Systems, Man, and Cybernetics 22, no. 5 (1992): 885–902. http://dx.doi.org/10.1109/21.179830.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Everett, David. "Full-Text Online Databases as a Document Delivery System." Journal of Interlibrary Loan & Information Supply 3, no. 3 (1993): 17–25. http://dx.doi.org/10.1300/j472v03n03_06.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Nørvåg, Kjetil. "Supporting temporal text-containment queries in temporal document databases." Data & Knowledge Engineering 49, no. 1 (2004): 105–25. http://dx.doi.org/10.1016/j.datak.2003.08.006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Zhang, Xueying. "Concept integration of document databases using different indexing languages." Information Processing & Management 42, no. 1 (2006): 121–35. http://dx.doi.org/10.1016/j.ipm.2004.09.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Amiraslani, Farshad, and Deirdre Dragovich. "A Review of Documentation: A Cross-Disciplinary Perspective." World 3, no. 1 (2022): 126–45. http://dx.doi.org/10.3390/world3010007.

Full text
Abstract:
Documents are tools of communication which are changing rapidly in nature and quantity. Prompted by the COVID-19 pandemic, digital formats have become ubiquitous. However, documents and documentation have a long pre-digital history. In seeking to survey document types and features, two major online journal databases from the Web of Science database were analysed over a 30-year period to 2020. Documents were classified into types and the (arbitrary) features of format, dimension, production, administration and distribution. Such tabulation of journal documents has not been undertaken previously. As the sampled journals covered a range of fields, the types and features of documentation in selected specialised areas were included. Digitalisation of documentation, especially of rare documents, has accelerated in recent times, contributing to the retention of knowledge and its rapid dissemination, despite the accompanying disadvantages of the digital age, with its largely unregulated social media. Classifying and describing the diversity of existing documents is a major task and we have initiated this process by analysing two scientific databases.
APA, Harvard, Vancouver, ISO, and other styles
47

Dinges, Laslo, Ayoub Al-Hamadi, Moftah Elzobi, Sherif El-etriby, and Ahmed Ghoneim. "ASM Based Synthesis of Handwritten Arabic Text Pages." Scientific World Journal 2015 (2015): 1–18. http://dx.doi.org/10.1155/2015/323575.

Full text
Abstract:
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
APA, Harvard, Vancouver, ISO, and other styles
48

Maté, Alejandro, Jesús Peral, Juan Trujillo, Carlos Blanco, Diego García-Saiz, and Eduardo Fernández-Medina. "Improving security in NoSQL document databases through model-driven modernization." Knowledge and Information Systems 63, no. 8 (2021): 2209–30. http://dx.doi.org/10.1007/s10115-021-01589-x.

Full text
Abstract:
AbstractNoSQL technologies have become a common component in many information systems and software applications. These technologies are focused on performance, enabling scalable processing of large volumes of structured and unstructured data. Unfortunately, most developments over NoSQL technologies consider security as an afterthought, putting at risk personal data of individuals and potentially causing severe economic loses as well as reputation crisis. In order to avoid these situations, companies require an approach that introduces security mechanisms into their systems without scrapping already in-place solutions to restart all over again the design process. Therefore, in this paper we propose the first modernization approach for introducing security in NoSQL databases, focusing on access control and thereby improving the security of their associated information systems and applications. Our approach analyzes the existing NoSQL solution of the organization, using a domain ontology to detect sensitive information and creating a conceptual model of the database. Together with this model, a series of security issues related to access control are listed, allowing database designers to identify the security mechanisms that must be incorporated into their existing solution. For each security issue, our approach automatically generates a proposed solution, consisting of a combination of privilege modifications, new roles and views to improve access control. In order to test our approach, we apply our process to a medical database implemented using the popular document-oriented NoSQL database, MongoDB. The great advantages of our approach are that: (1) it takes into account the context of the system thanks to the introduction of domain ontologies, (2) it helps to avoid missing critical access control issues since the analysis is performed automatically, (3) it reduces the effort and costs of the modernization process thanks to the automated steps in the process, (4) it can be used with different NoSQL document-based technologies in a successful way by adjusting the metamodel, and (5) it is lined up with known standards, hence allowing the application of guidelines and best practices.
APA, Harvard, Vancouver, ISO, and other styles
49

Schulz, S., M. L. Müller, W. Dzeyk, et al. "Subword-based Semantic Retrieval of Clinical and Bibliographic Documents." Methods of Information in Medicine 49, no. 02 (2010): 141–47. http://dx.doi.org/10.3414/me9303.

Full text
Abstract:
Summary Objectives: The increasing amount of electronically available documents in bibliographic databases and the clinical documentation requires user-friendly techniques for content retrieval. Methods: A domain-specific approach on semantic text indexing for document retrieval is presented. It is based on a subword thesaurus and maps the content of texts in different European languages to a common interlingual representation, which supports the search across multilingual document collections. Results: Three use cases are presented where the semantic retrieval method has been implemented: a bibliographic database, a department EHR system, and a consumer-oriented Web portal. Conclusions: It could be shown that a semantic indexing and retrieval approach, the performance of which had already been empirically assessed in prior studies, proved useful in different prototypical and routine scenarios and was well accepted by several user groups.
APA, Harvard, Vancouver, ISO, and other styles
50

Suma, Sugimiyanto, and Fahad Alqurashi. "A comparison study of NoSQL document-oriented database system." International Journal of Applied Mathematical Research 8, no. 1 (2019): 27. http://dx.doi.org/10.14419/ijamr.v8i1.29434.

Full text
Abstract:
By increasing data generation at these day, requirement for a sufficient storage system are strongly needed by stakeholders to store and access huge number of data in efficient way for fast analysis and decision. While RDBMS cannot deal with this challenge, NoSQL has emerged as a solution to address this challenge. There have been plenty of NoSQL database engine with their categories and characteristics, especially for document-oriented database. However, it makes a confusion for the system developer to choose the appropriate NoSQL database for their system. This paper is our preliminary report to provide a comparison of NoSQL databases. The comparison is based on performance of execution time which is measured by building a simple program. This experiment was done in our local cluster by exploiting around 1 million datasets. The result shows that RDB has better performance than CDB in terms of execution time.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!