To see the other types of publications on this topic, follow the link: Document-oriented Databases.

Journal articles on the topic 'Document-oriented Databases'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Document-oriented Databases.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ji, Lim Fung, and Nurulhuda Firdaus Mohd Azmi. "Migrating data from document-oriented database to graph-oriented database." Multidisciplinary Science Journal 5 (August 10, 2023): 2023ss0105. http://dx.doi.org/10.31893/multiscience.2023ss0105.

Full text
Abstract:
In data migration between different types of NoSQL databases, data may not be directly transferred to the targeted database compared to migration of data between the same types of database. This is due to the heterogeneity of the storage paradigm of the NoSQL databases. For example, when migrating data from a document-oriented database such as MongoDB, which stores data in JSON (Java Object Notation) format to Neo4j, a graph-oriented database stores data in nodes, the differences among these databases’ storage paradigms require different representations of the data model in the targeted graph-oriented database. This paper proposed a sequential approach to migrate data from MongoDB to Neo4j. The approach migrates MongoDB data to Neo4j and verifies the migrated data using a comparative method. The paper discusses the migration algorithm and how complex fields in MongoDB, such as nested documents, are presented in Neo4j.
APA, Harvard, Vancouver, ISO, and other styles
2

Gallinucci, Enrico, Matteo Golfarelli, and Stefano Rizzi. "Schema profiling of document-oriented databases." Information Systems 75 (June 2018): 13–25. http://dx.doi.org/10.1016/j.is.2018.02.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Suma, Sugimiyanto, and Fahad Alqurashi. "A comparison study of NoSQL document-oriented database system." International Journal of Applied Mathematical Research 8, no. 1 (2019): 27. http://dx.doi.org/10.14419/ijamr.v8i1.29434.

Full text
Abstract:
By increasing data generation at these day, requirement for a sufficient storage system are strongly needed by stakeholders to store and access huge number of data in efficient way for fast analysis and decision. While RDBMS cannot deal with this challenge, NoSQL has emerged as a solution to address this challenge. There have been plenty of NoSQL database engine with their categories and characteristics, especially for document-oriented database. However, it makes a confusion for the system developer to choose the appropriate NoSQL database for their system. This paper is our preliminary report to provide a comparison of NoSQL databases. The comparison is based on performance of execution time which is measured by building a simple program. This experiment was done in our local cluster by exploiting around 1 million datasets. The result shows that RDB has better performance than CDB in terms of execution time.
APA, Harvard, Vancouver, ISO, and other styles
4

Moukhi, Nawfal El, Ikram El Azami, and Soufiane Hajbi. "Towards a new hybrid approach for building document-oriented data warehouses." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 6 (2022): 6423. http://dx.doi.org/10.11591/ijece.v12i6.pp6423-6431.

Full text
Abstract:
<span lang="EN-US">Schemaless databases offer a large storage capacity while guaranteeing high performance in data processing. Unlike relational databases, which are rigid and have shown their limitations in managing large amounts of data. However, the absence of a well-defined schema and structure in not only SQL (NoSQL) databases makes the use of data for decision analysis purposes even more complex and difficult. In this paper, we propose an original approach to build a document-oriented data warehouse from unstructured data. The new approach follows a hybrid paradigm that combines data analysis and user requirements analysis. The first data-driven step exploits the fast and distributed processing of the spark engine to generate a general schema for each collection in the database. The second requirement-driven step consists of analyzing the semantics of the decisional requirements expressed in natural language and mapping them to the schemas of the collections. At the end of the process, a decisional schema is generated in JavaScript object notation (JSON) format and the data loading with the necessary transformations is performed.</span>
APA, Harvard, Vancouver, ISO, and other styles
5

Nawfal, El Moukhi, El Azami Ikram, and Hajbi Soufiane. "Towards a new hybrid approach for building document-oriented data wareh." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 6 (2022): 6423–31. https://doi.org/10.11591/ijece.v12i6.pp6423-6431.

Full text
Abstract:
Schemaless databases offer a large storage capacity while guaranteeing high performance in data processing. Unlike relational databases, which are rigid and have shown their limitations in managing large amounts of data. However, the absence of a well-defined schema and structure in not only SQL (NoSQL) databases makes the use of data for decision analysis purposes even more complex and difficult. In this paper, we propose an original approach to build a document-oriented data warehouse from unstructured data. The new approach follows a hybrid paradigm that combines data analysis and user requirements analysis. The first data-driven step exploits the fast and distributed processing of the spark engine to generate a general schema for each collection in the database. The second requirement-driven step consists of analyzing the semantics of the decisional requirements expressed in natural language and mapping them to the schemas of the collections. At the end of the process, a decisional schema is generated in JavaScript object notation (JSON) format and the data loading with the necessary transformations is performed.
APA, Harvard, Vancouver, ISO, and other styles
6

Winaya, I. Gede, and Ahmad Ashari. "Transformasi Skema Basis Data Relasional Menjadi Model Data Berorientasi Dokumen pada MongoDB." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 10, no. 1 (2016): 47. http://dx.doi.org/10.22146/ijccs.11188.

Full text
Abstract:
MongoDB is a database that uses document-oriented data storage models. In fact, to migrate from a relational database to NoSQL databases such as MongoDB is not an easy matter especially if the data are extremely complex. Based on the documentation that has been done by several global companies related to the use of MongoDB, it can be concluded that the process of migration from RDBMS to MongoDB require quite a long time. One process that takes quite a lot is transformation of relational database schema into a document-oriented data model on MongoDB. This research discusses the development transformation system of relational database schema to the document oriented data model in MongoDB. The process of transformation is done by utilizing the structure and relationships between tables in the scheme as the main parameters of the modeling algorithm. In the process of the modeling documents, it necessary to adjustments the specifications of MongoDB document that formed document model can be implemented in MongoDB. Document models are formed from transformation process can be a single document, embedded document, referenced document or combination of these. Document models are formed depending on the type, rules, and the value of the relationships cardinality between tables in the relational database schema.
APA, Harvard, Vancouver, ISO, and other styles
7

Calatrava, Carlos Garcia, Yolanda Becerra Fontal, Fernando M. Cucchietti, and Carla Diví Cuesta. "NagareDB: A Resource-Efficient Document-Oriented Time-Series Database." Data 6, no. 8 (2021): 91. http://dx.doi.org/10.3390/data6080091.

Full text
Abstract:
The recent great technological advance has led to a broad proliferation of Monitoring Infrastructures, which typically keep track of specific assets along time, ranging from factory machinery, device location, or even people. Gathering this data has become crucial for a wide number of applications, like exploration dashboards or Machine Learning techniques, such as Anomaly Detection. Time-Series Databases, designed to handle these data, grew in popularity, becoming the fastest-growing database type from 2019. In consequence, keeping track and mastering those rapidly evolving technologies became increasingly difficult. This paper introduces the holistic design approach followed for building NagareDB, a Time-Series database built on top of MongoDB—the most popular NoSQL Database, typically discouraged in the Time-Series scenario. The goal of NagareDB is to ease the access to three of the essential resources needed to building time-dependent systems: Hardware, since it is able to work in commodity machines; Software, as it is built on top of an open-source solution; and Expert Personnel, as its foundation database is considered the most popular NoSQL DB, lowering its learning curve. Concretely, NagareDB is able to outperform MongoDB recommended implementation up to 4.7 times, when retrieving data, while also offering a stream-ingestion up to 35% faster than InfluxDB, the most popular Time-Series database. Moreover, by relaxing some requirements, NagareDB is able to reduce the disk space usage up to 40%.
APA, Harvard, Vancouver, ISO, and other styles
8

Aggoune, Aicha, and Mohamed Sofiane Namoune. "P3 Process for Object-Relational Data Migration to NoSQL Document-Oriented Datastore." International Journal of Software Science and Computational Intelligence 14, no. 1 (2022): 1–20. http://dx.doi.org/10.4018/ijssci.309994.

Full text
Abstract:
The exponential growth of complex data in object-relational databases (ORDB) raises the need for efficient storage with scalability, consistency, and partition tolerance. The migration towards NoSQL (not only structured query language) datastores is the best fit for distributed complex data. Unfortunately, very few studies provide solutions for ORDB migration to NoSQL. This paper reports on how to achieve the migration of complex data from ORDB to a document-oriented NoSQL database. The proposed approach focused on the P3 process that involves three major stages: (P1) the preprocessing stage to access and extract the database features using SQL queries, (P2) the processing stage to provide the data mapping by using a list of mapping rules between the source and target models, and (P3) the post-processing stage to store and request the migrated data within the NoSQL context. A thorough experiments on two real-life databases veriðes the P3 process improves the performance of data migration with complex schema structures.
APA, Harvard, Vancouver, ISO, and other styles
9

Gallinucci, Enrico, Matteo Golfarelli, and Stefano Rizzi. "Approximate OLAP of document-oriented databases: A variety-aware approach." Information Systems 85 (November 2019): 114–30. http://dx.doi.org/10.1016/j.is.2019.02.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Datt, Niteshwar. "Comparative Study of CouchDB and MongoDB – NoSQL Document Oriented Databases." International Journal of Computer Applications 136, no. 3 (2016): 24–26. http://dx.doi.org/10.5120/ijca2016908457.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Sytnyk, Oksana, and Nina Pomazun. "COMPARING MODELLING OF DOCUMENT-ORIENTED AND RELATIONAL DATABASES: TEACHING EXPERIENCE." Transactions of Kremenchuk Mykhailo Ostrohradskyi National University, no. 1 (2025): 192–201. https://doi.org/10.32782/1995-0519.2025.1.25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Shichkina, Yulia, and Van Muon Ha. "Method for Creating Collections with Embedded Documents for Document-oriented Databases Taking into Account Executable Queries." SPIIRAS Proceedings 19, no. 4 (2020): 829–54. http://dx.doi.org/10.15622/sp.2020.19.4.5.

Full text
Abstract:
In the recent decades, NoSQL databases have become more popular day by day. And increasingly, developers and database administrators, for whatever reason, have to solve the problems of database migration from a relational model in the model NoSQL databases like the document-oriented database MongoDB database. This article discusses the approach to this migration data based on set theory. A new formal method of determining the optimal runtime searches aggregate collections with the attached documents NoSQL databases such as the key document. The attributes of the database objects are included in optimizing the number of collections and their structures in search queries. The initial data are object properties (attributes, relationships between attributes) on which information is stored in the database, and query the properties that are most often performed, or the speed of which should be maximal. This article discusses the basic types of connections (1-1, 1-M, M-M), typical of the relational model. The proposed method is the following step of the method of creating a collection without embedded documents. The article also provides a method for determining what methods should be used in the reasonable cases to make work with databases more effectively. At the end, this article shows the results of testing of the proposed method on databases with different initial schemes. Experimental results show that the proposed method helps reduce the execution time of queries can also significantly as well as reduce the amount of memory required to store the data in a new database.
APA, Harvard, Vancouver, ISO, and other styles
13

Aziz, Srai, Guerouate Fatima, and Drissi Lahsini Hilal. "The Integration of the MDA Approach in Document-Oriented NoSQL Databases, the case of Mongo DB." International Journal of Engineering and Advanced Technology (IJEAT) 10, no. 3 (2021): 115–22. https://doi.org/10.35940/ijeat.C2235.0210321.

Full text
Abstract:
Today with the growth of the internet, the use of social networks, mobile telephony, connected and communicating objects. The data has become so big, hence the need to exploit that data has become primordial. In practice, a very large number of companies specializing in the health sector, the banking and financial sector, insurance, manufacturing industry, etc… are based on traditional databases which are often well organized of customer data, machine data, etc ... but in most cases, very large volumes of data from these databases, and the speed with which they must be analyzed to meet the business needs of the company are real challenges. This article aims to respond to a problem of generating NoSQL MongoDB databases by applying an approach based on model-driven engineering (Model Driven Architecture Approach). We provide Model to Model (using the QVT model transformation language), and Model to Code transformations (using the code generator, Acceleo). We also propose vertical and horizontal transformations to demonstrate the validity of our approach on NoSQL MongoDB databases. We have studied in this article the PSM transformations towards the implementation. PIM to PSM transformations are the subject of another work.
APA, Harvard, Vancouver, ISO, and other styles
14

Schmitt, Oliver, and Tim A. Majchrzak. "Document-Based Databases for Medical Information Systems and Crisis Management." International Journal of Information Systems for Crisis Response and Management 5, no. 3 (2013): 63–80. http://dx.doi.org/10.4018/ijiscram.2013070104.

Full text
Abstract:
Both for healthcare and crisis management, the usage of Information Systems (IS) has become routine. In fact, they are unthinkable without sophisticated IT support. Virtually all IS rely on data storage. Despite the document-oriented nature of medical datasets, relational databases (RDBMS) prevail. The authors evaluate a document-based database to assess its feasibility for the domain of healthcare and crisis support. To foster the understanding of this technology, the authors present the background of form-originated data storage, introduce document-based databases, and describe a use case relying on document-based databases. Based on their findings, the authors generalize the results with a focus on crisis management. The authors investigated good indications that document-based databases such as CouchDB are well-suited for IS in medical contexts. They might be a feasible option for the future development of systems in various fields of healthcare, crisis response, and medical research.
APA, Harvard, Vancouver, ISO, and other styles
15

Gusarenko, A. S. "Ситуационно-ориентированные базы данных: обработка гетерогенных документов микросервисов в документо-ориентированном хранилище". МОДЕЛИРОВАНИЕ, ОПТИМИЗАЦИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ 10, № 4(39) (2022): 3–4. http://dx.doi.org/10.26102/2310-6018/2022.39.4.003.

Full text
Abstract:
The research is focused on a situation-oriented approach to the processing of heterogeneous data obtained from microservices that are widespread due to the implementation of the microservice architecture underlying many information systems. Such information systems are sources of heterogeneous data provided to the user upon request via the Internet. Data in the form of documents is provided by services included in the information system. The volume of such data can be large, and its processing requires specialized technologies available in document-oriented big data storages (SODB). As part of a situationally oriented database, a microservice is implemented that provides data in JSON format through its programming interface. There is a problem of loading and processing large amounts of data in the storage where specialized statistical functions of Map-Reduce are implemented. The manual method of loading and obtaining results for SODB is laborious because it requires the implementation of routine operations for loading data, applying functions to the loaded data, creating functions inside the storage and obtaining results. This task was not considered within the scope of the project on creating situation-oriented databases, and the possibilities for developing specialized elements and methods for processing large-scale heterogeneous data in a hierarchical situational model with the required equipment were not studied. The developed models for processing documents make processing heterogeneous data less laborious and help to create data-driven applications by means of situation-oriented databases in the framework of the introduced data processing model as part of a hierarchical situational model with the involvement of big data processing technologies of specialized document-oriented storages. The proposed tools are examined by the example of the SODB application for solving the problems of course design in the educational process using the developed microservice saturated with heterogeneous data collected while designing a course remotely. Работа сосредоточена на ситуационно-ориентированном подходе к обработке гетерогенных данных, получаемых из микросервисов получивших распространение благодаря реализации микросервисной архитектуры, положенной в основу многих информационных систем. Такие информационные системы являются источниками гетерогенных данных, предоставляемых пользователю по запросу через сеть интернет. Данные в виде документов предоставляются сервисами, входящими в состав информационной системы. Объемы таких данных могут быть велики, а для обработки требуются специализированные технологии, имеющиеся в документо-ориентированных хранилищах больших данных (СОБД). В составе ситуационно-ориентированной базы данных реализован микросервис, предоставляющий через свой программный интерфейс данные в формате JSON. Возникает задача загрузки и обработки больших объемов данных в хранилище, где реализованы специализированные статистические функции Map-Reduce. Ручной способ загрузки и получения результатов для СОБД является трудоемким, так как требуется реализация рутинных операций по загрузке данных, применению функций к загруженным данным, созданию функций внутри хранилища и получению результатов. Данная задача не рассматривалась в рамках проекта создания ситуационно-ориентированных баз данных, а возможности по разработке специализированных элементов и методов обработки гетерогенных данных большого объема в иерархической ситуационной модели с требующимся оснащением не исследовались. Разработанные модели обработки документов делают процесс обработки гетерогенных данных менее трудоемким и позволяют создавать собственные приложения на базе ситуационно-ориентированных баз данных, опираясь на введенную модель обработки данных в составе иерархической ситуационной модели с привлечением технологий обработки больших данных специализированных документо-ориентированных хранилищ. Предложенные средства рассматриваются в примере приложения СОБД для решения задач курсового проектирования в учебном процессе с задействованием разработанного микросервиса, насыщенного гетерогенными данными, собранными в процессе дистанционного курсового проектирования.
APA, Harvard, Vancouver, ISO, and other styles
16

Hamaji, Kohei, and Yukikazu Nakamoto. "A MongoDB Document Reconstruction Support System Using Natural Language Processing." Software 3, no. 2 (2024): 206–25. http://dx.doi.org/10.3390/software3020010.

Full text
Abstract:
Document-oriented databases, a type of Not Only SQL (NoSQL) database, are gaining popularity owing to their flexibility in data handling and performance for large-scale data. MongoDB, a typical document-oriented database, is a database that stores data in the JSON format, where the upper field involves lower fields and fields with the same related parent. One feature of thisdocument-oriented database is that data are dynamically stored in an arbitrary location without explicitly defining a schema in advance. This flexibility violates the above property and causes difficulties for application program readability and database maintenance. To address these issues, we propose a reconstruction support method for document structures in MongoDB. The method uses the strength of the Has-A relationship between the parent and child fields, as well as the similarity of field names in the MongoDB documents in natural language processing, to reconstruct the data structure in MongoDB. As a result, the method transforms the parent and child fields into morecoherent data structures. We evaluated our methods using real-world data and demonstrated their MongoDBeffectiveness.
APA, Harvard, Vancouver, ISO, and other styles
17

Aguilar Vera, Raul, Andrés Naal Jácome, Julio Díaz Mendoza, and Omar Gómez Gómez. "NoSQL Database Modeling and Management: A Systematic Literature Review." Revista Facultad de Ingeniería 32, no. 65 (2023): e16519. http://dx.doi.org/10.19053/01211129.v32.n65.2023.16519.

Full text
Abstract:
The NoSQL databases that emerged this century were created to solve the limitations of relational database systems due to the different types of data that have appeared for information processing. In this paper, we present the results of a secondary study carried out to find and synthesize the research made up to now on modeling processes, characteristics of the used types of data, and management tools for NoSQL Databases. Currently, four types are recognized and classified according to the data model they use: key-value, document-oriented, column-based, and graph-based. With this study, it was possible to identify that the most frequently type of NoSQL database model is that of documents because it offers greater flexibility and versatility compared to the other three models. Although it offers more complex search methods, in terms of data, column and document schemas are the ones that usually describe their characteristics. It was also possible to observe a trend in the use of the column-oriented model and the document-oriented model in the management tools, and, although they all comply with the basic functionalities, the differences lie in the way in which the information is stored and the way they can be accessed.
APA, Harvard, Vancouver, ISO, and other styles
18

Wang, Reen-Cheng, David Yang, Ming-Che Hsieh, Yi-Cheng Chen, and Weihsuan Lin. "GenAI-Assisted Database Deployment for Heterogeneous Indigenous–Native Ethnographic Research Data." Applied Sciences 14, no. 16 (2024): 7414. http://dx.doi.org/10.3390/app14167414.

Full text
Abstract:
In ethnographic research, data collected through surveys, interviews, or questionnaires in the fields of sociology and anthropology often appear in diverse forms and languages. Building a powerful database system to store and process such data, as well as making good and efficient queries, is very challenging. This paper extensively investigates modern database technology to find out what the best technologies to store these varied and heterogeneous datasets are. The study examines several database categories: traditional relational databases, the NoSQL family of key-value databases, graph databases, document databases, object-oriented databases and vector databases, crucial for the latest artificial intelligence solutions. The research proves that when it comes to field data, the NoSQL lineup is the most appropriate, especially document and graph databases. Simplicity and flexibility found in document databases and advanced ability to deal with complex queries and rich data relationships attainable with graph databases make these two types of NoSQL databases the ideal choice if a large amount of data has to be processed. Advancements in vector databases that embed custom metadata offer new possibilities for detailed analysis and retrieval. However, converting contents into vector data remains challenging, especially in regions with unique oral traditions and languages. Constructing such databases is labor-intensive and requires domain experts to define metadata and relationships, posing a significant burden for research teams with extensive data collections. To this end, this paper proposes using Generative AI (GenAI) to help in the data-transformation process, a recommendation that is supported by testing where GenAI has proven itself a strong supplement to document and graph databases. It also discusses two methods of vector database support that are currently viable, although each has drawbacks and benefits.
APA, Harvard, Vancouver, ISO, and other styles
19

Panwar, Avnish. "Data migration from SQL to NoSQL using snapshot- Livestream migration." Mathematical Statistician and Engineering Applications 70, no. 2 (2021): 1600–1608. http://dx.doi.org/10.17762/msea.v70i2.2450.

Full text
Abstract:
The process of moving data from a source database to a destination database is known as data migration. For a variety of reasons, including higher data handling capacity, improved speed, and scalability, many businesses are choosing to convert their databases from one kind (e.g., RDBMS) to another (e.g., NoSQL). Sqoop [3], mongoimport [2], and mongify [1] are a few techniques and technologies that have been developed to help with this transition from RDBMS to NoSQL databases. NoSQL databases use different models, as opposed to the relational model employed by RDBMS, including document, graph, and key-value. Large data volumes were the main focus of the design of NoSQL databases. The database migration model we provide in this paper can effectively transfer both real-time and historical data in parallel. Our Java-based model focuses on transferring data from MongoDB, a document-oriented NoSQL database, to MySQL, an RDBMS. The prototype we created can migrate both live data and a snapshot of the database at a particular moment in time simultaneously. Our experimental evaluation shows that, in terms of performance for both snapshot and live data migration, our model beats competing approaches.
APA, Harvard, Vancouver, ISO, and other styles
20

Zhang, Xiaohui, Songkun Jiao, Junfeng Wang, and Cuilei Yang. "Modeling of Security and Privacy Architecture for Protecting Databases in Cloud Computing Infrastructure." Scalable Computing: Practice and Experience 26, no. 3 (2025): 1300–1307. https://doi.org/10.12694/scpe.v26i3.4210.

Full text
Abstract:
In order to prevent data leakage and ensure the security of tenant’s private data, and to enable tenants to have a precise understanding of the security level of their private data, the author proposes a modeling of the security and privacy architecture for protecting databases in cloud computing infrastructure. The author proposes a document database privacy protection architecture, which builds upon the existing architecture by adding a privacy protection layer between the application layer and the storage layer, forming a new service deployment architecture. Then, the author introduced the basic methods of privacy protection based on facial document databases. In order to adapt to the data structure system based on document storage for document databases, the author designed a basic method of privacy protection for document databases based on segmentation and obfuscation. By utilizing the free nature of document oriented database patterns, privacy protection data can be achieved through appropriate segmentation. For nested document structures, the author designed a document structure tree to retain document structure information. The results show that by comparing the experimental data of the 50w and 100w groups horizontally, it can be found that under the same cutoff score, as the number of database data increases, the time for data queries also increases accordingly, the query time of database A has increased by nearly 300ms compared to database B, and the additional time of the other groups is also roughly the same. By vertically comparing the experimental data of the 50w and 100w groups, it can be found that the query time of the C database is nearly 200ms longer than that of the A database, and as the sharding factor increases, the query time also increases accordingly, but the proportion of increase begins to slow down. By comparing data with the same segmentation factor but different data volumes, it can be seen that the impact of data volume on query time is positively correlated. This model can ensure that the database system is transparent to users at the view layer after privacy protection, and ensure the correctness and integrity of privacy data.
APA, Harvard, Vancouver, ISO, and other styles
21

Ait El Mouden, Zakariyaa, and Abdeslam Jakimi. "A New Algorithm for Storing and Migrating Data Modelled by Graphs." International Journal of Online and Biomedical Engineering (iJOE) 16, no. 11 (2020): 137. http://dx.doi.org/10.3991/ijoe.v16i11.15545.

Full text
Abstract:
<span>NoSQL databases have moved from theoretical solutions to exceed relational databases limits to a practical and indisputable application for storing and manipulation big data. In term of variety, NoSQL databases store heterogeneous data without being obliged to respect a predefined schema such as the case of relational and object-relational databases. NoSQL solutions surpass the traditional databases in storage capacity; we consider MongoDB for example, which is a document-oriented database capable of storing unlimited number of documents with a maximal size of 32TB depending on the machine that runs the database and also the operating system. Also, in term of velocity, many researches compared the execution time of different transactions and proved that NoSQL databases are the perfect solution for real-time applications. This paper presents an algorithm to store data modeled by graphs as NoSQL documents, the purpose of this study is to exploit the high amount of data stored in SQL databases and to make such data usable by recent clustering algorithms and other data science tools. This study links relational data to document datastores by defining an effective algorithm for reading relational data, modelling those data as graphs and storing those data as NoSQL documents.</span>
APA, Harvard, Vancouver, ISO, and other styles
22

Maicha, Mohammed ElHabib, Youcef Ouinten, and Benameur Ziani. "UML4NoSQL: A Novel Approach for Modeling NoSQL Document-Oriented Databases Based on UML." Computing and Informatics 41, no. 3 (2022): 813–33. http://dx.doi.org/10.31577/cai_2022_3_813.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Саркисян, И. А., and Е. С. Гаврилюк. "ADVANTAGES OF USING DOCUMENT-ORIENTED DATABASES FOR ORGANIZING DATA STORAGE AND PROCESSING IN BUSINESS." Journal of Monetary Economics and Management, no. 4 (August 9, 2024): 22–28. http://dx.doi.org/10.26118/2782-4586.2024.59.78.003.

Full text
Abstract:
В с татье р аскрываются п реимущества и в озможности п ара- дигмы NоSQL хранения данных. Более подробно рассмотрены ключевые ар- хитектурные особенности и главные перспективы для практического приме- нения в бизнесе. Данный тип БД является одной из самых распространенных альтернатив классических реляционных моделей, которые не всегда способ- ны покрыть весь спектр потребностей на этапах организации хранения, ETL обработки и анализа данных. В работе был проведен сравнительный анализ основных метрик двух подходов, выявлены отрасли и направления бизне- са и науки, где применение ДОБД будет наиболее востребованным и даст больше возможностей для эффективного манипулирования информацией. В завершении презентована реальная модель бизнес данных, тестирование функицонала которой подтвердило гипотезу о преимуществах и перспекти- вах ДОБД.
APA, Harvard, Vancouver, ISO, and other styles
24

Srai*, Dr Aziz, Prof Fatima Guerouate, and Prof Hilal Drissi Lahsini. "The Integration of the MDA Approach in Document-Oriented NoSQL Databases, the case of Mongo DB." International Journal of Engineering and Advanced Technology 10, no. 3 (2021): 115–22. http://dx.doi.org/10.35940/ijeat.c2235.0210321.

Full text
Abstract:
Today with the growth of the internet, the use of social networks, mobile telephony, connected and communicating objects. The data has become so big, hence the need to exploit that data has become primordial. In practice, a very large number of companies specializing in the health sector, the banking and financial sector, insurance, manufacturing industry, etc… are based on traditional databases which are often well organized of customer data, machine data, etc ... but in most cases, very large volumes of data from these databases, and the speed with which they must be analyzed to meet the business needs of the company are real challenges. This article aims to respond to a problem of generating NoSQL MongoDB databases by applying an approach based on model-driven engineering (Model Driven Architecture Approach). We provide Model to Model (using the QVT model transformation language), and Model to Code transformations (using the code generator, Acceleo). We also propose vertical and horizontal transformations to demonstrate the validity of our approach on NoSQL MongoDB databases. We have studied in this article the PSM transformations towards the implementation. PIM to PSM transformations are the subject of another work.
APA, Harvard, Vancouver, ISO, and other styles
25

Ismael Imran, Inas. "Enhancement of NoSQL Database Performance Using Parallel Processing." Journal of Information Systems Engineering and Management 9, no. 2 (2024): 26126. http://dx.doi.org/10.55267/iadt.07.14670.

Full text
Abstract:
In the burgeoning realm of big data, document-oriented NoSQL databases stand out for their flexibility and scalability. This paper delves into the optimization of these databases, specifically through the lens of parallel processing techniques. A comparative study was conducted against the traditional non-parallel approaches, where marked performance enhancements were observed. For instance, the execution time for retrieving movies of a specific year decreased by over 80% when parallel processing was applied, plummeting from 1.578765 seconds to a brisk 0.300000 seconds. Memory usage and CPU utilization were meticulously recorded, revealing up to a 70% reduction in peak memory consumption in certain queries, and a moderate fluctuation in CPU usage between 49.25% to 75.2%. This indicates not only improved efficiency but also a prudent utilization of system capacity, without overtaxing resources. However, the study identified scenarios, such as highly complex queries, where the gains from parallel processing were less pronounced, suggesting a marginal improvement in CPU utilization. While the findings advocate for the adoption of parallel processing in handling intensive data retrieval tasks, it is recommended that future research should further scrutinize the scalability thresholds and explore alternative parallelization strategies to fortify the efficacy of document-oriented NoSQL databases.
APA, Harvard, Vancouver, ISO, and other styles
26

Sofiia, Materynska, Yaremenko Vadym, and Rogoza Walery. "A theoretically proposed algorithm in a decision tree format for choosing an efficient storage type of large datasets." Technology audit and production reserves 1, no. 2(63) (2022): 6–9. https://doi.org/10.15587/2706-5448.2022.251281.

Full text
Abstract:
<em>The object of research is methods and approaches to improve storage efficiency and optimize access to large amounts of data. The importance of this study consists in the wide dissemination of big data and the need for the right selection of technologies that will help improve the efficiency of big data processing systems. The complexity of the choice is caused by the large number of different data storages and databases that are available now, so the best decision requires a deep understanding of the advantages, disadvantages and features of each</em><em>.</em><em>&nbsp;And the difficulty lies in the lack of a universal algorithm for deciding on the optimal repository.</em>&nbsp;<em>Accordingly, based on the experiments, analysis of existing projects and research papers, a decision-making algorithm was proposed that determines the best way to store large datasets, depending on their characteristics and additional system requirements. This is necessary to simplify the design of the system in the early stages of big data processing projects. Thus, by highlighting the key differences, as well as the disadvantages and advantages of each type of storage and database, a list of key characteristics of the data and the future system, which should be considered when designing.</em> <em>This algorithm is a theoretical proposal based on the studied research papers. Accordingly, using this algorithm at the design stage of the system, it would be possible to quickly and clearly determine the optimal type of storage of large datasets.</em>&nbsp;<em>The paper considers column-oriented, document-oriented, graph and key-value types of databases, as well as distributed file systems and cloud services.</em>
APA, Harvard, Vancouver, ISO, and other styles
27

Martinez-Mosquera, Diana, Rosa Navarrete, and Sergio Lujan-Mora. "Modeling and Management Big Data in Databases—A Systematic Literature Review." Sustainability 12, no. 2 (2020): 634. http://dx.doi.org/10.3390/su12020634.

Full text
Abstract:
The work presented in this paper is motivated by the acknowledgement that a complete and updated systematic literature review (SLR) that consolidates all the research efforts for Big Data modeling and management is missing. This study answers three research questions. The first question is how the number of published papers about Big Data modeling and management has evolved over time. The second question is whether the research is focused on semi-structured and/or unstructured data and what techniques are applied. Finally, the third question determines what trends and gaps exist according to three key concepts: the data source, the modeling and the database. As result, 36 studies, collected from the most important scientific digital libraries and covering the period between 2010 and 2019, were deemed relevant. Moreover, we present a complete bibliometric analysis in order to provide detailed information about the authors and the publication data in a single document. This SLR reveal very interesting facts. For instance, Entity Relationship and document-oriented are the most researched models at the conceptual and logical abstraction level respectively and MongoDB is the most frequent implementation at the physical. Furthermore, 2.78% studies have proposed approaches oriented to hybrid databases with a real case for structured, semi-structured and unstructured data.
APA, Harvard, Vancouver, ISO, and other styles
28

Martínez-Mosquera, Diana, Rosa Navarrete, and Sergio Luján-Mora. "Modeling and Management Big Data in Databases—A Systematic Literature Review." Sustainability 12, no. 2 (2020): 1–41. https://doi.org/10.3390/su12020634.

Full text
Abstract:
The work presented in this paper is motivated by the acknowledgement that a complete and updated systematic literature review (SLR) that consolidates all the research efforts for Big Data modeling and management is missing. This study answers three research questions. The first question is how the number of published papers about Big Data modeling and management has evolved over time. The second question is whether the research is focused on semi-structured and/or unstructured data and what techniques are applied. Finally, the third question determines what trends and gaps exist according to three key concepts: the data source, the modeling and the database. As result, 36 studies, collected from the most important scientific digital libraries and covering the period between 2010 and 2019, were deemed relevant. Moreover, we present a complete bibliometric analysis in order to provide detailed information about the authors and the publication data in a single document. This SLR reveal very interesting facts. For instance, Entity Relationship and document-oriented are the most researched models at the conceptual and logical abstraction level respectively and MongoDB is the most frequent implementation at the physical. Furthermore, 2.78% studies have proposed approaches oriented to hybrid databases with a real case for structured, semi-structured and unstructured data.
APA, Harvard, Vancouver, ISO, and other styles
29

Mironov, V. V., A. S. Gusarenko та N. I. Yusupova. "Ситуационно-ориентированные базы данных: обработка офисных документов". МОДЕЛИРОВАНИЕ, ОПТИМИЗАЦИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ 10, № 2(37) (2022): 21–22. http://dx.doi.org/10.26102/2310-6018/2022.37.2.021.

Full text
Abstract:
This article discusses the application of a situation-oriented approach to the problem of extracting semantic information from office documents. Office documents created by vector graphics editors and word processors are reviewed. The ability to extract semantic information is due to the fact that such documents are based on open XML formats that can be processed by external programs. Processing of documents based on a situational database where word documents are programmatically loaded as XML files extracted from zip-archives is considered. In the situation-oriented database, it is possible to present an office document as a virtual document that is mapped both on XML files and the ZIP archive with XML files. This applies not only to text documents, but also to graphic documents that have an internal XML representation. This enables processing of documents in Office Open XML and Open Document Format. The article discusses various aspects of identifying and finding the necessary information during document processing by means of special standard definitions as bookmarks, key phrases and text labels. Models and algorithms for extracting the required information are examined. Examples of the practical use of this approach in the field of distance learning of students at the university are given. In addition, an example of extracting metadata of scientific publications in the Open Journal Systems publishing system is regarded. В статье рассматривается подход построения документоориентированных веб-приложений на основе ситуационно-ориентированных баз данных. Приложения на базе ситуационно-ориентированных баз данных решают проблемы с извлечением и обработкой семантической информации из офисных документов. В уже имеющихся исследованиях рассматривались вопросы заполнения офисных документов, в данном же исследовании рассматриваются методы извлечения информации из графических документов и текстовых документов, созданных в обычных офисных пакетах. Создание и задействование таких методов достигается за счет характера внутреннего представления офисных документов в XML и возможности обработки такого содержимого программным способом. Рассматривается обработка XML-файлов в ситуационно-ориентированных базах данных, где Word-документы программно загружаются как XML-файлы, извлекаемые из ZIP-архивов. В дальнейшем после загрузки документы могут быть представлены как виртуальные документы или множество таких документов, объединенных в виртуальный массив данных и отображаемых на реальные данные XML или ZIP-архивы с XML файлами внутри. Разработанные и применяемые методы работают в отношении как графических, так и текстовых документов. В статье также рассматриваются методы отыскания и идентификации нужных фрагментов данных внутри документа во время его обработки, базирующейся на стандартах описания в закладках, ключевых фразах, и текстовых метках. Модели и алгоритмы для извлечения требующейся информации обсуждаются и демонстрируются на практических примерах, где рассматривается система дистанционного выполнения курсовых проектов студентами. В дополнение к примерам из учебного процесса рассматривается извлечение метаданных научных публикаций из международной издательской системы Open Journal Systems.
APA, Harvard, Vancouver, ISO, and other styles
30

Hainebach, Richard. "The EUROCAT project — the integration of European Community multidisciplinary and document‐oriented databases on CDROM." Electronic Library 11, no. 4/5 (1993): 319–26. http://dx.doi.org/10.1108/eb045254.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Davardoost, Farnaz, Amin Babazadeh Sangar, and Kambiz Majidzadeh. "An Innovative Model for Extracting OLAP Cubes from NOSQL Database Based on Scalable Naïve Bayes Classifier." Mathematical Problems in Engineering 2022 (April 11, 2022): 1–11. http://dx.doi.org/10.1155/2022/2860735.

Full text
Abstract:
Due to unstructured and large amounts of data, relational databases are no longer suitable for data management. As a result, new databases known as NOSQL have been introduced. The issue is that such a database is difficult to analyze. Online analytical processing (OLAP) is the foundational technology for data analysis in business intelligence. Because these technologies were designed primarily for relational database systems, performing OLAP in NOSQL is difficult. We present a model for extracting OLAP cubes from a document-oriented NOSQL database in this article. A scalable Naïve Bayes classifier method was used for this purpose. The proposed solution is divided into three stages of preparation, Naïve Bayes, and NBMR. Our proposed algorithm, NBMR, is based on the Naïve Bayes classifier (NBC) and the MapReduce (MR) programming model. Each NOSQL database document with nearly the same attribute will belong to the same class, and as a result, OLAP cubes can be used to perform data analysis. Because the proposed model allows for distributed and parallel Naïve Bayes Classifier computing, it is appropriate and suitable for large-scale data sets. Our proposed model is a proper and efficient approach when considering the speed and reduced the number of required comparisons.
APA, Harvard, Vancouver, ISO, and other styles
32

Dourhri, Ahmed, Mohamed Hanine, and Hassan Ouahmane. "KVMod—A Novel Approach to Design Key-Value NoSQL Databases." Information 14, no. 10 (2023): 563. http://dx.doi.org/10.3390/info14100563.

Full text
Abstract:
The growth of structured, semi-structured, and unstructured data produced by the new applications is a result of the development and expansion of social networks, the Internet of Things, web technology, mobile devices, and other technologies. However, as traditional databases became less suitable to manage the rapidly growing quantity of data and variety of data structures, a new class of database management systems named NoSQL was required to satisfy the new requirements. Although NoSQL databases are generally schema-less, significant research has been conducted on their design. A literature review presented in this paper lets us claim the need to create modeling techniques to describe how to structure data in NoSQL databases. Key-value is one of the NoSQL families that has received too little attention, especially in terms of its design methodology. Most studies have focused on the other families, like column-oriented and document-oriented. This paper aims to present a design approach named KVMod (key-value modeling) specific to key-value databases. The purpose is to provide to the scientific community and engineers with a methodology for the design of key-value stores using the maximum automation and therefore the minimum human intervention, which equals the minimum number of errors. A software tool called KVDesign has been implemented to automate the proposed methodology and, thus, the most time-consuming database modeling tasks. The complexity is also discussed to assess the efficiency of our proposed algorithms.
APA, Harvard, Vancouver, ISO, and other styles
33

Schulz, S., M. L. Müller, W. Dzeyk, et al. "Subword-based Semantic Retrieval of Clinical and Bibliographic Documents." Methods of Information in Medicine 49, no. 02 (2010): 141–47. http://dx.doi.org/10.3414/me9303.

Full text
Abstract:
Summary Objectives: The increasing amount of electronically available documents in bibliographic databases and the clinical documentation requires user-friendly techniques for content retrieval. Methods: A domain-specific approach on semantic text indexing for document retrieval is presented. It is based on a subword thesaurus and maps the content of texts in different European languages to a common interlingual representation, which supports the search across multilingual document collections. Results: Three use cases are presented where the semantic retrieval method has been implemented: a bibliographic database, a department EHR system, and a consumer-oriented Web portal. Conclusions: It could be shown that a semantic indexing and retrieval approach, the performance of which had already been empirically assessed in prior studies, proved useful in different prototypical and routine scenarios and was well accepted by several user groups.
APA, Harvard, Vancouver, ISO, and other styles
34

Fulachier, Jérôme, Jérôme Odier, and Fabian Lambert. "Design principles of the Metadata Querying Language (MQL) implemented in the ATLAS Metadata Interface (AMI) ecosystem." EPJ Web of Conferences 245 (2020): 04044. http://dx.doi.org/10.1051/epjconf/202024504044.

Full text
Abstract:
This document describes the design principles of the Metadata Querying Language (MQL) implemented in ATLAS Metadata Interface (AMI), a metadata-oriented domain-specific language allowing to query databases without knowing the relation between tables. With this simplified yet generic grammar, MQL permits writing complex queries more simply than with Structured Query Language (SQL).
APA, Harvard, Vancouver, ISO, and other styles
35

Kannan, Rajesh, and Dr R. Mala. "Analysis of Encryption Techniques to Enhance Secure Data Transmission." International Journal of Engineering and Computer Science 7, no. 09 (2018): 24311–18. http://dx.doi.org/10.18535/ijecs/v7i9.04.

Full text
Abstract:
with the rapid increase of technology, the data stored and transmitted among the client and server has been increased tremendously. In order to provide high security for the confidential data, there is a need for proper encryption techniques that are to be followed by the concerns. This paper presents an analysis of the various encryption algorithms and their performance on handling the private data with authentication, access control, secure configuration and data encryption. Document oriented databases such as MongoDB, Cassandra, CouchDB, Redis and Hypertable are compared on the basis of their security aspects since they manipulate the huge amount of unstructured data in their databases. It is proposed that each database has its own security breaches and emphasises the need for proper encryption methods to secure the data stored in them.
APA, Harvard, Vancouver, ISO, and other styles
36

Martinez-Mosquera, Diana, Rosa Navarrete, Sergio Luján-Mora, Lorena Recalde, and Andres Andrade-Cabrera. "Integrating OLAP with NoSQL Databases in Big Data Environments: Systematic Mapping." Big Data and Cognitive Computing 8, no. 6 (2024): 64. http://dx.doi.org/10.3390/bdcc8060064.

Full text
Abstract:
The growing importance of data analytics is leading to a shift in data management strategy at many companies, moving away from simple data storage towards adopting Online Analytical Processing (OLAP) query analysis. Concurrently, NoSQL databases are gaining ground as the preferred choice for storing and querying analytical data. This article presents a comprehensive, systematic mapping, aiming to consolidate research efforts related to the integration of OLAP with NoSQL databases in Big Data environments. After identifying 1646 initial research studies from scientific digital repositories, a thorough examination of their content resulted in the acceptance of 22 studies. Utilizing the snowballing technique, an additional three studies were selected, culminating in a final corpus of twenty-five relevant articles. This review addresses the growing importance of leveraging NoSQL databases for OLAP query analysis in response to increasing data analytics demands. By identifying the most commonly used NoSQL databases with OLAP, such as column-oriented and document-oriented, prevalent OLAP modeling methods, such as Relational Online Analytical Processing (ROLAP) and Multidimensional Online Analytical Processing (MOLAP), and suggested models for batch and real-time processing, among other results, this research provides a roadmap for organizations navigating the integration of OLAP with NoSQL. Additionally, exploring computational resource requirements and performance benchmarks facilitates informed decision making and promotes advancements in Big Data analytics. The main findings of this review provide valuable insights and updated information regarding the integration of OLAP cubes with NoSQL databases to benefit future research, industry practitioners, and academia alike. This consolidation of research efforts not only promotes innovative solutions but also promises reduced operational costs compared to traditional database systems.
APA, Harvard, Vancouver, ISO, and other styles
37

Martinez-Mosquera, Diana, Rosa Navarrete, Sergio Luján-Mora, Lorena Recalde, and Andres Andrade-Cabrera. "Integrating OLAP with NoSQL Databases in Big Data Environments: Systematic Mapping." Big Data and Cognitive Computing 8, no. 6 (2024): 1–29. https://doi.org/10.3390/bdcc8060064.

Full text
Abstract:
The growing importance of data analytics is leading to a shift in data management strategy at many companies, moving away from simple data storage towards adopting Online Analytical Processing (OLAP) query analysis. Concurrently, NoSQL databases are gaining ground as the preferred choice for storing and querying analytical data. This article presents a comprehensive, systematic mapping, aiming to consolidate research efforts related to the integration of OLAP with NoSQL databases in Big Data environments. After identifying 1646 initial research studies from scientific digital repositories, a thorough examination of their content resulted in the acceptance of 22 studies. Utilizing the snowballing technique, an additional three studies were selected, culminating in a final corpus of twenty-five relevant articles. This review addresses the growing importance of leveraging NoSQL databases for OLAP query analysis in response to increasing data analytics demands. By identifying the most commonly used NoSQL databases with OLAP, such as column-oriented and document-oriented, prevalent OLAP modeling methods, such as Relational Online Analytical Processing (ROLAP) and Multidimensional Online Analytical Processing (MOLAP), and suggested models for batch and real-time processing, among other results, this research provides a roadmap for organizations navigating the integration of OLAP with NoSQL. Additionally, exploring computational resource requirements and performance benchmarks facilitates informed decision making and promotes advancements in Big Data analytics. The main findings of this review provide valuable insights and updated information regarding the integration of OLAP cubes with NoSQL databases to benefit future research, industry practitioners, and academia alike. This consolidation of research efforts not only promotes innovative solutions but also promises reduced operational costs compared to traditional database systems.
APA, Harvard, Vancouver, ISO, and other styles
38

Stetsyk, Oleksii, and Svitlana Terenchuk. "COMPARATIVE ANALYSIS OF NOSQL DATABASES ARCHITECTURE." Management of Development of Complex Systems, no. 47 (September 27, 2021): 78–82. http://dx.doi.org/10.32347/2412-9933.2021.47.78-82.

Full text
Abstract:
This article is devoted to the study of problematic issues due to the growing scale and requirements for modern high-load distributed systems. The relevance of the work is ensured by the fact that an important component of each such system is a database. The paper highlights the main problems associated with the use of relational databases in many high-load distributed systems. The main focus is on the study of such properties as data consistency, availability, and stability of the system. Basic information about the architecture and purpose of non-relational databases with a wide column, databases of key-value type, and document-oriented databases is provided. The advantages and disadvantages of non-relational databases of different types are shown, which are manifested in solving different problems depending on the purpose and features of the system. The choice of non-relational databases of different types for comparative analysis is substantiated. Databases such as Cassandra, Redis, and Mongo, which have long been used in high-load distributed systems and have already proven themselves among users, have been studied in detail. The main task addressed in this article was to find an answer to the question of the feasibility of using non-relational databases of the architecture of Cassandra, Redis, and Mongo depending on the characteristics of the system, or record information. Based on the analysis, options for using these databases for systems with a high number of requests to read or write information are proposed.
APA, Harvard, Vancouver, ISO, and other styles
39

Wulandari, Fera Tri. "PEMODELAN BASIS DATA AKADEMIK UNIVERSITAS XYZ MENGGUNAKAN PENDEKATAN OBJEK." JITU : Journal Informatic Technology And Communication 3, no. 1 (2019): 52–57. http://dx.doi.org/10.36596/jitu.v3i1.68.

Full text
Abstract:
Academic database modeling use an object approach is made by looking at the problems of the system that correspond to real-world objects. Academic database design to connect between requirements specifications and implementation that will help the development of academic information systems and determine the quality of information produced. The design of the XYZ University academic database is done using database modeling using the object approach.The methodology carried out is directly using document collection and interviews with parties related to data processing related to academic data. After getting complete data about the object so that database design can be done. Designing is the first step after getting information to apply to the application.The series of activities carried out in object-oriented modeling includes analysis and design of academic databases. The formed database can be used as a reference for designing academic information systems so that the information produced is accurate, fast and relevant
APA, Harvard, Vancouver, ISO, and other styles
40

Peretiatko, Mariia, Mariia Shirokopetleva, and Natalya Lesna. "RESEARCH OF METHODS TO SUPPORT DATA MIGRATION BETWEEN RELATIONAL AND DOCUMENT DATA STORAGE MODELS." Innovative Technologies and Scientific Solutions for Industries, no. 2 (20) (June 30, 2022): 64–74. http://dx.doi.org/10.30837/itssi.2022.20.064.

Full text
Abstract:
The subject matter of the article is heterogeneous model-inhomogeneous data migration between relational and document-oriented data storage models, existing strategies and methods to support such migrations, the use of relational algebra and set theory in the context of databases in building a new data migration algorithm. The goal of the work is to consider the features and procedure of data migration, explore methods to support data migration between relational and documentary data models, build a mathematical model and algorithm for data migration. The following methods were used: analysis and comparison of existing approaches to data migration, choice of strategy for further use in compiling the migration algorithm, mathematical modeling of the algorithm of heterogeneous model-inhomogeneous data migration, formalization of the data migration algorithm. The following tasks were solved in the article: consideration of the concept and types of data migration, justification for choosing a document-oriented data model as a target for data migration, analysis of existing literature sources on methods and strategies of heterogeneous model data migration from relational to document-oriented data model, highlighting advantages and disadvantages existing methods, choosing an approach to the formation of the data migration algorithm, compiling and describing a mathematical model of data migration using relational algebra and set theory, presentation of the data migration algorithm, which is based on the focus on data queries. The following results were obtained: the possibilities of relational algebra and set theory in the context of data models and queries are used, as well as in model redesign, the strategy of migration of data models is chosen, which provides relational and document-oriented data models, the algorithm of application of this method is described. Conclusions: because of the work, the main methods of migration support for different data storage models are analyzed, with the help of relational algebra, set theory a mathematical model is built, and an algorithm for transforming a relational data model into a document-oriented data model is taken into account. The obtained algorithm is suitable for use in real examples, and is the subject of further research and possible improvements, analysis of efficiency in comparison with other methods.
APA, Harvard, Vancouver, ISO, and other styles
41

Benmakhlouf, Amine. "An model for structured the NoSQL databases based on machine learning classifiers." International Journal of Informatics and Communication Technology (IJ-ICT) 14, no. 1 (2025): 229. https://doi.org/10.11591/ijict.v14i1.pp229-239.

Full text
Abstract:
Today, the majority of data generated and processed in organizations is unstructured. NoSQL database management systems perform the management of this data. The problem is that these unstructured databases cannot be analyzed by traditional OLAP analytical treatments. The latter are mainly used in structured relational databases. In order to apply OLAP analyses on NoSQL data, the structuring of this data is essential. In this paper, we propose a model for structuring the data of a document-oriented NoSQL database using machine learning (ML). This method is broken down into three steps, first the vectorization of documents, then the learning via different ML algorithms and finally the classification, which guarantees that documents with the same structure will belong to the same collection. Therefore, the modeling of a data warehouse can be carried out in order to create OLAP cubes. Since the models found by learning allow the parallel computation of the classifier, our approach represents an advantage in terms of speed since we will avoid doubly iterative algorithms, which rely on textual comparisons (TC). A comparative study of the performances is carried out in this work in order to detect the most efficient methods to perform this type of classification.
APA, Harvard, Vancouver, ISO, and other styles
42

Amine, Benmakhlouf. "An model for structured the NoSQL databases based on machine learning classifiers." International Journal of Informatics and Communication Technology 14, no. 1 (2025): 229–39. https://doi.org/10.11591/ijict.v14i1.pp229-239.

Full text
Abstract:
Today, the majority of data generated and processed in organizations is unstructured. NoSQL database management systems perform the management of this data. The problem is that these unstructured databases cannot be analyzed by traditional OLAP analytical treatments. The latter are mainly used in structured relational databases. In order to apply OLAP analyses on NoSQL data, the structuring of this data is essential. In this paper, we propose a model for structuring the data of a document-oriented NoSQL database using machine learning (ML). This method is broken down into three steps, first the vectorization of documents, then the learning via different ML algorithms and finally the classification, which guarantees that documents with the same structure will belong to the same collection. Therefore, the modeling of a data warehouse can be carried out in order to create OLAP cubes. Since the models found by learning allow the parallel computation of the classifier, our approach represents an advantage in terms of speed since we will avoid doubly iterative algorithms, which rely on textual comparisons (TC). A comparative study of the performances is carried out in this work in order to detect the most efficient methods to perform this type of classification.
APA, Harvard, Vancouver, ISO, and other styles
43

Maté, Alejandro, Jesús Peral, Juan Trujillo, Carlos Blanco, Diego García-Saiz, and Eduardo Fernández-Medina. "Improving security in NoSQL document databases through model-driven modernization." Knowledge and Information Systems 63, no. 8 (2021): 2209–30. http://dx.doi.org/10.1007/s10115-021-01589-x.

Full text
Abstract:
AbstractNoSQL technologies have become a common component in many information systems and software applications. These technologies are focused on performance, enabling scalable processing of large volumes of structured and unstructured data. Unfortunately, most developments over NoSQL technologies consider security as an afterthought, putting at risk personal data of individuals and potentially causing severe economic loses as well as reputation crisis. In order to avoid these situations, companies require an approach that introduces security mechanisms into their systems without scrapping already in-place solutions to restart all over again the design process. Therefore, in this paper we propose the first modernization approach for introducing security in NoSQL databases, focusing on access control and thereby improving the security of their associated information systems and applications. Our approach analyzes the existing NoSQL solution of the organization, using a domain ontology to detect sensitive information and creating a conceptual model of the database. Together with this model, a series of security issues related to access control are listed, allowing database designers to identify the security mechanisms that must be incorporated into their existing solution. For each security issue, our approach automatically generates a proposed solution, consisting of a combination of privilege modifications, new roles and views to improve access control. In order to test our approach, we apply our process to a medical database implemented using the popular document-oriented NoSQL database, MongoDB. The great advantages of our approach are that: (1) it takes into account the context of the system thanks to the introduction of domain ontologies, (2) it helps to avoid missing critical access control issues since the analysis is performed automatically, (3) it reduces the effort and costs of the modernization process thanks to the automated steps in the process, (4) it can be used with different NoSQL document-based technologies in a successful way by adjusting the metamodel, and (5) it is lined up with known standards, hence allowing the application of guidelines and best practices.
APA, Harvard, Vancouver, ISO, and other styles
44

Mitul Ashvinbhai Trivedi. "Enhancing legal practice through retrieval-augmented generation." World Journal of Advanced Engineering Technology and Sciences 15, no. 2 (2025): 2703–12. https://doi.org/10.30574/wjaets.2025.15.2.0852.

Full text
Abstract:
Retrieval-Augmented Generation (RAG) technology is transforming legal practice by combining sophisticated information retrieval with contextual content generation. As law firms confront mounting document volumes and rising client expectations, RAG systems provide a precision-oriented approach that maintains accuracy while increasing processing speed. This article examines how RAG's dual-component architecture creates distinctive advantages for legal applications through semantic understanding and contextual generation. The technical framework leverages vector databases and advanced language models to enhance contract analysis, legal research, document drafting, and multilingual document handling. Implementation delivers substantial benefits including time savings, error reduction, cost efficiencies, and strategic workload redistribution. The discussion explores implementation strategies, ethical considerations, and future directions including predictive analytics, evolving lawyer roles, regulatory frameworks, and research priorities. Rather than replacing legal professionals, RAG technology augments human expertise, enabling firms to reimagine service delivery while maintaining the essential human elements of legal counsel.
APA, Harvard, Vancouver, ISO, and other styles
45

Bathla, Gourav, Rinkle Rani, and Himanshu Aggarwal. "Comparative study of NoSQL databases for big data storage." International Journal of Engineering & Technology 7, no. 2.6 (2018): 83. http://dx.doi.org/10.14419/ijet.v7i2.6.10072.

Full text
Abstract:
Big data is a collection of large scale of structured, semi-structured and unstructured data. It is generated due to Social networks, Business organizations, interaction and views of social connected users. It is used for important decision making in business and research organizations. Storage which is efficient to process this large scale of data to extract important information in less response time is the need of current competitive time. Relational databases which have ruled the storage technology for such a long time seems not suitable for mixed types of data. Data can not be represented just in the form of rows and columns in tables. NoSQL (Not only SQL) is complementary to SQL technology which can provide various formats for storage that can be easily compatible with high velocity,large volume and different variety of data. NoSQL databases are categorized in four techniques- Column oriented, Key Value based, Graph based and Document oriented databases. There are approximately 120 real solutions existing for these categories; most commonly used solutions are elaborated in Introduction section. Several research works have been carried out to analyze these NoSQL technology solutions. These studies have not mentioned the situations in which a particular data storage technique is to be chosen. In this study and analysis, we have tried our best to provide answer on technology selection based on specific requirement to the reader. In previous research, comparisons amongNoSQL data storage techniques have been described by using real examples like MongoDB, Neo4J etc. Our observation is that if users have adequate knowledge of NoSQL categories and their comparison, then it is easy for them to choose best suitable category and then real solutions can be selected from this category.
APA, Harvard, Vancouver, ISO, and other styles
46

Kim, Bogyeong, Kyoseung Koo, Undraa Enkhbat, Sohyun Kim, Juhun Kim, and Bongki Moon. "M2Bench." Proceedings of the VLDB Endowment 16, no. 4 (2022): 747–59. http://dx.doi.org/10.14778/3574245.3574259.

Full text
Abstract:
As the world becomes increasingly data-centric, the tasks dealt with by a database management system (DBMS) become more complex and diverse. Compared with traditional workloads that typically require only a single data model, modern-day computational tasks often involve multiple data sources and rely on more than one data model. Unfortunately, however, there is currently no standard benchmark program that can evaluate a DBMS in the various aspects of multi-model databases, especially when the array data model is concerned. In this paper, we propose M2Bench , a new benchmark program capable of evaluating a multi-model DBMS that supports several important data models such as relational, document-oriented, property graph, and array models. M2Bench consists of multi-model workloads that are inspired by real-world problems. Each task of the workload mimics a real-life scenario where at least two different models of data are involved. To demonstrate the efficacy of M2Bench , we evaluated polyglot or multi-model database systems with the M2Bench workloads and unfolded the diverse characteristics of the database systems for each data model.
APA, Harvard, Vancouver, ISO, and other styles
47

Materynska, Sofiia, Vadym Yaremenko, and Walery Rogoza. "A theoretically proposed algorithm in a decision tree format for choosing an efficient storage type of large datasets." Technology audit and production reserves 1, no. 2(63) (2022): 6–9. http://dx.doi.org/10.15587/2706-5448.2022.251281.

Full text
Abstract:
The object of research is methods and approaches to improve storage efficiency and optimize access to large amounts of data. The importance of this study consists in the wide dissemination of big data and the need for the right selection of technologies that will help improve the efficiency of big data processing systems. The complexity of the choice is caused by the large number of different data storages and databases that are available now, so the best decision requires a deep understanding of the advantages, disadvantages and features of each. And the difficulty lies in the lack of a universal algorithm for deciding on the optimal repository. Accordingly, based on the experiments, analysis of existing projects and research papers, a decision-making algorithm was proposed that determines the best way to store large datasets, depending on their characteristics and additional system requirements. This is necessary to simplify the design of the system in the early stages of big data processing projects. Thus, by highlighting the key differences, as well as the disadvantages and advantages of each type of storage and database, a list of key characteristics of the data and the future system, which should be considered when designing. This algorithm is a theoretical proposal based on the studied research papers. Accordingly, using this algorithm at the design stage of the system, it would be possible to quickly and clearly determine the optimal type of storage of large datasets. The paper considers column-oriented, document-oriented, graph and key-value types of databases, as well as distributed file systems and cloud services.
APA, Harvard, Vancouver, ISO, and other styles
48

Mei, Xusheng. "Research on Performance Optimization Techniques for MongoDB Database Based on Load Balancing." Advances in Engineering Technology Research 13, no. 1 (2025): 1062. https://doi.org/10.56028/aetr.13.1.1062.2025.

Full text
Abstract:
This article examines the challenges traditional relational databases face in dealing with the growing demands of modern digital environments and proposes MongoDB (a NoSQL database) as a powerful alternative solution. MongoDB's document-oriented model provides flexible schema design, high performance, and easy scalability, making it particularly suitable for applications requiring rapid processing of large amounts of data, such as real-time analytics and the Internet of Things. Through in-depth analysis, we explored the key features of MongoDB, including automatic sharding, JSON style BSON data format, and dynamic load balancing technology considering access frequency, which significantly improved its efficiency and reliability. We also discussed MongoDB's evolution, technological advancements, and role in promoting agile development and scalable solutions in data-intensive applications. The article concludes by acknowledging the crucial role of MongoDB in modern application development, driven by its ability to meet the dynamic and complex demands of big data and cloud computing.
APA, Harvard, Vancouver, ISO, and other styles
49

McClay, Wilbert. "A Magnetoencephalographic/Encephalographic (MEG/EEG) Brain-Computer Interface Driver for Interactive iOS Mobile Videogame Applications Utilizing the Hadoop Ecosystem, MongoDB, and Cassandra NoSQL Databases." Diseases 6, no. 4 (2018): 89. http://dx.doi.org/10.3390/diseases6040089.

Full text
Abstract:
In Phase I, we collected data on five subjects yielding over 90% positive performance in Magnetoencephalographic (MEG) mid-and post-movement activity. In addition, a driver was developed that substituted the actions of the Brain Computer Interface (BCI) as mouse button presses for real-time use in visual simulations. The process was interfaced to a flight visualization demonstration utilizing left or right brainwave thought movement, the user experiences, the aircraft turning in the chosen direction, or on iOS Mobile Warfighter Videogame application. The BCI’s data analytics of a subject’s MEG brain waves and flight visualization performance videogame analytics were stored and analyzed using the Hadoop Ecosystem as a quick retrieval data warehouse. In Phase II portion of the project involves the Emotiv Encephalographic (EEG) Wireless Brain–Computer interfaces (BCIs) allow for people to establish a novel communication channel between the human brain and a machine, in this case, an iOS Mobile Application(s). The EEG BCI utilizes advanced and novel machine learning algorithms, as well as the Spark Directed Acyclic Graph (DAG), Cassandra NoSQL database environment, and also the competitor NoSQL MongoDB database for housing BCI analytics of subject’s response and users’ intent illustrated for both MEG/EEG brainwave signal acquisition. The wireless EEG signals that were acquired from the OpenVibe and the Emotiv EPOC headset can be connected via Bluetooth to an iPhone utilizing a thin Client architecture. The use of NoSQL databases were chosen because of its schema-less architecture and Map Reduce computational paradigm algorithm for housing a user’s brain signals from each referencing sensor. Thus, in the near future, if multiple users are playing on an online network connection and an MEG/EEG sensor fails, or if the connection is lost from the smartphone and the webserver due to low battery power or failed data transmission, it will not nullify the NoSQL document-oriented (MongoDB) or column-oriented Cassandra databases. Additionally, NoSQL databases have fast querying and indexing methodologies, which are perfect for online game analytics and technology. In Phase II, we collected data on five MEG subjects, yielding over 90% positive performance on iOS Mobile Applications with Objective-C and C++, however on EEG signals utilized on three subjects with the Emotiv wireless headsets and (n &lt; 10) subjects from the OpenVibe EEG database the Variational Bayesian Factor Analysis Algorithm (VBFA) yielded below 60% performance and we are currently pursuing extending the VBFA algorithm to work in the time-frequency domain referred to as VBFA-TF to enhance EEG performance in the near future. The novel usage of NoSQL databases, Cassandra and MongoDB, were the primary main enhancements of the BCI Phase II MEG/EEG brain signal data acquisition, queries, and rapid analytics, with MapReduce and Spark DAG demonstrating future implications for next generation biometric MEG/EEG NoSQL databases.
APA, Harvard, Vancouver, ISO, and other styles
50

Sheketa, Vasyl, Mykola Pasieka, Svitlana Chupakhina, et al. "Information System for Screening and Automation of Document Management in Oncological Clinics." Open Bioinformatics Journal 14, no. 1 (2021): 39–50. http://dx.doi.org/10.2174/1875036202114010039.

Full text
Abstract:
Introduction: Automation of business documentation workflow in medical practice substantially accelerates and improves the process and results in better service development. Methods: Efficient use of databases, data banks, and document-oriented storage (warehouses data), including dual-purpose databases, enables performing specific actions, such as adding records, introducing changes into them, performing an either ordinary or analytical search of data, as well as their efficient processing. With the focus on achieving interaction between the distributed and heterogeneous applications and the devices belonging to the independent organizations, the specialized medical client application has been developed, as a result of which the quantity and quality of information streams of data, which can be essential for effective treatment of patients with breast cancer, have increased. Results: The application has been developed, allowing automating the management of patient records, taking into account the needs of medical staff, especially in managing patients’ appointments and creating patient’s medical records in accordance with the international standards currently in force. This work is the basis for the smoother integration of medical records and genomics data to achieve better prevention, diagnosis, prediction, and treatment of breast cancer (oncology). Conclusion: Since relevant standards upgrade the functioning of health care information technology and the quality and safety of patient’s care, we have accomplished the global architectural scheme of the specific medical automation system through harmonizing the medical services specified by the HL7 international.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography