Log in

Relevant bibliographies by topics / Data warehouse queries / Journal articles

To see the other types of publications on this topic, follow the link: Data warehouse queries.

Journal articles on the topic 'Data warehouse queries'

Author: Grafiati

Published: 4 June 2021

Last updated: 8 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Data warehouse queries.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gupta, Dr S. L., Dr Payal Pahwa, and Ms Sonali Mathur. "CLASSIFICATION OF DATA WAREHOUSE TESTING APPROACHES." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 3, no. 3 (December 2, 2012): 381–86. http://dx.doi.org/10.24297/ijct.v3i3a.2942.

Full text

Abstract:

Data Warehouse is a collection of large amount of data which is used by the management for making strategic decisions. The data in a data warehouse is gathered from heterogeneous sources and then populated and queried for carrying out the analysis. The data warehouse design must support the queries for which it is being used for. The design is often an iterative process and must be modified a number of times before any model can be stabilized. The design life cycle of any product includes various stages wherein, testing being the most important one. Data warehouse design has received considerable attention whereas data warehouse testing is being explored now by various researchers. This paper discusses about various categories of testing activities being carried out in a data warehouse at different levels

APA, Harvard, Vancouver, ISO, and other styles

2

Haxhiu, Valdrin. "Decision making based on data analyses using data warehouses." International Journal of Business & Technology 6, no. 3 (May 1, 2018): 1–6. http://dx.doi.org/10.33107/ijbte.2018.6.3.04.

Full text

Abstract:

Data warehouses are a collection of several databases, whose goal is to help different companies and corporations make important decisions about their activities. These decisions are taken from the analyses that are made to the data within the data warehouse. These data are taken from data that companies and corporations collect on daily basis from their branches that may be located in different cities, regions, states and continents. Data that are entered to data warehouses are historical data and they represent that part of data that is important for making decisions. These data go under a transformation process in order to accommodate with the structure of the objects within the databases in the data warehouse. This is done because the structure of the relational databases is not similar with the structure of the databases (multidimensional databases) within the data warehouse. The first ones are optimized for transactions on daily basis like: entering, changing, deleting and retrieving data through simple queries, the second ones are optimized for retrieving data through multidimensional queries, which enable us to extract important information. This information helps to make important decisions by learning which are the weak points and the strong points of the company, in order to invest more on the weak points and to strengthen the strong points, increasing the profits of the company. The goal of this paper is to treat data analyses for decision making from a data warehouse by using OLAP (online analytical processing) analysis. For this treatment we used the Analysis Services of Microsoft SQL Server 2016 platform. We analyzed the data of an IT Store with branches in different cities in Kosovo and came to a conclusion for some sales trends. This paper emphasizes the role of data warehouses in decision making.

APA, Harvard, Vancouver, ISO, and other styles

3

Atigui, Faten, Franck Ravat, Jiefu Song, Olivier Teste, and Gilles Zurfluh. "Facilitate Effective Decision-Making by Warehousing Reduced Data." International Journal of Decision Support System Technology 7, no. 3 (July 2015): 36–64. http://dx.doi.org/10.4018/ijdsst.2015070103.

Full text

Abstract:

The authors' aim is to provide a solution for multidimensional data warehouse's reduction based on analysts' needs which will specify aggregated schema applicable over a period of time as well as retain only useful data for decision support. Firstly, they describe a conceptual modeling for multidimensional data warehouse. A multidimensional data warehouse's schema is composed of a set of states. Each state is defined as a star schema composed of one fact and its related dimensions. The derivation between states is carried out through combination of reduction operators. Secondly, they present a meta-model which allows managing different states of multidimensional data warehouse. The definition of reduced and unreduced multidimensional data warehouse schema can be carried out by instantiating the meta-model. Finally, they describe their experimental assessments and discuss their results. Evaluating their solution implies executing different queries in various contexts: unreduced single fact table, unreduced relational star schema, reduced star schema and reduced snowflake schema. The authors show that queries are more efficiently calculated within a reduced star schema.

APA, Harvard, Vancouver, ISO, and other styles

4

Dehdouh, Khaled, Omar Boussaid, and Fadila Bentayeb. "Big Data Warehouse." International Journal of Decision Support System Technology 12, no. 1 (January 2020): 1–24. http://dx.doi.org/10.4018/ijdsst.2020010101.

Full text

Abstract:

In the Big Data warehouse context, a column-oriented NoSQL database system is considered as the storage model which is highly adapted to data warehouses and online analysis. Indeed, the use of NoSQL models allows data scalability easily and the columnar store is suitable for storing and managing massive data, especially for decisional queries. However, the column-oriented NoSQL DBMS do not offer online analysis operators (OLAP). To build OLAP cubes corresponding to the analysis contexts, the most common way is to integrate other software such as HIVE or Kylin which has a CUBE operator to build data cubes. By using that, the cube is built according to the row-oriented approach and does not allow to fully obtain the benefits of a column-oriented approach. In this article, the focus is to define a cube operator called MC-CUBE (MapReduce Columnar CUBE), which allows building columnar NoSQL cubes according to the columnar approach by taking into account the non-relational and distributed aspects when data warehouses are stored.

APA, Harvard, Vancouver, ISO, and other styles

5

Kumar, Amit, and T. V. Vijay Kumar. "Materialized View Selection Using Self-Adaptive Perturbation Operator-Based Particle Swarm Optimization." International Journal of Applied Evolutionary Computation 11, no. 3 (July 2020): 50–67. http://dx.doi.org/10.4018/ijaec.2020070104.

Full text

Abstract:

A data warehouse is a central repository of time-variant and non-volatile data integrated from disparate data sources with the purpose of transforming data to information to support data analysis. Decision support applications access data warehouses to derive information using online analytical processing. The response time of analytical queries against speedily growing size of the data warehouse is substantially large. View materialization is an effective approach to decrease the response time for analytical queries and expedite the decision-making process in relational implementations of data warehouses. Selecting a suitable subset of views that deceases the response time of analytical queries and also fit within available storage space for materialization is a crucial research concern in the context of a data warehouse design. This problem, referred to as view selection, is shown to be NP-Hard. Swarm intelligence have been widely and successfully used to solve such problems. In this paper, a discrete variant of particle swarm optimization algorithm, i.e. self-adaptive perturbation operator based particle swarm optimization (SPOPSO), has been adapted to solve the view selection problem. Accordingly, SPOPSO-based view selection algorithm (SPOPSOVSA) is proposed. SPOPSOVSA selects the Top-K views in a multidimensional lattice framework. Further, the proposed algorithm is shown to perform better than the view selection algorithm HRUA.

APA, Harvard, Vancouver, ISO, and other styles

6

M Kirmani, Mudasir. "Dimensional Modeling Using Star Schema for Data Warehouse Creation." Oriental journal of computer science and technology 10, no. 04 (October 13, 2017): 745–54. http://dx.doi.org/10.13005/ojcst/10.04.07.

Full text

Abstract:

Data Warehouse design requires a radical rebuilding of tremendous measures of information, frequently of questionable or conflicting quality, drawn from various heterogeneous sources. Data Warehouse configuration assimilates business learning and innovation know-how. The outline of theData Warehouse requires a profound comprehension of the business forms in detail. The principle point of this exploration paper is to contemplate and investigate the transformation model to change over the E-R outlines to Star Schema for developing Data Warehouses. The Dimensional modelling is a logical design technique used for data warehouses. This research paper addresses various potential differences between the two techniques and highlights the advantages of using dimensional modelling along with disadvantages as well. Dimensional Modelling is one of the popular techniques for databases that are designed keeping in mind the queries from end-user in a data warehouse. In this paper the focus has been on Star Schema, which basically comprises of Fact table and Dimension tables. Each fact table further comprises of foreign keys of various dimensions and measures and degenerate dimensions if any. We also discuss the possibilities of deployment and acceptance of Conversion Model (CM) to provide the details of fact table and dimension tables according to the local needs. It will also highlight to why dimensional modelling is preferred over E-R modelling when creating data warehouse.

APA, Harvard, Vancouver, ISO, and other styles

7

Pisano, Valentina Indelli, Michele Risi, and Genoveffa Tortora. "How reduce the View Selection Problem through the CoDe Modeling." Journal on Advances in Theoretical and Applied Informatics 2, no. 2 (December 21, 2016): 19. http://dx.doi.org/10.26729/jadi.v2i2.2090.

Full text

Abstract:

Big Data visualization is not an easy task due to the sheer amount of information contained in data warehouses. Then the accuracy on data relationships in a representation becomes one of the most crucial aspects to perform business knowledge discovery. A tool that allows to model and visualize information relationships between data is CoDe, which by processing several queries on a data-mart, generates a visualization of such data. However on a large data warehouse, the computation of these queries increases the response time by the query complexity. A common approach to speed up data warehousing is precompute a set of materialized views, store in the warehouse and use them to compute the workload queries. The goal and the objectives of this paper are to present a new process exploiting the CoDe modeling through determining the minimal number of required OLAP queries and to mitigate the problem of view selection, i.e., select the optimal set of materialized views. In particular, the proposed process determines the minimal number of required OLAP queries, creates an ad hoc lattice structure to represent them, and selects on such structure the views to be materialized taking into account an heuristic based on the processing time cost and the view storage space. The results of an experiment on a real data warehouse show an improvement in the range of 36-98% with respect the approach that does not consider materialized views, and 7% wrt. an approach that exploits them. Moreover, we have shown how the results are affected by the lattice structure.

APA, Harvard, Vancouver, ISO, and other styles

8

Rado, Ratsimbazafy, and Omar Boussaid. "Multiple Decisional Query Optimization in Big Data Warehouse." International Journal of Data Warehousing and Mining 14, no. 3 (July 2018): 22–43. http://dx.doi.org/10.4018/ijdwm.2018070102.

Full text

Abstract:

Data warehousing (DW) area has always motivated a plethora of hard optimization problem that cannot be solved in polynomial time. Those optimization problems are more complex and interesting when it comes to multiple OLAP queries. In this article, the authors explore the potential of distributed environment for an established data warehouse, database-related optimization problem, the problem of Multiple Query Optimization (MQO). In traditional DW materializing views is an optimization technic to solve such problem by storing pre-computed join or frequently asked queries. In this era of big data this kind of view materialization is not suitable due to the data size. In this article, the authors tackle the problem of MQO on distributed DW by using a multiple, small, shared and easy to maintain shared data. The evaluation shows that, compared to available default execution engine, the authors' approach consumes on average 20% less memory in the Map-scan task and it is 12% faster regarding the execution time of interactive and reporting queries from TPC-DS.

APA, Harvard, Vancouver, ISO, and other styles

9

Bimonte, Sandro, Omar Boussaid, Michel Schneider, and Fabien Ruelle. "Design and Implementation of Active Stream Data Warehouses." International Journal of Data Warehousing and Mining 15, no. 2 (April 2019): 1–21. http://dx.doi.org/10.4018/ijdwm.2019040101.

Full text

Abstract:

In the era of Big Data, more and more stream data is available. In the same way, Decision Support Systems (DSS) tools, such as data warehouses and alert systems, become more and more sophisticated, and conceptual modeling tools are consequently mandatory for successfully DSS projects. Formalisms such as UML and ER have been widely used in the context of classical information and data warehouse systems, but they have not been investigated yet for stream data warehouses to deal with alert systems. Therefore, in this article, the authors introduce the notion of Active Stream Data Warehouse (ASDW) and this article proposes a UML profile for designing Active Stream Data Warehouses. Indeed, this article extends the ICSOLAP profile to take into account continuous and window OLAP queries. Moreover, this article studies the duality of the stream and OLAP decision-making process and the authors propose a set of ECA rules to automatically trigger OLAP operators. The UML profile is implemented in a new OLAP architecture, and it is validated using an environmental case study concerning the wind monitoring.

APA, Harvard, Vancouver, ISO, and other styles

10

Chen, Li. "The Study on Indexing Techniques in Data Warehouse." Key Engineering Materials 439-440 (June 2010): 1505–10. http://dx.doi.org/10.4028/www.scientific.net/kem.439-440.1505.

Full text

Abstract:

Nowadays, data warehouse has already become the hot spot in database studies. Indexes can potentially speed up a variety of operations in a data warehouse. In this paper, we present several relatively mature index techniques in data warehouse. Then, we give a comparison between them on performance evaluations. This paper focuses on the performance evaluation of three data warehouse queries with three different indexing techniques and to observe the impact of variable size data with respect to time and space complexity.

APA, Harvard, Vancouver, ISO, and other styles

11

Murthy, Raghotham, and Rajat Goel. "Peregrine: Low-latency queries on Hive warehouse data." XRDS: Crossroads, The ACM Magazine for Students 19, no. 1 (September 2012): 40–43. http://dx.doi.org/10.1145/2331042.2331056.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

GOLFARELLI, MATTEO, DARIO MAIO, and STEFANO RIZZI. "THE DIMENSIONAL FACT MODEL: A CONCEPTUAL MODEL FOR DATA WAREHOUSES." International Journal of Cooperative Information Systems 07, no. 02n03 (June 1998): 215–47. http://dx.doi.org/10.1142/s0218843098000118.

Full text

Abstract:

Data warehousing systems enable enterprise managers to acquire and integrate information from heterogeneous sources and to query very large databases efficiently. Building a data warehouse requires adopting design and implementation techniques completely different from those underlying operational information systems. Though most scientific literature on the design of data warehouses concerns their logical and physical models, an accurate conceptual design is the necessary foundations for building a DW which is well-documented and fully satisfies requirements. In this paper we formalize a graphical conceptual model for data warehouses, called Dimensional Fact model, and propose a semi-automated methodology to build it from the pre-existing (conceptual or logical) schemes describing the enterprise relational database. The representation of reality built using our conceptual model consists of a set of fact schemes whose basic elements are facts, measures, attributes, dimensions and hierarchies; other features which may be represented on fact schemes are the additivity of fact attributes along dimensions, the optionality of dimension attributes and the existence of non-dimension attributes. Compatible fact schemes may be overlapped in order to relate and compare data for drill-across queries. Fact schemes should be integrated with information of the conjectured workload, to be used as the input of logical and physical design phases; to this end, we propose a simple language to denote data warehouse queries in terms of sets of fact instances.

APA, Harvard, Vancouver, ISO, and other styles

13

Chakraborty, Sonali Ashish. "A Novel Approach Using Non-Synonymous Materialized Queries for Data Warehousing." International Journal of Data Warehousing and Mining 17, no. 3 (July 2021): 22–43. http://dx.doi.org/10.4018/ijdwm.2021070102.

Full text

Abstract:

Data from multiple sources are loaded into the organization data warehouse for analysis. Since some OLAP queries are quite frequently fired on the warehouse data, their execution time is reduced by storing the queries and results in a relational database, referred as materialized query database (MQDB). If the tables, fields, functions, and criteria of input query and stored query are the same but the query criteria specified in WHERE or HAVING clause do not match, then they are considered non-synonymous to each other. In the present research, the results of non-synonymous queries are generated by reusing the existing stored results after applying UNION or MINUS operations on them. This will reduce the execution time of non-synonymous queries. For superset criteria values of input query, UNION operation is applied, and for subset values, MINUS operation is applied. Incremental result processing of existing stored results, if required, is performed using Data Marts.

APA, Harvard, Vancouver, ISO, and other styles

14

Chatziantoniou, Damianos, and Theodore Johnson. "Decision support queries on a tape-resident data warehouse." Information Systems 30, no. 2 (April 2005): 133–49. http://dx.doi.org/10.1016/j.is.2003.11.003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

DEHURI, S., and R. MALL. "PARALLEL PROCESSING OF OLAP QUERIES USING A CLUSTER OF WORKSTATIONS." International Journal of Information Technology & Decision Making 06, no. 02 (June 2007): 279–99. http://dx.doi.org/10.1142/s0219622007002484.

Full text

Abstract:

Online analytical processing (OLAP) queries normally incur enormous processing overheads due to the huge size of data warehouses. This results in unacceptable response times. Parallel processing using a cluster of workstations has of late emerged as a practical solution to many compute and data intensive problems. In this article, we present parallel algorithms for some of the OLAP operators. We have implemented these parallel solutions for a data warehouse implemented on Oracle hosted in a cluster of workstations. Our performance studies show that encouraging speedups are achieved.

APA, Harvard, Vancouver, ISO, and other styles

16

Grace Yenin Edwige, Johnson, Adepo Joel, and Oumtanaga Souleymane. "A MECHANISM FOR DETECTING PARTIAL INFERENCES IN DATA WAREHOUSES." International Journal of Advanced Research 9, no. 03 (March 31, 2021): 369–78. http://dx.doi.org/10.21474/ijar01/12593.

Full text

Abstract:

Data warehouses are widely used in the fields of Big Data and Business Intelligence for statistics on business activity. Their use through multidimensional queries allows to have aggregated results of the data. The confidential nature of certain data leads malicious people to use means of deduction of this information. Among these means are data inference methods. To solve these security problems, the researchers have proposed several solutions based on the architecture of the warehouses, the design phase, the cuboids of a data cube and the materialized views of multidimensional queries. In this work, we propose a mechanism for detecting inference in data warehouses. The objective of this approach is to highlight partial inferences during the execution of a multidimensional OLAP (Online Analytical Processing) SUM-type multidimensional query. The goal is to prevent a data warehouse user from inferring sensitive information for which he or she has no access rights according to the access control policy in force. Our study improves the model proposed by a previous study carried out by Triki, which proposes an approach based on average deviations. The aim is to propose an optimal threshold to better detect inferences. The results we obtain are better compared to the previous study.

APA, Harvard, Vancouver, ISO, and other styles

17

Zhou, Li Juan, Hai Jun Geng, and Ming Sheng Xu. "Materialized View Selection in the Data Warehouse." Applied Mechanics and Materials 29-32 (August 2010): 1133–38. http://dx.doi.org/10.4028/www.scientific.net/amm.29-32.1133.

Full text

Abstract:

A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. Materialized view selection is one of the crucial decisions in designing a data warehouse for optimal efficiency. The goal is to select an appropriate set of views that minimizes sum of the query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In this article, we present an improved PGA algorithm to accomplish the view selection problem; the experiments show that our proposed algorithm shows it’s superior.

APA, Harvard, Vancouver, ISO, and other styles

18

Kerkad, Amira, Ladjel Bellatreche, Pascal Richard, Carlos Ordonez, and Dominique Geniet. "A Query Beehive Algorithm for Data Warehouse Buffer Management and Query Scheduling." International Journal of Data Warehousing and Mining 10, no. 3 (July 2014): 34–58. http://dx.doi.org/10.4018/ijdwm.2014070103.

Full text

Abstract:

Analytical queries, like those used in data warehouses and OLAP, are generally interdependent. This is due to the fact that the database is usually modeled with a denormalized star schema or its variants, where most queries pass through a large central fact table. Such interaction has been largely exploited in query optimization techniques such as materialized views. Nevertheless, such approaches usually ignore buffer management and assume queries have a fixed order and are known in advance. We believe such assumptions are too strong and thus they need to be revisited and simplified. In this paper, we study the combination of two problems: buffer management and query scheduling, in both static and dynamic scenarios. We present an NP-hardness study of the joint problem, highlighting its complexity. We then introduce a new and highly efficient algorithm inspired by a beehive. We conduct an extensive experimental evaluation on a real DBMS showing the superiority of our algorithm compared to previous ones as well as its excellent scalability.

APA, Harvard, Vancouver, ISO, and other styles

19

Khemiri, Rym. "User Profile-Driven Data Warehouse Summary for Adaptive OLAP Queries." International Journal of Database Management Systems 4, no. 6 (December 31, 2012): 69–84. http://dx.doi.org/10.5121/ijdms.2012.4606.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

CHANG, J. y. "Processing Aggregate Queries with Materialized Views in Data Warehouse Environment." IEICE Transactions on Information and Systems E88-D, no. 4 (April 1, 2005): 726–38. http://dx.doi.org/10.1093/ietisy/e88-d.4.726.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Rahman, Nayem. "SQL Scorecard for Improved Stability and Performance of Data Warehouses." International Journal of Software Innovation 4, no. 3 (July 2016): 22–37. http://dx.doi.org/10.4018/ijsi.2016070102.

Full text

Abstract:

Scorecard-based measurement techniques are used by organizations to measure the performance of their business operations. A scorecard approach could be applied to a database system to measure performance of SQL (Structured Query Language) being executed and the extent of resources being used by SQL. In a large data warehouse, thousands of jobs run daily via batch cycles to refresh different subject areas. Simultaneously, thousands of queries by business intelligence tools and ad-hoc queries are being executed twenty-four by seven. There needs to be a controlling mechanism to make sure these batch jobs and queries are efficient and do not consume database systems resources more than optimal. The authors propose measurement of SQL query performance via a scorecard tool. The motivation behind using a scorecard tool is to make sure that the resource consumption of SQL queries is predictable and the database system environment is stable. The experimental results show that queries that pass scorecard evaluation criteria tend to utilize optimal level of database systems computing resources. These queries also show improved parallel efficiency (PE) in using computing resources (CPU, I/O and spool space) that demonstrate the usefulness of SQL scorecard.

APA, Harvard, Vancouver, ISO, and other styles

22

Chakraborty, Sonali Ashish, and Jyotika Doshi. "Reducing Query Processing Time for Non-Synonymous Materialized Queries With Differed Criteria." International Journal of Natural Computing Research 8, no. 2 (April 2019): 75–93. http://dx.doi.org/10.4018/ijncr.2019040104.

Full text

Abstract:

Results of OLAP queries for strategic decision making are generated using warehouse data. For frequent queries, processing overhead increases as same results are generated by traversing through huge volume of warehouse data. Authors suggest saving time for frequent queries by storing them in a relational database referred as MQDB, along with its result and metadata information. Incremental updates for synonymous materialized queries are done using data marts. This article focusses on saving processing time for non-synonymous queries with differed criteria. Criteria is the query condition specified with ‘where' or a ‘having' clause apart from equijoin condition. Defined rules will determine if new results can be derived from existing stored results. If criteria of fired query are a subset of criteria in stored query, results are extracted from existing results using MINUS operation. When criteria are a superset of stored query criteria, new results are appended to existing results using the UNION operation.

APA, Harvard, Vancouver, ISO, and other styles

23

LEE, MINSOO, and JOACHIM HAMMER. "SPEEDING UP MATERIALIZED VIEW SELECTION IN DATA WAREHOUSES USING A RANDOMIZED ALGORITHM." International Journal of Cooperative Information Systems 10, no. 03 (September 2001): 327–53. http://dx.doi.org/10.1142/s0218843001000370.

Full text

Abstract:

A data warehouse stores information that is collected from multiple, heterogeneous information sources for the purpose of complex querying and analysis. Information in the warehouse is typically stored in the form of materialized views, which represent pre-computed portions of frequently asked queries. One of the most important tasks when designing a warehouse is the selection of materialized views to be maintained in the warehouse. The goal is to select a set of views in such a way as to minimize the total query response time over all queries, given a limited amount of time for maintaining the views (maintenance-cost view selection problem). In this paper, we propose an efficient solution to the maintenance-cost view selection problem using a genetic algorithm for computing a near-optimal set of views. Specifically, we explore the maintenance-cost view selection problem in the context of OR view graphs. We show that our approach represents a dramatic improvement in time complexity over existing search-based approaches using heuristics. Our analysis shows that the algorithm consistently yields a solution that lies within 10% of the optimal query benefit while at the same time exhibiting only a linear increase in execution time. We have implemented a prototype version of our algorithm which is used to simulate the measurements used in the analysis of our approach.

APA, Harvard, Vancouver, ISO, and other styles

24

Fong, Joseph, Qing Li, and Shi-Ming Huang. "Universal Data Warehousing Based on a Meta-Data Modeling Approach." International Journal of Cooperative Information Systems 12, no. 03 (September 2003): 325–63. http://dx.doi.org/10.1142/s0218843003000772.

Full text

Abstract:

Data warehouse contains vast amount of data to support complex queries of various Decision Support Systems (DSSs). It needs to store materialized views of data, which must be available consistently and instantaneously. Using a frame metadata model, this paper presents an architecture of a universal data warehousing with different data models. The frame metadata model represents the metadata of a data warehouse, which structures an application domain into classes, and integrates schemas of heterogeneous databases by capturing their semantics. A star schema is derived from user requirements based on the integrated schema, catalogued in the metadata, which stores the schema of relational database (RDB) and object-oriented database (OODB). Data materialization between RDB and OODB is achieved by unloading source database into sequential file and reloading into target database, through which an object relational view can be defined so as to allow the users to obtain the same warehouse view in different data models simultaneously. We describe our procedures of building the relational view of star schema by multidimensional SQL query, and the object oriented view of the data warehouse by Online Analytical Processing (OLAP) through method call, derived from the integrated schema. To validate our work, an application prototype system has been developed in a product sales data warehousing domain based on this approach.

APA, Harvard, Vancouver, ISO, and other styles

25

Ibtisam, Ferrahi Ibtisam, Sandro Bimonte, and Kamel Boukhalfa. "Logical and Physical Design of Spatial Non-Strict Hierarchies in Relational Spatial Data Warehouse." International Journal of Data Warehousing and Mining 15, no. 1 (January 2019): 1–18. http://dx.doi.org/10.4018/ijdwm.2019010101.

Full text

Abstract:

The emergence of spatial or geographic data in DW Systems defines new models that support the storage and manipulation of the data. The need to build an SDW and to optimize SOLAP queries continues to attract the interest of researchers in recent years. Several spatial data models have been investigated to extend classical multidimensional data models with spatial concepts. However, most of existing models do not handle a non-strict spatial hierarchy. Moreover, the complexity of the spatial data makes the execution time of spatial queries very considerable. Often, spatial indexation methods are applied to optimizing access to large volumes of data and helps reduce the cost of spatial OLAP queries. Most of existing indexes support predefined spatial hierarchies. The authors show, in this article, that the logical models proposed in the literature and indexing techniques are not suitable to non-strict hierarchies. The authors propose a new logical schema supporting the non-strict hierarchies and a bitmap index to optimize queries defined by spatial dimensions with several non-strict hierarchies.

APA, Harvard, Vancouver, ISO, and other styles

26

Zhou, Li Juan, Hai Jun Geng, and Ming Sheng Xu. "Research on Materialized View Selection in the Data Warehouse." Applied Mechanics and Materials 55-57 (May 2011): 361–66. http://dx.doi.org/10.4028/www.scientific.net/amm.55-57.361.

Full text

Abstract:

Materialized view is an effective method for improving the efficiency of queries in data warehouse system, and the problem of materialized view selection is one of the most important decisions. In this paper, an algorithm was proposed to select a set of materialized views under maintenance cost constraints for the purpose of minimizing the total query processing cost; the algorithm adopts the dynamic penalty function to solve the resource constraints view selection. The experimental study shows that the algorithm has better solutions and high efficiency.

APA, Harvard, Vancouver, ISO, and other styles

27

Chakraborty, Sonali Ashish, and Jyotika Doshi. "An Approach for Retrieving Faster Query Results From Data Warehouse Using Synonymous Materialized Queries." International Journal of Data Warehousing and Mining 17, no. 2 (April 2021): 85–105. http://dx.doi.org/10.4018/ijdwm.2021040105.

Full text

Abstract:

The enterprise data warehouse stores an enormous amount of data collected from multiple sources for analytical processing and strategic decision making. The analytical processing is done using online analytical processing (OLAP) queries where the performance in terms of result retrieval time is an important factor. The major existing approaches for retrieving results from a data warehouse are multidimensional data cubes and materialized views that incur more storage, processing, and maintenance costs. The present study strives to achieve a simpler and faster query result retrieval approach from data warehouse with reduced storage space and minimal maintenance cost. The execution time of frequent queries is saved in the present approach by storing their results for reuse when the query is fired next time. The executed OLAP queries are stored along with the query results and necessary metadata information in a relational database is referred as materialized query database (MQDB). The tables, fields, functions, relational operators, and criteria used in the input query are matched with those of stored query, and if they are found to be same, then the input query and the stored query are considered as a synonymous query. Further, the stored query is checked for incremental updates, and if no incremental updates are required, then the existing stored results are fetched from MQDB. On the other hand, if the stored query requires an incremental update of results, then the processing of only incremental result is considered from data marts. The performance of MQDB model is evaluated by comparing with the developed novel approach, and it is observed that, using MQDB, a significant reduction in query processing time is achieved as compared to the major existing approaches. The developed model will be useful for the organizations keeping their historical records in the data warehouse.

APA, Harvard, Vancouver, ISO, and other styles

28

Wang, Liyun, Tingting Liu, and Dingfeng Wu. "Research on display system for agricultural science and technology support data based on Microsoft data warehouse." MATEC Web of Conferences 309 (2020): 04014. http://dx.doi.org/10.1051/matecconf/202030904014.

Full text

Abstract:

Based on the introduction of Microsoft data warehouse service and related software architecture, in order to solve the problems of slow analysis queries caused by the great amount of the original data, complex classification and wide scope, the display system for agricultural science and technology support data was presented in this paper. In proposed system, agricultural science and technology support data were showed clearly and intuitively.

APA, Harvard, Vancouver, ISO, and other styles

29

Kumar, Amit, and T. V. Vijay Kumar. "Materialized View Selection Using Swap Operator Based Particle Swarm Optimization." International Journal of Distributed Artificial Intelligence 13, no. 1 (January 2021): 58–73. http://dx.doi.org/10.4018/ijdai.2021010103.

Full text

Abstract:

The data warehouse is a key data repository of any business enterprise that stores enormous historical data meant for answering analytical queries. These queries need to be processed efficiently in order to make efficient and timely decisions. One way to achieve this is by materializing views over a data warehouse. An n-dimensional star schema can be mapped into an n-dimensional lattice from which Top-K views can be selected for materialization. Selection of such Top-K views is an NP-Hard problem. Several metaheuristic algorithms have been used to address this view selection problem. In this paper, a swap operator-based particle swarm optimization technique has been adapted to address such a view selection problem.

APA, Harvard, Vancouver, ISO, and other styles

30

Mountantonakis, Michalis, Nikos Minadakis, Yannis Marketakis, Pavlos Fafalios, and Yannis Tzitzikas. "Quantifying the Connectivity of a Semantic Warehouse and Understanding its Evolution over Time." International Journal on Semantic Web and Information Systems 12, no. 3 (July 2016): 27–78. http://dx.doi.org/10.4018/ijswis.2016070102.

Full text

Abstract:

In many applications one has to fetch and assemble pieces of information coming from more than one source for building a semantic warehouse offering more advanced query capabilities. In this paper the authors describe the corresponding requirements and challenges, and they focus on the aspects of quality and value of the warehouse. For this reason they introduce various metrics (or measures) for quantifying its connectivity, and consequently its ability to answer complex queries. The authors demonstrate the behaviour of these metrics in the context of a real and operational semantic warehouse, as well as on synthetically produced warehouses. The proposed metrics allow someone to get an overview of the contribution (to the warehouse) of each source and to quantify the value of the entire warehouse. Consequently, these metrics can be used for advancing data/endpoint profiling and for this reason the authors use an extension of VoID (for making them publishable). Such descriptions can be exploited for dataset/endpoint selection in the context of federated search. In addition, the authors show how the metrics can be used for monitoring a semantic warehouse after each reconstruction reducing thereby the cost of quality checking, as well as for understanding its evolution over time.

APA, Harvard, Vancouver, ISO, and other styles

31

Adnan, Refed, and Talib M. J. Abbas. "MATERIALIZED VIEWS QUANTUM OPTIMIZED PICKING for INDEPENDENT DATA MARTS QUALITY." Iraqi Journal of Information & Communications Technology 3, no. 1 (April 11, 2020): 26–39. http://dx.doi.org/10.31987/ijict.3.1.88.

Full text

Abstract:

Particular and timely unified information along with quick and effective query response times is the basic fundamental requirement for the success of any collection of independent data marts (data warehouse) which forms Fact Constellation Schema or Galaxy Schema. Because of the materialized view storage area, the materialization of all views is practically impossible thus suitable materialized views (MVs) picking is one of the intelligent decisions in designing a Fact Constellation Schema to get optimal efficiency. This study presents a framework for picking best-materialized view using Quantum Particle Swarm Optimization (QPSO) algorithm where it is one of the stochastic algorithm in order to achieve the effective combination of good query response time, low query handling cost and low view maintenance cost. The results reveals that the proposed method for picking best-materialized view using QPSO algorithm is better than other techniques via computing the ratio of query response time and compare it to the response time of the same queries on the materialized views. Ratio of implementing the query on the base table takes five times more time than the query implementation on the materialized views. Where the response time of queries through MVs access were found 0.084 seconds while by direct access queries were found 0.422 seconds. This outlines that the performance of query through materialized views access is 402.38% better than those directly access via data warehouse-logical.

APA, Harvard, Vancouver, ISO, and other styles

32

Bellatreche, Ladjel, and Amira Kerkad. "Query Interaction Based Approach for Horizontal Data Partitioning." International Journal of Data Warehousing and Mining 11, no. 2 (April 2015): 44–61. http://dx.doi.org/10.4018/ijdwm.2015040103.

Full text

Abstract:

With the explosion of data, several applications are designed around analytical aspects, with data warehousing technology at the heart of the construction chain. The exploitation of this data warehouse is usually performed by the use of complex queries involving selections, joins and aggregations. These queries bring the following characteristics: (1) their routinely aspects, (2) their large number, and (3) the high operation sharing between queries. This interaction has been largely identified in the context of multi-query optimization, where graph data structures were proposed to capture it. Also during the physical design, the structures have been used to select redundant optimization structures such as materialized views and indexes. Horizontal data partitioning (HDP) is another non-redundant optimization structure that can be selected in the physical design phase. It is a pre-condition for designing extremely large databases in several environments: centralized, distributed, parallel and cloud. It aims to reduce the cost of the above operations. In HDP, the optimization space of potential candidates for partitioning grows exponentially with the problem size making the problem NP-hard. This paper proposes a new approach based on query interactions to select a partitioning schema of a data warehouse in a divide and conquer manner to achieve an improved trade-off between the optimization algorithm's speed and the quality of the solution. The effectiveness of our approach is proven through a validation using the Star Schema Benchmark (100 GB) on Oracle11g.

APA, Harvard, Vancouver, ISO, and other styles

33

Zhu, Xuejun, Xiaona Jin, Dongdong Jia, Naiwei Sun, and Pu Wang. "Application of Data Mining in an Intelligent Early Warning System for Rock Bursts." Processes 7, no. 2 (January 22, 2019): 55. http://dx.doi.org/10.3390/pr7020055.

Full text

Abstract:

In view of rock burst accidents frequently occurring, a basic framework for an intelligent early warning system for rock bursts (IEWSRB) is constructed based on several big data technologies in the computer industry, including data mining, databases and data warehouses. Then, a data warehouse is modeled with regard to monitoring the data of rock bursts, and the effective application of data mining technology in this system is discussed in detail. Furthermore, we focus on the K-means clustering algorithm, and a data visualization interface based on the Browser/Server (B/S) mode is developed, which is mainly based on the Java language, supplemented by Cascading Style Sheets (CSS), JavaScript and HyperText Markup Language (HTML), with Tomcat, as the server and Mysql as the JavaWeb project of the rock burst monitoring data warehouse. The application of data mining technology in IEWSRB can improve the existing rock burst monitoring system and enhance the prediction. It can also realize real-time queries and the analysis of monitoring data through browsers, which is very convenient. Hence, it can make important contributions to the safe and efficient production of coal mines and the sustainable development of the coal economy.

APA, Harvard, Vancouver, ISO, and other styles

34

Qusyairi, Muhammad, Made Sudarma, and Agus Dharma. "Designing Data Warehouse Model Using Benefit Cost Ratio Analysis Method." Control Systems and Computers, no. 2-3 (292-293) (July 2021): 84–91. http://dx.doi.org/10.15407/csc.2021.02.084.

Full text

Abstract:

The data warehouse has the function to make the spread company’s data to be integrated and concise, thereby it helps executives in analyzing the existing data to obtain a quick and accurate strategic decision. This research has the objective to design a data warehouse within the scope of application of the benefit-cost ratio. As a solution to the feasibility of the company’s business, the unity of different data enables it to be combined with the results of the company’s in-depth analysis. In designing the model, this research succeeded in designing a data warehouse with the application of benefit-cost-ratio method which is used to carry out an in-depth analysis of the financial sector by providing the feasibility and percentage results of the current business. In summary, the source data that is processed into the process of extracting, transforming, and loading which built by the star schema will affect the quality of generated data for the process of queries. In addition, the results of the data warehouse used for the decision-making process and feasible business strategy.

APA, Harvard, Vancouver, ISO, and other styles

35

Kumar, Amit, and T. V. Vijay Kumar. "Materialized View Selection Using Set Based Particle Swarm Optimization." International Journal of Cognitive Informatics and Natural Intelligence 12, no. 3 (July 2018): 18–39. http://dx.doi.org/10.4018/ijcini.2018070102.

Full text

Abstract:

A data warehouse is a central repository of historical data designed primarily to support analytical processing. These analytical queries are exploratory, long and complex in nature. Further, the rapid and continuous growth in the size of data warehouse increases the response times of such queries. Query response times need to be reduced in order to speedup decision making. This problem, being an NP-Complete problem, can be appropriately dealt with by using swarm intelligence techniques. One such technique, i.e. the set-based particle swarm optimization (SPSO), has been proposed to address this problem. Accordingly, a SPSO based view selection algorithm (SPSOVSA), which selects the Top-K views from a multidimensional lattice, is proposed. Experimental based comparison of SPSOVSA with the most fundamental view selection algorithm shows that SPSOVSA is able to select comparatively better quality Top-K views for materialization. The materialization of these selected views would improve the performance of analytical queries and lead to efficient decision making.

APA, Harvard, Vancouver, ISO, and other styles

36

Ben Messaoud, Ines, Abdulrahman A. Alshdadi, and Jamel Feki. "Building a Document-Oriented Warehouse Using NoSQL." International Journal of Operations Research and Information Systems 12, no. 2 (April 2021): 33–54. http://dx.doi.org/10.4018/ijoris.20210401.oa3.

Full text

Abstract:

The traditional data warehousing approaches should adapt to take into consideration novel needs and data structures. In this context, NoSQL technology is progressively gaining a place in the research and industry domains. This paper proposes an approach for building a NoSQL document-oriented warehouse (DocW). This approach has two methods, namely 1) document warehouse builder and 2) NoSQL-Converter. The first method generates the DocW schema as a galaxy model whereas the second one translates the generated galaxy into a document-oriented NoSQL model. This relies on two types of rules: structure and hierarchical rules. Furthermore, in order to help understanding the textual results of analytical queries on the NoSQL-DocW, the authors define two semantic operators S-Drill-Up and S-Drill-Down to aggregate/expand the terms of query. The implementation of our proposals uses MangoDB and Talend. The experiment uses the medical collection Clef-2007 and two metrics called write request latency and read request latency to evaluate respectively the loading time and the response time to queries.

APA, Harvard, Vancouver, ISO, and other styles

37

Kumar, Amit, and T. V. Vijay Kumar. "Improved Quality View Selection for Analytical Query Performance Enhancement Using Particle Swarm Optimization." International Journal of Reliability, Quality and Safety Engineering 24, no. 06 (November 17, 2017): 1740001. http://dx.doi.org/10.1142/s0218539317400010.

Full text

Abstract:

A data warehouse, which is a central repository of the detailed historical data of an enterprise, is designed primarily for supporting high-volume analytical processing in order to support strategic decision-making. Queries for such decision-making are exploratory, long and intricate in nature and involve the summarization and aggregation of data. Furthermore, the rapidly growing volume of data warehouses makes the response times of queries substantially large. The query response times need to be reduced in order to reduce delays in decision-making. Materializing an appropriate subset of views has been found to be an effective alternative for achieving acceptable response times for analytical queries. This problem, being an NP-Complete problem, can be addressed using swarm intelligence techniques. One such technique, i.e., the similarity interaction operator-based particle swarm optimization (SIPSO), has been used to address this problem. Accordingly, a SIPSO-based view selection algorithm (SIPSOVSA), which selects the Top-[Formula: see text] views from a multidimensional lattice, has been proposed in this paper. Experimental comparison with the most fundamental view selection algorithm shows that the former is able to select relatively better quality Top-[Formula: see text] views for materialization. As a result, the views selected using SIPSOVSA improve the performance of analytical queries that lead to greater efficiency in decision-making.

APA, Harvard, Vancouver, ISO, and other styles

38

THEODORATOS, DIMITRI, and MOKRANE BOUZEGHOUB. "DATA CURRENCY QUALITY SATISFACTION IN THE DESIGN OF A DATA WAREHOUSE." International Journal of Cooperative Information Systems 10, no. 03 (September 2001): 299–326. http://dx.doi.org/10.1142/s0218843001000369.

Full text

Abstract:

A Data Warehouse (DW) is a large collection of data integrated from multiple distributed autonomous databases and other information sources. A DW can be seen as a set of materialized views defined over the remote source data. Until now research work on DW design is restricted to quantitatively selecting view sets for materialization. However, quality issues in the DW design are neglected. In this paper we suggest a novel statement of the DW design problem that takes into account quality factors. We design a DW system architecture that supports performance and data consistency quality goals. In this framework we present a high level approach that allows to check whether a view selection guaranteeing a data completeness quality goal also satisfies a data currency quality goal. This approach is based on an AND/OR dag representation for multiple queries and views. It also allows determining the minimal change propagation frequencies that satisfy the data currency quality goal along with the optimal query evaluation and change propagation plans. Our results can be directly used for a quality driven design of a DW.

APA, Harvard, Vancouver, ISO, and other styles

39

Chen, Xiqian. "An Efficient Indexing Scheme for Range Aggregate Queries in Spatial Data Warehouse." Journal of Computer Research and Development 43, no. 1 (2006): 75. http://dx.doi.org/10.1360/crad20060112.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Raza, Basit, Adeel Aslam, Asma Sher, Ahmad Kamran Malik, and Muhammad Faheem. "Autonomic performance prediction framework for data warehouse queries using lazy learning approach." Applied Soft Computing 91 (June 2020): 106216. http://dx.doi.org/10.1016/j.asoc.2020.106216.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Sarika Prakash, Kale, and P. M. Joe Prathap. "Evaluating Aggregate Functions of Iceberg Query Using Priority Based Bitmap Indexing Strategy." International Journal of Electrical and Computer Engineering (IJECE) 7, no. 6 (December 1, 2017): 3745. http://dx.doi.org/10.11591/ijece.v7i6.pp3745-3752.

Full text

Abstract:

Aggregate function and iceberg queries are important and common in many applications of data warehouse because users are generally interested in looking for variance or unusual patterns. Normally, the nature of the queries to be executed on data warehouse are the queries with aggregate function followed by having clause, these type of queries are known as iceberg query. Especially to have efficient techniques for processing aggregate function of iceberg query is very important because their processing cost is much higher than that of the other basic relational operations such as SELECT and PROJECT. Presently available iceberg query processing techniques faces the problem of empty bitwise AND,OR XOR operation and requires more I/O access and time.To overcome these problems proposed research provides efficient algorithm to execute iceberg queries using priority based bitmap indexing strategy. Priority based approach consider bitmap vector to be executed as per the priority.Intermediate results are evaluated to find probability of result.Fruitless operations are identified and skipped in advance which help to reduce I/O access and time.Time and iteration required to process query is reduced [45-50] % compare to previous strategy. Experimental result proves the superiority of priorty based approach compare to previous bitmap processing approach.

APA, Harvard, Vancouver, ISO, and other styles

42

Arun, Biri, and T. V. Vijay Kumar. "Materialized View Selection using Marriage in Honey Bees Optimization." International Journal of Natural Computing Research 5, no. 3 (July 2015): 1–25. http://dx.doi.org/10.4018/ijncr.2015070101.

Full text

Abstract:

Data warehouse was designed to cater to the strategic decision making needs of an organization. Most queries posed on them are on-line analytical queries, which are complex and computation intensive in nature and have high query response times when processed against a large data warehouse. This time can be substantially reduced by materializing pre-computed summarized views and storing them in a data warehouse. All possible views cannot be materialized due to storage space constraints. Also, an optimal selection of subsets of views is shown to be an NP-Complete problem. This problem of view selection has been addressed in this paper by selecting a beneficial set of views, from amongst all possible views, using the swarm intelligence technique Marriage in Honey Bees Optimization (MBO). An MBO based view selection algorithm (MBOVSA), which aims to select views that incur the minimum total cost of evaluating all the views (TVEC), is proposed. In MBOVSA, the search has been intensified by incorporating the royal jelly feeding phase into MBO. MBOVSA, when compared with the most fundamental greedy based view selection algorithm HRUA, is able to select comparatively better quality views.

APA, Harvard, Vancouver, ISO, and other styles

43

Putra, I. Made Suwija, and Dewa Komang Tri Adhitya Putra. "Rancang Bangun Engine ETL Data Warehouse dengan Menggunakan Bahasa Python." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 3, no. 2 (August 1, 2019): 113–23. http://dx.doi.org/10.29207/resti.v3i2.872.

Full text

Abstract:

Big companies that have many branches in different locations often have difficulty with analyzing transaction processes from each branch. The problem experienced by the company management is the rapid delivery of massive data provided by the branch to the head office so that the analysis process of the company's performance becomes slow and inaccurate. The results of this process used as a consideration in decision making which produce the right information if the data is complete and relevant. The right method of massive data collection is using the data warehouse approach. Data warehouse is a relational database designed to optimize queries in Online Analytical Processing (OLAP) from the transaction process of various data sources that can record any changes in data that occur so that the data becomes more structured. In applying the data collection, data warehouse has extracted, transform, and load (ETL) steps to read data from the Online Transaction Processing (OLTP) system, change the form of data through uniform data structures, and save to the final location in the data warehouse. This study provides an overview of the solution for implementing ETL that can work automatically or manually according to needs using the Python programming language so that it can facilitate the ETL process and can adjust to the conditions of the database in the company system.

APA, Harvard, Vancouver, ISO, and other styles

44

Vijayarajeswari, R., and M. Kannan. "Optimization of Multiple Correlated Queries by Detecting Similar Data Source with Hive Warehouse." Indian Journal of Public Health Research & Development 9, no. 2 (2018): 362. http://dx.doi.org/10.5958/0976-5506.2018.00149.3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Pessina, Francesco, Marco Masseroli, and Arif Canakoglu. "Visual Composition of Complex Queries on an Integrative Genomic and Proteomic Data Warehouse." Engineering 05, no. 10 (2013): 94–98. http://dx.doi.org/10.4236/eng.2013.510b019.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Mohseni, Mohsen, and Mohammad Karim Sohrabi. "MVPP-Based Materialized View Selection in Data Warehouses Using Simulated Annealing." International Journal of Cooperative Information Systems 29, no. 03 (August 28, 2020): 2050001. http://dx.doi.org/10.1142/s021884302050001x.

Full text

Abstract:

The process of extracting data from different heterogeneous data sources, transforming them into an integrated, unified and cleaned repository, and storing the result as a single entity leads to the construction of a data warehouse (DW), which facilitates access to data for the users of information systems and decision support systems. Due to their enormous volumes of data, processing of analytical queries of decision support systems need to scan very large amounts of data, which has a negative effect on the systems’ response time. Because of the special importance of online analytical processing (OLAP) in these systems, to enhance the performance and improve the query response time of the system, an appropriate number of views of the DW are selected for materialization and will be utilized for responding to the analytical queries, instead of direct access to the base relations. Memory constraint and views maintenance overhead are two main limitations that make it impossible, in most cases, to materialize all views of the DW. Selecting a proper set of views of DW for materialization, called materialized view selection (MVS) problem, is an important research issue that has been focused in various papers. In this paper, we have proposed a method, called P-SA, to select an appropriate set of views using an improved version of simulated annealing (SA) algorithm that utilizes a proper neighborhood selection strategy. P-SA uses the multiple view processing plan (MVPP) structure for selecting the views. Data and queries of a benchmark DW have been used in experimental results for evaluating the introduced method. The experimental results show better performance of the P-SA compared to other SA-based MVS methods for increasing the number of queries, in terms of the total cost of view maintenance and query processing. Moreover, the total cost of queries in the P-SA is also better than the other important SA-based MVS methods of the literature when the size of the DW is increased.

APA, Harvard, Vancouver, ISO, and other styles

47

Chen, Jianhui, Ning Zhong, and Jianhua Feng. "Developing a Provenance Warehouse for the Systematic Brain Informatics Study." International Journal of Information Technology & Decision Making 16, no. 06 (November 2017): 1581–609. http://dx.doi.org/10.1142/s0219622015500418.

Full text

Abstract:

Aiming at the unstructured brain data and data-driven research process, provenances have become an important component of brain and health big data rather than the accessory of raw experimental data in the systematic Brain Informatics (BI) study. However, the existing file-based or transaction-database-based provenance queries cannot effectively support quickly understanding data and generating decisions or suppositions in the systematic BI study, which need multi-aspect and multi-granularity provenance information and a process of incremental modification. Inspired by studies on the data warehouse and online analytical processing (OLAP) technology, this paper proposes a BI provenance warehouse. The provenance cube and basic OLAP operations are defined. A complete Data-Brain-based development approach is also designed. Such a BI provenance warehouse represents a radically new way for developing the brain big data center, which regards raw experimental data, provenances and domain ontologies as different levels of brain big data for data sharing and mining.

APA, Harvard, Vancouver, ISO, and other styles

48

Yoon, Seung-Chul, Tae Sung Shin, Kurt Lawrence, and Deana R. Jones. "Development of Online Egg Grading Information Management System with Data Warehouse Technique." Applied Engineering in Agriculture 36, no. 4 (2020): 589–604. http://dx.doi.org/10.13031/aea.13675.

Full text

Abstract:

Highlights Digital data collection and management system is developed for the USDA-AMS’s shell-egg grading program. Database system consisting of OLTP, data warehouse and OLAP databases enables online data entry and trend reporting. Data and information management is done through web application servers. Users access the databases via web browsers. Abstract . This paper is concerned with development of web-based online data entry and reporting system, capable of centralized data storage and analytics of egg grading records produced by USDA egg graders. The USDA egg grading records are currently managed in paper form. While there is useful information for data-driven knowledge discovery and decision making, the paper-based egg grading record system has fundamental limitations in effective and timely management of such information. Thus, there has been a demand to electronically and digitally store and manage the egg grading records in a database for data analytics and mining, such that the quality trends of eggs observed at various levels (e.g., nation or state) are readily available to decision makers. In this study, we report the design and implementation of a web-based online data entry and reporting information system (called USDA Egg Grading Information Management System, EGIMS), based on a data warehouse framework. The developed information system consisted of web applications for data entry and reporting, and internal databases for data storage, aggregation, and query processing. The internal databases consisted of online transaction processing (OLTP) database for data entry and retrieval, data warehouse (DW) for centralized data storage and online analytical processing (OLAP) database for multidimensional analytical queries. Thus, the key design goal of the system was to build a system platform that could provide the web-based data entry and reporting capabilities while rapidly updating the OLTP, DW and OLAP databases. The developed system was evaluated by a simulation study with statistically-modeled egg grading records of one hypothetical year. The study found that the EGIMS could handle approximately up to 600 concurrent users, 32 data entries per second and 164 report requests per second, on average. The study demonstrated the feasibility of an enterprise-level data warehouse system for the USDA and a potential to provide data analytics and data mining capabilities such that the queries about historical and current trends can be reported. Once fully implemented and tested in the field, the EGIMS is expected to provide a solution to modernize the egg grading practice of the USDA and produce the useful information for timely decisions and new knowledge discovery. Keywords: Data warehouse, Database, OLTP, OLAP, Egg grading, Information management, Web application, Information system, Data.

APA, Harvard, Vancouver, ISO, and other styles

49

Xiahou, Jian Bing, Qian Qian Wei, Xiao Na Deng, and Xiao Wei Liu. "Research and Optimization of Materialized Views Selection Algorithm Based on the Data Warehouse." Advanced Materials Research 926-930 (May 2014): 3165–70. http://dx.doi.org/10.4028/www.scientific.net/amr.926-930.3165.

Full text

Abstract:

Materialized view is an effective mothed for improving the efficiency of queries in data warehouse system,and materialized views selection problem is one of the most important decisions in designing a data warehouse.This paper begins with a brief introduction to materialized view and study of the existing materialized viewalgorithm.Then in order toselect an appropriate set of views that minimizes total query response timeand the cost of maintaining the selected views under a limitedstorage space, a hybrid algorithm combined with the advantages of ant colony algorithm and immune genetic algorithm is proposed.Inaddition,analyze the shortcomings of this algorithm and propose some improvement ideas, which optimize the efficiency of algorithm to some extent.

APA, Harvard, Vancouver, ISO, and other styles

50

Grigorev, Yu A., and V. A. Proletarskaya. "MODEL OF QUERIES EXECUTION PROCESSES TO DATA WAREHOUSE ON THE PARALLEL COMPUTING PLATFORM SPARK." Informatika i sistemy upravleniya, no. 59 (2019): 3–17. http://dx.doi.org/10.22250/isu.2019.59.3-17.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!