Dissertations / Theses: 'OLAP technology'

1

Chui, Chun-kit, and 崔俊傑. "OLAP on sequence data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2010. http://hub.hku.hk/bib/B45823996.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Rehman, Nafees Ur [Verfasser]. "Extending the OLAP Technology for Social Media Analysis / Nafees Ur Rehman." Konstanz : Bibliothek der Universität Konstanz, 2015. http://d-nb.info/1079478485/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Zhao, Hongyan. "A visualization tool to support Online Analytical Processing." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000622.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Fu, Lixin. "CubiST++ a new approach to improving the performance of ad-hoc cube queries /." [Gainesville, Fla.] : University of Florida, 2001. http://etd.fcla.edu/etd/uf/2001/ank7110/masterfinal0.pdf.

Full text

Abstract:

Thesis (Ph. D.)--University of Florida, 2001.
Title from first page of PDF file. Document formatted into pages; contains x, 100 p.; also contains graphics. Vita. Includes bibliographical references (p. 95-99).

APA, Harvard, Vancouver, ISO, and other styles

5

Bell, Daniel M. "An evaluative case report of the group decision manager : a look at the communication and coordination issues facing online group facilitation /." free to MU campus, to others for purchase, 1998. http://wwwlib.umi.com/cr/mo/fullcit?p9901215.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Ferreira, André Luiz Nascente. "GESTÃO DO PROCESSO DE RELIGAÇÃO DE ÁGUA TRATADA NA CIDADE DE GOIÂNIA, UTILIZANDO OLAP E DATA WAREHOUSE." Pontifícia Universidade Católica de Goiás, 2013. http://localhost:8080/tede/handle/tede/2436.

Full text

Abstract:

Made available in DSpace on 2016-08-10T10:40:16Z (GMT). No. of bitstreams: 1 ANDRE LUIZ NASCENTE FERREIRA.pdf: 1645123 bytes, checksum: b3c5d896af5d93a8c4c05a04afea9227 (MD5) Previous issue date: 2013-03-08
Through sanitation, economic, social and public health problems are minimized resulting in a considerable improvement in quality of life. For companies in this line of work can improve their results by establishing quality in the managment of their internal processes. There is a specific process of these companies dealing with Reconnections of Treated Water, which usually occur due to cutting of their supply. This work discusses the concepts of Information Technology (IT) in the intent to use them to improve the management of this process, covering specific theories of Software Engineering, Data Warehouse and OLAP. Overall, the work presents the development and implementation of an OLAP tool to assist in specific control over the time spent in performing the Reconnections of Treated Water in the city of Goiânia, state of Goiás, and analyzes the results of this tool, comparing scenarios before and after its implementation of the same. According to the analysis of the results, we observed improvements on the times of reconnections after the implementation of the developed tool.
Através do saneamento, problemas econômicos, sociais e de saúde pública são amenizados resultando em uma melhoria considerável na qualidade de vida da população. Para que empresas desse ramo de atividade possam desempenhar cada vez melhor suas atribuições é necessário estabelecer qualidade na gestão dos seus processos internos. Existe um processo específico dessas empresas que são indispensáveis para a manutenção da qualidade de vida e disponibilização do saneamento para a população, que se refere às Religações de Água Tratada, que geralmente ocorrem devido ao corte do seu abastecimento. Este trabalho aborda conceitos de Tecnologia da Informação (TI) na intenção de utilizá-los para melhorar a gestão deste processo, abrangendo teorias específicas de Engenharia de Software, OLAP e Data Warehouse. De maneira geral, o trabalho apresenta o desenvolvimento e implantação de uma ferramenta OLAP específica para auxiliar no controle sobre o tempo gasto na execução das Religações de Água Tratada na cidade de Goiânia, no estado de Goiás, e analisa os resultados obtidos por essa ferramenta, comparando cenários antes e após a implantação da mesma. De acordo com a análise dos resultados obtidos, pôde-se observar melhorias em relação ao tempo de execução das religações após a implantação da ferramenta desenvolvida.

APA, Harvard, Vancouver, ISO, and other styles

7

Баглай, Роман Олегович. "Інформаційна архітектура банку на основі хмарних технологій." Thesis, Національний технічний університет "Харківський політехнічний інститут", 2019. http://repository.kpi.kharkov.ua/handle/KhPI-Press/43523.

Full text

Abstract:

Дисертація на здобуття наукового ступеня кандидата технічних наук (доктора філософії) за спеціальністю 05.13.06 – інформаційні технології (122 – комп’ютерні науки). – Національний технічний університет "Харківський політехнічний інститут", Харків, 2019. Об’єктом дослідження є процеси автоматизованого управління потоками даних інформаційної архітектури банку на основі хмарних технологій. Предметом дослідження є моделі, методи та інформаційні технології оптимізації обробки банківської інформації на основі хмарної інфраструктури. Дисертацію присвячено вирішенню актуальної науково-прикладної задачі підвищення ефективності обробки інформації регламенту операційного дня банку шляхом модернізації інформаційної архітектури банку на основі впровадження хмарних технологій. У дисертації проведено аналіз можливості впровадження хмарних технологій для забезпечення діяльності банківських установ та підтримки функціонування бізнес-процесів. Розглянуто проблеми та переваги хмарних технологій на різних рівнях архітектурного ландшафту банку з урахуванням специфіки нормативно-правового регулювання діяльності фінансової установи. У вступі обґрунтовано актуальність теми дисертації, зазначено зв’язок роботи з науковими темами, сформульовано мету та задачі дослідження, визначено об’єкт, предмет та методи дослідження, показано наукову новизну та практичне значення отриманих результатів, наведено інформацію про практичне використання, апробацію результатів та їх висвітлення у публікаціях. У першому розділі проведено аналіз основних підходів до управління банківською інформацією та перспективних напрямів застосування хмарних технологій для банківських інформаційних систем. Зокрема, проведено декомпозицію об’єкта дослідження на складові – «інформаційна архітектура», «хмарні технології», «хмарні обчислення» «банківська інформаційна система (ІС)» з метою подальшого застосування методів аналізу та синтезу. Малодослідженими залишаються проблеми та переваги застосування хмарних технологій у банківських установах України. Банки, які не є професійними компаніями з інформаційних технологій, змушені інвестувати і підтримувати значну кількість ресурсів інфраструктури ІТ та персоналу для управління власними бізнес-процесами. У такій ситуації хмарні технології дозволяють скоротити витрати та підвищити ефективність використання банківських інформаційних систем. У другому розділі досліджено інформаційні технології мінімізації загроз безпеки хмарних технологій для автоматизованих банківських систем шляхом застосування механізмів єдиного входу з метою забезпечення сильної автентифікації користувачів. Досліджено механізми впровадження такої автентифікації та їх практичне застосування для забезпечення безпеки та підвищення ефективності бізнес-процесів банку. Вироблено пропозиції щодо критеріїв вибору постачальника інформаційної технології управління обліковими даними як сервісу, механізмів єдиного входу в систему та федеративних сценаріїв доступу для забезпечення сильної автентифікації користувачів банківських ІС. Автором удосконалено метод оцінки загроз інформаційної безпеки банку при впровадженні хмарних технологій, який має в основі якісний аналіз імовірності ризику та обсягу збитків на основі міжнародного стандарту класифікації кібернетичних атак MITRE, що дозволило оптимізувати механізми захисту інформаційної архітектури банку від потенційних кібератак. У третьому розділі розроблено ІТ-рішення для банківської системи на основі хмарних технологій, що дозволяє перенести великі обчислювальні навантаження в хмарне середовище, забезпечивши відповідність вимогам загального регламенту про захист даних (GDPR) та національних регуляторів. Анонімізація даних клієнта описується як рішення для уникнення ризиків, пов’язаних із конфіденційністю даних клієнтів, а також необхідністю їх згоди на розміщення персональних даних у хмарному середовищі. Удосконалено інформаційну технологію реплікації даних банківських ІС, що ґрунтується на механізмах деперсоніфікації клієнтських даних. Це дозволило покращити захист конфіденційності даних та виконати вимоги НБУ щодо локалізації банківських персоніфікованих клієнтських даних на серверах, які фізично розташовані на території України. Розроблена архітектура рішень ІТ поєднує обробку даних у режимі реального часу та пакетних завантажень даних. На відміну від традиційного способу використання даних вони не тільки мігрують в базу даних (БД), розгорнуту на хмарній інфраструктурі, а також реплікуються назад у наземну інфраструктуру. Вимоги безпеки, що регулюються стандартами конфіденційності, цілісності та доступності даних, повністю задовольняються відповідними хмарними технологіями. Автором побудовано математичну модель процесу закриття операційного дня банку і вирішено задачу оптимізації часу та вартості обробки інформації для банківських ІС, розгорнутих у хмарному середовищі, що дозволило визначити оптимальну конфігурацію хмарних сервісів в інформаційній архітектурі банку на базі сервісів AWS.
The dissertation for the degree of a candidate in technical sciences (PhD), specialty 05.13.06 – information technologies (122 - computer science). – National Technical University «Kharkiv Polytechnic Institute», Kharkiv, 2019. The research object is the processes of automated management of data flows of the bank's information architecture based on cloud technologies. The research subject is models, methods and information technologies for optimization of banking information processing based on cloud infrastructure. The dissertation is devoted to the solution of the actual scientific and applied problem of increasing the efficiency of processing information of the bank end of day procedure by modernizing the information architecture of the bank based on the introduction of cloud technologies. The dissertation analyzes the feasibility of implementing cloud technologies to support the activities of banking institutions and functioning of business processes. The problems and advantages of cloud technologies at different levels of the bank's architectural landscape are considered, taking into account the specifics of regulatory requirements to activity of a financial institution. The introduction contains the proof dissertation topic relevance, indicates the relationship of work with scientific topics, purpose and objectives of the study, identified the object, subject and methods of research, shows the scientific novelty and practical significance of the results obtained, provides information on practical use, validation of results and their coverage in publications. The first chapter includes analyzes of the basic approaches to banking information management and the perspective fields to apply the cloud technologies for banking information systems. In particular, the decomposition of the object of study into components - "information architecture", "cloud technologies", "cloud computing" "banking information system (IS)", was carried out in order to further apply the methods of analysis and synthesis. The problems and benefits of using cloud technologies in Ukrainian banking institutions remain under-researched. Banks which are non-professional IT companies are forced to invest and maintain a significant amount of IT infrastructure resources and staff to manage their own business processes. In such a situation, cloud technologies help reduce costs and increase the efficiency of banking information systems. The second chapter explores information technology to minimize the security threats of cloud technologies for automated banking systems by using single sign on mechanisms to ensure strong user authentication. The mechanisms of implementation of such authentication and their practical application for security and increase of efficiency of bank business processes are investigated. Proposals have been made on the criteria for choosing a provider of identity access management as a service, single sign-on mechanisms and federated access scenarios to ensure strong authentication of users of banking ISs. The author has improved the method of assessing bank information security threats in the implementation of cloud technologies, which is based on a qualitative analysis of the probability of risk and volume of losses based on the international standard classification of cyber attacks MITRE, which allowed to optimize the mechanisms of protection of the information architecture of the bank against potential cyber attacks. The third chapter contains designed cloud-based IT solutions for the banking system that can transfer large computational loads to the cloud environment, ensuring compliance with the General Data Protection Regulation (GDPR) and national regulators. Anonymization of customer data is described as a solution to avoid the risks associated with the confidentiality of customer data and the need for their consent to the placement of personal data in a cloud environment. Improved information technology for replication of banking IS data, based on mechanisms of customer data depersonification. This allowed to improve the protection of data confidentiality and to fulfill the requirements of the NBU for localization of banking personalized client data on servers physically located in the territory of Ukraine. The developed IT solution architecture combines real-time data processing and batch data uploads. Unlike the traditional way of using data, it is not only migrated to a DB (database) deployed on cloud infrastructure, but also replicated back to On-premise infrastructure. Security requirements, governed by the standards of confidentiality, integrity and availability of data, are fully met by relevant cloud technologies. The author developed a mathematical model of the process of closing a bank's operating day and solved the problem of optimizing the time and cost of information processing for bank ICs deployed in a cloud environment, which allowed to determine the optimal configuration of cloud services in the bank's information architecture based on AWS services.

APA, Harvard, Vancouver, ISO, and other styles

8

Kamath, Akash S. "An efficient algorithm for caching online analytical processing objects in a distributed environment." Ohio : Ohio University, 2002. http://www.ohiolink.edu/etd/view.cgi?ohiou1174678903.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Баглай, Роман Олегович. "Інформаційна архітектура банку на основі хмарних технологій." Thesis, Національний технічний університет "Харківський політехнічний інститут", 2019. http://repository.kpi.kharkov.ua/handle/KhPI-Press/43520.

Full text

Abstract:

Дисертація на здобуття наукового ступеня кандидата технічних наук за спеціальністю 05.13.06 – інформаційні технології. – Київський національний торговельно-економічний університет, Київ, 2019. У дисертації проведено аналіз можливості впровадження хмарних технологій для забезпечення діяльності банківських установ та підтримки функціонування бізнес-процесів. Розглянуто проблеми та переваги хмарних технологій на різних рівнях архітектурного ландшафту банку з урахуванням специфіки нормативно-правового регулювання діяльності фінансової установи. Метою дисертаційної роботи є підвищення ефективності обробки інформації регламенту операційного дня банку шляхом модернізації інформаційної архітектури банку на основі впровадження хмарних технологій. Розглянуто сучасні підходи щодо управління безпекою ІТ банківських установ для мінімізації загроз, в тому числі породжених хмарними технологіями. Запропоновано сучасний підхід до побудови систем з механізмами забезпечення безпеки ІТ. Проведено аналіз загроз безпеки інформаційних технологій при впровадженні хмарних обчислень для забезпечення безперебійної та ефективної діяльності банківських установ та запропоновано заходи щодо мінімізації цих загроз. Результати дослідження апробовані шляхом впровадження відповідних проектів, обумовлених викликами і тенденціями банківської сфери, ринковими та регуляторними змінами. Виконано впровадження результатів дисертаційної роботи у діяльність управління менеджменту портфелю проектів АТ "Райффайзен Банк Аваль" (м. Київ) щодо модернізації архітектури банківських інформаційних систем та розробки програмного забезпечення на основі хмарних технологій; у діяльність ТОВ "ІТ Інновації Україна" (м. Київ) щодо класифікації загроз інформаційної безпеки на основі якісної оцінки ризиків та ефективності використання серверних ресурсів за рахунок застосування хмарних обчислень.
The research for a Ph. D. science degree by specialty 05.13.06 – information technologies. – Kyiv National University of Trade and Economics, Kyiv, 2019. The feasibility study for implementation of cloud technologies to support the activities of banking institutions and functioning of business processes is conducted in the dissertation. The problems and advantages of cloud technologies at different levels of the bank's architectural landscape are investigated, taking into account the specifics of regulatory activity of a financial institution. The purpose of the dissertation is to increase the efficiency of information processing in frames of end of day procedure of the Core Banking System by modernizing the information architecture of the bank based on cloud technologies implementation. Modern approaches to managing IT security of banking institutions to minimize threats, including those generated by cloud technologies, are considered. A modern approach to building systems with IT security mechanisms is proposed. The analysis of information technology security threats in the implementation of cloud computing has been conducted to ensure the smooth and efficient operation of banking institutions and measures have been proposed to minimize these threats. The proof of concepts for results of the study was included to the relevant projects, driven by the challenges and trends of the banking sector, market and regulatory changes.

APA, Harvard, Vancouver, ISO, and other styles

10

Norguet, Jean-Pierre. "Semantic analysis in web usage mining." Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210890.

Full text

Abstract:

With the emergence of the Internet and of the World Wide Web, the Web site has become a key communication channel in organizations. To satisfy the objectives of the Web site and of its target audience, adapting the Web site content to the users' expectations has become a major concern. In this context, Web usage mining, a relatively new research area, and Web analytics, a part of Web usage mining that has most emerged in the corporate world, offer many Web communication analysis techniques. These techniques include prediction of the user's behaviour within the site, comparison between expected and actual Web site usage, adjustment of the Web site with respect to the users' interests, and mining and analyzing Web usage data to discover interesting metrics and usage patterns. However, Web usage mining and Web analytics suffer from significant drawbacks when it comes to support the decision-making process at the higher levels in the organization.

Indeed, according to organizations theory, the higher levels in the organizations need summarized and conceptual information to take fast, high-level, and effective decisions. For Web sites, these levels include the organization managers and the Web site chief editors. At these levels, the results produced by Web analytics tools are mostly useless. Indeed, most of these results target Web designers and Web developers. Summary reports like the number of visitors and the number of page views can be of some interest to the organization manager but these results are poor. Finally, page-group and directory hits give the Web site chief editor conceptual results, but these are limited by several problems like page synonymy (several pages contain the same topic), page polysemy (a page contains several topics), page temporality, and page volatility.

Web usage mining research projects on their part have mostly left aside Web analytics and its limitations and have focused on other research paths. Examples of these paths are usage pattern analysis, personalization, system improvement, site structure modification, marketing business intelligence, and usage characterization. A potential contribution to Web analytics can be found in research about reverse clustering analysis, a technique based on self-organizing feature maps. This technique integrates Web usage mining and Web content mining in order to rank the Web site pages according to an original popularity score. However, the algorithm is not scalable and does not answer the page-polysemy, page-synonymy, page-temporality, and page-volatility problems. As a consequence, these approaches fail at delivering summarized and conceptual results.

An interesting attempt to obtain such results has been the Information Scent algorithm, which produces a list of term vectors representing the visitors' needs. These vectors provide a semantic representation of the visitors' needs and can be easily interpreted. Unfortunately, the results suffer from term polysemy and term synonymy, are visit-centric rather than site-centric, and are not scalable to produce. Finally, according to a recent survey, no Web usage mining research project has proposed a satisfying solution to provide site-wide summarized and conceptual audience metrics.

In this dissertation, we present our solution to answer the need for summarized and conceptual audience metrics in Web analytics. We first described several methods for mining the Web pages output by Web servers. These methods include content journaling, script parsing, server monitoring, network monitoring, and client-side mining. These techniques can be used alone or in combination to mine the Web pages output by any Web site. Then, the occurrences of taxonomy terms in these pages can be aggregated to provide concept-based audience metrics. To evaluate the results, we implement a prototype and run a number of test cases with real Web sites.

According to the first experiments with our prototype and SQL Server OLAP Analysis Service, concept-based metrics prove extremely summarized and much more intuitive than page-based metrics. As a consequence, concept-based metrics can be exploited at higher levels in the organization. For example, organization managers can redefine the organization strategy according to the visitors' interests. Concept-based metrics also give an intuitive view of the messages delivered through the Web site and allow to adapt the Web site communication to the organization objectives. The Web site chief editor on his part can interpret the metrics to redefine the publishing orders and redefine the sub-editors' writing tasks. As decisions at higher levels in the organization should be more effective, concept-based metrics should significantly contribute to Web usage mining and Web analytics.

Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

11

Malinowski, Gajda Elzbieta. "Designing conventional, spatial, and temporal data warehouses: concepts and methodological framework." Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210837.

Full text

Abstract:

Decision support systems are interactive, computer-based information systems that provide data and analysis tools in order to better assist managers on different levels of organization in the process of decision making. Data warehouses (DWs) have been developed and deployed as an integral part of decision support systems.

A data warehouse is a database that allows to store high volume of historical data required for analytical purposes. This data is extracted from operational databases, transformed into a coherent whole, and loaded into a DW during the extraction-transformation-loading (ETL) process.

DW data can be dynamically manipulated using on-line analytical processing (OLAP) systems. DW and OLAP systems rely on a multidimensional model that includes measures, dimensions, and hierarchies. Measures are usually numeric additive values that are used for quantitative evaluation of different aspects about organization. Dimensions provide different analysis perspectives while hierarchies allow to analyze measures on different levels of detail.

Nevertheless, currently, designers as well as users find difficult to specify multidimensional elements required for analysis. One reason for that is the lack of conceptual models for DW and OLAP system design, which would allow to express data requirements on an abstract level without considering implementation details. Another problem is that many kinds of complex hierarchies arising in real-world situations are not addressed by current DW and OLAP systems.

In order to help designers to build conceptual models for decision-support systems and to help users in better understanding the data to be analyzed, in this thesis we propose the MultiDimER model - a conceptual model used for representing multidimensional data for DW and OLAP applications. Our model is mainly based on the existing ER constructs, for example, entity types, attributes, relationship types with their usual semantics, allowing to represent the common concepts of dimensions, hierarchies, and measures. It also includes a conceptual classification of different kinds of hierarchies existing in real-world situations and proposes graphical notations for them.

On the other hand, currently users of DW and OLAP systems demand also the inclusion of spatial data, visualization of which allows to reveal patterns that are difficult to discover otherwise. The advantage of using spatial data in the analysis process is widely recognized since it allows to reveal patterns that are difficult to discover otherwise.

However, although DWs typically include a spatial or a location dimension, this dimension is usually represented in an alphanumeric format. Furthermore, there is still a lack of a systematic study that analyze the inclusion as well as the management of hierarchies and measures that are represented using spatial data.

With the aim of satisfying the growing requirements of decision-making users, we extend the MultiDimER model by allowing to include spatial data in the different elements composing the multidimensional model. The novelty of our contribution lays in the fact that a multidimensional model is seldom used for representing spatial data. To succeed with our proposal, we applied the research achievements in the field of spatial databases to the specific features of a multidimensional model. The spatial extension of a multidimensional model raises several issues, to which we refer in this thesis, such as the influence of different topological relationships between spatial objects forming a hierarchy on the procedures required for measure aggregations, aggregations of spatial measures, the inclusion of spatial measures without the presence of spatial dimensions, among others.

Moreover, one of the important characteristics of multidimensional models is the presence of a time dimension for keeping track of changes in measures. However, this dimension cannot be used to model changes in other dimensions.

Therefore, usual multidimensional models are not symmetric in the way of representing changes for measures and dimensions. Further, there is still a lack of analysis indicating which concepts already developed for providing temporal support in conventional databases can be applied and be useful for different elements composing a multidimensional model.

In order to handle in a similar manner temporal changes to all elements of a multidimensional model, we introduce a temporal extension for the MultiDimER model. This extension is based on the research in the area of temporal databases, which have been successfully used for modeling time-varying information for several decades. We propose the inclusion of different temporal types, such as valid and transaction time, which are obtained from source systems, in addition to the DW loading time generated in DWs. We use this temporal support for a conceptual representation of time-varying dimensions, hierarchies, and measures. We also refer to specific constraints that should be imposed on time-varying hierarchies and to the problem of handling multiple time granularities between source systems and DWs.

Furthermore, the design of DWs is not an easy task. It requires to consider all phases from the requirements specification to the final implementation including the ETL process. It should also take into account that the inclusion of different data items in a DW depends on both, users' needs and data availability in source systems. However, currently, designers must rely on their experience due to the lack of a methodological framework that considers above-mentioned aspects.

In order to assist developers during the DW design process, we propose a methodology for the design of conventional, spatial, and temporal DWs. We refer to different phases, such as requirements specification, conceptual, logical, and physical modeling. We include three different methods for requirements specification depending on whether users, operational data sources, or both are the driving force in the process of requirement gathering. We show how each method leads to the creation of a conceptual multidimensional model. We also present logical and physical design phases that refer to DW structures and the ETL process.

To ensure the correctness of the proposed conceptual models, i.e. with conventional data, with the spatial data, and with time-varying data, we formally define them providing their syntax and semantics. With the aim of assessing the usability of our conceptual model including representation of different kinds of hierarchies as well as spatial and temporal support, we present real-world examples. Pursuing the goal that the proposed conceptual solutions can be implemented, we include their logical representations using relational and object-relational databases.

Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

12

Ahmed, Usman. "Dynamic cubing for hierarchical multidimensional data space." Phd thesis, INSA de Lyon, 2013. http://tel.archives-ouvertes.fr/tel-00876624.

Full text

Abstract:

Data warehouses are being used in many applications since quite a long time. Traditionally, new data in these warehouses is loaded through offline bulk updates which implies that latest data is not always available for analysis. This, however, is not acceptable in many modern applications (such as intelligent building, smart grid etc.) that require the latest data for decision making. These modern applications necessitate real-time fast atomic integration of incoming facts in data warehouse. Moreover, the data defining the analysis dimensions, stored in dimension tables of these warehouses, also needs to be updated in real-time, in case of any change. In this thesis, such real-time data warehouses are defined as dynamic data warehouses. We propose a data model for these dynamic data warehouses and present the concept of Hierarchical Hybrid Multidimensional Data Space (HHMDS) which constitutes of both ordered and non-ordered hierarchical dimensions. The axes of the data space are non-ordered which help their dynamic evolution without any need of reordering. We define a data grouping structure, called Minimum Bounding Space (MBS), that helps efficient data partitioning of data in the space. Various operators, relations and metrics are defined which are used for the optimization of these data partitions and the analogies among classical OLAP concepts and the HHMDS are defined. We propose efficient algorithms to store summarized or detailed data, in form of MBS, in a tree structure called DyTree. Algorithms for OLAP queries over the DyTree are also detailed. The nodes of DyTree, holding MBS with associated aggregated measure values, represent materialized sections of cuboids and tree as a whole is a partially materialized and indexed data cube which is maintained using online atomic incremental updates. We propose a methodology to experimentally evaluate partial data cubing techniques and a prototype implementing this methodology is developed. The prototype lets us experimentally evaluate and simulate the structure and performance of the DyTree against other solutions. An extensive study is conducted using this prototype which shows that the DyTree is an efficient and effective partial data cubing solution for a dynamic data warehousing environment.

APA, Harvard, Vancouver, ISO, and other styles

13

Khandelwal, Nileshkumar. "An aggregate navigator for data warehouse." Ohio : Ohio University, 2000. http://www.ohiolink.edu/etd/view.cgi?ohiou1172255887.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Silva, Manoela Camila Barbosa da. "Faça no seu ritmo mas não perca a hora: tomada de decisão sob demandado usuário utilizando dados da Web." Universidade Federal de São Carlos, 2017. https://repositorio.ufscar.br/handle/ufscar/9154.

Full text

Abstract:

Submitted by Milena Rubi ( ri.bso@ufscar.br) on 2017-10-16T17:29:35Z No. of bitstreams: 1 SILVA_Manoela_2017.pdf: 5765067 bytes, checksum: 241f86d72385de30ffe23c0f4d49a868 (MD5)
Approved for entry into archive by Milena Rubi ( ri.bso@ufscar.br) on 2017-10-16T17:29:46Z (GMT) No. of bitstreams: 1 SILVA_Manoela_2017.pdf: 5765067 bytes, checksum: 241f86d72385de30ffe23c0f4d49a868 (MD5)
Approved for entry into archive by Milena Rubi ( ri.bso@ufscar.br) on 2017-10-16T17:29:57Z (GMT) No. of bitstreams: 1 SILVA_Manoela_2017.pdf: 5765067 bytes, checksum: 241f86d72385de30ffe23c0f4d49a868 (MD5)
Made available in DSpace on 2017-10-16T17:30:06Z (GMT). No. of bitstreams: 1 SILVA_Manoela_2017.pdf: 5765067 bytes, checksum: 241f86d72385de30ffe23c0f4d49a868 (MD5) Previous issue date: 2017-08-07
Não recebi financiamento
In the current knowledge age, with the continuous growth of the web data volume and where business decisions must be made quickly, traditional BI mechanisms become increasingly inaccurate in order to help the decision-making process. In response to this scenario rises the BI 2.0 concept, which is a recent one and is mainly based on the Web evolution, having as one of the main characteristics the use of Web sources in decision-making. However, data from Web tend to be volatile to be stored in the DW, making them a good option for situational data. Situational data are useful for decision-making queries at a particular time and situation, and can be discarded after analysis. Many researches have been developed regarding to BI 2.0, but there are still many points to be explored. This work proposes a generic architecture for Decision Support Systems that aims to integrate situational data from Web to user queries at the right time; this is, when the user needs them for decision making. Its main contribution is the proposal of a new OLAP operator, called Drill-Conformed, enabling data integration in an automatic way and using only the domain of values from the situational data.In addition, the operator collaborates with the Semantic Web, by making available the semantics-related discoveries. The case study is a streamings provision system. The results of the experiments are presented and discussed, showing that is possible to make the data integration in a satisfactory manner and with good processing times for the applied scenario.
Na atual era do conhecimento, com o crescimento contínuo do volume de dados da Web e onde decisões de negócio devem ser feitas de maneira rápida, os mecanismos tradicionais de BI se tornam cada vez menos precisos no auxílio à tomada de decisão. Em resposta a este cenário surge o conceito de BI 2.0, que se trata de um conceito recente e se baseia principalmente na evolução da Web, tendo como uma das principais características a utilização de fontes Web na tomada de decisão. Porém, dados provenientes da Web tendem a ser voláteis para serem armazenados no DW, tornando-se uma boa opção para dados transitórios. Os dados transitórios são úteis para consultas de tomada de decisão em um determinado momento e cenário e podem ser descartados após a análise. Muitos trabalhos têm sido desenvolvidos em relação à BI 2.0, mas ainda existem muitos pontos a serem explorados. Este trabalho propõe uma arquitetura genérica para SSDs, que visa integrar dados transitórios, provenientes da Web, às consultas de usuários no momento em que o mesmo necessita deles para a tomada de decisão. Sua principal contribuição é a proposta de um novo operador OLAP , denominado Drill-Conformed, capaz de realizar a integração dos dados de maneira automática e fazendo uso somente do domínio de valores dos dados transitórios. Além disso, o operador tem o intuito de colaborar com a Web semântica, a partir da disponibilização das informações por ele descobertas acerca do domínio de dados utilizado. O estudo de caso é um sistema de disponibilização de streamings . Os resultados dos experimentos são apresentados e discutidos, mostrando que é possível realizar a integração dos dados de maneira satisfatória e com bons tempos de processamento para o cenário aplicado.

APA, Harvard, Vancouver, ISO, and other styles

15

Bouadi, Tassadit. "Analyse multidimensionnelle interactive de résultats de simulation : aide à la décision dans le domaine de l'agroécologie." Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00933375.

Full text

Abstract:

Dans cette thèse, nous nous sommes intéressés à l'analyse des données de simulation issues du modèle agro-hydrologique TNT. Les objectifs consistaient à élaborer des méthodes d'analyse des résultats de simulation qui replacent l'utilisateur au coeur du processus décisionnel, et qui permettent d'analyser et d'interpréter de gros volumes de données de manière efficace. La démarche développée consiste à utiliser des méthodes d'analyse multidimensionnelle interactive. Tout d'abord, nous avons proposé une méthode d'archivage des résultats de simulation dans une base de données décisionnelle (i.e. entrepôt de données), adaptée au caractère spatio-temporel des données de simulation produites. Ensuite, nous avons suggéré d'analyser ces données de simulations avec des méthodes d'analyse en ligne (OLAP) afin de fournir aux acteurs des informations stratégiques pour améliorer le processus d'aide à la prise de décision. Enfin, nous avons proposé deux méthodes d'extraction de skyline dans le contexte des entrepôts de données afin de permettre aux acteurs de formuler de nouvelles questions en combinant des critères environnementaux contradictoires, et de trouver les solutions compromis associées à leurs attentes, puis d'exploiter les préférences des acteurs pour détecter et faire ressortir les données susceptibles de les intéresser. La première méthode EC2Sky, permet un calcul incrémental et efficace des skyline en présence de préférences utilisateurs dynamiques, et ce malgré de gros volumes de données. La deuxième méthode HSky, étend la recherche des points skyline aux dimensions hiérarchiques. Elle permet aux utilisateurs de naviguer le long des axes des dimensions hiérarchiques (i.e. spécialisation / généralisation) tout en assurant un calcul en ligne des points skyline correspondants. Ces contributions ont été motivées et expérimentées par l'application de gestion des pratiques agricoles pour l'amélioration de la qualité des eaux des bassins versants agricoles, et nous avons proposé un couplage entre le modèle d'entrepôt de données agro-hydrologiques construit et les méthodes d'extraction de skyline proposées.

APA, Harvard, Vancouver, ISO, and other styles

16

Kroeze, Jan Hendrik. "Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3." Pretoria : [s.n.], 2008. http://upetd.up.ac.za/thesis/available/etd-07282008-121520/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Janoška, Daniel. "Moderní technologie v OLAP." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235902.

Full text

Abstract:

OLAP systems are sought-after tool which are deployed in companies and industry environment. The fundamental task of these systems are support of executive management. This work is dealing with OLAP analysis and its capabilities. It also discussed FLEX and AIR technology, their base principles and functions.

APA, Harvard, Vancouver, ISO, and other styles

18

Žižka, Petr. "Implementace nástroje pro analýzu přístupů k webové prezentaci založeného na technologii OLAP." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-1323.

Full text

Abstract:

Cílem této práce je vytvořit aplikaci vhodnou pro měření návštěvnosti webových prezentací založenou na standardní OLAP databázi. V první kapitole práce se zaměřuji na získání teoretických podkladů, ze kterých vychází funkční požadavky na aplikaci. Součástí těchto požadavků je výběr nejdůležitějších ukazatelů návštěvnosti, které budou v aplikaci dostupné a popis datových zdrojů, které jsou k odvozování ukazatelů použity. Ve druhé kapitole se zaměřuji na popis použité technologie - OLAP. Třetí kapitola obsahuje vlastní popis tvorby aplikace. Výsledkem práce je návrh aplikace, která je využitelná pro sběr, ukládání a analýzu dat vzniklých při interakci webové prezentace s webovým prohlížečem a která je založená na otevřených volně dostupných komponentách.

APA, Harvard, Vancouver, ISO, and other styles

19

Buyankhishig, Agiimaa. "Využití moderní self-service BI technologie v praxi." Master's thesis, Vysoká škola ekonomická v Praze, 2012. http://www.nusl.cz/ntk/nusl-197265.

Full text

Abstract:

Abstract This diploma thesis treats about the latest technologies in the field of self-service BI from Microsoft Corporation. The main goal of this work is to analyze the Microsoft self-service BI solutions, to describe the benefits and advantages of this technology and to show examples with real data in Microsoft self-services BI tools. To achieve the goal, the internet resources, recommended literature, and the software applications PowerPivot and PowerView (Excel 2013) are59 used. In its first part this thesis describes the basic characteristics and technology of classical BI solutions. The second part examines the actual self-service BI solution and its usability. And then analyzes the advantages and benefits compared to conventional technologies. Finally, in its last section describes self-service BI solutions, the DAX language used in PowerPivot and shows example reports with real data from banking sector. The key benefit of this diploma is the verification of the usability and advantages of self-service BI by using Microsoft self-service BI products and tools.

APA, Harvard, Vancouver, ISO, and other styles

20

Borchert, Christoph [Verfasser], Olaf [Akademischer Betreuer] Spinczyk, and Wolfgang [Gutachter] Schröder-Preikschat. "Aspect-oriented technology for dependable operating systems / Christoph Borchert ; Gutachter: Wolfgang Schröder-Preikschat ; Betreuer: Olaf Spinczyk." Dortmund : Universitätsbibliothek Dortmund, 2017. http://d-nb.info/1133361919/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Plante, Mathieu. "Vers des cubes matriciels supportant l’analyse spatiale à la volée dans un contexte décisionnel." Thesis, Université Laval, 2014. http://www.theses.ulaval.ca/2014/30586/30586.pdf.

Full text

Abstract:

Depuis l’avènement du SOLAP, la problématique consistant à produire des analyses spatiales à la volée demeure entière. Les travaux précédents se sont tournés vers l’analyse visuelle et le calcul préalable afin d’obtenir des résultats en moins de 10 secondes. L’intégration des données matricielles dans les cubes SOLAP possède un potentiel inexploré pour le traitement à la volée des analyses spatiales. Cette recherche vise à explorer les avantages et les considérations à exploiter les cubes matriciels afin de produire des analyses spatiales à la volée dans un contexte décisionnel. Elle contribue à l’évolution du cadre théorique de l’intégration des données matricielles dans les cubes en ajoutant notamment la notion de couverture matricielle au cube afin de mieux supporter les analyses spatiales matricielles. Elle identifie des causes de la consommation excessive de ressources pour le traitement de ces analyses et propose des pistes d’optimisation basées sur l’exploitation des dimensions matricielles géométriques.

APA, Harvard, Vancouver, ISO, and other styles

22

Hänßler, Olaf C. [Verfasser], Didier [Akademischer Betreuer] Théron, and Sergej [Akademischer Betreuer] Fatikow. "Multimodal sensing and imaging technology by integrated scanning electron, force, and nearfield microwave microscopy and its application to submicrometer studies / Olaf C. Hänßler ; Didier Théron, Sergej Fatikow." Oldenburg : BIS der Universität Oldenburg, 2018. http://d-nb.info/1157010199/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Hänßler, Olaf C. Verfasser], Didier [Akademischer Betreuer] Théron, and Sergej [Akademischer Betreuer] [Fatikow. "Multimodal sensing and imaging technology by integrated scanning electron, force, and nearfield microwave microscopy and its application to submicrometer studies / Olaf C. Hänßler ; Didier Théron, Sergej Fatikow." Oldenburg : BIS der Universität Oldenburg, 2018. http://d-nb.info/1157010199/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Mabed, Metwaly, and Thomas Köhler. "The Impact of Learning Management System Usage on Cognitive and Affective Performance." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-101320.

Full text

Abstract:

1 INTRODUCTION Since learning management systems (LMSs) are offering a great variety of channels and workspaces to facilitate information sharing and communication among learners during learning process, many educational organizations have adopted a specific LMS into their educational context. A LMS is a software that handles learning tasks such as creating course catalogs, registering students, providing access to course components, tracking students within courses, recording data about students, and providing reports about usage and outcomes to teachers [1]. LMSs include several applications such as OLAT, WebCT, Moodle, ATutor, Ilias, and Claroline. However, LMSs can be utilized to integrate a wide range of multimedia materials, blogs, forums, quizzes, and wikis. Therefore, the researchers suggest that studying the influence of technology usage on end-users, especially students, is fundamental in learning and teaching environment. Despite educational organizations routinely make decisions regarding the best pedagogical approaches for supporting students’ performance, there is very little research on the impact of LMSs on learning outcomes [2]. Indeed, a considerable number of studies were conducted to examine the adoption of various LMSs, whereas little researches focused on understanding how educational institutes can enhance learning and teaching process through a particular LMS [3]. Consistent with this, the researchers found virtually no research on investigating the relationship between LMSs usage and attitude toward learning. [...]

APA, Harvard, Vancouver, ISO, and other styles

25

Chun, Chang Yi, and 張尹駿. "Demand Planning Hierarchy Software System Based on OLAP Technology." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/14780328150612707306.

Full text

Abstract:

碩士
國立臺灣大學
工業工程學研究所
90
Demand planning has to deal with a lot of complicated information. It is difficult for planners to make decisions efficiently from these data. In order to satisfy the requirement of quick response, the relational database and the spreadsheets are no longer sufficient. Therefore, this thesis applies the new information technology, namely, On-Line Analytical Processing (OLAP), and develops a software system to help users determine a Demand Planning Hierarchy quickly. The concept of Demand Planning Hierarchy (DPH) is first developed by Chen[3] to improve demand planning efficiency. OLAP and the Multidimensional database are utilized to design and implement a DPH software system. This system provides not only the greedy method but also the dynamic programming approach to search for the DPH. In addition, it can support planners to choose a specific product combination, and the system will provide middle out approach to get the DPH. Finally, a demand data set from a semiconductor manufacturing company is used to test the developed software system.

APA, Harvard, Vancouver, ISO, and other styles

26

"Materializing views in data warehouse: an efficient approach to OLAP." 2003. http://library.cuhk.edu.hk/record=b5891626.

Full text

Abstract:

Gou Gang.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.
Includes bibliographical references (leaves 83-87).
Abstracts in English and Chinese.
Acknowledgement --- p.iii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Data Warehouse and OLAP --- p.4
Chapter 1.2 --- Computational Model: Dependent Lattice --- p.10
Chapter 1.3 --- Materialized View Selection --- p.12
Chapter 1.3.1 --- Materialized View Selection under a Disk-Space Constraint --- p.13
Chapter 1.3.2 --- Materialized View Selection under a Maintenance-Time Con- straint --- p.16
Chapter 1.4 --- Main Contributions --- p.21
Chapter 2 --- A* Search: View Selection under a Disk-Space Constraint --- p.24
Chapter 2.1 --- The Weakness of Greedy Algorithms --- p.25
Chapter 2.2 --- A*-algorithm --- p.29
Chapter 2.2.1 --- An Estimation Function --- p.36
Chapter 2.2.2 --- Pruning Feasible Subtrees --- p.38
Chapter 2.2.3 --- Approaching the Optimal Solution from Two Directions --- p.41
Chapter 2.2.4 --- NIBS Order: Accelerating Convergence --- p.43
Chapter 2.2.5 --- Sliding Techniques: Eliminating Redundant H-Computation --- p.45
Chapter 2.2.6 --- Examples --- p.50
Chapter 2.3 --- Experiment Results --- p.54
Chapter 2.3.1 --- Analysis of Experiment Results --- p.55
Chapter 2.3.2 --- Computing for a Series of S Constraints --- p.60
Chapter 2.4 --- Conclusions --- p.62
Chapter 3 --- Randomized Search: View Selection under a Maintenance-Time Constraint --- p.64
Chapter 3.1 --- Non-monotonic Property --- p.65
Chapter 3.2 --- A Stochastic-Ranking-Based Evolutionary Algorithm --- p.67
Chapter 3.2.1 --- A Basic Evolutionary Algorithm --- p.68
Chapter 3.2.2 --- The Weakness of the rg-Method --- p.69
Chapter 3.2.3 --- Stochastic Ranking: a Novel Constraint Handling Technique --- p.70
Chapter 3.2.4 --- View Selection Using the Stochastic-Ranking-Based Evolu- tionary Algorithm --- p.72
Chapter 3.3 --- Conclusions --- p.74
Chapter 4 --- Conclusions --- p.75
Chapter 4.1 --- Thesis Review --- p.76
Chapter 4.2 --- Future Work --- p.78
Chapter A --- My Publications for This Thesis --- p.81
Bibliography --- p.83

APA, Harvard, Vancouver, ISO, and other styles

27

Li, Yu. "Integrating XML data for OLAP using XML schema and UML /." 2005.

Find full text

Abstract:

Thesis (M.Sc.)--York University, 2005. Graduate Programme in Computer Science and Engineering.
Typescript. Includes bibliographical references (leaves 113-117). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url%5Fver=Z39.88-2004&res%5Fdat=xri:pqdiss &rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:MR11840

APA, Harvard, Vancouver, ISO, and other styles

28

Mansmann, Svetlana [Verfasser]. "Extending the OLAP technology to handle non-conventional and complex data / vorgelegt von Svetlana Mansmann." 2009. http://d-nb.info/993406726/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Wu, Shih-Jong, and 武士戎. "The Application of OLAP and Association Rule Technology on An Agent-Based E-Commerce Infrastructure." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/44575802190546071732.

Full text

Abstract:

碩士
淡江大學
資訊工程學系
89
With the rapid development of computer and internet technology. After the Information Revolution, E-commerce is more and more popular. In recently year, the demand of E-commerce increases gradually. So designing and developing the platform for E-commerce are very important. In this virtual market, there are great mount of data, which like transaction data, customer data, etc. This is a challenge that to search and collect data in this market completely, and apply the suitable data mining technique to extract summary information and rule from very large historical data in this E-commerce platform. We can derive knowledge and obtain more commercial information. So we use mobile agent technique to construct an E-commerce infrastructure. In this infrastructure, we apply data warehouse, data mining and OLAP technique to derive and extract knowledge and rule information. Integrated all useful information to dispose these information optimizations in this market. Mining hided pattern and finding information, which have commerce value from data warehouse on this platform. In order to make decision rapidly and precisely in marketing, we apply data mining and OLAP techniques appropriately, and construct an E-commerce infrastructure completely. The major contribution of this paper is using mobile agent to construct an E-commerce infrastructure. In this platform we collect data completely to build a data warehouse. By applying OLAP and data mining techniques, we can extract data, integrate and manage information. And then we will combine ERP (Enterprise Resource planning) and CRM (Customer Relation Management) to E-commerce marketing to complete our system.

APA, Harvard, Vancouver, ISO, and other styles

30

Tsai, Long-Tsay, and 蔡隆財. "Using OLAP technology and satisfaction survey to analyze the management strategies of the cable TV system - take Pingtung area as an example." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/99999440603370370568.

Full text

Abstract:

碩士
國立屏東科技大學
高階經營管理碩士在職專班
94
The cable TV is deemed to be the role of the public utilities. There is status of the same influence of people's livelihood as such industries as the coal gas, water, electricity, etc. It must receive supervision of responsible institution to manage. Moreover, the cable TV also possesses local and community's media nature, the expense–standards should be suited to local conditions. On-line analytical processing (OLAP) is a favorable tool possessing user-friendly interface functions. It allows users picking and fetching multidimensional information fast and finding out overview to analyze the data as information. In this thesis, according to the real state of management of the cable TV, we tried with the on-line analytical processing technology and utilized SQL Server 2000 Analysis Services to support on-line analytical processing function in data transformation and processing. The satisfaction rating of the cable TV is mainly determined by the reciprocation of expectation and performance of the system proprietor’s signal service, quality of customer service and expenses, etc. In this thesis, we tried to make use of five statistical methods: Descriptives, Crosstabs, Chi-square test, t test, and Logistic Regression to explain people’s satisfaction rating for two cable TV systems to complete the analysis and research of satisfaction survey. This thesis is based on the management and administration data of the cable television system operator to analyze and process the real management state by using the on-line analytical processing technology. Subsequently, regarding the result of the satisfaction survey by phone questionnaire, we combined the statistical methods to analyze the audiences’ satisfaction rating and the household behavior deeply and extensively to sketch the relationship between the system operator's operation and administration and the audiences’ satisfaction rating. The quality of the cable TV system operator influences the audiences’ rights. We hope the integrated conclusions of the interaction between the operator's management strategies in Pingtung area and audiences’ rights analyzed by using OLAP technology and satisfaction survey in this thesis can provide the operators with information for the management strategies or suggestions for the government to safeguard the audiences’ rights. Hence to create an advantageous situation for the operator's management strategies, government's administrative performance, and audiences’ rights.

APA, Harvard, Vancouver, ISO, and other styles

31

Sanghi, Anupam. "HYDRA: A Dynamic Approach to Database Regeneration." Thesis, 2022. https://etd.iisc.ac.in/handle/2005/5959.

Full text

Abstract:

Database software vendors often need to generate synthetic databases for a variety of applications, including (a) Testing database engines and applications, (b) Data masking, (c) Benchmarking, (d) Creating what-if scenarios, and (e) Assessing performance impacts of planned engine upgrades. The synthetic databases are targeted toward capturing the desired schematic properties (e.g., keys, referential constraints, functional dependencies, domain constraints), as well as the statistical data profiles (e.g., value distributions, column correlations, data skew, output volumes) hosted on these schemas. Several data generation frameworks have been proposed for OLAP over the past three decades. The early efforts focused on ab initio generation based on standard mathematical distributions. Subsequently, there was a shift to database-dependent regeneration, which aims to create a database with similar statistical properties to a specific client database. However, these mechanisms could not mimic the customer query-processing environments satisfactorily. The contemporary school of thought generates workload-aware data that uses query execution plans from the customer workloads as input and guarantees volumetric similarity. That is, the intermediate row cardinalities obtained at the client and vendor sites are very similar when matching query plans are executed. This similarity helps to preserve the multi-dimensional layout and flow of the data, a prerequisite for achieving similar performance on the client’s workload. However, even in this category, the existing frameworks are hampered by limitations such as the inability to (a) provide a comprehensive algorithm to handle the queries based on core relational algebra operators, namely, Select, Project, and Join; (b) scale to big data volumes; (c) scale to large input workloads; and (d) provide high accuracy on unseen queries. In this work, motivated by the above lacunae, we present HYDRA, a data regeneration tool that materially addresses the above challenges by adding functionality, dynamism, scale, and robustness. Firstly, extended workload coverage is provided through a comprehensive solution for modeling select-project-join relational algebra operators. Specifically, the constraints are represented as a linear feasibility problem, in which each variable represents the volume of a partitioned region of the data space. Our partitioning scheme for filter constraints permits the regions to be non-convex and ensures the minimum number of regions, thereby hugely reducing the problem complexity as compared to the rectangular grid-partitioning advocated in the prior literature. Similarly, our projection subspace division and projection isolation strategies address the critical challenge of capturing unions, as opposed to summations, in incorporating projection constraints. Finally, by creating referential constraints over denormalized equivalents of the tables, Hydra delivers a comprehensive solution that also handles join constraints. Secondly, a unique feature of our data regeneration approach is that it delivers a database summary as the output rather than the static data itself. This summary is of negligible size and depends only on the query workload and not on the database scale. It can be used for dynamically generating data during query execution. Therefore, the enormous time and space overheads incurred by prior techniques in generating and storing the data before initiating analysis are eliminated. Our experience is that the summaries for complex Big Data client scenarios comprising over a hundred queries are constructed within just a few minutes, requiring only a few MBs of storage. Thirdly, to improve accuracy towards unseen queries, Hydra additionally exploits metadata statistics maintained by the database engine. Specifically, it adds an objective function to the linear program to pick a solution with improved inter-region tuple distribution. Further, a uniform distribution of tuples within regions is modeled to obtain a spread of values. These techniques facilitate the careful selection of a desirable database from the candidate synthetic databases, and also provide metadata compliance. The proposed ideas have been evaluated on the TPC-DS synthetic benchmark, as well as real-world benchmarks based on the Census and IMDB databases. Further, the Hydra framework has been prototyped in a Java-based tool that provides a visual and interactive demonstration of the data regeneration pipeline. The tool has been warmly received by both academic and industrial communities.

APA, Harvard, Vancouver, ISO, and other styles

32

Rajkumar, S. "Enhancing Coverage and Robustness of Database Generators." Thesis, 2021. https://etd.iisc.ac.in/handle/2005/5528.

Full text

Abstract:

Generating synthetic databases that capture essential data characteristics of client databases is a common requirement for enterprise database vendors. This need stems from a variety of use-cases, such as application testing and assessing performance impacts of planned engine upgrades. A rich body of literature exists in this area, spanning from the early techniques that simply generated data ab-initio to the contemporary ones that use a predefined client query workload to guide the data generation. In the latter category, the aim specifically is to ensure volumetric similarity -- that is, assuming a common choice of query execution plans at the client and vendor sites, the output row cardinalities of individual operators in these plans are similar in the original and synthetic databases. Hydra is a recently proposed data regeneration framework that provides volumetric similarity. In addition, it also provides a mechanism to generate data dynamically during query execution, using a minuscule database summary. Notwithstanding its desirable characteristics, Hydra has the following critical limitations: (a) limited scope of SQL operators in the input query workload, (b) poor scalability with respect to the number of queries in the input workload, and (c) poor volumetric similarity on unseen queries. The data generation algorithm internally uses a linear programming (LP) solver that throttles the workload scalability. This not only puts a threshold on the training (seen) workload size but also reduces the accuracy for test (unseen) queries. Robustness towards test queries is further adversely affected by design choices such as a lack of preference among candidate synthetic databases, and artificial skew in the generated data. In this work, we present an enhanced version of Hydra, called High-Fidelity Hydra (HF-Hydra), which attempts to address the above limitations. To start with, we expand the SQL operator coverage to also include the LIKE operator, and, in certain restricted settings, projection-based operators such as GROUP BY and DISTINCT. To sidestep the challenge of workload scalability, HF-Hydra outputs not one, but a suite of database summaries such that they collectively cover the entire input workload. The division of the workload into the associated sub-workloads is governed by heuristics that aim to balance robustness with LP solvability. For generating richer database summaries, HF-Hydra additionally exploits metadata statistics maintained by the database engine. Further, the database query optimizer is leveraged to make the choice among the various candidate databases. The data generation is also augmented to provide greater diversity in the represented values. Finally, when a test query is fired, HF-Hydra directs it to the database summary that is expected to provide the highest volumetric similarity. We have experimentally evaluated HF-Hydra on a customized set of queries based on the TPC-DS decision-support benchmark framework. We first evaluated the specialized case where each training query has its own summary, and here HF-Hydra achieves perfect volumetric similarity. Further, each summary construction took just under a second and the summary sizes were just in the order of a few tens of kilobytes. Also, our dynamic generation technique produced gigabytes of data in just a few seconds. For the general setting of a limited set of summaries representing the training query workload, the data generated by HF-Hydra was compared with that from Hydra. We observed that HF-Hydra delivers more than forty percent better accuracy for outputs from filter nodes in the plans, while also achieving an improvement of about twenty percent with regard to join nodes. Further, the degradation in volumetric similarity is minor as compared to the one-summary scenario, while the summary production is significantly more efficient due to reduced overheads on the LP solver. In summary, HF-Hydra takes a substantive step forward with regard to creating expressive, robust, and scalable data regeneration frameworks with immediate relevance to testing deployments.

APA, Harvard, Vancouver, ISO, and other styles

33

Banda, Misheck. "A data management and analytic model for business intelligence applications." Diss., 2017. http://hdl.handle.net/10500/23129.

Full text

Abstract:

Most organisations use several data management and business intelligence solutions which are on-premise and, or cloud-based to manage and analyse their constantly growing business data. Challenges faced by organisations nowadays include, but are not limited to growth limitations, big data, inadequate analytics, computing, and data storage capabilities. Although these organisations are able to generate reports and dashboards for decision-making in most cases, effective use of their business data and an appropriate business intelligence solution could achieve and retain informed decision-making and allow competitive reaction to the dynamic external environment. A data management and analytic model has been proposed on which organisations could rely for decisive guidance when planning to procure and implement a unified business intelligence solution. To achieve a sound model, literature was reviewed by extensively studying business intelligence in general, and exploring and developing various deployment models and architectures consisting of naïve, on-premise, and cloud-based which revealed their benefits and challenges. The outcome of the literature review was the development of a hybrid business intelligence model and the accompanying architecture as the main contribution to the study.In order to assess the state of business intelligence utilisation, and to validate and improve the proposed architecture, two case studies targeting users and experts were conducted using quantitative and qualitative approaches. The case studies found and established that a decision to procure and implement a successful business intelligence solution is based on a number of crucial elements, such as, applications, devices, tools, business intelligence services, data management and infrastructure. The findings further recognised that the proposed hybrid architecture is the solution for managing complex organisations with serious data challenges.
Computing
M. Sc. (Computing)

APA, Harvard, Vancouver, ISO, and other styles

34

"Short, Medium and Long Term Effects of an Online Learning Activity Based (OLAB) Curriculum on Middle School Students’ Achievement in Mathematics: A Quasi-Experimental Quantitative Study." Doctoral diss., 2016. http://hdl.handle.net/2286/R.I.40291.

Full text

Abstract:

abstract: Public Mathematics Education is not at its best in the United States and technology is often seen as part of the solution to address this issue. With the existence of high-speed Internet, mobile technologies, ever-improving computer programming and graphing, the concepts of learning management systems (LMS’s) and online learning environments (OLE’s), technology-based learning has elevated to a whole new level. The new generation of online learning enables multi-modal utilization, and, interactivity with instant feedback, among the other precious characteristics identified in this study. The studies that evaluated the effects of online learning often measured the immediate impacts on student achievement; there are very few studies that have investigated the longer-term effects in addition to the short term ones. In this study, the effects of the new generation Online Learning Activity Based (OLAB) Curriculum on middle school students’ achievement in mathematics at the statewide high-stakes testing system were examined. The results pointed out that the treatment group performed better than the control group in the short term (immediately after the intervention), medium term (one year after the intervention), and long term (two years after the intervention) and that the results were statistically significant in the short and long terms. Within the context of this study, the researcher also examined some of the factors affecting student achievement while using the OLE as a supplemental resource, namely, the time and frequency of usage, professional development of the facilitators, modes of instruction, and fidelity of implementation. While the researcher detected positive correlations between all of the variables and student achievement, he observed that school culture is indeed a major feature creating the difference attributed to the treatment group teachers. The researcher discovered that among the treatment group teachers, the ones who spent more time on professional development, used the OLE with greater fidelity and attained greater gains in student achievement and interestingly they came from the same schools. This verified the importance of school culture in teachers’ attitudes toward making the most of the resources made available to them so as to achieve better results in terms of student success in high stakes tests.
Dissertation/Thesis
Doctoral Dissertation Curriculum and Instruction 2016

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'OLAP technology'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles