Log in

Relevant bibliographies by topics / SQL query performance tuning / Journal articles

To see the other types of publications on this topic, follow the link: SQL query performance tuning.

Journal articles on the topic 'SQL query performance tuning'

Author: Grafiati

Published: 5 June 2025

Last updated: 16 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'SQL query performance tuning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhao, De Yu. "Research on Improving Oracle Query Performance in MES." Applied Mechanics and Materials 201-202 (October 2012): 39–42. http://dx.doi.org/10.4028/www.scientific.net/amm.201-202.39.

Full text

Abstract:

The tuning for Oracle database system is vital to the normal running of the whole system, but it is a complicated work. SQL statement tuning is a very critical aspect of database performance tuning. It is an inherently complex activity requiring a high level of expertise in several domains: query optimization, to improve the execution plan selected by the query optimizer, access design to identify missing access structures and SQL design to restructure and simplify the text of a badly written SQL statement. In this paper, the author analyzes the execution procedure of oracle optimizer, and researches how to improve the oracle database query performance in MES.

APA, Harvard, Vancouver, ISO, and other styles

2

Devi, Chiranjeevi, Pradeep Manivannan, and Radhakrishnan Pachyappan. "Differentially Private Canary Optimization via Thompson Sampling for SQL Performance Fixes." Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023 3, no. 1 (2024): 532–50. https://doi.org/10.60087/jaigs.v3i1.383.

Full text

Abstract:

SQL performance tuning often involves deploying multiple query fixes in live systems to identify the most effective solution, a process that can risk data exposure and degrade service quality. This paper proposes a novel framework that integrates differential privacy with Thompson Sampling to optimize "canary deployments" of SQL fixes. Our method ensures that experimental testing across user groups maintains statistical efficiency while preserving user data privacy. By leveraging Bayesian exploration strategies, our approach identifies high-performing query modifications under strict privacy constraints, minimizing performance regression and exposure. Experimental evaluations on benchmark SQL workloads demonstrate that our approach achieves near-optimal performance improvements with provable differential privacy guarantees, offering a robust solution for safely and effectively deploying performance fixes in sensitive environments.

APA, Harvard, Vancouver, ISO, and other styles

3

Souza, Ana Paula Dos Santos, Bruno Fidelis Campos, Carla Glênia Guedes Dias, et al. "Tuning em Banco de Dados Data base Tuning." Cadernos UniFOA 4, no. 10 (2017): 19. http://dx.doi.org/10.47385/cadunifoa.v4i10.965.

Full text

Abstract:

Devido ao grande volume de dados que são gerados pelas Empresas que utilizam Sistemas de Informação, é fundamental o papel do Banco de Dados (BD). Geralmente os dados precisam ser acessados a todo instante, logo, a disponibilidade dos resultados nem sempre são satisfatórias. Nesse contexto, entra a questão do desempenho ao se obter informações de um BD e como otimizá-las. Muitos problemas de performance não estão relacionados a infraestrutura, sistemas operacionais ou mesmo ao hardware. Pode-se encontrar problemas de perda de performance dentro do próprio BD, sendo a consulta a principal causadora desses problemas. Ajustar e otimizar uma consulta e o próprio BD tornam-se fatores importantes, podendo-se ter um ganho de performance aceitável, visto que cada consulta é tratada de forma diferente, dependendo do Sistema Gerenciador de Banco de Dados (SGBD). Este artigo avalia como melhorar o desempenho de consultas Transact-Structured Query Language (T-SQL) em um ambiente Microsoft SQL Server 2005, sugerindo possíveis alterações que possam levar a um ganho de performance considerável.

APA, Harvard, Vancouver, ISO, and other styles

4

Souza, Ana Paula dos Santos, Bruno Fidelis Campos, Carla Glênia Guedes Dias, et al. "Tuning em Banco de Dados Data base Tuning." Cadernos UniFOA 4, no. 10 (2017): 19–25. http://dx.doi.org/10.47385/cadunifoa.v4.n10.965.

Full text

Abstract:

Devido ao grande volume de dados que são gerados pelas Empresas que utilizam Sistemas de Informação, é fundamental o papel do Banco de Dados (BD). Geralmente os dados precisam ser acessados a todo instante, logo, a disponibilidade dos resultados nem sempre são satisfatórias. Nesse contexto, entra a questão do desempenho ao se obter informações de um BD e como otimizá-las. Muitos problemas de performance não estão relacionados a infraestrutura, sistemas operacionais ou mesmo ao hardware. Pode-se encontrar problemas de perda de performance dentro do próprio BD, sendo a consulta a principal causadora desses problemas. Ajustar e otimizar uma consulta e o próprio BD tornam-se fatores importantes, podendo-se ter um ganho de performance aceitável, visto que cada consulta é tratada de forma diferente, dependendo do Sistema Gerenciador de Banco de Dados (SGBD). Este artigo avalia como melhorar o desempenho de consultas Transact-Structured Query Language (T-SQL) em um ambiente Microsoft SQL Server 2005, sugerindo possíveis alterações que possam levar a um ganho de performance considerável.

APA, Harvard, Vancouver, ISO, and other styles

5

Muhammad, Qasim Memon, He Jingsha, Memon Aasma, Gulzar Rana Khurram, and Salman Pathan Muhammad. "Query Processing for Time Efficient Data Retrieval." Indonesian Journal of Electrical Engineering and Computer Science 9, no. 3 (2018): 784–88. https://doi.org/10.11591/ijeecs.v9.i3.pp784-788.

Full text

Abstract:

In database management system (DBMS) retrieving data through structure query language is an essential aspect to find better execution plan for performance. In this paper, we incorporated database objects to optimize query execution time and its cost by vanishing poorly SQL statements. We proposed a method of evolving and inserting database constraints as database objects embedded with queries either to add them for the sake of transactions required by user to detect those queries for the betterment of performance. We took analysis on several databases while processing queries itself and assimilate real time database workload with the bunch of transactions are invoked in comparison with tuning approaches. These database objects are coded in procedural language environment pertaining rules to make it worth and are merged into queries offering improved execution plan.

APA, Harvard, Vancouver, ISO, and other styles

6

AzraJabeen, Mohamed Ali. "SQL Server Optimization-Best Practices for Maximizing Performance." International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences 8, no. 4 (2020): 1–10. https://doi.org/10.5281/zenodo.14535769.

Full text

Abstract:

This paper explores the best practices for SQL Server optimization, offering a comprehensive guide to enhance the performance of database systems.In the data-driven world of today, sustaining high efficiency and responsiveness requires that SQL Server databases operate at their best.By addressing key aspects such as query tuning, indexing strategies, and resource management, it presents effective techniques to minimize latency and improve execution speed. It also highlights the importance of proper configuration, efficient use of memory, and effective database maintenance practices. Through these best practices, database administrators and developers can ensure that SQL Server operates at peak performance, supporting faster queries, reduced downtime, and seamless scalability.This paper serves as an invaluable resource for anyone seeking to optimize their SQL Server environment, ensuring better performance and reliability in real-world applications.

APA, Harvard, Vancouver, ISO, and other styles

7

Öztürk, Emir. "Improving Text-to-Sql Conversion for Low-Resource Languages Using Large Language Models." Bitlis Eren Üniversitesi Fen Bilimleri Dergisi 14, no. 1 (2025): 163–78. https://doi.org/10.17798/bitlisfen.1561298.

Full text

Abstract:

Accurate text-to-SQL conversion remains a challenge, particularly for low-resource languages like Turkish. This study explores the effectiveness of large language models (LLMs) in translating Turkish natural language queries into SQL, introducing a two-stage fine-tuning approach to enhance performance. Three widely used LLMs Llama2, Llama3, and Phi3 are fine-tuned under two different training strategies, direct SQL fine-tuning and sequential fine-tuning, where models are first trained on Turkish instruction data before SQL fine-tuning. A total of six model configurations are evaluated using execution accuracy and logical form accuracy. The results indicate that Phi3 models outperform both Llama-based models and previously reported methods, achieving execution accuracy of up to 99.95% and logical form accuracy of 99.95%, exceeding the best scores in the literature by 5–10%. The study highlights the effectiveness of instruction-based fine-tuning in improving SQL query generation. It provides a detailed comparison of Llama-based and Phi-based models in text-to-SQL tasks, introduces a structured fine-tuning methodology designed for low-resource languages, and presents empirical evidence demonstrating the positive impact of strategic data augmentation on model performance. These findings contribute to the advancement of natural language interfaces for databases, particularly in languages with limited NLP resources. The scripts and models used during the training and testing phases of the study are publicly available at https://github.com/emirozturk/TT2SQL.

APA, Harvard, Vancouver, ISO, and other styles

8

Martani, Marlene, Hanny Juwitasary, and Arya Nata Gani Putra. "Analisis Alat Bantu Tuning Fisikal Basis Data pada Sql Server 2008." ComTech: Computer, Mathematics and Engineering Applications 5, no. 1 (2014): 334. http://dx.doi.org/10.21512/comtech.v5i1.2628.

Full text

Abstract:

Nowadays every company has been faced with a business competition that requires the company to survive and be superior to its competitors. One strategy used by many companies is to use information technology to run their business processes. The use of information technology would require a storage which commonly referred to as a database to store and process data into useful information for the company. However, it was found that the greater the amount of data in the database, then the speed of the resulting process will decrease because the time needed to access the data will be much longer. The long process of data can cause a decrease in the company’s performance and the length of time needed to make decisions so that this can be a challenge to achieve the company’s competitive advantage. In this study performed an analysis of technique to improve the performance of the database system used by the company to perform tuning on SQL Server 2008 database physically. The purpose of this study is to improve the performance of the database used by speeding up the time it takes when doing query processing. The research methodology used was the method of analysis such as literature studies, analysis of the process and the workings of tuning tools that already exist in SQL Server 2008, and evaluation of applications that have been created, and also tuning methods that include query optimization and create index. The results obtained from this study is an evaluation of the physical application tuning tools that can integrate database functionality of other tuning tools such as SQL Profiler and Database Tuning Advisor.

APA, Harvard, Vancouver, ISO, and other styles

9

Vishnupriya, S. Devarajulu. "Key Solutions to Optimize Database SQL Queries." Journal of Scientific and Engineering Research 6, no. 12 (2019): 311–14. https://doi.org/10.5281/zenodo.13753398.

Full text

Abstract:

Optimizing SQL Queries is crucial for enhancing the performance and efficiency of database-driven applications. This article explores key solutions to the performance issues in SQL queries, with code samples and detailed explanations. Best practices such as using indexes, avoiding unnecessary columns in SELECT statements, using schema names with object names, and optimizing joins and subqueries and other solutions are discussed. By following these optimization techniques, developers can provide more efficient database with improved application performance.

APA, Harvard, Vancouver, ISO, and other styles

10

Peng, Yuchen, Ke Chen, Lidan Shou, Dawei Jiang, and Gang Chen. "AQUA: Automatic Collaborative Query Processing in Analytical Database." Proceedings of the VLDB Endowment 16, no. 12 (2023): 4006–9. http://dx.doi.org/10.14778/3611540.3611607.

Full text

Abstract:

Data analysts nowadays are keen to have analytical capabilities involving deep learning (DL). Collaborative queries, which employ relational operations to process structured data and DL models to process unstructured data, provide a powerful facility for DL-based in-database analysis. The classical approach to support collaborative queries in relational databases is to integrate DL models with user-defined functions (UDFs) in a general-purpose language (e.g., C++) to process unstructured data. This approach suffers from suboptimal performance as the opaque UDFs preclude the generation of an optimal query plan. A recent work, DL2SQL, addresses the problem of collaborative query optimization by first converting DL computations into SQL subqueries and then using a classical relational query optimizer to optimize the entire collaborative query. However, the DL2SQL approach compromises usability by requiring data analysts to manually manage DL-related data and tune query performance. To this end, this paper introduces AQUA, an analytical database designed for efficient collaborative query processing. Built on DL2SQL, AQUA automates translations from collaborative queries into SQL queries. To enhance usability, AQUA introduces two techniques: 1) a declarative scheme for DL-related data management, and 2) DL-specific optimizations for collaborative query processing, eliminating the burden of manual data management and performance tuning from the data analysts. We demonstrate the key contributions of AQUA via a web APP that allows the audience to perform collaborative queries on the CIFAR-10 dataset.

APA, Harvard, Vancouver, ISO, and other styles

11

Siddiqui, Tarique, and Wentao Wu. "ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges." ACM SIGMOD Record 52, no. 4 (2023): 19–30. http://dx.doi.org/10.1145/3641832.3641836.

Full text

Abstract:

The increasing scale and complexity of workloads in modern cloud services highlight a crucial challenge in automated index tuning: recommending high-quality indexes while ensuring scalability. This is further complicated by the need for these automated solutions to minimize query performance regressions in production deployments. This paper directs attention to some of these challenges in automated index tuning and explores ways in which machine learning (ML) techniques provide new opportunities in their mitigation. In particular, we reflect on our recent efforts in developing ML techniques for workload selection, candidate index filtering, speeding up index configuration search, reducing the amount of query optimizer calls, and lowering the chances of performance regressions. We highlight the key takeaways from these efforts and underline the gaps that need to be closed for their effective functioning within the traditional index tuning framework. Additionally, we present a preliminary cross-platform design aimed at democratizing index tuning across multiple SQL-like systems-an imperative in today's continuously expanding data system landscape. We believe our findings will help provide context and impetus to the research and development efforts in automated index tuning.

APA, Harvard, Vancouver, ISO, and other styles

12

Colley, Derek, Clare Stanier, and Md Asaduzzaman. "Investigating the Effects of Object-Relational Impedance Mismatch on the Efficiency of Object-Relational Mapping Frameworks." Journal of Database Management 31, no. 4 (2020): 1–23. http://dx.doi.org/10.4018/jdm.2020100101.

Full text

Abstract:

The object-relational impedance mismatch (ORIM) problem characterises differences between the object-oriented and relational approaches to data access. Queries generated by object-relational mapping (ORM) frameworks are designed to overcome ORIM difficulties and can cause performance concerns in environments which use object-oriented paradigms. The aim of this paper is twofold, first presenting a survey of database practitioners on the effectiveness of ORM tools followed by an experimental investigation into the extent of operational concerns through the comparison of ORM-generated query performance and SQL query performance with a benchmark data set. The results show there are perceived difficulties in tuning ORM tools and distrust around their effectiveness. Through experimental testing, these views are validated by demonstrating that ORMs exhibit performance issues to the detriment of the query and the overall scalability of the ORM-led approach. Future work on establishing a system to support the query optimiser when parsing and preparing ORM-generated queries is outlined.

APA, Harvard, Vancouver, ISO, and other styles

13

Memon, Muhammad Qasim, Jingsha He, Aasma Memon, Khurram Gulzar Rana, and Muhammad Salman Pathan. "Query Processing for Time Efficient Data Retrieval." Indonesian Journal of Electrical Engineering and Computer Science 9, no. 3 (2018): 784. http://dx.doi.org/10.11591/ijeecs.v9.i3.pp784-788.

Full text

Abstract:

<p class="TTPAbstract">In database management system (DBMS) retrieving data through structure query language is an essential aspect to find better execution plan for performance. In this paper, we incorporated database objects to optimize query execution time and its cost by vanishing poorly SQL statements. We proposed a method of evolving and inserting database constraints as database objects embedded with queries either to add them for the sake of transactions required by user to detect those queries for the betterment of performance. We took analysis on several databases while processing queries itself and assimilate real time database workload with the bunch of transactions are invoked in comparison with tuning approaches. These database objects are coded in procedural language environment pertaining rules to make it worth and are merged into queries offering improved execution plan.</p>

APA, Harvard, Vancouver, ISO, and other styles

14

Santosh, Vinnakota. "Streamlining Legacy Migrations: A Comparative Analysis of Teradata to Snowflake Transformation." International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences 8, no. 4 (2020): 1–7. https://doi.org/10.5281/zenodo.15054566.

Full text

Abstract:

The increasing need for scalable, cost-effective, and cloud-native data solutions has propelled organizations to migrate from legacy systems like Teradata to modern platforms such as Snowflake. This paper delves into comprehensive strategies, explores technical challenges, and presents robust solutions for migrating Teradata workloads to Snowflake. It emphasizes advanced techniques for schema conversion, SQL translation, query performance tuning, and cost optimization. Detailed workflows, in-depth technical comparisons, and actionable insights are accompanied by diagrams, flowcharts, and examples to guide data engineers and architects through successful migrations.

APA, Harvard, Vancouver, ISO, and other styles

15

Madhuri Koripalli. "Intelligent assistants for data professionals: Copilots and agents." World Journal of Advanced Engineering Technology and Sciences 15, no. 2 (2025): 015–21. https://doi.org/10.30574/wjaets.2025.15.2.0470.

Full text

Abstract:

Intelligent assistants, including AI-driven copilots and specialized extensions, transform how data professionals interact with database environments. These tools leverage advanced language models and contextual understanding to automate routine tasks while providing sophisticated recommendations for query optimization, schema design, and performance tuning. By integrating with platforms like SQL Server Management Studio and Azure Data Studio, these assistants offer capabilities ranging from natural language query translation to predictive code completion and error prevention. Success stories across financial services, healthcare, and retail demonstrate their potential to accelerate development cycles, improve code quality, and democratize data access. However, implementation requires careful consideration of adoption frameworks, governance policies, and technical prerequisites. These systems face challenges despite their value, including performance limitations with complex queries, organizational resistance, potential skill erosion, and privacy concerns. The evolution of these intelligent companions represents a significant shift from passive tools to active collaborators in data management.

APA, Harvard, Vancouver, ISO, and other styles

16

Nida, Bhanu Raju. "SAP Core Data Services (CDS) Views: A Modern Approach to Data Modeling and Performance Optimization in SAP Ecosystem." International Scientific Journal of Engineering and Management 04, no. 03 (2025): 1–7. https://doi.org/10.55041/isjem02351.

Full text

Abstract:

SAP Core Data Services (CDS) Views represent a new approach to data modeling in large enterprises. They can outperform regular SQL Views by pushing code down to the SAP HANA in-memory database. However, due to their unique syntax and query performance tuning, developers must learn ABAP CDS, CDS Annotations, and how to leverage SAP HANA in-memory features. If a CDS View is not properly designed, performance can degrade significantly. Code pushdown, while beneficial, can become a drawback if developers do not follow best practices when creating CDS Views. Additionally, debugging CDS Views presents challenges. Unlike traditional ABAP programs, ABAP Debugger breakpoints cannot be used, requiring SAP ABAP developers to learn tools like the SAP HANA SQL Analyzer and other Performance Trace utilities. Another consideration is that CDS Views are not fully backward compatible with legacy SAP ERP (ECC) systems. To unlock their full potential, including features like CDS Table Functions, companies need SAP HANA and SAP S/4HANA. Therefore, organizations must weigh the disadvantages before implementing CDS Views. They should carefully assess the cost and effort required to move processing to the database level while ensuring performance gains. Keywords—SAP Core Data Services, CDS Views, HANA, ABAP, Performance Optimization, Analytics

APA, Harvard, Vancouver, ISO, and other styles

17

Sethu, Sesha Synam Neeli. "Key Challenges and Strategies in Managing Databases for Data Science and Machine Learning." International Journal of Leading Research Publication 2, no. 3 (2021): 1–9. https://doi.org/10.5281/zenodo.15360136.

Full text

Abstract:

The convergence of data science and machine learning (ML) methodologies with enterprise-level data management systems necessitates a paradigm shift in database administration (DBA) practices. This integration presents significant hurdles, including the need for high-throughput data storage solutions (e.g., distributed NoSQL databases, columnar databases), real-time data streaming architectures (e.g., Apache Kafka, Apache Flink), robust data governance frameworks to ensure data quality and compliance (e.g., implementing data lineage tracking, metadata management), efficient management of heterogeneous data sources via ETL/ELT processes, and optimization strategies to mitigate the performance impact of ML model deployment and inference (e.g., model caching, query optimization techniques).Addressing these challenges requires a multi-faceted approach. This includes leveraging scalable database architectures (e.g., sharding, replication), implementing automated data manipulation and transformation processes (e.g., scripting with Python, leveraging cloud-based ETL services), and enforcing stringent security protocols using encryption, access control lists (ACLs), and intrusion detection systems. Furthermore, continuous professional development is crucial, encompassing expertise in areas such as AI-driven database auto-tuning, cloud-native database services (e.g., AWS RDS, Azure SQL Database, Google Cloud SQL), and containerization technologies (e.g., Docker, Kubernetes) for deploying and scaling ML workflows. By adopting these best practices, DBAs can ensure the efficiency, reliability, and scalability of data infrastructures essential for successful data science and ML initiatives

APA, Harvard, Vancouver, ISO, and other styles

18

Pandis, Ippokratis. "The evolution of Amazon redshift." Proceedings of the VLDB Endowment 14, no. 12 (2021): 3162–74. http://dx.doi.org/10.14778/3476311.3476391.

Full text

Abstract:

In 2013, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift [7], the first fully managed, petabyte-scale enterprise-grade cloud data warehouse. Amazon Redshift made it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools. This launch was a significant leap from the traditional on-premise data warehousing solutions, which were expensive, not elastic, and required significant expertise to tune and operate. Customers embraced Amazon Redshift and it became the fastest growing service in AWS. Today, tens of thousands of customers use Amazon Redshift in AWS's global infrastructure of 25 launched Regions and 81 Availability Zones (AZs), to process exabytes of data daily. The success of Amazon Redshift inspired a lot of innovation in the analytics segment, e.g. [1, 2, 4, 10], which in turn has benefited customers. In the last few years, the use cases for Amazon Redshift have evolved and in response, Amazon Redshift continues to deliver a series of innovations that delight customers. In this paper, we give an overview of Amazon Redshift's system architecture. Amazon Redshift is a columnar MPP data warehouse [7]. As shown in Figure 1, an Amazon Redshift compute cluster consists of a coordinator node, called the leader node , and multiple compute nodes . Data is stored on Redshift Managed Storage , backed by Amazon S3, and cached in compute nodes on locally-attached SSDs in compressed columnar fashion. Tables are either replicated on every compute node or partitioned into multiple buckets that are distributed among all compute nodes. AQUA is a query acceleration layer that leverages FPGAs to improve performance. CaaS is a caching microservice of optimized generated code for the various query fragments executed in the Amazon Redshift fleet. The innovation at Amazon Redshift continues at accelerated pace. Its development is centered around four streams. First, Amazon Redshift strives to provide industry-leading data warehousing performance. Amazon Redshift's query execution blends database operators in each query fragment via code generation. It combines prefetching and vectorized execution with code generation to achieve maximum efficiency. This allows Amazon Redshift to scale linearly when processing from a few terabytes to petabytes of data. Figure 2 depicts the total execution time of the Cloud Data Warehouse Benchmark Derived from TPC-DS 2.13 [6] while scaling dataset size and hardware simultaneously. Amazon Redshift's performance remains nearly flat for a given ratio of data to hardware, as data volume increases from 30TB to 1PB. This linear scaling to the petabyte scale makes it easy, predictable and cost-efficient for customers to on-board new datasets and workloads. Second, customers needed to process more data and wanted to support an increasing number of concurrent users or independent compute clusters that are operating over the Redshift-managed data and the data in Amazon S3. We present Redshift Managed Storage, Redshift's high-performance transactional storage layer, which is disaggregated from the Redshift compute layer and allows a single database to grow to tens of petabytes. We also describe Redshift's compute scaling capabilities. In particular, we present how Redshift can scale up by elastically resizing the size of each cluster, and how Redshift can scale out and increase its throughput via multi-cluster autoscaling, called Concurrency Scaling. With Concurrency Scaling, customers can have thousands of concurrent users executing queries on the same Amazon Redshift endpoint. We also talk about data sharing, which allows users to have multiple isolated compute clusters consume the same datasets in Redshift Managed Storage. Elastic resizing, concurrency scaling and data sharing can be combined giving multiple compute scaling options to the Amazon Redshift customers. Third, as Amazon Redshift became the most widely used cloud data warehouse, its users wanted it to be even easier to use. For that, Redshift introduced ML-based autonomics. We present how Redshift automated among others workload management, physical tuning, the refresh of materialized views (MVs), along with automated MVs-based optimization that rewrites queries to use MVs. We also present how we leverage ML to improve the operational health of the service and deal with gray failures [8]. Finally, as AWS offers a wide range of purpose-built services, Amazon Redshift provides seamless integration with the AWS ecosystem and novel abilities in ingesting and ELTing semistructured data (e.g., JSON) using the PartiQL extension of SQL [9]. AWS purpose-built services include the Amazon S3 object storage, transactional databases (e.g., DynamoDB [5] and Aurora [11]) and the ML services of Amazon Sagemaker. We present how AWS and Redshift make it easy for their customers to use the best service for each job and seamlessly take advantage of Redshift's best of class analytics capabilities. For example, we talk about Redshift Spectrum [3] that allows Redshift to query data in open-file formats in Amazon S3. We present how Redshift facilitates both the in-place querying of data in OLTP services, using Redshift's Federated Querying, as well as the copy of data to Redshift, using Glue Elastic Views. We also present how Redshift can leverage the catabilities of Amazon Sagemaker through SQL and without data movement.

APA, Harvard, Vancouver, ISO, and other styles

19

Raihan Siddik, Muhammad, Mhd Arief Hasan, Andika Fajar Kesuma, Nurmala Sari, Shania Dwi Putri, and Qurrotul Uyun Harahap. "IMPLEMENTASI QUERY TUNING UNTUK PENINGKATAN PERFORMA PADA DATABASE BARANG MINI MARKET NAN." JATI (Jurnal Mahasiswa Teknik Informatika) 9, no. 2 (2025): 3183–87. https://doi.org/10.36040/jati.v9i2.13217.

Full text

Abstract:

Query tuning merupakan suatu langkah optimasi performa database pada SQL Server. Query Tuning ini bertujuan untuk meningkatkan efisiensi eksekusi query dengan meminimalkan penggunaan sumber daya seperti waktu proses dan konsumsi memori. Dalam pengoperasiannya, Query Tuning melibatkan analisis query plan, indeks, serta penggunaan teknik-teknik seperti pembaruan statistik, restrukturisasi query, dan pengelolaan indeks yang tepat. Selain itu, fitur bawaan SQL Server seperti Database Engine Tuning Advisor dan Query Store memberikan panduan praktis dalam mengidentifikasi bottleneck performa. Dengan menerapkan query tuning secara efektif, performa aplikasi berbasis database dapat ditingkatkan secara signifikan, memastikan akses data yang cepat dan handal. Penelitian ini bertujuan untuk mengeksplorasi metode-metode utama dalam query tuning serta dampaknya terhadap kinerja sistem database SQL Server Penerapan query tuning dalam penelitian ini menunjukkan peningkatan efisiensi eksekusi query secara signifikan. Optimasi pada tabel barang mengurangi waktu eksekusi dari 229 ms menjadi 162 ms (29,26%), sementara query kompleks dengan indeks tambahan turun dari 223 ms menjadi 140 ms (37,22%). Strategi optimasi, seperti identifikasi query lambat, penerapan indeks cluster dan non-cluster, serta query refactoring, berdampak positif pada performa sistem, mengurangi waktu eksekusi serta penggunaan CPU dan memori.

APA, Harvard, Vancouver, ISO, and other styles

20

Yerra, Srikanth. "Reducing ETL processing time with SSIS optimizations for large-scale data pipelines." International journal of data science and machine learning 05, no. 01 (2025): 61–69. https://doi.org/10.55640/ijdsml-05-01-12.

Full text

Abstract:

Extract, Transform, Load (ETL) processes form the backbone of data manage- ment and consolidation in today’s data-driven enterprises with prevalent large- scale data pipelines. One of the widely used ETL tools is Microsoft SQL Server Integration Services (SSIS), yet its optimization for performance for large-scale data loads remains a challenge. As the volumes of data grow exponentially, inefficient ETL processes create bottlenecks, increased processing time, and ex- haustion of system resources. This work discusses major SSIS optimizations that minimize ETL processing time, allowing for effective and scalable data integration. One of the key areas of optimization is data flow optimization, such as lever- aging the use of the Fast Load mode in OLE DB Destination to perform batch inserts instead of row-by-row. Similarly, Bulk Insert operations can signifi- cantly reduce data movement time. Additionally, buffer size and DefaultBuffer- MaxRows tuning allows SSIS to process data in memory more efficiently, thereby minimizing disk I/O operations. Another major area of focus is source query optimization. With the utiliza- tion of indexed views, partitioned tables, and filtering in the WHERE clause, unnecessary data extraction is avoided, restricting the load on the source sys- tem. NOLOCK hints also minimize database contention in high-concurrency environments. Parallel execution of multiple operations within SSIS can also accelerate execution, with multithreading and batch processing enabling con- current data conversion. Lookup transformations, a common performance bottleneck, can be opti- mized using cache mode, where reference data is pre-loaded instead of querying the database for each row. Furthermore, replacing row-based transformations with set-based operations significantly reduces processing overhead. For incremental data loading, change tracking or CDC (Change Data Cap- ture) enables altered record processing in place of full set loads. This saves time in processing and optimizes utilization of resources. ETL logging and error- handling mechanisms play an important role as well; selective SSIS logging and event-based error-handling mechanisms can prevent performance degradation due to overlogging. Lastly, SSIS package configurations can be tuned by having proper indexing of destination tables, turning off unnecessary constraints during loading, and applying table partitioning to maximize parallel loads of data. By utilizing these SSIS optimizations, organizations can reduce ETL pro- cessing by significant quantities, optimize data pipelines, and overall enhance enterprise-level data integration performance. These approaches make large- scale big-data scale data pipelines have very low latency, thus making SSIS a more efficient and scalable solution for enterprise-level data workflows.

APA, Harvard, Vancouver, ISO, and other styles

21

Nuriev, Marat, Rimma Zaripova, Andrey Potapov, and Maxim Kuznetsov. "Achieving new SQL query performance levels through parallel execution in SQL Server." E3S Web of Conferences 460 (2023): 04005. http://dx.doi.org/10.1051/e3sconf/202346004005.

Full text

Abstract:

This article provides an in-depth look at implementing parallel SQL query processing using the Microsoft SQL Server database management system. It examines how parallelism can significantly accelerate query execution by leveraging multi-core processors and clustered environments. The article explores SQL Server's sophisticated parallel processing capabilities including automatic query parallelization, intra-query parallelism techniques like parallel joins and parallel data aggregation, as well as inter-query parallelism for concurrent query execution. It covers key considerations around effective parallelization such as managing concurrency and locks, handling data skew, resource governance, and monitoring. Challenges like debugging parallel plans and potential bottlenecks from excessive parallelism are also discussed along with mitigation strategies. Real-world examples demonstrate how judicious application of parallel processing helps optimize complex analytics workloads involving massive datasets. The insights presented provide guidance to database developers and administrators looking to enable parallel SQL query execution in SQL Server environments for substantial performance gains and scalability.

APA, Harvard, Vancouver, ISO, and other styles

22

Allam, Tahani M. "Estimate the Performance of Cloudera Decision Support Queries." International Journal of Online and Biomedical Engineering (iJOE) 18, no. 01 (2022): 127–38. http://dx.doi.org/10.3991/ijoe.v18i01.27877.

Full text

Abstract:

Hive and Impala queries are used to process a big amount of data. The overwriting amount of information requires an efficient data processing system. When we deal with a long-term batch query and analysis Hive will be more suitable for this query. Impala is the most powerful system suitable for real-time interactive Structured Query Language (SQL) query which are added a massive parallel processing to Hadoop distributed cluster. The data growth makes a problem with SQL Cluster because the execution processing time is increased. In this paper, a comparison is demonstrated between the performance time of Hive, Impala and SQL on two different data models with different queries chosen to test the performance. The results demonstrate that Impala outperforms Hive and SQL cluster when it comes to analyze data and processing tasks. Using two benchmark datasets, TPC-H and statistical computing, we compare the performance of Hive, Impala, and SQL clusters 2009 Statistical Graphics Data Expo.

APA, Harvard, Vancouver, ISO, and other styles

23

Ullah, Atta, Muhammad Usman, Muhammad F. Abrar, Najeeb Ullah, Ibrar A. Shah, and Muhammad F. Nadeem. "Systematic performance, and Security evaluation of .NET models for accessing database." VFAST Transactions on Software Engineering 9, no. 4 (2021): 18–24. http://dx.doi.org/10.21015/vtse.v9i4.752.

Full text

Abstract:

In .NET, Object Relational Mapping (ORM) is a programming technique used for accessing the database, which has many frameworks, like Entity Framework, LINQ to SQL, NHibernate, Tele rick Open Access, Light Speed. The LINQ to SQL and Entity Framework usability has increased. This is because of the reason that in these two frameworks full CRUD (Create, Read, Update and Delete) operations can be implemented in short time as compared to Transact Queries, which require more time. In case of multiple projects on various models; Transact Query, LINQ to SQL, and Entity Framework, it becomes difficult to decide which model is the best in terms of performance and security. Therefore, in this article, we provide a comprehensive comparison between Entity Framework, LINQ to SQL and Transact Queries in terms of performance and security. For this purpose, we implemented eleven different types of queries on the selected three frameworks. Subsequently, we quantified and evaluated the execution time and memory usage of all the queries. Furthermore, all types of SQL injection attacks have been applied on three separate applications for security evaluation. Our results show that, the Transact Query is more vulnerable to SQL injection attacks as compared to LINQ to SQL and Entity Framework. Our results show that Transact Query outperforms in terms of memory and CPU usage. Our results also help the practitioner in adopting a framework on the basis of query level performance in terms of memory and CPU usage.

APA, Harvard, Vancouver, ISO, and other styles

24

Raphaela Mei Lanny Br Aritonang, Gloria, Mhd Arief Hasan, Mohammad Khasbulla Ridwan, Muhamad Nur Iman, Ricky Carlo Pratama Silalahi, and Rizky Suranta Sipayung. "OPTIMASI QUERY SQL SERVER DENGAN TEKNIK INDEXING DAN PERFORMANCE MONITORING." JATI (Jurnal Mahasiswa Teknik Informatika) 9, no. 2 (2025): 3094–99. https://doi.org/10.36040/jati.v9i2.13179.

Full text

Abstract:

Pengelolaan database yang efisien merupakan salah satu tantangan utama dalam dunia teknologi informasi, terutama pada sistem dengan volume data besar dan kompleksitas query yang tinggi. Penelitian ini bertujuan untuk mengoptimalkan kinerja query pada Microsoft SQL Server melalui penerapan teknik indexing dan monitoring performa. Eksperimen dilakukan dengan membandingkan performa query pada database tanpa indeks dan dengan indeks, menggunakan data dummy sebanyak 1 juta baris. Hasil penelitian menunjukkan bahwa penerapan indeks mampu mengurangi logical reads hingga 89.77%, CPU time hingga 68.09%, dan waktu eksekusi (elapsed time) hingga 90.20%. Selain itu, monitoring performa melalui SQL Server Management Studio (SSMS) mengidentifikasi tingkat fragmentasi indeks sebagai faktor yang memengaruhi efisiensi query. Dengan demikian, teknik indexing terbukti efektif dalam meningkatkan kinerja pengolahan data pada database SQL Server. Penelitian ini memberikan wawasan strategis tentang pentingnya perancangan indeks yang tepat serta perlunya monitoring performa untuk menjaga efisiensi sistem database.

APA, Harvard, Vancouver, ISO, and other styles

25

Nerić, Vedrana, and Nermin Sarajlić. "A Review on Big Data Optimization Techniques." B&H Electrical Engineering 14, no. 2 (2020): 13–18. http://dx.doi.org/10.2478/bhee-2020-0008.

Full text

Abstract:

Abstract Analysis of representative tools for SQL query processing on Hadoop (SQL-on-Hadoop systems), such as Hive, Impala, Presto, Shark, show that they are not still sufficiently efficient for complex analytical queries and interactive query processing. Existing SQL-on-Hadoop systems have many benefits from the application of modern query processing techniques that have been studied extensively for many years in the database community. It is expected that with the application of advanced techniques, the performance of SQL-on-Hadoop systems can be improved. The main idea of this paper is to give a review of big data concepts and technologies, and summarize big data optimization techniques that can be used for improving performance when processing big data.

APA, Harvard, Vancouver, ISO, and other styles

26

Chandrashekariah, Yathish Aradhya Bandur, and Dinesha H. A. "Structured query language query join optimization by using rademacher averages and mapreduce algorithms." Bulletin of Electrical Engineering and Informatics 13, no. 3 (2024): 1730–40. http://dx.doi.org/10.11591/eei.v13i3.6837.

Full text

Abstract:

Query optimization involves identifying and implementing the most effective and efficient methods and strategies to enhance the performance of queries. This is achieved by intelligently utilizing system resources and considering various performance metrics. Table joining optimization involves optimizing the process of combining two or more tables within a database. Structured query language (SQL) optimization is the progress of utilizing SQL queries in the possible way to achieve fast and accurate database results. SQL optimization is critical to decreasing the no of queries in research description framework (RDF) and the time for processing a huge number of relatable data. In this paper, four new algorithms are proposed such as hash-join, sort-merge, rademacher averages and mapreduce for the progress of SQL query join optimization. The proposed model is evaluated and tested using waterloo sparql diversity test suite (WatDiv) and lehigh university benchmark (LUBM) benchmark datasets in terms of time execution. The results represented that the proposed method achieved an enhanced performance of less execution time for various queries such as Q3 of 5362, Q8 of 5921, Q9 of 5854 and Q10 of 5691 milliseconds. The proposed gives better performance than other existing methods like hybrid database-map reduction system (AQUA+) and join query processing (JQPro).

APA, Harvard, Vancouver, ISO, and other styles

27

Krishna Kishor Tirupati, Dr S P Singh, Sivaprasad Nadukuru, Shalu Jain, and Raghav Agarwal. "Improving Database Performance with SQL Server Optimization Techniques." Modern Dynamics: Mathematical Progressions 1, no. 2 (2024): 450–94. http://dx.doi.org/10.36676/mdmp.v1.i2.32.

Full text

Abstract:

Database performance is critical for ensuring efficient data management and retrieval, particularly in environments that utilize SQL Server. As the complexity and size of databases increase, performance optimization becomes essential for maintaining responsiveness and reliability. This paper examines a range of SQL Server optimization techniques that can significantly enhance database performance. Key strategies include effective indexing, which improves query execution speed; query optimization, which refines SQL statements for better efficiency; and proper database configuration, which ensures that SQL Server is tuned for optimal resource utilization. Additionally, we explore the importance of regular maintenance tasks such as updating statistics and monitoring performance metrics to identify bottlenecks. By employing these techniques, database administrators can minimize latency, reduce resource consumption, and enhance overall system performance. Furthermore, this paper highlights the significance of ongoing performance assessments to adapt to evolving application demands and data growth. Ultimately, implementing these optimization strategies not only boosts SQL Server performance but also contributes to improved user experiences and organizational productivity.

APA, Harvard, Vancouver, ISO, and other styles

28

Kusi, Rajan, Kabir Kumar Sinkemana, Sanjeeb Prasad Pandey, and Shashidhar Ram Joshi. "SQL Optimization in Oracle using Hybrid Genetic and Ant Colony Algorithm." Nepal Journal of Science and Technology 21, no. 2 (2022): 21–28. http://dx.doi.org/10.3126/njst.v21i2.62353.

Full text

Abstract:

In this paper, the input user Structured Query Language (SQL) query is converted into an optimized SQL query using a hybrid algorithm. The main aim is to reduce query execution time using PHP language and oracle database. These performance has been evaluated using different performance metrics: Cost of individuals, Query execution time. The hybrid algorithm method combines the evolutionary effect of the Genetic Algorithm (GA) and the cooperative effect of Ant Colony Optimization (ACO). A GA with a great global converging rate aims to produce an initial optimum for allocating initial pheromones of ACO. An ACO with great parallelism and effective feedback is then served to obtain the optimal solution. A fused algorithm of a GA and ACO to solve SQL optimization problems is an innovative solution that presents a clear methodological contribution to the optimization algorithm. In the simulation result, we found the algorithm of a GA and ACO to solve SQL optimization problems in Oracle. It is an innovative solution that presents a clear methodological contribution to the optimization algorithm.

APA, Harvard, Vancouver, ISO, and other styles

29

Kumail, Saifuddin Saif. "SAP HANA Query Optimization techniques." INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH AND CREATIVE TECHNOLOGY 6, no. 6 (2020): 1–6. https://doi.org/10.5281/zenodo.14566278.

Full text

Abstract:

Query performance is a key aspect for any development done in SAP HANA and it is very common to have performance issues when the data size grows in the system. SAP HANA offers effective tools and techniques for developers to make sure the query is optimized as per the hardware of SAP HANA database. Understanding the details of Query processing by different HANA engines like Calculation engine, Join engine, OLAP engine to name a few, and the role of the SQL optimizer, SQL plan cache, PlanViz, OOM dumps helps developers to write and test the code to make sure performance is optimized.

APA, Harvard, Vancouver, ISO, and other styles

30

Liu, Zheli, Jingwei Li, Jin Li, Chunfu Jia, Jun Yang, and Ke Yuan. "SQL-Based Fuzzy Query Mechanism Over Encrypted Database." International Journal of Data Warehousing and Mining 10, no. 4 (2014): 71–87. http://dx.doi.org/10.4018/ijdwm.2014100104.

Full text

Abstract:

With the development of cloud computing and big data, data privacy protection has become an urgent problem to solve. Data encryption is the most effective way to protect privacy; however, it will change the data format and result in: 1. database structure and application software will be changed; 2. structured query language (SQL) operations cannot work properly, especially in SQL-based fuzzy query. As a result, it is necessary to provide an SQL-based fuzzy query mechanism over encrypted databases, including traditional databases and cloud outsourced databases. This paper establishes a secure database system using format-preserving encryption (FPE) as the underlying primitive to protect the data privacy while not change the database structure. It further proposes a new SQL-based fuzzy query mechanism supporting directly query over encrypted data, which is constructed by FPE and universal hash function (UHF). The security of the proposed mechanism is analyzed as well. In the end, it makes extensive experiments on the system to demonstrate its practical performance.

APA, Harvard, Vancouver, ISO, and other styles

31

S.N, Kavitha. "Tuning SQL Queries for Better Performance." International Journal of Psychosocial Rehabilitation 24, no. 5 (2020): 7002–5. http://dx.doi.org/10.37200/ijpr/v24i5/pr2020703.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Deska, Amat, Ipal Akbar, and Samidi. "Tuning Database Pada Sistem Penerimaan Mahasiswa Baru Menggunakan Optimasi Query dan Indexing." Techno.Com 22, no. 1 (2023): 68–77. http://dx.doi.org/10.33633/tc.v22i1.7047.

Full text

Abstract:

Dalam pengoperasian database MaraiaDB diperlukan aplikasi berupa server localhost yang memiliki response waktu untuk menjalankan sebuah query agar dapat mendapatkan waktu yang efisiensi. Pada penelitian ini mengukur perfoma query dalam bentuk SELECT pada database MariaDB yang sudah di install pada komputer atau laptop dengan menggunakan aplikasi yang bernama Xammp dengan jumlah record data kurang lebih sebanyak 12.000 data, tapi pada penelitian ini hanya memakai sekitar 5000 data. Data tersebut nantinya akan dilakukan optimasi query dan indexing. Permasalahan yang terjadi adalah lamanya proses pengambilan data yang membutuhkan waktu sehingga membutuhkan suatu cara agar dapat mempercepat proses pengambilan data. Metode optimasi database yang difokuskan pada pengujian ini adalah dengan melakukan perbandingan dari berbagai efektivitas sub query serta penggunaan indexing pada tabel. Pada Query yang diuji adalah fungsi yang bernama LEFT JOIN, WHERE, ON, GROUP BY, ORDER BY dan DML (Data Manipulation Language), yaitu QUERY SELECT yang akan di lakukan pada aplikasi Xammp. Pada penelitian ini hasil yang di harapkan berupa pengambilan data yang akan menjadi lebih cepat atau efisien untuk meningkatkan kinerja pada database setelah melakukan Optimasi SQL Query dan Table Indexing.

APA, Harvard, Vancouver, ISO, and other styles

33

Yan, Liang, Jinhang Su, Chuanyi Liu, et al. "ExSPIN: Explicit Feedback-Based Self-Play Fine-Tuning for Text-to-SQL Parsing." Entropy 27, no. 3 (2025): 235. https://doi.org/10.3390/e27030235.

Full text

Abstract:

Recently, self-play fine-tuning (SPIN) has garnered widespread attention as it enables large language models (LLMs) to iteratively enhance their capabilities through simulated interactions with themselves, transforming a weak LLM into a strong one. However, applying SPIN to fine-tune text-to-SQL models presents substantial challenges. Notably, existing frameworks lack clear signal feedback during the training process and fail to adequately capture the implicit schema-linking characteristics between natural language questions and databases. To address these issues, we propose a novel self-play fine-tuning method for text-to-SQL models, termed ExSPIN, which incorporates explicit feedback. Specifically, during fine-tuning, the SQL query execution results predicted by the LLM are fed back into the model’s parameter update process. This feedback allows both the main player and the opponent to more accurately distinguish between negative and positive samples, thereby improving the fine-tuning outcomes. Additionally, we employ in-context learning techniques to provide explicit schema hints, enabling the LLM to better understand the schema-linking between the database and natural language queries during the self-play process. Evaluations on two real-world datasets show that our method significantly outperforms the state-of-the-art approaches.

APA, Harvard, Vancouver, ISO, and other styles

34

Pardede, Eric, J. Rahayu, Ramanpreet Aujla, and David Taniar. "SQL/XML Hierarchical Query Performance Analysis in an XML-Enabled Database System." JUCS - Journal of Universal Computer Science 15, no. (10) (2009): 2058–77. https://doi.org/10.3217/jucs-015-10-2058.

Full text

Abstract:

The increase utilization of XML structure for data representation, exchange, and integration has strengthened the need for an efficient storage and retrieval of XML data. Currently, there are two major streams of XML data repositories. The first stream is the Native XML database systems which are built solely to store and manipulate XML data, and equipped with the standard XML query language known as XPath and XQuery. The second stream is the XML-Enabled database systems which are generally existing traditional database systems enhanced with XML storage capabilities. The SQL/XML standard for XML querying is used in these enabled database systems stream. The main specific characteristic of this standard is the fact that XPath and XQuery are embedded within SQL statements. To date, most existing work in XML query analysis have been focussing on the first stream of Native XML database systems. The focus of this paper is to present a taxonomy of different hierarchical query patterns in XML-Enabled database environment, and to analyze the performance of the different query structures using the SQL/XML standard.

APA, Harvard, Vancouver, ISO, and other styles

35

Raghavender, Maddali. "Quantum Machine Learning for Ultra-Fast Query Execution in High-Dimensional SQL Data Systems." International Journal of Leading Research Publication 3, no. 4 (2022): 1–13. https://doi.org/10.5281/zenodo.15107548.

Full text

Abstract:

The new Quantum Machine Learning (QML) paradigm for highly efficient query execution in high-dimensional SQL data systems and Conventional database query execution is plagued by performance bottlenecks because of the explosive nature of structured data and intricate query optimization issues. The new QML-based methodology uses quantum algorithms to accelerate query processing by exploiting parallel computation, quantum-aided indexing, and probabilistic data access. With the incorporation of quantum-enhanced optimization methods, the framework achieves remarkable query execution time reduction, enhanced system scalability, and increased efficiency in managing relational databases at scale. The work compares the framework's performance against traditional SQL query optimizers and shows better performance in terms of execution speed, accuracy in retrieving data, and utilization of computational resources. The results also point to the promise of QML in revolutionizing DBMS and bringing in next-generation data analytics solutions.

APA, Harvard, Vancouver, ISO, and other styles

36

Idhaim, Hasan Ali. "Selecting and tuning the optimal query form of different SQL commands." International Journal of Business Information Systems 30, no. 1 (2019): 1. http://dx.doi.org/10.1504/ijbis.2019.097041.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Idhaim, Hasan Ali. "Selecting and tuning the optimal query form of different SQL commands." International Journal of Business Information Systems 30, no. 1 (2019): 1. http://dx.doi.org/10.1504/ijbis.2019.10018101.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Ji, Xuechun, Maoxian Zhao, Mingyu Zhai, and Qingxi Wu. "Query Execution Optimization in Spark SQL." Scientific Programming 2020 (February 7, 2020): 1–12. http://dx.doi.org/10.1155/2020/6364752.

Full text

Abstract:

Spark SQL is a big data processing tool for structured data query and analysis. However, due to the execution of Spark SQL, there are multiple times to write intermediate data to the disk, which reduces the execution efficiency of Spark SQL. Targeting on the existing issues, we design and implement an intermediate data cache layer between the underlying file system and the upper Spark core to reduce the cost of random disk I/O. By using the query pre-analysis module, we can dynamically adjust the capacity of cache layer for different queries. And the allocation module can allocate proper memory for each node in cluster. According to the sharing of the intermediate data in the Spark SQL workflow, this paper proposes a cost-based correlation merging algorithm, which can effectively reduce the cost of reading and writing redundant data. This paper develops the SSO (Spark SQL Optimizer) module and integrates it into the original Spark system to achieve the above functions. This paper compares the query performance with the existing Spark SQL by experiment data generated by TPC-H tool. The experimental results show that the SSO module can effectively improve the query efficiency, reduce the disk I/O cost and make full use of the cluster memory resources.

APA, Harvard, Vancouver, ISO, and other styles

39

Adji, Teguh Bharata, Dwi Retno Puspita Sari, and Noor Akhmad Setiawan. "Relational into Non-Relational Database Migration with Multiple-Nested Schema Methods on Academic Data." IJITEE (International Journal of Information Technology and Electrical Engineering) 3, no. 1 (2019): 16. http://dx.doi.org/10.22146/ijitee.46503.

Full text

Abstract:

The rapid development of internet technology has increased the need of data storage and processing technology application. One application is to manage academic data records at educational institutions. Along with massive growth of information, decrement in the traditional database performance is inevitable. Hence, there are many companies choose to migrate to NoSQL, a technology that is able to overcome the traditional database shortcomings. However, the existing SQL to NoSQL migration tools have not been able to represent SQL data relations in NoSQL without limiting query performance. In this paper, a relational database transformation system transforming MySQL into non-relational database MongoDB was developed, using the Multiple Nested Schema method for academic databases. The development began with a transformation scheme design. The transformation scheme was then implemented in the migration process, using PDI/Kettle. The testing was carried out on three aspects, namely query response time, data integrity, and storage requirements. The test results showed that the developed system successfully represented the relationship of SQL data in NoSQL, provided complex query performance 13.32 times faster in the migration database, basic query performance involving SQL transaction tables 28.6 times faster on migration results, and basic performance Queries without involving SQL transaction tables were 3.91 times faster in the migration source. This shows that the theory of the Multiple Nested Schema method, aiming to overcome the poor performance of queries involving many JOIN operations, is proved. In addition, the system is also proven to be able to maintain data integrity in all tested queries. The space performance test results indicated that the migrated database transformed using the Multiple Nested Schema method showed a storage requirement of 10.53 times larger than the migration source database. This is due to the large amount of data redundancy resulting from the transformation process. However, at present, storage performance is not a top priority in data processing technology, so large storage requirements are a consequence of obtaining efficient query performance, which is still considered as the first priority in data processing technology.

APA, Harvard, Vancouver, ISO, and other styles

40

Rahman, Nayem. "SQL Scorecard for Improved Stability and Performance of Data Warehouses." International Journal of Software Innovation 4, no. 3 (2016): 22–37. http://dx.doi.org/10.4018/ijsi.2016070102.

Full text

Abstract:

Scorecard-based measurement techniques are used by organizations to measure the performance of their business operations. A scorecard approach could be applied to a database system to measure performance of SQL (Structured Query Language) being executed and the extent of resources being used by SQL. In a large data warehouse, thousands of jobs run daily via batch cycles to refresh different subject areas. Simultaneously, thousands of queries by business intelligence tools and ad-hoc queries are being executed twenty-four by seven. There needs to be a controlling mechanism to make sure these batch jobs and queries are efficient and do not consume database systems resources more than optimal. The authors propose measurement of SQL query performance via a scorecard tool. The motivation behind using a scorecard tool is to make sure that the resource consumption of SQL queries is predictable and the database system environment is stable. The experimental results show that queries that pass scorecard evaluation criteria tend to utilize optimal level of database systems computing resources. These queries also show improved parallel efficiency (PE) in using computing resources (CPU, I/O and spool space) that demonstrate the usefulness of SQL scorecard.

APA, Harvard, Vancouver, ISO, and other styles

41

Basant, Namdeo, and Suman Ugrasen. "A Middleware Model for SQL to NoSQL Query Translation." Indian Journal of Science and Technology 15, no. 16 (2022): 718–28. https://doi.org/10.17485/IJST/v15i16.2250.

Full text

Abstract:

Abstract <strong>Objectives:</strong> To propose a suitable model for RDBMS SQL to NoSQL query translation, which works as a middleware between legacy applications and the NoSQL database. This model is expected to translate the SQL queries into NoSQL queries, and forward them to the NoSQL database for execution, and after the execution, the result received from NoSQL should be transferred to the legacy application.<strong> Methods:</strong> The proposed model is implemented in Java programming language for MySQL (RDBMS) to MongoDB document database query translation. The prototype translates the insert, update, delete, and select SQL queries into equivalent NoSQL query format for MongoDB document database. The middleware transforms the SQL queries to NoSQL query format, and returns the result to the legacy application, which they are expecting from the database. Performance of our model has been evaluated by executing SQL queries such as select, insert, update, delete (with simple and join queries) in Studio 3T, UnityJDBC driver for MongoDB, and our model SQLNo- QT. <strong>Findings:</strong> The study shows that the proposed model SQL to NoSQL Query Translation Model (SQL-No-QT) performs better in some cases. This model takes 7.5% less time compared to Studio3T, and 38.19% less time compared to UnityJDBC driver in executing select queries, and 78.82% less time compared to Studio3T in executing delete queries in big size database. This model also can execute the join SQL queries for insert, update and delete, which are not available in UnityJDBC driver for MongoDB. <strong>Novelty:</strong> This model works as a middleware between a legacy application and a NoSQL database, and it removes the need of developing whole new software for legacy application. <strong>Keywords:</strong> Database reengineering; database; Query translation; NoSQL; RDBMS

APA, Harvard, Vancouver, ISO, and other styles

42

Silva-Blancas, Victor Hugo, Hugo Jiménez-Hernández, Ana Marcela Herrera-Navarro, José M. Álvarez-Alvarado, Diana Margarita Córdova-Esparza, and Juvenal Rodríguez-Reséndiz. "A Clustering and PL/SQL-Based Method for Assessing MLP-Kmeans Modeling." Computers 13, no. 6 (2024): 149. http://dx.doi.org/10.3390/computers13060149.

Full text

Abstract:

With new high-performance server technology in data centers and bunkers, optimizing search engines to process time and resource consumption efficiently is necessary. The database query system, upheld by the standard SQL language, has maintained the same functional design since the advent of PL/SQL. This situation is caused by recent research focused on computer resource management, encryption, and security rather than improving data mining based on AI tools, machine learning (ML), and artificial neural networks (ANNs). This work presents a projected methodology integrating a multilayer perceptron (MLP) with Kmeans. This methodology is compared with traditional PL/SQL tools and aims to improve the database response time while outlining future advantages for ML and Kmeans in data processing. We propose a new corollary: hk→H=SSE(C),wherek>0and∃X, executed on application software querying data collections with more than 306 thousand records. This study produced a comparative table between PL/SQL and MLP-Kmeans based on three hypotheses: line query, group query, and total query. The results show that line query increased to 9 ms, group query increased from 88 to 2460 ms, and total query from 13 to 279 ms. Testing one methodology against the other not only shows the incremental fatigue and time consumption that training brings to database query but also that the complexity of the use of a neural network is capable of producing more precision results than the simple use of PL/SQL instructions, and this will be more important in the future for domain-specific problems.

APA, Harvard, Vancouver, ISO, and other styles

43

Wang, Feng Qin, Yu Liu, and Qing Long Han. "Research on Improving Data Query Speed Methods." Applied Mechanics and Materials 556-562 (May 2014): 5825–28. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.5825.

Full text

Abstract:

In order to improve data query speed, there are four methods: optimization of SQL (Structured Query Language) statements and reasonable use of index, temporary tables and views are researched. Practice demonstrates these four methods can improve query speed and database performance.

APA, Harvard, Vancouver, ISO, and other styles

44

Bayya, Anil Kumar. "Building Robust Fintech Reporting Systems Using JPA with Embedded SQL for Real-Time Data Accuracy and Consistency." Eastasouth Journal of Information System and Computer Science 1, no. 01 (2023): 119–31. https://doi.org/10.58812/esiscs.v1i01.480.

Full text

Abstract:

In the rapidly evolving fintech domain, the need for high-throughput, low-latency, real-time reporting systems has increased due to rising transaction volumes. This paper examines the architectural design and technical implementation of fintech reporting systems integrating Java Persistence API (JPA) with embedded SQL, ensuring data accuracy, efficient query execution, and transaction consistency. JPA, an Object-Relational Mapping (ORM) framework, abstracts interactions between Java objects and relational databases, simplifying development through entity mapping and reducing boilerplate code. Annotations like @Entity, @Table, and @Column define entity relationships and constraints, enabling automated schema synchronization. Embedded SQL complements JPA by allowing direct SQL query injection, optimizing complex, performance-critical queries while maintaining JPA’s portability. A multi-tier design ensures separation of concerns and scalability. The persistence layer, managed by JPA, facilitates database interaction, while connection pooling (e.g., HikariCP) and caching strategies (first-level and second-level) enhance transaction throughput. Transactional integrity is enforced via isolation levels, locking strategies, and batch processing, preventing issues like dirty reads and lost updates. Using native SQL queries via JPA’s create Native Query method, the system leverages advanced database optimization features. Profiling tools ensure minimal query latency, supporting real-time financial reporting with strict performance standards. By combining JPA’s abstraction with embedded SQL’s efficiency, this architecture provides a resilient, scalable reporting system, ensuring data integrity and optimized query execution for fintech operations.

APA, Harvard, Vancouver, ISO, and other styles

45

Zhou, Xuanhe, Guoliang Li, Jianming Wu, Jiesi Liu, Zhaoyan Sun, and Xinning Zhang. "A Learned Query Rewrite System." Proceedings of the VLDB Endowment 16, no. 12 (2023): 4110–13. http://dx.doi.org/10.14778/3611540.3611633.

Full text

Abstract:

Query rewriting is a challenging task that transforms a SQL query to improve its performance while maintaining its result set. However, it is difficult to rewrite SQL queries, which often involve complex logical structures, and there are numerous candidate rewrite strategies for such queries, making it an NP-hard problem. Existing databases or query optimization engines adopt heuristics to rewrite queries, but these approaches may not be able to judiciously and adaptively apply the rewrite rules and may cause significant performance regression in some cases (e.g., correlated subqueries may not be eliminated). To address these limitations, we introduce LearnedRewrite, a query rewrite system that combines traditional and learned algorithms (i.e., Monte Carlo tree search + hybrid estimator) to rewrite queries. We have implemented the system in Calcite, and experimental results demonstrate LearnedRewrite achieves superior performance on three real datasets.

APA, Harvard, Vancouver, ISO, and other styles

46

Oktavia, Tanty, and Surya Sujarwo. "Evaluation of Sub Query Performance in SQL Server." EPJ Web of Conferences 68 (2014): 00033. http://dx.doi.org/10.1051/epjconf/20146800033.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Zhao, Junyi, Kai Su, Yifei Yang, Xiangyao Yu, Paraschos Koutris, and Huanchen Zhang. "Debunking the Myth of Join Ordering: Toward Robust SQL Analytics." Proceedings of the ACM on Management of Data 3, no. 3 (2025): 1–28. https://doi.org/10.1145/3725283.

Full text

Abstract:

Join order optimization is critical in achieving good query performance. Despite decades of research and practice, modern query optimizers could still generate inferior join plans that are orders of magnitude slower than optimal. Existing research on robust query processing often lacks theoretical guarantees on join-order robustness while sacrificing query performance. In this paper, we rediscover the recent Predicate Transfer technique from a robustness point of view. We introduce two new algorithms, LargestRoot and SafeSubjoin, and then propose Robust Predicate Transfer (RPT) that is provably robust against arbitrary join orders of an acyclic query. We integrated Robust Predicate Transfer with DuckDB, a state-of-the-art analytical database, and evaluated against all the queries in TPC-H, JOB, TPC-DS, and DSB benchmarks. Our experimental results show that RPT improves join-order robustness by orders of magnitude compared to the baseline. With RPT, the largest ratio between the maximum and minimum execution time out of random join orders for a single acyclic query is only 1.6x (the ratio is close to 1 for most evaluated queries). Meanwhile, applying RPT also improves the end-to-end query performance by ≈1.5x (per-query geometric mean). We hope that this work sheds light on solving the practical join ordering problem.

APA, Harvard, Vancouver, ISO, and other styles

48

Bai, Qiushi, Sadeem Alsudais, and Chen Li. "QueryBooster: Improving SQL Performance Using Middleware Services for Human-Centered Query Rewriting." Proceedings of the VLDB Endowment 16, no. 11 (2023): 2911–24. http://dx.doi.org/10.14778/3611479.3611497.

Full text

Abstract:

SQL query performance is critical in database applications, and query rewriting is a technique that transforms an original query into an equivalent query with a better performance. In a wide range of database-supported systems, there is a unique problem where both the application and database layer are black boxes, and the developers need to use their knowledge about the data and domain to rewrite queries sent from the application to the database for better performance. Unfortunately, existing solutions do not give the users enough freedom to express their rewriting needs. To address this problem, we propose QueryBooster, a novel middleware-based service architecture for human-centered query rewriting, where users can use its expressive and easy-to-use rule language (called VarSQL) to formulate rewriting rules based on their needs. It also allows users to express rewriting intentions by providing examples of the original query and its rewritten query. QueryBooster automatically generalizes them to rewriting rules and suggests high-quality ones. We conduct a user study to show the benefits of VarSQL to formulate rewriting rules. Our experiments on real and synthetic workloads show the effectiveness of the rule-suggesting framework and the significant advantages of using QueryBooster for human-centered query rewriting to improve the end-to-end query performance.

APA, Harvard, Vancouver, ISO, and other styles

49

Hazboun, Fadi H., Majdi Owda, and Amani Yousef Owda. "A Natural Language Interface to Relational Databases Using an Online Analytic Processing Hypercube." AI 2, no. 4 (2021): 720–37. http://dx.doi.org/10.3390/ai2040043.

Full text

Abstract:

Structured Query Language (SQL) is commonly used in Relational Database Management Systems (RDBMS) and is currently one of the most popular data definition and manipulation languages. Its core functionality is implemented, with only some minor variations, throughout all RDBMS products. It is an effective tool in the process of managing and querying data in relational databases. This paper describes a method to effectively automate the conversion of a data query from a Natural Language Query (NLQ) to Structured Query Language (SQL) with Online Analytical Processing (OLAP) cube data warehouse objects. To obtain or manipulate the data from relational databases, the user must be familiar with SQL and must also write an appropriate and valid SQL statement. However, users who are not familiar with SQL are unable to obtain relevant data through relational databases. To address this, we propose a Natural Language Processing (NLP) model to convert an NLQ into an SQL query. This allows novice users to obtain the required data without having to know any complicated SQL details. The model is also capable of handling complex queries using the OLAP cube technique, which allows data to be pre-calculated and stored in a multi-dimensional and ready-to-use format. A multi-dimensional cube (hypercube) is used to connect with the NLP interface, thereby eliminating long-running data queries and enabling self-service business intelligence. The study demonstrated how the use of hypercube technology helps to increase the system response speed and the ability to process very complex query sentences. The system achieved impressive performance in terms of NLP and the accuracy of generating different query sentences. Using OLAP hypercube technology, the study achieved distinguished results compared to previous studies in terms of the speed of the response of the model to NLQ analysis, the generation of complex SQL statements, and the dynamic display of the results. As a plan for future work, it is recommended to use infinite-dimension (n-D) cubes instead of 4-D cubes to enable ingesting as much data as possible in a single object and to facilitate the execution of query statements that may be too complex in query interfaces running in a data warehouse. The study demonstrated how the use of hypercube technology helps to increase system response speed and process very complex query sentences.

APA, Harvard, Vancouver, ISO, and other styles

50

Lian, Xin, and Tianyu Zhang. "The Optimization of Cost-Model for Join Operator on Spark SQL Platform." MATEC Web of Conferences 173 (2018): 01015. http://dx.doi.org/10.1051/matecconf/201817301015.

Full text

Abstract:

Spark needs to use lots of memory resources, network resources and disk I/O resources when Spark SQL execute Join operation. The Join operation will greatly affect the performance of Spark SQL. How to improve the Join operation performance become an urgent problem. Spark SQL use Catalyst as query optimizer in the latest release. Catalyst query optimizer both implement the rule-based optimize strategy (RBO) and cost-based optimize strategy (CBO). There are some problems with the Catalyst CBO module. In the first place, the characteristic of In-memory computing in Spark was not fully considered. In the second place, the cost estimation of network transfer and disk I/O is insufficient. To solve these problems and improve the performance of Spark SQL. In this study, we proposed a cost estimation model for Join operator which take the cost from four aspects: time complexity, space complexity, network transfer and disk I/O. Then, the most cost-efficiency plan could be selected by using hierarchical analysis method from the equivalence physical plans which generated by Spark SQL. The experimental results show that the total amount of network transmission is reduced and the usage of processor is increased. Thus the performance of Spark SQL has improved.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!