Log in

Relevant bibliographies by topics / Data Mining And KDD / Journal articles

To see the other types of publications on this topic, follow the link: Data Mining And KDD.

Journal articles on the topic 'Data Mining And KDD'

Author: Grafiati

Published: 5 June 2025

Last updated: 15 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Data Mining And KDD.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Hegland, Markus. "Data mining techniques." Acta Numerica 10 (May 2001): 313–55. http://dx.doi.org/10.1017/s0962492901000058.

Full text

Abstract:

Methods for knowledge discovery in data bases (KDD) have been studied for more than a decade. New methods are required owing to the size and complexity of data collections in administration, business and science. They include procedures for data query and extraction, for data cleaning, data analysis, and methods of knowledge representation. The part of KDD dealing with the analysis of the data has been termed data mining. Common data mining tasks include the induction of association rules, the discovery of functional relationships (classification and regression) and the exploration of groups of similar data objects in clustering. This review provides a discussion of and pointers to efficient algorithms for the common data mining tasks in a mathematical framework. Because of the size and complexity of the data sets, efficient algorithms and often crude approximations play an important role.

APA, Harvard, Vancouver, ISO, and other styles

2

Hao, Wu. "On Business-Oriented Knowledge Discovery and Data Mining." Advanced Materials Research 760-762 (September 2013): 2267–71. http://dx.doi.org/10.4028/www.scientific.net/amr.760-762.2267.

Full text

Abstract:

This paper will discuss issues in data mining and business processes including Marketing, Finance and Health. In turn, the use of KDD in the complex real-world databases in business and government will push the IT researchers to identify and solve cutting-edge problems in KDD modelling, techniques and processes. From IT perspectives, some issues in economic sciences consist of business modelling and mining, aberrant behavior detection, and health economics. Some issues in KDD include data mining for complex data structures and complex modelling. These novel strategies will be integrated to build a one-stop KDD system.

APA, Harvard, Vancouver, ISO, and other styles

3

KHOSHGOFTAAR, TAGHI M., EDWARD B. ALLEN, WENDELL D. JONES, and JOHN P. HUDEPOHL. "DATA MINING FOR PREDICTORS OF SOFTWARE QUALITY." International Journal of Software Engineering and Knowledge Engineering 09, no. 05 (1999): 547–63. http://dx.doi.org/10.1142/s0218194099000309.

Full text

Abstract:

"Knowledge discovery in data bases" (KDD) for software engineering is a process for finding useful information in the large volumes of data that are a byproduct of software development, such as data bases for configuration management and for problem reporting. This paper presents guidelines for extracting innovative process metrics from these commonly available data bases. This paper also adapts the Classification And Regression Trees algorithm, CART, to the KDD process for software engineering data. To our knowledge, this algorithm has not been used previously for empirical software quality modeling. In particular, we present an innovative way to control the balance between misclassification rates. A KDD case study of a very large legacy telecommunications software system found that variables derived from source code, configuration management transactions, and problem reporting transactions can be useful predictors of software quality. The KDD process discovered that for this software development environment, out of forty software attributes, only a few of the predictor variables were significant. This resulted in a model that predicts whether modules are likely to have faults discovered by customers. Software developers need such predictions early in development to target software enhancement techniques to the modules that need improvement the most.

APA, Harvard, Vancouver, ISO, and other styles

4

Fayyad, Usama, and Paul Stolorz. "Data mining and KDD: Promise and challenges." Future Generation Computer Systems 13, no. 2-3 (1997): 99–115. http://dx.doi.org/10.1016/s0167-739x(97)00015-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Gansky, S. A. "Dental Data Mining: Potential Pitfalls and Practical Issues." Advances in Dental Research 17, no. 1 (2003): 109–14. http://dx.doi.org/10.1177/154407370301700125.

Full text

Abstract:

Knowledge Discovery and Data Mining (KDD) have become popular buzzwords. But what exactly is data mining? What are its strengths and limitations? Classic regression, artificial neural network (ANN), and classification and regression tree (CART) models are common KDD tools. Some recent reports ( e.g., Kattan et al., 1998 ) show that ANN and CART models can perform better than classic regression models: CART models excel at covariate interactions, while ANN models excel at nonlinear covariates. Model prediction performance is examined with the use of validation procedures and evaluating concordance, sensitivity, specificity, and likelihood ratio. To aid interpretation, various plots of predicted probabilities are utilized, such as lift charts, receiver operating characteristic curves, and cumulative captured-response plots. A dental caries study is used as an illustrative example. This paper compares the performance of logistic regression with KDD methods of CART and ANN in analyzing data from the Rochester caries study. With careful analysis, such as validation with sufficient sample size and the use of proper competitors, problems of naïve KDD analyses ( Schwarzer et al., 2000 ) can be carefully avoided.

APA, Harvard, Vancouver, ISO, and other styles

6

He, Han, Yuanyuan Hong, Weiwei Liu, and Sung-A. Kim. "Data mining model for multimedia financial time series using information entropy." Journal of Intelligent & Fuzzy Systems 39, no. 4 (2020): 5339–45. http://dx.doi.org/10.3233/jifs-189019.

Full text

Abstract:

At present, KDD research covers many aspects, and has achieved good results in the discovery of time series rules, association rules, classification rules and clustering rules. KDD has also been widely used in practical work such as OLAP and DW. Also, with the rapid development of network technology, KDD research based on WEB has been paid more and more attention. The main research content of this paper is to analyze and mine the time series data, obtain the inherent regularity, and use it in the application of financial time series transactions. In the financial field, there is a lot of data. Because of the huge amount of data, it is difficult for traditional processing methods to find the knowledge contained in it. New knowledge and new technology are urgently needed to solve this problem. The application of KDD technology in the financial field mainly focuses on customer relationship analysis and management, and the mining of transaction data is rare. The actual work requires a tool to analyze the transaction data and find its inherent regularity, to judge the nature and development trend of the transaction. Therefore, this paper studies the application of KDD in financial time series data mining, explores an appropriate pattern mining method, and designs an experimental system which includes mining trading patterns, analyzing the nature of transactions and predicting the development trend of transactions, to promote the application of KDD in the financial field.

APA, Harvard, Vancouver, ISO, and other styles

7

Ardhi Baskara, Arya, Nurul Maharani Piranti, and Muhammad Fahrury Romdendine. "FRAMEWORK DATA MINING: SEBUAH SURVEI." JATI (Jurnal Mahasiswa Teknik Informatika) 9, no. 3 (2025): 4886–95. https://doi.org/10.36040/jati.v9i3.13803.

Full text

Abstract:

Perkembangan pesat dalam ranah teknologi informasi telah meningkatkan kebutuhan akan metode data mining untuk menganalisis dan mengolah data dalam jumlah besar. Berbagai metodologi telah dikembangkan untuk mendukung proses ini, di antaranya Knowledge Discovery in Databases (KDD), Cross-Industry Standard Process for Data Mining (CRISP-DM), dan Sample, Explore, Modify, Model, and Assess (SEMMA). Penelitian ini bertujuan untuk mengevaluasi popularitas dan efektivitas masing-masing metodologi melalui pendekatan Systematic Literature Review berbasis PRISMA. Sebanyak 52 artikel dari tahun 2021 hingga 2025 dianalisis guna mengidentifikasi tren penggunaan metodologi dalam berbagai bidang, termasuk kesehatan, bisnis, teknologi, dan pendidikan. Hasil studi menunjukkan bahwa CRISP-DM adalah metodologi yang paling sering diterapkan karena fleksibilitasnya dalam berbagai sektor. Sementara itu, KDD dan SEMMA lebih banyak digunakan dalam konteks yang lebih spesifik. Studi ini menyoroti pentingnya pemilihan metodologi yang sesuai untuk memastikan efektivitas ekstraksi informasi dari data. Temuan penelitian ini diharapkan dapat menjadi referensi bagi akademisi, praktisi, dan peneliti dalam menentukan metodologi yang paling relevan berdasarkan karakteristik data dan tujuan analisis.

APA, Harvard, Vancouver, ISO, and other styles

8

CAO, LONGBING, and CHENGQI ZHANG. "THE EVOLUTION OF KDD: TOWARDS DOMAIN-DRIVEN DATA MINING." International Journal of Pattern Recognition and Artificial Intelligence 21, no. 04 (2007): 677–92. http://dx.doi.org/10.1142/s0218001407005612.

Full text

Abstract:

Traditionally, data mining is an autonomous data-driven trial-and-error process. Its typical task is to let data tell a story disclosing hidden information, in which domain intelligence may not be necessary in targeting the demonstration of an algorithm. Often knowledge discovered is not generally interesting to business needs. Comparably, real-world applications rely on knowledge for taking effective actions. In retrospect of the evolution of KDD, this paper briefly introduces domain-driven data mining to complement traditional KDD. Domain intelligence is highlighted towards actionable knowledge discovery, which involves aspects such as domain knowledge, people, environment and evaluation. We illustrate it through mining activity patterns in social security data.

APA, Harvard, Vancouver, ISO, and other styles

9

N., Thinaharan, Chitradevi B., Malathi P., and Kalpana K. "A LITERATURE SURVEY ON DATA MINING TECHNIQUES AND CONCEPTS." International Journal of Engineering Research and Modern Education 3, no. 2 (2018): 1–3. https://doi.org/10.5281/zenodo.1332042.

Full text

Abstract:

Data mining is a multidisciplinary field, drawing work from areas including database technology, machine learning, statistics, pattern recognition, information retrieval, neural networks, knowledge-based systems, artificial intelligence, high-performance computing, and data visualization. Data mining is the process of analyzing data from different views and summarizing it into useful data. “Data mining, also popularly referred to as knowledge discovery from data (KDD), is the automated or convenient extraction of patterns representing knowledge implicitly stored or captured in large databases, data warehouses, the Web, other massive information repositories or data streams.”.

APA, Harvard, Vancouver, ISO, and other styles

10

Rodríguez-Ruiz, Julieta G., Carlos Eric Galván-Tejada, Sodel Vázquez-Reyes, Jorge Issac Galván-Tejada, and Hamurabi Gamboa-Rosales. "Classification of Depressive Episodes Using Nighttime Data: Multivariate and Univariate Analysis." Proceedings of the Institute for System Programming of the RAS 33, no. 2 (2021): 115–24. http://dx.doi.org/10.15514/ispras-2021-33(2)-6.

Full text

Abstract:

Mental disorders like depression represent 28% of global disability, it affects around 7.5% percent of global disability. Depression is a common disorder that affects the state of mind, normal activities, emotions, and produces sleep disorders. It is estimated that approximately 50% of depressive patients suffering from sleep disturbances. In this paper, a data mining process to classify depressive and not depressive episodes during nighttime is carried out based on a formal method of data mining called Knowledge Discovery in Databases (KDD). KDD guides the process of data mining with stages well established: Pre-KDD, Selection, Pre-processing, Transformation, Data Mining, Evaluation, and Post-KDD. The dataset used for the classification is the DEPRESJON dataset, which contains the motor activity of 23 unipolar and bipolar depressed patients and 32 healthy controls. The classification is carried out with two different approaches; a multivariate and univariate analysis to classify depressive and non-depressive episodes. For the multivariate analysis, the Random Forest algorithm is implemented with a model construct of 8 features, the results of the classification are specificity equal to 0.9927 and sensitivity equal to 0.9991. The univariate analysis shows that the maximum of the activity is the most descriptive characteristic of the model with 0.908 in accuracy for the classification of depressive episodes.

APA, Harvard, Vancouver, ISO, and other styles

11

Melissa, Ira, and Raymond S. Oetama. "Analisis Data Pembayaran Kredit Nasabah Bank Menggunakan Metode Data Mining." Jurnal ULTIMA InfoSys 4, no. 1 (2013): 18–27. http://dx.doi.org/10.31937/si.v4i1.238.

Full text

Abstract:

Data mining adalah analisis atau pengamatan terhadap kumpulan data yang besar dengan tujuan untuk menemukan hubungan tak terduga dan untuk meringkas data dengan cara yang lebih mudah dimengerti dan bermanfaat bagi pemilik data. Data mining merupakan proses inti dalam Knowledge Discovery in Database (KDD). Metode data mining digunakan untuk menganalisis data pembayaran kredit peminjam pembayaran kredit. Berdasarkan pola pembayaran kredit peminjam yang dihasilkan, dapat dilihat parameter-parameter kredit yang memiliki keterkaitan dan paling berpengaruh terhadap pembayaran angsuran kredit. Kata kunci—data mining, outlier, multikolonieritas, Anova

APA, Harvard, Vancouver, ISO, and other styles

12

Guo, Yu Dong. "Prototype System of Knowledge Management Based on Data Mining." Applied Mechanics and Materials 411-414 (September 2013): 251–54. http://dx.doi.org/10.4028/www.scientific.net/amm.411-414.251.

Full text

Abstract:

Knowledge is a very crucial resource to promote economic development and society progress which includes facts, information, descriptions, or skills acquired through experience or education. With knowledge has being increasingly prominent, knowledge management has become important measure for the core competences promotion of a corporation. The paper begins with knowledge managements definition, and studies the process of knowledge discovery from databases (KDD),data mining techniques and SECI(Socialization, Externalization, Combination, Internalization) model of knowledge dimensions. Finally, a simple knowledge management prototype system was proposed which based on the KDD and data mining.

APA, Harvard, Vancouver, ISO, and other styles

13

Bagga, Dr Simmi. "CONCEPTUAL THREE PHASE KDD MODEL AND FINANCIAL RESEARCH." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 4, no. 2 (2005): 592–95. http://dx.doi.org/10.24297/ijct.v4i2c1.4177.

Full text

Abstract:

KDD model becomes used in financial process. Data Mining tools can be used to improve the efficiency of the professionals. The integration of Data Mining tools with the traditional financial research methods is relatively a new concept. If Data Mining tools for Financial are developed then it make the process fast, cheaper and relatively much more efficient. In this paper we have discussed the three phase model of KDD on Financial Research.

APA, Harvard, Vancouver, ISO, and other styles

14

Vinayak, Jain. "An Overview on Data Mining and Data Fusion." Indian Journal of Data Mining (IJDM) 3, no. 1 (2023): 1–5. https://doi.org/10.54105/ijdm.A1624.053123.

Full text

Abstract:

<strong>Abstract: </strong>Strong adoption of Internet and Communication technologies across industries in the last two decades has led to large-scale digitization of business processes. While this has helped in the instant availability of information, over the period, the source and amount of this information have increased multi-fold giving rise to Big Data. With the increase in volume, the relevance of data in its raw format continues to decrease over time. According to HACE Theorem, Big Data has autonomous sources being distributed and decentralized data in a complex relationship with each other. Making sense of this ever-growing large pool of data has become increasingly difficult and has created a new problem waning the initial gains made via the digitization of systems and processes. This gave rise to the evolution of multiple Data Mining techniques that have helped to classify large volumes of data into relevant segments and drive value to help provide meaningful information. To extract and discover knowledge from data, Knowledge Discovering Databases (KDD) help in the refining of data. This paper discusses various data mining techniques that help to identify patterns and relationships to help make business decisions using data analysis. Furthermore, the Data Fusion method is reviewed which deals with joint analysis of multiple inter-related datasets providing multiple complementary views to help further with precise decision-making.

APA, Harvard, Vancouver, ISO, and other styles

15

Mariscal, Gonzalo, Óscar Marbán, and Covadonga Fernández. "A survey of data mining and knowledge discovery process models and methodologies." Knowledge Engineering Review 25, no. 2 (2010): 137–66. http://dx.doi.org/10.1017/s0269888910000032.

Full text

Abstract:

AbstractUp to now, many data mining and knowledge discovery methodologies and process models have been developed, with varying degrees of success. In this paper, we describe the most used (in industrial and academic projects) and cited (in scientific literature) data mining and knowledge discovery methodologies and process models, providing an overview of its evolution along data mining and knowledge discovery history and setting down the state of the art in this topic. For every approach, we have provided a brief description of the proposed knowledge discovery in databases (KDD) process, discussing about special features, outstanding advantages and disadvantages of every approach. Apart from that, a global comparative of all presented data mining approaches is provided, focusing on the different steps and tasks in which every approach interprets the whole KDD process. As a result of the comparison, we propose a new data mining and knowledge discovery process namedrefined data mining processfor developing any kind of data mining and knowledge discovery project. The refined data mining process is built on specific steps taken from analyzed approaches.

APA, Harvard, Vancouver, ISO, and other styles

16

Faiz, Hashmi. "Elementary approach towards Biological Data Mining." International Journal of Trend in Scientific Research and Development 2, no. 1 (2017): 1109–14. https://doi.org/10.31142/ijtsrd7198.

Full text

Abstract:

In this paper we provide an overview on interactive and integrative knowledge discovery and data mining. The most important challenges, includes the need to develop and apply novel methods, algorithms and tools for the integration, fusion, pre processing, mapping, analysis and interpretation of complex biomedical data with the aim to identify testable hypotheses, and build realistic models. The HCI KDD approach, which is a synergistic combination of methodologies and approaches of two areas, Human-Computer Interaction HCI and Knowledge Discovery and Data Mining KDD , offer ideal conditions towards solving these challenges with the goal of supporting human intelligence with machine intelligence. There is an urgent need for integrative and interactive machine learning solutions, because no medical doctor or biomedical researcher can keep pace today with the increasingly large and complex data sets - often called "Big Data". The application of data mining in the domain of bioinformatics is explained. It also highlights some of the current challenges and opportunities of data mining in bioinformatics. Faiz Hashmi "Elementary approach towards Biological Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: https://www.ijtsrd.com/papers/ijtsrd7198.pdf

APA, Harvard, Vancouver, ISO, and other styles

17

Wang, Yan, Kun Yang, Xiang Jing, and Huang Long Jin. "Problems of KDD Cup 99 Dataset Existed and Data Preprocessing." Applied Mechanics and Materials 667 (October 2014): 218–25. http://dx.doi.org/10.4028/www.scientific.net/amm.667.218.

Full text

Abstract:

KDD Cup 99 dataset is not only the most widely used dataset in intrusion detection, but also the de facto benchmark on evaluating the performance merits of intrusion detection system. Nevertheless there are a lot of issues in this dataset which cannot be omitted. In order to establish good data mining models in intrusion detection and find the appropriate network intrusion attack types’ features, researchers should have a well-known understanding on this dataset. In this paper, first and foremost we have made an in-depth analysis on the problems which the dataset are existed, and given the related solutions. Secondly, we also have carried out plenty data preprocessing on the 10% subset of KDD Cup 99 dataset’s training set, giving better results to the following process. What’s more, by comparing 10 common kinds of data mining algorithms in our experiment, we have analyzed and summarized that data preprocessing plays a vital role on the performance and importance to data mining algorithms.

APA, Harvard, Vancouver, ISO, and other styles

18

Mariñelarena-Dondena, Luciana, Marcelo Luis Errecalde, and Alejandro Castro Solano. "Extracción de conocimiento con técnicas de minería de textos aplicadas a la psicología." Revista Argentina de Ciencias del Comportamiento 9, no. 2 (2017): 65–76. http://dx.doi.org/10.32348/1852.4206.v9.n2.12701.

Full text

Abstract:

The knowledge discovery in databases (KDD) is concerned with the non-trivial process of making sense of data. Data mining is only a step in the KDD process that consists in pattern recognition using statistics and machine learning techniques. This literature review focuses on how text mining techniques can be applied in Psychology. In this context, the two main purposes of text mining techniques will be introduced: description and prediction. Finally, this paper highlights the use of text mining techniques as a psychological assessment tool, which differs from the use of standard questionnaires or scales.

APA, Harvard, Vancouver, ISO, and other styles

19

Sandra, Silva Mitraud Ruas, and Fábio Maciel Rafael. "DESCOBERTA DE CONHECIMENTO E MINERAÇÃO DE DADOS EM SAÚDE PÚBLICA: CONCEITOS E PANORAMA ATUAL." Revistaft 27, no. 118 (2023): 67. https://doi.org/10.5281/zenodo.7585167.

Full text

Abstract:

<strong>Objetivo</strong>: Descrever os principais conceitos e usos das tecnologias de Descoberta de Conhecimento e Mineração de Dados (KDD) em saúde pública. <strong>Método</strong>: estudo de revisão bibliográfica. Pesquisa realizada nas bases de dados MEDLINE, LILACS e IBCS, nos últimos 10 anos (2013 a 2023).  <strong>Resultados</strong>: Onze estudos foram revisados para apresentar os principais conceitos e 8 experiências diferentes de aplicação de KDD em âmbito nacional e internacional. <strong>Conclusão</strong>: KDD pode ser empregada para atender a diferentes objetivos: gestão clínica, gestão de serviços, detecção de epidemias, estudos de percepções e atitudes da população, detecção de possíveis fraldes no sistema de pagamento. As fontes podem ser os registros eletrônicos de saúde, assistenciais ou administrativos, de serviços públicos e privados, bem como dados de redes sociais e da imprensa. Os estudos mostram o enorme potencial da KDD para a saúde pública.

APA, Harvard, Vancouver, ISO, and other styles

20

Miranda, Eka. "DATA MINING DENGAN METODE KLASIFIKASI NAÏVE BAYES UNTUK MENGKLASIFIKASIKAN PELANGGAN." Infotech: Journal of Technology Information 4, no. 1 (2018): 6–12. http://dx.doi.org/10.37365/it.v4i1.7.

Full text

Abstract:

Tujaun penelitian ini adalah mengklasifikasikan pelanggan berdasarkan tabel transaksi dengan pendekatan knowledge discovery from data (KDD) dan metode data mining naïve bayes classifier dengan manfaat menghasilkan pengetahuan yang berguna untuk mengambil keputusan yang terkait dengan mengelola pelanggan.Untuk menggali pengetahuan dari data yang berjumlah besar tersebut, menggunakan data mining dan metode Naïve Bayes Classifier. Untuk mengklasifikasikan pelanggan digunakan tabel transaksi dari proses pembelian kendaraan bermotor dengan pendekatan Knowledge Discovery from Data (KDD) dan metode data mining Naïve Bayes Classifier. Metode yang digunakan pada penelitian terdiri atas metode pengumpulan data yang digunakan untuk pencariaan kebutuhan informasi dengan menggunakan fact finding technique menurut Thomas Connolly dan Carolyn Begg, yang meliputi: Wawancara (Interview), Persyaratan (Requerements) atau Preferensi (Preferences) dan proses penemuan pengetahuan menggunakan pendekatan Knowledge Discovery from Data (KDD). Penellitian ini mengklasifikasikan pelanggan menjadi dua kelas yaitu kelas pelanggan potensial dan pelanggan tidak potensial dengan menggunakan atribut prediksi klasifikasi terdiri atas Pekerjaan, Jenis Bayar, Tenor dan Usia. Hasil dari penelitian menunjukan bahwa Naïve Bayes Classifier telah dapat mengklasifikasikan pelanggan menjadi dua kelas yaitu kelas pelanggan potensial dan pelanggan tidak potensial dengan nilai akurasi masing-masing sebagai berikut : Sensitivity 97%, Specificity 99,8%, Precision 99,8%, Recall 97%, Accuracy 97%, Error Rate 3%.

APA, Harvard, Vancouver, ISO, and other styles

21

Faiq Hacıyev, Nubar Qəhrəmanlı, Faiq Hacıyev, Nubar Qəhrəmanlı. "DATA-MİNİNG-DƏ KLASTERİZASİYA ÜSULLARININ TƏDQİQİ." PAHTEI-Procedings of Azerbaijan High Technical Educational Institutions 36, no. 01 (2024): 50–57. http://dx.doi.org/10.36962/pahtei36012024-50.

Full text

Abstract:

“Data mining” termini böyük həcmli məlumatlardan bilik əldə etmə prosesini təsvir edir. Başqa sözlə, böyük verilənlər nəhəng və mürəkkəb məlumat dəstlərində əhəmiyyətli nümunələri tapmaq sənəti, elm və texnikasıdır. Nəzəriyyəçilər və praktiklər prosesin effektivliyini, qənaətcilliyini və dəqiqliyini artırmaq üçün daim daha yaxşı üsullar axtarırlar. Bir çox terminlər, o cümlədən verilənlərdən məlumat əldə etmək, məlumat toplamaq, məlumatların təhlili və məlumatların axtarışı, verilənlərin əldə edilməsi ilə oxşar və ya bir qədər fərqli mənalara malikdir. Tez-tez KDD kimi tanınan Datadan Bilik Kəşfiyyatı, data mining-in sinonim kimi istifadə etdiyi başqa bir geniş istifadə olunan ifadədir. Digərləri, məlumatların nümunələrini çıxarmaq üçün ağıllı üsullardan istifadə edildikdə, məlumatların əldə edilməsi prosesində yalnız həlledici mərhələ kimi baxırlar. Data miningin tətbiq sahələri. Səhiyyə, pərakəndə satış, bankçılıq, hökumət və istehsal da daxil olmaqla çoxsaylı sektorlar Data Mining-dən geniş şəkildə istifadə edir. Məsələn, əgər biznes müəyyən malları alan müştərilər arasında tendensiyaları və ya nümunələri tanımaq istəyirsə, o, keçmiş satınalmaları yoxlamaq üçün məlumat toplama üsullarından istifadə edə və xüsusiyyətlərinə və ya davranışlarına əsasən hansı müştərilərin malı almaq istəyəcəyini təxmin edən modellər yarada bilər. Açar sözlər: Data mininq, sərhəd, klasterizasiya, müvafiq

APA, Harvard, Vancouver, ISO, and other styles

22

Mr., Nilesh Kumar Dokania*1 &. Ms. Navneet Kaur2. "COMPARATIVE STUDY OF VARIOUS TECHNIQUES IN DATA MINING." INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY 7, no. 5 (2018): 202–9. https://doi.org/10.5281/zenodo.1241440.

Full text

Abstract:

Data mining (knowledge discovery from data) may be viewed as the extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns and models from observed data or a method used for analytical process designed to explore data. We know Data mining as knowledge discovery. Basically Extraction or “MINING” means knowledge from large amount of data. We use Data mining due to the explosive growth of data i.e. from terabytes to petabytes. We are drowning in data, but starving for knowledge! Alternative names of Data mining are: Data archeology, Data dredging, Information harvesting, Business intelligence, etc. Data mining techniques are used to find the hidden or new patterns to store the data. We know that data mining can use every sector like business, agriculture, marketing etc. There are many techniques for data mining like clustering, classification etc. There are various approaches and techniques of data mining which can be applied on data to build up a new environment to improve performance of existing data and help to create the new predictions on the data. [1].

APA, Harvard, Vancouver, ISO, and other styles

23

Baskar, S.S., L. Arockiam, and S. Charles. "A Systematic Approach on Data Pre-processing In Data Mining." COMPUSOFT: An International Journal of Advanced Computer Technology 02, no. 11 (2013): 335–39. https://doi.org/10.5281/zenodo.14613431.

Full text

Abstract:

Data pre-processing is an important and critical step in the data mining process and it has a huge impact on the success of a data mining Soil classification. Data pre-processing is a first step of the Knowledge discovery in databases (KDD) process that reduces the complexity of the data and offers better analysis and ANN training. Based on the collected data from the field as well soil testing laboratory, data analysis is performed more accurately and efficiently. Data pre-processing is challenging and tedious task as it involves extensive manual effort and time in developing the data operation scripts. There are a number of different tools and methods used for pre-processing, including: sampling, which selects a representative subset from a large population of data; transformation, which manipulates raw data to produce a single input; denoising, which removes noise from data; normalization, which organizes data for more efficient access; and feature extraction, which pulls out specified data that is significant in some particular context. Pre-processing technique for soil data sets are also useful for classification in data mining. 

APA, Harvard, Vancouver, ISO, and other styles

24

Rao, R. Bharat, Oksana Yakhnenko, and Balaji Krishnapuram. "KDD cup 2008 and the workshop on mining medical data." ACM SIGKDD Explorations Newsletter 10, no. 2 (2008): 34–38. http://dx.doi.org/10.1145/1540276.1540288.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Margineantu, Dragos, Stephen Bay, Philip Chan, and Terran Lane. "Data mining methods for anomaly detection KDD-2005 workshop report." ACM SIGKDD Explorations Newsletter 7, no. 2 (2005): 132–36. http://dx.doi.org/10.1145/1117454.1117473.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Važan, Pavel, Pavol Tanuska, Dominika Jurovatá, and Michal Kebisek. "Analysis of Production Process Parameters by Using Data Mining Methods." Applied Mechanics and Materials 309 (February 2013): 342–49. http://dx.doi.org/10.4028/www.scientific.net/amm.309.342.

Full text

Abstract:

This article deals with knowledge discovery in databases (abbr. KDD) and methodology of this process. The authors give an identification of production parameters and their influence on a production process. Knowledge discovery in the production databases is minimally used for the process of planning and control. There are many problems that occur in the production process. It is important to indentify the impact of manufacturing parameters on the system for managers. New discovered knowledge from production systems will help make the right decision to fulfill the objectives. Using the KDD in the control of production systems, it can be achieved better understanding of system control and can help predict a future behavior of system. The authors formulated general knowledge for improve parameters of analyzed production process. The objectives, steps and some results of the project are presented in this article

APA, Harvard, Vancouver, ISO, and other styles

27

Andrade-Arenas, Laberiano, Inoc Rubio-Paucar, and Cesar Yactayo-Arias. "Data mining for predictive analysis in gynecology: a focus on cervical health." International Journal of Electrical and Computer Engineering (IJECE) 14, no. 3 (2024): 2822. http://dx.doi.org/10.11591/ijece.v14i3.pp2822-2833.

Full text

Abstract:

Currently, data mining based on the application of detection of important patterns that allow making decisions according to cervical cancer is a problem that affects women from the age of 24 years and older. For this purpose, the Rapid Miner Studio tool was used for data analysis according to age. To perform this analysis, the knowledge discovery in databases (KDD) methodology was used according to the stages that this methodology follows, such as data selection, data preparation, data mining and evaluation and interpretation. On the other hand, the comparison of methodologies such as the standard intersectoral process for data mining (Crips-dm), KDD and sample, explore, modify, model, evaluate (Semma) is shown, which is separated by dimensions and in each dimension both methodologies are compared. In that sense, a graph was created comparing algorithmic models such as naive Bayes, decision tree, and rule induction. It is concluded that the most outstanding result was -1.424 located in cluster 4 in the attribute result date.

APA, Harvard, Vancouver, ISO, and other styles

28

Zhao, Hong Yan. "Decision Tree Technology in Data Classification." Applied Mechanics and Materials 268-270 (December 2012): 1752–57. http://dx.doi.org/10.4028/www.scientific.net/amm.268-270.1752.

Full text

Abstract:

With the development of database technology as well as the widespread application of database management system, the capability of collecting data was improved rapidly, and lots of data have been accumulated. Data mining was created for the purpose of excavate the useful knowledge hidden behind these data. Data classification is not only an important issue of data mining but also an effective KDD method. Decision Tree, which is a major technology of data classification, is applied far and widely. In this article, the concrete step of mining data by decision tree, the main algorithm and the basic idea of decision tree were summarized and analysed.

APA, Harvard, Vancouver, ISO, and other styles

29

Molina Huerta, Carlos, Alan Sotelo Atahua, Jahir Villacrisis Guerrero, and Laberiano Andrade-Arenas. "Data mining: Application of digital marketing in education." Advances in Mobile Learning Educational Research 3, no. 1 (2023): 621–29. http://dx.doi.org/10.25082/amler.2023.01.011.

Full text

Abstract:

The excessive cost of inadequate management of stored information resources by companies means a significant loss for them, causing them to invest more than they should in technology. To overcome and avoid more significant losses, companies must counteract this type of problem. The present work's aim is to apply good data mining through digital business marketing that will allow ordering and filtering of the relevant information in the databases through RapidMiner, to supply the companies' databases with only relevant information for the normal development of their functions. For this purpose, the Knowledge Discovery Databases (KDD) methodology will be used, which will allow us to filter and search for information patterns that are hidden in order to take advantage of the historical data of investment per student in the educational sector and to establish a more accurate and efficient data prediction. As a result, it was found that over the years, the expenditure per student increases regardless of the area in which it is located, that although not in all provinces same amount is allocated, it is observed that it maintains an upward trend concerning the expenditures made, concluding that the KDD methodology allowed us to graph and showed how the expenditure allocated to the education sector has varied in the different grades of education, providing relevant information that will be useful for future related studies.

APA, Harvard, Vancouver, ISO, and other styles

30

Paucar, Inoc Rubio, and Laberiano Andrade-Arenas. "Data mining and cardiac health: predicting heart attack risks." Indonesian Journal of Electrical Engineering and Computer Science 38, no. 2 (2025): 1010. https://doi.org/10.11591/ijeecs.v38.i2.pp1010-1023.

Full text

Abstract:

In a context where heart attacks continue to be a global health concern, the lack of precision in predicting who is at higher risk poses a critical challenge due to the variability of risk factors and complex interactions among them. The research aims to develop predictive models for heart attack risks using data mining techniques, employing the knowledge discovery in databases methodology (KDD) and the k-means algorithm with RapidMiner studio. The primary objective is to identify patterns and risk profiles, allowing for early identification of at-risk individuals, considering factors like obesity, diabetes, alcoholism, and stress, to reduce preventable deaths and improve cardiac healthcare. This innovative approach combines cardiac health, data mining, and KDD methodology to address the challenge of predicting heart attack risks and has the potential to enhance medical care and save lives. The predominant results obtained were that cluster 1 with a fraction of 0.312 and a percentage of 31.2% of the attribute diabetes was one of the most prevalent causes of cardiac risk. Finally, the research concluded that people with diabetes are more likely to have cardiac risk associated with dietary factors or consumption of other substances.

APA, Harvard, Vancouver, ISO, and other styles

31

Inoc, Rubio Paucar Laberiano Andrade-Arenas. "Data mining and cardiac health: predicting heart attack risks." Indonesian Journal of Electrical Engineering and Computer Science 38, no. 2 (2025): 1010–23. https://doi.org/10.11591/ijeecs.v38.i2.pp1010-1023.

Full text

Abstract:

In a context where heart attacks continue to be a global health concern, the lack of precision in predicting who is at higher risk poses a critical challenge due to the variability of risk factors and complex interactions among them. The research aims to develop predictive models for heart attack risks using data mining techniques, employing the knowledge discovery in databases methodology (KDD) and the k-means algorithm with RapidMiner studio. The primary objective is to identify patterns and risk profiles, allowing for early identification of at-risk individuals, considering factors like obesity, diabetes, alcoholism, and stress, to reduce preventable deaths and improve cardiac healthcare. This innovative approach combines cardiac health, data mining, and KDD methodology to address the challenge of predicting heart attack risks and has the potential to enhance medical care and save lives. The predominant results obtained were that cluster 1 with a fraction of 0.312 and a percentage of 31.2% of the attribute diabetes was one of the most prevalent causes of cardiac risk. Finally, the research concluded that people with diabetes are more likely to have cardiac risk associated with dietary factors or consumption of other substances.

APA, Harvard, Vancouver, ISO, and other styles

32

ZHONG, NING, CHUNNIAN LIU, and SETSUO OHSUGA. "DYNAMICALLY ORGANIZING KDD PROCESSES." International Journal of Pattern Recognition and Artificial Intelligence 15, no. 03 (2001): 451–73. http://dx.doi.org/10.1142/s0218001401000976.

Full text

Abstract:

How to increase both autonomy and versatility of a knowledge discovery system is a core problem and a crucial aspect of KDD (Knowledge Discovery and Data Mining). Within the framework of the KDD process and the GLS (Global Learning Scheme) system recently proposed by us, this paper describes a way of increasing both autonomy and versatility of a KDD system by dynamically organizing KDD processes. In our approach, the KDD process is modeled as an organized society of KDD agents with multiple levels. We propose an ontology to describe KDD agents, in the style of OOER (Object Oriented Entity Relationship) data model. Based on this ontology of KDD agents, we apply several AI planning techniques, which are implemented as a meta-agent, so that we might (1) solve the most difficult problem in a multiagent KDD system: how to automatically choose appropriate KDD techniques (KDD agents) to achieve a particular discovery goal in a particular application domain; (2) tackle the complexity of KDD process; and (3) support evolution of KDD data, knowledge and process. The GLS system, as a multistrategy and multiagent KDD system based on the methodology, increases both autonomy and versatility.

APA, Harvard, Vancouver, ISO, and other styles

33

D N, Ashwini, and Soumya Dass B. "Role of Data Mining Technique: A Boon to Society." International Journal for Research in Applied Science and Engineering Technology 10, no. 6 (2022): 657–60. http://dx.doi.org/10.22214/ijraset.2022.43782.

Full text

Abstract:

Abstract: Datamining is a method of finding interested patterns from huge volume of data. Datamining techniques helps to make business decisions. It analyses information from multiple sources like DataMart, databases. In this paper, we are focussing on datamining tasks and its variety of applications in different fields, which is boon to the society. Keywords: KDD, Decision Tree, OLAP servers, Cube API, ODBC, Frequent patterns

APA, Harvard, Vancouver, ISO, and other styles

34

San, San Nwe, Khin Lay Khin, and Myint Yee Myint. "Delivery Feet Data using K Mean Clustering with Applied SPSS." International Journal of Trend in Scientific Research and Development 3, no. 5 (2019): 1944–45. https://doi.org/10.5281/zenodo.3591719.

Full text

Abstract:

Data mining refers to extracting or mining knowledge from large amounts of data. Many people treat data mining as a synonym for another popularly used term, knowledge discover from data or KDD. Data can be mined such as relational databases, data warehouses, transactional databases, advanced data and information systems and advance applications. The construction of clustering model which classify with car driving analysis using K mean clustering algorithm. The dataset was downloading from Google.com. San San Nwe | Khin Khin Lay | Myint Myint Yee "Delivery Feet Data using K-Mean Clustering with Applied SPSS" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26816.pdf

APA, Harvard, Vancouver, ISO, and other styles

35

Shukla, Siddhartha, Sandeep Kumar, Navdeep Sharma, and Dr Manjot Kaur Bhatia. "Data Mining Application: Rainfall Predictions." International Journal for Research in Applied Science and Engineering Technology 10, no. 12 (2022): 113–18. http://dx.doi.org/10.22214/ijraset.2022.47833.

Full text

Abstract:

Abstract: Data mining is method or process of extracting ( Implicit previously unknown and potentially useful) pattern or information from large amount of Data. It is used to extract relevance knowledge from raw data. Some data mining methods and algorithms or some organizations used this because to enhance their businesses and they found required result. In 1936 Alan Turing introduced this idea first time. And other name of Data Mining is Knowledge Discovery in Database (KDD), because of unknown and potentially important data stored in database. Rainfall is the prime input for wedding seasons or any occasion in India . It is designed for information about the rainfall season on occasion or by taking the previous 10 years data. This is very helpful for primary sector workers for crop planning .The Daily rainfall data for a period of 10 years is used to understand usual rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall This analysis will provide useful facts for water resources planner and formula available is used to evaluate return period of monthly , seasonally and annual rainfall.

APA, Harvard, Vancouver, ISO, and other styles

36

Khirod, Chandra Panda. "Anomaly Detection Techniques in Data Mining." European Journal of Advances in Engineering and Technology 5, no. 12 (2018): 1099–105. https://doi.org/10.5281/zenodo.12737369.

Full text

Abstract:

Anomaly detection has emerged as a crucial research area for modern researchers, particularly within the realm of data mining. This field is pivotal for future advancements in data mining. Data mining refers to the use of specific methods and algorithms designed to extract and analyze data, uncovering rules and patterns that describe the fundamental properties of data sets. These techniques can be applied to diverse data types, unveiling hidden structures and relationships. In today’s data-driven world, massive amounts of data are stored and transferred from one place to another, often exposing the data to potential threats. Although various techniques and applications are deployed to safeguard data, vulnerabilities still exist. To mitigate these risks and identify different types of cyber threats, data mining techniques are increasingly utilized to strengthen data security. Anomaly detection leverages data mining methods to identify unusual or unexpected behaviors within data, enhancing security by reducing the likelihood of intrusion or attack. This paper focuses on the application of anomaly detection in data mining, with a specific emphasis on identifying anomalies in time series data using machine learning techniques.

APA, Harvard, Vancouver, ISO, and other styles

37

Syahpitri Damanik, Nur Afni, Irianto Irianto, and Dahriansah Dahriansah. "Penerapan Metode Clustering Dengan Algoritma K-Means Tindak Kejahatan Pencurian di Kabupaten Asahan." J-Com (Journal of Computer) 1, no. 1 (2021): 7–14. http://dx.doi.org/10.33330/j-com.v1i1.1065.

Full text

Abstract:

Abstract:Theft is the illegal taking of property or belongings of another person without the permission of the owner. The most common crime problem in Asahan District is theft, so that the POLRES is still having trouble determining which areas are often the crime of theft. With this problem, we need to do a grouping for areas where theft often occurs, so the process used is the data mining process. Data mining is one of the processes of Knowledge Discovery from Databases (KDD). KDD is an activity that includes collecting, using historical data to find regularities, patterns or relationships in large data sets. One of the techniques known in data mining is clustering technique. The K-Means method is a method for clustering techniques, K- Means is a method that partitions data into groups so that data with the same characteristics are entered into the same set of groups and data with different characteristics are grouped into other groups. The attributes used in grouping this data are annual data, namely 2015, 2016, 2017, 2018, 2019. A case study of 9 POLSEK in the Asahan. Keywords: Data Mining, Clustering, K-Means Algorithm, Theft Crimes Grouping. Abstrak: Pencurian merupakan pengambilan properti atau barang milik orang lain secara tidak sah tanpa ijin dari pemilik. Masalah tindak kejahatan yang paling banyak terjadi di Kabupaten Asahan adalah tindak kejahatan pencurian sehingga pihak POLRES masih kesulitan untuk menentukan daerah mana saja yang sering terjadi tindak kejahatan pencuriaan. Dengan adanya masalah ini kita perlu melakukan pengelompokan untuk daerah mana saja yang sering terjadi tindak pencurian maka proses yang digunakan adalah proses data mining. Data mining adalah salah satu proses dari Knowledge Discovery from Databases (KDD). KDD adalah kegiatan yang meliputi pengumpulan, pemakaian data, historis untuk menemukan keteraturan, pola atau hubungan dalam set data besar. Salah satu teknik yang di kenal dalam data mining adalah teknik clustering. Metode K-Means merupakan metode untuk teknik clustering, K-Means adalah metode yang mempartisi data kedalam kelompok sehingga data berkarakteristik sama dimasukan kedalam set kelompok yang sama dan data yang berkerakteristik berbeda dikelompokkan ke dalam kelompok yang lain. Atribut yang di gunakan dalam pengelomokan data ini adalah data pertahun yaitu tahun 2015, 2016, 2017, 2018, 2019. Studi kasus pada 9 POLSEK yang ada di daerah kabupaten Asahan. Kata kunci: Data Mining, Clustering, Algoritma K-Means, Pengelompokan Tindak Kejahatan Pencurian.

APA, Harvard, Vancouver, ISO, and other styles

38

G.Karthikeyan, K.Saroja, and S.Prasath. "A Performance Assessment on Various Data mining Tool Using Support Vector Machine." Journal of Information Sciences and Computing Technologies 6, no. 1 (2016): 562–67. https://doi.org/10.5281/zenodo.3968244.

Full text

Abstract:

Data mining is essentially the discovery of valuable information and patterns from huge chunks of available data. Two indispensable techniques of data mining are clustering and classification, where the latter employs a set of pre-classified examples to develop a model that can classify the population of records at large, and the former divides the data into groups of similar objects. In this paper we have proposed a new method for data classification by integrating two data mining techniques, viz. clustering and classification. Then a comparative study has been carried out between the simple classification and new proposed integrated clustering-classification technique. Four popular data mining tools were used for both the techniques by using six different classifiers and one clustered for all sets. It was found that across all the tools used, the integrated clustering-classification technique was better than the simple classification technique. This result was consistent for all the six classifiers used. For both of the techniques, the best classifier was found to be SVM. From the four tools used, KNIME found to be the best in terms of flexibility of algorithm. All comparisons were drawn by comparing the percentage accuracy of each classifier used.

APA, Harvard, Vancouver, ISO, and other styles

39

Shen, Dou, Arun C. Surendran, and Ying Li. "Report on the second KDD workshop on data mining for advertising." ACM SIGKDD Explorations Newsletter 10, no. 2 (2008): 47–50. http://dx.doi.org/10.1145/1540276.1540291.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Li, Tao, and Chang-shing Perng. "KDD-2006 workshop report: Theory and Practice of Temporal Data Mining." ACM SIGKDD Explorations Newsletter 8, no. 2 (2006): 96–97. http://dx.doi.org/10.1145/1233321.1233337.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Adibi, Jafar, and Christos Faloutsos. "KDD-2002 workshop report fractals and self-similarity in data mining." ACM SIGKDD Explorations Newsletter 4, no. 2 (2002): 115–17. http://dx.doi.org/10.1145/772862.772885.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Siddiqui, Mohammad Khubeb, and Shams Naahid. "Analysis of KDD CUP 99 Dataset using Clustering based Data Mining." International Journal of Database Theory and Application 6, no. 5 (2013): 23–34. http://dx.doi.org/10.14257/ijdta.2013.6.5.03.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Sofyan, Fazrin Meila Azzahra, Affani Putri Riyandoro, Devi Fitriani Maulana, and Jajam Haerul Jaman. "Penerapan Data Mining dengan Algoritma C5.0 Untuk Prediksi Penyakit Stroke." J-SISKO TECH (Jurnal Teknologi Sistem Informasi dan Sistem Komputer TGD) 6, no. 2 (2023): 619. http://dx.doi.org/10.53513/jsk.v6i2.8578.

Full text

Abstract:

Penyakit stroke merupakan kondisi yang mempengaruhi sistem saraf dan dapat menyebabkan dampak yang serius pada kesehatan seseorang. WHO menyatakan sebanyak 13,7 juta kasus setiap tahunnya dan 5,5 juta orang diantaranya meninggal dunia akibat penyakit ini. Tujuan dari penelitian ini adalah untuk mengembangkan model prediksi yang dapat membantu dalam identifikasi dini risiko terjadinya stroke. Metode yang digunakan dalam penelitian ini adalah Knowledge Discovery in Databases (KDD) dengan menerapkan algoritma C5.0, yang merupakan salah satu algoritma klasifikasi yang efektif dalam mengolah data dengan atribut numerik maupun kategorikal. Pada metode Knowledge Discovery in Databases (KDD) terdiri dari beberapa tahap yang perlu dilakukan untuk penelitian ini, yaitu selection, preprocessing, transformation, data mining, dan evaluation. Untuk Algoritma C5.0 sendiri merupakan sebuah algoritma klasifikasi dalam bidang data mining yang secara khusus digunakan dalam teknik decision tree. Data yang digunakan dalam penelitian ini adalah dataset yang berisi informasi medis dan faktor risiko yang terkait dengan stroke. Hasil dari penelitian ini berupa Decision Tree (pohon keputusan) dengan nilai accuracy, recall, dan precision dengan melakukan split data 80% (data training) - 20% (data testing) hasil nilai Accuracy yang diperoleh sebesar 95%, Recall = 96%, dan Precision = 99%.

APA, Harvard, Vancouver, ISO, and other styles

44

Joseph, Sethunya R., Hlomani Hlomani, and Keletso Letsholo. "Data Mining Algorithms: An Overview." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 15, no. 6 (2016): 6806–13. http://dx.doi.org/10.24297/ijct.v15i6.1615.

Full text

Abstract:

The research on data mining has successfully yielded numerous tools, algorithms, methods and approaches for handling large amounts of data for various purposeful use andÂ Â problem solving. Data mining has become an integral part of many application domains such as data ware housing, predictive analytics, business intelligence, bio-informatics and decision support systems. Prime objective of data mining is to effectively handle large scale data, extract actionable patterns, and gain insightful knowledge. Data mining is part and parcel of knowledge discovery in databases (KDD) process. Success and improved decision making normally depends on how quickly one can discover insights from data. These insights could be used to drive better actions which can be used in operational processes and even predict future behaviour. This paper presents an overview of various algorithms necessary for handling large data sets. These algorithms define various structures and methods implemented to handle big data. The review also discusses the general strengths and limitations of these algorithms. This paper can quickly guide or an eye opener to the data mining researchers on which algorithm(s) to select and apply in solving the problems they will be investigating.

APA, Harvard, Vancouver, ISO, and other styles

45

Sembiring, Muhammad Ardiansyah, Raja Andri Tama Agus, and Mustika Fitri Larasati Sibuea. "ANALISIS KEPUASAN PELANGGAN MENGGUNAKAN METODE ROUGH SET." JOURNAL OF SCIENCE AND SOCIAL RESEARCH 4, no. 2 (2021): 227. http://dx.doi.org/10.54314/jssr.v4i2.647.

Full text

Abstract:

Data mining adalah proses dari Knowledge Discovery from Databases (KDD). KDD adalah kegiatan yang meliputi pengumpulan, pemakaian data, historis untuk menemukan keteraturan, pola atau hubungan dalam set data besar. Metode Rough Set berhubungan dengan discreet data, rough set biasanya digunakan bersamaan dengan teknik lain untuk melakukan discreetization pada dataset. Tujuan utama dari analisis rough set adalah untuk mensintesis pendekatan konsep dari data yang diperoleh. Penelitian ini bertujuan untuk mengetahui tingkat kepuasan pelanggan

APA, Harvard, Vancouver, ISO, and other styles

46

Ferrer-Troyano, Francisco, Jesús Aguilar-Ruiz, and José Riquelme. "Connecting Segments for Visual Data Exploration and Interactive Mining of Decision Rules." JUCS - Journal of Universal Computer Science 11, no. (11) (2005): 1835–48. https://doi.org/10.3217/jucs-011-11-1835.

Full text

Abstract:

Visualization has become an essential support throughout the KDD process in order to extract hidden information from huge amount of data. Visual data exploration techniques provide the user with graphic views or metaphors that represent potential patterns and data relationships. However, an only image does not always convey high-dimensional data properties successfully. From such data sets, visualization techniques have to deal with the curse of dimensionality in a critical way, as the number of examples may be very small with respect to the number of attributes. In this work, we describe a visual exploration technique that automatically extracts relevant attributes and displays their ranges of interest in order to support two data mining tasks: classification and feature selection. Through di#erent metaphors with dynamic properties, the user can re-explore meaningful intervals belonging to the most relevant attributes, building decision rules and increasing the model accuracy interactively.

APA, Harvard, Vancouver, ISO, and other styles

47

Krasnyuk, Maxim, Yurii Kulynych, and Svitlana Krasniuk. "KNOWLEDGE DISCOVERY AND DATA MINING OF STRUCTURED AND UNSTRUCTURED BUSINESS DATA: PROBLEMS AND PROSPECTS OF IMPLEMENTATION AND ADAPTATION IN CRISIS CONDITIONS." Grail of Science, no. 12-13 (May 24, 2022): 63–70. http://dx.doi.org/10.36074/grail-of-science.29.04.2022.006.

Full text

Abstract:

In modern conditions of the development of the global economy and in connection with the emergence of new branches of economic activity in the field of IT, the phenomenon of Structured and Unstructured Big Data - the use of Data Science for advanced in-depth analysis of data and knowledge in all possible modes - leads to competitive advantages for corporations and institutions, both at the regional and interstate levels, which is especially relevant in the context of the current macroeconomic and military crisis [1].The following topical issues are systematically investigated in the article: current status and prospects for further development of Knowledge Discovery in Data Base (KDD), problems and critical issues of theory and practice of Data Mining, the specifics of effective use of Knowledge Discovery in DB (Data Base) in the current crisis in Ukraine.The above trends and features of the KDD market should be taken into account in further theoretical research and practical implementation or reengineering of KDD systems in Ukraine. The obtained results are relevant and applicable not only for local companies and organizations, but also for international applications in the context of global, regional macroeconomic and current national crisis phenomena.

APA, Harvard, Vancouver, ISO, and other styles

48

Bay, Stephen D., Dennis Kibler, Michael J. Pazzani, and Padhraic Smyth. "The UCI KDD archive of large data sets for data mining research and experimentation." ACM SIGKDD Explorations Newsletter 2, no. 2 (2000): 81–85. http://dx.doi.org/10.1145/380995.381030.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Ma, Ming Lei, Gui Ling Wang, Dong Mei Miao, and Gui Jun Xian. "Applying KDD to a Structure Health Monitoring System Based on a Real Sited Bridge: Model Reshaping Case." Applied Mechanics and Materials 472 (January 2014): 535–38. http://dx.doi.org/10.4028/www.scientific.net/amm.472.535.

Full text

Abstract:

Knowledge discovery (KDD) method aims to solve the problem of massive data. For bridge engineering, the structural health monitoring (SHM) system is cumulative data from time to time, but the whole system should be understudied in real time. Data mining should be used in one of the KDD process. This article proposed a regular rule of analyzing the SHM data from a real sited bridge. The data aim to help engineers understanding the system degradation of the bridge.

APA, Harvard, Vancouver, ISO, and other styles

50

YANG, QIANG, and XINDONG WU. "10 CHALLENGING PROBLEMS IN DATA MINING RESEARCH." International Journal of Information Technology & Decision Making 05, no. 04 (2006): 597–604. http://dx.doi.org/10.1142/s0219622006002258.

Full text

Abstract:

In October 2005, we took an initiative to identify 10 challenging problems in data mining research, by consulting some of the most active researchers in data mining and machine learning for their opinions on what are considered important and worthy topics for future research in data mining. We hope their insights will inspire new research efforts, and give young researchers (including PhD students) a high-level guideline as to where the hot problems are located in data mining. Due to the limited amount of time, we were only able to send out our survey requests to the organizers of the IEEE ICDM and ACM KDD conferences, and we received an overwhelming response. We are very grateful for the contributions provided by these researchers despite their busy schedules. This short article serves to summarize the 10 most challenging problems of the 14 responses we have received from this survey. The order of the listing does not reflect their level of importance.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!