Log in

Relevant bibliographies by topics / Association Mining / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Association Mining.

Dissertations / Theses on the topic 'Association Mining'

Author: Grafiati

Published: 3 June 2025

Last updated: 23 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Association Mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Vithal, Kadam Omkar. "Novel applications of Association Rule Mining- Data Stream Mining." AUT University, 2009. http://hdl.handle.net/10292/826.

Full text

Abstract:

From the advent of association rule mining, it has become one of the most researched areas of data exploration schemes. In recent years, implementing association rule mining methods in extracting rules from a continuous flow of voluminous data, known as Data Stream has generated immense interest due to its emerging applications such as network-traffic analysis, sensor-network data analysis. For such typical kinds of application domains, the facility to process such enormous amount of stream data in a single pass is critical.

APA, Harvard, Vancouver, ISO, and other styles

2

Wong, Wai-kit. "Security in association rule mining." Click to view the E-thesis via HKUTO, 2007. http://sunzi.lib.hku.hk/HKUTO/record/B39558903.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Wong, Wai-kit, and 王偉傑. "Security in association rule mining." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B39558903.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Palanisamy, Senthil Kumar. "Association rule based classification." Link to electronic thesis, 2006. http://www.wpi.edu/Pubs/ETD/Available/etd-050306-131517/.

Full text

Abstract:

Thesis (M.S.)--Worcester Polytechnic Institute.<br>Keywords: Itemset Pruning, Association Rules, Adaptive Minimal Support, Associative Classification, Classification. Includes bibliographical references (p.70-74).

APA, Harvard, Vancouver, ISO, and other styles

5

Cai, Chun Hing. "Mining association rules with weighted items." Hong Kong : Chinese University of Hong Kong, 1998. http://www.cse.cuhk.edu.hk/%7Ekdd/assoc%5Frule/thesis%5Fchcai.pdf.

Full text

Abstract:

Thesis (M. Phil.)--Chinese University of Hong Kong, 1998.<br>Description based on contents viewed Mar. 13, 2007; title from title screen. Includes bibliographical references (p. 99-103). Also available in print.

APA, Harvard, Vancouver, ISO, and other styles

6

Zhou, Zequn. "Maintaining incremental data mining association rules." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ62311.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Goulbourne, Graham. "Tree algorithms for mining association rules." Thesis, University of Liverpool, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.250218.

Full text

Abstract:

With the increasing reliability of digital communication, the falling cost of hardware and increased computational power, the gathering and storage of data has become easier than at any other time in history. Commercial and public agencies are able to hold extensive records about all aspects of their operations. Witness the proliferation of point of sale (POS) transaction recording within retailing, digital storage of census data and computerized hospital records. Whilst the gathering of such data has uses in terms of answering specific queries and allowing visulisation of certain trends the volumes of data can hide significant patterns that would be impossible to locate manually. These patterns, once found, could provide an insight into customer behviour, demographic shifts and patient diagnosis hitherto unseen and unexpected. Remaining competitive in a modem business environment, or delivering services in a timely and cost effective manner for public services is a crucial part of modem economics. Analysis of the data held by an organisaton, by a system that "learns" can allow predictions to be made based on historical evidence. Users may guide the process but essentially the software is exploring the data unaided. The research described within this thesis develops current ideas regarding the exploration of large data volumes. Particular areas of research are the reduction of the search space within the dataset and the generation of rules which are deduced from the patterns within the data. These issues are discussed within an experimental framework which extracts information from binary data.

APA, Harvard, Vancouver, ISO, and other styles

8

Zhang, Ya Klein Cerry M. "Association rule mining in cooperative research." Diss., Columbia, Mo. : University of Missouri--Columbia, 2009. http://hdl.handle.net/10355/6540.

Full text

Abstract:

The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file. Title from PDF of title page (University of Missouri--Columbia, viewed January 26, 2010). Thesis advisor: Dr. Cerry M. Klein. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

9

Icev, Aleksandar. "DARM distance-based association rule mining." Link to electronic thesis, 2003. http://www.wpi.edu/Pubs/ETD/Available/etd-0506103-132405.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

HajYasien, Ahmed. "Preserving Privacy in Association Rule Mining." Thesis, Griffith University, 2007. http://hdl.handle.net/10072/365286.

Full text

Abstract:

With the development and penetration of data mining within different fields and disciplines, security and privacy concerns have emerged. Data mining technology which reveals patterns in large databases could compromise the information that an individual or an organization regards as private. The aim of privacy-preserving data mining is to find the right balance between maximizing analysis results (that are useful for the common good) and keeping the inferences that disclose private information about organizations or individuals at a minimum. In this thesis we present a new classification for privacy preserving data mining problems, we propose a new heuristic algorithm called the QIBC algorithm that improves the privacy of sensitive knowledge (as itemsets) by blocking more inference channels. We demonstrate the efficiency of the algorithm, we propose two techniques (item count and increasing cardinality) based on item-restriction that hide sensitive itemsets (and we perform experiments to compare the two techniques), we propose an efficient protocol that allows parties to share data in a private way with no restrictions and without loss of accuracy (and we demonstrate the efficiency of the protocol), and we review the literature of software engineering related to the associationrule mining domain and we suggest a list of considerations to achieve better privacy on software.<br>Thesis (PhD Doctorate)<br>Doctor of Philosophy (PhD)<br>School of Information and Communication Technology<br>Faculty of Engineering and Information Technology<br>Full Text

APA, Harvard, Vancouver, ISO, and other styles

11

Pray, Keith A. "Apriori Sets And Sequences: Mining Association Rules from Time Sequence Attributes." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0506104-150831/.

Full text

Abstract:

Thesis (M.S.) -- Worcester Polytechnic Institute.<br>Keywords: mining complex data; temporal association rules; computer system performance; stock market analysis; sleep disorder data. Includes bibliographical references (p. 79-85).

APA, Harvard, Vancouver, ISO, and other styles

12

Zhu, Hua. "On-line analytical mining of association rules." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ37678.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Lin, Weiyang. "Association rule mining for collaborative recommender systems." Link to electronic version, 2000. http://www.wpi.edu/Pubs/ETD/Available/etd-0515100-145926.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

王漣 and Lian Wang. "A study on quantitative association rules." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1999. http://hub.hku.hk/bib/B31223588.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Wang, Lian. "A study on quantitative association rules /." Hong Kong : University of Hong Kong, 1999. http://sunzi.lib.hku.hk/hkuto/record.jsp?B2118561X.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Delpisheh, Elnaz, and University of Lethbridge Faculty of Arts and Science. "Two new approaches to evaluate association rules." Thesis, Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science, c2010, 2010. http://hdl.handle.net/10133/2530.

Full text

Abstract:

Data mining aims to discover interesting and unknown patterns in large-volume data. Association rule mining is one of the major data mining tasks, which attempts to find inherent relationships among data items in an application domain, such as supermarket basket analysis. An essential post-process in an association rule mining task is the evaluation of association rules by measures for their interestingness. Different interestingness measures have been proposed and studied. Given an association rule mining task, measures are assessed against a set of user-specified properties. However, in practice, given the subjectivity and inconsistencies in property specifications, it is a non-trivial task to make appropriate measure selections. In this work, we propose two novel approaches to assess interestingness measures. Our first approach utilizes the analytic hierarchy process to capture quantitatively domain-dependent requirements on properties, which are later used in assessing measures. This approach not only eliminates any inconsistencies in an end user’s property specifications through consistency checking but also is invariant to the number of association rules. Our second approach dynamically evaluates association rules according to a composite and collective effect of multiple measures. It interactively snapshots the end user’s domain- dependent requirements in evaluating association rules. In essence, our approach uses neural networks along with back-propagation learning to capture the relative importance of measures in evaluating association rules. Case studies and simulations have been conducted to show the effectiveness of our two approaches.<br>viii, 85 leaves : ill. ; 29 cm

APA, Harvard, Vancouver, ISO, and other styles

17

Koh, Yun Sing, and n/a. "Generating sporadic association rules." University of Otago. Department of Computer Science, 2007. http://adt.otago.ac.nz./public/adt-NZDU20070711.115758.

Full text

Abstract:

Association rule mining is an essential part of data mining, which tries to discover associations, relationships, or correlations among sets of items. As it was initially proposed for market basket analysis, most of the previous research focuses on generating frequent patterns. This thesis focuses on finding infrequent patterns, which we call sporadic rules. They represent rare itemsets that are scattered sporadically throughout the database but with high confidence of occurring together. As sporadic rules have low support the minabssup (minimum absolute support) measure was proposed to filter out any rules with low support whose occurrence is indistinguishable from that of coincidence. There are two classes of sporadic rules: perfectly sporadic and imperfectly sporadic rules. Apriori-Inverse was then proposed for perfectly sporadic rule generation. It uses a maximum support threshold and user-defined minimum confidence threshold. This method is designed to find itemsets which consist only of items falling below a maximum support threshold. However imperfectly sporadic rules may contain items with a frequency of occurrence over the maximum support threshold. To look for these rules, variations of Apriori-Inverse, namely Fixed Threshold, Adaptive Threshold, and Hill Climbing, were proposed. However these extensions are heuristic. Thus the MIISR algorithm was proposed to find imperfectly sporadic rules using item constraints, which capture rules with a single-item consequent below the maximum support threshold. A comprehensive evaluation of sporadic rules and current interestingness measures was carried out. Our investigation suggests that current interestingness measures are not suitable for detecting sporadic rules.

APA, Harvard, Vancouver, ISO, and other styles

18

魯建江 and Kin-kong Loo. "Efficient mining of association rules using conjectural information." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31224878.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Ahmed, Shakil. "Strategies for partitioning data in association rule mining." Thesis, University of Liverpool, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.415661.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Bogorny, Vania. "Enhancing spatial association rule mining in geographic databases." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2006. http://hdl.handle.net/10183/7841.

Full text

Abstract:

A técnica de mineração de regras de associação surgiu com o objetivo de encontrar conhecimento novo, útil e previamente desconhecido em bancos de dados transacionais, e uma grande quantidade de algoritmos de mineração de regras de associação tem sido proposta na última década. O maior e mais bem conhecido problema destes algoritmos é a geração de grandes quantidades de conjuntos freqüentes e regras de associação. Em bancos de dados geográficos o problema de mineração de regras de associação espacial aumenta significativamente. Além da grande quantidade de regras e padrões gerados a maioria são associações do domínio geográfico, e são bem conhecidas, normalmente explicitamente representadas no esquema do banco de dados. A maioria dos algoritmos de mineração de regras de associação não garantem a eliminação de dependências geográficas conhecidas a priori. O resultado é que as mesmas associações representadas nos esquemas do banco de dados são extraídas pelos algoritmos de mineração de regras de associação e apresentadas ao usuário. O problema de mineração de regras de associação espacial pode ser dividido em três etapas principais: extração dos relacionamentos espaciais, geração dos conjuntos freqüentes e geração das regras de associação. A primeira etapa é a mais custosa tanto em tempo de processamento quanto pelo esforço requerido do usuário. A segunda e terceira etapas têm sido consideradas o maior problema na mineração de regras de associação em bancos de dados transacionais e tem sido abordadas como dois problemas diferentes: “frequent pattern mining” e “association rule mining”. Dependências geográficas bem conhecidas aparecem nas três etapas do processo. Tendo como objetivo a eliminação dessas dependências na mineração de regras de associação espacial essa tese apresenta um framework com três novos métodos para mineração de regras de associação utilizando restrições semânticas como conhecimento a priori. O primeiro método reduz os dados de entrada do algoritmo, e dependências geográficas são eliminadas parcialmente sem que haja perda de informação. O segundo método elimina combinações de pares de objetos geográficos com dependências durante a geração dos conjuntos freqüentes. O terceiro método é uma nova abordagem para gerar conjuntos freqüentes não redundantes e sem dependências, gerando conjuntos freqüentes máximos. Esse método reduz consideravelmente o número final de conjuntos freqüentes, e como conseqüência, reduz o número de regras de associação espacial.<br>The association rule mining technique emerged with the objective to find novel, useful, and previously unknown associations from transactional databases, and a large amount of association rule mining algorithms have been proposed in the last decade. Their main drawback, which is a well known problem, is the generation of large amounts of frequent patterns and association rules. In geographic databases the problem of mining spatial association rules increases significantly. Besides the large amount of generated patterns and rules, many patterns are well known geographic domain associations, normally explicitly represented in geographic database schemas. The majority of existing algorithms do not warrant the elimination of all well known geographic dependences. The result is that the same associations represented in geographic database schemas are extracted by spatial association rule mining algorithms and presented to the user. The problem of mining spatial association rules from geographic databases requires at least three main steps: compute spatial relationships, generate frequent patterns, and extract association rules. The first step is the most effort demanding and time consuming task in the rule mining process, but has received little attention in the literature. The second and third steps have been considered the main problem in transactional association rule mining and have been addressed as two different problems: frequent pattern mining and association rule mining. Well known geographic dependences which generate well known patterns may appear in the three main steps of the spatial association rule mining process. Aiming to eliminate well known dependences and generate more interesting patterns, this thesis presents a framework with three main methods for mining frequent geographic patterns using knowledge constraints. Semantic knowledge is used to avoid the generation of patterns that are previously known as non-interesting. The first method reduces the input problem, and all well known dependences that can be eliminated without loosing information are removed in data preprocessing. The second method eliminates combinations of pairs of geographic objects with dependences, during the frequent set generation. A third method presents a new approach to generate non-redundant frequent sets, the maximal generalized frequent sets without dependences. This method reduces the number of frequent patterns very significantly, and by consequence, the number of association rules.

APA, Harvard, Vancouver, ISO, and other styles

21

Shrestha, Anuj. "Association Rule Mining of Biological Field Data Sets." Thesis, North Dakota State University, 2017. https://hdl.handle.net/10365/28394.

Full text

Abstract:

Association rule mining is an important data mining technique, yet, its use in association analysis of biological data sets has been limited. This mining technique was applied on two biological data sets, a genome and a damselfly data set. The raw data sets were pre-processed, and then association analysis was performed with various configurations. The pre-processing task involves minimizing the number of association attributes in genome data and creating the association attributes in damselfly data. The configurations include generation of single/maximal rules and handling single/multiple tier attributes. Both data sets have a binary class label and using association analysis, attributes of importance to each of these class labels are found. The results (rules) from association analysis are then visualized using graph networks by incorporating the association attributes like support and confidence, differential color schemes and features from the pre-processed data.<br>Bioinformatics Seed Grant Program NIH/UND<br>National Science Foundation (NSF) Grant IIA-1355466

APA, Harvard, Vancouver, ISO, and other styles

22

Chudán, David. "Association rule mining as a support for OLAP." Doctoral thesis, Vysoká škola ekonomická v Praze, 2010. http://www.nusl.cz/ntk/nusl-201130.

Full text

Abstract:

The aim of this work is to identify the possibilities of the complementary usage of two analytical methods of data analysis, OLAP analysis and data mining represented by GUHA association rule mining. The usage of these two methods in the context of proposed scenarios on one dataset presumes a synergistic effect, surpassing the knowledge acquired by these two methods independently. This is the main contribution of the work. Another contribution is the original use of GUHA association rules where the mining is performed on aggregated data. In their abilities, GUHA association rules outperform classic association rules referred to the literature. The experiments on real data demonstrate the finding of unusual trends in data that would be very difficult to acquire using standard methods of OLAP analysis, the time consuming manual browsing of an OLAP cube. On the other hand, the actual use of association rules loses a general overview of data. It is possible to declare that these two methods complement each other very well. The part of the solution is also usage of LMCL scripting language that automates selected parts of the data mining process. The proposed recommender system would shield the user from association rules, thereby enabling common analysts ignorant of the association rules to use their possibilities. The thesis combines quantitative and qualitative research. Quantitative research is represented by experiments on a real dataset, proposal of a recommender system and implementation of the selected parts of the association rules mining process by LISp-Miner Control Language. Qualitative research is represented by structured interviews with selected experts from the fields of data mining and business intelligence who confirm the meaningfulness of the proposed methods.

APA, Harvard, Vancouver, ISO, and other styles

23

Loo, Kin-kong. "Efficient mining of association rules using conjectural information." Hong Kong : University of Hong Kong, 2001. http://sunzi.lib.hku.hk/hkuto/record.jsp?B22505544.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Wu, Jingtong. "Interpretation of association rules with multi-tier granule mining." Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/71455/1/Jing_Wu_Thesis.pdf.

Full text

Abstract:

This study was a step forward to improve the performance for discovering useful knowledge – especially, association rules in this study – in databases. The thesis proposed an approach to use granules instead of patterns to represent knowledge implicitly contained in relational databases; and multi-tier structure to interpret association rules in terms of granules. Association mappings were proposed for the construction of multi-tier structure. With these tools, association rules can be quickly assessed and meaningless association rules can be justified according to the association mappings. The experimental results indicated that the proposed approach is promising.

APA, Harvard, Vancouver, ISO, and other styles

25

Mahamaneerat, Wannapa Kay Shyu Chi-Ren. "Domain-concept mining an efficient on-demand data mining approach /." Diss., Columbia, Mo. : University of Missouri--Columbia, 2008. http://hdl.handle.net/10355/7195.

Full text

Abstract:

Title from PDF of title page (University of Missouri--Columbia, viewed on February 24, 2010). The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file. Dissertation advisor: Dr. Chi-Ren Shyu. Vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

26

Liao, Yuan-Fong, and 廖原豐. "Causal Association Rule Mining." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/sy5ufc.

Full text

Abstract:

碩士<br>國立中央大學<br>資訊管理研究所<br>94<br>This thesis mainly probes into the causality among the investment problems of the stock market to do for the experimental subject of this research. We focus on discussing how about to promote the performance of investment. If we want to promote the performance of investment, we must understand the causality among the factor which influences the performance and performance observing value. we will utilize the method of association rule of data mining to help to look for association rules about causality among the technological indicators which influences the performance and performance observing value (ex. the reversal point of the stock price). We call these rules as Causal Association Rules. We can make these rules up into the tactics of securities trading. In the past, many scholars proposed a lot of methods of association rules, but these methods will produce a large number of large itemsets. So that there are too many rules and it is difficult to assess the interesting of rules and relatively inefficient. So we propose a CFP algorithm structure which mainly improve FP-Growth algorithm to reduce mining the unnecessary large itemsets and enable only producing the interesting causal association rules efficiently. The common data dispersed methods now have equal width interval and equal frequency interval. But when investors pass in and out stock market to buy or sell stocks, they usually reference the aggregate value of technological indicators. So we propose equal width aggregate interval and equal frequency aggregate interval. These two data dispersed methods can also support mining causal association rules with level crossing so that we can mine more interesting rules. As the result of t test, the performance of our algorithm is better than FP-growth algorithm apparently. We also find the CFP algorithm is suitable for mining large-scalar database. We arrange causal association rules in an order by different point of view to analysis so as to offer investors assistance in arrangements of investment tactics and the reference of to avoid the loss.

APA, Harvard, Vancouver, ISO, and other styles

27

Chen, Wei-Ren, and 陳威任. "Mining Utility Association Rules." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/04865121871313091524.

Full text

Abstract:

碩士<br>銘傳大學<br>資訊工程學系碩士班<br>103<br>Mining Association Rules can find which products would be purchased by the customer when a customer has bought some products, and we can use association rules to recommend products for customers. Mining High Utility Itemset is to find the combinations of products which could bring high profit to us. However, High Utility Itemset only tells us which products bring high profit but not increase profit when we recommend other product to customer. Therefore, we propose definitions and algorithm of Mining Utility Association Rules to find which product to recommend and to bring us more benefit than the original high utility itemsets. We will clearly know which product should be recommended to customer bring more profit to us with Utility Association Rules.

APA, Harvard, Vancouver, ISO, and other styles

28

Li, Li-Ya, and 李立雅. "Inter-sequence Association Rules Mining." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/07341599579887641679.

Full text

Abstract:

碩士<br>國立臺灣大學<br>資訊管理研究所<br>91<br>There are many algorithms proposed to find sequential patterns in sequence databases where each transaction contains one sequence. Previously proposed algorithms treat each sequence as an independent one. This kind of mining belongs to intra-transaction sequential patterns mining. In this paper, we propose an algorithm, ProbSif, to mine inter-sequence association rules. Our proposed algorithm consists of three phases. First, we find all large intra-sequence patterns. For each large pattern found, all the time points at which the pattern occurs are recorded in a time point list. Second, those time point lists are hashed into L-buckets. Third, we use a level-wise candidate generation-and-test method to generate candidate patterns across different sequences and check if a candidate is large. Once we generate a candidate, we count its support by reading relevant time point lists from L-buckets. By using the L-buckets, our proposed algorithm requires fewer database scans than the Apriori-like approach. Therefore, our proposed algorithm is more efficient. The experimental results show that our proposed algorithm outperforms the Apriori-like approach by several orders of magnitude.

APA, Harvard, Vancouver, ISO, and other styles

29

Chang, Paul C. M., and 張仲銘. "Mining Association Rules by Sorts." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/27186430188696978772.

Full text

Abstract:

碩士<br>國立清華大學<br>資訊工程學系<br>86<br>In this thesis, we use the knowledge about the sorts of items and transactions to discover association rules among items in a market transaction database. It is natural to divide items into sorts: milk and bread belong to the sort of food while gloves and hats pertain to the sort of clothing. We sort each transaction according to the sorts of items contained by this transaction. Then each sort of transactions will form a subset of the entire database. To discover the association rules within and between these subsets, two kinds of support-constraint models with the corresponding algorithms are proposed. We claim that such models not only enrich the semantics of rules compared with the inceptive work but also emphasize the customer buying patterns for both intra-sort and inter-sort merchandise. The constraint needed when generating rules based on sorts of items is also discussed. The experiments evaluate the performance of these algorithms on synthetical databases of different inter- sort patterns.

APA, Harvard, Vancouver, ISO, and other styles

30

Li, Shenzhi. "Higher order association rule mining." 2010. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3389963.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Su, Wei-Tu, and 蘇威圖. "Mining Multidimensional Intertransaction Association Rules." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/54923326716084839179.

Full text

Abstract:

碩士<br>國立臺灣大學<br>資訊管理研究所<br>90<br>Traditionally, association rule data mining almost focuses on finding the associations among items within the same transaction. In this thesis, we explore “Multidimensional Intertrnasaction Association Rules”, which tries to find the association rule from different transactions and extend to multidimensional space. We propose the E-Partition algorithm and use the Grid File as our data structure to find the large itemsets in the database. Besides, we propose the E-DELTA algorithm to deal with the incremental data mining. The experiment shows that the E-Partition algorithm performs better than the E-Apriori algorithm. Also, the algorithm using the Grid File has better efficiency than that scanning database does.

APA, Harvard, Vancouver, ISO, and other styles

32

Lin, Ming-Yen, and 林明言. "Efficient Algorithms for Association Rule Mining and Sequential Pattern Mining." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/m8z62p.

Full text

Abstract:

博士<br>國立交通大學<br>資訊工程系所<br>92<br>Tremendous amount of data being collected is increasing speedily by computerized applications around the world. Hidden in the vast data, the valuable information is attracting researchers of multiple disciplines to study effective approaches to derive useful knowledge from within. Among various data mining objectives, the mining of frequent patterns has been the focus of knowledge discovery in databases. This thesis aims to investigate efficient algorithms for mining frequent patterns including association rules and sequential patterns. We propose the LexMiner algorithm to deal with frequent item-set discovery for association rules. To alleviate the drawbacks of hash-tree placement of candidates, some algorithms store candidate patterns according to prefix-order of itemsets. LexMiner utilizes the lexicographic features and lexicographic comparisons to further speed up the kernel operation of mining algorithms. A memory indexing approach called MEMISP is proposed for fast sequential pattern mining using a find-then-index technique. MEMISP mines databases of any size, with respect to any support threshold, in just two passes of database scanning. MEMISP outperforms other algorithms in that neither candidate patterns nor intermediate databases are generated. Mining sequential patterns with time constraints, such as time gaps and sliding time-window, may reinforce the accuracy of mining results. However, the capabilities to mine the time-constrained patterns were previously available only within Apriori framework. Recent studies indicate that pattern- growth methodology could speed up sequence mining. We integrate the constraints into a divide-and-conquer strategy of sub-database projection and propose the pattern-growth based DELISP algorithm, which outperforms other algorithms in mining time-constrained sequential patterns. In practice, knowledge discovery is an iterative process. Thus, reducing the response time during user interactions for the desired outcome is crucial. The proposed KISP algorithm utilizes the knowledge acquired from individual mining process, accumulates the counting information to facilitate efficient counting of patterns, and accelerates the whole interactive sequence mining process. Current approaches for sequential pattern mining usually assume that the mining is performed with respect to a static sequence database. However, databases are not static due to update so that the discovered patterns might become invalid and new patterns could be created. Instead of re-mining from scratch, the proposed IncSP algorithm solves the incremental update problem through effective implicit merging and efficient separate counting over appended sequences. Patterns found in prior stages are incrementally updated rather than re-mining. Comprehensive experiments have been conducted to assess the performance of the proposed algorithms. The empirical results show that these algorithms outperform state-of-the-art algorithms with respect to various mining parameters and datasets of different characteristics. The scale-up experiments also verify that our algorithms successfully mine frequent patterns with good linear scalability.

APA, Harvard, Vancouver, ISO, and other styles

33

Yang, Chian-Yi, and 楊千儀. "Mining High Utility Quantitative Association Rules." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/2jdtaf.

Full text

Abstract:

碩士<br>銘傳大學<br>資訊傳播工程學系碩士班<br>94<br>Mining weighted association rules consider the importance of items in a large transaction database. Mining quantitative association rules find most quantitative itemsets , which are purchased frequently, and relate with them from a large transaction database. However, weighted association rules didn’t consider the items which their quantities, and quantitative association rules didn’t consider the items which their weighted. Economics mention influence that quantities affect the cost; and high prices are not necessarily to make a profit, that proves, if only consider weighted or quantitative, it’s must not enough. This paper will consider both weighted and quantitative, and find out useful rules for policymaker. We will weight of items multiply quantitative of items, it’s mean utility, and we want to find high utility association rules that these items reach to the utility threshold. Our methods don’t produce candidates and just scan once database to produce about sub-database, then we use these sub-database to find profitable association rules.

APA, Harvard, Vancouver, ISO, and other styles

34

Yi-Ling, Chen. "Mining Spatial Association Rules in Image." 2005. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-0907200516580400.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Su, Po-Ta, and 蘇伯達. "Mining Association Rules by Ant System." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/51532702116386182507.

Full text

Abstract:

碩士<br>國立清華大學<br>工業工程與工程管理學系<br>90<br>Mining association rules is to find relations among large amount of data so that the pattern of the dataset can be discovered. Many companies use association rules to find the relations among different items to improve their service quality of customers or enlarge their marketplace. Recently, many algorithms have been developed that only consider either non-quantitative data or quantitative data. However, in reality, most data we collected are mixed in types. Since Ant System allows to consider both of data types and has advantages of being efficient in filtering the unobvious association rules to reduce the unnecessary outputs and ease of making judgment to improve the performance, therefore, in this study, we adopted the technique and concept of Ant System to develop association rules. The developed algorithm is supported by theoretical evidence, and comparative studies are provided for evaluation.

APA, Harvard, Vancouver, ISO, and other styles

36

Lin, Shih Hsiang, and 林士翔. "DARM: Doughnut-shaped Association Rule Mining." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/54386438560648611106.

Full text

Abstract:

碩士<br>長庚大學<br>資訊管理學研究所<br>97<br>This is the age of “Information Explosion”. We can easier to get more and more information. Information visualization research is to be valuable for conveniently presenting the infinite information. It is often seen the information visualization products like maps, signs, graphs in our life. Information visualization can also use in data mining methodology. Data mining is often called knowledge discovery. Association rule mining is the most famous data mining method. Association rule mining is used to discover all associations among items. However, user can not hold the important item fast and exactly by text. We propose an association rule algorithm which use doughnut shapes to present association rule. DARM(Doughnut-shaped association rule mining) includes a overview circle and lots of detail circles which produced by items. DARM let user understand the mining step easily. User can use their self-knowledge and self-experience to participate in the process. Most importantly, we use the simple and clear doughnut shapes let user realize the database overview and all associations among items rapidly.

APA, Harvard, Vancouver, ISO, and other styles

37

"Mining association rules with weighted items." 1998. http://library.cuhk.edu.hk/record=b5889513.

Full text

Abstract:

by Cai, Chun Hing.<br>Thesis (M.Phil.)--Chinese University of Hong Kong, 1998.<br>Includes bibliographical references (leaves 109-114).<br>Abstract also in Chinese.<br>Acknowledgments --- p.ii<br>Abstract --- p.iii<br>Chapter 1 --- Introduction --- p.1<br>Chapter 1.1 --- Main Categories in Data Mining --- p.1<br>Chapter 1.2 --- Motivation --- p.3<br>Chapter 1.3 --- Problem Definition --- p.4<br>Chapter 1.4 --- Experimental Setup --- p.5<br>Chapter 1.5 --- Outline of the thesis --- p.6<br>Chapter 2 --- Literature Survey on Data Mining --- p.8<br>Chapter 2.1 --- Statistical Approach --- p.8<br>Chapter 2.1.1 --- Statistical Modeling --- p.9<br>Chapter 2.1.2 --- Hypothesis testing --- p.10<br>Chapter 2.1.3 --- Robustness and Outliers --- p.11<br>Chapter 2.1.4 --- Sampling --- p.12<br>Chapter 2.1.5 --- Correlation --- p.15<br>Chapter 2.1.6 --- Quality Control --- p.16<br>Chapter 2.2 --- Artificial Intelligence Approach --- p.18<br>Chapter 2.2.1 --- Bayesian Network --- p.19<br>Chapter 2.2.2 --- Decision Tree Approach --- p.20<br>Chapter 2.2.3 --- Rough Set Approach --- p.21<br>Chapter 2.3 --- Database-oriented Approach --- p.23<br>Chapter 2.3.1 --- Characteristic and Classification Rules --- p.23<br>Chapter 2.3.2 --- Association Rules --- p.24<br>Chapter 3 --- Background --- p.27<br>Chapter 3.1 --- Iterative Procedure: Apriori Gen --- p.27<br>Chapter 3.1.1 --- Binary association rules --- p.27<br>Chapter 3.1.2 --- Apriori Gen --- p.29<br>Chapter 3.1.3 --- Closure Properties --- p.30<br>Chapter 3.2 --- Introduction of Weights --- p.31<br>Chapter 3.2.1 --- Motivation --- p.31<br>Chapter 3.3 --- Summary --- p.32<br>Chapter 4 --- Mining weighted binary association rules --- p.33<br>Chapter 4.1 --- Introduction of binary weighted association rules --- p.33<br>Chapter 4.2 --- Weighted Binary Association Rules --- p.34<br>Chapter 4.2.1 --- Introduction --- p.34<br>Chapter 4.2.2 --- Motivation behind weights and counts --- p.36<br>Chapter 4.2.3 --- K-support bounds --- p.37<br>Chapter 4.2.4 --- Algorithm for Mining Weighted Association Rules --- p.38<br>Chapter 4.3 --- Mining Normalized Weighted association rules --- p.43<br>Chapter 4.3.1 --- Another approach for normalized weighted case --- p.45<br>Chapter 4.3.2 --- Algorithm for Mining Normalized Weighted Association Rules --- p.46<br>Chapter 4.4 --- Performance Study --- p.49<br>Chapter 4.4.1 --- Performance Evaluation on the Synthetic Database --- p.49<br>Chapter 4.4.2 --- Performance Evaluation on the Real Database --- p.58<br>Chapter 4.5 --- Discussion --- p.65<br>Chapter 4.6 --- Summary --- p.66<br>Chapter 5 --- Mining Fuzzy Weighted Association Rules --- p.67<br>Chapter 5.1 --- Introduction to the Fuzzy Rules --- p.67<br>Chapter 5.2 --- Weighted Fuzzy Association Rules --- p.69<br>Chapter 5.2.1 --- Problem Definition --- p.69<br>Chapter 5.2.2 --- Introduction of Weights --- p.71<br>Chapter 5.2.3 --- K-bound --- p.73<br>Chapter 5.2.4 --- Algorithm for Mining Fuzzy Association Rules for Weighted Items --- p.74<br>Chapter 5.3 --- Performance Evaluation --- p.77<br>Chapter 5.3.1 --- Performance of the algorithm --- p.77<br>Chapter 5.3.2 --- Comparison of unweighted and weighted case --- p.79<br>Chapter 5.4 --- Note on the implementation details --- p.81<br>Chapter 5.5 --- Summary --- p.81<br>Chapter 6 --- Mining weighted association rules with sampling --- p.83<br>Chapter 6.1 --- Introduction --- p.83<br>Chapter 6.2 --- Sampling Procedures --- p.84<br>Chapter 6.2.1 --- Sampling technique --- p.84<br>Chapter 6.2.2 --- Algorithm for Mining Weighted Association Rules with Sampling --- p.86<br>Chapter 6.3 --- Performance Study --- p.88<br>Chapter 6.4 --- Discussion --- p.91<br>Chapter 6.5 --- Summary --- p.91<br>Chapter 7 --- Database Maintenance with Quality Control method --- p.92<br>Chapter 7.1 --- Introduction --- p.92<br>Chapter 7.1.1 --- Motivation of using the quality control method --- p.93<br>Chapter 7.2 --- Quality Control Method --- p.94<br>Chapter 7.2.1 --- Motivation of using Mil. Std. 105D --- p.95<br>Chapter 7.2.2 --- Military Standard 105D Procedure [12] --- p.95<br>Chapter 7.3 --- Mapping the Database Maintenance to the Quality Control --- p.96<br>Chapter 7.3.1 --- Algorithm for Database Maintenance --- p.98<br>Chapter 7.4 --- Performance Evaluation --- p.102<br>Chapter 7.5 --- Discussion --- p.104<br>Chapter 7.6 --- Summary --- p.105<br>Chapter 8 --- Conclusion and Future Work --- p.106<br>Chapter 8.1 --- Summary of the Thesis --- p.106<br>Chapter 8.2 --- Conclusions --- p.107<br>Chapter 8.3 --- Future Work --- p.108<br>Bibliography --- p.108<br>Appendix --- p.115<br>Chapter A --- Generating a random number --- p.115<br>Chapter B --- Hypergeometric distribution --- p.116<br>Chapter C --- Quality control tables --- p.117<br>Chapter D --- Rules extracted from the database --- p.120

APA, Harvard, Vancouver, ISO, and other styles

38

Chen, Yi-Ling, and 陳奕伶. "Mining Spatial Association Rules in Image." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/39410529091384817730.

Full text

Abstract:

碩士<br>國立臺灣大學<br>電機工程學研究所<br>93<br>In this paper, we integrate data mining with image processing for discovering spatial relationships in images. We present an image mining framework, Spatial Association Rulemining (SAR), to mine spatial associations located in specific locations of images. A rule in the SAR refers to the occurrences of image content in a pair of spatial locations. The proposed approach is applied to mine color spatial association rules (color-SAR) in landscape scene images so as to demonstrate that the spatial association rules is able to the application of image classification. Our experimental results show that the classification accuracy of 86% can be achieved by the rule-based classifier.

APA, Harvard, Vancouver, ISO, and other styles

39

Huang, Minghua, and 黃明華. "Algorithms for Parallel Association Rules Mining." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/88224762660139352543.

Full text

Abstract:

碩士<br>國立臺灣科技大學<br>管理研究所資訊管理學程<br>87<br>Mining association rules is an important task. Many parallel algorithms have been proposed to expedite the execution of the mining process. In this thesis, we propose a parallel algorithm called ''PBSM'' for shared-disk environments, and implement the PBSM algorithm on an nCUBE parallel computer. In the PBSM algorithm, mining process is divided into two steps. In the first step, multiple processors are used to generate frequent itemsets. Then, in the second phase, a chosen processor is used to generate the related association rules. Through boolean-based table operations, the PBSM algorithm needs not generate candidate itemsets─which constitute the major part of execution time in the previous Apriori-based mining algorithms. Further-more, in the PBSM algorithm, each processor works independently in generating frequent itemsets. There is no need to send messages for itemsets, supports or counts between processors. As a result, our PBSM algorithm shows a superb performance compared to the existing parallel mining algorithms.

APA, Harvard, Vancouver, ISO, and other styles

40

Lin, Chih-Lung, and 林志龍. "Mining Association Words for Document Summarization." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/62018588118824230371.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Chang, Chien-Yu, and 張倩瑜. "Privacy-Preserving for Association Patterns Mining." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/76658381270102818626.

Full text

Abstract:

碩士<br>國立東華大學<br>資訊工程學系<br>91<br>Data mining is able to apply in many directions, but it can also cause a threat to privacy. We investigate to find an appropriate balance between a need for privacy and information discovery on association patterns. In this thesis, we propose an innovative technique for hiding patterns. We define a correlation matrix for items and set to appropriate value in it. By multiplying the original transaction database and the correlation matrix, a new database, which is sanitized for privacy concern, is gotten. We also add the probabilistic factor to decide whether the sanitization process is performed. In addition, the performance metric for measuring the efficacy is introduced and we also present experimental results of this methodology.

APA, Harvard, Vancouver, ISO, and other styles

42

Zhen, Hao, and 振昊. "DPARM: Differential Privacy Association Rules Mining." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/xqj7yw.

Full text

Abstract:

碩士<br>國立臺灣大學<br>電機工程學研究所<br>107<br>In contemporary society, the rapid expansion of data volume has driven the development of data analysis techniques, which makes decision automation possible. Association analysis is an important task in data analysis. The goal is to find all co-occurrence relationships from the transactional dataset, i.e. frequent itemsets or confident association rules. An association rule consists of two parts, the antecedent and the consequent, which means that if the antecedent occurs then the consequent is also possible to happen. Confident association rules are those association rules with larger possibility, which can help people better discover patterns and develop corresponding strategies. The process of data analysis can be highly summarized as a set of queries, where each query is a real-valued function of the dataset. However, without any restriction and protection, accessing the dataset to answer the queries may lead to the disclosure of individual privacy. Therefore, techniques for privacy-preserving data analysis has received increasing attention. People are eager to find a strong, mathematically rigorous, and socio-cognitive-conform definition of privacy. Differential privacy is such a privacy definition that manages and quantifies the privacy risks faced by individuals in data analysis through the parameter called the privacy level. In general, differential privacy can be achieved by adding delicate noise to the query results. In this thesis, we focus on differential privacy association rules mining with multiple support thresholds, and solve the challenges existing in the state-of-art works. We propose and implement the DPARM algorithm, which uses multiple support thresholds to reduce the number of candidate itemsets while reflecting the real nature of the items, and uses random truncation and uniform partition to lower the dimensionality of the dataset. Both of these are helpful to reduce the sensitivity of the queries, thereby reducing the scale of the required noise and improving the utility of the mining results. We also stabilize the noise scale by adaptively allocating the privacy levels, and bound the overall privacy loss. In addition, we prove that the DPARM algorithm satisfies ex post differential privacy, and verify the utility of the DPARM algorithm through a series of experiments.

APA, Harvard, Vancouver, ISO, and other styles

43

Wu, Chieh-Ming, and 吳界明. "Data mining for generalized association rules and privacy preservingData mining for generalized association rules and privacy preserving." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/85543535661382633122.

Full text

Abstract:

博士<br>國立雲林科技大學<br>工程科技研究所博士班<br>99<br>Data mining is an analysis method used to extract the unknown and latent information that hides in large dataset which has usable information. In the last few years the data mining model and method have long-term progress and the association rule mining is most often applied. The association rule research focus on discussion how to discover single level association rule effectiveness in the large dataset. In the recent years more and more researchers start to study the problem of multiple level association rules that was advantageous in the knowledge economy modernized society. In accordance to the enterprise, it must utilize nimbly the more deeply and more detailed association rules to assist the superintendent to complete policy-making in the short time. For reach the above objective, this study proposed an efficient data structure, Frequent Closed Enumerable Table (FCET), to speed the generalized association rules mining. In the other aspect, as a result of enterprise globalization acceleration, many sensitive individual information collection, processing and application involve to the individual privacy protection law. In addition, databases managed by enterprises also largely grow up. The databases store many individual sensitive material and corporation secret information. If the database suffers non-suitable access, it leads the security problem. Moreover, it causes the company secret restricted data and the individual material to be disclosed. Once the problem is not careful processed, it would possibly reduce the competitiveness of enterprise. This study proposes an effective data structure which considers the privacy preserving in the mining process. In addition, it carries on the complete discussion from data mining and privacy the preserving related question. A greedy algorithm which considers the hiding cost was proposed here. The algorithm includes the sanitized procedure and exposed procedure protection of mechanism. Not only privacy preserving for public content but also useful information extraction are guarantee to reach. Moreover, after the sanitized processing, it achieves privacy preserving and knowledge extracting balanced effectively.

APA, Harvard, Vancouver, ISO, and other styles

44

Wang, Wei-Tse, and 王威澤. "A native XML database association rules mining method and a database compression approach using association rules mining." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/47594741243843634353.

Full text

Abstract:

碩士<br>朝陽科技大學<br>資訊管理系碩士班<br>91<br>With the advancement of technology and popularity of applications in enterprises’ information system, greater and greater amount of data is generated everyday. To properly store and access these data, database applications have come into play and become crucial. The main task of data mining is to help enterprises make their decisions by extracting useful information from the large amount of complicated data storage for reference, so this is why data mining has been recently paid more attention than ever. Also, more storage media for data is required for the increasing amount of data. For unlimited needs of increasing amount of data, it will be wise to provide an efficient data compression technique to reduce the cost. The thesis proposes the related research on data mining. First of all, it is different from data mining fields based primarily on relational database. We propose a data mining method for native XML database. It can extract some knowledge from native XML database. Secondly, propose a semantic association rule – the rule that is extracted from data mining method. Convert it to the semantic association rule from the proposed procedures so as to make it more legible and easier to users as reference. Finally, propose a database compression using association rules mining. The method compresses the database for reducing the cost of storage. And from the association rules mining, it finds the association among these data. These association rules are further taken as reference for the organizations when making their strategic steps.

APA, Harvard, Vancouver, ISO, and other styles

45

Yang, Nai-Hua, and 楊乃樺. "Mining Multidimensional Association Rules for Market Segmentation." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/78f64c.

Full text

Abstract:

碩士<br>銘傳大學<br>資訊管理學系碩士班<br>95<br>Today is a customer-oriented market. Enterprises need to give every customer appropriate service. The more precise information can make accurate and profitable strategies. Association rules provide correlations between data items in large numbers of data. The further exploration is to discover relationship between customer’s features and customer purchasing behaviors. This paper proposes a new method to discover mining multidimensional association rule for market segmentation. We use conditional databases to discover multidimensional association rule, do not scan the target database many times and combine cluster method to automatically discretize numerical-type attributes. Our method analyzes CRM data from two different points of view. One is the product combinations according to different customer features; another is the customer features according to purchased products of customers. These two different points of view can provide decision-makers to establish customer profiles, segment market and make strategies more accurately.

APA, Harvard, Vancouver, ISO, and other styles

46

Jen-Feng, Li. "Mining Association Rules in Time-series Databases." 2005. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2507200511203800.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Ying-Hsiang, Wen. "Parallel Hardware Architecture for Mining Association Rules." 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2507200610192300.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Wang, Hsing-Kai, and 王星凱. "An Efficient Distributed Association Rules Mining System." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/75735815815407828838.

Full text

Abstract:

碩士<br>淡江大學<br>資訊管理學系<br>91<br>Association rule mining can help the enterprises to capture the consumer behaviors and develop effective marketing strategies. However, the size of transaction database is increasing everyday, how to get timely mining results becomes a serious problem. In this paper, we propose an Effective Distributed Association rule Mining System, EDAMS, to cope with this problem. Unlike other distributed mining systems, a dedicated node is used as data server to collect exchange data among nodes. Thus, the point-to-point broadcasts are avoided and therefore the number of message exchanged is greatly reduced from O(n2) to O(n). Besides, to reduce the total amount of message, the DHP algorithm[2] is used as the basis algorithm to reduce the number of candidate 2-itemsets. According to our experimental results, the EDAMS achieve steadily increasing speedup ration ranging from 100,000 to 700,000 transaction data. Also, the speedup ratio is superior to those in the previous work[7][9]. It clearly demonstrates the effectiveness of our system.

APA, Harvard, Vancouver, ISO, and other styles

49

Li, Jian-Ming, and 李建明. "Mining Quantitative Association Rules in Disease Databases." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/76433798386921713653.

Full text

Abstract:

碩士<br>國立臺灣大學<br>電機工程學研究所<br>88<br>With the computerization of medical information and popularity of medical database, the amount of data grows much more rapidly than ever. There must be numerous known or unknown information hidden behind these data. Traditional statistical approach is not suit for processing such large amount of data. A technique called “Data Mining” is emerging in which the “Association Rules” is the one focusing on the relationship among data items. The technique of mining association rules was first introduced to search the pattern of items that a customer may buy in a supermarket. It can also be extended for mining association rules from a relational database. There are two kinds of attributes in a relational database, one is quantitative and the other is categorical. In this thesis, we introduce a statistical method to finely partition the values of a quantitative attribute into a set of intervals. Different from the previous method which equally partitions the range of an attribute, we suggest a method based on the observation of the data distribution. And we use the mean and standard deviation of each attribute as two parameters of partition. This choice reflects the bias of databases so that it can improve the effectiveness of analysis in highly skewed data. To demonstrate the feasibility of our method, we combine two effective rule-mining algorithms called the DHP algorithm and the Boolean algorithm. With the combination, we can mine association rules from the relational database. Finally, we use this approach on two disease databases. We show the experimental results and compare them with previous methods. The results reveal that our method generated less noises and it was executed easier.

APA, Harvard, Vancouver, ISO, and other styles

50

Wen, Ying-Hsiang, and 溫英翔. "Parallel Hardware Architecture for Mining Association Rules." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/84140141294459812790.

Full text

Abstract:

碩士<br>國立臺灣大學<br>電機工程學研究所<br>94<br>Generally speaking, to implement Apriori-based association rule mining in hardware, one has to load candidate itemsets and a database into the hardware. Since the capacity of the hardware architecture is fixed, if the number of candidate itemsets or the number of items in the database is larger than the hardware capacity, the items are loaded into the hardware separately. The time complexity is in proportion to the number of candidate itemsets multiplied by the number of items in the database. Too many candidate itemsets and a large database would create a performance bottleneck. In this thesis, we propose a HAsh-based and PiPelIned architecture (abbreviated as HAPPI) for hardware-enhanced association rule mining. We apply the pipeline methodology in the HAPPI architecture to compare itemsets with the database and collect useful information for reducing the number of candidate itemsets and items in the database simultaneously. When the database is fed into the hardware, candidate itemsets are compared with the items in the database to find frequent itemsets. At the same time, trimming information is collected from each transaction. In addition, itemsets are generated from transactions and hashed into a hash table. The useful trimming information and the hash table enable us to reduce the number of items in the database and the number of candidate itemsets. Therefore, we can effectively reduce the frequency of loading the database into the hardware. As such, HAPPI solves the bottleneck problem in Apriori-based hardware schemes. We also derive some properties to investigate the performance of this hardware implementation. As shown by the experiment results, HAPPI significantly outperforms the previous hardware approach in terms of execution cycles.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!