Log in

Relevant bibliographies by topics / Graph Mining / Journal articles

To see the other types of publications on this topic, follow the link: Graph Mining.

Journal articles on the topic 'Graph Mining'

Author: Grafiati

Published: 4 June 2021

Last updated: 22 February 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Graph Mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Et. al., M. Sailaja,. "Ensemble Distributed Search-FSGM-CRD Compressed Cache Algorithm for Large Datasets." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 2 (April 11, 2021): 2854–58. http://dx.doi.org/10.17762/turcomat.v12i2.2317.

Full text

Abstract:

Frequent sub-graph mining (FSM) is a alternative of frequent pattern mining where patterns are graphs. Among the entities, graph based representation is utilized to effectively represent the complex relationships. Various graph mining techniques are developed from the past many years, most the challenging tasks in graph mining is frequent sub-graph mining (FSM). In FSM many of the existing algorithms consider only graph based structure, the relationships based on entities involved and strength is not considered. It is very important to handle the complex and huge data. There is very huge demand in distributed computational approaches. In this paper, An Ensemble Distributed Search-FSGM-CRD Compressed Cache Algorithm is developed and implemented to find frequent sub graphs

APA, Harvard, Vancouver, ISO, and other styles

2

Chakrabarti, Deepayan, and Christos Faloutsos. "Graph mining." ACM Computing Surveys 38, no. 1 (June 29, 2006): 2. http://dx.doi.org/10.1145/1132952.1132954.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Ergenç Bostanoğlu, Belgin, and Nourhan Abuzayed. "Dynamic frequent subgraph mining algorithms over evolving graphs: a survey." PeerJ Computer Science 10 (October 8, 2024): e2361. http://dx.doi.org/10.7717/peerj-cs.2361.

Full text

Abstract:

Frequent subgraph mining (FSM) is an essential and challenging graph mining task used in several applications of the modern data science. Some of the FSM algorithms have the objective of finding all frequent subgraphs whereas some of the algorithms focus on discovering frequent subgraphs approximately. On the other hand, modern applications employ evolving graphs where the increments are small graphs or stream of nodes and edges. In such cases, FSM task becomes more challenging due to growing data size and complexity of the base algorithms. Recently we see frequent subgraph mining algorithms designed for dynamic graph data. However, there is no comparative review of the dynamic subgraph mining algorithms focusing on the discovery of frequent subgraphs over evolving graph data. This article focuses on the characteristics of dynamic frequent subgraph mining algorithms over evolving graphs. We first introduce and compare dynamic frequent subgraph mining algorithms; trying to highlight their attributes as increment type, graph type, graph representation, internal data structure, algorithmic approach, programming approach, base algorithm and output type. Secondly, we introduce and compare the approximate frequent subgraph mining algorithms for dynamic graphs with additional attributes as their sampling strategy, data in the sample, statistical guarantees on the sample and their main objective. Finally, we highlight research opportunities in this specific domain from our perspective. Overall, we aim to introduce the research area of frequent subgraph mining over evolving graphs with the hope that this can serve as a reference and inspiration for the researchers of the field.

APA, Harvard, Vancouver, ISO, and other styles

4

Kashyap, Navneet Kr, B. K. Pandey, H. L. Mandoria, and Ashok Kumar. "Graph Mining Using gSpan: Graph-Based Substructure Pattern Mining." International Journal of Applied Research on Information Technology and Computing 7, no. 2 (2016): 132. http://dx.doi.org/10.5958/0975-8089.2016.00014.2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Chen, Yuzhong, Zhenyu Liu, Yulin Liu, and Chen Dong. "Distributed Attack Modeling Approach Based on Process Mining and Graph Segmentation." Entropy 22, no. 9 (September 14, 2020): 1026. http://dx.doi.org/10.3390/e22091026.

Full text

Abstract:

Attack graph modeling aims to generate attack models by investigating attack behaviors recorded in intrusion alerts raised in network security devices. Attack models can help network security administrators discover an attack strategy that intruders use to compromise the network and implement a timely response to security threats. However, the state-of-the-art algorithms for attack graph modeling are unable to obtain a high-level or global-oriented view of the attack strategy. To address the aforementioned issue, considering the similarity between attack behavior and workflow, we employ a heuristic process mining algorithm to generate the initial attack graph. Although the initial attack graphs generated by the heuristic process mining algorithm are complete, they are extremely complex for manual analysis. To improve their readability, we propose a graph segmentation algorithm to split a complex attack graph into multiple subgraphs while preserving the original structure. Furthermore, to handle massive volume alert data, we propose a distributed attack graph generation algorithm based on Hadoop MapReduce and a distributed attack graph segmentation algorithm based on Spark GraphX. Additionally, we conduct comprehensive experiments to validate the performance of the proposed algorithms. The experimental results demonstrate that the proposed algorithms achieve considerable improvement over comparative algorithms in terms of accuracy and efficiency.

APA, Harvard, Vancouver, ISO, and other styles

6

Acosta-Mendoza, Niusvel, Andrés Gago-Alonso, Jesús Ariel Carrasco-Ochoa, José Fco Martínez-Trinidad, and José E. Medina-Pagola. "Extension of Canonical Adjacency Matrices for Frequent Approximate Subgraph Mining on Multi-Graph Collections." International Journal of Pattern Recognition and Artificial Intelligence 31, no. 08 (May 9, 2017): 1750025. http://dx.doi.org/10.1142/s0218001417500252.

Full text

Abstract:

Into the data mining field, frequent approximate subgraph (FAS) mining has become an important technique with a broad spectrum of real-life applications. This fact is because several real-life phenomena can be modeled by graphs. In the literature, several algorithms have been reported for mining frequent approximate patterns on simple-graph collections; however, there are applications where more complex data structures, as multi-graphs, are needed for modeling the problem. But to the best of our knowledge, there is no FAS mining algorithm designed for dealing with multi-graphs. Therefore, in this paper, a canonical form (CF) for simple-graphs is extended to allow representing multi-graphs and a state-of-the-art algorithm for FAS mining is also extended for processing multi-graph collections by using the extended CF. Our experiments over different synthetic and real-world multi-graph collections show that the proposed algorithm has a good performance in terms of runtime and scalability. Additionally, we show the usefulness of the patterns computed by our algorithm in an image classification problem where images are represented as multi-graphs.

APA, Harvard, Vancouver, ISO, and other styles

7

Rao, Bapuji, and Sarojananda Mishra. "A New Approach to Community Graph Partition Using Graph Mining Techniques." International Journal of Rough Sets and Data Analysis 4, no. 1 (January 2017): 75–94. http://dx.doi.org/10.4018/ijrsda.2017010105.

Full text

Abstract:

Knowledge extraction is very much possible from the community graph using graph mining techniques. The authors have studied the related definitions of graph partition in terms of both mathematical as well as computational aspects. To derive knowledge from a particular sub-community graph of a large community graph, the authors start partitioning the large community graph into smaller sub-community graphs. Thus, the knowledge extraction from the sub-community graph becomes easier and faster. The proposed approach of partition is done by detection of edges among the community members of dissimilar community. By studying existing techniques followed by different researchers, the authors propose a new and simple algorithm for partitioning the community graph into sub-community graphs using graph mining techniques. Finally, the authors have considered a benchmark dataset as example which verifies the strength and easiness of the proposed algorithm.

APA, Harvard, Vancouver, ISO, and other styles

8

Sanei-Mehri, Seyed-Vahid, Apurba Das, Hooman Hashemi, and Srikanta Tirthapura. "Mining Largest Maximal Quasi-Cliques." ACM Transactions on Knowledge Discovery from Data 15, no. 5 (June 26, 2021): 1–21. http://dx.doi.org/10.1145/3446637.

Full text

Abstract:

Quasi-cliques are dense incomplete subgraphs of a graph that generalize the notion of cliques. Enumerating quasi-cliques from a graph is a robust way to detect densely connected structures with applications in bioinformatics and social network analysis. However, enumerating quasi-cliques in a graph is a challenging problem, even harder than the problem of enumerating cliques. We consider the enumeration of top- k degree-based quasi-cliques and make the following contributions: (1) we show that even the problem of detecting whether a given quasi-clique is maximal (i.e., not contained within another quasi-clique) is NP-hard. (2) We present a novel heuristic algorithm K ernel QC to enumerate the k largest quasi-cliques in a graph. Our method is based on identifying kernels of extremely dense subgraphs within a graph, followed by growing subgraphs around these kernels, to arrive at quasi-cliques with the required densities. (3) Experimental results show that our algorithm accurately enumerates quasi-cliques from a graph, is much faster than current state-of-the-art methods for quasi-clique enumeration (often more than three orders of magnitude faster), and can scale to larger graphs than current methods.

APA, Harvard, Vancouver, ISO, and other styles

9

Rehman, Saif Ur, Sohail Asghar, Yan Zhuang, and Simon Fong. "Performance Evaluation of Frequent Subgraph Discovery Techniques." Mathematical Problems in Engineering 2014 (2014): 1–6. http://dx.doi.org/10.1155/2014/869198.

Full text

Abstract:

Due to rapid development of the Internet technology and new scientific advances, the number of applications that model the data as graphs increases, because graphs have highly expressive power to model a complicated structure. Graph mining is a well-explored area of research which is gaining popularity in the data mining community. A graph is a general model to represent data and has been used in many domains such as cheminformatics, web information management system, computer network, and bioinformatics, to name a few. In graph mining the frequent subgraph discovery is a challenging task. Frequent subgraph mining is concerned with discovery of those subgraphs from graph dataset which have frequent or multiple instances within the given graph dataset. In the literature a large number of frequent subgraph mining algorithms have been proposed; these included FSG, AGM, gSpan, CloseGraph, SPIN, Gaston, and Mofa. The objective of this research work is to perform quantitative comparison of the above listed techniques. The performances of these techniques have been evaluated through a number of experiments based on three different state-of-the-art graph datasets. This novel work will provide base for anyone who is working to design a new frequent subgraph discovery technique.

APA, Harvard, Vancouver, ISO, and other styles

10

Kemmar, Amina, Yahia Lebbah, and Samir Loudni. "Interval graph mining." International Journal of Data Mining, Modelling and Management 10, no. 1 (2018): 1. http://dx.doi.org/10.1504/ijdmmm.2018.089629.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Lebbah, Yahia, Amina Kemmar, and Samir Loudni. "Interval graph mining." International Journal of Data Mining, Modelling and Management 10, no. 1 (2018): 1. http://dx.doi.org/10.1504/ijdmmm.2018.10010657.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Kang, U., and Christos Faloutsos. "Big graph mining." ACM SIGKDD Explorations Newsletter 14, no. 2 (April 30, 2013): 29–36. http://dx.doi.org/10.1145/2481244.2481249.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Zhang, Quanshi, Xuan Song, Yu Yang, Haotian Ma, and Ryosuke Shibasaki. "Visual graph mining for graph matching." Computer Vision and Image Understanding 178 (January 2019): 16–29. http://dx.doi.org/10.1016/j.cviu.2018.11.002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Alam, Md Tanvir, Chowdhury Farhan Ahmed, and Md Samiullah. "A Vertex-extension based Algorithm for Frequent Pattern Mining from Graph Databases." Dhaka University Journal of Applied Science and Engineering 7, no. 1 (February 1, 2023): 58–65. http://dx.doi.org/10.3329/dujase.v7i1.62887.

Full text

Abstract:

Frequent pattern mining is a core problem in data mining. Algorithms for frequent pattern mining have been proposed for itemsets, sequences, and graphs. However, existing graph mining frameworks follow an edge-growth approach to building patterns which limits many applications. Motivated by real-life problems, in this work, we define a novel graph mining framework that incorporates vertex-based extensions along with the edge-growth approach. We also propose an efficient algorithm for mining frequent subgraphs. To deal with the exploding search space, we introduce a canonical labeling technique for isomorphic candidates as well as downward closure property-based search space pruning. We present an experimental analysis of our algorithm on real-life benchmark graph datasets to demonstrate the performance in terms of runtime. DUJASE Vol. 7(1) 58-65, 2022 (January)

APA, Harvard, Vancouver, ISO, and other styles

15

Hussein, Rana, Alberto Lerner, Andre Ryser, Lucas David Bürgi, Albert Blarer, and Philippe Cudre-Mauroux. "GraphINC: Graph Pattern Mining at Network Speed." Proceedings of the ACM on Management of Data 1, no. 2 (June 13, 2023): 1–28. http://dx.doi.org/10.1145/3589329.

Full text

Abstract:

Graph Pattern Mining (GPM) is a class of algorithms that identifies given shapes within a graph, e.g., cliques of a certain size. Any area of a graph can contain a shape of interest, but in real-world graphs, these shapes tend to be concentrated in areas deemed skewed. Because mining skewed areas can dominate GPM computations, the overwhelming majority of state-of-the-art GPM techniques break such areas into many small parts and load balance them across servers. This paper takes a diametrically opposite approach: we suggest a framework that concentrates rather than divides the skewed areas. Our framework, called GraphINC, relies on two key innovations. First, it introduces a new graph partitioning scheme capable of separating the skewed area from the rest of the graph. Second, it offloads the skewed part onto a new class of hardware accelerator, a programmable network switch. We implemented our framework to leverage a commercial 100 Gbps switch and obtained results 6.5 to 52.4× faster thanks to our novel offloading technique.

APA, Harvard, Vancouver, ISO, and other styles

16

Lee, Kwangyon, Haemin Jung, June Seok Hong, and Wooju Kim. "Learning Knowledge Using Frequent Subgraph Mining from Ontology Graph Data." Applied Sciences 11, no. 3 (January 20, 2021): 932. http://dx.doi.org/10.3390/app11030932.

Full text

Abstract:

In many areas, vast amounts of information are rapidly accumulating in the form of ontology-based knowledge graphs, and the use of information in these forms of knowledge graphs is becoming increasingly important. This study proposes a novel method for efficiently learning frequent subgraphs (i.e., knowledge) from ontology-based graph data. An ontology-based large-scale graph is decomposed into small unit subgraphs, which are used as the unit to calculate the frequency of the subgraph. The frequent subgraphs are extracted through candidate generation and chunking processes. To verify the usefulness of the extracted frequent subgraphs, the methodology was applied to movie rating prediction. Using the frequent subgraphs as user profiles, the graph similarity between the rating graph and new item graph was calculated to predict the rating. The MovieLens dataset was used for the experiment, and a comparison showed that the proposed method outperformed other widely used recommendation methods. This study is meaningful in that it proposed an efficient method for extracting frequent subgraphs while maintaining semantic information and considering scalability in large-scale graphs. Furthermore, the proposed method can provide results that include semantic information to serve as a logical basis for rating prediction or recommendation, which existing methods are unable to provide.

APA, Harvard, Vancouver, ISO, and other styles

17

Cook, D. J., and L. B. Holder. "Graph-based data mining." IEEE Intelligent Systems 15, no. 2 (March 2000): 32–41. http://dx.doi.org/10.1109/5254.850825.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Rao, Bapuji. "An Approach to Opinion Mining in Community Graph Using Graph Mining Techniques." International Journal of Synthetic Emotions 9, no. 2 (July 2018): 94–110. http://dx.doi.org/10.4018/ijse.2018070106.

Full text

Abstract:

Opinions are the central theme to almost all human activities, as well the key influencers of our behaviours. Opinions related to sentiments, evaluations, attitudes, and emotions are the features of studying of opinion mining. It is important to study peoples of various communities sentiments about the schemes implemented by the government agencies as well as NGOs. The opinion mining is about the opinions of various communities of villages of a Panchayat about various social schemes implemented by the government of India. This article proposes an algorithm for opinion mining in a community graph for various social schemes run by the Panchayat using graph mining techniques. The algorithm has been implemented in C++ programming language.

APA, Harvard, Vancouver, ISO, and other styles

19

Ortega-Guzmán, Víctor H., Luis Gutiérrez-Preciado, Francisco Cervantes, and Mildreth Alcaraz-Mejia. "A Methodology for Knowledge Discovery in Labeled and Heterogeneous Graphs." Applied Sciences 14, no. 2 (January 18, 2024): 838. http://dx.doi.org/10.3390/app14020838.

Full text

Abstract:

Graph mining has emerged as a significant field of research with applications spanning multiple domains, including marketing, corruption analysis, business, and politics. The exploration of knowledge within graphs has garnered considerable attention due to the exponential growth of graph-modeled data and its potential in applications where data relationships are a crucial component, and potentially being even more important than the data themselves. However, the increasing use of graphs for data storing and modeling presents unique challenges that have prompted advancements in graph mining algorithms, data modeling and storage, query languages for graph databases, and data visualization techniques. Despite there being various methodologies for data analysis, they predominantly focus on structured data and may not be optimally suited for highly connected data. Accordingly, this work introduces a novel methodology specifically tailored for knowledge discovery in labeled and heterogeneous graphs (KDG), and it presents three case studies demonstrating its successful application in addressing various challenges across different application domains.

APA, Harvard, Vancouver, ISO, and other styles

20

Huang, Jinchao, Zhipu Xie, Han Zhang, Bin Yang, Chong Di, and Runhe Huang. "Enhancing Knowledge-Aware Recommendation with Dual-Graph Contrastive Learning." Information 15, no. 9 (September 2, 2024): 534. http://dx.doi.org/10.3390/info15090534.

Full text

Abstract:

Incorporating knowledge graphs as auxiliary information to enhance recommendation systems can improve the representations learning of users and items. Recommendation methods based on knowledge graphs can introduce user–item interaction learning into the item graph, focusing only on learning the node vector representations within a single graph; alternatively, they can treat user–item interactions and item graphs as two separate graphs and learn from each graph individually. Learning from two graphs has natural advantages in exploring original information and interaction information, but faces two main challenges: (1) in complex graph connection scenarios, how to adequately mine the self-information of each graph, and (2) how to merge interaction information from the two graphs while ensuring that user–item interaction information predominates. Existing methods do not thoroughly explore the simultaneous mining of self-information from both graphs and effective interaction information, leading to the loss of valuable insights. Considering the success of contrastive learning in mining self-information and auxiliary information, this paper proposes a dual-graph contrastive learning recommendation method based on knowledge graphs (KGDC) to explore a more accurate representations of users and items in recommendation systems based on external knowledge graphs. In the learning process within the self-graph, KGDC strengthens and represents the information of different connecting edges in both graphs, and extracts the existing information more fully. In interactive information learning, KGDC reinforces the interaction relationship between users and items in the external knowledge graph, realizing the leading role of the main task. We conducted a series of experiments on three standard datasets, and the results show that the proposed method can achieve better results.

APA, Harvard, Vancouver, ISO, and other styles

21

Rao, Bapuji, and Sarojananda Mishra. "Compression of community graph using graph mining techniques." International Journal of Hybrid Intelligent Systems 15, no. 1 (April 17, 2019): 55–65. http://dx.doi.org/10.3233/his-190260.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Wu, Jia, Zhibin Hong, Shirui Pan, Xingquan Zhu, Zhihua Cai, and Chengqi Zhang. "Multi-graph-view subgraph mining for graph classification." Knowledge and Information Systems 48, no. 1 (September 21, 2015): 29–54. http://dx.doi.org/10.1007/s10115-015-0872-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Gao, Shang, and Mei Mei Li. "Research of Data Graph Mining Based on Telecommunication Customers." Applied Mechanics and Materials 443 (October 2013): 402–6. http://dx.doi.org/10.4028/www.scientific.net/amm.443.402.

Full text

Abstract:

With the rapid development of the number of mobile phone users has accumulated a large number of graph data, graph data mining has gradually become a hot area of research. Traditional data such as clustering, classification, frequent pattern mining gradually extended to the field of graph data mining research. Introduced at this stage graph data mining technology research progress, summarizes the characteristics of the graphical data mining, practical significance, the main problem, and scenarios to discuss and forecast chart data, especially research on uncertain graph data become trends and hot spots.

APA, Harvard, Vancouver, ISO, and other styles

24

Wang, Weiya, Geng Yang, Lin Bao, Ke Ma, Hao Zhou, and Yunlu Bai. "Travel Trajectory Frequent Pattern Mining Based on Differential Privacy Protection." Wireless Communications and Mobile Computing 2021 (August 5, 2021): 1–14. http://dx.doi.org/10.1155/2021/6379530.

Full text

Abstract:

Now, many application services based on location data have brought a lot of convenience to people’s daily life. However, publishing location data may divulge individual sensitive information. Because the location records about location data may be discrete in the database, some existing privacy protection schemes are difficult to protect location data in data mining. In this paper, we propose a travel trajectory data record privacy protection scheme (TMDP) based on differential privacy mechanism, which employs the structure of a trajectory graph model on location database and frequent subgraph mining based on weighted graph. Time series is introduced into the location data; the weighted trajectory model is designed to obtain the travel trajectory graph database. We upgrade the mining of location data to the mining of frequent trajectory graphs, which can discover the relationship of location data from the database and protect location data mined. In particular, to improve the identification efficiency of frequent trajectory graphs, we design a weighted trajectory graph support calculation algorithm based on canonical code and subgraph structure. Moreover, to improve the data utility under the premise of protecting user privacy, we propose double processes of adding noises to the subgraph mining process by the Laplace mechanism and selecting final data by the exponential mechanism. Through formal privacy analysis, we prove that our TMDP framework satisfies ε -differential privacy. Compared with the other schemes, the experiments show that the data availability of the proposed scheme is higher and the privacy protection of the scheme is effective.

APA, Harvard, Vancouver, ISO, and other styles

25

Mustafa, Rashed, Mohammad Sultan Mahmud, and Mahir Shadid. "Random Sampling Method of Large-Scale Graph Data Classification." Jurnal Kejuruteraan 36, no. 2 (March 30, 2024): 525–32. http://dx.doi.org/10.17576/jkukm-2024-36(2)-14.

Full text

Abstract:

Graph data appears in broad real-world applications in modelling complex objects in big data. Effective analysis of graph data provides a deeper understanding of the data in data mining tasks, including classification, clustering, prediction, and recommendation systems. Mining a large number of graphs becomes a challenging task because state-of-the-art methods are not scalable due to the memory limit. To address this issue, we propose a novel approximate random sampling method for large-scale graph data classification. In this approach, we applied a representation method to encode each graph as a record of a vector string and a set of graphs as a set of N records in a file. Then, we partition the set of records into disjoint subsets of data blocks, making each data block a random sample of the data file. After that, we randomly select a subset of data blocks, each being a random sample of the graph dataset, and compute the different graph property distributions. Since the data blocks in this model are much smaller than the entire data set, it is more efficient to analyze them on a standalone small machine, and multiple data blocks can be analyzed on multiple nodes of the cluster in parallel. Finally, we classified the graphs of data blocks using the SVM algorithm. In experimental evaluation, our proposed method outperformed state-of-the-art graph kernels on graph classification datasets in terms of accuracy.

APA, Harvard, Vancouver, ISO, and other styles

26

Akram, Muhammad, Wieslaw A. Dudek, and M. Murtaza Yousaf. "Regularity in Vague Intersection Graphs and Vague Line Graphs." Abstract and Applied Analysis 2014 (2014): 1–10. http://dx.doi.org/10.1155/2014/525389.

Full text

Abstract:

Fuzzy graph theory is commonly used in computer science applications, particularly in database theory, data mining, neural networks, expert systems, cluster analysis, control theory, and image capturing. A vague graph is a generalized structure of a fuzzy graph that gives more precision, flexibility, and compatibility to a system when compared with systems that are designed using fuzzy graphs. In this paper, we introduce the notion of vague line graphs, and certain types of vague line graphs and present some of their properties. We also discuss an example application of vague digraphs.

APA, Harvard, Vancouver, ISO, and other styles

27

Güvenoglu, Büsra, and Belgin Ergenç Bostanoglu. "A qualitative survey on frequent subgraph mining." Open Computer Science 8, no. 1 (December 1, 2018): 194–209. http://dx.doi.org/10.1515/comp-2018-0018.

Full text

Abstract:

AbstractData mining is a popular research area that has been studied by many researchers and focuses on finding unforeseen and important information in large databases. One of the popular data structures used to represent large heterogeneous data in the field of data mining is graphs. So, graph mining is one of the most popular subdivisions of data mining. Subgraphs that are more frequently encountered than the user-defined threshold in a database are called frequent subgraphs. Frequent subgraphs in a database can give important information about this database. Using this information, data can be classified, clustered and indexed. The purpose of this survey is to examine frequent subgraph mining algorithms (i) in terms of frequent subgraph discovery process phases such as candidate generation and frequency calculation, (ii) categorize the algorithms according to their general attributes such as input type, dynamicity of graphs, result type, algorithmic approach they are based on, algorithmic design and graph representation as well as (iii) to discuss the performance of algorithms in comparison to each other and the challenges faced by the algorithms recently.

APA, Harvard, Vancouver, ISO, and other styles

28

Chowdhury, Mohammad Ehsan Shahmi, Chowdhury Farhan Ahmed, and Carson K. Leung. "A New Approach for Mining Correlated Frequent Subgraphs." ACM Transactions on Management Information Systems 13, no. 1 (March 31, 2022): 1–28. http://dx.doi.org/10.1145/3473042.

Full text

Abstract:

Nowadays graphical datasets are having a vast amount of applications. As a result, graph mining—mining graph datasets to extract frequent subgraphs—has proven to be crucial in numerous aspects. It is important to perform correlation analysis among the subparts (i.e., elements) of the frequent subgraphs generated using graph mining to observe interesting information. However, the majority of existing works focuses on complexities in dealing with graphical structures, and not much work aims to perform correlation analysis. For instance, a previous work realized in this regard, operated with a very naive raw approach to fulfill the objective, but dealt only on a small subset of the problem. Hence, in this article, a new measure is proposed to aid in the analysis for large subgraphs, mined from various types of graph transactions in the dataset. These subgraphs are immense in terms of their structural composition, and thus parallel the entire set of graphs in real-world. A complete framework for discovering the relations among parts of a frequent subgraph is proposed using our new method. Evaluation results show the usefulness and accuracy of the newly defined measure on real-life graphical datasets.

APA, Harvard, Vancouver, ISO, and other styles

29

Sonawane, Jayashri A., and Dr Swati A. Bhavsar. "Mining Health Examination Records - A Graph Based Approach." International Journal of Trend in Scientific Research and Development Volume-3, Issue-3 (April 30, 2019): 1496–98. http://dx.doi.org/10.31142/ijtsrd22810.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Bruno, Francesco, Luigi Palopoli, and Simona E. Rombo. "New Trends in Graph Mining." International Journal of Knowledge Discovery in Bioinformatics 1, no. 1 (January 2010): 81–99. http://dx.doi.org/10.4018/jkdb.2010100206.

Full text

Abstract:

Searching for repeated features characterizing biological data is fundamental in computational biology. When biological networks are under analysis, the presence of repeated modules across the same network (or several distinct ones) is shown to be very relevant. Indeed, several studies prove that biological networks can be often understood in terms of coalitions of basic repeated building blocks, often referred to as network motifs.This work provides a review of the main techniques proposed for motif extraction from biological networks. In particular, main intrinsic difficulties related to the problem are pointed out, along with solutions proposed in the literature to overcome them. Open challenges and directions for future research are finally discussed.

APA, Harvard, Vancouver, ISO, and other styles

31

Nanopoulos, Alexandros, and Yannis Manolopoulos. "Mining patterns from graph traversals." Data & Knowledge Engineering 37, no. 3 (June 2001): 243–66. http://dx.doi.org/10.1016/s0169-023x(01)00008-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Lee, Ho, Bin Shao, and U. Kang. "Fast graph mining with HBase." Information Sciences 315 (September 2015): 56–66. http://dx.doi.org/10.1016/j.ins.2015.04.016.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Xuan, Junyu, Jie Lu, Guangquan Zhang, and Xiangfeng Luo. "Topic Model for Graph Mining." IEEE Transactions on Cybernetics 45, no. 12 (December 2015): 2792–803. http://dx.doi.org/10.1109/tcyb.2014.2386282.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Besta, Maciej, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beranek, et al. "GraphMineSuite." Proceedings of the VLDB Endowment 14, no. 11 (July 2021): 1922–35. http://dx.doi.org/10.14778/3476249.3476252.

Full text

Abstract:

We propose GraphMineSuite (GMS): the first benchmarking suite for graph mining that facilitates evaluating and constructing high-performance graph mining algorithms. First, GMS comes with a benchmark specification based on extensive literature review, prescribing representative problems, algorithms, and datasets. Second, GMS offers a carefully designed software platform for seamless testing of different fine-grained elements of graph mining algorithms, such as graph representations or algorithm subroutines. The platform includes parallel implementations of more than 40 considered baselines, and it facilitates developing complex and fast mining algorithms. High modularity is possible by harnessing set algebra operations such as set intersection and difference, which enables breaking complex graph mining algorithms into simple building blocks that can be separately experimented with. GMS is supported with a broad concurrency analysis for portability in performance insights, and a novel performance metric to assess the throughput of graph mining algorithms, enabling more insightful evaluation. As use cases, we harness GMS to rapidly redesign and accelerate state-of-the-art baselines of core graph mining problems: degeneracy reordering (by >2X), maximal clique listing (by >9×), k -clique listing (by up to 1.1×), and subgraph isomorphism (by 2.5×), also obtaining better theoretical performance bounds.

APA, Harvard, Vancouver, ISO, and other styles

35

Zhu, Shijie, Jianxin Li, Hao Peng, Senzhang Wang, and Lifang He. "Adversarial Directed Graph Embedding." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 5 (May 18, 2021): 4741–48. http://dx.doi.org/10.1609/aaai.v35i5.16605.

Full text

Abstract:

Node representation learning for directed graphs is critically important to facilitate many graph mining tasks. To capture the directed edges between nodes, existing methods mostly learn two embedding vectors for each node, source vector and target vector. However, these methods learn the source and target vectors separately. For the node with very low indegree or outdegree, the corresponding target vector or source vector cannot be effectively learned. In this paper, we propose a novel Directed Graph embedding framework based on Generative Adversarial Network, called DGGAN. The main idea is to use adversarial mechanisms to deploy a discriminator and two generators that jointly learn each node’s source and target vectors. For a given node, the two generators are trained to generate its fake target and source neighbor nodes from the same underlying distribution, and the discriminator aims to distinguish whether a neighbor node is real or fake. The two generators are formulated into a unified framework and could mutually reinforce each other to learn more robust source and target vectors. Extensive experiments show that DGGAN consistently and significantly outperforms existing state-of-the-art methods across multiple graph mining tasks on directed graphs.

APA, Harvard, Vancouver, ISO, and other styles

36

Buluç, Aydın, and John R. Gilbert. "The Combinatorial BLAS: design, implementation, and applications." International Journal of High Performance Computing Applications 25, no. 4 (May 19, 2011): 496–509. http://dx.doi.org/10.1177/1094342011403516.

Full text

Abstract:

This paper presents a scalable high-performance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of high-performance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse-grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extensible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and ease-of-use. The scalability and raw performance of the example applications, using the Combinatorial BLAS, are unprecedented on distributed memory clusters.

APA, Harvard, Vancouver, ISO, and other styles

37

Prakash Dixit, Chandra, and Nilay Khare. "A Survey of Frequent Subgraph Mining Algorithms." International Journal of Engineering & Technology 7, no. 3.8 (July 7, 2018): 58. http://dx.doi.org/10.14419/ijet.v7i3.8.15218.

Full text

Abstract:

Graphs are broadly used data structure. Graphs are very useful in representing/analyzing and processing real world data. Evolving graphs are graphs which are frequently changing in nature. There is either increase or decrease in their size i.e. change in number of edges or/and vertices. Mining is the process done for knowledge discovery in graphs. Detecting specific patterns with their number of repetition more than a predefined threshold in graph is known as frequent subgraph mining or FSM. Real Timed data representing graphs are high volumetric or of very large in size, handling such graphs require processing them with special mechanisms and algorithms. Our review paper discovers present FSM techniques and tries to give their comparative study.

APA, Harvard, Vancouver, ISO, and other styles

38

Koutra, Danai. "The power of summarization in graph mining and learning." Proceedings of the VLDB Endowment 14, no. 13 (September 2021): 3416. http://dx.doi.org/10.14778/3484224.3484238.

Full text

Abstract:

Our ability to generate, collect, and archive data related to everyday activities, such as interacting on social media, browsing the web, and monitoring well-being, is rapidly increasing. Getting the most benefit from this large-scale data requires analysis of patterns it contains, which is computationally intensive or even intractable. Summarization techniques produce compact data representations (summaries) that enable faster processing by complex algorithms and queries. This talk will cover summarization of interconnected data (graphs) [3], which can represent a variety of natural processes (e.g., friendships, communication). I will present an overview of my group's work on bridging the gap between research on summarized network representations and real-world problems. Examples include summarization of massive knowledge graphs for refinement [2] and on-device querying [4], summarization of graph streams for persistent activity detection [1], and summarization within graph neural networks for fast, interpretable classification [5]. I will conclude with open challenges and opportunities for future research.

APA, Harvard, Vancouver, ISO, and other styles

39

Leng, Fangling, Fan Li, Yubin Bao, Tiancheng Zhang, and Ge Yu. "FSM-BC-BSP: Frequent Subgraph Mining Algorithm Based on BC-BSP." Applied Sciences 14, no. 8 (April 9, 2024): 3154. http://dx.doi.org/10.3390/app14083154.

Full text

Abstract:

As graph models become increasingly prevalent in the processing of scientific data, the exploration of effective methods for the mining of meaningful patterns from large-scale graphs has garnered significant research attention. This paper delves into the complexity of frequent subgraph mining and proposes a frequent subgraph mining (FSM) algorithm. This FSM algorithm is developed within a distributed graph iterative system, designed for the Big Cloud (BC) environment of the China Mobile Corp., and is based on the bulk synchronous parallel (BSP) model, named FSM-BC-BSP. Its aim is to address the challenge of mining frequent subgraphs within a single, large graph. This study advocates for the incorporation of a message sending and receiving mechanism to facilitate data sharing across various stages of the frequent subgraph mining algorithm. Additionally, it suggests employing a standard coded subgraph and sending it to the same node for global support calculation on the large graph. The adoption of the rightmost path expansion strategy in generating candidate subgraphs helps to mitigate the occurrence of redundant subgraphs. The use of standard coding ensures the unique identification of subgraphs, thus eliminating the need for isomorphism calculations. Support calculation is executed using the Minimum Image (MNI) measurement method, aligning with the downward closure attribute. The experimental results demonstrate the robust performance of the FSM-BC-BSP algorithm across diverse input datasets and parameter configurations. Notably, the algorithm exhibits exceptional efficacy, particularly in scenarios with low support requirements, showcasing its superior performance under such conditions.

APA, Harvard, Vancouver, ISO, and other styles

40

Xue, Shan, Li Xiong, Zhao Lu, and Jia Wu. "Graph-theoretic node importance mining in world city networks: methods and applications." Information Discovery and Delivery 45, no. 2 (May 15, 2017): 57–65. http://dx.doi.org/10.1108/idd-09-2016-0032.

Full text

Abstract:

Purpose This study aims to review the literature on graph-theoretic mining methods for node importance in both static and dynamic world city networks, which is correspondingly categorised by graph-theoretic node importance mining on network topologies and transmission mechanisms. Design/methodology/approach The authors overview the graph-theoretic indicators of node importance: centrality and power. Then, the methods of graph-theoretic node importance mining on network topologies are assessed with node relevance, centrality- and power-based measurements, heterogeneous fusion and other miscellaneous approaches. The latest progress in transmission mechanisms is also reviewed in this study involving network evolution, node immunisation and robustness in dynamics. Finally, the findings are analysed and future directions in this field are suggested. Findings The method development of node importance mining is driven by complex application-based problems within a transmission mechanism. Fusion measurements, based on centrality and power, are extended by other graph mining techniques in which power has a significant role. In conclusion, the trends of node importance mining focus on power-embedded fusion measurements in the transmission mechanism-based complex applications. Originality/value This is the first systematic literature review of node importance from the view of graph-theoretic mining.

APA, Harvard, Vancouver, ISO, and other styles

41

Zhou, Houquan, Shenghua Liu, Danai Koutra, Huawei Shen, and Xueqi Cheng. "A Provable Framework of Learning Graph Embeddings via Summarization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 4 (June 26, 2023): 4946–53. http://dx.doi.org/10.1609/aaai.v37i4.25621.

Full text

Abstract:

Given a large graph, can we learn its node embeddings from a smaller summary graph? What is the relationship between embeddings learned from original graphs and their summary graphs? Graph representation learning plays an important role in many graph mining applications, but learning em-beddings of large-scale graphs remains a challenge. Recent works try to alleviate it via graph summarization, which typ-ically includes the three steps: reducing the graph size by combining nodes and edges into supernodes and superedges,learning the supernode embedding on the summary graph and then restoring the embeddings of the original nodes. How-ever, the justification behind those steps is still unknown. In this work, we propose GELSUMM, a well-formulated graph embedding learning framework based on graph sum-marization, in which we show the theoretical ground of learn-ing from summary graphs and the restoration with the three well-known graph embedding approaches in a closed form.Through extensive experiments on real-world datasets, we demonstrate that our methods can learn graph embeddings with matching or better performance on downstream tasks.This work provides theoretical analysis for learning node em-beddings via summarization and helps explain and under-stand the mechanism of the existing works.

APA, Harvard, Vancouver, ISO, and other styles

42

Qin, Zheyun, Xiankai Lu, Xiushan Nie, Yilong Yin, and Jianbing Shen. "Exposing the Self-Supervised Space-Time Correspondence Learning via Graph Kernels." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 2 (June 26, 2023): 2110–18. http://dx.doi.org/10.1609/aaai.v37i2.25304.

Full text

Abstract:

Self-supervised space-time correspondence learning is emerging as a promising way of leveraging unlabeled video. Currently, most methods adapt contrastive learning with mining negative samples or reconstruction adapted from the image domain, which requires dense affinity across multiple frames or optical flow constraints. Moreover, video correspondence predictive models require mining more inherent properties in videos, such as structural information. In this work, we propose the VideoHiGraph, a space-time correspondence framework based on a learnable graph kernel. Concerning the video as the spatial-temporal graph, the learning objectives of VideoHiGraph are emanated in a self-supervised manner for predicting unobserved hidden graphs via graph kernel manner. We learn a representation of the temporal coherence across frames in which pairwise similarity defines the structured hidden graph, such that a biased random walk graph kernel along the sub-graph can predict long-range correspondence. Then, we learn a refined representation across frames on the node-level via a dense graph kernel. The self-supervision of the model training is formed by the structural and temporal consistency of the graph. VideoHiGraph achieves superior performance and demonstrates its robustness across the benchmark of label propagation tasks involving objects, semantic parts, keypoints, and instances. Our algorithm implementations have been made publicly available at https://github.com/zyqin19/VideoHiGraph.

APA, Harvard, Vancouver, ISO, and other styles

43

Eddamiri, Siham, Asmaa Benghabrit, and Elmoukhtar Zemmouri. "RDF graph mining for cluster-based theme identification." International Journal of Web Information Systems 16, no. 2 (April 17, 2020): 223–47. http://dx.doi.org/10.1108/ijwis-10-2019-0048.

Full text

Abstract:

Purpose The purpose of this paper is to present a generic pipeline for Resource Description Framework (RDF) graph mining to provide a comprehensive review of each step in the knowledge discovery from data process. The authors also investigate different approaches and combinations to extract feature vectors from RDF graphs to apply the clustering and theme identification tasks. Design/methodology/approach The proposed methodology comprises four steps. First, the authors generate several graph substructures (Walks, Set of Walks, Walks with backward and Set of Walks with backward). Second, the authors build neural language models to extract numerical vectors of the generated sequences by using word embedding techniques (Word2Vec and Doc2Vec) combined with term frequency-inverse document frequency (TF-IDF). Third, the authors use the well-known K-means algorithm to cluster the RDF graph. Finally, the authors extract the most relevant rdf:type from the grouped vertices to describe the semantics of each theme by generating the labels. Findings The experimental evaluation on the state of the art data sets (AIFB, BGS and Conference) shows that the combination of Set of Walks-with-backward with TF-IDF and Doc2vec techniques give excellent results. In fact, the clustering results reach more than 97% and 90% in terms of purity and F-measure, respectively. Concerning the theme identification, the results show that by using the same combination, the purity and F-measure criteria reach more than 90% for all the considered data sets. Originality/value The originality of this paper lies in two aspects: first, a new machine learning pipeline for RDF data is presented; second, an efficient process to identify and extract relevant graph substructures from an RDF graph is proposed. The proposed techniques were combined with different neural language models to improve the accuracy and relevance of the obtained feature vectors that will be fed to the clustering mechanism.

APA, Harvard, Vancouver, ISO, and other styles

44

Jin, Ning, Calvin Young, and Wei Wang. "Discriminative Subgraph Mining for Protein Classification." International Journal of Knowledge Discovery in Bioinformatics 1, no. 3 (July 2010): 36–52. http://dx.doi.org/10.4018/jkdb.2010070103.

Full text

Abstract:

Protein classification can be performed by representing 3-D protein structures by graphs and then classifying the corresponding graphs. One effective way to classify such graphs is to use frequent subgraph patterns as features; however, the effectiveness of using subgraph patterns in graph classification is often hampered by the large search space of subgraph patterns. In this paper, the authors present two efficient discriminative subgraph mining algorithms: COM and GAIA. These algorithms directly search for discriminative subgraph patterns rather than frequent subgraph patterns which can be used to generate classification rules. Experimental results show that COM and GAIA can achieve high classification accuracy and runtime efficiency. Additionally, they find substructures that are very close to the proteins’ actual active sites.

APA, Harvard, Vancouver, ISO, and other styles

45

Nguyen, Lam B. Q., Ivan Zelinka, and Quoc Bao Diep. "CCGraMi: An Effective Method for Mining Frequent Subgraphs in a Single Large Graph." MENDEL 27, no. 2 (December 21, 2021): 90–99. http://dx.doi.org/10.13164/mendel.2021.2.090.

Full text

Abstract:

In modern applications, large graphs are usually applied in the simulation and analysis of large complex systems such as social networks, computer networks, maps, traffic networks. Therefore, graph mining is also an interesting subject attracting many researchers. Among them, frequent subgraph mining in a single large graph is one of the most important branches of graph mining, it is defined as finding all subgraphs whose occurrences in a dataset are greater than or equal to a given frequency threshold. In which, the GraMi algorithm is considered the state of the art approach and many algorithms have been proposed to improve this algorithm. In 2020, the SoGraMi algorithm was proposed to optimize the GraMi algorithm and presented an outstanding performance in terms of runtime and storage space. In this paper, we propose a new algorithm to improve SoGraMi based on connected components, called CCGraMi (Connected Components GraMi). Our experiments on four real datasets (both directed and undirected) show that the proposed algorithm outperforms SoGraMi in terms of running time as well as memory requirements.

APA, Harvard, Vancouver, ISO, and other styles

46

Nasir, Muhammad Anis Uddin, Cigdem Aslay, Gianmarco De Francisci Morales, and Matteo Riondato. "Approximate Mining of Frequent -Subgraph Patterns in Evolving Graphs." ACM Transactions on Knowledge Discovery from Data 15, no. 3 (April 12, 2021): 1–35. http://dx.doi.org/10.1145/3442590.

Full text

Abstract:

“Perhaps he could dance first and think afterwards, if it isn’t too much to ask him.” S. Beckett, Waiting for Godot Given a labeled graph, the collection of -vertex induced connected subgraph patterns that appear in the graph more frequently than a user-specified minimum threshold provides a compact summary of the characteristics of the graph, and finds applications ranging from biology to network science. However, finding these patterns is challenging, even more so for dynamic graphs that evolve over time, due to the streaming nature of the input and the exponential time complexity of the problem. We study this task in both incremental and fully-dynamic streaming settings, where arbitrary edges can be added or removed from the graph. We present TipTap , a suite of algorithms to compute high-quality approximations of the frequent -vertex subgraphs w.r.t. a given threshold, at any time (i.e., point of the stream), with high probability. In contrast to existing state-of-the-art solutions that require iterating over the entire set of subgraphs in the vicinity of the updated edge, TipTap operates by efficiently maintaining a uniform sample of connected -vertex subgraphs, thanks to an optimized neighborhood-exploration procedure. We provide a theoretical analysis of the proposed algorithms in terms of their unbiasedness and of the sample size needed to obtain a desired approximation quality. Our analysis relies on sample-complexity bounds that use Vapnik–Chervonenkis dimension, a key concept from statistical learning theory, which allows us to derive a sufficient sample size that is independent from the size of the graph. The results of our empirical evaluation demonstrates that TipTap returns high-quality results more efficiently and accurately than existing baselines.

APA, Harvard, Vancouver, ISO, and other styles

47

Qin, Hongchao, Rong-Hua Li, Ye Yuan, Guoren Wang, Lu Qin, and Zhiwei Zhang. "Mining Bursting Core in Large Temporal Graphs." Proceedings of the VLDB Endowment 15, no. 13 (September 2022): 3911–23. http://dx.doi.org/10.14778/3565838.3565845.

Full text

Abstract:

Temporal graphs are ubiquitous. Mining communities that are bursting in a period of time is essential for seeking real emergency events in temporal graphs. Unfortunately, most previous studies on community mining in temporal networks ignore the bursting patterns of communities. In this paper, we study the problem of seeking bursting communities in a temporal graph. We propose a novel model, called the ( l , δ)-maximal bursting core, to represent a bursting community in a temporal graph. Specifically, an ( l , δ)-maximal bursting core is a temporal subgraph in which each node has an average degree no less than δ in a time segment with length no less than l. To compute the ( l , δ)-maximal bursting core, we first develop a novel dynamic programming algorithm that can reduce time complexity of calculating the segment density from O (| T |) 2 to O (| T |). Then, we propose an efficient updating algorithm which can update the segment density in O ( l ) time. In addition, we develop an efficient algorithm to enumerate all ( l , δ)-maximal bursting cores that are not dominated by the others in terms of l and δ. The results of extensive experiments on 9 real-life datasets demonstrate the effectiveness, efficiency and scalability of our algorithms.

APA, Harvard, Vancouver, ISO, and other styles

48

D’Emidio, Mattia. "Faster Algorithms for Mining Shortest-Path Distances from Massive Time-Evolving Graphs." Algorithms 13, no. 8 (August 4, 2020): 191. http://dx.doi.org/10.3390/a13080191.

Full text

Abstract:

Computing shortest-path distances is a fundamental primitive in the context of graph data mining, since this kind of information is essential in a broad range of prominent applications, which include social network analysis, data routing, web search optimization, database design and route planning. Standard algorithms for shortest paths (e.g., Dijkstra’s) do not scale well with the graph size, as they take more than a second or huge memory overheads to answer a single query on the distance for large-scale graph datasets. Hence, they are not suited to mine distances from big graphs, which are becoming the norm in most modern application contexts. Therefore, to achieve faster query answering, smarter and more scalable methods have been designed, the most effective of them based on precomputing and querying a compact representation of the transitive closure of the input graph, called the 2-hop-cover labeling. To use such approaches in realistic time-evolving scenarios, when the managed graph undergoes topological modifications over time, specific dynamic algorithms, carefully updating the labeling as the graph evolves, have been introduced. In fact, recomputing from scratch the 2-hop-cover structure every time the graph changes is not an option, as it induces unsustainable time overheads. While the state-of-the-art dynamic algorithm to update a 2-hop-cover labeling against incremental modifications (insertions of arcs/vertices, arc weights decreases) offers very fast update times, the only known solution for decremental modifications (deletions of arcs/vertices, arc weights increases) is still far from being considered practical, as it requires up to tens of seconds of processing per update in several prominent classes of real-world inputs, as experimentation shows. In this paper, we introduce a new dynamic algorithm to update 2-hop-cover labelings against decremental changes. We prove its correctness, formally analyze its worst-case performance, and assess its effectiveness through an experimental evaluation employing both real-world and synthetic inputs. Our results show that it improves, by up to several orders of magnitude, upon average update times of the only existing decremental algorithm, thus representing a step forward towards real-time distance mining in general, massive time-evolving graphs.

APA, Harvard, Vancouver, ISO, and other styles

49

YADA, KATSUTOSHI, HIROSHI MOTODA, TAKASHI WASHIO, and ASUKA MIYAWAKI. "CONSUMER BEHAVIOR ANALYSIS BY GRAPH MINING TECHNIQUE." New Mathematics and Natural Computation 02, no. 01 (March 2006): 59–68. http://dx.doi.org/10.1142/s1793005706000294.

Full text

Abstract:

In this paper, we discuss how graph mining system is applied to sales transaction data so as to understand consumer behavior. First, existing research of consumer behavior analysis for sequential purchase pattern is reviewed. Then we propose to represent the complicated customer purchase behavior by a directed graph retaining temporal information in a purchase sequence and apply a graph mining technique to analyze the frequent occurring patterns. In this paper, we demonstrate through the case of healthy cooking oil analysis how graph mining technology helps us understand complex purchase behavior.

APA, Harvard, Vancouver, ISO, and other styles

50

Jamshidi, Kasra, and Keval Vora. "A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE." ACM SIGOPS Operating Systems Review 55, no. 1 (June 2, 2021): 1–10. http://dx.doi.org/10.1145/3469379.3469381.

Full text

Abstract:

Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. PEREGRINE is a general-purpose graph mining system that provides a generic runtime to efficiently explore subgraph structures of interest and perform various graph mining analyses. It takes a 'pattern-aware' approach by incorporating a pattern-based programming model along with efficient pattern matching strategies. The programming model enables easier expression of complex graph mining use cases and enables PEREGRINE to extract the semantics of patterns. By analyzing the patterns, PEREGRINE generates efficient exploration plans which it uses to guide its subgraph exploration. In this paper, we present an in-depth view of the patternanalysis techniques powering the matching engine of PEREGRINE. Beyond the theoretical foundations from prior research, we expose opportunities based on how the exploration plans are evaluated, and develop key techniques for computation reuse, enumeration depth reduction, and branch elimination. Our experiments show the importance of patternawareness for scalable and performant graph mining where the presented new techniques speed up the performance by up to two orders of magnitude on top of the benefits achieved from the prior theoretical foundations that generate the initial exploration plans.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!