Dissertations / Theses on the topic 'Graph Mining'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Graph Mining.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Feng, Jing. "Information-theoretic graph mining." Diss., Ludwig-Maximilians-Universität München, 2015. http://nbn-resolving.de/urn:nbn:de:bvb:19-183384.
Full textKumar, Rohit 1986. "Temporal graph mining and distributed processing." Doctoral thesis, Universitat Politècnica de Catalunya, 2018. http://hdl.handle.net/10803/620623.
Full textCon el reciente crecimiento de las redes sociales y el deseo humano de interactuar con el mundo digital, una gran cantidad de datos de interacción humano-a-humano o humano-a-dispositivo se generan cada segundo. Con el auge de los dispositivos IoT, las interacciones dispositivo-a-dispositivo también están en alza. Todas estas interacciones no son más que una representación de como la red subyacente conecta distintas entidades en el tiempo. Modelar estas interacciones en forma de red de interacciones presenta una gran cantidad de oportunidades únicas para descubrir patrones interesantes y entender la dinamicidad de la red. Entender la dinamicidad de la red es clave ya que encapsula la forma en la que nos comunicamos, socializamos, consumimos información y somos influenciados. Para ello, en esta tesis doctoral, nos centramos en analizar una red de interacciones para entender como la red subyacente es usada. Definimos una red de interacciones como una sequencia de interacciones grabadas en el tiempo E sobre aristas de un grafo estático G=(V, E). Las redes de interacción se pueden usar para modelar gran cantidad de aplicaciones reales, por ejemplo en una red social o de comunicaciones cada interacción sobre una arista representa una interacción entre dos usuarios (correo electrónico, llamada, retweet), o en el caso de una red financiera una interacción entre dos cuentas para representar una transacción. Analizamos las redes de interacción bajo múltiples escenarios. En el primero, estudiamos las redes de interacción bajo un modelo de ventana deslizante. Asumimos que un nodo puede mandar información a otros nodos si estan conectados utilizando aristas presentes en una ventana temporal. En este modelo, estudiamos como la importancia o centralidad de un nodo evoluciona en el tiempo. En el segundo escenario añadimos restricciones adicionales respecto como la información fluye entre nodos. Asumimos que un nodo puede mandar información a otros nodos solo si existe un camino temporal entre ellos. Para restringir la longitud de los caminos temporales también asumimos una ventana temporal. Aplicamos este modelo para resolver este problema de maximización de influencia restringido temporalmente. Analizando los datos de la red de interacción bajo nuestro modelo intentamos descubrir los k nodos más influyentes. Examinamos nuestro modelo en interacciones humano-a-humano, usando datos de redes sociales, como en ubicación-a-ubicación usando datos de redes sociales basades en localización (LBSNs). En el mismo escenario también minamos camínos cíclicos temporales para entender los patrones de comunicación en una red. Existen múltiples aplicaciones para cíclos temporales y aparecen naturalmente en redes de comunicación donde una persona envía un mensaje y después de un tiempo reacciona a una cadena de reacciones de compañeros en el mensaje. En redes financieras, por otro lado, la presencia de un ciclo temporal puede indicar ciertos tipos de fraude. Proponemos algoritmos eficientes para todos nuestros análisis y evaluamos su eficiencia y efectividad en datos reales. Finalmente, dado que muchos de los algoritmos estudiados tienen una gran demanda computacional, también estudiamos los algoritmos de procesado distribuido de grafos. Un aspecto importante de procesado distribuido de grafos es el de correctamente particionar los datos del grafo entre distintas máquinas. Gran cantidad de investigación se ha realizado en estrategias para particionar eficientemente un grafo, pero no existe un particionamento bueno para todos los tipos de grafos y algoritmos. Escoger la mejor estrategia de partición no es trivial y es mayoritariamente un ejercicio de prueba y error. Con tal de abordar este problema, proporcionamos un modelo de costes para dar un mejor entendimiento en como una estrategia de particionamiento actúa dado un grafo y un algoritmo.
Kumar, Rohit. "Temporal Graph Mining and Distributed Processing." Doctoral thesis, Universite Libre de Bruxelles, 2018. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/271527.
Full textDoctorat en Sciences de l'ingénieur et technologie
info:eu-repo/semantics/nonPublished
Niggemann, Oliver. "Visual data mining of graph based data." [S.l. : s.n.], 2001. http://deposit.ddb.de/cgi-bin/dokserv?idn=962400505.
Full textRunelöv, Martin. "Finding seminal scientific publications with graph mining." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-172382.
Full textI detta examensarbete undersöks det huruvida analys av citeringsgrafer kan användas för att finna betydelsefulla vetenskapliga publikationer. Framför allt studeras ”betweenness”-centralitet, den så kallade ”backbone”-grafen samt ”burstiness” av citeringar. Dessa mått utvärderas med hjälp av precisionsmått med avseende på guldstandarder baserade på ’fellow’-program samt via manuell annotering. Antal citeringar, PageRank, och slumpmässigt urval används som jämförelse. Resultaten visar att ”backbone”-grafen kan bidra till att eventuellt upptäcka betydelsefulla publikationer med ett lågt antal citeringar samt att en kombination av ”betweenness” och ”burstiness” ger resultat i nivå med de man får av att räkna antal citeringar.
Diot, Fabien. "Graph mining for object tracking in videos." Thesis, Saint-Etienne, 2014. http://www.theses.fr/2014STET4009/document.
Full textDetecting and following the main objects of a video is necessary to describe its content in order to, for example, allow for a relevant indexation of the multimedia content by the search engines. Current object tracking approaches either require the user to select the targets to follow, or rely on pre-trained classifiers to detect particular classes of objects such as pedestrians or car for example. Since those methods rely on user intervention or prior knowledge of the content to process, they cannot be applied automatically on amateur videos such as the ones found on YouTube. To solve this problem, we build upon the hypothesis that, in videos with a moving background, the main objects should appear more frequently than the background. Moreover, in a video, the topology of the visual elements composing an object is supposed consistent from one frame to another. We represent each image of the videos with plane graphs modeling their topology. Then, we search for substructures appearing frequently in the database of plane graphs thus created to represent each video. Our contributions cover both fields of graph mining and object tracking. In the first field, our first contribution is to present an efficient plane graph mining algorithm, named PLAGRAM. This algorithm exploits the planarity of the graphs and a new strategy to extend the patterns. The next contributions consist in the introduction of spatio-temporal constraints into the mining process to exploit the fact that, in a video, the motion of objects is small from on frame to another. Thus, we constrain the occurrences of a same pattern to be close in space and time by limiting the number of frames and the spatial distance separating them. We present two new algorithms, DYPLAGRAM which makes use of the temporal constraint to limit the number of extracted patterns, and DYPLAGRAM_ST which efficiently mines frequent spatio-temporal patterns from the datasets representing the videos. In the field of object tracking, our contributions consist in two approaches using the spatio-temporal patterns to track the main objects in videos. The first one is based on a search of the shortest path in a graph connecting the spatio-temporal patterns, while the second one uses a clustering approach to regroup them in order to follow the objects for a longer period of time. We also present two industrial applications of our method
Wang, Guan. "Graph-Based Approach on Social Data Mining." Thesis, University of Illinois at Chicago, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3668648.
Full textPowered by big data infrastructures, social network platforms are gathering data on many aspects of our daily lives. The online social world is reflecting our physical world in an increasingly detailed way by collecting people's individual biographies and their various of relationships with other people. Although massive amount of social data has been gathered, an urgent challenge remain unsolved, which is to discover meaningful knowledge that can empower the social platforms to really understand their users from different perspectives.
Motivated by this trend, my research addresses the reasoning and mathematical modeling behind interesting phenomena on social networks. Proposing graph based data mining framework regarding to heterogeneous data sources is the major goal of my research. The algorithms, by design, utilize graph structure with heterogeneous link and node features to creatively represent social networks' basic structures and phenomena on top of them.
The graph based heterogeneous mining methodology is proved to be effective on a series of knowledge discovery topics, including network structure and macro social pattern mining such as magnet community detection (87), social influence propagation and social similarity mining (85), and spam detection (86). The future work is to consider dynamic relation on social data mining and how graph based approaches adapt from the new situations.
Giatsidis, Christos. "Graph Mining and Community Evaluation with Degeneracy." Palaiseau, Ecole polytechnique, 2013. http://pastel.archives-ouvertes.fr/docs/00/95/96/15/PDF/thesisA.pdf.
Full textThe study and analysis of social networks attract attention from a variety of Sciences (psychology, statistics, sociology). Among them, the field of Data Mining offers tools to automatically extract useful information on properties of those networks. More specifically, Graph Mining serves the need to model and investigate social networks especially in the case of large communities - usually found in online media - where social networks are prohibitively large for non-automated methodologies. The general modeling of a social network is based on graph structures. Nodes of the graph represent individuals and edges signify different actions or types of social connections between them. A community is defined as a subgraph (of a social network) and is characterized by dense connections. Various measures have been proposed to evaluate different quality aspects of such communities - in most cases ignoring various properties of the connections (e. G. Directionality). In the work presented here, the k-core concept is used as a means to evaluate communities and extract information. The k-core structure essentially measures the robustness of an undirected network through degeneracy. Further more extensions of degeneracy are introduced to networks that their edges offer more information than the undirected type. Starting point is the exploration of properties that can be extracted from undirected graphs (of social networks). On this, degeneracy is used to evaluate collaboration features - a property not captured by the single node metrics or by the established community evaluation metrics - of both individuals and the entire community. Next, this process is extended for weighted, directed and signed graphs offering a plethora of novel evaluation metrics for social networks. These new features offer measurement tools for collaboration in social networks where we can assign a weight or a direction to a connection and provide alternative ways to signify the importance of individuals within a community. For signed graphs the extension of degeneracy offers additional metrics that can be used for trust management. Moreover, a clustering approach is introduced which capitalizes on processing the graph in a hierarchical manner provided by its core expansion sequence, an ordered partition of the graph into different levels according to the k-core decomposition The graph theoretical models are then applied in real world graphs to investigate trends and behaviors. The datasets explored include scientific collaboration and citation graphs (DBLP and ARXIV), a snapshot of Wikipedia's inner graph and trust networks (e. G. Epinions and Slashdot). The findings on these datasets are interesting and the proposed models offer intuitive results
Schenker, Adam. "Graph-Theoretic Techniques for Web Content Mining." [Tampa, Fla.] : University of South Florida, 2003. http://purl.fcla.edu/fcla/etd/SFE0000143.
Full textDe, Lara Nathan. "Algorithmic and software contributions to graph mining." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT029.
Full textSince the introduction of Google's PageRank method for Web searches in the late 1990s, graph algorithms have been part of our daily lives. In the mid 2000s, the arrival of social networks has amplified this phenomenon, creating new use-cases for these algorithms. Relationships between entities can be of multiple types: user-user symmetric relationships for Facebook or LinkedIn, follower-followee asymmetric ones for Twitter or even user-content bipartite ones for Netflix or Amazon. They all come with their own challenges and the applications are numerous: centrality calculus for influence measurement, node clustering for knowledge discovery, node classification for recommendation or embedding for link prediction, to name a few.In the meantime, the context in which graph algorithms are applied has rapidly become more constrained. On the one hand, the increasing size of the datasets with millions of entities, and sometimes billions of relationships, bounds the asymptotic complexity of the algorithms for industrial applications. On the other hand, as these algorithms affect our daily lives, there is a growing demand for explanability and fairness in the domain of artificial intelligence in general. Graph mining is no exception. For example, the European Union has published a set of ethics guidelines for trustworthy AI. This calls for further analysis of the current models and even new ones.This thesis provides specific answers via a novel analysis of not only standard, but also extensions, variants, and original graph algorithms. Scalability is taken into account every step of the way. Following what the Scikit-learn project does for standard machine learning, we deem important to make these algorithms available to as many people as possible and participate in graph mining popularization. Therefore, we have developed an open-source software, Scikit-network, which implements and documents the algorithms in a simple and efficient way. With this tool, we cover several areas of graph mining such as graph embedding, clustering, and semi-supervised node classification
Casas, Roma Jordi. "Privacy-preserving and data utility in graph mining." Doctoral thesis, Universitat Autònoma de Barcelona, 2014. http://hdl.handle.net/10803/285566.
Full textIn recent years, an explosive increase of graph-formatted data has been made publicly available. Embedded within this data there is private information about users who appear in it. Therefore, data owners must respect the privacy of users before releasing datasets to third parties. In this scenario, anonymization processes become an important concern. However, anonymization processes usually introduce some kind of noise in the anonymous data, altering the data and also their results on graph mining processes. Generally, the higher the privacy, the larger the noise. Thus, data utility is an important factor to consider in anonymization processes. The necessary trade-off between data privacy and data utility can be reached by using measures and metrics to lead the anonymization process to minimize the information loss, and therefore, to maximize the data utility. In this thesis we have covered the fields of user's privacy-preserving in social networks and the utility and quality of the released data. A trade-off between both fields is a critical point to achieve good anonymization methods for the subsequent graph mining processes. Part of this thesis has focused on data utility and information loss. Firstly, we have studied the relation between the generic information loss measures and the clustering-specific ones, in order to evaluate whether the generic information loss measures are indicative of the usefulness of the data for subsequent data mining processes. We have found strong correlation between some generic information loss measures (average distance, betweenness centrality, closeness centrality, edge intersection, clustering coefficient and transitivity) and the precision index over the results of several clustering algorithms, demonstrating that these measures are able to predict the perturbation introduced in anonymous data. Secondly, two measures to reduce the information loss on graph modification processes have been presented. The first one, Edge neighbourhood centrality, is based on information flow throw 1-neighbourhood of a specific edge in the graph. The second one is based on the core number sequence and it preserves better the underlying graph structure, retaining more data utility. By an extensive experimental set up, we have demonstrated that both methods are able to preserve the most important edges in the network, keeping the basic structural and spectral properties close to the original ones. The other important topic of this thesis has been privacy-preserving methods. We have presented our random-based algorithm, which utilizes the concept of Edge neighbourhood centrality to drive the edge modification process to better preserve the most important edges in the graph, achieving lower information loss and higher data utility on the released data. Our method obtains a better trade-off between data utility and data privacy than other methods. Finally, two different approaches for k-degree anonymity on graphs have been developed. First, an algorithm based on evolutionary computing has been presented and tested on different small and medium real networks. Although this method allows us to fulfil the desired privacy level, it presents two main drawbacks: the information loss is quite large in some graph structural properties and it is not fast enough to work with large networks. Therefore, a second algorithm has been presented, which uses the univariate micro-aggregation to anonymize the degree sequence and reduce the distance from the original one. This method is quasi-optimal and it results in lower information loss and better data utility.
Alkan, Sertan. "A Distributed Graph Mining Framework Based On Mapreduce." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12611588/index.pdf.
Full textOkuno, Shingo. "Parallelization of Graph Mining using Backtrack Search Algorithm." 京都大学 (Kyoto University), 2017. http://hdl.handle.net/2433/225743.
Full textDi, Jinchao. "Gene Co-expression Network Mining Using Graph Sparsification." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1367583964.
Full textFaisal, S. M. "Towards Energy Efficient Data Mining & Graph Processing." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440364739.
Full textLiu, Haishan, and Haishan Liu. "A Graph-based Approach for Semantic Data Mining." Thesis, University of Oregon, 2012. http://hdl.handle.net/1794/12567.
Full textRossi, Maria. "Graph Mining for Influence Maximization in Social Networks." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLX083/document.
Full textModern science of graphs has emerged the last few years as a field of interest and has been bringing significant advances to our knowledge about networks. Until recently the existing data mining algorithms were destined for structured/relational data while many datasets exist that require graph representation such as social networks, networks generated by textual data, 3D protein structures and chemical compounds. It has become therefore of crucial importance to be able to extract meaningful information from that kind of data and towards this end graph mining and analysis methods have been proven essential. The goal of this thesis is to study problems in the area of graph mining focusing especially on designing new algorithms and tools related to information spreading and specifically on how to locate influential entities in real-world networks. This task is crucial in many applications such as information diffusion, epidemic control and viral marketing. In the first part of the thesis, we have studied spreading processes in social networks focusing on finding topological characteristics that rank entities in the network based on their influential capabilities. We have specifically focused on the K-truss decomposition which is an extension of the core decomposition of the graph. Extensive experimental analysis showed that the nodes that belong to the maximal K-truss subgraph show a better spreading behavior when compared to baseline criteria. Such spreaders can influence a greater part of the network during the first steps of a spreading process but also the total fraction of the influenced nodes at the end of the epidemic is greater. We have also observed that node members of such dense subgraphs are those achieving the optimal spreading in the network.In the second part of the thesis, we focused on identifying a group of nodes that by acting all together maximize the expected number of influenced nodes at the end of the spreading process, formally called Influence Maximization (IM). The IM problem is actually NP-hard though there exist approximation guarantees for efficient algorithms that can solve the problem while obtaining a solution within the 63% of optimal classes of models. As those guarantees propose a greedy approximation which is computationally expensive especially for large graphs, we proposed the MATI algorithm which succeeds in locating the group of users that maximize the influence while also being scalable. The algorithm takes advantage the possible paths created in each node’s neighborhood to precalculate each node’s potential influence and produces competitive results in quality compared to those of baseline algorithms such as the Greedy, LDAG and SimPath. In the last part of the thesis, we study the privacy point of view of sharing such metrics that are good influential indicators in a social network. We have focused on designing an algorithm that addresses the problem of computing through an efficient, correct, secure, and privacy-preserving algorithm the k-core metric which measures the influence of each node of the network. We have specifically adopted a decentralization approach where the social network is considered as a Peer-to-peer (P2P) system. The algorithm is built based on the constraint that it should not be possible for a node to reconstruct partially or entirely the graph using the information they obtain during its execution. While a distributed algorithm that computes the nodes’ coreness is already proposed, dynamic networks are not taken into account. Our main contribution is an incremental algorithm that efficiently solves the core maintenance problem in P2P while limiting the number of messages exchanged and computations. We provide a security and privacy analysis of the solution regarding network de-anonimization and show how it relates to previously defined attacks models and discuss countermeasures
Berlingerio, Michele. "Graph and network data: mining the temporal dimension." Thesis, IMT Alti Studi Lucca, 2009. http://e-theses.imtlucca.it/21/1/Berlingerio_phdthesis.pdf.
Full textKang, U. "Mining Tera-Scale Graphs: Theory, Engineering and Discoveries." Research Showcase @ CMU, 2012. http://repository.cmu.edu/dissertations/160.
Full textIngalalli, Vijay. "Querying and Mining Multigraphs." Thesis, Montpellier, 2017. http://www.theses.fr/2017MONTS080/document.
Full textWith the ever-increasing growth of data and information, extracting the right knowledge has become a real challenge.Further, the advanced applications demand the analysis of complex, interrelated data which cannot be adequately described using a propositional representation. The graph representation is of great interest for the knowledge extraction community, since graphs are versatile data structures and are one of the most general forms of data representation. Among several classes of graphs, textit{multigraphs} have been captivating the attention in the recent times, thanks to their inherent property of succinctly representing the entities by allowing the rich and complex relations among them.The focus of this thesis is streamlined into two themes of knowledge extraction; one being textit{knowledge retrieval}, where we focus on the subgraph query matching aspects in multigraphs, and the other being textit{knowledge discovery}, where we focus on the problem of frequent pattern mining in multigraphs.This thesis makes three main contributions in the field of query matching and data mining.The first contribution, which is very generic, addresses querying subgraphs in multigraphs that yields isomorphic matches, and this problem finds potential applications in the domains of remote sensing, social networks, bioinformatics, chemical informatics. The second contribution, which is focussed on knowledge graphs, addresses querying subgraphs in RDF multigraphs that yield homomorphic matches. In both the contributions, we introduce efficient indexing structures that capture the multiedge information. The query matching processes introduced have been carefully optimized, w.r.t. the time performance and the heuristics employed assure robust performance.The third contribution is in the field of data mining, where we propose an efficient frequent pattern mining algorithm for multigraphs. We observe that multigraphs pose challenges while exploring the search space, and hence we introduce novel optimization techniques and heuristic search methods to swiftly traverse the search space.For each proposed approach, we perform extensive experimental analysis by comparing with the existing state-of-the-art approaches in order to validate the performance and correctness of our approaches.In the end, we perform a case study analysis on a remote sensing dataset. Remote sensing dataset is modelled as a multigraph, and the mining and query matching processes are employed to discover some useful knowledge
Wang, Changliang. "Continuous subgraph pattern search over graph streams /." View abstract or full-text, 2009. http://library.ust.hk/cgi/db/thesis.pl?CSED%202009%20WANG.
Full textFeng, Jing [Verfasser], and Christian [Akademischer Betreuer] Böhm. "Information-theoretic graph mining / Jing Feng. Betreuer: Christian Böhm." München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2015. http://d-nb.info/1073825892/34.
Full textHossain, Md Shakhawat. "Context Specific Module Mining from Multiple Co-Expression Graph." Thesis, North Dakota State University, 2017. https://hdl.handle.net/10365/28664.
Full textLe, Thanh Nam. "Graph representation and mining applied in comic images retrieval." Thesis, La Rochelle, 2019. http://www.theses.fr/2019LAROS008.
Full textIn information retrieval tasks from image databases where content representation is based on graphs, the evaluation of similarity is based both on the appearance of spatial entities and on their mutual relationships. In this thesis we present a novel scheme of Attributed Relational Adjacency Graphs representation and mining, which has been applied in content-based retrieval of comic images. The images used in this thesis are comics images, which have their inherent difficulties in applying content-based retrieval, such as their abstractness, partial occlusion, scale change and shape deformation due to viewpoint changes. We propose a graph representation that yields stable graphs and allow to retain high-level and structural information of objects of interest in comic images. Next, we extend the indexing and matching problem to graph structures representing the comic image, and apply it to the problem of retrieval. The graphs in the graph database representing the whole comic volume are then mined for frequent patterns (or frequent substructures). This step is to overcome the non-repeatability problem caused by the unavoidable errors introduced into the graph structure during the graph construction stage, which ultimately create a semantic gap between the graph and the content of the comic image. Finally, we demonstrate the effectiveness of the system with a database of annotated comic images. Experiments of performance measures is addressed to evaluate the performance of this CBIR system
Kolli, Lakshmi Priya. "Mining for Frequent Community Structures using Approximate Graph Matching." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623166375110273.
Full textNabhan, Ahmed Ragab. "Graph Pattern Mining Techniques to Identify Potential Model Organisms." ScholarWorks @ UVM, 2014. http://scholarworks.uvm.edu/graddis/4.
Full textHu, Chenhui. "Graph-Based Data Mining in Neuroimaging of Neurological Diseases." Thesis, Harvard University, 2016. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493468.
Full textEngineering and Applied Sciences - Engineering Sciences
Karunaratne, Thashmee M. "Learning predictive models from graph data using pattern mining." Doctoral thesis, Stockholms universitet, Institutionen för data- och systemvetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-100713.
Full textHuang, Zan, Wingyan Chung, and Hsinchun Chen. "A Graph Model for E-Commerce Recommender Systems." Wiley Periodicals, Inc, 2004. http://hdl.handle.net/10150/105683.
Full textInformation overload on the Web has created enormous challenges to customers selecting products for online purchases and to online businesses attempting to identify customersâ preferences efficiently. Various recommender systems employing different data representations and recommendation methods are currently used to address these challenges. In this research, we developed a graph model that provides a generic data representation and can support different recommendation methods. To demonstrate its usefulness and flexibility, we developed three recommendation methods: direct retrieval, association mining, and high-degree association retrieval. We used a data set from an online bookstore as our research test-bed. Evaluation results showed that combining product content information and historical customer transaction information achieved more accurate predictions and relevant recommendations than using only collaborative information. However, comparisons among different methods showed that high-degree association retrieval did not perform significantly better than the association mining method or the direct retrieval method in our test-bed.
AlAbed-AlHaq, Abrar Fawwaz. "APPLYING GRAPH MINING TECHNIQUES TO SOLVE COMPLEX SOFTWARE ENGINEERING PROBLEMS." Kent State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=kent1442986844.
Full textPech, Palacio Manuel Alfredo. "Spatial data modeling and mining using a graph-based representation." Lyon, INSA, 2005. http://theses.insa-lyon.fr/publication/2005ISAL0118/these.pdf.
Full textWe propose a unique graph-based model to represent spatial data, non-spatial data and the spatial relations among spatial objects. We will generate datasets composed of graphs with a set of these three elements. We consider that by mining a dataset with these characteristics a graph-based mining tool can search patterns involving all these elements at the same time improving the results of the spatial analysis task. A significant characteristic of spatial data is that the attributes of the neighbors of an object may have an influence on the object itself. So, we propose to include in the model three relationship types (topological, orientation, and distance relations). In the model the spatial data (i. E. Spatial objects), non-spatial data (i. E. Non-spatial attributes), and spatial relations are represented as a collection of one or more directed graphs. A directed graph contains a collection of vertices and edges representing all these elements. Vertices represent either spatial objects, spatial relations between two spatial objects (binary relation), or non-spatial attributes describing the spatial objects. Edges represent a link between two vertices of any type. According to the type of vertices that an edge joins, it can represent either an attribute name or a spatial relation name. The attribute name can refer to a spatial object or a non-spatial entity. We use directed edges to represent directional information of relations among elements (i. E. Object x touches object y) and to describe attributes about objects (i. E. Object x has attribute z). We propose to adopt the Subdue system, a general graph-based data mining system developed at the University of Texas at Arlington, as our mining tool. A special feature named overlap has a primary role in the substructures discovery process and consequently a direct impact over the generated results. However, it is currently implemented in an orthodox way: all or nothing. Therefore, we propose a third approach: limited overlap, which gives the user the capability to set over which vertices the overlap will be allowed. We visualize directly three motivations issues to propose the implementation of the new algorithm: search space reduction, processing time reduction, and specialized overlapping pattern oriented search
Pech, Palacio Manuel Alfredo Laurini Robert Tchounikine Anne Sol Martínez David. "Spatial data modeling and mining using a graph-based representation." Villeurbanne : Doc'INSA, 2006. http://docinsa.insa-lyon.fr/these/pont.php?id=pech_palacio.
Full textThèse soutenue en co-tutelle. Thèse rédigée en français, en anglais et en espagnol. Titre provenant de l'écran-titre. Bibliogr. p. 174-182.
Abdlwafa, Alan, and Henrik Edman. "Distributed Graph Mining : A study of performance advantages in distributed data mining paradigms when processing graphs using PageRank on a single node cluster." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-166449.
Full textFang, Chunsheng. "Novel Frameworks for Mining Heterogeneous and Dynamic Networks." University of Cincinnati / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1321369978.
Full textMaxwell, Evan Kyle. "Graph Mining Algorithms for Memory Leak Diagnosis and Biological Database Clustering." Thesis, Virginia Tech, 2010. http://hdl.handle.net/10919/34008.
Full textMaster of Science
Aridhi, Sabeur. "Distributed frequent subgraph mining in the cloud." Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2013. http://tel.archives-ouvertes.fr/tel-00951350.
Full textKumar, Lalit. "Scalable Map-Reduce Algorithms for Mining Formal Concepts and Graph Substructures." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1543996580926452.
Full textWegner, Jörg Kurt. "Data mining und graph mining auf molekularen Graphen - Cheminformatik und molekulare Kodierungen für ADME/Tox-QSAR-Analysen." Berlin Logos, 2006. http://deposit.d-nb.de/cgi-bin/dokserv?id=2865580&prov=M&dok_var=1&dok_ext=htm.
Full textChen, Zhiqian. "Graph Neural Networks: Techniques and Applications." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/99848.
Full textDoctor of Philosophy
Graph data is pervasive throughout most fields, including pandemic spread network, social network, transportation roads, internet, and chemical structure. Therefore, the applications modeled by graph benefit people's everyday life, and graph mining derives insightful opinions from this complex topology. This paper investigates an emerging technique called graph neural newton (GNNs), which is designed for graph data mining. There are two primary goals of this thesis paper: (1) understanding the GNNs in theory, and (2) apply GNNs for unexplored and values real-world scenarios. For the first goal, we investigate spectral theory and approximation theory, and a unified framework is proposed to summarize most GNNs. This direction provides a possibility that existing or newly proposed works can be compared, and the actual process can be measured. Specifically, this result demonstrates that most GNNs are either an approximation for a function of graph adjacency matrix or a function of eigenvalues. Different types of approximations are analyzed in terms of physical meaning, and the advantages and disadvantages are offered. Beyond that, we proposed a new optimization for a highly accurate but low efficient approximation. Evaluation of synthetic data proves its theoretical power, and the tests on two transportation networks show its potentials in real-world graphs. For the second goal, the circuit is selected as a novel application since it is crucial, but there are few works. Specifically, we focus on a security problem, a high-value real-world problem in industry companies such as Nvidia, Apple, AMD, etc. This problem is defined as a circuit graph as apply GNN to learn the representation regarding the prediction target such as attach runtime. Experiment on several benchmark circuits shows its superiority on effectiveness and efficacy compared with competitive baselines. This paper provides exploration in theory and application with GNNs, which shows a promising direction for graph mining tasks. Its potentials also provide a wide range of innovations in graph-based problems.
Boothe, Peter Mattison 1978. "Measuring the Internet AS graph and its evolution." Thesis, University of Oregon, 2009. http://hdl.handle.net/1794/10328.
Full textAs the Internet has evolved over time, the interconnection patterns of the members of this "network of networks" have changed. Can we characterize those changes? Have those changes been good or bad? What does "good" mean in this context? Has market power been centralizing or decentralizing? How certain can we be of our answer? What are the limitations of our data? These are the questions which motivate this dissertation. In this dissertation, we answer these questions and more by carefully taking a long-term quantitative study of the evolution of the topology of the Internet's AS graph. In order to do this study, we spend most of the dissertation developing methods of data processing and data analysis all informed by ideas from networking, data mining, graph theory, and statistics. The contributions are both theoretical and practical. The theoretical contributions include an in-depth analysis of the complexity of AS graph measurement as well as of the difficulty of reconstructing the AS graph from available data. The practical contributions include the design of graph metrics to capture properties of interest, usable approximation algorithms for several AS graph analysis methods, and an analysis of the evolution of the AS graph over time. It is our hope that these methods may prove useful in other domains, and that the conclusions about the evolution of the Internet topology prove useful for Internet operators, network researchers, policy makers, and others.
Committee in charge: Andrzej Proskurowski, Chairperson, Computer & Information Science; Arthur Farley, Member, Computer & Information Science; Jun Li, Member, Computer & Information Science; Anne van den Nouweland, Outside Member, Economics
Hassanzadeh, Reza. "Anomaly detection in online social networks : using data-mining techniques and fuzzy logic." Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/78679/1/Reza_Hassanzadeh_Thesis.pdf.
Full textHohman, Elizabeth Leeds. "A dynamic graph model for representing streaming text documents." Fairfax, VA : George Mason University, 2008. http://hdl.handle.net/1920/3062.
Full textVita: p. 141. Thesis director: Edward J. Wegman. Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computational Sciences and Informatics. Title from PDF t.p. (viewed July 3, 2008). Includes bibliographical references (p. 105-110). Also issued in print.
Cakmak, Ali. "Mining Metabolic Networks and Biomedical Literature." Case Western Reserve University School of Graduate Studies / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=case1223490345.
Full textHirao, Eiji, Takeshi Furuhashi, Tomohiro Yoshikawa, and Daisuke Kobayashi. "A Study of Visualization Method with HK Graph Using Concept Words." 日本知能情報ファジィ学会, 2010. http://hdl.handle.net/2237/20687.
Full textSCIS & ISIS 2010, Joint 5th International Conference on Soft Computing and Intelligent Systems and 11th International Symposium on Advanced Intelligent Systems. December 8-12, 2010, Okayama Convention Center, Okayama, Japan
Gawronski, Alexander. "RiboFSM: Frequent Subgraph Mining for the Discovery of RNA Structures and Interactions." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/26296.
Full textFrisoni, Giacomo. "A New Unsupervised Methodology of Descriptive Text Mining for Knowledge Graph Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20471/.
Full textViaño, Iglesias Adrian. "Graph representation of documents content and its suitability for text mining tasks." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2011. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-14133.
Full textShaban, Khaled. "A Semantic Graph Model for Text Representation and Matching in Document Mining." Thesis, University of Waterloo, 2006. http://hdl.handle.net/10012/2860.
Full textConventional text representation methodologies consider documents as bags of words and ignore the meanings and ideas their authors want to convey. It is this deficiency that causes similarity measures to fail to perceive contextual similarity of text passages due to the variation of the words the passages contain, or at least perceive contextually dissimilar text passages as being similar because of the resemblance of words the passages have.
This thesis presents a new paradigm for mining documents by exploiting semantic information of their texts. A formal semantic representation of linguistic inputs is introduced and utilized to build a semantic representation scheme for documents. The representation scheme is constructed through accumulation of syntactic and semantic analysis outputs. A new distance measure is developed to determine the similarities between contents of documents. The measure is based on inexact matching of attributed trees. It involves the computation of all distinct similarity common sub-trees, and can be computed efficiently. It is believed that the proposed representation scheme along with the proposed similarity measure will enable more effective document mining processes.
The proposed techniques to mine documents were implemented as vital components in a mining system. A case study of semantic document clustering is presented to demonstrate the working and the efficacy of the framework. Experimental work is reported, and its results are presented and analyzed.
Zulfiqar, Omer. "Detecting Public Transit Service Disruptions Using Social Media Mining and Graph Convolution." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/103745.
Full textMaster of Science
Millions of people worldwide rely on public transit networks for their daily commutes and day to day movements. With the growth in the number of people using the service, there has been an increase in the number of daily passengers affected by service disruptions. This thesis and research involves proposing and developing three different approaches to help aid in the timely detection of these disruptions. In this work we have developed a pure data mining approach along with two deep learning models using neural networks and live data from Twitter to identify these disruptions. The data mining approach uses a set of dirsuption related input keywords to identify similar keywords within the live Twitter data. By collecting historical data we were able to create deep learning models that represent the vocabulary from the disruptions related Tweets in the form of a graph. A graph is a collection of data values where the data points are connected to one another based on their relationships. A longer chain of connection between two words defines a weak relationship, a shorter chain defines a stronger relationship. In our graph, words with similar contextual meanings are connected to each other over shorter distances, compared to words with different meanings. At the end we use a neural network as a classifier to scan this graph to learn the semantic relationships within our data. Afterwards, this learned information can be used to accurately classify the disruption related Tweets within a pool of random Tweets. Once all the proposed approaches have been developed, a benchmark evaluation is performed against other existing text classification techniques, to justify the effectiveness of the approaches. The final results indicate that the proposed graph based models achieved a higher accuracy, compared to the data mining model, and also outperformed all the other baseline models. Our Tweet-Level GCN had the highest accuracy of 89.9%.
Cederquist, Aaron. "Frequent Pattern Mining among Weighted and Directed Graphs." Case Western Reserve University School of Graduate Studies / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=case1228328123.
Full text