To see the other types of publications on this topic, follow the link: Labeled graph.

Dissertations / Theses on the topic 'Labeled graph'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 39 dissertations / theses for your research on the topic 'Labeled graph.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Park, Noseong. "Top-K Query Processing in Edge-Labeled Graph Data." Thesis, University of Maryland, College Park, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10128677.

Full text
Abstract:

Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed.

Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs.

Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features.

The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned.

An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask.

The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.

APA, Harvard, Vancouver, ISO, and other styles
2

Li, Jie. "Data integration for biological network databases MetNetDB labeled graph model and graph matching algorithm /." [Ames, Iowa : Iowa State University], 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Christensen, Robin. "An Analysis of Notions of Differential Privacy for Edge-Labeled Graphs." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-169379.

Full text
Abstract:
The user data in social media platforms is an excellent source of information that is beneficial for both commercial and scientific purposes. However, recent times has seen that the user data is not always used for good, which has led to higher demands on user privacy. With accurate statistical research data being just as important as the privacy of the user data, the relevance of differential privacy has increased. Differential privacy allows user data to be accessible under certain privacy conditions at the cost of accuracy in query results, which is caused by noise. The noise is based on a tuneable constant ε and the global sensitivity of a query. The query sensitivity is defined as the greatest possible difference in query result between the queried database and a neighboring database. Where the neighboring database is defined to differ by one record in a tabular database, there are multiple neighborhood notions for edge-labeled graphs. This thesis considers the notions of edge neighborhood, node neighborhood, QL-edge neighborhood and QL-outedges neighborhood. To study these notions, a framework was developed in Java to function as a query mechanism for a graph database. ArangoDB was used as a storage for graphs, which was generated by parsing data sets in the RDF format as well as through a graph synthesizer in the developed framework. Querying a database in the framework is done with Apache TinkerPop, and a Laplace distribution is used when generating noise for the query results. The framework was used to study the privacy and utility trade-off of different histogram queries on a number of data sets, while employing the different notions of neighborhood in edge-labeled graphs. The level of privacy is determined by the value on ε, and the utility is defined as a measurement based on the L1-distance between the true and noisy result. In the general case, the notions of edge neighborhood and QL-edge neighborhood are the better alternatives in terms of privacy and utility. Although, there are indications that node neighborhood and QL-outedges neighborhood are considerable options for larger graphs, where the level of privacy for edge neighborhood and QL-edge neighborhood appears to be negligible based on utility measurements.
APA, Harvard, Vancouver, ISO, and other styles
4

Shafie, Termeh. "Random Multigraphs : Complexity Measures, Probability Models and Statistical Inference." Doctoral thesis, Stockholms universitet, Statistiska institutionen, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-82697.

Full text
Abstract:
This thesis is concerned with multigraphs and their complexity which is defined and quantified by the distribution of edge multiplicities. Two random multigraph models are considered.  The first model is random stub matching (RSM) where the edges are formed by randomly coupling pairs of stubs according to a fixed stub multiplicity sequence. The second model is obtained by independent edge assignments (IEA) according to a common probability distribution over the edge sites. Two different methods for obtaining an approximate IEA model from an RSM model are also presented. In Paper I, multigraphs are analyzed with respect to structure and complexity by using entropy and joint information. The main results include formulae for numbers of graphs of different kinds and their complexity. The local and global structure of multigraphs under RSM are analyzed in Paper II. The distribution of multigraphs under RSM is shown to depend on a single complexity statistic. The distributions under RSM and IEA are used for calculations of moments and entropies, and for comparisons by information divergence. The main results include new formulae for local edge probabilities and probability approximation for simplicity of an RSM multigraph. In Paper III, statistical tests of a simple or composite IEA hypothesis are performed using goodness-of-fit measures. The results indicate that even for very small number of edges, the null distributions of the test statistics under IEA have distributions that are  well approximated by their asymptotic χ2-distributions. Paper IV contains the multigraph algorithms that are used for numerical calculations in Papers I-III.
APA, Harvard, Vancouver, ISO, and other styles
5

Tahraoui, Mohammed Amin. "Coloring, packing and embedding of graphs." Phd thesis, Université Claude Bernard - Lyon I, 2012. http://tel.archives-ouvertes.fr/tel-00995041.

Full text
Abstract:
In this thesis, we investigate some problems in graph theory, namelythe graph coloring problem, the graph packing problem and tree pattern matchingfor XML query processing. The common point between these problems is that theyuse labeled graphs.In the first part, we study a new coloring parameter of graphs called the gapvertex-distinguishing edge coloring. It consists in an edge-coloring of a graph G whichinduces a vertex distinguishing labeling of G such that the label of each vertex isgiven by the difference between the highest and the lowest colors of its adjacentedges. The minimum number of colors required for a gap vertex-distinguishing edgecoloring of G is called the gap chromatic number of G and is denoted by gap(G).We will compute this parameter for a large set of graphs G of order n and we evenprove that gap(G) 2 fn E 1; n; n + 1g.In the second part, we focus on graph packing problems, which is an area ofgraph theory that has grown significantly over the past several years. However, themajority of existing works focuses on unlabeled graphs. In this thesis, we introducefor the first time the packing problem for a vertex labeled graph. Roughly speaking,it consists of graph packing which preserves the labels of the vertices. We studythe corresponding optimization parameter on several classes of graphs, as well asfinding general bounds and characterizations.The last part deal with the query processing of a core subset of XML query languages:XML twig queries. An XML twig query, represented as a small query tree,is essentially a complex selection on the structure of an XML document. Matching atwig query means finding all the occurrences of the query tree embedded in the XMLdata tree. Many holistic twig join algorithms have been proposed to match XMLtwig pattern. Most of these algorithms find twig pattern matching in two steps. Inthe first one, a query tree is decomposed into smaller pieces, and solutions againstthese pieces are found. In the second step, all of these partial solutions are joinedtogether to generate the final solutions. In this part, we propose a novel holistictwig join algorithm, called TwigStack++, which features two main improvementsin the decomposition and matching phase. The proposed solutions are shown to beefficient and scalable, and should be helpful for the future research on efficient queryprocessing in a large XML database.
APA, Harvard, Vancouver, ISO, and other styles
6

Mortada, Maidoun. "The b-chromatic number of regular graphs." Thesis, Lyon 1, 2013. http://www.theses.fr/2013LYO10116.

Full text
Abstract:
Les deux problèmes majeurs considérés dans cette thèse : le b-coloration problème et le graphe emballage problème. 1. Le b-coloration problème : Une coloration des sommets de G s'appelle une b-coloration si chaque classe de couleur contient au moins un sommet qui a un voisin dans toutes les autres classes de couleur. Le nombre b-chromatique b(G) de G est le plus grand entier k pour lequel G a une b-coloration avec k couleurs. EL Sahili et Kouider demandent s'il est vrai que chaque graphe d-régulier G avec le périmètre au moins 5 satisfait b(G) = d + 1. Blidia, Maffray et Zemir ont montré que la conjecture d'El Sahili et de Kouider est vraie pour d ≤ 6. En outre, la question a été résolue pour les graphes d-réguliers dans des conditions supplémentaires. Nous étudions la conjecture d'El Sahili et de Kouider en déterminant quand elle est possible et dans quelles conditions supplémentaires elle est vrai. Nous montrons que b(G) = d + 1 si G est un graphe d-régulier qui ne contient pas un cycle d'ordre 4 ni d'ordre 6. En outre, nous fournissons des conditions sur les sommets d'un graphe d-régulier G sans le cycle d'ordre 4 de sorte que b(G) = d + 1. Cabello et Jakovac ont prouvé si v(G) ≥ 2d3 - d2 + d, puis b(G) = d + 1, où G est un graphe d-régulier. Nous améliorons ce résultat en montrant que si v(G) ≥ 2d3 - 2d2 + 2d alors b(G) = d + 1 pour un graphe d-régulier G. 2. Emballage de graphe problème : Soit G un graphe d'ordre n. Considérer une permutation σ : V (G) → V (Kn), la fonction σ* : E(G) → E(Kn) telle que σ *(xy) = σ *(x) σ *(y) est la fonction induite par σ. Nous disons qu'il y a un emballage de k copies de G (dans le graphe complet Kn) s'il existe k permutations σi : V (G) → V (Kn), où i = 1, …, k, telles que σi*(E(G)) ∩ σj (E(G)) = ɸ pour i ≠ j. Un emballage de k copies d'un graphe G est appelé un k-placement de G. La puissance k d'un graphe G, noté par Gk, est un graphe avec le même ensemble de sommets que G et une arête entre deux sommets si et seulement si le distance entre ces deux sommets est au plus k. Kheddouci et al. ont prouvé que pour un arbre non-étoile T, il existe un 2-placement σ sur V (T). Nous introduisons pour la première fois le problème emballage marqué de graphe dans son graphe puissance
Two problems are considered in this thesis: the b-coloring problem and the graph packing problem. 1. The b-Coloring Problem : A b-coloring of a graph G is a proper coloring of the vertices of G such that there exists a vertex in each color class joined to at least a vertex in each other color class. The b-chromatic number of a graph G, denoted by b(G), is the maximum number t such that G admits a b-coloring with t colors. El Sahili and Kouider asked whether it is true that every d-regular graph G with girth at least 5 satisfies b(G) = d + 1. Blidia, Maffray and Zemir proved that the conjecture is true for d ≤ 6. Also, the question was solved for d-regular graphs with supplementary conditions. We study El Sahili and Kouider conjecture by determining when it is possible and under what supplementary conditions it is true. We prove that b(G) = d+1 if G is a d-regular graph containing neither a cycle of order 4 nor of order 6. Then, we provide specific conditions on the vertices of a d-regular graph G with no cycle of order 4 so that b(G) = d + 1. Cabello and Jakovac proved that if v(G) ≥ 2d3 - d2 + d, then b(G) = d + 1, where G is a d-regular graph. We improve this bound by proving that if v(G) ≥ 2d3 - 2d2 + 2d, then b(G) = d+1 for a d-regular graph G. 2. Graph Packing Problem : Graph packing problem is a classical problem in graph theory and has been extensively studied since the early 70's. Consider a permutation σ : V (G) → V (Kn), the function σ* : E(G) → E(Kn) such that σ *(xy) = σ *(x) σ *(y) is the function induced by σ. We say that there is a packing of k copies of G into the complete graph Kn if there exist k permutations σ i : V (G) → V (Kn), where i = 1,…, k, such that σ*i (E(G)) ∩ σ*j (E(G)) = ɸ for I ≠ j. A packing of k copies of a graph G will be called a k-placement of G. The kth power Gk of a graph G is the supergraph of G formed by adding an edge between all pairs of vertices of G with distance at most k. Kheddouci et al. proved that for any non-star tree T there exists a 2-placement σ on V (T). We introduce a new variant of graph packing problem, called the labeled packing of a graph into its power graph
APA, Harvard, Vancouver, ISO, and other styles
7

Adamský, Aleš. "Segmentace mluvčích s využitím statistických metod klasifikace." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-219007.

Full text
Abstract:
The thesis discusses in detail some concepts of speech and prosody that can contribute to build a speech corpus for the speaker segmentation purpose. Moreover, the Elan multimedia annotator used for labeling is described. The theoretical part highlights some frequently used speech features such as MFCC, PLP and LPC and deals with currently most popular speech segmentation methods. Some classification algorithms are also mentioned. The practical part describes implementation of Bayesian information criterium algorithm in system for automatic speaker segmentation. For classification of speaker change point in speech, were used different speech features. The results of tests were evaluated by the graphic method of receiver operating characteristic (ROC) and his quantitative indices. As the best speech features for this system were provided MFCC and HFCC.
APA, Harvard, Vancouver, ISO, and other styles
8

Fan, Shuangfei. "Deep Representation Learning on Labeled Graphs." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/96596.

Full text
Abstract:
We introduce recurrent collective classification (RCC), a variant of ICA analogous to recurrent neural network prediction. RCC accommodates any differentiable local classifier and relational feature functions. We provide gradient-based strategies for optimizing over model parameters to more directly minimize the loss function. In our experiments, this direct loss minimization translates to improved accuracy and robustness on real network data. We demonstrate the robustness of RCC in settings where local classification is very noisy, settings that are particularly challenging for ICA. As a new way to train generative models, generative adversarial networks (GANs) have achieved considerable success in image generation, and this framework has also recently been applied to data with graph structures. We identify the drawbacks of existing deep frameworks for generating graphs, and we propose labeled-graph generative adversarial networks (LGGAN) to train deep generative models for graph-structured data with node labels. We test the approach on various types of graph datasets, such as collections of citation networks and protein graphs. Experiment results show that our model can generate diverse labeled graphs that match the structural characteristics of the training data and outperforms all baselines in terms of quality, generality, and scalability. To further evaluate the quality of the generated graphs, we apply it to a downstream task for graph classification, and the results show that LGGAN can better capture the important aspects of the graph structure.
Doctor of Philosophy
Graphs are one of the most important and powerful data structures for conveying the complex and correlated information among data points. In this research, we aim to provide more robust and accurate models for some graph specific tasks, such as collective classification and graph generation, by designing deep learning models to learn better task-specific representations for graphs. First, we studied the collective classification problem in graphs and proposed recurrent collective classification, a variant of the iterative classification algorithm that is more robust to situations where predictions are noisy or inaccurate. Then we studied the problem of graph generation using deep generative models. We first proposed a deep generative model using the GAN framework that generates labeled graphs. Then in order to support more applications and also get more control over the generated graphs, we extended the problem of graph generation to conditional graph generation which can then be applied to various applications for modeling graph evolution and transformation.
APA, Harvard, Vancouver, ISO, and other styles
9

Martinsen, Thor. "Refinement composition using doubly labeled transition graphs." Thesis, Monterey, Calif. : Naval Postgraduate School, 2007. http://bosun.nps.edu/uhtbin/hyperion-image.exe/07Sep%5FMartinsen.pdf.

Full text
Abstract:
Thesis (M.S. in Computer Science and M.S. in Applied Mathematics)--Naval Postgraduate School, September 2007.
Thesis Advisor(s): Dinolt, George ; Fredricksen, Harold. "September 2007." Description based on title screen as viewed on October 23, 2007. Includes bibliographical references (p.49-51). Also available in print.
APA, Harvard, Vancouver, ISO, and other styles
10

Johansson, Öjvind. "Graph Decomposition Using Node Labels." Doctoral thesis, KTH, Numerical Analysis and Computer Science, NADA, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3213.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Humphries, Peter John. "Combinatorial Aspects of Leaf-Labelled Trees." Thesis, University of Canterbury. Mathematics and Statistics, 2008. http://hdl.handle.net/10092/1801.

Full text
Abstract:
Leaf-labelled trees are used commonly in computational biology and in other disciplines, to depict the ancestral relationships and present-day similarities between both extant and extinct species. Studying these trees from a mathematical perspective provides a foundation for developing tools and techniques that have practical applications. We begin by examining some quartet problems, namely determining the number of quartets that are required to infer the structure of a particular supertree. The quartet graph is introduced as a tool for tackling quartet problems, and is subsequently used to give new characterisations of compatible, definitive and identifying quartet sets. We then turn to investigating some properties of the subtrees induced by a collection of trees. This is motivated in part by the problem of reconstructing two or more trees simultaneously from their combined collection of subtrees. We also use some ideas drawn from Ramsey theory to show the existence of arbitrarily large common subtrees. Finally, we explore some extremal properties of the metric that is induced by the tree bisection and reconnection operation. This includes finding new (asymptotically) tight upper and lower bounds on both the size of the neighbourhoods in the metric space and on the diameter of the corresponding adjacency graph.
APA, Harvard, Vancouver, ISO, and other styles
12

Willis, Paulette Nicole. "C*-algebras of labeled graphs and *-commuting endomorphisms." Diss., University of Iowa, 2010. https://ir.uiowa.edu/etd/627.

Full text
Abstract:
My research lies in the general area of functional analysis. I am particularly interested in C*-algebras and related dynamical systems. From the very beginning of the theory of operator algebras, in the works of Murray and von Neumann dating from the mid 1930's, dynamical systems and operator algebras have led a symbiotic existence. Murray and von Neumann's work grew from a few esoteric, but clearly original and prescient papers, to a ma jor river of contemporary mathematics. My work lies at the confluence of two important tributaries to this river. On the one hand, the operator algebras that I study are C*-algebras that are built from graphs. On the other, the dynamical systems on which I focus are symbolic dynamical systems of various types. My goal is to use dynamical systems theory to construct new and interesting C*-algebras and to use the algebraic invariants of these algebras to reveal properties of the dynamics. My work has two fairly distinct strands: One deals with C*-algebras built from irreversible dynamical systems. The other deals with group actions on graph C*-algebras and their generalizations.
APA, Harvard, Vancouver, ISO, and other styles
13

Ruan, Da. "Statistical methods for comparing labelled graphs." Thesis, Imperial College London, 2014. http://hdl.handle.net/10044/1/24963.

Full text
Abstract:
Due to the availability of the vast amount of graph-structured data generated in various experiment settings (e.g., biological processes, social connections), the need to rapidly identify network structural differences is becoming increasingly prevalent. In many fields, such as bioinformatics, social network analysis and neuroscience, graphs estimated from the same experimental settings are always defined on a fixed set of objects. We formalize such a problem as a labelled graph comparison problem. The main issue in this area, i.e. measuring the distance between graphs, has been extensively studied over the past few decades. Although a large distance value constitutes evidence of difference between graphs, we are more interested in the issue of inferentially justifying whether a distance value as large or larger than the observed distance could have been obtained simply by chance. However, little work has been done to provide the procedures of statistical inference necessary to formally answer this question. Permutation-based inference has been proposed as a theoretically sound approach and a natural way of tackling such a problem. However, the common permutation procedure is computationally expensive, especially for large graphs. This thesis contributes to the labelled graph comparison problem by addressing three different topics. Firstly, we analyse two labelled graphs by inferentially justifying their independence. A permutation-based testing procedure based on Generalized Hamming Distance (GHD) is proposed. We show rigorously that the permutation distribution is approximately normal for a large network, under three graph models with two different types of edge weights. The statistical significance can be evaluated without the need to resort to computationally expensive permutation procedures. Numerical results suggest the validity of this approximation. With the Topological Overlap edge weight, we suggest that the GHD test is a more powerful test to identify network differences. Secondly, we tackle the problem of comparing two large complex networks in which only localized topological differences are assumed. By applying the normal approximation for the GHD test, we propose an algorithm that can effectively detect localised changes in the network structure from two large complex networks. This algorithm is quickly and easily implemented. Simulations and applications suggest that it is a useful tool to detect subtle differences in complex network structures. Finally, we address the problem of comparing multiple graphs. For this topic, we analyse two different problems that can be interpreted as corresponding to two distinct null hypotheses: (i) a set of graphs are mutually independent; (ii) graphs in one set are independent of graphs in another set. Applications for the multiple graphs problem are commonly found in social network analysis (i) or neuroscience (ii). However, little work has been done to inferentially address the problem of comparing multiple networks. We propose two different statistical testing procedures for (i) and (ii), by again using a normality approximation for GHD. We extend the normality of GHD for the two graphs case to multiple cases, for hypotheses (i) and (ii), with two different permutation strategies. We further build a link between the test of group independence to an existing method, namely the Multivariate Exponential Random Graph Permutation model (MERGP). We show that by applying asymptotic normality, the maximum likelihood estimate of MERGP can be analytically derived. Therefore, the original, computationally expensive, inferential procedure of MERGP can be abandoned.
APA, Harvard, Vancouver, ISO, and other styles
14

Huynh, Tony. "The Linkage Problem for Group-labelled Graphs." Thesis, University of Waterloo, University of Waterloo, 2009. http://hdl.handle.net/10012/4716.

Full text
Abstract:
This thesis aims to extend some of the results of the Graph Minors Project of Robertson and Seymour to "group-labelled graphs". Let $\Gamma$ be a group. A $\Gamma$-labelled graph is an oriented graph with its edges labelled from $\Gamma$, and is thus a generalization of a signed graph. Our primary result is a generalization of the main result from Graph Minors XIII. For any finite abelian group $\Gamma$, and any fixed $\Gamma$-labelled graph $H$, we present a polynomial-time algorithm that determines if an input $\Gamma$-labelled graph $G$ has an $H$-minor. The correctness of our algorithm relies on much of the machinery developed throughout the graph minors papers. We therefore hope it can serve as a reasonable introduction to the subject. Remarkably, Robertson and Seymour also prove that for any sequence $G_1, G_2, \dots$ of graphs, there exist indices $i
APA, Harvard, Vancouver, ISO, and other styles
15

HONG, HUI. "Computing Label-Constraint Reachability in Graph Databases." Kent State University / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=kent1333472725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Gurajada, Sairam [Verfasser], and Gerhard [Akademischer Betreuer] Weikum. "Distributed querying of large labeled graphs / Sairam Gurajada ; Betreuer: Gerhard Weikum." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2017. http://d-nb.info/1125431903/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Meng, Jinghan. "Flexible and Feasible Support Measures for Mining Frequent Patterns in Large Labeled Graphs." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6900.

Full text
Abstract:
In recent years, the popularity of graph databases has grown rapidly. This paper focuses on single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. In this paper, we propose a novel framework for constructing support measures that brings together existing minimum-image-based and overlap-graph-based support measures. Our framework is built on the concept of occurrence / instance hypergraphs. Based on that, we present two new support measures: minimum instance (MI) measure and minimum vertex cover (MVC) measure, that combine the advantages of existing measures. In particular, we show that the existing minimum-image-based support measure is an upper bound of the MI measure, which is also linear-time computable and results in counts that are close to number of instances of a pattern. Although the MVC measure is NP-hard, it can be approximated to a constant factor in polynomial time. We also provide polynomial-time relaxations for both measures and bounding theorems for all presented support measures in the hypergraph setting. We further show that the hypergraph-based framework can unify all support measures studied in this paper. This framework is also flexible in that more variants of support measures can be defined and profiled in it.
APA, Harvard, Vancouver, ISO, and other styles
18

Gouveia, da silva Thiago. "The Minimum Labeling Spanning Tree and Related Problems." Thesis, Avignon, 2018. http://www.theses.fr/2018AVIG0278.

Full text
Abstract:
Soit L un ensemble fini d’éléments appelés étiquettes. On appelle graphe étiqueté simple, un graphe simple dans lequel à chaque arête est associée une étiquette prise dans L. Le problème de l’arbre couvrant de nombre d’étiquettes minimal (en anglais: the minimum labeling spanning tree problem, MLSTP) est un problème d’optimisation combinatoire consistant à trouver un arbre couvrant dans un graphe étiqueté simple en utilisant un nombre minimum d’étiquettes. Le problème est NP-dur. Il a fait l’objet d’un nombre important de recherche au cours des dernières années. L’une de ces directions de recherche a par ailleurs conduit à l’étude d’une généralisation du problème dite problème généralisée de l’arbre couvrant de nombre d’étiquettes minimal(en anglais: the generalized minimum labeling spanning tree problem, GMLSTP). Le problème GMLSTP modélise les situations dans lesquelles plusieurs étiquettes peuvent être assignées à un arête. Les deux problèmes ont plusieurs applications pratiques dans des domaines importants tels que la conception de réseaux informatiques, la conception de réseaux de transport multimodaux et la compression de données. Nous proposons dans cette thèse plusieurs résultats théoriques contribuant à l’implantation de nouveaux schémas de résolution pratique de ces problèmes. En particulier, sur le plan théorique, nous avons introduit de nouveaux concepts, définitions, propriétés et théorèmes utiles, ainsi qu’une étude polyédrale du domaine des points réalisables d’une nouvelle formulation de GMLSTP. Cette formulation et son analyse ont permi le développement d’algorithmes de branchement et de coupe (branch-and-cut) pour la résolution exacte des problèmes. De nouvelles heuristiques ont été également développées — telles que l’algorithme basé sur la métaheuristique MSLB, et l’heuristique constructive pMVCA. Des résultats d’expériences numériques sur des benchmarks du problème MLSTP sont données. Elles démontrent la qualité des approches proposées dans cette thèse puisque, aussi bien pour les approches exactes qu’approchées, nous obtenons, comparativement à l’état de l’art du domaine, les meilleurs résultats de la littérature
The minimum labeling spanning tree problem (MLSTP) is a combinatorial optimization problem that consists in finding a spanning tree in a simple edge-labeled graph, i.e., a graph inwhich each edge has one label associated, by using a minimum number of labels. It is anNP-hard problem that has attracted substantial research attention in recent years. In its turn,the generalized minimum labeling spanning tree problem (GMLSTP) is a generalization of theMLSTP that allows the situation in which multiple labels can be assigned to an edge. Bothproblems have several practical applications in important areas such as computer network design, multimodal transportation network design, and data compression. This thesis addressesseveral connectivity problems defined over edge-labeled graphs, in special the minimum labeling spanning tree problem and its generalized version. The contributions in this work can beclassified between theoretical and practical. On the theoretical side, we have introduced newuseful concepts, definitions, properties and theorems regarding edge-labeled graphs, as well asa polyhedral study on the GMLSTP. On the practical side, we have proposed new heuristics— such as the metaheuristic-based algorithm MSLB, and the constructive heuristic pMVCA —and exact methods — such as new mathematical formulations and branch-and-cut algorithms —for solving the GMLSTP. Computational experiments over well established benchmarks for theMLSTP are reported, showing that the new approaches introduced in this work have achievedthe best results for both heuristic and exact methods in comparison with the state-of-the-artmethods in the literature
O Problema da Árvore Geradora com Rotulação Mínima (MLSTP, do inglês minimum labelingspanning tree problem) é um problema de otimização combinatória que consiste em encontraruma árvore de cobertura em um grafo com arestas rotuladas, isto é, um grafo no qual cada arestapossui um rótulo associado, utilizando o menor número de rótulos. Este problema é NP-difícile tem atraído bastante atenção em pesquisas nos últimos anos. Por sua vez, o Problema Generalizado da Árvore Geradora com Rotulação Mínima (GMLSTP, do inglês generalized minimum labeling spanning tree problem) é uma generalização do MLSTP na qual se permite que múltiplos rótulos sejam associados a uma aresta. Ambos os problemas tem aplicações práticas em áreas importantes, como Projeto de Redes de Computadores, Projeto de Redes de Transporte Multimodais e Compactação de Dados. Esta tese aborda vários problemas de conectividade definidos em grafos com arestas rotuladas, em especial o Problema da Árvore Geradora com Rotulação Mínima e sua versão generalizada. As contribuições neste trabalho podem ser classificadas entre teóricas e práticas. Dentre as contribuições teóricas, introduzimos novos conceitos,definições, propriedades e teoremas úteis em relação a grafos com arestas rotuladas, bem como um estudo poliédrico sobre o GMLSTP. Dentre as contribuições práticas, propusemos novas heurísticas _ como o algoritmo baseado na metaheurística MSLB e a heurística construtiva pMVCA _ e métodos exatos _ como novas formulações matemáticas e algoritmos branch- and-cut _ para resolver o GMLSTP. Os experimentos computacionais realizados utilizando conjuntos de instâncias bem estabelecidos para o MLSTP são relatados, mostrando que as novas abordagens introduzidas neste trabalho alcançaram os melhores resultados para métodosheurísticos e exatos em comparação com estado da arte da literatura
APA, Harvard, Vancouver, ISO, and other styles
19

Chatel, David. "Semi-supervised clustering in graphs." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10134/document.

Full text
Abstract:
Le partitionnement consiste à rechercher une partition d'éléments, de sorte que les éléments d'un même cluster soient plus similaires que les éléments de différents clusters. Les données proviennent de différentes sources et prennent des formes différentes. L'un des défis consiste à concevoir un système capable de tirer parti des différentes sources de données. Certaines contraintes peuvent être connues sur les données. On peut savoir qu'un objet est d'un certain type ou que deux objets partagent le même type ou sont de types différents. On peut également savoir qu'à l'échelle globale, les différents types d'objets apparaissent avec une fréquence connue. Dans cette thèse, nous nous concentrons sur le partitionnement avec trois types de contraintes: les contraintes d'étiquettes, les contraintes de paires et les contraintes de lois de puissance. Une contrainte d'étiquette spécifie dans quel cluster appartient un objet. Les contraintes par paire spécifient que les paires d'objets doivent ou ne doivent pas partager le même cluster. Enfin, la contrainte de loi de puissance est une contrainte globale qui spécifie que la distribution des tailles de cluster est soumise à une loi de puissance. Nous voulons montrer que l'introduction de la semi-supervision aux algorithmes de clustering peut modifier et améliorer les solutions retournées par des algorithmes de clustering non supervisés. Nous contribuons à cette question en proposant des algorithmes pour chaque type de contraintes. Nos expériences sur les ensembles de données UCI et les jeux de données en langage naturel montrent la bonne performance de nos algorithmes et donnent des indications pour des travaux futurs prometteurs
Clustering is the task of finding a partition of items, such that items in the same cluster are more similar than items in different clusters. One challenge consists in designing a system capable of taking benefit of the different sources of data. Among the different forms a piece of data can take, the description of an object can take the form of a feature vector: a list of attributes that takes a value. Objects can also be described by a graph which captures the relationships objects have with each others. In addition to this, some constraints can be known about the data. It can be known that an object is of a certain type or that two objects share the same type or are of different types. It can also be known that on a global scale, the different types of objects appear with a known frequency. In this thesis, we focus on clustering with three different types of constraints: label constraints, pairwise constraints and power-law constraint. A label constraint specifies in which cluster an object belong. Pairwise constraints specify that pairs of object should or should not share the same cluster. Finally, the power-law constraint is a cluster-level constraint that specifies that the distribution of cluster sizes are subject to a power-law. We want to show that introducing semi-supervision to clustering algorithms can alter and improve the solutions returned by unsupervised clustering algorithms. We contribute to this question by proposing algorithms for each type of constraints. Our experiments on UCI data sets and natural language processing data sets show the good performance of our algorithms and give hints towards promising future works
APA, Harvard, Vancouver, ISO, and other styles
20

Okoth, Isaac Owino. "Combinatorics of oriented trees and tree-like structures." Thesis, Stellenbosch : Stellenbosch University, 2015. http://hdl.handle.net/10019.1/96860.

Full text
Abstract:
Thesis (PhD)--Stellenbosch University, 2015.
ENGLISH ABSTRACT : In this thesis, a number of combinatorial objects are enumerated. Du and Yin as well as Shin and Zeng (by a different approach) proved an elegant formula for the number of labelled trees with respect to a given in degree sequence, where each edge is oriented from a vertex of lower label towards a vertex of higher label. We refine their result to also take the number of sources (vertices of in degree 0) or sinks (vertices of out degree 0) into account. We find formulas for the mean and variance of the number of sinks or sources in these trees. We also obtain a differential equation and a functional equation satisfied by the generating function for these trees. Analogous results for labelled trees with two marked vertices, related to functional digraphs, are also established. We extend the work to count reachable vertices, sinks and leaf sinks in these trees. Among other results, we obtain a counting formula for the number of labelled trees on n vertices in which exactly k vertices are reachable from a given vertex v and also the average number of vertices that are reachable from a specified vertex in labelled trees of order n. In this dissertation, we also enumerate certain families of set partitions and related tree-like structures. We provide a proof for a formula that counts connected cycle-free families of k set partitions of {1, . . . , n} satisfying a certain coherence condition and then establish a bijection between these families and the set of labelled free k-ary cacti with a given vertex-degree distribution. We then show that the formula also counts coloured Husimi graphs in which there are no blocks of the same colour that are incident to one another. We extend the work to count coloured oriented cacti and coloured cacti. Noncrossing trees and related tree-like structures are also considered in this thesis. Specifically, we establish formulas for locally oriented noncrossing trees with a given number of sources and sinks, and also with given indegree and outdegree sequences. The work is extended to obtain the average number of reachable vertices in these trees. We then generalise the concept of noncrossing trees to find formulas for the number of noncrossing Husimi graphs, cacti and oriented cacti. The study is further extended to find formulas for the number of bicoloured noncrossing Husimi graphs and the number of noncrossing connected cycle-free pairs of set partitions.
AFRIKAANSE OPSOMMING : In hierdie tesis word ’n aantal kombinatoriese objekte geenumereer. Du en Yin asook Shin en Zeng (deur middel van ’n ander benadering) het ’n elegante formule vir die aantal geëtiketteerde bome met betrekking tot ’n gegewe ingangsgraadry, waar elke lyn van die nodus met die kleiner etiket na die nodus met die groter etiket toe georiënteer word. Ons verfyn hul resultaat deur ook die aantal bronne (nodusse met ingangsgraad 0) en putte (nodusse met uitgangsgraad 0) in ag te neem. Ons vind formules vir die gemiddelde en variansie van die aantal putte of bronne in hierdie bome. Ons bepaal verder ’n differensiaalvergelyking en ’n funksionaalvergelyking wat deur die voortbringende funksie van hierdie bome bevredig word. Analoë resultate vir geëtiketteerde bome met twee gemerkte nodusse (wat verwant is aan funksionele digrafieke), is ook gevind. Ons gaan verder voort deur ook bereikbare nodusse, bronne en putte in hierdie bome at te tel. Onder andere verkry ons ’n formule vir die aantal geëtiketteerde bome met n nodusse waarin presies k nodusse vanaf ’n gegewe nodus v bereikbaar is asook die gemiddelde aantal nodusse wat bereikbaar is vanaf ’n gegewe nodus. Ons enumereer in hierdie tesis verder sekere families van versamelingsverdelings en soortgelyke boom-vormige strukture. Ons gee ’n bewys vir ’n formule wat die aantal van samehangende siklus-vrye families van k versamelingsverdelings op {1, . . . , n} wat ’n sekere koherensie-vereiste bevredig, en ons beskryf ’n bijeksie tussen hierdie familie en die versameling van geëtiketteerde vrye k-êre kaktusse met ’n gegewe nodus-graad-verdeling. Ons toon ook dat hierdie formule ook gekleurde Husimi-grafieke tel waar blokke van dieselfde kleur nie insident met mekaar mag wees nie. Ons tel verder ook gekleurde georiënteerde kaktusse en gekleurde kaktusse. Nie-kruisende bome en soortgelyke boom-vormige strukture word in hierdie tesis ook beskou. On bepaal spesifiek formules vir lokaal georiënteerde nie-kruisende bome wat ’n gegewe aantal bronne en putte het asook nie-kruisende bome met gegewe ingangs- en uitgangsgraadrye. Ons gaan voort deur die gemiddelde aantal bereikbare nodusse in hierdie bome te bepaal. Ons veralgemeen dan die konsep van nie-kruisende bome en vind formules vir die aantal nie-kruisende Husimi-grafieke, kaktusse en georiënteerde kaktusse. Laastens vind ons ’n formule vir die aantaal tweegekleurde nie-kruisende Husimi-grafieke en die aantal nie-kruisende samehangende siklus-vrye pare van versamelingsverdelings.
APA, Harvard, Vancouver, ISO, and other styles
21

Jönsson, Mattias, and Lucas Borg. "How to explain graph-based semi-supervised learning for non-mathematicians?" Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20339.

Full text
Abstract:
Den stora mängden tillgänglig data på internet kan användas för att förbättra förutsägelser genom maskininlärning. Problemet är att sådan data ofta är i ett obehandlat format och kräver att någon manuellt bestämmer etiketter på den insamlade datan innan den kan användas av algoritmen. Semi-supervised learning (SSL) är en teknik där algoritmen använder ett fåtal förbehandlade exempel och därefter automatiskt bestämmer etiketter för resterande data. Ett tillvägagångssätt inom SSL är att representera datan i en graf, vilket kallas för graf-baserad semi-supervised learning (GSSL), och sedan hitta likheter mellan noderna i grafen för att automatiskt bestämma etiketter.Vårt mål i denna uppsatsen är att förenkla de avancerade processerna och stegen för att implementera en GSSL-algoritm. Vi kommer att gå igen grundläggande steg som hur utvecklingsmiljön ska installeras men även mer avancerade steg som data pre-processering och feature extraction. Feature extraction metoderna som uppsatsen använder sig av är bag-of-words (BOW) och term frequency-inverse document frequency (TF-IDF). Slutgiltligen presenterar vi klassificering av dokument med Label Propagation (LP) och Multinomial Naive Bayes (MNB) samt en detaljerad beskrivning över hur GSSL fungerar.Vi presenterar även prestanda för klassificering-algoritmerna genom att klassificera 20 Newsgroup datasetet med LP och MNB. Resultaten dokumenteras genom två olika utvärderingspoäng vilka är F1-score och accuracy. Vi gör även en jämförelse mellan MNB och LP med två olika typer av kärnor, KNN och RBF, på olika mängder av förbehandlade träningsdokument. Resultaten ifrån klassificering-algoritmerna visar att MNB är bättre på att klassificera datasetet än LP.
The large amount of available data on the web can be used to improve the predictions made by machine learning algorithms. The problem is that such data is often in a raw format and needs to be manually labeled by a human before it can be used by a machine learning algorithm. Semi-supervised learning (SSL) is a technique where the algorithm uses a few prepared samples to automatically prepare the rest of the data. One approach to SSL is to represent the data in a graph, also called graph-based semi-supervised learning (GSSL), and find similarities between the nodes for automatic labeling.Our goal in this thesis is to simplify the advanced processes and steps to implement a GSSL-algorithm. We will cover basic tasks such as setup of the developing environment and more advanced steps such as data preprocessing and feature extraction. The feature extraction techniques covered are bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF). Lastly, we present how to classify documents using Label Propagation (LP) and Multinomial Naive Bayes (MNB) with a detailed explanation of the inner workings of GSSL. We showcased the classification performance by classifying documents from the 20 Newsgroup dataset using LP and MNB. The results are documented using two different evaluation scores called F1-score and accuracy. A comparison between MNB and the LP-algorithm using two different types of kernels, KNN and RBF, was made on different amount of labeled documents. The results from the classification algorithms shows that MNB is better at classifying the data than LP.
APA, Harvard, Vancouver, ISO, and other styles
22

Planche, Léo. "Décomposition de graphes en plus courts chemins et en cycles de faible excentricité." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCB224.

Full text
Abstract:
En collaboration avec des chercheurs en biologie à Jussieu, nous étudions des graphes issus de données biologiques afin de d'en améliorer la compréhension. Ces graphes sont constitués à partir de fragments d'ADN, nommés reads. Chaque read correspond à un sommet, et deux sommets sont reliés si les deux séquences d'ADN correspondantes ont un taux de similarité suffisant. Ainsi se forme des graphes ayant une structure bien particulière que nous nommons hub-laminaire. Un graphe est dit hub-laminaire s'il peut être résumé en quelques plus courts chemins dont tous les sommets du graphe soient proche. Nous étudions en détail le cas où le graphe est composé d'un unique plus court chemin d'excentricité faible, ce problème a été initialement défini par Dragan 2017. Nous améliorons la preuve d'un algorithme d'approximation déjà existant et en proposons un nouveau, effectuant une 3-approximation en temps linéaire. De plus, nous analysons le lien avec le problème de k-laminarité défini par Habib 2016, ce dernier consistant en la recherche d'un diamètre de faible excentricité. Nous étudions ensuite le problème du cycle isométrique de plus faible excentricité. Nous montrons que ce problème est NP-complet et proposons deux algorithmes d'approximations. Nous définissons ensuite précisément la structure "hub-laminaire" et présentons un algorithme d'approximation en temps O(nm). Nous confrontons cet algorithme à des graphes générés par une procédure aléatoire et l'appliquons à nos données biologiques. Pour finir nous montrons que le calcul du cycle isométrique d'excentricité minimale permet le plongement d'un graphe dans un cercle avec une distorsion multiplicative faible. Le calcul d'une décomposition hub-laminaire permet quant à lui une représentation compacte des distances avec une distorsion additive bornée
In collaboration with reserchears in biology at Université Pierre et Marie Curie, we study graphs coming from biological data in order to improve our understanding of it. Those graphs come from DNA fragments, named reads. Each read is a vertex and two vertices are linked if the DNA sequences are similar enough. Such graphs have a particuliar structure that we name hub-laminar. A graph is said to be hub-laminar if it may be represented as a (small) set of shortest paths such that every vertex of the graph is close to one of those paths. We first study the case where the graph is composed of an unique shortest path of low eccentricity. This problem was first definied by Dragan 2017. We improve the proof of an approximation algorithm already existing and propose a new one, a 3-approximation running in linear time. Furthermore we show its link with the k-laminar problem defined by Habib 2016, consisting in finding a diameter of low eccentricity. We then define and study the problem of the isometric cycle of minimal eccentricity. We show that this problem is NP-complete and propose two approximation algorithms. We then properly define what is an hub-laminar decomposition and we show an approximation algorithm running in O(nm). We test this algorithm with randomly generated graphs and apply it to our biolgical data. Finaly we show that computing an isometric cycle of low eccentricity allows to embed a graph into a cycle with a low multiplicative distortion. Computing an hub-laminar decomposition allows a compact representation of distances with a low additive distortion
APA, Harvard, Vancouver, ISO, and other styles
23

Dash, Santanu Kumar. "Adaptive constraint solving for information flow analysis." Thesis, University of Hertfordshire, 2015. http://hdl.handle.net/2299/16354.

Full text
Abstract:
In program analysis, unknown properties for terms are typically represented symbolically as variables. Bound constraints on these variables can then specify multiple optimisation goals for computer programs and nd application in areas such as type theory, security, alias analysis and resource reasoning. Resolution of bound constraints is a problem steeped in graph theory; interdependencies between the variables is represented as a constraint graph. Additionally, constants are introduced into the system as concrete bounds over these variables and constants themselves are ordered over a lattice which is, once again, represented as a graph. Despite graph algorithms being central to bound constraint solving, most approaches to program optimisation that use bound constraint solving have treated their graph theoretic foundations as a black box. Little has been done to investigate the computational costs or design e cient graph algorithms for constraint resolution. Emerging examples of these lattices and bound constraint graphs, particularly from the domain of language-based security, are showing that these graphs and lattices are structurally diverse and could be arbitrarily large. Therefore, there is a pressing need to investigate the graph theoretic foundations of bound constraint solving. In this thesis, we investigate the computational costs of bound constraint solving from a graph theoretic perspective for Information Flow Analysis (IFA); IFA is a sub- eld of language-based security which veri es whether con dentiality and integrity of classified information is preserved as it is manipulated by a program. We present a novel framework based on graph decomposition for solving the (atomic) bound constraint problem for IFA. Our approach enables us to abstract away from connections between individual vertices to those between sets of vertices in both the constraint graph and an accompanying security lattice which defines ordering over constants. Thereby, we are able to achieve significant speedups compared to state-of-the-art graph algorithms applied to bound constraint solving. More importantly, our algorithms are highly adaptive in nature and seamlessly adapt to the structure of the constraint graph and the lattice. The computational costs of our approach is a function of the latent scope of decomposition in the constraint graph and the lattice; therefore, we enjoy the fastest runtime for every point in the structure-spectrum of these graphs and lattices. While the techniques in this dissertation are developed with IFA in mind, they can be extended to other application of the bound constraints problem, such as type inference and program analysis frameworks which use annotated type systems, where constants are ordered over a lattice.
APA, Harvard, Vancouver, ISO, and other styles
24

Alise, Dario Fioravante. "Algoritmo di "Label Propagation" per il clustering di documenti testuali." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14388/.

Full text
Abstract:
Negli ultimi anni del secolo scorso l’avvento di Internet ha permesso di avere a disposizione innumerevoli quantità di testi consultabili online, provenienti sia da libri e riviste, sia da nuove forme di comunicazione della rete quali email, forum, newsgroup e chat. 
Le soluzioni adottate nel settore del Text Mining (d’ora in poi abbreviato in TM), che è l’estensione del Data Mining rivolto a dati testuali non strutturati, si basano su fondamenti informatici, statistici e linguistici e sono in linea di principio applicabili a documenti di qualsiasi dimensione.
Con l’avvento dei Social Networks la quantità e la dimensione dei dati testuali da analizzare è cresciuta in maniera sub-esponenziale e benché le tecniche disponibili rimangono comunque valide e applicabili, negli ultimi quattro/cinque anni la ricerca si è concentrata su una tecnica emergente, chiamata semantic hashing, che consente di mappare documenti di qualunque tipo in stringhe binarie.
Sfruttando questa nuova branca di ricerca, lo scopo principale di questa tesi è di definire, progettare ed implementare un algoritmo di clustering che prendendo in input questi dati binari sia in grado di etichettare tali dati in maniera più precisa ed in tempi minori rispetto a quanto fanno gli altri approcci presenti in letteratura.
Dopo una descrizione di quelle che sono le principali tecniche di TM, seguirà una trattazione relativa all’hashing semantico e alle basi teoriche su cui questo si fonda per poi introdurre l’algoritmo adoperato per fare clustering, presentandone lo schema architetturale di funzionamento e la relativa implementazione. 
Infine saranno comparati e analizzati i risultati dell’esecuzione dell’algoritmo, chiamato d’ora in poi Label Propagation (abbreviato in LP), con quelli ottenuti con tecniche standard.
APA, Harvard, Vancouver, ISO, and other styles
25

Attal, Jean-Philippe. "Nouveaux algorithmes pour la détection de communautés disjointes et chevauchantes basés sur la propagation de labels et adaptés aux grands graphes." Thesis, Cergy-Pontoise, 2017. http://www.theses.fr/2017CERG0842/document.

Full text
Abstract:
Les graphes sont des structures mathématiques capable de modéliser certains systèmes complexes.Une des nombreuses problématiques liée aux graphes concerne la détection de communautés qui vise à trouver une partition en sommet d'un graphe en vue d'en comprendre la structure. A titre d'exemple, en représentant des contratsd'assurances par des noeuds et leurs degrés de similarité par une arête,détecter des groupes de noeuds fortement connectésconduit à détecter des profils similaires, et donc a voir des profils à risques.De nombreux algorithmes ont essayé de répondreà ce problème.Une des méthodes est la propagation de labels qui consiste à ce quechaque noeud puisse recevoir un label par un vote majoritaire de ses voisins.Bien que cette méthode soit simple à mettre en oeuvre,elle présente une grande instabilité due au non déterminisme del'algorithme et peut dans certains cas ne pas détecter de structures communautaires.La première contribution de cette thèse sera de i) proposerune méthode de stabilisation de la propagation de labelstout en appliquant des barrages artificiels pour limiter les possibles mauvaises propagations.Les réseaux complexes ont également comme caractéristique que certains noeuds puissent appartenir à plusieurs communautés, on parle alors de recouvrements. C'est en ce sens que la secondecontribution de cette thèse portera sur ii) la créationd'un algorithme auquel seront adjointes des fonctions d'appartenancespour détecter de possibles recouvrements via des noeuds candidats au chevauchement.La taille des graphes est également une notion à considérer dans la mesure où certains réseaux peuvent contenir plusieursmillions de noeuds et d'arêtes.Nous proposons iii) une version parallèleet distribuée de la détection de communautés en utilisant la propagation de labels par coeur.Une étude comparative sera effectuée pour observerla qualité de partitionnement et de recouvrement desalgorithmes proposés
Graphs are mathematical structures amounting to a set of nodes (objects or persons) in which some pairs are in linked with edges. Graphs can be used to model complex systems.One of the main problems in graph theory is the community detection problemwhich aims to find a partition of nodes in the graph to understand its structure.For instance, by representing insurance contracts by nodes and their relationship by edges,detecting groups of nodes highly connected leads to detect similar profiles and to evaluate risk profiles. Several algorithms are used as aresponse to this currently open research field.One of the fastest method is the label propagation.It's a local method, in which each node changes its own label according toits neighbourhood.Unfortunately, this method has two major drawbacks. The first is the instability of the method. Each trialgives rarely the same result.The second is a bad propagation which can lead to huge communities without sense (giant communities problem).The first contribution of the thesis is i) proposing a stabilisation methodfor the label propagation with artificial dams on edges of some networks in order to limit bad label propagations. Complex networks are also characterized by some nodes which may belong to several communities,we call this a cover.For example, in Protein–protein interaction networks, some proteins may have several functions.Detecting these functions according to their communities could help to cure cancers. The second contribution of this thesis deals with the ii)implementation of an algorithmwith functions to detect potential overlapping nodes .The size of the graphs is also to be considered because some networks contain several millions of nodes and edges like the Amazon product co-purchasing network.We propose iii) a parallel and a distributed version of the community detection using core label propagation.A study and a comparative analysis of the proposed algorithms will be done based on the quality of the resulted partitions and covers
APA, Harvard, Vancouver, ISO, and other styles
26

Huang, Sangxia. "Hardness of Constraint Satisfaction and Hypergraph Coloring : Constructions of Probabilistically Checkable Proofs with Perfect Completeness." Doctoral thesis, KTH, Teoretisk datalogi, TCS, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-168576.

Full text
Abstract:
A Probabilistically Checkable Proof (PCP) of a mathematical statement is a proof written in a special manner that allows for efficient probabilistic verification. The celebrated PCP Theorem states that for every family of statements in NP, there is a probabilistic verification procedure that checks the validity of a PCP proof by reading only 3 bits from it. This landmark theorem, and the works leading up to it, laid the foundation for many subsequent works in computational complexity theory, the most prominent among them being the study of inapproximability of combinatorial optimization problems. This thesis focuses on a broad class of combinatorial optimization problems called Constraint Satisfaction Problems (CSPs). In an instance of a CSP problem of arity k, we are given a set of variables taking values from some finite domain, and a set of constraints each involving a subset of at most k variables. The goal is to find an assignment that simultaneously satisfies as many constraints as possible. An alternative formulation of the goal that is commonly used is Gap-CSP, where the goal is to decide whether a CSP instance is satisfiable or far from satisfiable, where the exact meaning of being far from satisfiable varies depending on the problems.We first study Boolean CSPs, where the domain of the variables is {0,1}. The main question we study is the hardness of distinguishing satisfiable Boolean CSP instances from those for which no assignment satisfies more than some epsilon fraction of the constraints. Intuitively, as the arity increases, the CSP gets more complex and thus the hardness parameter epsilon should decrease. We show that for Boolean CSPs of arity k, it is NP-hard to distinguish satisfiable instances from those that are at most 2^{~O(k^{1/3})}/2^k-satisfiable. We also study coloring of graphs and hypergraphs. Given a graph or a hypergraph, a coloring is an assignment of colors to vertices, such that all edges or hyperedges are non-monochromatic. The gap problem is to distinguish instances that are colorable with a small number of colors, from those that require a large number of colors. For graphs, we prove that there exists a constant K_0>0, such that for any K >= K_0, it is NP-hard to distinguish K-colorable graphs from those that require 2^{Omega(K^{1/3})} colors. For hypergraphs, we prove that it is quasi-NP-hard to distinguish 2-colorable 8-uniform hypergraphs of size N from those that require 2^{(log N)^{1/4-o(1)}} colors. In terms of techniques, all these results are based on constructions of PCPs with perfect completeness, that is, PCPs where the probabilistic proof verification procedure always accepts a correct proof. Not only is this a very natural property for proofs, but it can also be an essential requirement in many applications. It has always been particularly challenging to construct PCPs with perfect completeness for NP statements due to limitations in techniques. Our improved hardness results build on and extend many of the current approaches. Our Boolean CSP result and GraphColoring result were proved by adapting the Direct Sum of PCPs idea by Siu On Chan to the perfect completeness setting. Our proof for hypergraph coloring hardness improves and simplifies the recent work by Khot and Saket, in which they proposed the notion of superposition complexity of CSPs.
Ett probabilistiskt verifierbart bevis (eng: Probabilistically Checkable Proof, PCP) av en matematisk sats är ett bevis skrivet på ett speciellt sätt vilket möjliggör en effektiv probabilistisk verifiering. Den berömda PCP-satsen säger att för varje familj av påståenden i NP finns det en probabilistisk verifierare som kontrollerar om en PCP bevis är giltigt genom att läsa endast 3 bitar från det. Denna banbrytande sats, och arbetena som ledde fram till det, lade grunden för många senare arbeten inom komplexitetsteorin, framförallt inom studiet av approximerbarhet av kombinatoriska optimeringsproblem. I denna avhandling fokuserar vi på en bred klass av optimeringsproblem i form av villkorsuppfyllningsproblem (engelska ``Constraint Satisfaction Problems'' CSPs). En instans av ett CSP av aritet k ges av en mängd variabler som tar värden från någon ändlig domän, och ett antal villkor som vart och ett beror på en delmängd av högst k variabler. Målet är att hitta ett tilldelning av variablerna som samtidigt uppfyller så många som möjligt av villkoren. En alternativ formulering av målet som ofta används är Gap-CSP, där målet är att avgöra om en CSP-instans är satisfierbar eller långt ifrån satisfierbar, där den exakta innebörden av att vara ``långt ifrån satisfierbar'' varierar beroende på problemet.Först studerar vi booleska CSPer, där domänen är {0,1}. Den fråga vi studerar är svårigheten av att särskilja satisfierbara boolesk CSP-instanser från instanser där den bästa tilldelningen satisfierar högst en andel epsilon av villkoren. Intuitivt, när ariten ökar blir CSP mer komplexa och därmed bör svårighetsparametern epsilon avta med ökande aritet. Detta visar sig vara sant och ett första resultat är att för booleska CSP av aritet k är det NP-svårt att särskilja satisfierbara instanser från dem som är högst 2^{~O(k^{1/3})}/2^k-satisfierbara. Vidare studerar vi färgläggning av grafer och hypergrafer. Givet en graf eller en hypergraf, är en färgläggning en tilldelning av färger till noderna, så att ingen kant eller hyperkant är monokromatisk. Problemet vi analyserar är att särskilja instanser som är färgbara med ett litet antal färger från dem som behöver många färger. För grafer visar vi att det finns en konstant K_0>0, så att för alla K >= K_0 är det NP-svårt att särskilja grafer som är K-färgbara från dem som kräver minst 2^{Omega(K^{1/3})} färger. För hypergrafer visar vi att det är kvasi-NP-svårt att särskilja 2-färgbara 8-likformiga hypergrafer som har N noder från dem som kräv minst 2^{(log N)^{1/4-o(1)}} färger. Samtliga dessa resultat bygger på konstruktioner av PCPer med perfekt fullständighet. Det vill säga PCPer där verifieraren alltid accepterar ett korrekt bevis. Inte bara är detta en mycket naturlig egenskap för PCPer, men det kan också vara ett nödvändigt krav för vissa tillämpningar. Konstruktionen av PCPer med perfekt fullständighet för NP-påståenden ger tekniska komplikationer och kräver delvis utvecklande av nya metoder. Vårt booleska CSPer resultat och vårt Färgläggning resultat bevisas genom att anpassa ``Direktsumman-metoden'' introducerad av Siu On Chan till fallet med perfekt fullständighet. Vårt bevis för hypergraffärgningssvårighet förbättrar och förenklar ett färskt resultat av Khot och Saket, där de föreslog begreppet superpositionskomplexitet av CSP.

QC 20150916

APA, Harvard, Vancouver, ISO, and other styles
27

Jia, Wei. "Image analysis and representation for textile design classification." Thesis, University of Dundee, 2011. https://discovery.dundee.ac.uk/en/studentTheses/c667f279-d7a6-4670-b23e-c9dbe2784266.

Full text
Abstract:
A good image representation is vital for image comparision and classification; it may affect the classification accuracy and efficiency. The purpose of this thesis was to explore novel and appropriate image representations. Another aim was to investigate these representations for image classification. Finally, novel features were examined for improving image classification accuracy. Images of interest to this thesis were textile design images. The motivation of analysing textile design images is to help designers browse images, fuel their creativity, and improve their design efficiency. In recent years, bag-of-words model has been shown to be a good base for image representation, and there have been many attempts to go beyond this representation. Bag-of-words models have been used frequently in the classification of image data, due to good performance and simplicity. “Words” in images can have different definitions and are obtained through steps of feature detection, feature description, and codeword calculation. The model represents an image as an orderless collection of local features. However, discarding the spatial relationships of local features limits the power of this model. This thesis exploited novel image representations, bag of shapes and region label graphs models, which were based on bag-of-words model. In both models, an image was represented by a collection of segmented regions, and each region was described by shape descriptors. In the latter model, graphs were constructed to capture the spatial information between groups of segmented regions and graph features were calculated based on some graph theory. Novel elements include use of MRFs to extract printed designs and woven patterns from textile images, utilisation of the extractions to form bag of shapes models, and construction of region label graphs to capture the spatial information. The extraction of textile designs was formulated as a pixel labelling problem. Algorithms for MRF optimisation and re-estimation were described and evaluated. A method for quantitative evaluation was presented and used to compare the performance of MRFs optimised using alpha-expansion and iterated conditional modes (ICM), both with and without parameter re-estimation. The results were used in the formation of the bag of shapes and region label graphs models. Bag of shapes model was a collection of MRFs' segmented regions, and the shape of each region was described with generic Fourier descriptors. Each image was represented as a bag of shapes. A simple yet competitive classification scheme based on nearest neighbour class-based matching was used. Classification performance was compared to that obtained when using bags of SIFT features. To capture the spatial information, region label graphs were constructed to obtain graph features. Regions with the same label were treated as a group and each group was associated uniquely with a vertex in an undirected, weighted graph. Each region group was represented as a bag of shape descriptors. Edges in the graph denoted either the extent to which the groups' regions were spatially adjacent or the dissimilarity of their respective bags of shapes. Series of unweighted graphs were obtained by removing edges in order of weight. Finally, an image was represented using its shape descriptors along with features derived from the chromatic numbers or domination numbers of the unweighted graphs and their complements. Linear SVM classifiers were used for classification. Experiments were implemented on data from Liberty Art Fabrics, which consisted of more than 10,000 complicated images mainly of printed textile designs and woven patterns. Experimental data was classified into seven classes manually by assigning each image a text descriptor based on content or design type. The seven classes were floral, paisley, stripe, leaf, geometric, spot, and check. The result showed that reasonable and interesting regions were obtained from MRF segmentation in which alpha-expansion with parameter re-estimation performs better than alpha-expansion without parameter re-estimation or ICM. This result was not only promising for textile CAD (Computer-Aided Design) to redesign the textile image, but also for image representation. It was also found that bag of shapes model based on MRF segmentation can obtain comparable classification accuracy with bag of SIFT features in the framework of nearest neighbour class-based matching. Finally, the result indicated that incorporation of graph features extracted by constructing region label graphs can improve the classification accuracy compared to both bag of shapes model and bag of SIFT models.
APA, Harvard, Vancouver, ISO, and other styles
28

Chu, Sheng-Chih, and 朱聖池. "Efficiently Finding Neighborhood Patterns in a Large Labeled Graph." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/10254639751782455142.

Full text
Abstract:
碩士
國立臺灣師範大學
資訊工程學系
103
Graph is a powerful abstraction of structural data, which is applied to model the various relations among data in a real world. Recently, a new kind of patterns called frequent neighborhood patterns is defined for a large labeled graph. Frequent neighborhood patterns have the downward closure property of the support measure and provide meaningful interpretations of pattern mining. The previous work used an Apriori-like approach to combine the discovered frequent neighborhood patterns into larger candidate patterns, many of the generated candidates may not appear in the graph. In this thesis, we propose an algorithm, which is called the gSFNP algorithm, to find frequent neighborhood patterns efficiently. By applying a pattern growth approach, a data structure of pattern is designed to store the information of the matched sub-graphs for speeding up the following pattern growth computation. Besides, the minimum DFS code of a pattern is used to avoid finding graph isomorphic patterns. Moreover, we propose another MapReduce version of the gSFNP algorithm, which is called the gSFNP_MR algorithm, to solve the problem of insufficient memory in a centralized environment. Finally, we evaluate the performance of gSFNP and gSFNP_MR. The experimental results show that both the proposed algorithms have shorter response time than the previous work.
APA, Harvard, Vancouver, ISO, and other styles
29

Huang, Jiayuan. "Learning from Partially Labeled Data: Unsupervised and Semi-supervised Learning on Graphs and Learning with Distribution Shifting." Thesis, 2007. http://hdl.handle.net/10012/3165.

Full text
Abstract:
This thesis focuses on two fundamental machine learning problems:unsupervised learning, where no label information is available, and semi-supervised learning, where a small amount of labels are given in addition to unlabeled data. These problems arise in many real word applications, such as Web analysis and bioinformatics,where a large amount of data is available, but no or only a small amount of labeled data exists. Obtaining classification labels in these domains is usually quite difficult because it involves either manual labeling or physical experimentation. This thesis approaches these problems from two perspectives: graph based and distribution based. First, I investigate a series of graph based learning algorithms that are able to exploit information embedded in different types of graph structures. These algorithms allow label information to be shared between nodes in the graph---ultimately communicating information globally to yield effective unsupervised and semi-supervised learning. In particular, I extend existing graph based learning algorithms, currently based on undirected graphs, to more general graph types, including directed graphs, hypergraphs and complex networks. These richer graph representations allow one to more naturally capture the intrinsic data relationships that exist, for example, in Web data, relational data, bioinformatics and social networks. For each of these generalized graph structures I show how information propagation can be characterized by distinct random walk models, and then use this characterization to develop new unsupervised and semi-supervised learning algorithms. Second, I investigate a more statistically oriented approach that explicitly models a learning scenario where the training and test examples come from different distributions. This is a difficult situation for standard statistical learning approaches, since they typically incorporate an assumption that the distributions for training and test sets are similar, if not identical. To achieve good performance in this scenario, I utilize unlabeled data to correct the bias between the training and test distributions. A key idea is to produce resampling weights for bias correction by working directly in a feature space and bypassing the problem of explicit density estimation. The technique can be easily applied to many different supervised learning algorithms, automatically adapting their behavior to cope with distribution shifting between training and test data.
APA, Harvard, Vancouver, ISO, and other styles
30

Pootheri, Sridar Kuttan. "Counting classes of labeled 2-connected graphs." 2000. http://purl.galileo.usg.edu/uga%5Fetd/pootheri%5Fsridar%5Fk%5F200005%5Fms.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Morgan, David. "Gracefully labelled trees from Skolem and related sequences /." 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
32

Araújo, Miguel Ramos de. "Communities and Anomaly Detection in Large Edged-Labeled Graphs." Tese, 2017. https://hdl.handle.net/10216/105062.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Araújo, Miguel Ramos de. "Communities and Anomaly Detection in Large Edged-Labeled Graphs." Doctoral thesis, 2017. https://hdl.handle.net/10216/105062.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

"Graph-based recommendation with label propagation." 2011. http://library.cuhk.edu.hk/record=b5894820.

Full text
Abstract:
Wang, Dingyan.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.
Includes bibliographical references (p. 97-110).
Abstracts in English and Chinese.
Abstract --- p.ii
Acknowledgement --- p.vi
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Overview --- p.1
Chapter 1.2 --- Motivations --- p.6
Chapter 1.3 --- Contributions --- p.9
Chapter 1.4 --- Organizations of This Thesis --- p.11
Chapter 2 --- Background --- p.14
Chapter 2.1 --- Label Propagation Learning Framework --- p.14
Chapter 2.1.1 --- Graph-based Semi-supervised Learning --- p.14
Chapter 2.1.2 --- Green's Function Learning Framework --- p.16
Chapter 2.2 --- Recommendation Methods --- p.19
Chapter 2.2.1 --- Traditional Memory-based Methods --- p.19
Chapter 2.2.2 --- Traditional Model-based Methods --- p.20
Chapter 2.2.3 --- Label Propagation Recommendation Models --- p.22
Chapter 2.2.4 --- Latent Feature Recommendation Models . --- p.24
Chapter 2.2.5 --- Social Recommendation Models --- p.25
Chapter 2.2.6 --- Tag-based Recommendation Models --- p.25
Chapter 3 --- Recommendation with Latent Features --- p.28
Chapter 3.1 --- Motivation and Contributions --- p.28
Chapter 3.2 --- Item Graph --- p.30
Chapter 3.2.1 --- Item Graph Definition --- p.30
Chapter 3.2.2 --- Item Graph Construction --- p.31
Chapter 3.3 --- Label Propagation Recommendation Model with Latent Features --- p.33
Chapter 3.3.1 --- Latent Feature Analysis --- p.33
Chapter 3.3.2 --- Probabilistic Matrix Factorization --- p.35
Chapter 3.3.3 --- Similarity Consistency Between Global and Local Views (SCGL) --- p.39
Chapter 3.3.4 --- Item-based Green's Function Recommendation Based on SCGL --- p.41
Chapter 3.4 --- Experiments --- p.41
Chapter 3.4.1 --- Dataset --- p.43
Chapter 3.4.2 --- Baseline Methods --- p.43
Chapter 3.4.3 --- Metrics --- p.45
Chapter 3.4.4 --- Experimental Procedure --- p.45
Chapter 3.4.5 --- Impact of Weight Parameter u --- p.46
Chapter 3.4.6 --- Performance Comparison --- p.48
Chapter 3.5 --- Summary --- p.50
Chapter 4 --- Recommendation with Social Network --- p.51
Chapter 4.1 --- Limitation and Contributions --- p.51
Chapter 4.2 --- A Social Recommendation Framework --- p.55
Chapter 4.2.1 --- Social Network --- p.55
Chapter 4.2.2 --- User Graph --- p.57
Chapter 4.2.3 --- Social-User Graph --- p.59
Chapter 4.3 --- Experimental Analysis --- p.60
Chapter 4.3.1 --- Dataset --- p.61
Chapter 4.3.2 --- Metrics --- p.63
Chapter 4.3.3 --- Experiment Setting --- p.64
Chapter 4.3.4 --- Impact of Control Parameter u --- p.65
Chapter 4.3.5 --- Performance Comparison --- p.67
Chapter 4.4 --- Summary --- p.69
Chapter 5 --- Recommendation with Tags --- p.71
Chapter 5.1 --- Limitation and Contributions --- p.71
Chapter 5.2 --- Tag-Based User Modeling --- p.75
Chapter 5.2.1 --- Tag Preference --- p.75
Chapter 5.2.2 --- Tag Relevance --- p.78
Chapter 5.2.3 --- User Interest Similarity --- p.80
Chapter 5.3 --- Tag-Based Label Propagation Recommendation --- p.83
Chapter 5.4 --- Experimental Analysis --- p.84
Chapter 5.4.1 --- Douban Dataset --- p.85
Chapter 5.4.2 --- Experiment Setting --- p.86
Chapter 5.4.3 --- Metrics --- p.87
Chapter 5.4.4 --- Impact of Tag and Rating --- p.88
Chapter 5.4.5 --- Performance Comparison --- p.90
Chapter 5.5 --- Summary --- p.92
Chapter 6 --- Conclusions and Future Work --- p.94
Chapter 6.0.1 --- Conclusions --- p.94
Chapter 6.0.2 --- Future Work --- p.96
Bibliography --- p.97
APA, Harvard, Vancouver, ISO, and other styles
35

Wang, Chung Han, and 王宗涵. "Unsupervised Image Segmentation using Multi-label Graph Cuts." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/92616521540163396109.

Full text
Abstract:
碩士
國立清華大學
資訊工程學系
104
Image segmentation is an important issue in image editing and computer vision. Due to the complexity of information in images, efficient extraction of a foreground object is a challenging problem. Recently, several approaches based on optimization by graph cuts have been developed which successfully combine the color feature with the edge information. A problem is that the segmentation results heavily depend on the seeds selection. However, it is difficult to obtaining reliable seeds automatically. To overcome this problem, we propose an automatic scheme for image segmentation. Compare to the classical binary-label graph cuts, the results by the multi-label graph cuts do not heavily depend on the seeds selection. Our method uses the multi-label graph cuts to separate an image into multiple segments, and then classify the segments into the object and the background. We introduce the standard deviation to adapt the importance between the properties in our method. Experiments show that the proposed method yields more accurate segmentation results than the previous automatic approach and is comparable to the interactive approach.
APA, Harvard, Vancouver, ISO, and other styles
36

Dickinson, Peter. "Graph based techniques for measurement of intranet dynamics." 2006. http://arrow.unisa.edu.au:8081/1959.8/45980.

Full text
Abstract:
This thesis develops a number of graph-based techniques that are capable of measuring the dynamic behaviour of a network and discusses their application in network management. By representing a computer network as a time series of uniquely labelled graphs, it is possible to measure the degree of change that has occurred between a pair of graphs, and hence the dynamics in a network. Concepts introduced include the median graph, intra- and inter- graph clustering, and hierarchical graph representations. The focus is on producing efficient algorithms and improved measures of network change. It is believed that these graph-based techniques for measuring network dynamics have great potential in network anomaly detection, and thus will improve reliability of enterprise intranets.
APA, Harvard, Vancouver, ISO, and other styles
37

Humphries, Peter J. "Combinatorial aspects of leaf-labelled trees : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Mathematics, University of Canterbury Department of Mathematics and Statistics /." 2008. http://hdl.handle.net/10092/1801.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Tedder, Marc. "Applications of Lexicographic Breadth-first Search to Modular Decomposition, Split Decomposition, and Circle Graphs." Thesis, 2011. http://hdl.handle.net/1807/29888.

Full text
Abstract:
This thesis presents the first sub-quadratic circle graph recognition algorithm, and develops improved algorithms for two important hierarchical decomposition schemes: modular decomposition and split decomposition. The modular decomposition algorithm results from unifying two different approaches previously employed to solve the problem: divide-and-conquer and factorizing permutations. It runs in linear-time, and is straightforward in its understanding, correctness, and implementation. It merely requires a collection of trees and simple traversals of these trees. The split-decomposition algorithm is similar in being straightforward in its understanding and correctness. An efficient implementation of the algorithm is described that uses the union-find data-structure. A novel charging argument is used to prove the running-time. The algorithm is the first to use the recent reformulation of split decomposition in terms of graph-labelled trees. This facilitates its extension to circle graph recognition. In particular, it allows us to efficiently apply a new lexicographic breadth-first search characterization of circle graphs developed in the thesis. Lexicographic breadth-first search is additionally responsible for the efficiency of the split decomposition algorithm, and contributes to the simplicity of the modular decomposition algorithm.
APA, Harvard, Vancouver, ISO, and other styles
39

Shenkenfelder, Warren. "Learning bisimulation." Thesis, 2008. http://hdl.handle.net/1828/1262.

Full text
Abstract:
Computational learning theory is a branch of theoretical computer science that re-imagines the role of an algorithm from an agent of computation to an agent of learning. The operations of computers become those of the human mind; an important step towards illuminating the limitations of artificial intelligence. The central difference between a learning algorithm and a traditional algorithm is that the learner has access to an oracle who, in constant time, can answer queries about that to be learned. Normally an algorithm would have to discover such information on its own accord. This subtle change in how we model problem solving results in changes in the computational complexity of some classic problems; allowing us to re-examine them in a new light. Specifically two known result are examined: one positive, one negative. It is know that one can efficiently learn Deterministic Finite Automatons with queries, not so of Non-Deterministic Finite Automatons. We generalize these Automatons into Labeled Transition Systems and attempt to learn them using a stronger query.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!