To see the other types of publications on this topic, follow the link: Usage pattern mining.

Dissertations / Theses on the topic 'Usage pattern mining'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 21 dissertations / theses for your research on the topic 'Usage pattern mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Alshehri, Abdullah. "Keyboard usage recognition : a study in pattern mining and prediction in the context of impersonation." Thesis, University of Liverpool, 2018. http://livrepository.liverpool.ac.uk/3022436/.

Full text
Abstract:
The research presented in this thesis is directed at an investigation into the use of keystroke dynamics (typing patterns) for the purpose of impersonation detection, especially in the context of online assessments. More specifically, the aim was to research the nature of time series analysis approaches for the purpose of continuous user authentication. The research question to be answered was "Is it possible to continuously authenticate individuals, according to their keyboard usage patterns; and if so what are the most appropriate mechanisms for achieving this?". The main contribution of the thesis is a collection of three time series analysis approaches to continuous user authentication using keystroke dynamics: (i) Once-only Keystroke Continuous Authentication (OKCA), (ii) Iterative Keystroke Continuous Authentication (IKCA) and (iii) Keystroke Continuous Authentication based Spectral Analysis (KCASA). The OKCA approach was a benchmark, proof-of-concept, approach applicable in the static (as opposed to the continuous) context, and directed at establishing the veracity of the time series approach. The IKCA system was the first of two proposed continuous iterative authentication approaches. The IKCA approach was founded on the OKCA approach. A particular novel aspect of the operation of the IKCA approach was that it used the concept of a bespoke similarity threshold. The KCASA approach was then an improvement on the IKCA approach that operated in the spectral domain rather than the temporal domain used in the case of the OKCA, and IKCA approaches. Two spectral transformations were considered: (i) the Discrete Fourier Transform (DFT) and (ii) the Discrete Wavelet Transform (DWT). All three of the proposed approaches used Dynamic Time Warping (DTW) as the time series similarity determination mechanism because this offered advantages over the more standard Euclidean distance similarity measurement. The systems were evaluated using a dataset collated by the author, and two further datasets taken from the literature. Both Univariate and Multivariate Keystroke Time Series (U-KTS and M-KTS) were considered. The evaluation was conducted to compare the operation of the proposed approaches and to compare the operation of the proposed approaches with the established feature vector-based approach from the literature. All the proposed time series-based approaches were found to be more accurate than the feature vector-based approach. The most accurate of the three proposed time seriesbased approaches was found to be the KCASA approach. More specifically, KCASA with DWT coupled with M-KTS. However, DFT was found to be more efficient in terms of run-time complixity.
APA, Harvard, Vancouver, ISO, and other styles
2

Tanasa, Doru. "Web usage mining : contributions to intersites logs preprocessing and sequential pattern extraction with low support." Nice, 2005. http://www.theses.fr/2005NICE4019.

Full text
Abstract:
Le Web Usage Mining (WUM), domaine de recherche assez récent, correspond au processus d’extraction des connaissances à partir des données (ECD) appliquées aux données d’usage sur le Web. Il comporte trois étapes principales : le prétraitement des données, la découverte des schémas et l’analyse des résultats. La quantité des données d’usage à analyser ainsi que leur faible qualité (en particulier l’absence de structuration) sont les principaux problèmes en WUM. Les algorithmes classiques de fouille de données appliquées sur ces données donnent généralement des résultats décevants en termes de pratiques des internautes. Dans cette thèse, nous apportons deux contributions importantes pour un processus WUM, implémentées dans notre boîte à outils Axislogminer. D’abord, nous proposons une méthodologie générale de prétraitement des logs Web dont l’originalité consiste dans le fait qu’elle prend en compte l’aspect multi-sites du WUM. Nous proposons dans notre méthodologie quatre étapes distinctes : la fusion des fichiers logs, le nettoyage, la structuration et l’agrégation des données. Notre deuxième contribution vise à la découverte à partir d’un fichier log prétraité de grande taille, des comportements minoritaires correspondant à des motifs séquentiels de très faible support. Pour cela, nous proposons une méthodologie générale visant à diviser le fichier log prétraité en sous-logs, se déclinant selon trois approches d’extraction de motifs séquentiels au support faible (séquentielle, itérative et hiérarchique). Celles-ci ont été implémentées dans des méthodes concrètes hybrides mettant en jeu des algorithmes de classification et d’extraction de motifs séquentiels
The Web use mining (WUM) is a rather research field and it corresponds to the process of knowledge discovery from databases (KDD) applied to the Web usage data. It comprises three main stages : the pre-processing of raw data, the discovery of schemas and the analysis (or interpretation) of results. The quantity of the web usage data to be analysed and its low quality (in particular the absence of structure) are the principal problems in WUM. When applied to these data, the classic algorithms of data mining, generally, give disappointing results in terms of behaviours of the Web sites users (E. G. Obvious sequential patterns, stripped of interest). In this thesis, we bring two significant contributions for a WUM process, both implemented in our toolbox, the Axislogminer. First, we propose a complete methodology for pre-processing the Web logs whose originality consists in its intersites aspect. We propose in our methodology four distinct steps : the data fusion, data cleaning, data structuration and data summarization. Our second contribution aims at discovering from a large pre-processed log file the minority behaviours corresponding to the sequential patterns with low support. For that, we propose a general methodology aiming at dividing the pre-processed log file into a series of sub-logs. Based on this methodology, we designed three approaches for extracting sequential patterns with low support (the sequential, iterative and hierarchical approaches). These approaches we implemented in hybrid concrete methods using algorithms of clustering and sequential pattern mining
APA, Harvard, Vancouver, ISO, and other styles
3

Adam, Chloé. "Pattern Recognition in the Usage Sequences of Medical Apps." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLC027/document.

Full text
Abstract:
Les radiologues utilisent au quotidien des solutions d'imagerie médicale pour le diagnostic. L'amélioration de l'expérience utilisateur est toujours un axe majeur de l'effort continu visant à améliorer la qualité globale et l'ergonomie des produits logiciels. Les applications de monitoring permettent en particulier d'enregistrer les actions successives effectuées par les utilisateurs dans l'interface du logiciel. Ces interactions peuvent être représentées sous forme de séquences d'actions. Sur la base de ces données, ce travail traite de deux sujets industriels : les pannes logicielles et l'ergonomie des logiciels. Ces deux thèmes impliquent d'une part la compréhension des modes d'utilisation, et d'autre part le développement d'outils de prédiction permettant soit d'anticiper les pannes, soit d'adapter dynamiquement l'interface logicielle en fonction des besoins des utilisateurs. Tout d'abord, nous visons à identifier les origines des crashes du logiciel qui sont essentielles afin de pouvoir les corriger. Pour ce faire, nous proposons d'utiliser un test binomial afin de déterminer quel type de pattern est le plus approprié pour représenter les signatures de crash. L'amélioration de l'expérience utilisateur par la personnalisation et l'adaptation des systèmes aux besoins spécifiques de l'utilisateur exige une très bonne connaissance de la façon dont les utilisateurs utilisent le logiciel. Afin de mettre en évidence les tendances d'utilisation, nous proposons de regrouper les sessions similaires. Nous comparons trois types de représentation de session dans différents algorithmes de clustering. La deuxième contribution de cette thèse concerne le suivi dynamique de l'utilisation du logiciel. Nous proposons deux méthodes -- basées sur des représentations différentes des actions d'entrée -- pour répondre à deux problématiques industrielles distinctes : la prédiction de la prochaine action et la détection du risque de crash logiciel. Les deux méthodologies tirent parti de la structure récurrente des réseaux LSTM pour capturer les dépendances entre nos données séquentielles ainsi que leur capacité à traiter potentiellement différents types de représentations d'entrée pour les mêmes données
Radiologists use medical imaging solutions on a daily basis for diagnosis. Improving user experience is a major line of the continuous effort to enhance the global quality and usability of software products. Monitoring applications enable to record the evolution of various software and system parameters during their use and in particular the successive actions performed by the users in the software interface. These interactions may be represented as sequences of actions. Based on this data, this work deals with two industrial topics: software crashes and software usability. Both topics imply on one hand understanding the patterns of use, and on the other developing prediction tools either to anticipate crashes or to dynamically adapt software interface according to users' needs. First, we aim at identifying crash root causes. It is essential in order to fix the original defects. For this purpose, we propose to use a binomial test to determine which type of patterns is the most appropriate to represent crash signatures. The improvement of software usability through customization and adaptation of systems to each user's specific needs requires a very good knowledge of how users use the software. In order to highlight the trends of use, we propose to group similar sessions into clusters. We compare 3 session representations as inputs of different clustering algorithms. The second contribution of our thesis concerns the dynamical monitoring of software use. We propose two methods -- based on different representations of input actions -- to address two distinct industrial issues: next action prediction and software crash risk detection. Both methodologies take advantage of the recurrent structure of LSTM neural networks to capture dependencies among our sequential data as well as their capacity to potentially handle different types of input representations for the same data
APA, Harvard, Vancouver, ISO, and other styles
4

Singh, Shailendra. "Smart Meters Big Data : Behavioral Analytics via Incremental Data Mining and Visualization." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/35244.

Full text
Abstract:
The big data framework applied to smart meters offers an exception platform for data-driven forecasting and decision making to achieve sustainable energy efficiency. Buying-in consumer confidence through respecting occupants' energy consumption behavior and preferences towards improved participation in various energy programs is imperative but difficult to obtain. The key elements for understanding and predicting household energy consumption are activities occupants perform, appliances and the times that appliances are used, and inter-appliance dependencies. This information can be extracted from the context rich big data from smart meters, although this is challenging because: (1) it is not trivial to mine complex interdependencies between appliances from multiple concurrent data streams; (2) it is difficult to derive accurate relationships between interval based events, where multiple appliance usage persist; (3) continuous generation of the energy consumption data can trigger changes in appliance associations with time and appliances. To overcome these challenges, we propose an unsupervised progressive incremental data mining technique using frequent pattern mining (appliance-appliance associations) and cluster analysis (appliance-time associations) coupled with a Bayesian network based prediction model. The proposed technique addresses the need to analyze temporal energy consumption patterns at the appliance level, which directly reflect consumers' behaviors and provide a basis for generalizing household energy models. Extensive experiments were performed on the model with real-world datasets and strong associations were discovered. The accuracy of the proposed model for predicting multiple appliances usage outperformed support vector machine during every stage while attaining accuracy of 81.65\%, 85.90\%, 89.58\% for 25\%, 50\% and 75\% of the training dataset size respectively. Moreover, accuracy results of 81.89\%, 75.88\%, 79.23\%, 74.74\%, and 72.81\% were obtained for short-term (hours), and long-term (day, week, month, and season) energy consumption forecasts, respectively.
APA, Harvard, Vancouver, ISO, and other styles
5

Soztutar, Enis. "Mining Frequent Semantic Event Patterns." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12611007/index.pdf.

Full text
Abstract:
Especially with the wide use of dynamic page generation, and richer user interaction in Web, traditional web usage mining methods, which are based on the pageview concept are of limited usability. For overcoming the difficulty of capturing usage behaviour, we define the concept of semantic events. Conceptually, events are higher level actions of a user in a web site, that are technically independent of pageviews. Events are modelled as objects in the domain of the web site, with associated properties. A sample event from a video web site is the '
play video event'
with properties '
video'
, '
length of video'
, '
name of video'
, etc. When the event objects belong to the domain model of the web site'
s ontology, they are referred as semantic events. In this work, we propose a new algorithm and associated framework for mining patterns of semantic events from the usage logs. We present a method for tracking and logging domain-level events of a web site, adding semantic information to events, an ordering of events in respect to the genericity of the event, and an algorithm for computing sequences of frequent events.
APA, Harvard, Vancouver, ISO, and other styles
6

Özakar, Belgin Püskülcü Halis. "Finding And Evaluating Patterns In Wes Repository Using Database Technology And Data Mining Algorithms/." [s.l.]: [s.n.], 2002. http://library.iyte.edu.tr/tezler/master/bilgisayaryazilimi/T000130.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Nguyen, Hoang Viet Tuan. "Prise en compte de la qualité des données lors de l’extraction et de la sélection d’évolutions dans les séries temporelles de champs de déplacements en imagerie satellitaire." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAA011.

Full text
Abstract:
Ce travail de thèse traite de la découverte de connaissances à partir de Séries Temporelles de Champs de Déplacements (STCD) obtenues par imagerie satellitaire. De telles séries occupent aujourd'hui une place centrale dans l'étude et la surveillance de phénomènes naturels tels que les tremblements de terre, les éruptions volcaniques ou bien encore le déplacement des glaciers. En effet, ces séries sont riches d'informations à la fois spatiales et temporelles et peuvent aujourd'hui être produites régulièrement à moindre coût grâce à des programmes spatiaux tels que le programme européen Copernicus et ses satellites phares Sentinel. Nos propositions s'appuient sur l'extraction de motifs Séquentiels Fréquents Groupés (SFG). Ces motifs, à l'origine définis pour l'extraction de connaissances à partir des Séries Temporelles d’Images Satellitaires (STIS), ont montré leur potentiel dans de premiers travaux visant à dépouiller une STCD. Néanmoins, ils ne permettent pas d'utiliser les indices de confiance intrinsèques aux STCD et la méthode de swap randomisation employée pour sélectionner les motifs les plus prometteurs ne tient pas compte de leurs complémentarités spatiotemporelles, chaque motif étant évalué individuellement. Notre contribution est ainsi double. Une première proposition vise tout d'abord à associer une mesure de fiabilité à chaque motif en utilisant les indices de confiance. Cette mesure permet de sélectionner les motifs portés par des données qui sont en moyenne suffisamment fiables. Nous proposons un algorithme correspondant pour réaliser les extractions sous contrainte de fiabilité. Celui-ci s'appuie notamment sur une recherche efficace des occurrences les plus fiables par programmation dynamique et sur un élagage de l'espace de recherche grâce à une stratégie de push partiel, ce qui permet de considérer des STCD conséquentes. Cette nouvelle méthode a été implémentée sur la base du prototype existant SITS-P2miner, développé au sein du LISTIC et du LIRIS pour extraire et classer des motifs SFG. Une deuxième contribution visant à sélectionner les motifs les plus prometteurs est également présentée. Celle-ci, basée sur un critère informationnel, permet de prendre en compte à la fois les indices de confiance et la façon dont les motifs se complètent spatialement et temporellement. Pour ce faire, les indices de confiance sont interprétés comme des probabilités, et les STCD comme des bases de données probabilistes dont les distributions ne sont que partielles. Le gain informationnel associé à un motif est alors défini en fonction de la capacité de ses occurrences à compléter/affiner les distributions caractérisant les données. Sur cette base, une heuristique est proposée afin de sélectionner des motifs informatifs et complémentaires. Cette méthode permet de fournir un ensemble de motifs faiblement redondants et donc plus faciles à interpréter que ceux fournis par swap randomisation. Elle a été implémentée au sein d'un prototype dédié. Les deux propositions sont évaluées à la fois quantitativement et qualitativement en utilisant une STCD de référence couvrant des glaciers du Groenland construite à partir de données optiques Landsat. Une autre STCD que nous avons construite à partir de données radar TerraSAR-X couvrant le massif du Mont-Blanc est également utilisée. Outre le fait d'être construites à partir de données et de techniques de télédétection différentes, ces séries se différencient drastiquement en termes d'indices de confiance, la série couvrant le massif du Mont-Blanc se situant à des niveaux de confiance très faibles. Pour les deux STCD, les méthodes proposées ont été mises en œuvre dans des conditions standards au niveau consommation de ressources (temps, espace), et les connaissances des experts sur les zones étudiées ont été confirmées et complétées
This PhD thesis deals with knowledge discovery from Displacement Field Time Series (DFTS) obtained by satellite imagery. Such series now occupy a central place in the study and monitoring of natural phenomena such as earthquakes, volcanic eruptions and glacier displacements. These series are indeed rich in both spatial and temporal information and can now be produced regularly at a lower cost thanks to spatial programs such as the European Copernicus program and its famous Sentinel satellites. Our proposals are based on the extraction of grouped frequent sequential patterns. These patterns, originally defined for the extraction of knowledge from Satellite Image Time Series (SITS), have shown their potential in early work to analyze a DFTS. Nevertheless, they cannot use the confidence indices coming along with DFTS and the swap method used to select the most promising patterns does not take into account their spatiotemporal complementarities, each pattern being evaluated individually. Our contribution is thus double. A first proposal aims to associate a measure of reliability with each pattern by using the confidence indices. This measure allows to select patterns having occurrences in the data that are on average sufficiently reliable. We propose a corresponding constraint-based extraction algorithm. It relies on an efficient search of the most reliable occurrences by dynamic programming and on a pruning of the search space provided by a partial push strategy. This new method has been implemented on the basis of the existing prototype SITS-P2miner, developed by the LISTIC and LIRIS laboratories to extract and rank grouped frequent sequential patterns. A second contribution for the selection of the most promising patterns is also made. This one, based on an informational criterion, makes it possible to take into account at the same time the confidence indices and the way the patterns complement each other spatially and temporally. For this aim, the confidence indices are interpreted as probabilities, and the DFTS are seen as probabilistic databases whose distributions are only partial. The informational gain associated with a pattern is then defined according to the ability of its occurrences to complete/refine the distributions characterizing the data. On this basis, a heuristic is proposed to select informative and complementary patterns. This method provides a set of weakly redundant patterns and therefore easier to interpret than those provided by swap randomization. It has been implemented in a dedicated prototype. Both proposals are evaluated quantitatively and qualitatively using a reference DFTS covering Greenland glaciers constructed from Landsat optical data. Another DFTS that we built from TerraSAR-X radar data covering the Mont-Blanc massif is also used. In addition to being constructed from different data and remote sensing techniques, these series differ drastically in terms of confidence indices, the series covering the Mont-Blanc massif being at very low levels of confidence. In both cases, the proposed methods operate under standard conditions of resource consumption (time, space), and experts’ knowledge of the studied areas is confirmed and completed
APA, Harvard, Vancouver, ISO, and other styles
8

Vollino, Bruno Winiemko. "Descoberta de perfis de uso de web services." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/83669.

Full text
Abstract:
Durante o ciclo de vida de um web service, diversas mudanças são feitas na sua interface, eventualmente causando incompatibilidades em relação aos seus clientes e ocasionando a quebra de suas aplicações. Os provedores precisam tomar decisões sobre mudanças em seus serviços frequentemente, muitas vezes sem um bom entendimento a respeito do efeito destas mudanças sobre seus clientes. Os trabalhos e ferramentas existentes não fornecem ao provedor um conhecimento adequado a respeito do uso real das funcionalidades da interface de um serviço, considerando os diferentes tipos de consumidores, o que impossibilita avaliar o impacto das mudanças. Este trabalho apresenta um framework para a descoberta de perfis de uso de serviços web, os quais constituem um modelo descritivo dos padrões de uso dos diferentes grupos de clientes do serviço, com relação ao uso das funcionalidades em sua interface. O framework auxilia no processo de descoberta de conhecimento através de tarefas semiautomáticas e parametrizáveis para a preparação e análise de dados de uso, minimizando a necessidade de intervenção do usuário. O framework engloba o monitoramento de interações de web services, a carga de dados de uso pré-processados em uma base de dados unificada, e a geração de perfis de uso. Técnicas de mineração de dados são utilizadas para agrupar clientes de acordo com seus padrões de uso de funcionalidades, e esses grupos são utilizados na construção de perfis de uso de serviços. Todo o processo é configurado através de parâmetros, permitindo que o usuário determine o nível de detalhe das informações sobre o uso incluídas nos perfis e os critérios para avaliar a similaridade entre clientes. A proposta é validada por meio de experimentos com dados sintéticos, simulados de acordo com características esperadas no comportamento de clientes de um serviço real. Os resultados dos experimentos demonstram que o framework proposto permite a descoberta de perfis de uso de serviço úteis, e fornecem evidências a respeito da parametrização adequada do framework.
During the life cycle of a web service, several changes are made in its interface, which possibly are incompatible with regard to current usage and may break client applications. Providers must make decisions about changes on their services, most often without insight on the effect these changes will have over their customers. Existing research and tools fail to input provider with proper knowledge about the actual usage of the service interface’s features, considering the distinct types of customers, making it impossible to assess the actual impact of changes. This work presents a framework for the discovery of web service usage profiles, which constitute a descriptive model of the usage patterns found in distinct groups of clients, concerning the usage of service interface features. The framework supports a user in the process of knowledge discovery over service usage data through semi-automatic and configurable tasks, which assist the preparation and analysis of usage data with the minimum user intervention possible. The framework performs the monitoring of web services interactions, loads pre-processed usage data into a unified database, and supports the generation of usage profiles. Data mining techniques are used to group clients according to their usage patterns of features, and these groups are used to build service usage profiles. The entire process is configured via parameters, which allows the user to determine the level of detail of the usage information included in the profiles, and the criteria for evaluating the similarity between client applications. The proposal is validated through experiments with synthetic data, simulated according to features expected in the use of a real service. The experimental results demonstrate that the proposed framework allows the discovery of useful service usage profiles, and provide evidences about the proper parameterization of the framework.
APA, Harvard, Vancouver, ISO, and other styles
9

Duck, Geraint. "Extraction of database and software usage patterns from the bioinformatics literature." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/extraction-of-database-and-software-usage-patterns-from-the-bioinformatics-literature(fac16cb8-5b5b-4732-b7af-77a41cc64487).html.

Full text
Abstract:
Method forms the basis of scientific research, enabling criticism, selection and extension of current knowledge. However, methods are usually confined to the literature, where they are often difficult to find, understand, compare, or repeat. Bioinformatics and computational biology provide a rich opportunity for resource creation and discovery, with a rapidly expanding "resourceome". Many of these resources are difficult to find due to the large choice available, and there are only a limited number of sufficiently populated lists that can help inform resource selection. Text mining has enabled large scale data analysis and extraction from within the scientific literature, and as such can provide a way to help explore the vast wealth of resources available, which form the basis of bioinformatics methods. As such, this thesis aims to survey the computational biology literature, using text mining to extract database and software resource name mentions. By evaluating the common pairs and patterns of usage of these resources within such articles, an abstract approximation of the in silico methods employed within the target domain is developed. Specifically, this thesis provides an analysis of the difficulties of resource name extraction from the literature, then using this knowledge to develop bioNerDS - a rule-based system that can detect database and software name mentions within full-text documents (with a final F-score of 67%). bioNerDS is then applied to the full-text document corpus from PubMed Central, the results of which are then explored to identify the differences in resource usage between different domains (bioinformatics, biology and medicine) through time, different journals and different document sections. In particular, the well established resources (e.g., BLAST, GO and GenBank) remain pervasive throughout the domains, although they are seeing a slight decline in usage. Statistical programs see high levels of usage, with R in bioinformatics and SPSS in medicine being frequently mentioned throughout the literature. An overview of the common resource pairs has been generated by pairing database and software names which directly co-occur after one another in text. Combining and aggregating these resource pairs together across the literature enables the generation of a network of common resource patterns within computational biology, which provides an abstract representation of the common in silico methods used. For example, sequence alignment tools remain an important part of several computational biology analysis pipelines, and GO is a strong network sink (primarily used for data annotation). The networks also show the emergence of proteomics and next generation sequencing resources, and provide a specialised overview of a typical phylogenetics method. This work performs an analysis of common resource usage patterns, and thus provides an important first step towards in silico method extraction using text-mining. This should have future implications in community best practice, both for resource and method selection.
APA, Harvard, Vancouver, ISO, and other styles
10

Gandikota, Vijai. "Modeling operating system crash behavior through multifractal analysis, long range dependence and mining of memory usage patterns." Morgantown, W. Va. : [West Virginia University Libraries], 2006. https://eidr.wvu.edu/etd/documentdata.eTD?documentid=4566.

Full text
Abstract:
Thesis (M.S.)--West Virginia University, 2006.
Title from document title page. Document formatted into pages; contains xii, 102 p. : ill. (some col.). Vita. Includes abstract. Includes bibliographical references (p. 96-99).
APA, Harvard, Vancouver, ISO, and other styles
11

Kilic, Sefa. "Clustering Frequent Navigation Patterns From Website Logs Using Ontology And Temporal Information." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12613979/index.pdf.

Full text
Abstract:
Given set of web pages labeled with ontological items, the level of similarity between two web pages is measured using the level of similarity between ontological items of pages labeled with. Using similarity measure between two pages, degree of similarity between two sequences of web page visits can be calculated as well. Using clustering algorithms, similar frequent sequences are grouped and representative sequences are selected from these groups. A new sequence is compared with all clusters and it is assigned to most similar one. Representatives of the most similar cluster can be used in several real world cases. They can be used for predicting and prefetching the next page user will visit or for helping the navigation of user in the website. They can also be used to improve the structure of website for easier navigation. In this study the effect of time spent on each web page during the session is analyzed.
APA, Harvard, Vancouver, ISO, and other styles
12

Persson, Pontus. "Identifying Early Usage Patterns That Increase User Retention Rates In A Mobile Web Browser." Thesis, Linköpings universitet, Databas och informationsteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-137793.

Full text
Abstract:
One of the major challenges for modern technology companies is user retentionmanagement. This work focuses on identifying early usage patterns that signifyincreased retention rates in a mobile web browser.This is done using a targetedparallel implementation of the association rule mining algorithm FP-Growth.Different item subset selection techniques including clustering and otherstatistical methods have been used in order to reduce the mining time and allowfor lower support thresholds.A lot of interesting rules have been mined. The best retention-wise ruleimplies a retention rate of 99.5%. The majority of the rules analyzed in thiswork implies a retention rate increase between 150% and 200%.
APA, Harvard, Vancouver, ISO, and other styles
13

Ammari, Ahmad N. "Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns. The Development and Evaluation of New Web Mining Methods that enhance Information Retrieval and improve the Understanding of User¿s Web Behavior in Websites and Social Blogs." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/5269.

Full text
Abstract:
The rapid growth of the World Wide Web in the last decade makes it the largest publicly accessible data source in the world, which has become one of the most significant and influential information revolution of modern times. The influence of the Web has impacted almost every aspect of humans' life, activities and fields, causing paradigm shifts and transformational changes in business, governance, and education. Moreover, the rapid evolution of Web 2.0 and the Social Web in the past few years, such as social blogs and friendship networking sites, has dramatically transformed the Web from a raw environment for information consumption to a dynamic and rich platform for information production and sharing worldwide. However, this growth and transformation of the Web has resulted in an uncontrollable explosion and abundance of the textual contents, creating a serious challenge for any user to find and retrieve the relevant information that he truly seeks to find on the Web. The process of finding a relevant Web page in a website easily and efficiently has become very difficult to achieve. This has created many challenges for researchers to develop new mining techniques in order to improve the user experience on the Web, as well as for organizations to understand the true informational interests and needs of their customers in order to improve their targeted services accordingly by providing the products, services and information that truly match the requirements of every online customer. With these challenges in mind, Web mining aims to extract hidden patterns and discover useful knowledge from Web page contents, Web hyperlinks, and Web usage logs. Based on the primary kinds of Web data used in the mining process, Web mining tasks can be categorized into three main types: Web content mining, which extracts knowledge from Web page contents using text mining techniques, Web structure mining, which extracts patterns from the hyperlinks that represent the structure of the website, and Web usage mining, which mines user's Web navigational patterns from Web server logs that record the Web page access made by every user, representing the interactional activities between the users and the Web pages in a website. The main goal of this thesis is to contribute toward addressing the challenges that have been resulted from the information explosion and overload on the Web, by proposing and developing novel Web mining-based approaches. Toward achieving this goal, the thesis presents, analyzes, and evaluates three major contributions. First, the development of an integrated Web structure and usage mining approach that recommends a collection of hyperlinks for the surfers of a website to be placed at the homepage of that website. Second, the development of an integrated Web content and usage mining approach to improve the understanding of the user's Web behavior and discover the user group interests in a website. Third, the development of a supervised classification model based on recent Social Web concepts, such as Tag Clouds, in order to improve the retrieval of relevant articles and posts from Web social blogs.
APA, Harvard, Vancouver, ISO, and other styles
14

Ammari, Ahmad N. "Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns : the development and evaluation of new Web mining methods that enhance information retrieval and improve the understanding of users' Web behavior in websites and social blogs." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/5269.

Full text
Abstract:
The rapid growth of the World Wide Web in the last decade makes it the largest publicly accessible data source in the world, which has become one of the most significant and influential information revolution of modern times. The influence of the Web has impacted almost every aspect of humans' life, activities and fields, causing paradigm shifts and transformational changes in business, governance, and education. Moreover, the rapid evolution of Web 2.0 and the Social Web in the past few years, such as social blogs and friendship networking sites, has dramatically transformed the Web from a raw environment for information consumption to a dynamic and rich platform for information production and sharing worldwide. However, this growth and transformation of the Web has resulted in an uncontrollable explosion and abundance of the textual contents, creating a serious challenge for any user to find and retrieve the relevant information that he truly seeks to find on the Web. The process of finding a relevant Web page in a website easily and efficiently has become very difficult to achieve. This has created many challenges for researchers to develop new mining techniques in order to improve the user experience on the Web, as well as for organizations to understand the true informational interests and needs of their customers in order to improve their targeted services accordingly by providing the products, services and information that truly match the requirements of every online customer. With these challenges in mind, Web mining aims to extract hidden patterns and discover useful knowledge from Web page contents, Web hyperlinks, and Web usage logs. Based on the primary kinds of Web data used in the mining process, Web mining tasks can be categorized into three main types: Web content mining, which extracts knowledge from Web page contents using text mining techniques, Web structure mining, which extracts patterns from the hyperlinks that represent the structure of the website, and Web usage mining, which mines user's Web navigational patterns from Web server logs that record the Web page access made by every user, representing the interactional activities between the users and the Web pages in a website. The main goal of this thesis is to contribute toward addressing the challenges that have been resulted from the information explosion and overload on the Web, by proposing and developing novel Web mining-based approaches. Toward achieving this goal, the thesis presents, analyzes, and evaluates three major contributions. First, the development of an integrated Web structure and usage mining approach that recommends a collection of hyperlinks for the surfers of a website to be placed at the homepage of that website. Second, the development of an integrated Web content and usage mining approach to improve the understanding of the user's Web behavior and discover the user group interests in a website. Third, the development of a supervised classification model based on recent Social Web concepts, such as Tag Clouds, in order to improve the retrieval of relevant articles and posts from Web social blogs.
APA, Harvard, Vancouver, ISO, and other styles
15

Becker, Mélanie. "L’exploration des pages web : de la caractérisation interindividuelle à l’identification de patterns comportementaux." Thesis, Université de Lorraine, 2016. http://www.theses.fr/2016LORR0344/document.

Full text
Abstract:
Une étude de Nielsen (2006), largement citée, indique que les internautes explorent les pages Web suivant un pattern en forme de "F". Ce résultat a amené les concepteurs à organiser les informations d'une page en fonction de ce comportement, même si aucune étude n'a permis de répliquer ces résultats. Bien que les conclusions de cette étude portent sur le comportement visuel, la question des patterns comportementaux permettant de décrire la navigation des internautes se pose de manière plus générale. L'objectif de cette thèse a donc été de déterminer si des patterns pouvaient être mis en lumière à partir de différents indicateurs. Trois études ont été réalisées. Dans la première étude, 112 participants devaient réaliser quatre tâches de recherche d’information sur deux sites web différents. Le protocole impliquait une répétition immédiate de ces mêmes tâches. Une classification automatique a permis d'identifier 4 patterns qui se distinguent à la fois en termes de navigation sur la page d’accueil, mais aussi de performances. Lors des répétitions, la classification nous a permis d’identifier 3 des 4 patterns précédents. Ceci implique que les individus ne répètent pas forcément leur façon de rechercher l'information et ceci, peu importe la tâche, et le site. La deuxième étude a porté sur 27 individus et impliquait, pour les participants, de se présenter trois fois consécutives à 48 heures d’intervalle, afin de refaire les mêmes tâches. La répétition des tâches, que ce soit à court ou à moyen terme, entraîne une augmentation des performances, c'est-à-dire que les tâches sont réalisées plus rapidement et de façon plus efficiente. Toutefois, les patterns identifiés diffèrent entre les répétitions à court et moyen terme. Un autre résultat observé est que les stratégies ou patterns ne sont pas propres aux individus. En d'autres termes, un individu peut présenter ou adopter plusieurs patterns d'une tâche à une autre, d'un site à un autre ou d'une répétition à l'autre. Enfin, pour notre dernière étude, nous nous sommes demandé si l’homogénéité de nos échantillons pouvait influer sur les patterns. Nous avons donc réalisé une expérimentation comptant 47 participants avec des profils variés. Les individus ont eu tendance à se distinguer selon les 4 mêmes patterns identifiés. Nous avons pu observer qu’en fonction des individus, une même stratégie pouvait conduire à la réussite ou à l’échec de la tâche. De plus, les styles d’apprentissage ne semblent pas liés aux patterns observés. Les limites et les perspectives de ces travaux sont discutées
A study by Nielsen (2006), widely cited, indicates that Internet users explore web pages following a "F" shape pattern. This result brought the designers to organize the information of a page according to this behavior, even if no study replicated these results. Although the conclusions of this study concern the visual behavior, the question of the behavioral patterns allowing describing the navigation of the Internet users remains in a more general way. Thus the aim of this thesis was to determine if patterns could be revealed from various indicators. Three studies were conducted. In the first study, 112 participants had to perform four information search tasks on two different websites. The experimental protocol involved an immediate repetition of the same tasks. A clustering method allowed us to identify 4 behavioral patterns, which distinguish themselves in terms of navigation on the homepage, but also in terms of performances. During the repetitions, the classification allowed us to identify 3 patterns out of the 4 previous ones. This implies that the individuals do not repeat necessarily the way they look for the information and this, no matter the task, and the Web site. The second experiment involved 27 persons. They had to come three times, with 48 hour intervals to repeat the same tasks. The repetition of the tasks, whether in short or medium-term, increased the performances of the users, that is the tasks are more quickly realized and in a more efficient way. However, the identified patterns differ between the short and medium-term repetitions. Another observed result is that the strategies or patterns are not peculiar to the individuals. In other words, an individual can present or adopt several patterns from a task to another one, from a site to an other one or from a repetition to the other one. Finally, in our last study, we wondered if the homogeneity of our previous samples could have influenced the patterns. So we conducted an experiment with 47 participants with varied profiles. The individuals tended to distinguish themselves according to 4 same identified patterns. We were able to observe that according to the individuals, the same strategy could lead to the success or to the failure of the task. Furthermore, the learning styles did not seem to be related to the observed patterns. Limits and prospects of this work are discussed
APA, Harvard, Vancouver, ISO, and other styles
16

Chen, Chien Chung, and 陳建忠. "Pattern Discovery of Web Usage Mining by K-means of Sequence Alignment Methods." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/83712834596186922724.

Full text
Abstract:
碩士
淡江大學
資訊工程學系
92
Nowadays, in the popular of Internet, people usually use the Internet for accessing the information and frequently act for business is more and more actively. Logs on a web site keep track of browsing record of the user and conceal the user’s demand on information. By utilizing Web Usage Mining techniques on web logs, we can find out the pattern where users access web pages. To go a step further, discover the pattern of user’s behavior to improve the design of the structure of web site and propose an effective Internet performance. In this paper, about the preprocessing of Web Usage Mining, we integrate and apply the technique of Web Usage Mining was published by Cooley and Chen ; about the pattern discovery of Web Usage Mining, we apply K-means method of clustering and Sequence Alignment Methods, SAM to covert one sequence into be represented by a score to discover the pattern of user’s behavior.
APA, Harvard, Vancouver, ISO, and other styles
17

Saied, Mohamed Aymen. "Inferring API Usage Patterns and Constraints : a Holistic Approach." Thèse, 2016. http://hdl.handle.net/1866/18471.

Full text
Abstract:
Les systèmes logiciels dépendent de plus en plus des librairies et des frameworks logiciels. Les programmeurs réutilisent les fonctionnalités offertes par ces librairies à travers une interface de programmation (API). Par conséquent, ils doivent faire face à la complexité des APIs nécessaires pour accomplir leurs tâches, tout en surmontant l’absence de directive sur l’utilisation de ces API dans leur documentation. Dans cette thèse, nous proposons une approche holistique qui cible le problème de réutilisation des librairies, à trois niveaux. En premier lieu, nous nous sommes intéressés à la réutilisation d’une seule méthode d’une API. À ce niveau, nous proposons d’identifier les contraintes d’utilisation liées aux paramètres de la méthode, en analysant uniquement le code source de la librairie. Nous avons appliqué plusieurs analyses de programme pour détecter quatre types de contraintes d’utilisation considérées critiques. Dans un deuxième temps, nous changeons l’échelle pour nous focaliser sur l’inférence des patrons d’utilisation d’une API. Ces patrons sont utiles pour aider les développeurs à apprendre les façons courantes d’utiliser des méthodes complémentaires de l’API. Nous proposons d’abord une technique basée sur l’analyse des programmes clients de l’API. Cette technique permet l’inférence de patrons multi-niveaux. Ces derniers présentent des relations de co-utilisation entre les méthodes de l’API à travers des scénarios d’utilisation entremêlés. Ensuite, nous proposons une technique basée uniquement sur l’analyse du code de la librairie, pour surmonter la contrainte de l’existence des programmes clients de l‘API. Cette technique infère les patrons par analyse des relations structurelles et sémantiques entre les méthodes. Finalement, nous proposons une technique coopérative pour l’inférence des patrons d’utilisation. Cette technique est axée sur la combinaison des heuristiques basées respectivement sur les clients et sur le code de la librairie. Cette combinaison permet de profiter à la fois de la précision des techniques basées sur les clients et de la généralisabilité des techniques basées sur les librairies. Pour la dernière contribution de notre thèse, nous visons un plus haut niveau de réutilisation des librairies. Nous présentons une nouvelle approche, pour identifier automatiquement les patrons d’utilisation de plusieurs librairies, couramment utilisées ensemble, et généralement développées par différentes tierces parties. Ces patrons permettent de découvrir les possibilités de réutilisation de plusieurs librairies pour réaliser diverses fonctionnalités du projets.
Software systems increasingly depend on external library and frameworks. Software developers need to reuse functionalities provided by these libraries through their Application Programming Interfaces (APIs). Hence, software developers have to cope with the complexity of existing APIs needed to accomplish their work, and overcome the lack of usage directive in the API documentation. In this thesis, we propose a holistic approach that deals with the library usability problem at three levels of granularity. In the first step, we focus on the method level. We propose to identify usage constraints related to method parameters, by analyzing only the library source code. We applied program analysis strategies to detect four critical usage constraint types. At the second step, we change the scale to focus on API usage pattern mining in order to help developers to better learn common ways to use the API complementary methods. We first propose a client-based technique for mining multilevel API usage patterns to exhibit the co-usage relationships between API methods across interfering usage scenarios. Then, we proposed a library-based technique to overcome the strong constraint of client programs’ selection. Our technique infers API usage patterns through the analysis of structural and semantic relationships between API methods. Finally, we proposed a cooperative usage pattern mining technique that combines client-based and library-based usage pattern mining. Our technique takes advantage at the same time from the precision of the client-based technique and from the generalizability of the library-based technique. As a last contribution of this thesis, we target a higher level of library usability. We present a novel approach, to automatically identify third-party library usage patterns, of libraries that are commonly used together. This aims to help developers to discover reuse opportunities, and pick complementary libraries that may be relevant for their projects.
APA, Harvard, Vancouver, ISO, and other styles
18

Pan, Yi-Chin, and 潘依琴. "Mining Apps Usage Patterns for Mobile Apps Prediction." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/45570114515872789005.

Full text
Abstract:
碩士
國立交通大學
網路工程研究所
100
Due to the proliferation of mobile applications (abbreviated as Apps) on mobile devices, users can download and execute Apps to facilitate their life. Clearly, Apps usage logs on mobile devices reflect users’ behavior. Given Apps usage logs, we intend to mine Apps usage patterns, which refers how and when Apps are used. To save the energy consumption for Apps usage logs generation, Apps usage logs usually record when Apps are executed. In other words, only temporal information is collected in Apps usage logs. With only temporal information is available in Apps usage logs, for each App, its usage pattern consists of three features: global-frequency, temporal-frequency, and periodicity. Explicitly, the global frequency of Apps refers the number of executions from the global view of Apps usages, the temporal-frequency of Apps is used to capture the execution distribution of Apps within a pre-defined time slot, and the periodicity is to identify whether Apps is periodically executed or not. In light of the three features of Apps, we address the mobile Apps usage prediction problem. Given a query time and the number of Apps, denoted as K, the top K Apps that are likely to be executed at the query time are generated. Based on Apps usage patterns, we propose two prediction algorithms: naive prediction algorithm and adaptive prediction algorithm. In particular, we derive the probability model for each feature in Apps usage patterns and give a set of Apps with their features, the above two algorithms could select top K Apps. To evaluate our proposed methods for mining Apps usage patterns and two proposed prediction algorithms, two real mobile Apps usage datasets are used. The experiment results show that our proposed methods can discover the Apps usage patterns effectively and our proposed prediction algorithms are able to accurately predict the Apps, and demonstrate the advantage of using Apps usage patterns for mobile Apps prediction.
APA, Harvard, Vancouver, ISO, and other styles
19

Ko, Yu-Lun, and 柯宇倫. "Mining Usage Patterns from Appliance Data in Smart Environment." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/17873050757984701191.

Full text
Abstract:
碩士
國立交通大學
網路工程研究所
101
In the last decade, considerable concern has arisen over the electricity saving due to the issue of reducing greenhouse gases. However, in daily lives, conserving electricity is not an easy task, since residents only can acquire the total electricity consumption from their bills or power meters. If more detailed behaviors of appliance usage are available, residents can make the correct policy to conserve the energy according to their frequent usage patterns. In this paper, based on four proposed usage patterns, we develop a system to analyze and aware users the detailed appliance usage information in a smart home environment. In advance, if the electricity cost is high, users can observe the extraordinary usage of appliances from the proposed system for energy saving easily. Furthermore, we also apply our system on real-world dataset to show the practicability of mining usage pattern in a smart home environment.
APA, Harvard, Vancouver, ISO, and other styles
20

Tu, Chi-Hua, and 杜季樺. "Web Usage Mining: Integrating Traversal Patterns and Purchase Behaviors." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/fpm2k5.

Full text
Abstract:
碩士
銘傳大學
資訊管理學系碩士班
92
Web mining applying data mining techniques on the web can be used to improve the Web services. Based on different web data, web mining can be divided into three research fields, i.e., web content mining, web structure mining, and web usage mining. Web usage mining is the process of extracting interesting patterns from web logs. This thesis proposes an IPA (Integrating Path traversal patterns and Association rules) model for web usage mining in the Electronic Commerce environment. The IPA model takes both the traveling and purchasing behaviors of customers into consideration at the same time to overcome the disadvantages of the pure association rules mining and pure path traversal pattern mining. The IPA model considers not only user traversal forward information but also backward information. Besides, web structure is also used in this paper to prune unnecessary candidates. The experimental results show that the IPA model can correctly capture the user’s traversing and purchasing behaviors.
APA, Harvard, Vancouver, ISO, and other styles
21

lo, chen-wei, and 羅振維. "The Study of Motif and Sequential Patterns Mining Based on Hadoop─A case study of Appliances Usage Time Series in Taiwan." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/m66q8s.

Full text
Abstract:
碩士
國立臺灣大學
資訊管理學研究所
102
With the rise of environmental awareness, power companies increasing demand for electric data mining. In addition, the increasing popularity of Smart Meters generate big electric time series data. Big data make researchers confronted analysis of large-scale data sets and heavy computation. It is a good choice to solve this problem that Hadoop which provide fault-tolerant parallelized analysis based on a Programming style named MapReduce. In order to achieve the goal of electric data mining. Motif mining is important research topic in time series mining. In time series, a motif is a subsequence fragment of a recurring. By motif mining, we can discovery a significant event. Traditional single-processor motif algorithm is inadequate to mining motif from that large-scale time series datasets. Therefore, this study provides two novel motif mining algorithm「PrefixMotif」 and 「MR_PrefixMotif」 based on Hadoop platform. Experiments show that when facing big data, 「PrefixMotif」 performance is better than traditional motif mining algorithm 「Time Series Projection」. Further, a distributed algorithm「MR_PrefixMotif」performance is better than single-processor algorithm「PrefixMotif」. MR_PrefixMotif is a novel parallel and distributed algorithm optimized for motif mining of large-scale time series datasets and provided superior performance of motif mining for electric data mining researchers.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography