Dissertations / Theses: 'Clickstream'

1

Kliegr, Tomáš. "Clickstream Analysis." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-2065.

Full text

Abstract:

Thesis introduces current research trends in clickstream analysis and proposes a new heuristic that could be used for dimensionality reduction of semantically enriched data in Web Usage Mining (WUM). Click-fraud and conversion fraud are identified as key prospective application areas for WUM. Thesis documents a conversion fraud vulnerability of Google Analytics and proposes defense - a new clickstream acquisition software, which collects data in sufficient granularity and structure to allow for data mining approaches to fraud detection. Three variants of K-means clustering algorithms and three association rule data mining systems are evaluated and compared on real-world web usage data.

APA, Harvard, Vancouver, ISO, and other styles

2

Jamalzadeh, Mohammadamin. "Analysis of clickstream data." Thesis, Durham University, 2011. http://etheses.dur.ac.uk/3366/.

Full text

Abstract:

This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are many research question regarding web browsing behaviour. Specifically, we focused on the depth-of-visit metric and implemented an exploratory analysis of this feature using clickstream data. Due to the large volume of data available in this context, we chose to present effect size measures along with all statistical analysis of data. We introduced two new robust measures of effect size for two-sample comparison studies for Non-normal situations, specifically where the difference of two populations is due to the shape parameter. The proposed effect sizes perform adequately for non-normal data, as well as when two distributions differ from shape parameters. We will focus on conversion analysis, to investigate the causal relationship between the general clickstream information and online purchasing using a logistic regression approach. The aim is to find a classifier by assigning the probability of the event of online shopping in an e-commerce website. We also develop the application of a mixture of hidden Markov models (MixHMM) to model web browsing behaviour using sequences of web pages viewed by users of an e-commerce website. The mixture of hidden Markov model will be performed in the Bayesian context using Gibbs sampling. We address the slow mixing problem of using Gibbs sampling in high dimensional models, and use the over-relaxed Gibbs sampling, as well as forward-backward EM algorithm to obtain an adequate sample of the posterior distributions of the parameters. The MixHMM provides an advantage of clustering users based on their browsing behaviour, and also gives an automatic classification of web pages based on the probability of observing web page by visitors in the website.

APA, Harvard, Vancouver, ISO, and other styles

3

Ekberg, Fredrik. "Jämförelse av analysmetoder för clickstream-data." Thesis, University of Skövde, School of Humanities and Informatics, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-873.

Full text

Abstract:

Det här arbetet har som syfte att genom en jämförelse av olika analysmetoder för clickstream-data kunna fungera som en vägledning när en metod ska implementeras. Metoden som använts vid jämförelsen är litteraturstudie i och med att de analyseringsmetoder som ska undersökas redan är framtagna och kunskap om dem fås genom att studera litteratur i vilka de förekommer. Ett antal kriterier används sedan vid själva jämförelsen, anledningen till detta är att metoderna ska jämföras utifrån en gemensam grund.

De metoder som uppfyllde kraven för de olika kriterierna bäst var page events fact model och subsession fact model. Subsession fact model kan dock upplevas som det bästa valet i alla lägen men samtidigt är den kanske lite överdriven om clickstream-datan bara ska användas till att se hur besökarna använder varje individuell sida för att användas i designsupport syfte. Det går alltså att påvisa att syftet styr vilken metod som är mest lämpad.

APA, Harvard, Vancouver, ISO, and other styles

4

Hotle, Susan Lisa. "Applications of clickstream information in estimating online user behavior." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53507.

Full text

Abstract:

The internet has become a more prominent part of people’s lives. Clickstream and other online data have enabled researchers to better understand consumers’ decision-making behavior in a variety of application areas. This dissertation focuses on using clickstream data in two application areas: the airline industry and the field of education. The first study investigates if airline passengers departing from or arriving to a multi-airport city actually consider itineraries at the airports not considered to be their preferred airport. It was found that customers do consider fares at multiple airports in multi-airport cities. However, other trip characteristics, typically linked to whether a customer is considered business or leisure, were found to have a larger impact on customer behavior than offered fares at competing airports. The second study evaluates airline customer search and purchase behavior near the advance purchase deadlines, which typically signify a price increase. Search and purchase demand models were constructed using instrumented two-stage least squares (2SLS) models with valid instruments to correct for endogeneity. Increased demand was found before each deadline, even though these deadlines are not well-known among the general public. It is hypothesized that customers are able to use two methods to unintentionally book right before these price increases: (1) altering their travel dates by one or two days using the flexible dates tools offered by an airline’s or online travel agency’s (OTA) website to receive a lower fare, (2) booking when the coefficient of variation across competitor fares is high, as the dynamics of one-way and roundtrip pricing differ near these deadlines. The third study uses clickstream data in the field of education to compare the success of the traditional, flipped, and micro-flipped classrooms as well as their impacts on classroom attitudes. Students’ quiz grades were not significantly different between the traditional and flipped classrooms. The flipped classroom reduced the impact of procrastination on success. In the end, it was found that micro-flipped was most preferred by students as it incorporated several benefits of the flipped classroom without the effects of a learning curve.

APA, Harvard, Vancouver, ISO, and other styles

5

Li, Richard D. (Richard Ding) 1978. "Web clickstream data analysis using a dimensional data warehouse." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86671.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2001.
Includes bibliographical references (leaves 83-84).
by Richard D. Li.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

6

Wong, Mark Alan. "Logging clickstream data into a database on a consolidated system /." Full text open access at:, 2002. http://content.ohsu.edu/u?/etd,274.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Johansson, Henrik. "Using clickstream data as implicit feedback in information retrieval systems." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233870.

Full text

Abstract:

This Master's thesis project aims to investigate if Wikipedia's clickstream data can be used to improve the retrieval performance of information retrieval systems. The project is conducted under the assumption that a traversal between two article connects the two articles in regards to content. To extract useful terms out of the clickstream data, it needed to be structured so that it given a Wikipedia article it is possible to find all of the in-going or out-going article traversals.The project settled on using the clickstream data in an automatic query expansion approach.Two expansion methods were investigated, one based on expanding with full article title so that the context would be preserved, and the other expanded with individual terms from the article titles.The structure of the data and two proposed methods were evaluated using a set of queries and relevance judgments. The results of the evaluation shows that the method that expands with individual terms performed better than the full article title expansion method and that the individual term method managed to increase the MAP with 11.24%. The expansion method was evaluated on two different query collections, and it was found that the proposed expansion method only improves the results where the average recall of the original queries are low.The thesis conclusion is that the clickstream can be used to improve retrieval performance for an information retrieval system.
Det här examensarbetets mål är att undersöka om Wikipedias klickströmsdata kan användas för att förbättra sökprestanda för informationsökningssystem. Arbetet har utförts under antagandet att en övergång mellan två artiklar på Wikipedia sammankopplar artiklarnas innehåll och är av intresse för användaren. För att kunna utnyttja klickströmsdatan krävs det att den struktureras på ett användbart sätt så att det givet en artikel går att se hur läsare har förflyttat sig ut eller in mot artikeln. Vi valde att utnyttja datamängden genom en automatisk sökfrågeexpansion. Två olika metoder togs fram, där den första expanderar sökfrågan med hela artikeltitlar medans den andra expanderar med enskilda ord ur en artikeltitel.Undersökningens resultat visar att den ordbaserade expansionsmetoden presterar bättre än metoden som expanderar med hela artikeltitlar. Den ordbaserade expansionsmetoden lyckades uppnå en förbättring för måttet MAP med 11.21%. Från arbetet kan man också se att expansionmetoden enbart förbättrar prestandan när täckningen för den ursprungliga sökfrågan är liten. Gällande strukturen på klickströmsdatan så presterade den utgående strukturen bättre än den ingående. Examensarbetets slutsats är att denna klickströmsdata lämpar sig bra för att förbättra sökprestanda för ett informationsökningssystem.

APA, Harvard, Vancouver, ISO, and other styles

8

Collin, Sara, and Ingrid Möllerberg. "Designing an Interactive tool for Cluster Analysis of Clickstream Data." Thesis, Uppsala universitet, Avdelningen för visuell information och interaktion, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-414237.

Full text

Abstract:

The purpose of this study was to develop an interactive tool that enables identification of different types of users of an application based on clickstream data. A complex hierarchical clustering algorithm tool called Recursive Hierarchical Clustering (RHC) was used. RHC provides a visualisation of user types as clusters, where each cluster has its own distinguishing action pattern, i.e., one or several consecutive actions made by the user in the application. A case study was conducted on the mobile application Plick, which is an application for selling and buying second hand clothes. During the course of the project, the analysis and its result was discovered to be difficult to understand by the operators of the tool. The interactive tool had to be extended to visualise the complex analysis and its result in an intuitive way. A literature study of how humans interpret information, and how to present it to operators, was conducted and led to a redesign of the tool. More information was added to each cluster to enable further understanding of the clustering results. A clustering reconfiguration option was also created where operators of the tool got the possibility to interact with the analysis. In the reconfiguration, the operator could change the input file of the cluster analysis and thus the end result. Usability tests showed that the extra added information about the clusters served as an amplification and a verification of the original results presented by RHC. In some cases the original result presented by RHC was used as a verification to user group identification made by the operator solely based on the extra added information. The usability tests showed that the complex analysis with its results could be understood and configured without considerable comprehension of the algorithm. Instead it seemed like it could be successfully used in order to identify user types with help of visual clues in the interface and default settings in the reconfiguration. The visualisation tool is shown to be successful in identifying and visualising user groups in an intuitive way.

APA, Harvard, Vancouver, ISO, and other styles

9

Neville, Kevin. "Channel attribution modelling using clickstream data from an online store." Thesis, Linköpings universitet, Statistik och maskininlärning, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139318.

Full text

Abstract:

In marketing, behaviour of users is analysed in order to discover which channels (for instance TV, Social media etc.) are important for increasing the user’s intention to buy a product. The search for better channel attribution models than the common last-click model is of major concern for the industry of marketing. In this thesis, a probabilistic model for channel attribution has been developed, and this model is demonstrated to be more data-driven than the conventional last- click model. The modelling includes an attempt to include the time aspect in the modelling which have not been done in previous research. Our model is based on studying different sequence length and computing conditional probabilities of conversion by using logistic regression models. A clickstream dataset from an online store was analysed using the proposed model. This thesis has revealed proof of that the last-click model is not optimal for conducting these kinds of analyses.

APA, Harvard, Vancouver, ISO, and other styles

10

Bača, Roman. "Sběr sémanticky obohacených clickstreamů." Master's thesis, Vysoká škola ekonomická v Praze, 2009. http://www.nusl.cz/ntk/nusl-76722.

Full text

Abstract:

The aim of this thesis is to bring near to the readers the area of webmining and familiarize them with tools, which deal with data mining on the web. The main emphasis is placed on the analytical software program called Piwik. This analytical tool is compared with others nowadays available analytical tools. This thesis also aims to create a compact documentation of the software Piwik. The largest part of this documentation is devoted to the newly programmed plugin. The principle of information retrieval, based on user behavior on the web, is described from the common viewpoint and leads to more factual form of description of information retrieval using this new plugin.

APA, Harvard, Vancouver, ISO, and other styles

11

MacGibbon, David George. "An investigation into the effects of perceptions of person-team fit during online recruitment; and the uses of clickstream data associated with this medium." Thesis, University of Canterbury. Psychology, 2012. http://hdl.handle.net/10092/7007.

Full text

Abstract:

Given the increasing predominance of work teams within organisations, this study aimed to investigate the role that perceptions of person-team fit has in the recruitment process, in addition to other forms of person-environment fit. An experimental design was followed which manipulated the amount of team information made available to participants. It was hypothesised that participants who received more information would exhibit higher perceptions of person-team fit. Results supported this prediction with levels of person-team fit being successfully manipulated. Results also showed significant correlations between person-team fit and organisational attraction which is important in the early stages of recruitment. This study was conducted remotely over the internet with clickstream data associated with this medium being collected. It was hypothesised that viewing order and times may be related to dependent variables. No support for this prediction was found, however it did identify a group of participants that appeared not to engage in the task, which has implications for future research carried out online.

APA, Harvard, Vancouver, ISO, and other styles

12

El-Gharib, Najah Mary. "Using Process Mining Technology to Understand User Behavior in SaaS Applications." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39963.

Full text

Abstract:

Processes are running everywhere. Understanding and analyzing business and software processes and their interactions is critical if we wish to improve them. There are many event logs generated from Information Systems and applications related to fraud detection, healthcare processes, e-commerce processes, and others. These event logs are the starting point for process mining. Process mining aims to discover, monitor, and improve real processes by extracting knowledge from event logs available in information systems. Process mining provides fact-based insight from real event logs that helps analyze and improve existing business processes by answering, for example performance or conformance questions. As the number of applications developed in a cloud infrastructure (often called Software as a Service – SaaS at the application level) is increasing, it becomes essential and useful to study and discover these processes. However, SaaS applications bring new challenges to the problem of process mining. Using the Design Science Research Methodology, this thesis introduces a new method to study, discover, and analyze cloud-based application processes using process mining techniques. It explores the applications and known challenges related to process mining in cloud applications through a systematic literature review (SLR). It then contributes a new Application Programming Interface (API), with an implementation in R, and a companion method called Cloud Pattern API – Process Mining (CPA-PM), for the preprocessing of event logs in a way that addresses many of the challenges identified in the SLR. A case study involving a SaaS company and real event logs related to the trial process of their online service is used to validate the proposed solution.

APA, Harvard, Vancouver, ISO, and other styles

13

Olanyk, Luís Roberto Zart. "Um modelo para a implantação de um Data Mart de Clickstream para empresas provedoras de acesso à internet de pequeno e médio porte." Florianópolis, SC, 2002. http://repositorio.ufsc.br/xmlui/handle/123456789/83561.

Full text

Abstract:

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia de Produção.
Made available in DSpace on 2012-10-20T01:09:10Z (GMT). No. of bitstreams: 1 189333.pdf: 663186 bytes, checksum: 23c0867066721833aa2a4272629e3570 (MD5)
Data warehousing é um dos campos de Sistemas de Apoio a Decisão (SAD) com mais rápida expansão na recente Tecnologia da Informação (TI). A Internet, apesar de sua juventude, mostra-se como um superpovoado ambiente de informações e com um alto grau de competitividade. Com o intuito de ampliar o relacionamento com clientes que utilizam sites da Web o presente trabalho busca formular as bases para construção de uma ferramenta SAD que auxilie neste relacionamento. No trabalho são descritos os conceitos referenciados na literatura para construção de um data warehouse de clickstream, demonstrando os requisitos necessários e citando os principais pontos onde diferentes soluções se aplicam, para que, com bases sólidas se verifiquem quais as melhores opções podem ser empregadas na implantação do projeto. De acordo com a estrutura física da organização em estudo, um modelo de implantação de um data mart de clickstream é proposto. Buscando solucionar problemas de navegação e com o foco na busca por uma melhora do serviço prestado para os clientes da organização é executada a implantação do protótipo, o qual mostrou-se importante para auxiliar estas tarefas. Alguns dos resultados obtidos são apresentados, demonstrando assim o poder do protótipo construído. Por fim são discutidas algumas recomendações para trabalhos futuros.

APA, Harvard, Vancouver, ISO, and other styles

14

Wang, Yufei 1981. "An analysis of different data base structures and management systems on Clickstream data collected for advocacy based marketing strategies experiments for Intel and GM." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/33387.

Full text

Abstract:

Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.
Includes bibliographical references (leaves 82-83).
Marketing on the Internet is the next big field in marketing research. Clickstream data is a great contribution to analyze the effects of advocacy based marketing strategies. Handling Clickstream data becomes a big issue. This paper will look at the problems caused by Clickstream data from a database perspective and consider several theories to alleviate the difficulties. Applications of modern database optimization techniques will be discussed and this paper will detail the implementation of these techniques for the Intel and GM project.
by Yufei Wang.
M.Eng.and S.B.

APA, Harvard, Vancouver, ISO, and other styles

15

Mccart, James A. "Goal Attainment On Long Tail Web Sites: An Information Foraging Approach." Scholar Commons, 2009. http://scholarcommons.usf.edu/etd/3686.

Full text

Abstract:

This dissertation sought to explain goal achievement at limited traffic “long tail” Web sites using Information Foraging Theory (IFT). The central thesis of IFT is that individuals are driven by a metaphorical sense of smell that guides them through patches of information in their environment. An information patch is an area of the search environment with similar information. Information scent is the driving force behind why a person makes a navigational selection amongst a group of competing options. As foragers are assumed to be rational, scent is a mechanism by which to reduce search costs by increasing the accuracy on which option leads to the information of value. IFT was originally developed to be used in a “production rule” environment, where a user would perform an action when the conditions of a rule were met. However, the use of IFT in clickstream research required conceptualizing the ideas of information scent and patches in a non-production rule environment. To meet such an end this dissertation asked three research questions regarding (1) how to learn information patches, (2) how to learn trails of scent, and finally (3) how to combine both concepts to create a Clickstream Model of Information Foraging (CMIF). The learning of patches and trails were accomplished by using contrast sets, which distinguished between individuals who achieved a goal or not. A user- and site-centric version of the CMIF, which extended and operationalized IFT, presented and evaluated hypotheses. The user-centric version had four hypotheses and examined product purchasing behavior from panel data, whereas the site-centric version had nine hypotheses and predicted contact form submission using data from a Web hosting company. In general, the results show that patches and trails exist on several Web sites, and the majority of hypotheses were supported in each version of the CMIF. This dissertation contributed to the literature by providing a theoretically-grounded model which tested and extended IFT; introducing a methodology for learning patches and trails; detailing a methodology for preprocessing clickstream data for long tail Web sites; and focusing on traditionally under-studied long tail Web sites.

APA, Harvard, Vancouver, ISO, and other styles

16

Forslund, John, and Jesper Fahlén. "Predicting customer purchase behavior within Telecom : How Artificial Intelligence can be collaborated into marketing efforts." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279575.

Full text

Abstract:

This study aims to investigate the implementation of an AI model that predicts customer purchases, in the telecom industry. The thesis also outlines how such an AI model can assist decision-making in marketing strategies. It is concluded that designing the AI model by following a Recurrent Neural Network (RNN) architecture with a Long Short-Term Memory (LSTM) layer, allow for a successful implementation with satisfactory model performances. Stepwise instructions to construct such model is presented in the methodology section of the study. The RNN-LSTM model further serves as an assisting tool for marketers to assess how a consumer’s website behavior affect their purchase behavior over time, in a quantitative way - by observing what the authors refer to as the Customer Purchase Propensity Journey (CPPJ). The firm empirical basis of CPPJ, can help organizations improve their allocation of marketing resources, as well as benefit the organization’s online presence by allowing for personalization of the customer experience.
Denna studie undersöker implementeringen av en AI-modell som förutspår kunders köp, inom telekombranschen. Studien syftar även till att påvisa hur en sådan AI-modell kan understödja beslutsfattande i marknadsföringsstrategier. Genom att designa AI-modellen med en Recurrent Neural Network (RNN) arkitektur med ett Long Short-Term Memory (LSTM) lager, drar studien slutsatsen att en sådan design möjliggör en framgångsrik implementering med tillfredsställande modellprestation. Instruktioner erhålls stegvis för att konstruera modellen i studiens metodikavsnitt. RNN-LSTM-modellen kan med fördel användas som ett hjälpande verktyg till marknadsförare för att bedöma hur en kunds beteendemönster på en hemsida påverkar deras köpbeteende över tiden, på ett kvantitativt sätt - genom att observera det ramverk som författarna kallar för Kundköpbenägenhetsresan, på engelska Customer Purchase Propensity Journey (CPPJ). Den empiriska grunden av CPPJ kan hjälpa organisationer att förbättra allokeringen av marknadsföringsresurser, samt gynna deras digitala närvaro genom att möjliggöra mer relevant personalisering i kundupplevelsen.

APA, Harvard, Vancouver, ISO, and other styles

17

Berg, Marcus. "Evaluating Quality of Online Behavior Data." Thesis, Stockholms universitet, Statistiska institutionen, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-97524.

Full text

Abstract:

This thesis has two purposes; emphasizing the importance of data quality of Big Data, and identifying and evaluating potential error sources in JavaScript tracking (a client side on - site online behavior clickstream data collection method commonly used in web analytics). The importance of data quality of Big Data is emphasized through the evaluation of JavaScript tracking. The Total Survey Error framework is applied to JavaScript tracking and 17 nonsampling error sources are identified and evaluated. The bias imposed by these error sources varies from large to small, but the major takeaway is the large number of error sources actually identified. More work is needed. Big Data has much to gain from quality work. Similarly, there is much that can be done with statistics in web analytics.

APA, Harvard, Vancouver, ISO, and other styles

18

Chen, Ting-Rui, and 陳廷睿. "Extended Clickstream: an analysis of the missing user behaviors in the Clickstream." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/9nrd3r.

Full text

Abstract:

碩士
國立中央大學
資訊工程學系
107
Nowadays, people often use clickstream to represent the behavior of online users. However, we found that clickstream only represents part of users' browsing behaviors. For instance, clickstream does not include tab switching and browser window switching. We collect these kinds of behaviors and named as ``extended clickstream". This thesis builds a service to capture both of clickstream and extended clickstream, also provides an analysis of the differences between above. We use a Multi-Task learning model with GRU components to perform multi-objective predictions of ``what kind of website the user will go next time" and ``how long the interval of clicks will be" for the time series of clickstreams and extended clickstreams. Our experimental results show that combining clickstream and extended clickstream can improve the prediction performance. In addition, this article finds that the clickstream will record unintended clicks due to the operation mechanism of certain websites. Moreover, we can differentiate the single user from several devices by combining the clickstream and extended clickstream.

APA, Harvard, Vancouver, ISO, and other styles

19

Moe, Wendy W., and Peter S. Fader. "Capturing Evolving Visit Behavior in Clickstream Data." 2001. http://hdl.handle.net/10150/105085.

Full text

Abstract:

Many online retailers monitor visitor traffic as a measure of their storesâ success. However, summary measures such as the total number of visits per month provide little insight about individual-level shopping behavior. Additionally, behavior may evolve over time, especially in a changing environment like the Internet. Understanding the nature of this evolution provides valuable knowledge that can influence how a retail store is managed and marketed. This paper develops an individual-level model for store visiting behavior based on Internet clickstream data. We capture cross-sectional variation in store-visit behavior as well as changes over time as visitors gain experience with the store. That is, as someone makes more visits to a site, her latent rate of visit may increase, decrease, or remain unchanged as in the case of static, mature markets. So as the composition of the customer population changes (e.g., as customers mature or as large numbers of new and inexperienced Internet shoppers enter the market), the overall degree of visitor heterogeneity that each store faces may shift. We also examine the relationship between visiting frequency and purchasing propensity. Previous studies suggest that customers who shop frequently may be more likely to make a purchase on any given shopping occasion. As a result, frequent shoppers often comprise the preferred target segment. We find evidence supporting the fact that people who visit a store more frequently are more likely to buy. However, we also show that changes (i.e., evolution) in an individualâ s visit frequency over time provides further information regarding which customer segments are more likely to buy. Rather than simply targeting all frequent shoppers, our results suggest that a more refined segmentation approach that incorporates how much an individualâ s behavior is changing could more efficiently identify a profitable target segment.

APA, Harvard, Vancouver, ISO, and other styles

20

Su, Ching-Lun, and 蘇敬倫. "Predicting Online Purchasing Behavior Using Clickstream Data." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/nnmd7f.

Full text

Abstract:

碩士
國立臺灣大學
經濟學研究所
106
Online shopping has been booming in recent ten years. It is now a critical issue for online retailers how to make good use of the rich data generated in the process of online shopping. Online retailers cannot observe physical characteristics of the customers, such as gender and age. But they can use browsing data to analyze customers’ preferences and predict purchasing behavior. This study explores the relationships between browsing behavior, customer characteristics, and purchase results using clickstream data from the website of an online wine retailer. I use a K-Means model to cluster customers based on the filters they chose when browsing the website. I find the clustering results are significantly correlated with customers’ location and gender. Also, the more filters a customer choose before a purchase, the more wines they buy and the higher their order total. The results of logistic regressions show that customers who choose a low price range to filter products are most likely to buy.

APA, Harvard, Vancouver, ISO, and other styles

21

Teixeira, Ricardo Filipe Fernandes e. Costa Magalhães. "Using clickstream data to analyze online purchase intentions." Dissertação, 2015. https://repositorio-aberto.up.pt/handle/10216/83497.

Full text

Abstract:

Hoje em dia as técnicas de negócio tradicionais estão ultrapassadas devido à emergência de novos modelos de negócio, nomeadamente no espaço online através da Internet. Este novo espaço de comércio eletrónico difere substancialmente das atividades tradicionais que têm por bases espaços físicos. Assim, torna-se imperativo que as empresas adotem novas estratégias e sejam capazes de compreender as motivações que guiam os compradores online, caso pretendam suceder no competitivo ecossistema virtual.Os logs dos servidores são a principal fonte de informação, sobre os seus utilizadores, que as empresas dispõem. Estes ficheiros contêm detalhes sobre como cada cliente navegou pela loja eletrónica, mais ainda, através destes dados é possível reconstruir a sequência exata das páginas que cada um acedeu. Este tipo de dados, conhecidos como dados de clickstream, são fundamentais para conseguir compreender o comportamento dos utilizadores. Aliás, a análise e exploração desta informação são essenciais para melhorar a relação com os clientes.A análise de dados clickstream permite, acima de tudo, a compreensão de determindas intenções que motivam os utilizadores a realizar determinadas ações. A percentagem de conversão de utilizadores é uma das métricas mais conhecidas e que se relaciona diretamente com as intenções dos mesmos. Durante esta dissertação nós investigamos outro tipo de intenções, nomeadamente, fatores relacionados com os utilizadores que passam a ser compradores e ainda com a probabilidade de compra em tempo real. São utilizados dados concretos, provenientes de uma das maiores empresas europeias na área do retalho alimentar, para alimentar e avaliar diferentes modelos de data mining.
Nowadays, traditional business techniques are almost deprecated due to the insurgence of the world of online virtual shopping, the so-called e-commerce. This new, in many ways, uncharted territory poses difficult challenges when it comes to apply marketing techniques especially traditional methods, as these are not effective when dealing with online customers. In this context, it is imperative that companies have a complete in-depth understanding of online behavior in order to succeed within this complex environment in which they compete.The server Web logs of each customer are the main sources of potentially useful information for online stores. These logs contain details on how each customer visited the online store, moreover, it is possible to reconstruct the sequence of accessed pages, the so-called clickstream data. This data is fundamental in depicting each customer's behavior. Analyzing and exploring this behavior is key to improve customer relationship management. The analysis of clickstream data allows for the understanding of customer intentions. One of the most studied measures regards customer conversion, that is, the percentage of customers that will actually perform a purchase during a specific online session. During this dissertation we investigate other relevant intentions, namely, customer purchasing engagement and real-time purchase likelihood. Actual data from a major European online grocery retail store will be used to support and evaluate different data mining models.

APA, Harvard, Vancouver, ISO, and other styles

22

Teixeira, Ricardo Filipe Fernandes e. Costa Magalhães. "Using clickstream data to analyze online purchase intentions." Master's thesis, 2015. https://repositorio-aberto.up.pt/handle/10216/83497.

Full text

Abstract:

Hoje em dia as técnicas de negócio tradicionais estão ultrapassadas devido à emergência de novos modelos de negócio, nomeadamente no espaço online através da Internet. Este novo espaço de comércio eletrónico difere substancialmente das atividades tradicionais que têm por bases espaços físicos. Assim, torna-se imperativo que as empresas adotem novas estratégias e sejam capazes de compreender as motivações que guiam os compradores online, caso pretendam suceder no competitivo ecossistema virtual.Os logs dos servidores são a principal fonte de informação, sobre os seus utilizadores, que as empresas dispõem. Estes ficheiros contêm detalhes sobre como cada cliente navegou pela loja eletrónica, mais ainda, através destes dados é possível reconstruir a sequência exata das páginas que cada um acedeu. Este tipo de dados, conhecidos como dados de clickstream, são fundamentais para conseguir compreender o comportamento dos utilizadores. Aliás, a análise e exploração desta informação são essenciais para melhorar a relação com os clientes.A análise de dados clickstream permite, acima de tudo, a compreensão de determindas intenções que motivam os utilizadores a realizar determinadas ações. A percentagem de conversão de utilizadores é uma das métricas mais conhecidas e que se relaciona diretamente com as intenções dos mesmos. Durante esta dissertação nós investigamos outro tipo de intenções, nomeadamente, fatores relacionados com os utilizadores que passam a ser compradores e ainda com a probabilidade de compra em tempo real. São utilizados dados concretos, provenientes de uma das maiores empresas europeias na área do retalho alimentar, para alimentar e avaliar diferentes modelos de data mining.
Nowadays, traditional business techniques are almost deprecated due to the insurgence of the world of online virtual shopping, the so-called e-commerce. This new, in many ways, uncharted territory poses difficult challenges when it comes to apply marketing techniques especially traditional methods, as these are not effective when dealing with online customers. In this context, it is imperative that companies have a complete in-depth understanding of online behavior in order to succeed within this complex environment in which they compete.The server Web logs of each customer are the main sources of potentially useful information for online stores. These logs contain details on how each customer visited the online store, moreover, it is possible to reconstruct the sequence of accessed pages, the so-called clickstream data. This data is fundamental in depicting each customer's behavior. Analyzing and exploring this behavior is key to improve customer relationship management. The analysis of clickstream data allows for the understanding of customer intentions. One of the most studied measures regards customer conversion, that is, the percentage of customers that will actually perform a purchase during a specific online session. During this dissertation we investigate other relevant intentions, namely, customer purchasing engagement and real-time purchase likelihood. Actual data from a major European online grocery retail store will be used to support and evaluate different data mining models.

APA, Harvard, Vancouver, ISO, and other styles

23

Chen, Po Chu, and 陳伯駒. "Predicting Consumers’ Purchase Decision by Clickstream Data: A Machine Learning Approach." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/cdk6y2.

Full text

Abstract:

碩士
國立臺灣大學
經濟學研究所
106
In the recent years, numerous commerces have gradually shifted from physi- cal store to web-shops, so-called the e-commerce. These online stores contain lots of log files in the back-end which basically record the pages accessed by visitors, namely the clickstream data. In this study, we predict consumers’ purchase decision by analyzing the clickstream data from an online wine re- tailer. We impose two modern machine learning model, decision tree and ran- dom forest, to predict consumers’ final purchase intention. Besides the normal features based on visitors’ activities on the website, we construct a new feature that clusters different groups of visitors according to the sequence page-type accessed. After re-sampling to remedy the unbalanced data, our two models both show high predictive accuracy up to 90% and provides a new insight for retailer to target some specific visitors on website.

APA, Harvard, Vancouver, ISO, and other styles

24

Chang, Peishih. "Sifting customers from the clickstream behavior pattern discovery in a virtual shopping environment /." Thesis, 2007. http://library1.njit.edu/etd/fromwebvoyage.cfm?id=njit-etd2007-043.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

"Um Modelo Para A Implantação de Um Data Mart de Clickstream Para Empresas Provedoras de Acesso À Internet de Pequeno E Médio Porte." Tese, Programa de Pós Graduação em Engenharia de Produção, 2002. http://teses.eps.ufsc.br/defesa/pdf/7993.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Camacho, Pedro André Freitas. "Sistema de recomendação em real-time para reserva de transfers." Master's thesis, 2020. http://hdl.handle.net/10071/22131.

Full text

Abstract:

O continuado crescimento do número de turistas dos últimos anos é proporcional à progressiva utilização de serviços de transfers, sendo também, a oferta deste tipo de serviços, cada vez mais uma tendência. Os clientes de hoje são mais exigentes e procuram uma experiência online mais simplificada e personalizada, que pode ser obtida através de técnicas de antecipação do comportamento do cliente. Na sociedade contemporânea, a procura por mecanismos que possam recomendar ou auxiliar na escolha de produtos ou serviços é cada vez mais uma tendência, fomentando os conceitos de cross-selling e upselling nas empresas. A aquisição de serviços privados de transfer através de reservas nos websites, geram uma grande quantidade de dados que podem ser utilizados para segmentar clientes e construir sistemas de recomendação que sugerem outros produtos ou serviços ao cliente. No decorrer desta dissertação, apresentamos e desenvolvemos um modelo de classificação híbrido tendo por base uma empresa de transfers, sediada no Algarve, que pretende aumentar as vendas dos seus serviços paralelos (experiências/tours). De forma a identificar-se o comportamento e padrões nos clientes da empresa, é efetuada uma análise exploratória, assim como, aplicadas técnicas de segmentação de clientes. O sistema de recomendação proposto, funciona com um modelo de classificação em que, identifica, numa primeira fase, possíveis compradores de experiências e, posteriormente, numa segunda fase, sugere qual das experiências disponíveis será mais adequada a cada cliente. Apenas uma baixa percentagem de clientes que compra serviços de transfers, também compra experiências e pretende-se aumentar esta percentagem.
The continued growth in the number of tourists in recent years is proportional to the incresead use of transfer services. The offer of this type of service is becoming a trend. Today’s customers are more demanding and require a more streamlined and personalized online experience, which can be achieved through techniques to anticipate customer behaviour. In contemporary society, the search for mechanisms that can recommend or assist in choosing products or services is increasingly a trend, fostering the concepts of crossselling and upselling in companies. The acquisition of private transfer services through reservations on the websites generate a large amount of data that can be used to segment customers and build a recommendation system that suggest other products or services to the customer. In the course of this dissertation, we present and develop a hybrid classification model based on a transfer company based in the Algarve, which intends to increase sales of its parallel services (experiences/ tours). An exploratory analysis was carried out to identify the company’s customers’ behaviour and patterns and apply customer segmentation techniques. The proposed recommendation system works with a classification model in which it determines, in the first stage, potential buyers of experiences. Later, in a second phase, it suggests which of the available experiences will be best suited to each client. Only a low percentage of customers who buy transfer services also buy experiences and are intended to increase this percentage.

APA, Harvard, Vancouver, ISO, and other styles

27

Mota, Gabriel Ivan da Silva Rosa Neco da. "Detection of fraud patterns in electronic commerce environments." Master's thesis, 2014. http://hdl.handle.net/1822/33392.

Full text

Abstract:

Dissertação de mestrado em Systems Engineering
Electronic transactions (e-commerce) have revolutionized the way consumers shop, making small and local retailers, which were being affected by the worldwide crisis, accessible to the entire world. As e-commerce market expands, commercial transactions supported by credit cards - Card or Customer Not Present (CNP) also increases. This growing relationship, quite natural and expected, has clear advantages, facilitating e-commerce transactions and attracting new possibilities for trading. However, at the same time a big and serious problem emerge: the occurrence of fraudulent situations in payments. Fraud imposes severe financial losses, which deeply impacts e-commerce companies and their revenue. In order to minimize losses, they spend a lot of efforts (and money) trying to establish the most satisfactory solutions to detect and counteract in a timely manner the occurrence of a fraud scenario. In the ecommerce domain, fraud analysts are typically interested in subject oriented customer data, frequently extracted from each order process that occurred in an e-commerce site. Besides transactional data, all their behavior data e.g. clickstream data are traced and recorded, enriching the means of detection with profiling data and providing a way to trace customers behavior along time. In this work, a signature-based method was used to establish the characteristics of user behavior and detect potential fraud cases. Signatures have already been used successfully for anomalous detection in many areas like credit card usage, network intrusion, and in particular in telecommunications fraud. A signature is defined by a set of attributes that receive a diverse range of variables - e.g. the average number of orders, time spent per order, number of payment attempts, number of days since last visit, and many others - related to the behavior of a user, referring to an e-commerce application scenario. Based on the analysis of user behavior deviation, detected by comparing the user recent activity with the user behavior data, which is expressed through the user signature, it's possible to detect potential fraud situations (deviate behaviors) in useful time, giving a more robust and accurate support decision system to the fraud analysts on their daily job.
As transações electrónicas (e-commerce) têm revolucionado a maneira como os consumidores fazem compras on-line, facilitando o acesso a partir de qualquer parte do globo, a retalhistas pequenos e locais, que estão a ser afectados pela crise mundial. À medida que o mercado do e-commerce se expande, transações comerciais suportadas por cartões de crédito – Cartão ou Cliente Não Presente (CNP) - também aumentam. Este crescimento natural e expectável apresenta claras vantagens, facilitando as transações e-commerce e atraindo novas possibilidades de negócio. Contudo, ao mesmo tempo, existe um grande e grave problema: a ocorrência de situações fraudulentas nos pagamentos. A fraude encontra-se associada a graves perdas financeiras, que têm um impacto profundo na receita de companhias de comércio electrónico. Grandes esforços (e dinheiro) são gastos numa tentativa de estabelecer soluções mais satisfatórias na detecção de casos de fraude em tempo útil, por forma a minimizar perdas. No domínio do e-commerce, os analistas de fraude estão tipicamente interessados em dados orientados ao consumidor, extraídos de cada uma das ordens de compra realizadas no site de comércio electrónico. Além dos dados transacionais, todos os dados comportamentais, i.e. dados clickstream, são guardados, enriquecendo assim os meios de detecção e garantindo uma forma de rastrear o comportamento dos consumidores ao longo do tempo. Neste trabalho utilizámos um método baseado na aplicação de assinaturas para estabelecer características comportamentais de consumidores e-commerce e assim, detectar potenciais casos de fraude. A aplicação de assinaturas foi já usada com sucesso na detecção de anomalias em diversas áreas, como a utilização de cartões de crédito, intrusão de redes e em particular, fraude em telecomunicações. Uma assinatura é definida por um conjunto de atributos que recebem um diverso leque de variáveis - e.g. número médio de encomendas, tempo de compra, número de tentativas de pagamento, número de dias desde a última visita, entre muitos outros – relacionados com o comportamento de um consumidor. Baseado na análise do desvio comportamental do consumidor, detectado através da comparação da sua atividade recente, com os seus dados comportamentais, expressados através da sua assinatura, é possível a detecção de potenciais casos de fraude (comportamentos díspares do habitual) em tempo real, garantindo assim um sistema mais robusto e preciso, capaz de servir de suporte à decisão aos analistas de fraude no seu trabalho diário.

APA, Harvard, Vancouver, ISO, and other styles

28

Borges, Eurico Alexandre Teixeira. "Sistemas de Data Webhousing : análise, desenho, implementação e exploração de sistemas reais." Master's thesis, 2004. http://hdl.handle.net/1822/2787.

Full text

Abstract:

Dissertação de mestrado em Informática, especialidade de Sistemas Distribuídos, Comunicações por Computador e Arquitectura de Computadores
A Web tem-se tornado um dos espaços mais apelativos para as organizações como forma de divulgação das suas actividades, promoção dos seus produtos e serviços e desenvolvimento de actividades comerciais. Todavia, os visitantes de um sítio Web podem facilmente saltar para um sítio da concorrência caso não encontrem rapidamente aquilo que procuram, ou se tiverem qualquer outro motivo que não seja do seu agrado. Conhecer os visitantes e garantir que os produtos, serviços ou informação são aqueles que eles procuram é imperativo. É por isso que as organizações têm tentado analisar vários tipos de questões relacionadas, por exemplo, com a forma como os clientes procuram os produtos, onde abandonam o sítio e porquê, qual a frequência de visitas dos seus clientes, quais os produtos ou serviços que mais interesse despertaram nos visitantes, enfim tudo o que possa contribuir para a melhoria do sítio e para manter ou atrair novos clientes. Todos os movimentos e selecções dos utilizadores de um sítio Web podem ser acompanhados através dos “cliques“ que vão fazendo ao longo do seu processo de interacção com as diversas páginas Web. A esta sequência de “diques” dá-se o nome de clickstream. Será a partir dos dados registados pelo servidor Web sobre as selecções do utilizador que se poderá iniciar o estudo das suas iterações e comportamento. Contudo, o registo mantido pelos servidores Web forma apenas um esqueleto que terá de ser enriquecido com os registos dos vários componentes e sistemas que suportam o seu funcionamento. Este tipo de integração e conciliação de dados num único repositório é, tradicionalmente, feito no seio de um Data Warehouse que, pelo acréscimo dos dados de dlickstream, se torna num Data Webhouse. Todo o processo de extracção, transformação e integração no Data Webhouse é, no entanto, dificultado pelo volume, incomplitude e heterogeneidade dos dados e pela própria tecnologia utilizada no ambiente Web. Nesta dissertação, é apresentado e descrito um modelo dimensional para um Data Webhouse para análise de um sítio Web comercial. São estudadas e apresentadas algumas das suas fontes de dados bem como técnicas que podem ser utilizadas para eliminar ou reduzir os problemas existentes nos dados de clickstream. É descrito todo o desenvolvimento e implementação do processo de extracção, limpeza, transformação e integração de dados no Data Webhouse com especial relevo para as tarefas de clickstream - a identificação de utilizadores e agentes automáticos e a reconstrução de sessões. É apresentado o Webuts — Web Usage Tracking Statistics, um protótipo de um sistema de apoio à decisão para acompanhamento e análise estatística das actividades dos utilizadores de um sítio Web e onde se incorporam alguns dos elementos, técnicas, princípios e práticas descritas.
The Web is becoming one of the most appeallng environments for the many organisations as a means of promoting its businesses and activities as well as a commercialisation channel. However, a Web user can easily leave one organisation’s Web site for its competitors if he doesn’t find what he is looking for or if he finds something unpleasant on one organisation’s site. To know the site’s users and making sure that the products, services or information the site is providing is what the users want is nowadays a must. That is why many organisations have started to study how their web site users browse the site, where are they leaving the site and why, how frequently do their users return, what products and services are most appealing and, in general terms, everything that may be used to improve the Web site and attract new users. Every user moves may be tracked by retaining the clicks selections they do on the different Web pages during their visit. This flow of clicks is now called clickstream. It is the data logged by the Web server on the user’s selections that will enable the organisation to study their moves and behaviour. However, the Web server log only keeps the bare bones of the user’s activity. This data will have to be enriched with data collected by other systems designed to provide the Web site with contents or additional functionalities. Traditionally, the gathering and integration of data from heterogeneous data sources is done inside a Data Warehouse. By adding clickstream data to it we are creating a Data Webhouse. However, Web technology, the data volume, its heterogeneity and incompleteness will create difficulties in the process of extracting, transforming and loading data into the Data Webhouse. In this document we present a dimensional model for a Data Webhouse whose purpose is to analyse a commercial Web site. Several data sources are presented and analised in detail. Some of the techniques used to eliminate or reduce clickstream data problems are also described. The Data Webhouse extraction, cleaning, transformation and loading process is described and special attention is paid to clickstream processing tasks such as user and robot identification and user session reconstruction. A new decision support system prototype, named Webuts - Web Usage Tracking Statistics, is presented. This system’s purpose is to track and analyse a Web site users’ moves and actitivities as well as generate some statistical data on the Web site operation. Its operation is based on a Data Webhouse and its development incorporated some of the elements, techniques and best practices studied and described.
Sonae, Indústria Consultoria e Gestão - Departamento de Sistemas de Informação

APA, Harvard, Vancouver, ISO, and other styles

29

Cavalcanti, Fábio Torres. "Incremental mining techniques." Master's thesis, 2005. http://hdl.handle.net/1822/3965.

Full text

Abstract:

Dissertação de mestrado em Sistemas de Dados e Processamento Analítico.
The increasing necessity of organizational data exploration and analysis, seeking new knowledge that may be implicit in their operational systems, has made the study of data mining techniques gain a huge impulse. This impulse can be clearly noticed in the e-commerce domain, where the analysis of client’s past behaviours is extremely valuable and may, eventually, bring up important working instruments for determining his future behaviour. Therefore, it is possible to predict what a Web site visitor might be looking for, and thus restructuring the Web site to meet his needs. Thereby, the visitor keeps longer navigating in the Web site, what increases his probability of getting attracted by some product, leading to its purchase. To achieve this goal, Web site adaptation has to be fast enough to change while the visitor navigates, and has also to ensure that this adaptation is made according to the most recent visitors’ navigation behaviour patterns, which requires a mining algorithm with a sufficiently good response time for frequently update the patterns. Typical databases are continuously changing over the time, what can invalidate some patterns or introduce new ones. Thus, conventional data mining techniques were proved to be inefficient, as they needed to re-execute to update the mining results with the ones derived from the last database changes. Incremental mining techniques emerged to avoid algorithm re-execution and to update mining results when incremental data are added or old data are removed, ensuring a better performance in the data mining processes. In this work, we analyze some existing incremental mining strategies and models, giving a particular emphasis in their application on Web sites, in order to develop models to discover Web user behaviour patterns and automatically generate some recommendations to restructure sites in useful time. For accomplishing this task, we designed and implemented Spottrigger, a system responsible for the whole data life cycle in a Web site restructuring work. This life cycle includes tasks specially oriented to extract the raw data stored in Web servers, pass these data by intermediate phases of cleansing and preparation, perform an incremental data mining technique to extract users’ navigation patterns and finally suggesting new locations of spots on the Web site according to the patterns found and the profile of the visitor. We applied Spottrigger in our case study, which was based on data gathered from a real online newspaper. Our main goal was to collect, in a useful time, information about users that at a given moment are consulting the site and thus restructuring the Web site in a short term, delivering the scheduled advertisements, activated according to the user’s profile. Basically, our idea is to have advertisements classified in levels and restructure the Web site to have the higher level advertisements in pages the visitor will most probably access. In order to do that, we construct a page ranking for the visitor, based on results obtained through the incremental mining technique. Since visitors’ navigation behaviour may change during time, the incremental mining algorithm will be responsible for catching this behaviour changes and fast update the patterns. Using Spottrigger as a decision support system for advertisement, a newspaper company may significantly improve the merchandising of its publicity spots guaranteeing that a given advertisement will reach to a higher number of visitors, even if they change their behaviour when visiting pages that were usually not visited.
A crescente necessidade de exploração e análise dos dados, na procura de novo conhecimento sobre o negócio de uma organização nos seus sistemas operacionais, tem feito o estudo das técnicas de mineração de dados ganhar um grande impulso. Este pode ser notado claramente no domínio do comércio electrónico, no qual a análise do comportamento passado dos clientes é extremamente valiosa e pode, eventualmente, fazer emergir novos elementos de trabalho, bastante válidos, para a determinação do seu comportamento no futuro. Desta forma, é possível prever aquilo que um visitante de um sítio Web pode andar à procura e, então, preparar esse sítio para atender melhor as suas necessidades. Desta forma, consegue-se fazer com que o visitante permaneça mais tempo a navegar por esse sítio o que aumenta naturalmente a possibilidade dele ser atraído por novos produtos e proceder, eventualmente, à sua aquisição. Para que este objectivo possa ser alcançado, a adaptação do sítio tem de ser suficientemente rápida para que possa acompanhar a navegação do visitante, ao mesmo tempo que assegura os mais recentes padrões de comportamento de navegação dos visitantes. Isto requer um algoritmo de mineração de dados com um nível de desempenho suficientemente bom para que se possa actualizar os padrões frequentemente. Com as constantes mudanças que ocorrem ao longo do tempo nas bases de dados, invalidando ou introduzindo novos padrões, as técnicas de mineração de dados convencionais provaram ser ineficientes, uma vez que necessitam de ser reexecutadas a fim de actualizar os resultados do processo de mineração com os dados subjacentes às modificações ocorridas na base de dados. As técnicas de mineração incremental surgiram com o intuito de evitar essa reexecução do algoritmo para actualizar os resultados da mineração quando novos dados (incrementais) são adicionados ou dados antigos são removidos. Assim, consegue-se assegurar uma maior eficiência aos processos de mineração de dados. Neste trabalho, analisamos algumas das diferentes estratégias e modelos para a mineração incremental de dados, dando-se particular ênfase à sua aplicação em sítios Web, visando desenvolver modelos para a descoberta de padrões de comportamento dos visitantes desses sítios e gerar automaticamente recomendações para a sua reestruturação em tempo útil. Para atingir esse objectivo projectámos e implementámos o sistema Spottrigger, que cobre todo o ciclo de vida do processo de reestruturação de um sítio Web. Este ciclo é composto, basicamente, por tarefas especialmente orientadas para a extracção de dados “crus” armazenados nos servidores Web, passar estes dados por fases intermédias de limpeza e preparação, executar uma técnica de mineração incremental para extrair padrões de navegação dos utilizadores e, finalmente, reestruturar o sítio Web de acordo com os padrões de navegação encontrados e com o perfil do próprio utilizador. Além disso, o sistema Spottrigger foi aplicado no nosso estudo de caso, o qual é baseado em dados reais provenientes de um jornal online. Nosso principal objectivo foi colectar, em tempo útil, alguma informação sobre o perfil dos utilizadores que num dado momento estão a consultar o sítio e, assim, fazer a reestruturação do sítio num período de tempo tão curto quanto o possível, exibindo os anúncios desejáveis, activados de acordo com o perfil do utilizador. Os anúncios do sistema estão classificados por níveis. Os sítios são reestruturados para que os anúncios de nível mais elevado sejam lançados nas páginas com maior probabilidade de serem visitadas. Nesse sentido, foi definida uma classificação das páginas para o utilizador, baseada nos padrões frequentes adquiridos através do processo de mineração incremental. Visto que o comportamento de navegação dos visitantes pode mudar ao longo do tempo, o algoritmo de mineração incremental será também responsável por capturar essas mudanças de comportamento e rapidamente actualizar os padrões. .

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Clickstream'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles