To see the other types of publications on this topic, follow the link: Web usage log mining.

Dissertations / Theses on the topic 'Web usage log mining'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Web usage log mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Khairo-Sindi, Mazin Omar. "Framework for web log pre-processing within web usage mining." Thesis, University of Manchester, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.488456.

Full text
Abstract:
Web mining is gaining popularity by the day and the role of the web in providing invaluable information about users' behaviour and navigational patterns is now highly appreciated by information technology specialists and businesses alike. Nevertheless, given the enormity of the web and the complexities involved in delivering and retrieving electronic information, one can imagine the difficulties involved in extracting a set of minable objects from the raw and huge web log data. Added to the fact that web mining is a new science, this may explain why research on data pre-processing is still limited in scope. And, although the debate on major issues is still gaining momentum, attempts to establish a coherent and accurate web usage pre-processing framework are still non existent. As a contribution to the existing debate, this research aims at formulating a workable, reliable, and coherent pre-processing framework. The present study will address the following issues: enhance and maximise knowledge about every visit made to a given website from multiple web logs even when they have different schemas, improve the process of eliminating excessive web log data that are not related to users' behaviour, modify the existing approaches for session identification in order to obtain more accurate results and eliminate redundant data that comes as a result of repeatedly adding cached data to the web logs regardless whether or not the added page is a frameset. In addition to the suggested improvements, the study will also introduce a novel task, namely, "automatic web log integration". This will make it possible to integrate different web logs with different schemas into a unified data set. Finally, the study will incorporate unnecessary information, particularly that pertaining to malicious website visits into the non user request removal task. Put together, both the suggested improvements and novel tasks result into a coherent pre-processing framework. To test the reliability and validity of the framework, a website is created in order to perform the necessary experimental work and a prototype pre-processing tool is devised and employed to support it.
APA, Harvard, Vancouver, ISO, and other styles
2

Shun, Yeuk Kiu. "Web mining from client side user activity log /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?COMP%202002%20SHUN.

Full text
Abstract:
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2002.
Includes bibliographical references (leaves 85-90). Also available in electronic version. Access restricted to campus users.
APA, Harvard, Vancouver, ISO, and other styles
3

Vlk, Vladimír. "Získávání znalostí z webových logů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236196.

Full text
Abstract:
This master's thesis deals with creating of an application, goal of which is to perform data preprocessing of web logs and finding association rules in them. The first part deals with the concept of Web mining. The second part is devoted to Web usage mining and notions related to it. The third part deals with design of the application. The forth section is devoted to describing the implementation of the application. The last section deals with experimentation with the application and results interpretation.
APA, Harvard, Vancouver, ISO, and other styles
4

Tanasa, Doru. "Web usage mining : contributions to intersites logs preprocessing and sequential pattern extraction with low support." Nice, 2005. http://www.theses.fr/2005NICE4019.

Full text
Abstract:
Le Web Usage Mining (WUM), domaine de recherche assez récent, correspond au processus d’extraction des connaissances à partir des données (ECD) appliquées aux données d’usage sur le Web. Il comporte trois étapes principales : le prétraitement des données, la découverte des schémas et l’analyse des résultats. La quantité des données d’usage à analyser ainsi que leur faible qualité (en particulier l’absence de structuration) sont les principaux problèmes en WUM. Les algorithmes classiques de fouille de données appliquées sur ces données donnent généralement des résultats décevants en termes de pratiques des internautes. Dans cette thèse, nous apportons deux contributions importantes pour un processus WUM, implémentées dans notre boîte à outils Axislogminer. D’abord, nous proposons une méthodologie générale de prétraitement des logs Web dont l’originalité consiste dans le fait qu’elle prend en compte l’aspect multi-sites du WUM. Nous proposons dans notre méthodologie quatre étapes distinctes : la fusion des fichiers logs, le nettoyage, la structuration et l’agrégation des données. Notre deuxième contribution vise à la découverte à partir d’un fichier log prétraité de grande taille, des comportements minoritaires correspondant à des motifs séquentiels de très faible support. Pour cela, nous proposons une méthodologie générale visant à diviser le fichier log prétraité en sous-logs, se déclinant selon trois approches d’extraction de motifs séquentiels au support faible (séquentielle, itérative et hiérarchique). Celles-ci ont été implémentées dans des méthodes concrètes hybrides mettant en jeu des algorithmes de classification et d’extraction de motifs séquentiels
The Web use mining (WUM) is a rather research field and it corresponds to the process of knowledge discovery from databases (KDD) applied to the Web usage data. It comprises three main stages : the pre-processing of raw data, the discovery of schemas and the analysis (or interpretation) of results. The quantity of the web usage data to be analysed and its low quality (in particular the absence of structure) are the principal problems in WUM. When applied to these data, the classic algorithms of data mining, generally, give disappointing results in terms of behaviours of the Web sites users (E. G. Obvious sequential patterns, stripped of interest). In this thesis, we bring two significant contributions for a WUM process, both implemented in our toolbox, the Axislogminer. First, we propose a complete methodology for pre-processing the Web logs whose originality consists in its intersites aspect. We propose in our methodology four distinct steps : the data fusion, data cleaning, data structuration and data summarization. Our second contribution aims at discovering from a large pre-processed log file the minority behaviours corresponding to the sequential patterns with low support. For that, we propose a general methodology aiming at dividing the pre-processed log file into a series of sub-logs. Based on this methodology, we designed three approaches for extracting sequential patterns with low support (the sequential, iterative and hierarchical approaches). These approaches we implemented in hybrid concrete methods using algorithms of clustering and sequential pattern mining
APA, Harvard, Vancouver, ISO, and other styles
5

Kilic, Sefa. "Clustering Frequent Navigation Patterns From Website Logs Using Ontology And Temporal Information." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12613979/index.pdf.

Full text
Abstract:
Given set of web pages labeled with ontological items, the level of similarity between two web pages is measured using the level of similarity between ontological items of pages labeled with. Using similarity measure between two pages, degree of similarity between two sequences of web page visits can be calculated as well. Using clustering algorithms, similar frequent sequences are grouped and representative sequences are selected from these groups. A new sequence is compared with all clusters and it is assigned to most similar one. Representatives of the most similar cluster can be used in several real world cases. They can be used for predicting and prefetching the next page user will visit or for helping the navigation of user in the website. They can also be used to improve the structure of website for easier navigation. In this study the effect of time spent on each web page during the session is analyzed.
APA, Harvard, Vancouver, ISO, and other styles
6

Benkovská, Petra. "Web Usage Mining." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-3950.

Full text
Abstract:
General characteristic of web mining including methodology and procedures incorporated into this term. Relation to other areas (data mining, artificial intelligence, statistics, databases, internet technologies, management etc.) Web usage mining - data sources, data pre-processing, characterization of analytical methods and tools, interpretation of outputs (results), and possible areas of usage including examples. Suggestion of solution method, realization and a concrete example's outputs interpretation while using above mentioned methods of web usage mining.
APA, Harvard, Vancouver, ISO, and other styles
7

Tanasa, Doru. "Fouille de données d'usage du Web : Contributions au prétraitement de logs Web Intersites et à l'extraction des motifs séquentiels avec un faible support." Phd thesis, Université de Nice Sophia-Antipolis, 2005. http://tel.archives-ouvertes.fr/tel-00178870.

Full text
Abstract:
Les quinze dernières années ont été marquées par une croissance exponentielle du domaine du Web tant dans le nombre de sites Web disponibles que dans le nombre d'utilisateurs de ces sites. Cette croissance a généré de très grandes masses de données relatives aux traces d'usage duWeb par les internautes, celles-ci enregistrées dans des fichiers logs Web. De plus, les propriétaires de ces sites ont exprimé le besoin de mieux comprendre leurs visiteurs afin de mieux répondre à leurs attentes. Le Web Usage Mining (WUM), domaine de recherche assez récent, correspond justement au processus d'extraction des connaissances à partir des données (ECD) appliqué aux données d'usage sur le Web. Il comporte trois étapes principales : le prétraitement des données, la découverte des schémas et l'analyse (ou l'interprétation) des résultats. Un processus WUM extrait des patrons de comportement à partir des données d'usage et, éventuellement, à partir d'informations sur le site (structure et contenu) et sur les utilisateurs du site (profils). La quantité des données d'usage à analyser ainsi que leur faible qualité (en particulier l'absence de structuration) sont les principaux problèmes en WUM. Les algorithmes classiques de fouille de données appliqués sur ces données donnent généralement des résultats décevants en termes de pratiques des internautes (par exemple des patrons séquentiels évidents, dénués d'intérêt). Dans cette thèse, nous apportons deux contributions importantes pour un processus WUM, implémentées dans notre bo^³te à outils AxisLogMiner. Nous proposons une méthodologie générale de prétraitement des logs Web et une méthodologie générale divisive avec trois approches (ainsi que des méthodes concrètes associées) pour la découverte des motifs séquentiels ayant un faible support. Notre première contribution concerne le prétraitement des données d'usage Web, domaine encore très peu abordé dans la littérature. L'originalité de la méthodologie de prétraitement proposée consiste dans le fait qu'elle prend en compte l'aspect multi-sites du WUM, indispensable pour appréhender les pratiques des internautes qui naviguent de fa»con transparente, par exemple, sur plusieurs sites Web d'une même organisation. Outre l'intégration des principaux travaux existants sur ce thème, nous proposons dans notre méthodologie quatre étapes distinctes : la fusion des fichiers logs, le nettoyage, la structuration et l'agrégation des données. En particulier, nous proposons plusieurs heuristiques pour le nettoyage des robots Web, des variables agrégées décrivant les sessions et les visites, ainsi que l'enregistrement de ces données dans un modèle relationnel. Plusieurs expérimentations ont été réalisées, montrant que notre méthodologie permet une forte réduction (jusqu'à 10 fois) du nombre des requêtes initiales et offre des logs structurés plus riches pour l'étape suivante de fouille de données. Notre deuxième contribution vise la découverte à partir d'un fichier log prétraité de grande taille, des comportements minoritaires correspondant à des motifs séquentiels de très faible support. Pour cela, nous proposons une méthodologie générale visant à diviser le fichier log prétraité en sous-logs, se déclinant selon trois approches d'extraction de motifs séquentiels au support faible (Séquentielle, Itérative et Hiérarchique). Celles-ci ont été implémentées dans des méthodes concrètes hybrides mettant en jeu des algorithmes de classification et d'extraction de motifs séquentiels. Plusieurs expérimentations, réalisées sur des logs issus de sites académiques, nous ont permis de découvrir des motifs séquentiels intéressants ayant un support très faible, dont la découverte par un algorithme classique de type Apriori était impossible. Enfin, nous proposons une boite à outils appelée AxisLogMiner, qui supporte notre méthodologie de prétraitement et, actuellement, deux méthodes concrètes hybrides pour la découverte des motifs séquentiels en WUM. Cette boite à outils a donné lieu à de nombreux prétraitements de fichiers logs et aussi à des expérimentations avec nos méthodes implémentées.
APA, Harvard, Vancouver, ISO, and other styles
8

Ngok, Man Chan. "Log mining to support web query expansions." Thesis, University of Macau, 2008. http://umaclib3.umac.mo/record=b1783608.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Leibold, Markus. "Web Log Mining als Controllinginstrument der PR." [S.l. : s.n.], 2004. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB11675715.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Oosthuizen, Craig Peter. "Web usage mining of organisational web sites." Thesis, Nelson Mandela Metropolitan University, 2005. http://hdl.handle.net/10948/399.

Full text
Abstract:
Web Usage Mining (WUM) can be used to determine whether the information architecture of a web site is structured correctly. Existing WUM tools however, do not indicate which web usage mining algorithms are used or provide effective graphical visualisations of the results obtained. WUM techniques can be used to determine typical navigation patterns of the users of organisational web sites. An organisational web site can be described as a site which has a high level of content. The Computer Science & Information Systems (CS&IS) web site at the Nelson Mandela Metropolitan University (NMMU) is an example of such a web site. The process of combining WUM and information visualisation techniques in order to discover useful information about web usage patterns is called visual web mining. The goal of this research is to discuss the development of a WUM model and a prototype, called WebPatterns, which allows the user to effectively visualise web usage patterns of an organisational web site. This will facilitate determining whether the information architecture of the CS&IS web site is structured correctly. The WUM algorithms used in WebPatterns are association rule mining and sequence analysis. The purpose of association rule mining is to discover relationships between different web pages within a web site. Sequence analysis is used to determine the longest time ordered paths that satisfy a user specified minimum frequency. A radial tree layout is used in WebPatterns to visualise the static structure of the organisational web site. The structure of the web site is laid out radially, with the home page in the middle and other pages positioned in circles at various levels around it. Colour and other visual cues are used to show the results of the WUM algorithms. User testing was used to determine the effectiveness and usefulness of WebPatterns for visualising web usage patterns. The results of the user testing clearly show that the participants were highly satisfied with the visual design and information provided by WebPatterns. All the participants also indicated that they would like to use WebPatterns in the future. Analysis of the web usage patterns presented by WebPatterns was used to determine that the information architecture of the CS&IS web site can be restructured to better facilitate information retrieval. Changes to the CS&IS web site web were suggested, included placing embedded hyperlinks on the home page to the frequently accessed sections of the web site.
APA, Harvard, Vancouver, ISO, and other styles
11

Norguet, Jean-Pierre. "Semantic analysis in web usage mining." Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210890.

Full text
Abstract:
With the emergence of the Internet and of the World Wide Web, the Web site has become a key communication channel in organizations. To satisfy the objectives of the Web site and of its target audience, adapting the Web site content to the users' expectations has become a major concern. In this context, Web usage mining, a relatively new research area, and Web analytics, a part of Web usage mining that has most emerged in the corporate world, offer many Web communication analysis techniques. These techniques include prediction of the user's behaviour within the site, comparison between expected and actual Web site usage, adjustment of the Web site with respect to the users' interests, and mining and analyzing Web usage data to discover interesting metrics and usage patterns. However, Web usage mining and Web analytics suffer from significant drawbacks when it comes to support the decision-making process at the higher levels in the organization.

Indeed, according to organizations theory, the higher levels in the organizations need summarized and conceptual information to take fast, high-level, and effective decisions. For Web sites, these levels include the organization managers and the Web site chief editors. At these levels, the results produced by Web analytics tools are mostly useless. Indeed, most of these results target Web designers and Web developers. Summary reports like the number of visitors and the number of page views can be of some interest to the organization manager but these results are poor. Finally, page-group and directory hits give the Web site chief editor conceptual results, but these are limited by several problems like page synonymy (several pages contain the same topic), page polysemy (a page contains several topics), page temporality, and page volatility.

Web usage mining research projects on their part have mostly left aside Web analytics and its limitations and have focused on other research paths. Examples of these paths are usage pattern analysis, personalization, system improvement, site structure modification, marketing business intelligence, and usage characterization. A potential contribution to Web analytics can be found in research about reverse clustering analysis, a technique based on self-organizing feature maps. This technique integrates Web usage mining and Web content mining in order to rank the Web site pages according to an original popularity score. However, the algorithm is not scalable and does not answer the page-polysemy, page-synonymy, page-temporality, and page-volatility problems. As a consequence, these approaches fail at delivering summarized and conceptual results.

An interesting attempt to obtain such results has been the Information Scent algorithm, which produces a list of term vectors representing the visitors' needs. These vectors provide a semantic representation of the visitors' needs and can be easily interpreted. Unfortunately, the results suffer from term polysemy and term synonymy, are visit-centric rather than site-centric, and are not scalable to produce. Finally, according to a recent survey, no Web usage mining research project has proposed a satisfying solution to provide site-wide summarized and conceptual audience metrics.

In this dissertation, we present our solution to answer the need for summarized and conceptual audience metrics in Web analytics. We first described several methods for mining the Web pages output by Web servers. These methods include content journaling, script parsing, server monitoring, network monitoring, and client-side mining. These techniques can be used alone or in combination to mine the Web pages output by any Web site. Then, the occurrences of taxonomy terms in these pages can be aggregated to provide concept-based audience metrics. To evaluate the results, we implement a prototype and run a number of test cases with real Web sites.

According to the first experiments with our prototype and SQL Server OLAP Analysis Service, concept-based metrics prove extremely summarized and much more intuitive than page-based metrics. As a consequence, concept-based metrics can be exploited at higher levels in the organization. For example, organization managers can redefine the organization strategy according to the visitors' interests. Concept-based metrics also give an intuitive view of the messages delivered through the Web site and allow to adapt the Web site communication to the organization objectives. The Web site chief editor on his part can interpret the metrics to redefine the publishing orders and redefine the sub-editors' writing tasks. As decisions at higher levels in the organization should be more effective, concept-based metrics should significantly contribute to Web usage mining and Web analytics.


Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles
12

Mendoza, Rocha Marcelo Gabriel. "Query log mining in search engines." Tesis, Universidad de Chile, 2007. http://www.repositorio.uchile.cl/handle/2250/102877.

Full text
Abstract:
Doctor en Ciencias, Mención Computación
La Web es un gran espacio de información donde muchos recursos como documentos, imágenes u otros contenidos multimediales pueden ser accesados. En este contexto, varias tecnologías de la información han sido desarrolladas para ayudar a los usuarios a satisfacer sus necesidades de búsqueda en la Web, y las más usadas de estas son los motores de búsqueda. Los motores de búsqueda permiten a los usuarios encontrar recursos formulando consultas y revisando una lista de respuestas. Uno de los principales desafíos para la comunidad de la Web es diseñar motores de búsqueda que permitan a los usuarios encontrar recursos semánticamente conectados con sus consultas. El gran tamaño de la Web y la vaguedad de los términos más comúnmente usados en la formulación de consultas es un gran obstáculo para lograr este objetivo. En esta tesis proponemos explorar las selecciones de los usuarios registradas en los logs de los motores de búsqueda para aprender cómo los usuarios buscan y también para diseñar algoritmos que permitan mejorar la precisión de las respuestas recomendadas a los usuarios. Comenzaremos explorando las propiedades de estos datos. Esta exploración nos permitirá determinar la naturaleza dispersa de estos datos. Además presentaremos modelos que nos ayudarán a entender cómo los usuarios buscan en los motores de búsqueda. Luego, exploraremos las selecciones de los usuarios para encontrar asociaciones útiles entre consultas registradas en los logs. Concentraremos los esfuerzos en el diseño de técnicas que permitirán a los usuarios encontrar mejores consultas que la consulta original. Como una aplicación, diseñaremos métodos de reformulación de consultas que ayudarán a los usuarios a encontrar términos más útiles mejorando la representación de sus necesidades. Usando términos de documentos construiremos representaciones vectoriales para consultas. Aplicando técnicas de clustering podremos determinar grupos de consultas similares. Usando estos grupos de consultas, introduciremos métodos para recomendación de consultas y documentos que nos permitirán mejorar la precisión de las recomendaciones. Finalmente, diseñaremos técnicas de clasificación de consultas que nos permitirán encontrar conceptos semánticamente relacionados con la consulta original. Para lograr esto, clasificaremos las consultas de los usuarios en directorios Web. Como una aplicación, introduciremos métodos para la manutención automática de los directorios.
APA, Harvard, Vancouver, ISO, and other styles
13

Ba-Omer, Hafidh Taher. "A framework for educational web usage mining." Thesis, University of Manchester, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.492063.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Salin, Suleyman. "Web Usage Mining And Recommendation With Semantic Information." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12610483/index.pdf.

Full text
Abstract:
Web usage mining has become popular in various business areas related with Web site development. In Web usage mining, the commonly visited navigational paths are extracted in terms of Web page addresses from the Web server visit logs, and the patterns are used in various applications. The semantic information of the Web page contents is generally not included in Web usage mining. In this thesis, a framework for integrating semantic information with Web usage mining is implemented. The frequent navigational patterns are extracted in the forms of ontology instances instead of Web page addresses and the result is used for making page recommendations to the visitor. Moreover, an evaluation mechanism is implemented to find the success of the recommendation. Test results proved that stronger and more accurate recommendations are obtained by including semantic information in the Web usage mining instead of using on visited Web page addresses.
APA, Harvard, Vancouver, ISO, and other styles
15

Pabarškaitė, Židrina. "Enhancements of pre-processing, analysis and presentation techniques in web log mining." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2009. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2009~D_20090713_142203-05841.

Full text
Abstract:
As Internet is becoming an important part of our life, more attention is paid to the information quality and how it is displayed to the user. The research area of this work is web data analysis and methods how to process this data. This knowledge can be extracted by gathering web servers’ data – log files, where all users’ navigational patters about browsing are recorded. The research object of the dissertation is web log data mining process. General topics that are related with this object: web log data preparation methods, data mining algorithms for prediction and classification tasks, web text mining. The key target of the thesis is to develop methods how to improve knowledge discovery steps mining web log data that would reveal new opportunities to the data analyst. While performing web log analysis, it was discovered that insufficient interest has been paid to web log data cleaning process. By reducing the number of redundant records data mining process becomes much more effective and faster. Therefore a new original cleaning framework was introduced which leaves records that only corresponds to the real user clicks. People tend to understand technical information more if it is similar to a human language. Therefore it is advantageous to use decision trees for mining web log data, as they generate web usage patterns in the form of rules which are understandable to humans. However, it was discovered that users browsing history length is different, therefore specific data... [to full text]
Internetui skverbiantis į mūsų gyvenimą, vis didesnis dėmesys kreipiamas į informacijos pateikimo kokybę, bei į tai, kaip informacija yra pateikta. Disertacijos tyrimų sritis yra žiniatinklio serverių kaupiamų duomenų gavyba bei duomenų pateikimo galutiniam naudotojui gerinimo būdai. Tam reikalingos žinios išgaunamos iš žiniatinklio serverio žurnalo įrašų, kuriuose fiksuojama informacija apie išsiųstus vartotojams žiniatinklio puslapius. Darbo tyrimų objektas yra žiniatinklio įrašų gavyba, o su šiuo objektu susiję dalykai: žiniatinklio duomenų paruošimo etapų tobulinimas, žiniatinklio tekstų analizė, duomenų analizės algoritmai prognozavimo ir klasifikavimo uždaviniams spręsti. Pagrindinis disertacijos tikslas – perprasti svetainių naudotojų elgesio formas, tiriant žiniatinklio įrašus, tobulinti paruošimo, analizės ir rezultatų interpretavimo etapų metodologijas. Darbo tyrimai atskleidė naujas žiniatinklio duomenų analizės galimybes. Išsiaiškinta, kad internetinių duomenų – žiniatinklio įrašų švarinimui buvo skirtas nepakankamas dėmesys. Parodyta, kad sumažinus nereikšmingų įrašų kiekį, duomenų analizės procesas tampa efektyvesnis. Todėl buvo sukurtas naujas metodas, kurį pritaikius žinių pateikimas atitinka tikruosius vartotojų maršrutus. Tyrimo metu nustatyta, kad naudotojų naršymo istorija yra skirtingų ilgių, todėl atlikus specifinį duomenų paruošimą – suformavus fiksuoto ilgio vektorius, tikslinga taikyti iki šiol nenaudotus praktikoje sprendimų medžių algoritmus... [toliau žr. visą tekstą]
APA, Harvard, Vancouver, ISO, and other styles
16

Khalil, Faten. "Combining web data mining techniques for web page access prediction." University of Southern Queensland, Faculty of Sciences, 2008. http://eprints.usq.edu.au/archive/00004341/.

Full text
Abstract:
[Abstract]: Web page access prediction gained its importance from the ever increasing number of e-commerce Web information systems and e-businesses. Web page prediction, that involves personalising the Web users’ browsing experiences, assists Web masters in the improvement of the Web site structure and helps Web users in navigating the site and accessing the information they need. The most widely used approach for this purpose is the pattern discovery process of Web usage mining that entails many techniques like Markov model, association rules and clustering. Implementing pattern discovery techniques as such helps predict the next page tobe accessed by theWeb user based on the user’s previous browsing patterns. However, each of the aforementioned techniques has its own limitations, especiallywhen it comes to accuracy and space complexity. This dissertation achieves better accuracy as well as less state space complexity and rules generated by performingthe following combinations. First, we combine low-order Markov model and association rules. Markov model analysis are performed on the data sets. If the Markov model prediction results in a tie or no state, association rules are used for prediction. The outcome of this integration is better accuracy, less Markov model state space complexity and less number of generated rules than using each of the methods individually. Second, we integrate low-order Markov model and clustering. The data sets are clustered and Markov model analysis are performed oneach cluster instead of the whole data sets. The outcome of the integration is better accuracy than the first combination with less state space complexity than higherorder Markov model. The last integration model involves combining all three techniques together: clustering, association rules and low-order Markov model. The data sets are clustered and Markov model analysis are performed on each cluster. If the Markov model prediction results in close accuracies for the same item, association rules are used for prediction. This integration model achievesbetter Web page access prediction accuracy, less Markov model state space complexity and less number of rules generated than the previous two models.
APA, Harvard, Vancouver, ISO, and other styles
17

Lou, Wenwu. "Characterizing Web linking and usage with hierarchical models /." View abstract or full-text, 2005. http://library.ust.hk/cgi/db/thesis.pl?COMP%202005%20LOU.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Karlsson, Sophie. "Datainsamling med Web Usage Mining : Lagringsstrategier för loggning av serverdata." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-9467.

Full text
Abstract:
Webbapplikationers komplexitet och mängden avancerade tjänster ökar. Loggning av aktiviteter kan öka förståelsen över användares beteenden och behov, men används i för stor mängd utan relevant information. Mer avancerade system medför ökade krav för prestandan och loggning blir än mer krävande för systemen. Det finns behov av smartare system, utveckling inom tekniker för prestandaförbättringar och tekniker för datainsamling. Arbetet kommer undersöka hur svarstider påverkas vid loggning av serverdata, enligt datainsamlingsfasen i web usage mining, beroende på lagringsstrategier. Hypotesen är att loggning kan försämra svarstider ytterligare. Experiment genomförs där fyra olika lagringsstrategier används för att lagra serverdata med olika tabell- och databasstrukturer, för att se vilken strategi som påverkar svarstiderna minst. Experimentet påvisar statistiskt signifikant skillnad mellan lagringsstrategierna enligt ANOVA. Lagringsstrategi 4 påvisar bäst effekt för prestandans genomsnittliga svarstid, jämfört med lagringsstrategi 2 som påvisar mest negativ effekt för den genomsnittliga svarstiden. Framtida arbete vore intressant för att stärka resultaten.
Web applications complexity and the amount of advanced services increases. Logging activities can increase the understanding of users behavior and needs, but is used too much without relevant information. More advanced systems brings increased requirements for performance and logging becomes even more demanding for the systems. There is need of smarter systems, development within the techniques for performance improvements and techniques for data collection. This work will investigate how response times are affected when logging server data, according to the data collection phase in web usage mining, depending on storage strategies. The hypothesis is that logging may degrade response times even further. An experiment was conducted in which four different storage strategies are used to store server data with different table- and database structures, to see which strategy affects the response times least. The experiment proves statistically significant difference between the storage strategies with ANOVA. Storage strategy 4 proves the best effect for the performance average response time compared with storage strategy 2, which proves the most negative effect for the average response time. Future work would be interesting for strengthening the results.
APA, Harvard, Vancouver, ISO, and other styles
19

Bayir, Murat Ali. "A New Reactive Method For Processing Web Usage Data." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12607323/index.pdf.

Full text
Abstract:
In this thesis, a new reactive session reconstruction method '
Smart-SRA'
is introduced. Web usage mining is a type of web mining, which exploits data mining techniques to discover valuable information from navigations of Web users. As in classical data mining, data processing and pattern discovery are the main issues in web usage mining. The first phase of the web usage mining is the data processing phase including session reconstruction. Session reconstruction is the most important task of web usage mining since it directly affects the quality of the extracted frequent patterns at the final step, significantly. Session reconstruction methods can be classified into two categories, namely '
reactive'
and '
proactive'
with respect to the data source and the data processing time. If the user requests are processed after the server handles them, this technique is called as &lsquo
reactive&rsquo
, while in &lsquo
proactive&rsquo
strategies this processing occurs during the interactive browsing of the web site. Smart-SRA is a reactive session reconstruction techique, which uses web log data and the site topology. In order to compare Smart-SRA with previous reactive methods, a web agent simulator has been developed. Our agent simulator models behavior of web users and generates web user navigations as well as the log data kept by the web server. In this way, the actual user sessions will be known and the successes of different techniques can be compared. In this thesis, it is shown that the sessions generated by Smart-SRA are more accurate than the sessions constructed by previous heuristics.
APA, Harvard, Vancouver, ISO, and other styles
20

Wu, Hao-cun, and 吳浩存. "A multidimensional data model for monitoring web usage and optimizing website topology." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B29528215.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Yilmaz, Hakan. "Using Ontology Based Web Usage Mining And Object Clustering For Recommendation." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12611902/index.pdf.

Full text
Abstract:
Many e-commerce web sites such as online book retailers or specialized information hubs such as online movie databases make use of recommendation systems where users are directed to items of interests based on past user interactions. Keyword-based approaches, collaborative and content filtering techniques have been tried and used over the years each having their own shortcomings. While keyword based approaches are naive and do not take content or context into account collaborative and content filtering techniques suffer from biased ratings, first item and first-rater problems. Recent approaches try to incorporate underlying semantic properties of data by employing ontology based usage mining. This thesis aims to design a recommendation system based on ontological data where web pages are seen as objects with attributes and relations. Instead of relying on users&rsquo
content ratings, user sessions are clustered on a iv semantic level to capture different behavioral groups. Since semantic information is used for the clustering distance function, each cluster represents a behavior group instead of simpler data groups. New users are then assigned to individual clusters that best represent their behavior and recommendations are generated accordingly. In this thesis we use the recommendation results as a means for measuring the effectiveness of the clusters we have generated. We have compared the results obtained using the ontological data and the results obtained without using it and shown that semantic integrating semantic knowledge increases both precision and recall.
APA, Harvard, Vancouver, ISO, and other styles
22

Palmer, Bart C. "Web Usage Mining: Application To An Online Educational Digital Library Service." DigitalCommons@USU, 2012. https://digitalcommons.usu.edu/etd/1215.

Full text
Abstract:
This dissertation was situated in the crossroads of educational data mining (EDM), educational digital libraries (such as the National Science Digital Library; http://nsdl.org), and examination of teacher behaviors while creating online learning resources in an end-user authoring system, the Instructional Architect (IA; http://ia.usu.edu). The knowledge from data/database (KDD) framework for preparing data and finding patterns in large amounts of data served as the process framework in which a latent class analysis (LCA) was applied to IA user data. Details of preprocessing challenges for web usage data are included. A meaningful IA activity framework provided four general areas of user behavior features that assisted in the interpretation of the LCA results: registration and usage, resource collection, project authoring, and project usage. Four clusters were produced on two samples (users with 5–90 logins and those with 10–90 logins) from 22 months of data collection. The analyses produced nearly identical models with both samples. The clusters were named according to their usage behaviors: one-hit wonders who came, did, and left and we are left to wonder where they went; focused functionaries who appeared to produce some content, but in only small numbers and they did not share many of their projects; popular producers who produced small but very public projects that received a lot of visitors; and prolific producers who were very verbose, created many projects, and published a lot to their students with many hits, but they did not publish much for the public. Information about EDM within the context of digital libraries is discussed and implications for the IA, its professional development workshop, and the larger context of educational digital libraries are presented.
APA, Harvard, Vancouver, ISO, and other styles
23

Dvořák, Jan Bc. "Web Usage Mining - popis, metody a nástroje, možné aplikace, konkrétní řešení." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-1979.

Full text
Abstract:
Obecný popis Web Miningu. Charakteristika a užití technik Web Usage Miningu. Podrobný popis metod a nástrojů zahrnovaných pod pojem ?Web Usage Mining?. Softwarové nástroje a existující řešení pro Web Usage Mining. Praktický návrh konkrétního řešení s využitím popsaných metod Web Usage Miningu ? analýza logovacích souborů webového serveru Fakulty managementu Vysoké školy ekonomické v Praze.
APA, Harvard, Vancouver, ISO, and other styles
24

Özakar, Belgin Püskülcü Halis. "Finding And Evaluating Patterns In Wes Repository Using Database Technology And Data Mining Algorithms/." [s.l.]: [s.n.], 2002. http://library.iyte.edu.tr/tezler/master/bilgisayaryazilimi/T000130.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Nagi, Mohamad. "Integrating Network Analysis and Data Mining Techniques into Effective Framework for Web Mining and Recommendation. A Framework for Web Mining and Recommendation." Thesis, University of Bradford, 2015. http://hdl.handle.net/10454/14200.

Full text
Abstract:
The main motivation for the study described in this dissertation is to benefit from the development in technology and the huge amount of available data which can be easily captured, stored and maintained electronically. We concentrate on Web usage (i.e., log) mining and Web structure mining. Analysing Web log data will reveal valuable feedback reflecting how effective the current structure of a web site is and to help the owner of a web site in understanding the behaviour of the web site visitors. We developed a framework that integrates statistical analysis, frequent pattern mining, clustering, classification and network construction and analysis. We concentrated on the statistical data related to the visitors and how they surf and pass through the various pages of a given web site to land at some target pages. Further, the frequent pattern mining technique was used to study the relationship between the various pages constituting a given web site. Clustering is used to study the similarity of users and pages. Classification suggests a target class for a given new entity by comparing the characteristics of the new entity to those of the known classes. Network construction and analysis is also employed to identify and investigate the links between the various pages constituting a Web site by constructing a network based on the frequency of access to the Web pages such that pages get linked in the network if they are identified in the result of the frequent pattern mining process as frequently accessed together. The knowledge discovered by analysing a web site and its related data should be considered valuable for online shoppers and commercial web site owners. Benefitting from the outcome of the study, a recommendation system was developed to suggest pages to visitors based on their profiles as compared to similar profiles of other visitors. The conducted experiments using popular datasets demonstrate the applicability and effectiveness of the proposed framework for Web mining and recommendation. As a by product of the proposed method, we demonstrate how it is effective in another domain for feature reduction by concentrating on gene expression data analysis as an application with some interesting results reported in Chapter 5.
APA, Harvard, Vancouver, ISO, and other styles
26

Wang, Hui. "Mining novel Web user behavior models for access prediction /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20WANG.

Full text
Abstract:
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2003.
Includes bibliographical references (leaves 83-91). Also available in electronic version. Access restricted to campus users.
APA, Harvard, Vancouver, ISO, and other styles
27

Kong, Wei. "EXPLORING HEALTH WEBSITE USERS BY WEB MINING." Thesis, Universal Access in Human-Computer Interaction. Applications and Services Lecture Notes in Computer Science, 2011, Volume 6768/2011, 376-383, DOI: 10.1007/978-3-642-21657-2_40, 2011. http://hdl.handle.net/1805/2810.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)
With the continuous growth of health information on the Internet, providing user-orientated health service online has become a great challenge to health providers. Understanding the information needs of the users is the first step to providing tailored health service. The purpose of this study is to examine the navigation behavior of different user groups by extracting their search terms and to make some suggestions to reconstruct a website for more customized Web service. This study analyzed five months’ of daily access weblog files from one local health provider’s website, discovered the most popular general topics and health related topics, and compared the information search strategies for both patient/consumer and doctor groups. Our findings show that users are not searching health information as much as was thought. The top two health topics which patients are concerned about are children’s health and occupational health. Another topic that both user groups are interested in is medical records. Also, patients and doctors have different search strategies when looking for information on this website. Patients get back to the previous page more often, while doctors usually go to the final page directly and then leave the page without coming back. As a result, some suggestions to redesign and improve the website are discussed; a more intuitive portal and more customized links for both user groups are suggested.
APA, Harvard, Vancouver, ISO, and other styles
28

Zhao, Hongkun. "Automatic wrapper generation for the extraction of search result records from search engines." Diss., Online access via UMI:, 2007.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
29

Soztutar, Enis. "Mining Frequent Semantic Event Patterns." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12611007/index.pdf.

Full text
Abstract:
Especially with the wide use of dynamic page generation, and richer user interaction in Web, traditional web usage mining methods, which are based on the pageview concept are of limited usability. For overcoming the difficulty of capturing usage behaviour, we define the concept of semantic events. Conceptually, events are higher level actions of a user in a web site, that are technically independent of pageviews. Events are modelled as objects in the domain of the web site, with associated properties. A sample event from a video web site is the '
play video event'
with properties '
video'
, '
length of video'
, '
name of video'
, etc. When the event objects belong to the domain model of the web site'
s ontology, they are referred as semantic events. In this work, we propose a new algorithm and associated framework for mining patterns of semantic events from the usage logs. We present a method for tracking and logging domain-level events of a web site, adding semantic information to events, an ordering of events in respect to the genericity of the event, and an algorithm for computing sequences of frequent events.
APA, Harvard, Vancouver, ISO, and other styles
30

Mužík, Zbyněk. "Web Analytics." Master's thesis, Vysoká škola ekonomická v Praze, 2006. http://www.nusl.cz/ntk/nusl-295.

Full text
Abstract:
Práce se zabývá problematikou měření ukazatelů souvisejících s provozem webových stránek a aplikací a technologickými prostředky k tomu sloužícími ? Web Analytics (WA). Hlavním cílem práce je otestovat a porovnat vybrané zástupce těchto nástrojů a podrobit je srovnání podle objektivních kriterií, dále také kritické zhodnocení možností WA nástrojů obecně. V první části se práce zaměřuje na popis různých způsobů měření provozu na WWW a definuje související metriky. Poskytuje také přehled dostupných WA nástrojů. Následně je vytvořen hodnotící model pro WA nástroje a podle něj je ohodnoceno šest zástupců těchto nástrojů. Hodnocení má podobu uživatelského testování na datech ze dvou reálných webových stránek. Majitelům těchto dvou webových stránek je učiněno doporučení pro volbu vhodného WA nástroje na základě jejich preferencí. Dalším výstupem práce jsou reporty, vygenerované testovanými nástroji, popisující aktivity na zkoumaných webových stránkách.
APA, Harvard, Vancouver, ISO, and other styles
31

Jiang, Hao, and 江浩. "Personalized web search re-ranking and content recommendation." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013. http://hdl.handle.net/10722/197548.

Full text
Abstract:
In this thesis, I propose a method for establishing a personalized recommendation system for re-ranking web search results and recommending web contents. The method is based on personal reading interest which can be reflected by the user’s dwell time on each document or webpage. I acquire document-level dwell times via a customized web browser, or a mobile device. To obtain better precision, I also explore the possibility of tracking gaze position and facial expression, from which I can determine the attractiveness of different parts of a document. Inspired by idea of Google Knowledge Graph, I also establish a graph-based ontology to maintain a user profile to describe the user’s personal reading interest. Each node in the graph is a concept, which represents the user’s potential interest on this concept. I also use the dwell time to measure concept-level interest, which can be inferred from document-level user dwell times. The graph is generated based on the Wikipedia. According to the estimated concept-level user interest, my algorithm can estimate a user’s potential dwell time over a new document, based on which personalized webpage re-ranking can be carried out. I compare the rankings produced by my algorithm with rankings generated by popular commercial search engines and a recently proposed personalized ranking algorithm. The results clearly show the superiority of my method. I also use my personalized recommendation framework in other applications. A good example is personalized document summarization. The same knowledge graph is employed to estimate the weight of every word in a document; combining with a traditional document summarization algorithm which focused on text mining, I could generate a personalized summary which emphasize the user’s interest in the document. To deal with images and videos, I present a new image search and ranking algorithm for retrieving unannotated images by collaboratively mining online search results, which consists of online images and text search results. The online image search results are leveraged as reference examples to perform content-based image search over unannotated images. The online text search results are used to estimate individual reference images’ relevance to the search query as not all the online image search results are closely related to the query. Overall, the key contribution of my method lies in its ability to deal with unreliable online image search results through jointly mining visual and textual aspects of online search results. Through such collaborative mining, my algorithm infers the relevance of an online search result image to a text query. Once I estimate a query relevance score for each online image search result, I can selectively use query specific online search result images as reference examples for retrieving and ranking unannotated images. To explore the performance of my algorithm, I tested it both on a standard public image datasets and several modestly sized personal photo collections. I also compared the performance of my method with that of two peer methods. The results are very positive, which indicate that my algorithm is superior to existing content-based image search algorithms for retrieving and ranking unannotated images. Overall, the main advantage of my algorithm comes from its collaborative mining over online search results both in the visual and the textual domains.
published_or_final_version
Computer Science
Doctoral
Doctor of Philosophy
APA, Harvard, Vancouver, ISO, and other styles
32

Vollino, Bruno Winiemko. "Descoberta de perfis de uso de web services." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/83669.

Full text
Abstract:
Durante o ciclo de vida de um web service, diversas mudanças são feitas na sua interface, eventualmente causando incompatibilidades em relação aos seus clientes e ocasionando a quebra de suas aplicações. Os provedores precisam tomar decisões sobre mudanças em seus serviços frequentemente, muitas vezes sem um bom entendimento a respeito do efeito destas mudanças sobre seus clientes. Os trabalhos e ferramentas existentes não fornecem ao provedor um conhecimento adequado a respeito do uso real das funcionalidades da interface de um serviço, considerando os diferentes tipos de consumidores, o que impossibilita avaliar o impacto das mudanças. Este trabalho apresenta um framework para a descoberta de perfis de uso de serviços web, os quais constituem um modelo descritivo dos padrões de uso dos diferentes grupos de clientes do serviço, com relação ao uso das funcionalidades em sua interface. O framework auxilia no processo de descoberta de conhecimento através de tarefas semiautomáticas e parametrizáveis para a preparação e análise de dados de uso, minimizando a necessidade de intervenção do usuário. O framework engloba o monitoramento de interações de web services, a carga de dados de uso pré-processados em uma base de dados unificada, e a geração de perfis de uso. Técnicas de mineração de dados são utilizadas para agrupar clientes de acordo com seus padrões de uso de funcionalidades, e esses grupos são utilizados na construção de perfis de uso de serviços. Todo o processo é configurado através de parâmetros, permitindo que o usuário determine o nível de detalhe das informações sobre o uso incluídas nos perfis e os critérios para avaliar a similaridade entre clientes. A proposta é validada por meio de experimentos com dados sintéticos, simulados de acordo com características esperadas no comportamento de clientes de um serviço real. Os resultados dos experimentos demonstram que o framework proposto permite a descoberta de perfis de uso de serviço úteis, e fornecem evidências a respeito da parametrização adequada do framework.
During the life cycle of a web service, several changes are made in its interface, which possibly are incompatible with regard to current usage and may break client applications. Providers must make decisions about changes on their services, most often without insight on the effect these changes will have over their customers. Existing research and tools fail to input provider with proper knowledge about the actual usage of the service interface’s features, considering the distinct types of customers, making it impossible to assess the actual impact of changes. This work presents a framework for the discovery of web service usage profiles, which constitute a descriptive model of the usage patterns found in distinct groups of clients, concerning the usage of service interface features. The framework supports a user in the process of knowledge discovery over service usage data through semi-automatic and configurable tasks, which assist the preparation and analysis of usage data with the minimum user intervention possible. The framework performs the monitoring of web services interactions, loads pre-processed usage data into a unified database, and supports the generation of usage profiles. Data mining techniques are used to group clients according to their usage patterns of features, and these groups are used to build service usage profiles. The entire process is configured via parameters, which allows the user to determine the level of detail of the usage information included in the profiles, and the criteria for evaluating the similarity between client applications. The proposal is validated through experiments with synthetic data, simulated according to features expected in the use of a real service. The experimental results demonstrate that the proposed framework allows the discovery of useful service usage profiles, and provide evidences about the proper parameterization of the framework.
APA, Harvard, Vancouver, ISO, and other styles
33

Khasawneh, Natheer Yousef. "Toward Better Website Usage: Leveraging Data Mining Techniques and Rough Set Learning to Construct Better-to-use Websites." Akron, OH : University of Akron, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=akron1120534472.

Full text
Abstract:
Dissertation (Ph. D.)--University of Akron, Dept. of Electrical and Computer Engineering, 2005.
"August, 2005." Title from electronic dissertation title page (viewed 01/14/2006) Advisor, John Durkin; Committee members, John Welch, James Grover, Yueh-Jaw Lin, Yingcai Xiao, Chien-Chung Chan; Department Chair, Alex Jose De Abreu-Garcia; Dean of the College, George Haritos; Dean of the Graduate School, George R. Newkome. Includes bibliographical references.
APA, Harvard, Vancouver, ISO, and other styles
34

Villar, Escobar Osvaldo Pablo. "Minería y Personalización de un Sitio Web para Celulares." Tesis, Universidad de Chile, 2007. http://www.repositorio.uchile.cl/handle/2250/104823.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Agarwal, Khushbu. "A partition based approach to approximate tree mining a memory hierarchy perspective /." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1196284256.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Kliegr, Tomáš. "Clickstream Analysis." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-2065.

Full text
Abstract:
Thesis introduces current research trends in clickstream analysis and proposes a new heuristic that could be used for dimensionality reduction of semantically enriched data in Web Usage Mining (WUM). Click-fraud and conversion fraud are identified as key prospective application areas for WUM. Thesis documents a conversion fraud vulnerability of Google Analytics and proposes defense - a new clickstream acquisition software, which collects data in sufficient granularity and structure to allow for data mining approaches to fraud detection. Three variants of K-means clustering algorithms and three association rule data mining systems are evaluated and compared on real-world web usage data.
APA, Harvard, Vancouver, ISO, and other styles
37

Nenadić, Oleg. "An implementation of correspondence analysis in R and its application in the analysis of web usage /." Göttingen : Cuvillier, 2007. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=016229974&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Khrouf, Laamouri Lilia. "Vers une meilleure compréhension des réactions des internautes à l'atmosphère des sites web marchands : rôle de l'imagerie mentale." Nantes, 2012. http://www.theses.fr/2012NANT4022.

Full text
Abstract:
L'objectif de cette thèse était d'apporter une meilleure compréhension aux réponses des internautes face à l'atmosphère des sites web marchands par la prise en compte du rôle de l'imagerie mentale. Partant d'une revue de la littérature issue de divers champs de recherche (Internet, persuasion publicitaire, psychologie, etc. ) ainsi que de deux études qualitatives, les hypothèses de recherche ont été formulées et un modèle conceptuel a été construit. Une expérimentation menée auprès d'un échantillon de 400 internautes a permis de valider le modèle conceptuel proposé. Nous avons ainsi démontré à travers cette recherche que l'imagerie mentale véhiculée par les sites web marchands médiatise l'impact de l'atmosphère du site sur les réactions des internautes. Plus spécifiquement, il a été prouvé que les sites à dominante visuelle et perçus comme étant interactifs favorisent l'imagerie mentale. De même, nous avons trouvé que par rapport aux sites à dominante rouge, les sites à dominante bleue engendrent des images mentales plus vivaces et positives mais moins nombreuses et liées à soi. Néanmoins, les liens établis entre la couleur dominante du site et l'imagerie mentale s'inversent en cas de forte implication de l'internaute envers le produit vendu. Enfin, nous avons montré que les dimensions vivacité/clarté et valence de l'imagerie mentale affectent positivement les réactions affectives, attitudinales et conatives des internautes. La quantité/facilité des images mentales et leur lien à soi permettent de favoriser uniquement certaines d'entre elles. Ces résultats ont donné lieu à des recommandations managériales et à des voies futures de recherche
The purpose of this research was to contribute to the comprehension of web surfers' reactions to commercial websites' atmosphere by taking into account mental imagery's role. Literature from different research areas (Internet, advertising, psychology, etc. ) and two exploratory studies allowed us to construct a conceptual model and to propose research hypotheses. The validation of the conceptual model was made through the implementation of an experiment to which 400 web surfers participated. The results showed that mental imagery conveyed by commercial websites mediates the impact of website's atmosphere on web surfers' reactions. Specifically, it was proven that when websites were picture-based and perceived as interactive, mental imagery was enhanced. It was also demonstrated that compared to red backgrounds websites, blue ones lead to mental images that are more vivid and positive but less numerous and related to oneself. The impact of the color of websites' backgrounds is however moderated by web surfers' involvement toward the product sold. Finally, it appears that vividness/clarity and valence of mental imagery improve affective, attitudinal and conative web surfers' reactions. Quantity/ease of mental images' construction and their self-relatedness only can enhance some of them. These results led to the proposition of managerial recommendations and some suggestions for future research
APA, Harvard, Vancouver, ISO, and other styles
39

Jadhav, Ashutosh. "Knowledge Driven Search Intent Mining." Wright State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Falchi, Cecilia. "Monitoring di un portale web: modello e implementazione." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amslaurea.unibo.it/7993/.

Full text
Abstract:
Ogni giorno enormi quantità di dati sono prodotti come record dettagliati del comportamento di utilizzo del Web, ma l'obiettivo di trarne conoscenza rimane ancora una sfida. In questa trattazione viene descritto EOP(Eye-On-Portal), un framework di monitoring che si propone come strumento per riuscire a catturare informazioni dettagliate sulle componenti della pagina visitata dall'utente e sulle interazioni di quest'ultimo con il portale: i dati raccolti potrebbero avere utilità nell'ottimizzazione del layout e nell'usabilità del portale.
APA, Harvard, Vancouver, ISO, and other styles
41

Charrad, Malika. "Une approche générique pour l'analyse croisant contenu et usage des sites Web par des méthodes de bipartitionnement." Phd thesis, Conservatoire national des arts et metiers - CNAM, 2010. http://tel.archives-ouvertes.fr/tel-00516367.

Full text
Abstract:
Dans cette thèse, nous proposons une nouvelle approche WCUM (Web Content and Usage Mining based approach) permettant de relier l'analyse du contenu à l'analyse de l'usage d'un site Web afin de mieux comprendre le comportement général des visiteurs du site. Ce travail repose sur l'utilisation de l'algorithme CROKI2 de classification croisée implémenté selon deux stratégies d'optimisation différentes que nous comparons à travers des expérimentations sur des données générées artificiellement. Afin de pallier le problème de détermination du nombre de classes sur les lignes et les colonnes, nous proposons de généraliser certains indices proposés initialement pour évaluer les partitions obtenues par des algorithmes de classification simple, aux algorithmes de classification simultanée. Pour évaluer la performance de ces indices nous proposons un algorithme de génération de biclasses artificielles pour effectuer des simulations et valider les résultats. Des expérimentations sur des données artificielles ainsi qu'une application sur des données réelles ont été réalisées pour évaluer l'efficacité de l'approche proposée.
APA, Harvard, Vancouver, ISO, and other styles
42

Rigo, Sandro Jose. "Integração de recursos da web semântica e mineração de uso para personalização de sites." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2008. http://hdl.handle.net/10183/15324.

Full text
Abstract:
Um dos motivos para o crescente desenvolvimento da área de mineração de dados encontra-se no aumento da quantidade de documentos gerados e armazenados em formato digital, estruturados ou não. A Web contribui sobremaneira para este contexto e, de forma coerente com esta situação, observa-se o surgimento de técnicas específicas para utilização nesta área, como a mineração de estrutura, de conteúdo e de uso. Pode-se afirmar que esta crescente oferta de informação na Web cria o problema da sobrecarga cognitiva. A Hipermídia Adaptativa permite minorar este problema, com a adaptação de hiperdocumentos e hipermídia aos seus usuários segundo suas necessidades, preferências e objetivos. De forma resumida, esta adaptação é realizada relacionando-se informações sobre o domínio da aplicação com informações sobre o perfil de usuários. Um dos tópicos importantes de pesquisa em sistemas de Hipermídia Adaptativa encontra-se na geração e manutenção do perfil dos usuários. Dentre as abordagens conhecidas, existe um contínuo de opções, variando desde cadastros de informações preenchidos manualmente, entrevistas, até a aquisição automática de informações com acompanhamento do uso da Web. Outro ponto fundamental de pesquisa nesta área está ligado à construção das aplicações, sendo que recursos da Web Semântica, como ontologias de domínio ou anotações semânticas de conteúdo podem ser observados no desenvolvimento de sistemas de Hipermídia Adaptativa. Os principais motivos para tal podem ser associados com a inerente flexibilidade, capacidade de compartilhamento e possibilidades de extensão destes recursos. Este trabalho descreve uma arquitetura para a aquisição automática de perfis de classes de usuários, a partir da mineração do uso da Web e da aplicação de ontologias de domínio. O objetivo principal é a integração de informações semânticas, obtidas em uma ontologia de domínio descrevendo o site Web em questão, com as informações de acompanhamento do uso obtidas pela manipulação dos dados de sessões de usuários. Desta forma é possível identificar mais precisamente os interesses e necessidades de um usuário típico. Integra o trabalho a implementação de aplicação de Hipermídia Adaptativa a partir de conceitos de modelagem semântica de aplicações, com a utilização de recursos de serviços Web, para validação experimental da proposta.
One of the reasons for the increasing development observed in Data Mining area is the raising in the quantity of documents generated and stored in digital format, structured or not. The Web plays central role in this context and some specific techniques can be observed, as structure, content and usage mining. This increasing information offer in the Web brings the cognitive overload problem. The Adaptive Hypermedia permits a reduction of this problem, when the contents of selected documents are presented in accordance with the user needs, preferences and objectives. Briefly put, this adaptation is carried out on the basis of relationship between information concerning the application domain and information concerning the user profile. One of the important points in Adaptive Hypermedia systems research is to be found in the generation and maintenance of the user profiles. Some approaches seek to create the user profile from data obtained from registration, others incorporate the results of interviews, and some have the objective of automatic acquisition of information by following the usage. Another fundamental research point is related with the applications construction, where can be observed the use of Web semantic resources, such as semantic annotation and domain ontologies. This work describes the architecture for automatic user profile acquisition, using domain ontologies and Web usage mining. The main objective is the integration of usage data, obtained from user sessions, with semantic description, obtained from a domain ontology. This way it is possible to identify more precisely the interests and needs of a typical user. The implementation of an Adaptive Hypermedia application based on the concepts of semantic application modeling and the use of Web services resources that were integrated into the proposal permitted greater flexibility and experimentation possibilities.
APA, Harvard, Vancouver, ISO, and other styles
43

Lee, Jong Gun. "User behavior modeling of content generation and consumption in online social networks." Paris 6, 2011. http://www.theses.fr/2011PA066032.

Full text
Abstract:
Les réseaux sociaux en ligne dont des systèmes informatiques qui réagisse à une charge générée par les utilisateur. Comprendre comment les utilisateurs se comportent dans ces réseaux est d'une grande importance pour parvenir à évaluer la charge générée. Néanmoins , le spectre de choix de comportement des utilisateurs dans un réseau social est très large et peu d'études s'y sont penchée. Dans cette, j'ai étudié le comportement des utilisateurs des réseaux sociaux sous trois angles: (1) la consommation de contenus numériques en ligne, (2) le choix de préférence et (3) la production de contenus numériques en ligne. Dans la première partie de cette thèse, j'analyse la consommation des contenus nuémriques en ligne et en particulier la popularité de ceux-ci. J'ai modélisé la popularité d'un contenu par un modèle de regression de Cox avec risques proportionnelles. Ce modèle permet de prédire la popularité d'un contenu. Dans une seconde partie j'ai étudié les choix de préférences des utilisateurs des réseaux sociaux. En d'autres termes comment les utilisateurs choisissent de favoriser un contenu et comment en retour les autres utilisateurs favorisent leurs contenus. Dans une tierce partie de cette thèse j'ai étudié comment les utilisateurs produisent leurs contenus nuémriques et plus spécifiquement j'ai modéliser ce comportement en utilisant des distributions d'exponentielle étirée.
APA, Harvard, Vancouver, ISO, and other styles
44

Judge, John Thomas. "A new model for the marginal distribution of HTTP request rate." School of Electrical, Computer and Telecommunications Engineering - Faculty of Informatics, 2004. http://ro.uow.edu.au/theses/265.

Full text
Abstract:
This thesis proposes a new model for the marginal distribution of HTTP request rate. The model applies to aggregate network traffic generated by a population of users accessing the Web on the Internet. The new model is relatively simple and allows for both the accurate estimation of peak HTTP request rate and the development of two new rules of thumb concerning HTTP request rate. Previous models of HTTP request rate have generally been single user models of a form that are both complex to transform into a model of aggregate traffic and apply to the estimation of peak aggregate HTTP request rate. One comparable model of aggregate HTTP traffic models HTTP request inter-arrival time rather than HTTP request rate and is shown to over estimate peak HTTP request rate. There are few existing rules of thumb concerning HTTP request rate. The two rules proposed here are the first for the estimation of either standard deviation or peak HTTP request rate at the second time scale. The new model for the marginal distribution of aggregate per second HTTP request rate is based on the P�lya-Aeppli probability distribution. The selection of the P�lya-Aeppli distribution can be justified from observed distributions of HTTP request rate of individual Web users and the number of active users per second in a population of Web users. The results are based on the analysis of five independent traces of Web traffic. One trace, collected by the candidate, is of per-user Web traffic generated in a postgraduate research laboratory at the University of Wollongong (UOW) between 1994 and 1997. The other four traces are large independent traces of aggregate Web traffic collected between 1996 and 2002.
APA, Harvard, Vancouver, ISO, and other styles
45

Gomes, João Fernando dos Anjos. "Recomendação de navegação em portais da internet como um serviço suportado em ferramentas Web Analytics." Master's thesis, Instituto Politécnico de Setúbal. Escola Superior de Ciências Empresariais, 2016. http://hdl.handle.net/10400.26/17292.

Full text
Abstract:
Dissertação apresentada para cumprimento dos requisitos necessários à obtenção do grau de Mestre de Sistemas de Informação Organizacionais
Com o constante crescimento da utilização da Internet o número de websites e respetivas páginas contínua a evoluir também, por este motivo, verifica-se uma necessidade de alinhar a experiência de utilização com os objetivos gerais de um website. Para satisfazer esta necessidade o sistema de recomendação proposto sugere páginas ao utilizador que possam ser do seu interesse com base em perfis de navegação de um website em geral. A maioria dos sistemas de recomendação são baseados em regras de associação ou palavras chave (quando o conteúdo é considerado). No entanto, quando os dados não são suficientes ou são muito dispersos e a ordem é considerada, uma abordagem tradicional pode ser inadequada. Por outro lado, assumindo outro paradigma, a área de Web Analytics, tem obtido um crescimento considerável, através de ferramentas robustas que permitem a recolha e análise de dados da internet, a fim de compreender e otimizar eficiência e eficácia do website. O presente artigo propõe o desenvolvimento de um sistema de recomendação baseado na ferramenta Google Analytics. O protótipo é composto por dois componentes principais que são: 1) um serviço responsável pela construção e lógica associada à criação das recomendações; 2) uma biblioteca incorporável em qualquer website que providenciará um widget de recomendação configurável. Avaliações preliminares constataram que a implementação segue a lógica do modelo proposto.
As the Internet usage keeps increasing, the number of web sites and hence the number of web pages also keeps increasing, so there is a need to align the user experience with the overall websites purposes. Toward this requirement, the proposed recommendation systems suggest the user pages that might be of its interest based on past navigation profiles of overall site usage. Most of existing recommendation systems are based on association rules or based on keywords (when content is considered). However, on usage data shortage or sparse data and if sequential order is to be considered such traditional approaches may become unsuitable. Conversely, the Web Analytics arena, assuming other paradigm, has experienced a considerable growth through mature tools that allow the collection and analysis of internet data in order to understand and optimize website efficiency and efficacy. This work proposes the development of a recommendation system based on the Google Analytics tool. The prototype is constituted by two main components which are: 1) a service responsible for the construction and associated logic that underlies recommendations generation; 2) an embeddable library on any website that will furnish website with a configurable recommendation widget. Preliminary evaluations had showed that the implementation follows the logic of the proposed model.
APA, Harvard, Vancouver, ISO, and other styles
46

Persson, Pontus. "Identifying Early Usage Patterns That Increase User Retention Rates In A Mobile Web Browser." Thesis, Linköpings universitet, Databas och informationsteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-137793.

Full text
Abstract:
One of the major challenges for modern technology companies is user retentionmanagement. This work focuses on identifying early usage patterns that signifyincreased retention rates in a mobile web browser.This is done using a targetedparallel implementation of the association rule mining algorithm FP-Growth.Different item subset selection techniques including clustering and otherstatistical methods have been used in order to reduce the mining time and allowfor lower support thresholds.A lot of interesting rules have been mined. The best retention-wise ruleimplies a retention rate of 99.5%. The majority of the rules analyzed in thiswork implies a retention rate increase between 150% and 200%.
APA, Harvard, Vancouver, ISO, and other styles
47

Mayer, Thomas. "Personalisierungsstrategien im E-Commerce : die Webloganalyse als Instrument der Personalisierung im Rahmen des eCRM /." Frankfurt am Main [u.a.] : Lang, 2007. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=015055243&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Pabarškaitė, Židrina. "Žiniatinklio įrašų gavybos paruošimo, analizės ir rezultatų pateikimo naudotojui tobulinimas." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2009. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2009~D_20090713_142146-18729.

Full text
Abstract:
Mokslo problemos aktualumas – dėl didėjančios konkurencijos rinkoje ieškoma naujų darbo formų, todėl didžioji dalis verslo ir ne pelno siekiančių struktūrų perkeliamos į internetinę erdvę. Tai apima įvairių tipų – įmonės-kliento, įmonės-įmonės (skirtingų verslo subjektų) bei kitokius santykius. Be to, per paskutinį dešimtmetį išaugo valstybinių institucijų, bibliotekų, asmeninių svetainių skaičius. Siūlyti prekes, teikti verslo paslaugas ar skelbti aktualią informaciją internete yra labai patogu, nes tai nepriklauso nuo geografinių ir laiko juostų skirtumų. Naudotojas, esantis kitur, nei verslo ar informacijos teikėjas, gali naršyti įmonės internetinę svetainę ir priimti sprendimą, susijusį su minėta verslo struktūra. Šis virtualus ryšys tarp tinklapių ir jų lankytojų palieka pėdsakus – įrašus arba dar kitaip vadinamus įrašus žiniatinklio žurnale, kurie kaupiasi tinklapį aptarnaujančioje tarnybinėje stotyje. Dėl tobulėjančių technologijų atsirado galimybė kaupti ir analizuoti didelių apimčių duomenis, todėl daugiau nei prieš dešimtmetį atsirado nauja tyrimų sritis – žiniatinklio įrašų gavyba. Šio žinių gavybos procesas yra panašus į kitokių duomenų (pvz. finansinių, medicininių), tačiau tam tikri šio proceso etapai yra skirtingi bei unikalūs. Praktinė nauda, kuri gali būti gaunama analizuojant naudotojų naršymo maršrutus tinklapyje – ištirti ryšius tarp susijusių puslapių, atrasti dažniausiai pasirenkamų puslapių sekas bei tokias puslapių sekas, kurios naršomos tam tikru... [toliau žr. visą tekstą]
Topicality of the problem – Internet is becoming an important part of our life; therefore more attention is paid to the information quality on the web and how it is displayed to the user. This knowledge can be extracted by gathering web servers’ data – log files, where all users’ navigational patters are recorded. The research area of this work is web log data analysis in order to enhance information presentation on the web. Web log data analysis steps are similar to other kind of data analysis (e. g. financial, medical) but some processes are different and unique. The research objects of the dissertation are web log data cleaning methods, data mining algorithms and web text mining. The key aim of the work is to improve pattern discovery steps mining web log data in order to: 1. improve the quality of the data for researchers who analyse users behaviour, 2. improve the ways how information is presented, to speed up information display to the end user.
APA, Harvard, Vancouver, ISO, and other styles
49

Bousbia, Nabila. "Analyse des traces de navigation des apprenants dans un environnement de formation dans une perspective de détection automatique des styles d'apprentissage." Paris 6, 2011. http://www.theses.fr/2011PA066011.

Full text
Abstract:
De nombreux Environnements Informatiques d’Apprentissage Humain (EIAH) se sont appuyés principalement sur la détection de caractéristiques individuelles des apprenants pour le suivi et l’adaptation des contenus. L’identification de ces caractéristiques est un problème difficile dans le domaine de l’enseignement à distance. Les recherches se sont orientées vers l’analyse du comportement de l’apprenant. Cette analyse est basée sur l’interprétation d’informations recueillies pendant la session d’apprentissage, appelées traces. Ces traces fournissent des connaissances sur l’activité grâce à des variables calculées que nous appelons indicateurs. L’objectif de cette recherche est de proposer des indicateurs, aussi indépendants que possible de la conception de l’environnement de formation, pour fournir aux enseignants une perception du comportement de leurs apprenants et d’identifier leurs styles d’apprentissage (ensemble de conduites et de stratégies dans la manière de gérer et d'organiser l'information et à la manière de les mettre en œuvre). Pour valider cette approche, nous avons proposé un système à base de traces, et nous avons détaillé la méthode de calcul de l’un des indicateurs de base pour décrire les comportements de navigation des apprenants, celui de «type de navigation». La méthode de calcul proposée a été validée par trois expérimentations, dont la dernière a confirmé la possibilité d’identifier automatiquement les styles d'apprentissage à partir des comportements de navigation. Nous sommes ainsi convaincus de la validité de notre démarche qui nous permettra de poursuivre le travail visant le suivi et l’adaptation des EIAH à base de styles d'apprentissage.
APA, Harvard, Vancouver, ISO, and other styles
50

Murgue, Thierry. "Extraction de données et apprentissage automatique pour les sites web adaptatifs." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2006. http://tel.archives-ouvertes.fr/tel-00366586.

Full text
Abstract:
Les travaux présentés se situent dans le cadre d'extraction de connaissance à partir de données. Un contexte d'étude intéressant et d'actualité a été choisi : les sites web adaptatifs. Pour mettre en oeuvre, de manière la plus automatique possible, de tels sites adaptés aux utilisateurs, nous décidons d'apprendre des modèles d'utilisateurs ou, plus précisément, de leurs types de navigations sur un site web donné. Ces modèles sont appris par inférence grammaticale. Les données disponibles liées au contexte du Web sont particulièrement difficiles à récupérer proprement. Nous choisissons de nous focaliser sur les fichiers de logs serveur en supprimant le bruit inhérent à ces derniers. L'inférence grammaticale peut généraliser ses données d'entrée pour obtenir de bons modèles de langages. Nous travaillons sur les mesures de similarité entre langages pour l'évaluation de la qualité des modèles appris. L'introduction d'une mesure euclidienne entre modèles de langages représentés sous forme d'automates permet de pallier les problèmes des métriques existantes. Des résultats théoriques montrent que cette mesure a les propriétés d'une vraie distance. Enfin, nous présentons divers résultats d'expérimentation sur des données du web que nous pré-traitons avant d'apprendre grâce à elles des modèles utilisateurs issus de l'inférence grammaticale stochastique. Les résultats obtenus sont sensiblement meilleurs que ceux présents dans l'état de l'art, notamment sur les tâches de prédiction de nouvelle page dans une navigation utilisateur.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography