Dissertations / Theses on the topic 'Web-logs'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 33 dissertations / theses for your research on the topic 'Web-logs.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Rao, Rashmi Jayathirtha. "Modeling learning behaviour and cognitive bias from web logs." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492560600002105.
Full textLam, Yin-wan, and 林燕雲. "Senior secondary students use of web-logs in writing Chinese." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B37198361.
Full textChiara, Ramon. ""Aplicação de técnicas de data mining em logs de servidores web"." Universidade de São Paulo, 2003. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-19012004-093205/.
Full textHolmes, Ashley Joyce. "Web logs in the Post-Secondary Writing Classroom: A Study of Purposes." NCSU, 2005. http://www.lib.ncsu.edu/theses/available/etd-03222005-205901/.
Full textVillalobos, Luengo César Alexis. "Análisis de archivos Logs semi-estructurados de ambientes Web usando tecnologías Big-Data." Tesis, Universidad de Chile, 2016. http://repositorio.uchile.cl/handle/2250/140417.
Full textActualmente el volumen de datos que las empresas generan es mucho más grande del que realmente pueden procesar, por ende existe un gran universo de información que se pierde implícito en estos datos. Este proyecto de tesis logró implementar tecnologías Big Data capaces de extraer información de estos grandes volúmenes de datos existentes en la organización y que no eran utilizados, de tal forma de transformarlos en valor para el negocio. La empresa elegida para este proyecto se dedicada al pago de cotizaciones previsionales de forma electrónica por internet. Su función es ser el medio por el cual se recaudan las cotizaciones de los trabajadores del país. Cada una de estas cotizaciones es informada, rendida y publicada a las instituciones previsionales correspondientes (Mutuales, Cajas de Compensación, AFPs, etc.). Para realizar su función, la organización ha implementado a lo largo de sus 15 años una gran infraestructura de alto rendimiento orientada a servicios web. Actualmente esta arquitectura de servicios genera una gran cantidad de archivos logs que registran los sucesos de las distintas aplicaciones y portales web. Los archivos logs tienen la característica de poseer un gran tamaño y a la vez no tener una estructura rigurosamente definida. Esto ha causado que la organización no realice un eficiente procesamiento de estos datos, ya que las actuales tecnologías de bases de datos relaciones que posee no lo permiten. Por consiguiente, en este proyecto de tesis se buscó diseñar, desarrollar, implementar y validar métodos que sean capaces de procesar eficientemente estos archivos de logs con el objetivo de responder preguntas de negocio que entreguen valor a la compañía. La tecnología Big Data utilizada fue Cloudera, la que se encuentra en el marco que la organización exige, como por ejemplo: Que tenga soporte en el país, que esté dentro de presupuesto del año, etc. De igual forma, Cloudera es líder en el mercado de soluciones Big Data de código abierto, lo cual entrega seguridad y confianza de estar trabajando sobre una herramienta de calidad. Los métodos desarrollados dentro de esta tecnología se basan en el framework de procesamiento MapReduce sobre un sistema de archivos distribuido HDFS. Este proyecto de tesis probó que los métodos implementados tienen la capacidad de escalar horizontalmente a medida que se le agregan nodos de procesamiento a la arquitectura, de forma que la organización tenga la seguridad que en el futuro, cuando los archivos de logs tengan un mayor volumen o una mayor velocidad de generación, la arquitectura seguirá entregando el mismo o mejor rendimiento de procesamiento, todo dependerá del número de nodos que se decidan incorporar.
Vasconcelos, Leandro Guarino de. "Uma abordagem para mineração de logs para apoiar a construção de aplicações web adaptativas." Instituto Nacional de Pesquisas Espaciais (INPE), 2017. http://urlib.net/sid.inpe.br/mtc-m21b/2017/07.24.15.06.
Full textCurrently, there are more than 1 billion websites available. In this huge hyperspace, there are many websites that provide exactly the same content or service. Therefore, when the user does not find what she is looking for easily or she faces difficulties during the interaction, she tends to search for another website. In order to fullfil the needs and preferences of todays web users, adaptive websites have been proposed. Existing adaptation approaches usually adapt the content of pages according to the user interest. However, the adaptation of the interface structure in order to meet user needs and preferences is still incipient. In this thesis, an approach is proposed to analyze the user behavior of Web applications during navigation, exploring the mining of client logs, called RUM (Real-time Usage Mining). In this approach, user actions are collected in the applications interface and processed synchronously. Thus, RUM is able to detect usability problems and behavioral patterns for the current application user, while she is browsing the application. In order to facilitate its deployment, RUM provides a toolkit which allows the application to consume information about the user behavior. By using this toolkit, developers are able to code adaptations that are automatically triggered in response to the data provided by the toolkit. Experiments were conducted on different websites to demonstrate the efficiency of the approach in order to support interface adaptations that improve the user experience.
Tanasa, Doru. "Web usage mining : contributions to intersites logs preprocessing and sequential pattern extraction with low support." Nice, 2005. http://www.theses.fr/2005NICE4019.
Full textThe Web use mining (WUM) is a rather research field and it corresponds to the process of knowledge discovery from databases (KDD) applied to the Web usage data. It comprises three main stages : the pre-processing of raw data, the discovery of schemas and the analysis (or interpretation) of results. The quantity of the web usage data to be analysed and its low quality (in particular the absence of structure) are the principal problems in WUM. When applied to these data, the classic algorithms of data mining, generally, give disappointing results in terms of behaviours of the Web sites users (E. G. Obvious sequential patterns, stripped of interest). In this thesis, we bring two significant contributions for a WUM process, both implemented in our toolbox, the Axislogminer. First, we propose a complete methodology for pre-processing the Web logs whose originality consists in its intersites aspect. We propose in our methodology four distinct steps : the data fusion, data cleaning, data structuration and data summarization. Our second contribution aims at discovering from a large pre-processed log file the minority behaviours corresponding to the sequential patterns with low support. For that, we propose a general methodology aiming at dividing the pre-processed log file into a series of sub-logs. Based on this methodology, we designed three approaches for extracting sequential patterns with low support (the sequential, iterative and hierarchical approaches). These approaches we implemented in hybrid concrete methods using algorithms of clustering and sequential pattern mining
Allam, Amir Ali. "Measuring the use of online corporate annual reports through the analysis of web server logs." Thesis, University of Birmingham, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.633067.
Full textTanasa, Doru. "Fouille de données d'usage du Web : Contributions au prétraitement de logs Web Intersites et à l'extraction des motifs séquentiels avec un faible support." Phd thesis, Université de Nice Sophia-Antipolis, 2005. http://tel.archives-ouvertes.fr/tel-00178870.
Full textMantella, Dana G. ""Pro-ana" Web-log uses and gratifications towards understanding the pro-anorexia paradox." unrestricted, 2007. http://etd.gsu.edu/theses/available/etd-04182007-194043/.
Full textCynthia Hoffner, committee chair; Jaye Atkinson, Mary Ann Romski, committee members. Electronic text (90 p.) : digital, PDF file. Title from file title page. Description based on contents viewed Dec. 14, 2007. Includes bibliographical references (p. 67-74).
Stomeo, Carlo. "Applying Machine Learning to Cyber Security." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17303/.
Full textVàllez, Letrado Mari. "Exploración de procedimientos semiautomáticos para el proceso de indexación en el entorno web." Doctoral thesis, Universitat Pompeu Fabra, 2015. http://hdl.handle.net/10803/359393.
Full textLa ingent quantitat d'informació que hi ha actualment fa necessari el desenvolupament d'eines, mètodes i processos que facilitin l'accés a la mateixa. Especialment, es requereixen sistemes d'informació que siguin eficients i precisos. Les tècniques d'indexació compten amb una llarga tradició en aquest àmbit. No obstant això, la seva aplicació a gran escala i en el context de la web no sempre és viable per la magnitud i heterogeneïtat de la informació present en ella. En aquesta tesi es presenten dues propostes per a facilitar el procés d'indexació de documents a Internet. La primera es caracteritza per l'ús de tècniques d'indexació semiautomàtiques basades en aspectes de posicionament web, i que s'apliquen a través d'una eina pròpia anomenada DigiDoc MetaEdit. La segona proposa un model per a l'actualització de vocabularis controlats a partir del processament dels logs de les cerques formulades pels usuaris als cercadors.
The vast amount of information that currently exists necessitates the development of tools, methods and processes that facilitate access to it. In particular, information systems that are efficient and accurate are required. Indexing techniques have a long tradition of promoting the improvement of these systems. However, its application on a large scale and in the context of the Web is not always feasible because of the magnitude and diversity of the information in it. This thesis presents two proposals to facilitate the process of indexing documents on the Internet. The first is characterized by the use of semi-automatic indexing techniques based on aspects of SEO, and applied through a proprietary tool called DigiDoc MetaEdit. The second proposes a model for updating controlled vocabularies from the processing of logs of searches made by users on search engines.
Belaud, Lydie. "Une approche ergonomique des sites marchands sur internet : de la perception au comportement des consommateurs." Phd thesis, Université de Bourgogne, 2011. http://tel.archives-ouvertes.fr/tel-00681182.
Full textLam, Yin-wan. "Senior secondary students use of web-logs in writing Chinese a case study = Xianggang gao zhong xue sheng zhong wen wang shang ri zhi xie zuo ge an yan jiu /." Click to view the E-thesis via HKUTO, 2006. http://sunzi.lib.hku.hk/hkuto/record/B37198361.
Full textPettersson, Albin, and Robin Rogne. "Webbplats för översikt av loggar." Thesis, Karlstads universitet, Fakulteten för hälsa, natur- och teknikvetenskap (from 2013), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-42974.
Full textThis is a dissertation on how we performed a project that gives our customer Ninetech a simple overview of logs. Searching of specific information such as logs in large systems can be time consuming. This was the case for the company Ninetech. Therefore, the company’s will was to have a website to present the logs in a simple format. The website’s purpose was to facilitate and reduce the time when searching for logs. It is important to shorten the time because the logs contain valuable information that the company uses to resolve support issues. The website created is a complete product delivered to the company and is currently running in production. In the project, we have had a close contact with the customer, hence has an agile method applied. As a result, the website has brought the need to find logs quickly and easily. The website has become a part of Ninetech's everyday work. The main focus of the dissertation is devoted to the view that presents logs.
Kilic, Sefa. "Clustering Frequent Navigation Patterns From Website Logs Using Ontology And Temporal Information." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12613979/index.pdf.
Full textNunes, José Manuel Rodrigues. "Visualização de interação em cenários de comunicação humano-computador." Doctoral thesis, Universidade de Aveiro, 2017. http://hdl.handle.net/10773/23153.
Full textOs contextos infocomunicacionais suportados em mediação tecnológica (no contexto geral da comunicação mediada por computador) estão a tornar-se cada vez mais presentes nas atividades do dia a dia de um número crescente de indivíduos e instituições. Especificamente, as tecnologias e serviços da internet/web têm uma presença marcante nas instituições, um pouco por todo lado. Os sites internos das instituições (vulgo intranets) são desenvolvidos de acordo com as estratégias de comunicação internas, refletindo os fluxos internos de informação e respetivos serviços de comunicação que lhe estão associados. Um problema emergente tem a ver com a gestão destas plataformas infocomunicacionais internas (intranets) e da relação com os seus interlocutores externos (extranets), ambas em crescimento constante. Os especialistas em comunicação organizacional sentem a falta de ferramentas que lhes permita analisar (padrões de atividade e comportamento) e perceber o que realmente se está a passar dentro da instituição. De facto, estes instrumentos tendem a basear-se em métricas de cariz técnico clássico, em muitos casos, para afinações técnicas e não para gerir a comunicação organizacional ou para a análise e gestão da informação. Esta tese centra-se na conceção e avaliação destas ferramentas de análise e diagnóstico para que possam contribuir para um desenvolvimento destas infraestruturas sofisticadas e, consequentemente, melhorar a eficiência dos processos infocomunicacionais que lhes são intrínsecos. Um dos problemas está na identificação dos desajustes utilizador-sistema ao nível da interação humano-computador, que têm que ser completamente identificados, e os problemas prontamente apresentados à equipa que procede ao desenho e desenvolvimento das plataformas infocomunicacionais. O sistema tem que servir a organização e manter com eficácia os seus padrões de fluxo de informação e respetivas tarefas. O conceito sistémico de feedback apresenta-se aqui como fundamental e necessariamente eficiente para a rigorosa identificação de problemas na plataforma infocomunicacional de uma determinada instituição. As propostas apresentadas demonstram capacidade de diagnosticar problemas estruturais e de conteúdo a dois níveis: ao nível da própria interface dos serviços infocomunicacionais e ao nível da estrutura interna, ou de organização relacional de informação. Os serviços de diagnóstico apresentados baseiam-se em pressupostos de análise contextual fortemente suportados em técnicas de análise visual e revelam, através de algumas experiências de cariz empírico, conseguir dar resposta ao desafio lançado por esta tese.
Technologically mediated info-communicational scenarios are becoming more and more pervasive in the day-to-day activity of a growing number of individuals and institutions. Specifically, internet/web technologies and services have a strong presence in institutions worldwide. Internal web sites (also known as intranets) are developed in compliance with internal communication strategies, reflecting internal information, workflow and related communication services. An emerging problem concerns the management of these constantly growing internal info-communicational platforms (intranets) and its external counterparts (extranets). Organizational communication specialists lack efficient tools to analyze (activity and behavioral patterns) and understand what is really going on inside the institutions. In fact, these instruments tend to be based on classical technical metrics, in most situations, for technical tuning and not for organizational communication and information analysis. This thesis is focused on the conception and evaluation of these diagnostic tools in order to contribute to the development of these sophisticated infrastructures and, consequently, improve the efficiency of their internal info-communicational processes. One of the issues lies in identifying user-system mismatch at the human-computer interaction level, which must be thoroughly identified, and the problems pinpointed to the design team. The system must serve the organization and adapt perfectly to its internal communication strategies, sustaining efficiently its information and workflow patterns. Efficient feedback instruments are fundamental to identify info-communicational platform problems inside an institution. The offered proposals demonstrate the ability to diagnose structural and content issues at two levels: at the level of its own info-communication services interface, and at the level of the internal structure or relational layout of information. The presented diagnostic services are based upon assumed contextual analysis, strongly supported in visual assessment methods, and manage to provide a response to the challenge issued by this thesis, through some empirical experiments.
Nassopoulos, Georges. "Deducing Basic Graph Patterns from Logs of Linked Data Providers." Thesis, Nantes, 2017. http://www.theses.fr/2017NANT4110/document.
Full textFollowing the principles of Linked Data, data providers published billions of facts as RDF data. Executing SPARQL queries over SPARQL endpoints or Triple Pattern Fragments (TPF) servers allow to easily consume Linked Data. However, federated SPARQL query processing and TPF query processing decompose the initial query into subqueries. Consequently, the data providers only see subqueries and the initial query is only known by end users. Knowing executed SPARQL queries is fundamental for data providers, to ensure usage control, to optimize costs of query answering, to justify return of investment, to improve the user experience or to create business models of usage trends. In this thesis, we focus on analyzing execution logs of TPF servers and SPARQL endpoints to extract Basic Graph Patterns (BGP) of executed SPARQL queries. The main challenge to extract BGPs is the concurrent execution of SPARQL queries. We propose two algorithms: LIFT and FETA. LIFT extracts BGPs of executed queries from a single TPF server log. FETA extracts BGPs of federated queries from a log of a set of SPARQL endpoints. For experiments, we run LIFT and FETA on synthetic logs and real logs. LIFT and FETA are able to extract BGPs with good precision and recall under certain conditions
Soto, Mu?oz Leonardo Humberto. "Desarrollo de modelo de negocio para un gestor de logs para aplicaciones desarrolladas en la nube (cloud)." Tesis, Universidad de Chile, 2014. http://repositorio.uchile.cl/handle/2250/131997.
Full textEste trabajo presenta el desarrollo de un modelo de negocios para BeautifulLogs , un gestor de logs de aplicaciones (m?viles o web) desarrolladas para una infraestructura conocida como la nube (cloud computing). El desarrollo del modelo de negocio sigue una metodolog?a basada en los m?todos Lean Startup y Customer Development, enfocada en validar con clientes reales los supuestos de un modelo de negocios y reflejando los avances y el estado de dicho modelo en un Canvas de Modelo de Negocios. Cada validaci?n se conforma de un experimento con un resultado esperado en el caso en que el supuesto sea correcto. Durante el desarrollo de este proceso para el modelo de negocios de BeautifulLogs se encontr? un espacio en el mercado (ya existente) de este tipo de herramientas, nicho basado en necesidades avanzadas que no est? siendo cubierto por las soluciones actuales. Entre estas necesidades se cuenta la necesidad de almacenar la informaci?n por m?s tiempo que las soluciones actuales (que promedian dos semanas), incrementar la potencia de las funciones de b?squeda y agregar elementos de m?tricas y anal?ticas inferidas a partir de los mismos logs. En base a estas necesidades se establecieron propuestas de valor empaquetadas en planes con un precio validado por los propios usuarios encuestados. El modelo de negocio sigue el esquema SaaS: Software como servicio por el cual el cliente paga una mensualidad (o anualidad) a cambio de su uso. Si bien los costos variables para poder ofrecer las soluciones ofrecidas en los planes resultaron ser relativamente elevados (para el mercado SaaS), existe un saludable margen entre el ingreso variable promedio por usuario (USD 143,33) y el costo variable por usuario proyectado (USD 68,13). Por otra parte, los costos de adquisici?n, la conversi?n de usuarios y el crecimiento neto de usuarios se presentan como las variables claves que determinar?n la rentabilidad del negocio, que en un escenario moderado proyecta ventas por casi un mill?n de d?lares en el octavo trimestre de vida del negocio. En su forma actual, el modelo requiere una inversi?n inicial de USD 350.000 a ocupar principalmente en desarrollo de la plataforma, as? como tambi?n en capital de trabajo. La recomendaci?n a seguir es levantar una fracci?n de esa inversi?n para efectuar una validaci?n de clientes mediante un producto m?nimo viable, que permita bajar el riesgo presentado por la variaci?n de las variables claves mencionadas anteriormente.
Bednář, Martin. "Automatické testování projektu JavaScript Restrictor." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-432879.
Full textHsiao, Kuang-Yu, and 蕭廣佑. "Fuzzy Data Mining on Web Logs." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/74688406448388466191.
Full text南台科技大學
資訊管理系
92
With the improvement of technology, the Internet has been becoming an important part of everyday life. Governmental institutions and enterprises propose to advertise and marketing through webs. With the traveling records of browsers, one can analyze the preference of browsers, further understand the demands of consumers, and promote the advertising and marketing. In this study, we utilize Maximum Forward Reference algorithm to find the travel pattern of browsers from web logs. Simultaneously, experts are asked to evaluate the fuzzy importance weightings for different webs. At last, we employ fuzzy data mining technique that combines Apriori algorithm with fuzzy weights to determine the associate rules. From the yielded association rules, one can be accurately aware the information consumers need and which webs they prefer. This is important to governmental institutions and enterprises. Enterprises can find the commercial opportunities and improve the design of webs by means of this study. Governmental institutions can realize the needs of people from the obtained association rules, make the promotion of policy more efficiently, and provide better service quality.
Weng, Hong-yang, and 翁弘彥. "Exploring Web Logs on Internet Communications." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/68202977758871794235.
Full text世新大學
資訊管理學研究所(含碩專班)
101
In recent years, the globe use Internet the population and penetration rate continuously improve with the rapid development of the Internet. Not only the population is growth who is using Internet, but also people use Internet habits of proportion in their daily lives are increasing. With the development of the Internet, Web-based business market is growing. The companies also have to invest in the network market, the rise of e-commerce, whether it is online store, online advertising and so continue to increase. Since the rise of social networking sites to share will be faster of everything. So many people want to use of the convenience of this share distribution, valuable information or products quickly spread to consumers. Therefore, these research uses the sequential pattern mining by data mining for users to browse the website analyze the data. After the first site browsing will next visit the site to find more people visit the site of the reference node with a wider spread site of opinion leaders. And to find the relevance between the websites use the people matrix. Therefore, this paper will use these methods using social network analysis. To find new channels of advertising placement that through the results of this analysis enables companies.
Lin, Ching-Nan, and 林慶南. "Enhancement of Web Sites Security Utilizing Web Logs Mining." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/34333759103494653266.
Full text中原大學
電子工程研究所
90
Abstract The problem of information security on the Web has become an important research issue recently. Because the Backdoors or information leak of scripts in Common Getaway Interface(CGI)is hidden inadvertently or premeditated by programmers, these problems cause enterprise’s information to be gotten illegally, and can’t be detected by security tools easily. Besides, Internet grows fast to encourage the important research of Web mining. Therefore, in order to detect Backdoor or information leak of CGI scripts that the some security tools can’t detect and to avoid damage of enterprises, we propose a log data mining to enhance the security of Web servers. First, we combine Web application log data with Web log data to solve the problems in Web log. Then, our method uses the density-based clustering algorithm to mine some abnormal Web log and Web application log data. The obtained information can help system administrator detecting the Backdoor or information leakages in programs more easily. Moreover, the mined information can help system administrator detecting the problem of CGI scripts from on-line Web site log data.
Wang, Tseng-Pu, and 王曾甫. "Clustering Customers Based on Web-Browsing Logs." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/76727886739977180393.
Full text世新大學
資訊管理學研究所(含碩專班)
100
Therefore the income composition of enterprises at current internet industry, advertisement takes up to most of portion out of it. In tradition, promotional activities usually use basic reading/hearing users information (head count Mathematics) to classify target users group. However simply use head count Mathematics to classify target users group does not consider the differences of hobbies and interests of peoples which might cause the dynamic changes of people’s behavior. Therefore considering user’s behavior characteristic to proceed uer’s classification, which can make enterprise to understand internet user’s tendency as well as raise it’s overall marketing coverage rate. This research uses the database of website browsing behaviors recorded by Genesis Market Research Consulting Corporation to proceed research, also to use the seven kinds of consumer’s behavior researched and analyzed by Yahoo U.S.A., and eventually to establish the groups models of internet users in accordance with foregoing research.
Caldera, Amithalal, University of Western Sydney, of Science Technology and Environment College, and School of Computing and Information Technology. "Effectively capturing user sessions on the Web using Web server logs." 2005. http://handle.uws.edu.au:8081/1959.7/11206.
Full textDoctor of Philosophy (PhD)
林逸塵. "Collection and retrieval of suicide information in web logs." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/6wm3pm.
Full textTsai, Tsung-chou, and 蔡聰洲. "Integrating Data Warehousing and Data Mining for Web Logs Analysis." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/27623832729147888571.
Full text國立交通大學
資訊管理所
89
Because of the highly mature network technologies and the explosive growth of Internet, the numbers of Internet users increase substantially. In order to keep good relationships with customers and raise customers’ satisfaction, more and more companies start to provide their customers easy and fast retrieval of business information and services through Internet, such as WWW. Consequently, Web logs on the Web servers keep growing as time passing by. The Web logs are not useless and not just wasting memory space. Instead, they are important sources from which we can retrieve useful information. Effective analysis of Web logs is helpful to promote enterprise competence. In this thesis, integrated system architecture for analyzing Web logs is proposed. It integrates Data mining technology to provide recommendations for Web users and Data Warehousing technology to provide decision information for managers. A test Web site simulating a company selling computer peripherals has been set up to verify the proposed architecture. The results are analyzed and presented in the thesis. Finally, some comparisons and discussions with other related papers are presented.
""Aplicação de técnicas de data mining em logs de servidores web"." Tese, Biblioteca Digital de Teses e Dissertações da USP, 2003. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-19012004-093205/.
Full textHong, Rong Zong, and 洪榮宗. "A Study on Security Enhancement of Web Sites Utilizing Logs Analysis." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/00011799081038754784.
Full text長庚大學
資訊管理學研究所
97
The application of Internet becomes popular. The services provided by the website include those related to foods, clothing, living, transportation, education and entertainment. However, we usually hear that some websites have been hacked or attacked. The events led to information security incidents. The research issue of this thesis is : Can we use the server log to enhance the security in the website ? This paper presents a method that uses server logs as the source for analysis to enhance the website security. The server logs analyzed in the thesis include Fedora Code Operation System log, Apache server log, Snort log, IPTABLES log. As a result, we can prevent some potential known attacks. For those unknown attacks, we can identify potential threats and thus protect the security of the web server.
Tang, Ran. "AN APPROACH FOR IDENTIFYING SERVICE COMPOSITION PATTERNS FROM EXECUTION LOGS." Thesis, 2010. http://hdl.handle.net/1974/6114.
Full textThesis (Master, Electrical & Computer Engineering) -- Queen's University, 2010-09-29 18:08:07.55
Cordes, Christopher Sean. "Blogging the future: Theory and use of web logs to enhance library information services." 2004. http://hdl.handle.net/10150/105509.
Full textJiang, Jyun-Yu, and 姜俊宇. "Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/67605146002402152486.
Full text國立臺灣大學
資訊工程學研究所
103
In this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically only optimize ranking models by each query separately. Hence, there are inconsistent rankings in modern search engines. It is expected that the search results of different queries with similar search intents should preserve ranking consistency. The aim of this paper is to learn consistent rankings in search results for improving the relevance ranking in web search. We then propose a re-ranking model aiming to simultaneously improve relevance ranking and ranking consistency by leveraging knowledge bases and search logs. To the best of our knowledge, our work offers the first solution to improving relevance rankings with ranking consistency. Extensive experiments have been conducted using the Freebase knowledge base and the large-scale query-log of a commercial search engine. The experimental results show that our approach significantly improves relevance ranking and ranking consistency. Two user surveys on Amazon Mechanical Turk also show that users are sensitive and prefer the consistent ranking results generated by our model.
Borges, Eurico Alexandre Teixeira. "Sistemas de Data Webhousing : análise, desenho, implementação e exploração de sistemas reais." Master's thesis, 2004. http://hdl.handle.net/1822/2787.
Full textA Web tem-se tornado um dos espaços mais apelativos para as organizações como forma de divulgação das suas actividades, promoção dos seus produtos e serviços e desenvolvimento de actividades comerciais. Todavia, os visitantes de um sítio Web podem facilmente saltar para um sítio da concorrência caso não encontrem rapidamente aquilo que procuram, ou se tiverem qualquer outro motivo que não seja do seu agrado. Conhecer os visitantes e garantir que os produtos, serviços ou informação são aqueles que eles procuram é imperativo. É por isso que as organizações têm tentado analisar vários tipos de questões relacionadas, por exemplo, com a forma como os clientes procuram os produtos, onde abandonam o sítio e porquê, qual a frequência de visitas dos seus clientes, quais os produtos ou serviços que mais interesse despertaram nos visitantes, enfim tudo o que possa contribuir para a melhoria do sítio e para manter ou atrair novos clientes. Todos os movimentos e selecções dos utilizadores de um sítio Web podem ser acompanhados através dos “cliques“ que vão fazendo ao longo do seu processo de interacção com as diversas páginas Web. A esta sequência de “diques” dá-se o nome de clickstream. Será a partir dos dados registados pelo servidor Web sobre as selecções do utilizador que se poderá iniciar o estudo das suas iterações e comportamento. Contudo, o registo mantido pelos servidores Web forma apenas um esqueleto que terá de ser enriquecido com os registos dos vários componentes e sistemas que suportam o seu funcionamento. Este tipo de integração e conciliação de dados num único repositório é, tradicionalmente, feito no seio de um Data Warehouse que, pelo acréscimo dos dados de dlickstream, se torna num Data Webhouse. Todo o processo de extracção, transformação e integração no Data Webhouse é, no entanto, dificultado pelo volume, incomplitude e heterogeneidade dos dados e pela própria tecnologia utilizada no ambiente Web. Nesta dissertação, é apresentado e descrito um modelo dimensional para um Data Webhouse para análise de um sítio Web comercial. São estudadas e apresentadas algumas das suas fontes de dados bem como técnicas que podem ser utilizadas para eliminar ou reduzir os problemas existentes nos dados de clickstream. É descrito todo o desenvolvimento e implementação do processo de extracção, limpeza, transformação e integração de dados no Data Webhouse com especial relevo para as tarefas de clickstream - a identificação de utilizadores e agentes automáticos e a reconstrução de sessões. É apresentado o Webuts — Web Usage Tracking Statistics, um protótipo de um sistema de apoio à decisão para acompanhamento e análise estatística das actividades dos utilizadores de um sítio Web e onde se incorporam alguns dos elementos, técnicas, princípios e práticas descritas.
The Web is becoming one of the most appeallng environments for the many organisations as a means of promoting its businesses and activities as well as a commercialisation channel. However, a Web user can easily leave one organisation’s Web site for its competitors if he doesn’t find what he is looking for or if he finds something unpleasant on one organisation’s site. To know the site’s users and making sure that the products, services or information the site is providing is what the users want is nowadays a must. That is why many organisations have started to study how their web site users browse the site, where are they leaving the site and why, how frequently do their users return, what products and services are most appealing and, in general terms, everything that may be used to improve the Web site and attract new users. Every user moves may be tracked by retaining the clicks selections they do on the different Web pages during their visit. This flow of clicks is now called clickstream. It is the data logged by the Web server on the user’s selections that will enable the organisation to study their moves and behaviour. However, the Web server log only keeps the bare bones of the user’s activity. This data will have to be enriched with data collected by other systems designed to provide the Web site with contents or additional functionalities. Traditionally, the gathering and integration of data from heterogeneous data sources is done inside a Data Warehouse. By adding clickstream data to it we are creating a Data Webhouse. However, Web technology, the data volume, its heterogeneity and incompleteness will create difficulties in the process of extracting, transforming and loading data into the Data Webhouse. In this document we present a dimensional model for a Data Webhouse whose purpose is to analyse a commercial Web site. Several data sources are presented and analised in detail. Some of the techniques used to eliminate or reduce clickstream data problems are also described. The Data Webhouse extraction, cleaning, transformation and loading process is described and special attention is paid to clickstream processing tasks such as user and robot identification and user session reconstruction. A new decision support system prototype, named Webuts - Web Usage Tracking Statistics, is presented. This system’s purpose is to track and analyse a Web site users’ moves and actitivities as well as generate some statistical data on the Web site operation. Its operation is based on a Data Webhouse and its development incorporated some of the elements, techniques and best practices studied and described.
Sonae, Indústria Consultoria e Gestão - Departamento de Sistemas de Informação