Log in

Relevant bibliographies by topics / Wikipedia / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Wikipedia.

Dissertations / Theses on the topic 'Wikipedia'

Author: Grafiati

Published: 4 June 2021

Last updated: 29 July 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Wikipedia.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Huang, F. (Feiran). "What motivates and restricts chinese Wikipedians to contribute to english Wikipedia?" Master's thesis, University of Oulu, 2016. http://urn.fi/URN:NBN:fi:oulu-201605221860.

Full text

Abstract:

Wikipedia, world’s biggest and most popular online encyclopedia, contains more than 26 million articles in over 280 languages, behind which are contributors voluntarily dedicating their time and effort. Hence, the Wikipedia contributors’ motivations have been a popular topic in academic researches. According to the prior studies, people contribute to Wikipedia entries are motivated by altruism, reputation and enjoyment. However, researches on the motivation and restrictions of Chinese Wikipedia users contributing to Wikipedia articles in English still remains blank. To bridge this gap, this study aims to explore and address the motivations and restrictions of Chinese Wikipedians contributing to the English version of Wikipedia articles. This study was an explorative case study with the data and interviews contributed by four Chinese Wikipedians. The main findings were divided into two domains: motivations and restrictions. To be more detailed, Chinese Wikipedians are motivated by altruism, reputation, self-development and improvement of content quality. Meanwhile, they are driven by restrictions such as the blockage of access to Wikipedia in Chinese language from the mainland of China, and the limited source of articles in Chinese. The findings of this study contribute to the research on cross-linguistic participation: people contributing to Wikipedia in a language other than their mother language. In addition, the findings could be helpful for future researches on the Internet blockage in China.

APA, Harvard, Vancouver, ISO, and other styles

2

Cöster-Ahl, Pelle. "Wikipedia i undervisning." Thesis, Malmö högskola, Fakulteten för lärande och samhälle (LS), 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-32934.

Full text

Abstract:

Syftet med denna undersökning är att studera svensklärares och samhällsvetenskapslärares förhåll-ningssätt till Wikipedia och därmed belysa en del av de utmaningar som skolan ställs inför det rå-dande medielandskapet. Frågeställningarna är inriktade på lärares attityder och resonemang kring Wikipedia, samt vilka konsekvenser lärarna anser att användningen av det får sin i undervisning. Eventuella skillnader mellan svensklärares och samhällsvetenskapslärares förhållningssätt gentemot Wikipedia tas också upp.Metoden för undersökningen är kvalitativa intervjuer. Dessa genomfördes med fyra lärare, två svensklärare och två samhällsvetenskapslärare, från en och samma gymnasieskola. Resultatet ger en bild av hur diskursen kring Wikipedia kan se ut i skolan och hur de olika lärarna resonerar kring det. Lärarna ser olika för- och nackdelar med Wikipedia i undervisning och dessa behandlas i analysen. Analysen visar att det finns en konflikt mellan de traditioner som tidigare gett skolan och lärarna makten att kontrollera och distribuera kunskap. Nya kulturella förutsättningar och faktumet att vi lever i en deltagarkultur resulterar i att fler människor kan sprida kunskap och information, via ex-empelvis Wikipedia. I synnerhet en av lärarna poängterar brister i Wikipedias pålitlighet och en annan ser demokratiaspekten av det som en viktig aspekt. Olika för-och nackdelar med Wikipedias form tas också upp, t.ex. vad dess hypertextualitet kan ha för pedagogiska konsekvenser.Slutsatsen är att lärarna inte har något enhetligt förhållningssätt gentemot Wikipedia. Wikipedia verkar dessutom, trots att det är ett nytt fenomen, harmonisera väl med traditionella skoluppgifter som handlar om att reproducera fakta, vilket också pekar mot en relativt snäv kunskapssyn.

APA, Harvard, Vancouver, ISO, and other styles

3

Famiglietti, Andrew A. "Hackers, Cyborgs, and Wikipedians: The Political Economy and Cultural History of Wikipedia." Bowling Green State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1300717552.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Johansson, Henrik, and Johan Stiel. "Kognitiv auktoritet och Wikipedia : En analys av gymnasieelevers källkritiska granskning av Wikipedia." Thesis, Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-19012.

Full text

Abstract:

The main purpose of this master’s thesis is to examine how high school students evaluate the quality of the information available on the online encyclopedia Wikipedia. By doing quantitative research based on questionnaires we expected to find that the means for judging that students use in this situation are most frequently based on a text’s intrinsic qualities in the point of view of its content. It was also our presupposition that 18 and 19 year old high school graduates have received some education in information research and authoritative sources on the Internet. From this assumption we assessed that, while not always using methods recommended by experts, the students use Wikipedia with a high level of healthy suspicion. We considered Patrick Wilson’s theory about cognitive authority to be valid for these assumptions. Though generalizations based on a large sample reduce an understanding of each individual’s expected behavior during information qualitiy evaluation, conclusions could be drawn about which criteria most individuals in this age use in a situation like this. The results show that though intrinsic plausibility has great significance in the students’ evaluation of quality, looking for outside sources is still the main criteria for credibility. It was further discovered that the students in the study used a wide variety of criteria apart from intrinsic plausibility and examining outside sources. This can be of certain significance for future research because it gives us valuable information about the behavior of high school students in terms of information research.
Uppsatsnivå: D

APA, Harvard, Vancouver, ISO, and other styles

5

Pang, Cheong Iao. "Map-like Wikipedia visualization." Thesis, University of Macau, 2011. http://umaclib3.umac.mo/record=b2550683.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Holaker, Martin Rudi, and Eirik Emanuelsen. "Event Detection using Wikipedia." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-22972.

Full text

Abstract:

The amount of information on the web is ever growing, and analysing and making sense ofthis is difficult without specialised tools. The concept of Big Data is the field of processing and storing these enormous quantities of data. Wikipedia is an example of a source that can benefit from improvements provided by this concept. It is an online, user driven encyclopedia that contains vast amounts of information on nearly all aspects known to man. Every hour, Wikipedia releases logs showing the number of views each page had, and it is also possible to gain access to all edits as Wikipedia frequently release the entire encyclopedia - containing all revisions. This makes it a great source when studying trends of recent years. In order to systematise these page views and edits, we design a scalable database, and implement a number of analysing jobs to process this information. From the page views and edit count, we perform burst and event detection, and compare the usefulness of two methods used for this purpose. We test the validity of our system by examining the case of the of the football transfer window in August 2011. Our system generate admirable and accurate results in regards to being able to detect these football transfers as events. In order to visualise the information we gather, we design a web application that give users structured access to our findings.

APA, Harvard, Vancouver, ISO, and other styles

7

Maier, Gunther. "OpenStreetMap, the Wikipedia Map." ERSA (European Regional Science Association), 2014. http://dx.doi.org/10.18335/region.v1i1.70.

Full text

Abstract:

This paper presents OpenStreetMap and closely related software as a resource for spatial economic research. The paper demonstrates how information can be extracted from OpenStreetMap, how it can be used as a geographical interface in web-based communication, and illustrates the value of the tools by use of a specific application, the WU campus GIS. (author's abstract)

APA, Harvard, Vancouver, ISO, and other styles

8

Chen, Mike. "Taxonomy Extraction from Wikipedia." Ohio University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1320342712.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Wotschka, Marco. "Experiments in Compressing Wikipedia." Ohio University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1376909207.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Ferron, Michela. "Collective Memories in Wikipedia." Doctoral thesis, Università degli studi di Trento, 2012. https://hdl.handle.net/11572/368923.

Full text

Abstract:

Collective memories are precious resources for the society, because they contribute to strengthening the emotional bonding between community members, maintaining groups cohesion, and directing future behavior. Understanding the formation of the collective memories of emotional upheavals is important to a better comprehension of people's reactions and of the consequences on their psychological health. Previous studies investigated the effects of single traumatizing events, but few of them applied a quantitative approach to analyze the different psychological processes associated to the collective memories formation of upheavals on a large scale. This thesis explores the opportunities of applying quantitative methods to the study of collective memories in a collaborative environment such as the English Wikipedia. First, the presence of commemoration processes in Wikipedia articles and talk pages about traumatic events will be investigated through the analysis of edit activity patterns. Second, natural language processing techniques will be applied to detect differences in the collective representations of traumatic and non traumatic events, in the temporal focus of old and recent traumatic events, and in the representations of natural and human-made disasters. Third, the temporal evolution of language related to emotional, cognitive and social processes will be analyzed in the talk pages of two different emotional upheavals, the 2005 London bombings and the 2011 Egyptian revolution. The results will confirm the interpretation of Wikipedia as a global memory place, and highlight specific psychological processes related to the formation of collective memories of different types of traumatic events, opening the way to the quantitative study of collective memory formation in digital collaborative environments.

APA, Harvard, Vancouver, ISO, and other styles

11

Ferron, Michela. "Collective Memories in Wikipedia." Doctoral thesis, University of Trento, 2012. http://eprints-phd.biblio.unitn.it/830/1/Thesis.pdf.

Full text

Abstract:

Collective memories are precious resources for the society, because they contribute to strengthening the emotional bonding between community members, maintaining groups cohesion, and directing future behavior. Understanding the formation of the collective memories of emotional upheavals is important to a better comprehension of people's reactions and of the consequences on their psychological health. Previous studies investigated the effects of single traumatizing events, but few of them applied a quantitative approach to analyze the different psychological processes associated to the collective memories formation of upheavals on a large scale. This thesis explores the opportunities of applying quantitative methods to the study of collective memories in a collaborative environment such as the English Wikipedia. First, the presence of commemoration processes in Wikipedia articles and talk pages about traumatic events will be investigated through the analysis of edit activity patterns. Second, natural language processing techniques will be applied to detect differences in the collective representations of traumatic and non traumatic events, in the temporal focus of old and recent traumatic events, and in the representations of natural and human-made disasters. Third, the temporal evolution of language related to emotional, cognitive and social processes will be analyzed in the talk pages of two different emotional upheavals, the 2005 London bombings and the 2011 Egyptian revolution. The results will confirm the interpretation of Wikipedia as a global memory place, and highlight specific psychological processes related to the formation of collective memories of different types of traumatic events, opening the way to the quantitative study of collective memory formation in digital collaborative environments.

APA, Harvard, Vancouver, ISO, and other styles

12

Sauper, Christina (Christina Joan). "Automated creation of Wikipedia articles." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/47824.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.
Includes bibliographical references (leaves 81-84).
This thesis describes an automatic approach for producing Wikipedia articles. The wealth of information present on the Internet is currently untapped for many topics of secondary concern. Creating articles requires a great deal of time spent collecting information and editing. This thesis presents a solution. The proposed algorithm creates a new article by querying the Internet, selecting relevant excerpts from the search results, and synthesizing the best excerpts into a coherent document. This work builds on previous work in document summarization, web question answering, and Integer Linear Programming. At the core of our approach is a method for using existing human-authored Wikipedia articles to learn a content selection mechanism. Articles in the same category often present similar types of information; we can leverage this to create content templates for new articles. Once a template has been created, we use classification and clustering techniques to select a single best excerpt for each section. Finally, we use Integer Linear Programming techniques to eliminate any redundancy over the complete article. We evaluate our system for both individual sections and complete articles, using both human and automatic evaluation methods. The results indicate that articles created by our system are close to human-authored Wikipedia entries in quality of content selection. We show that both human and automatic evaluation metrics are in agreement; therefore, automatic methods are a reasonable evaluation tool for this task. We also empirically demonstrate that explicit modeling of content structure is essential for improving the quality of an automatically-produced article.
by Christina Sauper.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

13

Chan, Chi Kit. "Wikipedia recent changes information visualization." Thesis, University of Macau, 2015. http://umaclib3.umac.mo/record=b3335700.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Helgeson, Björn. "The Swedish Wikipedia Gender Gap." Thesis, KTH, Medieteknik och interaktionsdesign, MID, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-177493.

Full text

Abstract:

The proportion of women editors on the English language Wikipedia has for years been known to be very low. The purpose of this thesis is to see if this gender gap exists on the Swedish language Wikipedia as well, and investigate the reasons behind it. To do this, three methods are used. Firstly a literature review is conducted, looking at women in computing and how Wikipedia works and how it was founded. Secondly, user behavior and activity-levels are measured through means of a database analysis of editors and edits. And thirdly, a survey is distributed, aimed at both readers and editors of Swedish Wikipedia, gathering some 2700 respondents. The results indicate that there is indeed a big disproportion, and that only between 13-19% of editors are women. The findings did not indicate readers of the encyclopedia having any strong negative preconceptions about Wikipedia or its community. However when looking at reasons for not contributing, women were significantly more likely to perceive themselves as not competent enough to edit. Computer skills were found to be an important factor for trying out editing in the first place, and Wikipedia’s connection to a male-dominated computing/programming culture is put forth as a reason for the resilience of the gender gap. The difference in men’s and women’s communication styles in relation to the climate Wikipedia’s policies and guidelines is also discussed.
Andelen kvinnor som redigerar engelskspråkiga Wikipedia har visats vara väldigt låg. Syftet med detta arbetet är att undersöka om andelen ser likadan ut på den Svenskspråkiga siten också, samt undersöka de bakomliggande orsakerna. För att göra detta används tre metoder. Först görs en literaturstudie som behandlar kvinnor inom programmering och hur Wikipedia fungerar och dess grundande. Därefter mäts användarbeteende och aktivitetsnivåer genom en databasanalys på redigerare och redigeringar. slutligen distribuerades en webb-enkät riktad till både läsare och redigerare av svenskspråkiga Wikipedia, med runt 2700 svaranden. Resultaten visar att det finns en stor snedfördelning och att endast mellan 13-19% av redigerare är kvinnor. Resultaten påvisar inte några särskilda negativa uppfattningar hos läsare om Wikipedia eller dess gemenskap. Däremot uppgav kvinnor i signifikant högre utsträckning att en viktig anledning till att de inte bidrog till encyklopedin var att de inte upplevde sig tillräckligt kompetenta. Datorvana fanns vara en viktig faktor till att testa på att redigera första gången, och Wikipedias koppling till en mans-dominerad programmeringskultur diskuteras som en faktor till den låga andelen kvinnor. Wikipedias policies och riktlinjer och dess sammankoppling med skillnader i män och kvinnors kommunikationsstilar på internet diskuteras även.

APA, Harvard, Vancouver, ISO, and other styles

15

Tadakamala, Anirudh. "Analysis of PageRank on Wikipedia." Kansas State University, 2014. http://hdl.handle.net/2097/17609.

Full text

Abstract:

Master of Science
Department of Computing and Information Science
Daniel Andresen
With massive explosion of data in recent times and people depending more and more on search engines to get all kinds of information they want, it has becoming increasingly difficult for the search engines to produce most relevant data to the users. PageRank is one algorithm that has revolutionized the way search engines work. It was developed by Google`s Larry Page and Sergey Brin. It was developed by Google to rank websites and display them in order of ranking in its search engine results. PageRank is a link analysis algorithm that assigns a weight to each document in a corpus and measures the relative importance within the corpus. The purpose of my project is to extract all the English Wikipedia data using MediaWiki API and JWPL(Java Wikipedia Library), build PageRank Algorithm and analyze its performance on the this data set. Since the data set is too big to run in a single node Hadoop cluster, the analysis is done in a high computation cluster called Beocat, provided by Kansas State University, Computing and Information Sciences Department.

APA, Harvard, Vancouver, ISO, and other styles

16

Radford, William Edward John. "Linking named entities to Wikipedia." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/12850.

Full text

Abstract:

Natural language is fraught with problems of ambiguity, including name reference. A name in text can refer to multiple entities just as an entity can be known by different names. This thesis examines how a mention in text can be linked to an external knowledge base (KB), in our case, Wikipedia. The named entity linking (NEL) task requires systems to identify the KB entry, or Wikipedia article, that a mention refers to; or, if the KB does not contain the correct entry, return NIL. Entity linking systems can be complex and we present a framework for analysing their different components, which we use to analyse three seminal systems which are evaluated on a common dataset and we show the importance of precise search for linking. The Text Analysis Conference (TAC) is a major venue for NEL research. We report on our submissions to the entity linking shared task in 2010, 2011 and 2012. The information required to disambiguate entities is often found in the text, close to the mention. We explore apposition, a common way for authors to provide information about entities. We model syntactic and semantic restrictions with a joint model that achieves state-of-the-art apposition extraction performance. We generalise from apposition to examine local descriptions specified close to the mention. We add local description to our state-of-the-art linker by using patterns to extract the descriptions and matching against this restricted context. Not only does this make for a more precise match, we are also able to model failure to match. Local descriptions help disambiguate entities, further improving our state-of-the-art linker. The work in this thesis seeks to link textual entity mentions to knowledge bases. Linking is important for any task where external world knowledge is used and resolving ambiguity is fundamental to advancing research into these problems.

APA, Harvard, Vancouver, ISO, and other styles

17

Fallis, Don. "Toward an Epistemology of Wikipedia." Wiley Periodicals, 2008. http://hdl.handle.net/10150/105728.

Full text

Abstract:

Wikipedia (the â free online encyclopedia that anyone can editâ ) is having a huge impact on how a great many people gather information about the world. So, it is important for epistemologists and information scientists to ask whether or not people are likely to acquire knowledge as a result of having access to this information source. In other words, is Wikipedia having good epistemic consequences? After surveying the various concerns that have been raised about the reliability of Wikipedia, this paper argues that the epistemic consequences of people using Wikipedia as a source of information are likely to be quite good. According to several empirical studies, the reliability of Wikipedia compares favorably to the reliability of traditional encyclopedias. Furthermore, the reliability of Wikipedia compares even more favorably to the reliability of those information sources that people would be likely to use if Wikipedia did not exist (viz., websites that are as freely and easily accessible as Wikipedia). In addition, Wikipedia has a number of other epistemic virtues (e.g., power, speed, and fecundity) that arguably outweigh any deficiency in terms of reliability. Even so, epistemologists and information scientists should certainly be trying to identify changes (or alternatives) to Wikipedia that will bring about even better epistemic consequences. This paper suggests that, in order to improve Wikipedia, we need to clarify what our epistemic values are and we need a better understanding of why Wikipedia works as well as it does.

APA, Harvard, Vancouver, ISO, and other styles

18

Rendina, Natalia. "Wikipedia, il crowdsourcing e la traduzione." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/10804/.

Full text

Abstract:

Questa tesi si propone di investigare sul rapporto che Wikipedia ha con la traduzione dei suoi articoli. Nella tesi si osserva il fenomeno Wikipedia come paradosso e metodo globale efficiente di diffusione delle informazioni grazie al NPOV, del crowdsourcing con le sue implicazioni etiche per i volontari, i professionisti e i clienti, e della traduzione degli articoli del sito da parte di volontari. Si tratterà del rapporto del traduttore con la grande rete di Wikipedia e con le 280 lingue in cui il sito si può consultare, con una proposta di traduzione del paragrafo “Recursos Humanos” della pagina Wikipedia “El Corte Inglés” spagnola tradotta verso l’italiano.

APA, Harvard, Vancouver, ISO, and other styles

19

Wöhner, Thomas. "Automatic Editing Rights Management in Wikipedia." Universitätsbibliothek Leipzig, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-84131.

Full text

Abstract:

The free online encyclopedia Wikipedia is one of the most successful collaborative web projects. It is based on an open editing model, which allows everyone to edit the articles directly in the web browser. As a result of the open concept undesirable contributions like vandalism cannot be ruled out. These contributions reduce the article quality temporarily, consume system resources and cause effort for correcting. To address these problems, this paper introduces an approach for automatic editing rights management in Wikipedia that assigns editing rights according to the reputation of the author and the quality of the article to be edited. This analysis shows that this approach reduces undesirable contributions significantly while valuable contributions are nearly unaffected.

APA, Harvard, Vancouver, ISO, and other styles

20

Mederake, Nathalie [Verfasser]. "Wikipedia: Palimpseste der Gegenwart / Nathalie Mederake." Frankfurt : Peter Lang GmbH, Internationaler Verlag der Wissenschaften, 2016. http://d-nb.info/1090773838/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

El, Zant Samer. "Google matrix analysis of Wikipedia networks." Thesis, Toulouse, INPT, 2018. http://www.theses.fr/2018INPT0046/document.

Full text

Abstract:

Cette thèse s’intéresse à l’analyse du réseau dirigé extrait de la structure des hyperliens de Wikipédia. Notre objectif est de mesurer les interactions liant un sous-ensemble de pages du réseau Wikipédia. Par conséquent, nous proposons de tirer parti d’une nouvelle représentation matricielle appelée matrice réduite de Google ou "reduced Google Matrix". Cette matrice réduite de Google (GR) est définie pour un sous-ensemble de pages donné (c-à-d un réseau réduit).Comme pour la matrice de Google standard, un composant de GR capture la probabilité que deux noeuds du réseau réduit soient directement connectés dans le réseau complet. Une des particularités de GR est l’existence d’un autre composant qui explique la probabilité d’avoir deux noeuds indirectement connectés à travers tous les chemins possibles du réseau entier. Dans cette thèse, les résultats de notre étude de cas nous montrent que GR offre une représentation fiable des liens directs et indirects (cachés). Nous montrons que l’analyse de GR est complémentaire à l’analyse de "PageRank" et peut être exploitée pour étudier l’influence d’une variation de lien sur le reste de la structure du réseau. Les études de cas sont basées sur des réseaux Wikipédia provenant de différentes éditions linguistiques. Les interactions entre plusieurs groupes d’intérêt ont été étudiées en détail : peintres, pays et groupes terroristes. Pour chaque étude, un réseau réduit a été construit. Les interactions directes et indirectes ont été analysées et confrontées à des faits historiques, géopolitiques ou scientifiques. Une analyse de sensibilité est réalisée afin de comprendre l’influence des liens dans chaque groupe sur d’autres noeuds (ex : les pays dans notre cas). Notre analyse montre qu’il est possible d’extraire des interactions précieuses entre les peintres, les pays et les groupes terroristes. On retrouve par exemple, dans le réseau de peintre sissu de GR, un regroupement des artistes par grand mouvement de l’histoire de la peinture. Les interactions bien connues entre les grands pays de l’UE ou dans le monde entier sont également soulignées/mentionnées dans nos résultats. De même, le réseau de groupes terroristes présente des liens pertinents en ligne avec leur idéologie ou leurs relations historiques ou géopolitiques.Nous concluons cette étude en montrant que l’analyse réduite de la matrice de Google est une nouvelle méthode d’analyse puissante pour les grands réseaux dirigés. Nous affirmons que cette approche pourra aussi bien s’appliquer à des données représentées sous la forme de graphes dynamiques. Cette approche offre de nouvelles possibilités permettant une analyse efficace des interactions d’un groupe de noeuds enfoui dans un grand réseau dirigé
This thesis concentrates on the analysis of the large directed network representation of Wikipedia.Wikipedia stores valuable fine-grained dependencies among articles by linking webpages togetherfor diverse types of interactions. Our focus is to capture fine-grained and realistic interactionsbetween a subset of webpages in this Wikipedia network. Therefore, we propose to leverage anovel Google matrix representation of the network called the reduced Google matrix. This reducedGoogle matrix (GR) is derived for the subset of webpages of interest (i.e. the reduced network). Asfor the regular Google matrix, one component of GR captures the probability of two nodes of thereduced network to be directly connected in the full network. But unique to GR, anothercomponent accounts for the probability of having both nodes indirectly connected through allpossible paths in the full network. In this thesis, we demonstrate with several case studies that GRoffers a reliable and meaningful representation of direct and indirect (hidden) links of the reducednetwork. We show that GR analysis is complementary to the well-known PageRank analysis andcan be leveraged to study the influence of a link variation on the rest of the network structure.Case studies are based on Wikipedia networks originating from different language editions.Interactions between several groups of interest are studied in details: painters, countries andterrorist groups. For each study, a reduced network is built, direct and indirect interactions areanalyzed and confronted to historical, geopolitical or scientific facts. A sensitivity analysis isconducted to understand the influence of the ties in each group on other nodes (e.g. countries inour case). From our analysis, we show that it is possible to extract valuable interactions betweenpainters, countries or terrorist groups. Network of painters with GR capture art historical fact sucha painting movement classification. Well-known interactions of countries between major EUcountries or worldwide are underlined as well in our results. Similarly, networks of terrorist groupsshow relevant ties in line with their objective or their historical or geopolitical relationships. Weconclude this study by showing that the reduced Google matrix analysis is a novel powerfulanalysis method for large directed networks. We argue that this approach can find as well usefulapplication for different types of datasets constituted by the exchange of dynamic content. Thisapproach offers new possibilities to analyze effective interactions in a group of nodes embedded ina large directed network

APA, Harvard, Vancouver, ISO, and other styles

22

Dandala, Bharath. "Multilingual Word Sense Disambiguation Using Wikipedia." Thesis, University of North Texas, 2013. https://digital.library.unt.edu/ark:/67531/metadc500036/.

Full text

Abstract:

Ambiguity is inherent to human language. In particular, word sense ambiguity is prevalent in all natural languages, with a large number of the words in any given language carrying more than one meaning. Word sense disambiguation is the task of automatically assigning the most appropriate meaning to a polysemous word within a given context. Generally the problem of resolving ambiguity in literature has revolved around the famous quote “you shall know the meaning of the word by the company it keeps.” In this thesis, we investigate the role of context for resolving ambiguity through three different approaches. Instead of using a predefined monolingual sense inventory such as WordNet, we use a language-independent framework where the word senses and sense-tagged data are derived automatically from Wikipedia. Using Wikipedia as a source of sense-annotations provides the much needed solution for knowledge acquisition bottleneck. In order to evaluate the viability of Wikipedia based sense-annotations, we cast the task of disambiguating polysemous nouns as a monolingual classification task and experimented on lexical samples from four different languages (viz. English, German, Italian and Spanish). The experiments confirm that the Wikipedia based sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. It is a long belief that exploiting multiple languages helps in building accurate word sense disambiguation systems. Subsequently, we developed two approaches that recast the task of disambiguating polysemous nouns as a multilingual classification task. The first approach for multilingual word sense disambiguation attempts to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text. First, the various senses of a target word may be translated into different words, which constitute unique, yet highly salient signal that effectively expand the target word’s feature space. Second, the translated context words themselves embed co-occurrence information that a translation engine gathers from very large parallel corpora. The second approach for multlingual word sense disambiguation attempts to reduce the reliance on the machine translation system during training by using the multilingual knowledge available in Wikipedia through its interlingual links. Finally, the experiments on a lexical sample from four different languages confirm that the multilingual systems perform better than the monolingual system and significantly improve the disambiguation accuracy.

APA, Harvard, Vancouver, ISO, and other styles

23

McCurdy, Helena Brooke. "WikiMatcher: Leveraging Wikipedia for Ontology Alignment." Wright State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1461710743.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Dandala, Bharath. "Graph-Based Keyphrase Extraction Using Wikipedia." Thesis, University of North Texas, 2010. https://digital.library.unt.edu/ark:/67531/metadc67939/.

Full text

Abstract:

Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to quickly determine whether the document satisfies their information needs. The pervasion of huge amount of information on Web, with only a small amount of documents have keyphrases extracted, there is a definite need to discover automatic keyphrase extraction systems. Typically, a document written by human develops around one or more general concepts or sub-concepts. These concepts or sub-concepts should be structured and semantically related with each other, so that they can form the meaningful representation of a document. Considering the fact, the phrases or concepts in a document are related to each other, a new approach for keyphrase extraction is introduced that exploits the semantic relations in the document. For measuring the semantic relations between concepts or sub-concepts in the document, I present a comprehensive study aimed at using collaboratively constructed semantic resources like Wikipedia and its link structure. In particular, I introduce a graph-based keyphrase extraction system that exploits the semantic relations in the document and features such as term frequency. I evaluated the proposed system using novel measures and the results obtained compare favorably with previously published results on established benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

25

QURESHI, MUHAMMAD ATIF. "Utilizing Wikipedia for Text Mining Applications." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2015. http://hdl.handle.net/10281/91081.

Full text

Abstract:

Textual data forms a popular form of communication; however, textual data is complex in nature as it is produced by humans. Given the huge amount of textual data currently available, it is essential to be able to mine this data automatically. Recent text mining efforts are making extensive use of knowledge bases, and this thesis pursues a similar effort. We however make use of Wikipedia to solve complex text mining tasks and current approaches do not make effective use of the category-article structure within Wikipedia. Particularly, we solve the problem of determining various topical threads in a document together with contextualization of social media content to disambiguate its various aspects. Experimental evaluations demonstrate the superiroty of our proposed methods when compared with state-of-the-art.

APA, Harvard, Vancouver, ISO, and other styles

26

Kopf, Susanne. "Debating the European Union transnationally : Wikipedians' construction of the EU on a Wikipedia talk page (2001-2015)." Thesis, Lancaster University, 2018. http://eprints.lancs.ac.uk/126749/.

Full text

Abstract:

This thesis deals with the construction of the European Union (EU) as negotiated among contributors to the English Wikipedia between 2001 and 2015. It focuses on the Talk Page (TP) which accompanies the Wikipedia article on the EU and provides a space for Wikipedia contributors to discuss controversial issues regarding the article. The EU has received considerable attention in Critical Discourse Studies (CDS), addressing e.g. questions regarding language policy and discourses surrounding topics connected to the EU (e.g. Muntigl, Weiss, & Wodak, 2000; Unger, Krzyżanowski, & Wodak, 2014; Wodak, 2007a). However, private individuals’ attempts to make sense of the EU when facing the task of defining it have hardly been touched upon. In this context, Wikipedia constitutes an ideal repository of data as it has recorded debates on the institution since 2001. Taking a corpus-assisted approach (cf. Baker, 2006), I examine how contributors from various backgrounds have grappled with their understanding of the EU. Additionally, this study explores aspects of Wikipedia since this collaboratively created encyclopaedia has received little research attention. Taking the EU on Wikipedia as a starting point, this thesis presents a foray into how Wikipedia can be approached from a CDS perspective. That is, on the one hand, it identifies central aspects of this website’s structure and addresses policies that guide Wikipedia operations and thus shape Wikipedia data. On the other hand, it examines the site’s societal impact/relevance and evaluates to what extent it can function as a transnational public sphere. Findings suggest that a substantial part of discussions amongst Wikipedians addresses the classification of the EU along the continuum between confederation and unified country, depending on different views concerning member states’ sovereignty. Wikipedia’s policies and the nature of the debates further suggest that the TP can, to some extent, serve as a transnational public sphere.

APA, Harvard, Vancouver, ISO, and other styles

27

Pentzold, Christian. "Wikipedia : Diskussionsraum und Informationsspeicher im neuen Netz /." München : Fischer, 2007. http://www.verlag-reinhard-fischer.de.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Niesyto, Johanna [Verfasser]. "Die minimale Politik der Wikipedia / Johanna Niesyto." Siegen : Universitätsbibliothek der Universität Siegen, 2017. http://d-nb.info/1129453286/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Yau, Cheuk Yin. "Reusing semantic web data in authoring Wikipedia." Thesis, University of Warwick, 2011. http://wrap.warwick.ac.uk/55706/.

Full text

Abstract:

This thesis presents research conducted at the University of Warwick in improving Web 2.0 (or Social Web) authoring, by semi-adaptively recommending data via Information Retrieval methods. Thus this research integrates two main areas, Semantic Web and Information Retrieval. The research has led to a tool for retrieving semantic web data and reusing the retrieved data in authoring Wikipedia pages. The goal of this tool is firstly to enable semantic web crawling with a user friendly interface, by applying a semantic web framework API to an existing web archiving system and, secondly, to find an easy way for reusing the data in authoring, by implementing a delivery engine for the retrieved data. To demonstrate this new tool, this thesis presents example scenarios for retrieving semantic data from a specified domain and utilizing the drag and drop method in authoring. The presented paradigm is evaluated via two case studies, following the two versions of the prototype, and future work is discussed.

APA, Harvard, Vancouver, ISO, and other styles

30

Kliegr, Tomáš. "Unsupervised Entity Classification with Wikipedia and WordNet." Doctoral thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-126861.

Full text

Abstract:

This dissertation addresses the problem of classification of entities in text represented by noun phrases. The goal of this thesis is to develop a method for automated classification of entities appearing in datasets consisting of short textual fragments. The emphasis is on unsupervised and semi-supervised methods that will allow for fine-grained character of the assigned classes and require no labeled instances for training. The set of target classes is either user-defined or determined automatically. Our initial attempt to address the entity classification problem is called Semantic Concept Mapping (SCM) algorithm. SCM maps the noun phrases representing the entities as well as the target classes to WordNet. Graph-based WordNet similarity measures are used to assign the closest class to the noun phrase. If a noun phrase does not match any WordNet concept, a Targeted Hypernym Discovery (THD) algorithm is executed. The THD algorithm extracts a hypernym from a Wikipedia article defining the noun phrase using lexico-syntactic patterns. This hypernym is then used to map the noun phrase to a WordNet synset, but it can also be perceived as the classification result by itself, resulting in an unsupervised classification system. SCM and THD algorithms were designed for English. While adaptation of these algorithms for other languages is conceivable, we decided to develop the Bag of Articles (BOA) algorithm, which is language agnostic as it is based on the statistical Rocchio classifier. Since this algorithm utilizes Wikipedia as a source of data for classification, it does not require any labeled training instances. WordNet is used in a novel way to compute term weights. It is also used as a positive term list and for lemmatization. A disambiguation algorithm utilizing global context is also proposed. We consider the BOA algorithm to be the main contribution of this dissertation. Experimental evaluation of the proposed algorithms is performed on the WordSim353 dataset, which is used for evaluation in the Word Similarity Computation (WSC) task, and on the Czech Traveler dataset, the latter being specifically designed for the purpose of our research. BOA performance on WordSim353 achieves Spearman correlation of 0.72 with human judgment, which is close to the 0.75 correlation for the ESA algorithm, to the author's knowledge the best performing algorithm for this gold-standard dataset, which does not require training data. The advantage of BOA over ESA is that it has smaller requirements on preprocessing of the Wikipedia data. While SCM underperforms on the WordSim353 dataset, it overtakes BOA on the Czech Traveler dataset, which was designed specifically for our entity classification problem. This discrepancy requires further investigation. In a standalone evaluation of THD on Czech Traveler dataset the algorithm returned a correct hypernym for 62% of entities.

APA, Harvard, Vancouver, ISO, and other styles

31

Metke, Jimenez Alejandro. "Using Wikipedia to improve web service discovery." Thesis, Queensland University of Technology, 2012. https://eprints.qut.edu.au/59632/1/Alejandro_Metke_Jimenez_Thesis.pdf.

Full text

Abstract:

Building and maintaining software are not easy tasks. However, thanks to advances in web technologies, a new paradigm is emerging in software development. The Service Oriented Architecture (SOA) is a relatively new approach that helps bridge the gap between business and IT and also helps systems remain exible. However, there are still several challenges with SOA. As the number of available services grows, developers are faced with the problem of discovering the services they need. Public service repositories such as Programmable Web provide only limited search capabilities. Several mechanisms have been proposed to improve web service discovery by using semantics. However, most of these require manually tagging the services with concepts in an ontology. Adding semantic annotations is a non-trivial process that requires a certain skill-set from the annotator and also the availability of domain ontologies that include the concepts related to the topics of the service. These issues have prevented these mechanisms becoming widespread. This thesis focuses on two main problems. First, to avoid the overhead of manually adding semantics to web services, several automatic methods to include semantics in the discovery process are explored. Although experimentation with some of these strategies has been conducted in the past, the results reported in the literature are mixed. Second, Wikipedia is explored as a general-purpose ontology. The benefit of using it as an ontology is assessed by comparing these semantics-based methods to classic term-based information retrieval approaches. The contribution of this research is significant because, to the best of our knowledge, a comprehensive analysis of the impact of using Wikipedia as a source of semantics in web service discovery does not exist. The main output of this research is a web service discovery engine that implements these methods and a comprehensive analysis of the benefits and trade-offs of these semantics-based discovery approaches.

APA, Harvard, Vancouver, ISO, and other styles

32

Kjellén, Viktor. "Wik-i-media : Mediala framställningar av uppslagsverket Wikipedia." Thesis, Linköping University, Department for Studies of Social Change and Culture, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-11943.

Full text

Abstract:

Studien visar upp på vilka olika sätt det Internetbaserade uppslagsverket Wikipedia framställs i tidningarna Dagens Nyheter och Svenska Dagbladet, från april 2005 och fram till november 2007. Diskurser i ledare, artiklar och krönikor synliggörs och kopplas till den större debatt om kunskap som Wikipedia blivit en del av.

APA, Harvard, Vancouver, ISO, and other styles

33

Sunercan, Omer. "Missing Link Discovery In Wikipedia: A Comparative Study." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12611530/index.pdf.

Full text

Abstract:

The fast growing online encyclopedia concept presents original and innovative features by taking advantage of information technologies. The links connecting the articles is one of the most important instances of these features. In this thesis, we present our work on discovering missing links in Wikipedia articles. This task is important for both readers and authors of Wikipedia. Readers will bene&
#64257
t from the increased article quality with better navigation support. On the other hand, the system can be employed to support authors during editing. This study combines the strengths of different approaches previously applied for the task, and proposes its own techniques to reach satisfactory results. Because of the subjectivity in the nature of the task
automatic evaluation is hard to apply. Comparing approaches seems to be the best method to evaluate new techniques, and we offer a semi-automatized method for evaluation of the results. The recall is calculated automatically using existing links in Wikipedia. The precision is calculated according to manual evaluations of human assessors. Comparative results for different techniques are presented, showing the success of our improvements. Our system employs Turkish Wikipedia (Vikipedi) and, according to our knowledge, it is the &
#64257
rst study on it. We aim to exploit the Turkish Wikipedia as a semantic resource to examine whether it is scalable enough for such purposes.

APA, Harvard, Vancouver, ISO, and other styles

34

Neppare, Christoffer, and Pontus Blomberg. "Case study in political user behavior on Wikipedia." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-187493.

Full text

Abstract:

Wikipedia has since its inception in 2001 become the world’s largest encyclopedia and continues to enjoy a high degree of support from the general public. This report investigate differences in the user creation process of content between political and unpolitical articles. According to the results the revisions per user follow a similar pyramid pattern in both the political and unpolitical articles. When investigating the number of reverts the results differ between the political and unpolitical articles. More reverts are made by semi active users in the political articles. The report also establish that some political categories exhibit a significantly longer edit history than the Wikipedia average and that these categories contain a disproportionately high number of reverts among their revisions. Conclusions are that there are differences between political and unpolitical articles in regard to who do reverts and the number of reverts per page.
Wikipedia har sedan det skapades 2001 blivit världens största encyclopedia och den fortsätter erhålla en hög grad av stöd och tillit från allmänheten. Denna rapport undersöker skillnader i användarbeteende under skapande processen av innehåll för politiska och opolitiska artiklar. Det framkom att politiska artiklar uppvisar en liknande pyramid struktur som opolitiska i hänseende till antalet revisioner per användare men att de skiljer sig i hänseende till reverts, som till en högre grad utförs av mellan-aktiva användare i politiska artiklar. Rapporten tar även och visar att vissa politiska kategorier har längre editeringshistorier än Wikipedias genomsnitt och dessa kategorier innehåller en oproportionerligt hög andel reverts bland sina revisioner. Slutsatser är att det finns skillnader mellan politiska och opolitiska artiklar i hänseende till vem som gör reverts och antalet reverts per sida.

APA, Harvard, Vancouver, ISO, and other styles

35

Morsey, Mohamed. "Efficient Extraction and Query Benchmarking of Wikipedia Data." Doctoral thesis, Universitätsbibliothek Leipzig, 2014. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-130593.

Full text

Abstract:

Knowledge bases are playing an increasingly important role for integrating information between systems and over the Web. Today, most knowledge bases cover only specific domains, they are created by relatively small groups of knowledge engineers, and it is very cost intensive to keep them up-to-date as domains change. In parallel, Wikipedia has grown into one of the central knowledge sources of mankind and is maintained by thousands of contributors. The DBpedia (http://dbpedia.org) project makes use of this large collaboratively edited knowledge source by extracting structured content from it, interlinking it with other knowledge bases, and making the result publicly available. DBpedia had and has a great effect on the Web of Data and became a crystallization point for it. Furthermore, many companies and researchers use DBpedia and its public services to improve their applications and research approaches. However, the DBpedia release process is heavy-weight and the releases are sometimes based on several months old data. Hence, a strategy to keep DBpedia always in synchronization with Wikipedia is highly required. In this thesis we propose the DBpedia Live framework, which reads a continuous stream of updated Wikipedia articles, and processes it. DBpedia Live processes that stream on-the-fly to obtain RDF data and updates the DBpedia knowledge base with the newly extracted data. DBpedia Live also publishes the newly added/deleted facts in files, in order to enable synchronization between our DBpedia endpoint and other DBpedia mirrors. Moreover, the new DBpedia Live framework incorporates several significant features, e.g. abstract extraction, ontology changes, and changesets publication. Basically, knowledge bases, including DBpedia, are stored in triplestores in order to facilitate accessing and querying their respective data. Furthermore, the triplestores constitute the backbone of increasingly many Data Web applications. It is thus evident that the performance of those stores is mission critical for individual projects as well as for data integration on the Data Web in general. Consequently, it is of central importance during the implementation of any of these applications to have a clear picture of the weaknesses and strengths of current triplestore implementations. We introduce a generic SPARQL benchmark creation procedure, which we apply to the DBpedia knowledge base. Previous approaches often compared relational and triplestores and, thus, settled on measuring performance against a relational database which had been converted to RDF by using SQL-like queries. In contrast to those approaches, our benchmark is based on queries that were actually issued by humans and applications against existing RDF data not resembling a relational schema. Our generic procedure for benchmark creation is based on query-log mining, clustering and SPARQL feature analysis. We argue that a pure SPARQL benchmark is more useful to compare existing triplestores and provide results for the popular triplestore implementations Virtuoso, Sesame, Apache Jena-TDB, and BigOWLIM. The subsequent comparison of our results with other benchmark results indicates that the performance of triplestores is by far less homogeneous than suggested by previous benchmarks. Further, one of the crucial tasks when creating and maintaining knowledge bases is validating their facts and maintaining the quality of their inherent data. This task include several subtasks, and in thesis we address two of those major subtasks, specifically fact validation and provenance, and data quality The subtask fact validation and provenance aim at providing sources for these facts in order to ensure correctness and traceability of the provided knowledge This subtask is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents. We present DeFacto (Deep Fact Validation), which is an algorithm for validating facts by finding trustworthy sources for it on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of webpages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. On the other hand the subtask of data quality maintenance aims at evaluating and continuously improving the quality of data of the knowledge bases. We present a methodology for assessing the quality of knowledge bases’ data, which comprises of a manual and a semi-automatic process. The first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. In the manual process, the second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is accompanied by a tool wherein a user assesses an individual resource and evaluates each fact for correctness. The semi-automatic process involves the generation and verification of schema axioms. We report the results obtained by applying this methodology to DBpedia.

APA, Harvard, Vancouver, ISO, and other styles

36

Almroth, Bodil, and Sofia Tenglin. "Media om Wikipedia : en diskursanalys av nationella facktidskrifter." Thesis, Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-20408.

Full text

Abstract:

The aim of this Master’s thesis is to examine how librarians, information specialists and teachers discuss Wikipedia within national (Swedish) professional journals. Questions asked in the study are: How is Wikipedia perceived in the professional journals? What different positions do writers and commentators take in relation to Wikipedia? 133 articles from 31 different professional journals in the period from 2001 to the middle of 2010 were analysed. The theory and method used is Laclau’s and Mouffe’s discourse theory from which we created our own model with sex different steps. The results show that there are several recurring discussions, for example about the credibility of user generated content. Another example is the discussion about the use of Wikipedia within the education and school contexts and whether the encyclopedia should be seen as sufficiently credible to use in these contexts. From these discussions we have identified three discourses that we have chosen to call: the knowledge-liberal discourse, the knowledge-conservative discourse and the pedagogical discourse.
Program: Bibliotekarie

APA, Harvard, Vancouver, ISO, and other styles

37

Ford, Heather. "Fact factories : Wikipedia and the power to represent." Thesis, University of Oxford, 2015. http://ora.ox.ac.uk/objects/uuid:b34fdd6c-ec15-4bcd-acba-66a777739b4d.

Full text

Abstract:

Wikipedia is no longer just another source of knowledge about the world. It is fast becoming a central source, used by other powerful knowledge brokers like Google and Bing to offer authoritative answers to search queries about people, places and things and as information infrastructure for a growing number of Web applications and services. Researchers have found that Wikipedia offers a skewed representation of the world that favours some groups at the expense of others so that representations on the platform have repercussions for the subjects of those representations beyond Wikipedia's domain. It becomes critical in this context to understand how exactly Wikipedia's representations come about, what practices give rise to them and what socio-technical arrangements lead to their expression. This ethnographic study of Wikipedia explores the values, principles and practices that guide what knowledge Wikipedia represents. It follows the foundational principles of Wikipedia in its identity both as an encyclopaedia and a product of the free and open source software and internet freedom rhetoric of the early 2000s. Two case studies are analysed against the backdrop of this ideology, illustrating how different sets of actors battle to extend or reject the boundaries of Wikipedia, and in doing so, affect who are defined as the experts, subjects and revolutionaries of the knowledge that is taken up. The findings of this thesis indicate that Wikipedia's process of decision-making is neither hierarchical nor is it egalitarian; rather, the power to represent on Wikipedia is rhizoid: it happens at the edges rather than in the centre of the network. Instead of everyone having the same power to represent their views on Wikipedia, those who understand how to perform and speak according to Wikipedia's complex technical, symbolic and policy vocabulary tend to prevail over those who possess disciplinary knowledge about the subject being represented. Wikipedians are no amateurs as many would have us believe; nor are they passive collectors of knowledge held in sources; Wikipedians are, instead, active co-creators of knowledge in the form of facts that they support using specially chosen sources. The authority of Wikipedia and Wikipedians is garnered through the performative acts of citation, through the ability of individual editors to construct the traces that represent citation, and through the stabilization and destabilization of facts according to the ideological viewpoints of its editors. In venerating and selecting certain sources among others, Wikipedians also serve to reaffirm traditional centres of authority, while at the same time amplifying new centres of knowledge and denying the authority of knowledge that is not codified in practice. As a result, Wikipedia is becoming the site of new centres of expertise and authoritative knowledge creation, and is signalling a move towards the professionalization of the expertise required to produce factual data in the context of digital networks.

APA, Harvard, Vancouver, ISO, and other styles

38

Sáez, Binelli Tomás Andrés. "Construcción automática de cajas de información para Wikipedia." Tesis, Universidad de Chile, 2018. http://repositorio.uchile.cl/handle/2250/152161.

Full text

Abstract:

Ingeniero Civil en Computación
Las Infobox son tablas de resumen, que pretenden describir brevemente una entidad mediante la presentación se sus principales características de forma clara y en un formato establecido. Lamentablemente estas Infoboxes son construidas de forma manual por editores de Wikipedia, lo que se traduce en que muchos artículos en idiomas poco frecuentes no cuentan con Infoboxes o éstas son de baja calidad. Utilizando Wikidata como fuente de información, el desafío de este trabajo es ordenar y seleccionar las propiedades y valores según importancia, para lograr una Infobox concisa con la información ordenada según pertenencia. Con este objetivo en mente, este trabajo propone una estrategia de control y 4 estrategias experimentales para la construcción de Infoboxes en forma automática. Durante el desarrollo de este trabajo se implementa una API en Django, que se recibe una petición indicando la entidad, el lenguaje y la estrategia a utilizar para generar la Infobox. Como respuesta se obtiene un JSON que representa la Infobox generada. Se construye adicionalmente una interfaz gráfica que permita una rápida utilización de dicha API y opere como facilitador de un proceso de evaluación comparativo entre las diversas estrategias. La evaluación comparativa se realiza enfrentando a encuestados a un listado de 15 entidades cuyas 5 Infoboxes (una por estrategia) han sido previamente calculadas y dispuestas en forma paralela. Asignando una nota de 1 (menor valoración) a 7, 12 usuarios proceden a evaluar cada Infobox; obteniendo un total de 728 valoraciones. Los resultados indican que la estrategia mejor evaluada combina la frecuencia de una propiedad y el PageRank de su valor como indicadores de orden de importancia.

APA, Harvard, Vancouver, ISO, and other styles

39

Osman, Kim Y. "Wikipedia: Access and participation in an open encyclopaedia." Thesis, Queensland University of Technology, 2015. https://eprints.qut.edu.au/89756/1/Kim_Osman_Thesis.pdf.

Full text

Abstract:

This thesis explores how people and technologies work together to coordinate and shape participation through a case study of the online encyclopaedia Wikipedia. The research found participation is shaped by different understandings of openness, where it is constructed as either a libertarian ideal where "anyone" is free to edit the encyclopaedia, or as an inclusive concept that enables "everyone" to participate in the platform. The findings therefore problematise the idea of single user community, and serve to highlight the different and sometimes competing approaches actors employ to enable and constrain participation in Wikipedia.

APA, Harvard, Vancouver, ISO, and other styles

40

Lundberg, Leo. "Uppslagsverkens diskursordning. En diskursanalytisk studie av Nationalencyklopedin och Wikipedia." Thesis, Uppsala University, Department of ALM, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-101792.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Lange, Dustin, Christoph Böhm, and Felix Naumann. "Extracting structured information from Wikipedia articles to populate infoboxes." Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4571/.

Full text

Abstract:

Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.
Ungefähr jeder dritte Wikipedia-Artikel enthält eine Infobox - eine Tabelle, die wichtige Fakten über das beschriebene Thema in Attribut-Wert-Form darstellt. Das Schema einer Infobox, d.h. die Attribute, die für ein Konzept verwendet werden können, wird durch ein Infobox-Template definiert. Häufig geben Autoren nicht für alle Template-Attribute Werte an, wodurch unvollständige Infoboxen entstehen. Mit iPopulator stellen wir ein System vor, welches automatisch Infoboxen von Wikipedia-Artikeln durch Extrahieren von Attributwerten aus dem Artikeltext befüllt. Im Unterschied zu früheren Arbeiten erkennt iPopulator die Struktur von Attributwerten und nutzt diese aus, um die einzelnen Bestandteile von Attributwerten unabhängig voneinander zu extrahieren. Wir haben iPopulator auf der gesamten Menge der Infobox-Templates getestet und analysieren detailliert die Effektivität. Wir erreichen beispielsweise für die Extraktion einen durchschnittlichen Precision-Wert von 91% für 1.727 verschiedene Infobox-Template-Attribute.

APA, Harvard, Vancouver, ISO, and other styles

42

Rönnblom, Olof. "IKT och Wikipedia i skolan- en fallstudie våren 2011." Thesis, Uppsala universitet, Institutionen för pedagogik, didaktik och utbildningsstudier, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-157885.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Ullah, Noor. "ANFIS BASED MODELS FOR ACCESSING QUALITY OF WIKIPEDIA ARTICLES." Thesis, Högskolan Dalarna, Datateknik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:du-4909.

Full text

Abstract:

Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Due to the free nature of Wikipedia and allowing open access to everyone to edit articles the quality of articles may be affected. As all people don’t have equal level of knowledge and also different people have different opinions about a topic so there may be difference between the contributions made by different authors. To overcome this situation it is very important to classify the articles so that the articles of good quality can be separated from the poor quality articles and should be removed from the database. The aim of this study is to classify the articles of Wikipedia into two classes class 0 (poor quality) and class 1(good quality) using the Adaptive Neuro Fuzzy Inference System (ANFIS) and data mining techniques. Two ANFIS are built using the Fuzzy Logic Toolbox [1] available in Matlab. The first ANFIS is based on the rules obtained from J48 classifier in WEKA while the other one was built by using the expert’s knowledge. The data used for this research work contains 226 article’s records taken from the German version of Wikipedia. The dataset consists of 19 inputs and one output. The data was preprocessed to remove any similar attributes. The input variables are related to the editors, contributors, length of articles and the lifecycle of articles. In the end analysis of different methods implemented in this research is made to analyze the performance of each classification method used.

APA, Harvard, Vancouver, ISO, and other styles

44

Kindfält, Jonathan, and Alex Wennberg. "Genus och diskussionsklimatet i Wikipedia : En Liten, Explorativ Studie." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-177581.

Full text

Abstract:

Wikipedia är en samling digitala encyklopedier som alla får skriva på och redigera. Skrivarbasen är dock kraftigt mansdominerad, trots att Wikipedia försöker få en skrivarbas som skulle kunna ses som representativ för verkligheten. För att undersöka varför så har en mindre explorativ studie på svenska Wikipedias diskussionsklimat genomförts. Studenter vid KTH har fått läsa utdrag ur diskussioner, och svarat på frågor om hur de upplever diskussionsklimatet och hur deras motivation att delta i diskussionen påverkas av det. Resultatet visar att de kvinnor som har svarat i intervjuerna i större grad än män blir omotiverade att delta i diskussionsklimat som de upplever som obehagliga. Studien ger förslag på framtida studier som skulle kunna genomföras på området.
Wikipedia is a collection of digital encyclopedia that anyone can write and edit. The writers, however, are a male dominated group, despite the fact that Wikipedia is trying to get a writer base that could be seen as representative of society. To investigate why this is a small-scale exploratory study of Swedish Wikipedia's discussion climate has been implemented. A small group of students at KTH have been studied. The interviewees have read excerpts from discussions taken from Swedish Wikipedia and answered questions about how they perceive them and how their motivation to participate in the discussion is affected by their climate. The result shows that the women that have answered the interviews have felt unmotivated to partake in discussions they feel are being conducted poorly to a greater extent than the men who have answered. The study provides ideas for future studies on the discussion climate of Swedish Wikipedia.

APA, Harvard, Vancouver, ISO, and other styles

45

Kallass, Kerstin [Verfasser]. "Schreibprozesse in der Wikipedia : eine linguistische Analyse / Kerstin Kallass." Koblenz : Universitätsbibliothek Koblenz, 2013. http://d-nb.info/1034493620/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Loubser, Max. "Organisational mechanisms in peer production : the Case of Wikipedia." Thesis, University of Oxford, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.522764.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Jankowski, Steven J. "Wikipedia and Encyclopaedism: A Genre Analysis of Epistemological Values." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/24160.

Full text

Abstract:

This thesis considers how Wikipedia justifies, structures, and legitimizes its production of knowledge. To do so the thesis positions Wikipedia as a site of conflict over the epistemic values between its wiki and encyclopaedic traditions. Through the literature review, the wiki epistemology is argued to be composed of six values: self-identification, collaboration, co-construction, cooperation, trust in the community, and constructionism. While these values are explicit, encyclopaedism’s were not found to be equally defined. To fill this gap, the thesis conducts a genre analysis of encyclopaedism. It first identifies the genre through its communicative purposes to create a universal system of total knowledge and to use this system to educate the public. Second, an analysis of recurrent social contexts within Chambers’ Cyclopaedia (1728), Diderot & d’Alembert’s Encyclopédie (1751–72), the Encyclopaedia Britannica (1771–), and Wikipedia (2001–) finds that the communicative purposes are achieved through the use of five epistemic values: utility, systematic organization, authority, trust in experts, and consistency. Third, a comparison spanning 240 years between Wikipedia and the Britannica’s article headings finds that the value of systematic organization structures Wikipedia’s articles using seventeenth century categories of knowledge. Having established two sets of values that determine Wikipedia’s production of knowledge, the thesis sets the stage for future research to analyze how Wikipedia’s epistemology is articulated in its different production spaces. Ultimately, such research may not only describe the shifting values of Wikipedia’s epistemology but also explain how knowledge is transformed and produced in the network society.

APA, Harvard, Vancouver, ISO, and other styles

48

Alencar, Rafael Odon de. "Utilizando Evidência da wikipedia para relacionar textos a lugares." Universidade Federal de Minas Gerais, 2011. http://hdl.handle.net/1843/SLSS-8KDPKG.

Full text

Abstract:

Obtaining or approximating a geographic location for search results often motivates users to include place names and other geography-related terms in their queries. Previous work shows that queries that include geography-related terms correspond to a significant share of the users demand. Therefore, it is important to recognize the association of documents to places in order to adequately respond to such queries. This dissertation describes strategies for the geographic scope computation, using Wikipedia as an alternative source of direct and indirect geographic references. First we propose to perform a text classification task on geography-related classes, using textual evidence extracted from Wikipedia. We use terms that correspond to articles titles and the connections between articles in Wikipedias graph to establish a semantic network from which classification features are generated. Results of experiments using a news data-set, classified over Brazilian states, show that such terms constitute a valid evidence set for the geographic classification of documents, and demonstrate the potential of this technique for text classification. Another proposal describes a strategy for tagging documents with multiple place names, according to the geographic context of their textual content, using a topic indexing technique that considers Wikipedia articles as a controlled vocabulary. By identifying those topics in the text, we connect documents with the Wikipedia semantic network of articles, allowing us to perform operations on Wikipedias graph and find related places. We present an experimental evaluation on documents tagged as Brazilian states, demonstrating the feasibility of our proposal and opening the way to further research on geotagging based on semantic networks. Our results demonstrates the feasibility of using Wikipedia as an alternative source of geographical references. The method\\\'s main advantage is the use of free, up-to-date and wide knowledge and information from the digital encyclopedia. Finally, the Wikipedia introduction to the geographic text analysis can be faced as both, an alternative and a extension to use of geographical dictionaries (i. e. gazetteers).
Dado que uma parcela significativa de buscas na Web apresenta alguma intenção geográfica, é importante conceber formas automáticas de associar recursos a lugares (geotagging). O presente trabalho propõe duas estratégias para geotagging de textos usando a Wikipedia como fonte de evidência geográfica. Primeiro, propõe-se a classificação automática de textos com base na ocorrência de palavras-chave extraídas da Wikipedia para um conjunto de lugares. Em seguida, é proposto basear-se numa técnica de identificação de tópicos auxiliada pela Wikipedia, onde os tópicos encontrados conectam textos ao grafo da Wikipedia, permitindo a busca por lugares relacionados. Experimentos avaliaram a precisão do geotagging em uma coleção de documentos associados a estados brasileiros. Demonstrou-se a viabilidade do uso da Wikipedia como fonte de evidência geográfica, beneficiando-se de seu conhecimento livre, amplo e atualizado e apresentando uma alternativa ou extensão aos dicionários geográficos(gazetteers) em tarefas de recuperação de informação geográfica.

APA, Harvard, Vancouver, ISO, and other styles

49

Weijand, Sasha. "AUTOMATED GENDER CLASSIFICATION IN WIKIPEDIA BIOGRAPHIESa cross-lingual comparison." Thesis, Umeå universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-163371.

Full text

Abstract:

The written word plays an important role in the reinforcement of gender stereotypes, especially in texts of a more formal character. Wikipedia biographies have a lot of information about famous people, but do they describe men and women with different kinds of words? This thesis aims to evaluate and explore a method for gender classification of text. In this study, two machine learning classifiers, Random Forest (RF) and Support Vector Machine (SVM), are applied to the gender classification of Wikipedia biographies in two languages, English and French. Their performance is evaluated and compared. The 500 most important words (features) are listed for each of the classifiers.A short review is given on the theoretic foundations of text classification, and a detailed description on how the datasets are built, what tools are used, and why. The datasets used are built from the first 5 paragraphs in each biography, with only nouns, verbs, adjectives and adverbs remaining. Feature ranking is also applied, where the top tenth of the features are kept.Performance is measured using the F0:5-score. The comparison shows that the RF and SVM classifiers' performance are close to each other, but that the classifiers perform worse on the French set than on the English. Initial performance scores range from 0.82 to 0.86, but they drop drastically when the most important features are removed from the set. A majority of the top most important features are nouns related to career and family roles, in both languages.The results show that there are indeed some semantic differences in language depending on the gender of the person described. Whether these depend on the writers' biased views, an unequal gender distribution of real world contexts, such as careers, or if these differences depend on how the datasets were built, is not clear.

APA, Harvard, Vancouver, ISO, and other styles

50

Frost, Ingo. "Zivilgesellschaftliches Engagement in virtuellen Gemeinschaften eine systemwissenschaftliche Analyse des deutschsprachigen Wikipedia-Projektes." München Utz, 2006. http://deposit.ddb.de/cgi-bin/dokserv?id=2798296&prov=M&dok_var=1&dok_ext=htm.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!