Se connecter

Bibliographies thématiques / Mining Source Code Repositories

Sommaire

Thèses

Littérature scientifique sur le sujet « Mining Source Code Repositories »

Auteur : Grafiati

Publié le 11 mars 2023

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Mining Source Code Repositories ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Thèses sur le sujet "Mining Source Code Repositories"

1

Kagdi, Huzefa H. "Mining Software Repositories to Support Software Evolution." Kent State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=kent1216149768.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

2

Bengtsson, Jonathan, and Heidi Hokka. "Analysing Lambda Usage in the C++ Open Source Community." Thesis, Mittuniversitetet, Institutionen för data- och systemvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-39514.

Texte intégral

Résumé :

Object-oriented languages have made a shift towards incorporating functional concepts such as lambdas. Lambdas are anonymous functions that can be used within the scope of other functions. In C++ lambdas are considered difficult to use for inexperienced developers. This implies that there may be problems with lambdas in C++. However, studies about lambdas in C++ repositories are scarce, compared to other object-oriented languages such as Java. This study aims to address a knowledge gap regarding how lambdas are used by developers in C++ repositories. Furthermore, examine how developer experience and software engineering practices, such as unit testing and in-code documentation, correlates with the inclusion of lambdas. To achieve this we create a set of tools that statically analyse repositories to gather results. This study gained insight into the number of repositories utilising lambdas, their usage areas, and documentation but also how these findings compare to similar studies’ results in Java. Further, it is shown that unit testing and developer experience correlates with the usage of lambdas.<br>Objektorienterade språk har gjort en förskjutning mot att integrera funktionella begrepp som lambdas. Lambdas är anonyma funktioner som kan användas inom ramen för andra funktioner. I C ++ anses lambdas vara svåra att använda för oerfarna utvecklare. Detta innebär att det kan vara problem med lambdas i C ++. Emellertid är studier på lambdas i C ++ repositorier mindre vanliga jämfört med andra objektorienterade språk som Java. Denna studie syftar till att ta itu med ett kunskapsgap beträffande hur lambdas används av utvecklare i C++ repositorier. Dessutom undersöks hur utvecklarvanor och sedvänjor i programvaruutveckling, till exempel enhetstestning och dokumentation, korrelerar med inkluderingen av lambdas. För att uppnå detta skapar vi en uppsättning verktyg som statiskt analyserar repositorier för att samla resultat. Denna studie fick inblick i antalet repositorier som använder lambdas, deras användningsområden och dokumentation men också hur dessa resultat jämför sig med liknande studieresultat i Java. Vidare har det visats att enhetstestning och utvecklaren erfarenhet korrelerar med användningen av lambdas.

Styles APA, Harvard, Vancouver, ISO, etc.

3

Schulte, Lukas. "Investigating topic modeling techniques for historical feature location." Thesis, Karlstads universitet, Institutionen för matematik och datavetenskap (from 2013), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-85379.

Texte intégral

Résumé :

Software maintenance and the understanding of where in the source code features are implemented are two strongly coupled tasks that make up a large portion of the effort spent on developing applications. The concept of feature location investigated in this thesis can serve as a supporting factor in those tasks as it facilitates the automation of otherwise manual searches for source code artifacts. Challenges in this subject area include the aggregation and composition of a training corpus from historical codebase data for models as well as the integration and optimization of qualified topic modeling techniques. Building up on previous research, this thesis provides a comparison of two different techniques and introduces a toolkit that can be used to reproduce and extend on the results discussed. Specifically, in this thesis a changeset-based approach to feature location is pursued and applied to a large open-source Java project. The project is used to optimize and evaluate the performance of Latent Dirichlet Allocation models and Pachinko Allocation models, as well as to compare the accuracy of the two models with each other. As discussed at the end of the thesis, the results do not indicate a clear favorite between the models. Instead, the outcome of the comparison depends on the metric and viewpoint from which it is assessed.

Styles APA, Harvard, Vancouver, ISO, etc.

4

Vu, Duc Ly. "Towards Understanding and Securing the OSS Supply Chain." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/333508.

Texte intégral

Résumé :

Free and Open-Source Software (FOSS) has become an integral part of the software supply chain in the past decade. Various entities (automated tools and humans) are involved at different stages of the software supply chain. Some actions that occur in the chain may result in vulnerabilities or malicious code injected in a published artifact distributed in a package repository. At the end of the software supply chain, developers or end-users may consume the resulting artifacts altered in transit, including benign and malicious injection. This dissertation starts from the first link in the software supply chain, ‘developers’. Since many developers do not update their vulnerable software libraries, thus exposing the user of their code to security risks. To understand how they choose, manage and update the libraries, packages, and other Open-Source Software (OSS) that become the building blocks of companies’ completed products consumed by end-users, twenty-five semi-structured interviews were conducted with developers of both large and small-medium enterprises in nine countries. All interviews were transcribed, coded, and analyzed according to applied thematic analysis. Although there are many observations about developers’ attitudes on selecting dependencies for their projects, additional quantitative work is needed to validate whether behavior matches or whether there is a gap. Therefore, we provide an extensive empirical analysis of twelve quality and popularity factors that should explain the corresponding popularity (adoption) of PyPI packages was conducted using our tool called py2src. At the end of the software supply chain, software libraries (or packages) are usually downloaded directly from the package registries via package dependency management systems under the comfortable assumption that no discrepancies are introduced in the last mile between the source code and their respective packages. However, such discrepancies might be introduced by manual or automated build tools (e.g., metadata, Python bytecode files) or for evil purposes (malicious code injects). To identify differences between the published Python packages in PyPI and the source code stored on Github, we developed a new approach called LastPyMile . Our approach has been shown to be promising to integrate within the current package dependency management systems or company workflow for vetting packages at a minimal cost. With the ever-increasing numbers of software bugs and security vulnerabilities, the burden of secure software supply chain management on developers and project owners increases. Although automated program repair approaches promise to reduce the burden of bug-fixing tasks by suggesting likely correct patches for software bugs, little is known about the practical aspects of using APR tools, such as how long one should wait for a tool to generate a bug fix. To provide a realistic evaluation of five state-of-the-art APR tools, 221 bugs from 44 open-source Java projects were run within a reasonable developers’ time and effort.

Styles APA, Harvard, Vancouver, ISO, etc.

5

Carlsson, Emil. "Mining Git Repositories : An introduction to repository mining." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-27742.

Texte intégral

Résumé :

When performing an analysis of the evolution of software quality and software metrics,there is a need to get access to as many versions of the source code as possible. There isa lack of research on how data or source code can be extracted from the source controlmanagement system Git. This thesis explores different possibilities to resolve thisproblem. Lately, there has been a boom in usage of the version control system Git. Githubalone hosts about 6,100,000 projects. Some well known projects and organizations thatuse Git are Linux, WordPress, and Facebook. Even with these figures and clients, thereare very few tools able to perform data extraction from Git repositories. A pre-studyshowed that there is a lack of standardization on how to share mining results, and themethods used to obtain them. There are several tools available for older version control systems, such as concurrentversions system (CVS), but few for Git. The examined repository mining applicationsfor Git are either poorly documented; or were built to be very purpose-specific to theproject for which they were designed. This thesis compiles a list of general issues encountered when using repositorymining as a tool for data gathering. A selection of existing repository mining tools wereevaluated towards a set of prerequisite criteria. The end result of this evaluation is thecreation of a new repository mining tool called Doris. This tool also includes a smallcode metrics analysis library to show how it can be extended.

Styles APA, Harvard, Vancouver, ISO, etc.

6

Sinha, Vinayak. "Sentiment Analysis On Java Source Code In Large Software Repositories." Youngstown State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1464880227.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

7

Ribeiro, Athos Coimbra. "Ranking source code static analysis warnings for continuous monitoring of free/libre/open source software repositories." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-20082018-170140/.

Texte intégral

Résumé :

While there is a wide variety of both open source and proprietary source code static analyzers available in the market, each of them usually performs better in a small set of problems, making it hard to choose one single tool to rely on when examining a program. Combining the analysis of different tools may reduce the number of false negatives, but yields a corresponding increase in the number of false positives (which is already high for many tools). An interesting solution, then, is to filter these results to identify the issues least likely to be false positives. This work presents kiskadee, a system to support the usage of static analysis during software development by providing carefully ranked static analysis reports. First, it runs multiple static analyzers on the source code. Then, using a classification model, the potential bugs detected by the static analyzers are ranked based on their importance, with critical flaws ranked first, and potential false positives ranked last. To train kiskadee\'s classification model, we post-analyze the reports generated by three tools on synthetic test cases provided by the US National Institute of Standards and Technology. To make our technique as general as possible, we limit our data to the reports themselves, excluding other information such as change histories or code metrics. The features extracted from these reports are used to train a set of decision trees using AdaBoost to create a stronger classifier, achieving 0.8 classification accuracy (the combined false positive rate from the used tools was 0.61). Finally, we use this classifier to rank static analyzer alarms based on the probability of a given alarm being an actual bug. Our experimental results show that, on average, when inspecting warnings ranked by kiskadee, one hits 5.2 times less false positives before each bug than when using a randomly sorted warning list.<br>Embora exista grande variedade de analisadores estáticos de código-fonte disponíveis no mercado, tanto com licenças proprietárias, quanto com licenças livres, cada uma dessas ferramentas mostra melhor desempenho em um pequeno conjunto de problemas distinto, dificultando a escolha de uma única ferramenta de análise estática para analisar um programa. A combinação das análises de diferentes ferramentas pode reduzir o número de falsos negativos, mas gera um aumento no número de falsos positivos (que já é alto para muitas dessas ferramentas). Uma solução interessante é filtrar esses resultados para identificar os problemas com menores probabilidades de serem falsos positivos. Este trabalho apresenta kiskadee, um sistema para promover o uso da análise estática de código fonte durante o ciclo de desenvolvimento de software provendo relatórios de análise estática ranqueados. Primeiramente, kiskadee roda diversos analisadores estáticos no código-fonte. Em seguida, utilizando um modelo de classificação, os potenciais bugs detectados pelos analisadores estáticos são ranqueados conforme sua importância, onde defeitos críticos são colocados no topo de uma lista, e potenciais falsos positivos, ao fim da mesma lista. Para treinar o modelo de classificação do kiskadee, realizamos uma pós-análise nos relatórios gerados por três analisadores estáticos ao analisarem casos de teste sintéticos disponibilizados pelo National Institute of Standards and Technology (NIST) dos Estados Unidos. Para tornar a técnica apresentada o mais genérica possível, limitamos nossos dados às informações contidas nos relatórios de análise estática das três ferramentas, não utilizando outras informações, como históricos de mudança ou métricas extraídas do código-fonte dos programas inspecionados. As características extraídas desses relatórios foram utilizadas para treinar um conjunto de árvores de decisão utilizando o algoritmo AdaBoost para gerar um classificador mais forte, atingindo uma acurácia de classificação de 0,8 (a taxa de falsos positivos das ferramentas utilizadas foi de 0,61, quando combinadas). Finalmente, utilizamos esse classificador para ranquear os alarmes dos analisadores estáticos nos baseando na probabilidade de um dado alarme ser de fato um bug no código-fonte. Resultados experimentais mostram que, em média, quando inspecionando alarmes ranqueados pelo kiskadee, encontram-se 5,2 vezes menos falsos positivos antes de se encontrar cada bug quando a mesma inspeção é realizada para uma lista ordenada de forma aleatória.

Styles APA, Harvard, Vancouver, ISO, etc.

8

Hassan, Ahmed. "Mining Software Repositories to Assist Developers and Support Managers." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/1017.

Texte intégral

Résumé :

This thesis explores mining the evolutionary history of a software system to support software developers and managers in their endeavors to build and maintain complex software systems. We introduce the idea of evolutionary extractors which are specialized extractors that can recover the history of software projects from software repositories, such as source control systems. The challenges faced in building C-REX, an evolutionary extractor for the C programming language, are discussed. We examine the use of source control systems in industry and the quality of the recovered C-REX data through a survey of several software practitioners. Using the data recovered by C-REX, we develop several approaches and techniques to assist developers and managers in their activities. We propose <em>Source Sticky Notes</em> to assist developers in understanding legacy software systems by attaching historical information to the dependency graph. We present the <em>Development Replay</em> approach to estimate the benefits of adopting new software maintenance tools by reenacting the development history. We propose the <em>Top Ten List</em> which assists managers in allocating testing resources to the subsystems that are most susceptible to have faults. To assist managers in improving the quality of their projects, we present a complexity metric which quantifies the complexity of the changes to the code instead of quantifying the complexity of the source code itself. All presented approaches are validated empirically using data from several large open source systems. The presented work highlights the benefits of transforming software repositories from static record keeping repositories to active repositories used by researchers to gain empirically based understanding of software development, and by software practitioners to predict, plan and understand various aspects of their project.

Styles APA, Harvard, Vancouver, ISO, etc.

9

Thummalapenta, Suresh. "Improving Software Productivity and Quality via Mining Source Code." NORTH CAROLINA STATE UNIVERSITY, 2011. http://pqdtopen.proquest.com/#viewpdf?dispub=3442531.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

10

Delorey, Daniel Pierce. "Observational Studies of Software Engineering Using Data from Software Repositories." Diss., CLICK HERE for online access, 2007. http://contentdm.lib.byu.edu/ETD/image/etd1716.pdf.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

Plus de sources

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!