Log in

Relevant bibliographies by topics / R packages / Dissertations / Theses

To see the other types of publications on this topic, follow the link: R packages.

Dissertations / Theses on the topic 'R packages'

Author: Grafiati

Published: 4 June 2021

Last updated: 12 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'R packages.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Hornik, Kurt. "Are There Too Many R Packages?" Austrian Statistical Society, c/o Bundesanstalt Statistik Austria, 2012. http://epub.wu.ac.at/3814/1/121Hornik.pdf.

Full text

Abstract:

The number of R extension packages available from the CRAN repository has tremendously grown over the past 10 years. We look at this phenomenon in more detail, and discuss some of its consequences. In particular, we argue that the statistical computing community needs a more common understanding of software quality, and better domain-specific semantic resources.

APA, Harvard, Vancouver, ISO, and other styles

2

Rusch, Thomas, Patrick Mair, and Reinhold Hatzinger. "Psychometrics With R: A Review Of CRAN Packages For Item Response Theory." WU Vienna University of Economics and Business, 2013. http://epub.wu.ac.at/4010/1/resrepIRThandbook.pdf.

Full text

Abstract:

In this paper we review the current state of R packages for Item Response Theory (IRT). We group the available packages based on their purpose and provide an overview of each package's main functionality. Each of the packages we describe has a peer-reviewed publication associated with it. We also provide a tutorial analysis of data from the 1990 Workplace Industrial Relation Survey to show how the breadth and exibility of IRT packages in R can be leveraged to conduct even challenging item analyses with versatility and ease. These items relate to the type of consultations that are carried out in a firm when major changes are implemented. We first use unidimensional IRT models just to discover that they fit do not fit well. We then use nonparametric IRT to explore the possible causes for the scaling problem. Based on the results from the exploration, we finally use a two-dimensional model on a subset of the original items to achieve a good fit with a sensible interpretation, namely that there are two types of consultations a firm may engage in: consultations with workers/representatives from the firm and with official union representatives. The different items relate mostly to one of these dimensions and firms can be scaled well along these two dimensions.
Series: Discussion Paper Series / Center for Empirical Research Methods

APA, Harvard, Vancouver, ISO, and other styles

3

Granato, Italo Stefanine Correia. "snpReady and BGGE: R packages to prepare datasets and perform genome-enabled predictions." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/11/11137/tde-21062018-134207/.

Full text

Abstract:

The use of molecular markers allows an increase in efficiency of the selection as well as better understanding of genetic resources in breeding programs. However, with the increase in the number of markers, it is necessary to process it before it can be ready to use. Also, to explore Genotype x Environment (GE) in the context of genomic prediction some covariance matrices needs to be set up before the prediction step. Thus, aiming to facilitate the introduction of genomic practices in the breeding program pipelines, we developed two R-packages. The former is called snpReady, which is set to prepare data sets to perform genomic studies. This package offers three functions to reach this objective, from organizing and apply the quality control, build the genomic relationship matrix and a summary of a population genetics. Furthermore, we present a new imputation method for missing markers. The latter is the BGGE package that was built to generate kernels for some GE genomic models and perform predictions. It consists of two functions (getK and BGGE). The former is helpful to create kernels for the GE genomic models, and the latter performs genomic predictions with some features for GE kernels that decreases the computational time. The features covered in the two packages presents a fast and straightforward option to help the introduction and usage of genome analysis in the breeding program pipeline.
O uso de marcadores moleculares permite um aumento na eficiência da seleção, bem como uma melhor compreensão dos recursos genéticos em programas de melhoramento. No entanto, com o aumento do número de marcadores, é necessário o processamento deste antes de deixa-lo disponível para uso. Além disso, para explorar a interação genótipo x ambiente (GA) no contexto da predição genômica, algumas matrizes de covariância precisam ser obtidas antes da etapa de predição. Assim, com o objetivo de facilitar a introdução de práticas genômicas nos programa de melhoramento, dois pacotes em R foram desenvolvidos. O primeiro, snpReady, foi criado para preparar conjuntos de dados para realizar estudos genômicos. Este pacote oferece três funções para atingir esse objetivo, organizando e aplicando o controle de qualidade, construindo a matriz de parentesco genômico e com estimativas de parâmetros genéticos populacionais. Além disso, apresentamos um novo método de imputação para marcas perdidas. O segundo pacote é o BGGE, criado para gerar kernels para alguns modelos genômicos de interação GA e realizar predições genômicas. Consiste em duas funções (getK e BGGE). A primeira é utilizada para criar kernels para os modelos GA, e a última realiza predições genômicas, com alguns recursos especifico para os kernels GA que diminuem o tempo computacional. Os recursos abordados nos dois pacotes apresentam uma opção rápida e direta para ajudar a introdução e uso de análises genômicas nas diversas etapas do programa de melhoramento.

APA, Harvard, Vancouver, ISO, and other styles

4

Kozinski, Lukasz. "Constant protectionism in the financial meltdown: are national stimulas packages new forms of protection alongside currency devaluation?" Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104530.

Full text

Abstract:

Deriving from research of the two North American economies, the study investigated the nature and makeup of various stimulus policies enacted by the respective governments in response to the recent financial crisis. Specific attention was given to identifying the most significant industry recipients of stimulus offerings. World Trade Organization (WTO) dispute histories for the American and Canadian economies were investigated to find parallels between recent fiscal stimulus offerings amongst various industry beneficiaries and past instances of international trade disputes. Evidence of national stimulus packages being new sources of protection for traditional recipients was discovered. American WTO disputes targeting important stimulus recipients during the crisis period underlined international identification of that country's fiscal policies as protectionist in several economic sectors. Periods of American Dollar depreciation are thought to have exacerbated recent international reactions. No such marked parallel in recent disputes was found for Canada, though it is believed that clear Canadian Dollar appreciation over the past two years is the explanatory factor for this reserved international reaction. Stimulus in conjunction with depreciation, it would seem have been accepted to a lesser degree than stimulus alone. Central, however, is the finding that stimulus packages for both nations were found to focus on industries historically inclined on protectionism.
Émanant de la recherche sur les deux economies Nord-Américaines, l'étude a investigué la composition de plusieurs projets de stimule fiscal de ces deux nations en réponse à la crise financière récente. Plus précisement, la recherche a visé les industries et secteurs économiques les plus affectés par les offres de stimule des gouvernments Canadien et Americain. L'histoire des disputes précédentes à l'Organisation Mondiale du Commerce (OMC) du Canada et des États-Unis à été investigué afin de pouvior identifier des parallèles entre les bénéficiers des stimules contemporains et des tendances de protéctionisme chez ces deux economies dans le passé. Des preuves signifiantes de nouvelles sources de protection des industries nationaux ont été découvertes. Plusieures disputes dans l'OMC contre les industries Américaines qui ont réçu des niveau de stimule importantes pendants la crise ont soulignés le charactère protectioniste des project fiscaux proposés par ce pays. La dévaluation en valeur du Dollar Américain est presumée d'être la cause des réactions négatives internationals. Le stimule Canadien n'a pas crée une reaction internationale aussi marquée mais l'appréciation soulignée du Dollar Canadien pendants les deux dernières années est presumée d'être la raison de ce restraint des partenaires d'échange international du Canada. L'application des stimules économiques en conjunction avec une chute de valeur de la monnaie nationale n'a pas donc été acceptée au même niveau que les stimules seulements. Néamoins, et au centre de cette invéstigation, les projects de stimules fiscaux proposés part les deux économies ont été démasquées pour être des nouvelles sources de protéction pour des récipients historiquement protéctionistes.

APA, Harvard, Vancouver, ISO, and other styles

5

Dehghannya, Jalal. "Mathematical modeling of airflow, heat and mass transfer during forced convection cooling of produce in ventilated packages." Thesis, McGill University, 2008. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=115663.

Full text

Abstract:

Forced convection cooling process is the most widely used method of cooling to extend shelf life of horticultural produce after harvest. However, heterogeneous cooling of produce inside different parts of ventilated packages is a serious problem. Therefore, it is essential to design packages that facilitate air circulation throughout the entire package to provide uniform cooling. Selection of appropriate combinations of air temperature and velocity for a given vent design is currently done largely by experimental trial and error approach. A more logical approach in designing new packages, to provide uniform cooling, is to develop mathematical models that would be able to predict package performance without requiring costly experiments.
In this study, mathematical models of simultaneous airflow, heat and mass transfer during forced convection cooling process were developed and validated with experimental data. The study showed that produce cooling is strongly influenced by different ventilated package designs. Generally, cooling uniformity was increased by increasing number of vents from 1 (2.4% vent area) to 5 (12.1% vent area). More uniform produce cooling was obtained at less cooling time when vents were uniformly distributed on package walls with at least 4.8% opening areas. Aerodynamic studies showed that heterogeneity of airflow distribution during the process is strongly influenced by different package vent configurations. The highest cooling heterogeneity index (108%) was recorded at 2.4% vent area whereas lowest heterogeneity index (0%) was detected in a package with 12.1% vent area.
The magnitudes of produce evaporative cooling (EC) and heat generation by respiration (HG) as well as the interactive effects of EC, HG and package vent design on produce cooling time were also investigated. Considerable differences in cooling times were obtained with regard to independent and simultaneous effects of EC and HG in different package vent configurations. Cooling time was increased to about 47% in a package with 1 vent compared to packages with 3 and 5 vents considering simultaneous effects of EC and HG. Therefore, the effects of EC and HG can be influential in designing the forced-air precooling system and consequently, in the accurate determination of cooling time and the corresponding refrigeration load.

APA, Harvard, Vancouver, ISO, and other styles

6

Lee, Harry B. "A comparative analysis of lectures versus interactive computer-assisted learning packages for the teaching and learning of anatomy by tertiary students." Curtin University of Technology, Faculty of Education, 1996. http://espace.library.curtin.edu.au:80/R/?func=dbin-jump-full&object_id=11232.

Full text

Abstract:

The primary aim of this study was to validate interactive computer-assisted learning packages (ICALP) in a self operated computer controlled educational resource (SOCCER) to undergraduate (UG) physiotherapy students of anatomy. The development of ICALP, Test and FeedBack items for SOCCER are described, as well as the mechanism of delivery with continuous positive reinforcement to randomly selected students. To meet this requirement, a computer managed learning environment (CMLE) was established to affirm the value of ICALP and SOCCER materials to replace traditional lectures in anatomy. Quantitative data is given to verify this hypothesis during the education of UG physiotherapy students of anatomy. Throughout 1992, the UG population was randomly divided into Lecture and ICALP groups, with mutual exclusion of each to the other, for ten areas of study. These results were validated by re-application to the succeeding UG population in 1993. The secondary aim of this study was in two-parts. Firstly, to verify that ICALP materials can be applied to transfer 2-D cognitive anatomical information in a self-paced format of autonomous learning. Secondly, to investigate a premise that previously acquired 2-D anatomical information may be transferred into a 3-D psycho-motor skill. Ample data is given to verify the first hypothesis, with sufficient evidence to support the second. The subsidiary aim of this study compared the educational and administrative cost-effectiveness of ICALP and SOCCER with traditional lectures used in anatomy. Evidence is given to demonstrate that the time saved in lectures can be replaced by a lecture-seminar approach to problem-based learning to empower UG2 students to achieve at a level beyond that which would normally be expected. Sufficient data is provided to affirm the cost-benefits of ICALP and SOCCER to academic staff, individual students, and ++
administrators. The untested belief held by schools of anatomy that high ranking pre-entrants in English, English Literature, and Human Biology, are more likely to transpose 2-D anatomical information into a 3-D skill than high ranking pre-entrants in Mathematics, Chemistry and Physics was also investigated. Scrutiny of these data could not determine any discriminatory differences of ability to succeed in UG anatomy by either of these two categories.

APA, Harvard, Vancouver, ISO, and other styles

7

Theußl, Stefan, Uwe Ligges, and Kurt Hornik. "Prospects and Challenges in R Package Development." Institute for Statistics and Mathematics, WU Vienna University of Economics and Business, 2010. http://epub.wu.ac.at/866/1/document.pdf.

Full text

Abstract:

R, a software package for statistical computing and graphics, has evolved into the lingua franca of (computational) statistics. One of the cornerstones of R's success is the decentralized and modularized way of creating software using a multi-tiered development model: The R Development Core Team provides the "base system", which delivers basic statistical functionality, and many other developers contribute code in the form of extensions in a standardized format via so-called packages. In order to be accessible by a broader audience, packages are made available via standardized source code repositories. To support such a loosely coupled development model, repositories should be able to verify that the provided packages meet certain formal quality criteria and "work": both relative to the development of the base R system as well as with other packages (interoperability). However, established quality assurance systems and collaborative infrastructures typically face several challenges, some of which we will discuss in this paper.
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

8

Privé, Florian. "Genetic risk score based on statistical learning Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr Efficient implementation of penalized regression for genetic risk prediction Making the most of Clumping and Thresholding for polygenic scores." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAS024.

Full text

Abstract:

Le génotypage devient de moins en moins cher, rendant les données de génotypes disponibles pour des millions d’individus. Par ailleurs, l’imputation permet d’obtenir l’information génotypique pour des millions de positions de l’ADN, capturant l’essentiel de la variation génétique du génome humain. Compte tenu de la richesse des données et du fait que de nombreux traits et maladies sont héréditaires (par exemple, la génétique peut expliquer 80% de la variation de la taille dans la population), il est envisagé d’utiliser des modèles prédictifs basés sur l’information génétique dans le cadre d’une médecine personnalisée.Au cours de ma thèse, je me suis concentré sur l’amélioration de la capacité prédictive des modèles polygéniques. Les modèles prédictifs faisant partie d’une analyse statistique plus large des jeux de données, j’ai développé des outils permettant l’analyse exploratoire de grands jeux de données, constitués de deux packages R/C++ décrits dans la première partie de ma thèse. Ensuite, j’ai développé une implémentation efficace de larégression pénalisée pour construire des modèles polygéniques basés sur des centaines de milliers d’individus génotypés. Enfin, j’ai amélioré la méthode appelée “clumpingand thresholding”, qui est la méthode polygénique la plus largement utilisée et qui estbasée sur des statistiques résumées plus largement accessibles par rapport aux données individuelles.Dans l’ensemble, j’ai appliqué de nombreux concepts d’apprentissage statistique aux données génétiques. J’ai utilisé du “extreme gradient boosting” pour imputer des variants génotypés, du “feature engineering” pour capturer des effets récessifs et dominants dans une régression pénalisée, et du “parameter tuning” et des “stacked regres-sions” pour améliorer les modèles polygéniques prédictifs. L’apprentissage statistique n’est pour l’instant pas très utilisé en génétique humaine et ma thèse est une tentative pour changer cela
Genotyping is becoming cheaper, making genotype data available for millions of indi-viduals. Moreover, imputation enables to get genotype information at millions of locicapturing most of the genetic variation in the human genome. Given such large data andthe fact that many traits and diseases are heritable (e.g. 80% of the variation of heightin the population can be explained by genetics), it is envisioned that predictive modelsbased on genetic information will be part of a personalized medicine.In my thesis work, I focused on improving predictive ability of polygenic models.Because prediction modeling is part of a larger statistical analysis of datasets, I de-veloped tools to allow flexible exploratory analyses of large datasets, which consist intwo R/C++ packages described in the first part of my thesis. Then, I developed someefficient implementation of penalized regression to build polygenic models based onhundreds of thousands of genotyped individuals. Finally, I improved the “clumping andthresholding” method, which is the most widely used polygenic method and is based onsummary statistics that are widely available as compared to individual-level data.Overall, I applied many concepts of statistical learning to genetic data. I used ex-treme gradient boosting for imputing genotyped variants, feature engineering to cap-ture recessive and dominant effects in penalized regression, and parameter tuning andstacked regressions to improve polygenic prediction. Statistical learning is not widelyused in human genetics and my thesis is an attempt to change that

APA, Harvard, Vancouver, ISO, and other styles

9

Hornik, Kurt, and Bettina Grün. "topicmodels: An R Package for Fitting Topic Models." American Statistical Association, 2011. http://epub.wu.ac.at/3987/1/topicmodels.pdf.

Full text

Abstract:

Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R package topicmodels provides basic infrastructure for fitting topic models based on data structures from the text mining package tm. The package includes interfaces to two algorithms for fitting topic models: the variational expectation-maximization algorithm provided by David M. Blei and co-authors and an algorithm using Gibbs sampling by Xuan-Hieu Phan and co-authors.

APA, Harvard, Vancouver, ISO, and other styles

10

Guy, Abel. "fanplot: An R Package for Visualising Sequential Distributions." The R Foundation for Statistical Computing, 2015. http://epub.wu.ac.at/5910/1/Abel_2015_RJ_fanplot%2DAn%2DR%2DPackage.pdf.

Full text

Abstract:

Fan charts, first developed by the Bank of England in 1996, have become a standard method for visualising forecasts with uncertainty. Using shading fan charts focus the attention towards the whole distribution away from a single central measure. This article describes the basics of plotting fan charts using an R add-on package alongside some additional methods for displaying sequential distributions. Examples are based on distributions of both estimated parameters from a time series model and future values with uncertainty.

APA, Harvard, Vancouver, ISO, and other styles

11

Kastner, Gregor. "Heavy-Tailed Innovations in the R Package stochvol." WU Vienna University of Economics and Business, 2015. http://epub.wu.ac.at/4918/1/heavytails.pdf.

Full text

Abstract:

We document how sampling from a conditional Student's t distribution is implemented in stochvol. Moreover, a simple example using EUR/CHF exchange rates illustrates how to use the augmented sampler. We conclude with results and implications. (author's abstract)

APA, Harvard, Vancouver, ISO, and other styles

12

Randahl, David. "Raoul: An R-Package for Handling Missing Data." Thesis, Uppsala universitet, Statistiska institutionen, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-297051.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Karatzoglou, Alexandros, Alex Smola, Kurt Hornik, and Achim Zeileis. "kernlab - An S4 package for kernel methods in R." Institut für Statistik und Mathematik, WU Vienna University of Economics and Business, 2004. http://epub.wu.ac.at/1048/1/document.pdf.

Full text

Abstract:

kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 object model and provides a framework for creating and using kernel-based algorithms. The package contains dot product primitives (kernels), implementations of support vector machines and the relevance vector machine, Gaussian processes, a ranking algorithm, kernel PCA, kernel CCA, and a spectral clustering algorithm. Moreover it provides a general purpose quadratic programming solver, and an incomplete Cholesky decomposition method. (author's abstract)
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

14

Philipp, Michel, Thomas Rusch, Kurt Hornik, and Carolin Strobl. "Measuring the Stability of Results from Supervised Statistical Learning." WU Vienna University of Economics and Business, 2017. http://epub.wu.ac.at/5398/1/Report131.pdf.

Full text

Abstract:

Stability is a major requirement to draw reliable conclusions when interpreting results from supervised statistical learning. In this paper, we present a general framework for assessing and comparing the stability of results, that can be used in real-world statistical learning applications or in benchmark studies. We use the framework to show that stability is a property of both the algorithm and the data-generating process. In particular, we demonstrate that unstable algorithms (such as recursive partitioning) can produce stable results when the functional form of the relationship between the predictors and the response matches the algorithm. Typical uses of the framework in practice would be to compare the stability of results generated by different candidate algorithms for a data set at hand or to assess the stability of algorithms in a benchmark study. Code to perform the stability analyses is provided in the form of an R-package.
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

15

de, Leeuw Jan, and Patrick Mair. "Simple and Canonical Correspondence Analysis Using the R Package anacor." Foundation for Open Access Statistics, 2009. http://dx.doi.org/10.18637/jss.v031.i05.

Full text

Abstract:

This paper presents the R package anacor for the computation of simple and canonical correspondence analysis with missing values. The canonical correspondence analysis is specified in a rather general way by imposing covariates on the rows and/or the columns of the two-dimensional frequency table. The package allows for scaling methods such as standard, Benzécri, centroid, and Goodman scaling. In addition, along with well-known two- and three-dimensional joint plots including confidence ellipsoids, it offers alternative plotting possibilities in terms of transformation plots, Benzécri plots, and regression plots.

APA, Harvard, Vancouver, ISO, and other styles

16

Buder, Thomas, Andreas Deutsch, Michael Seifert, and Anja Voss-Böhme. "CellTrans: An R Package to Quantify Stochastic Cell State Transitions." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-230144.

Full text

Abstract:

Many normal and cancerous cell lines exhibit a stable composition of cells in distinct states which can, e.g., be defined on the basis of cell surface markers. There is evidence that such an equilibrium is associated with stochastic transitions between distinct states. Quantifying these transitions has the potential to better understand cell lineage compositions. We introduce CellTrans, an R package to quantify stochastic cell state transitions from cell state proportion data from fluorescence-activated cell sorting and flow cytometry experiments. The R package is based on a mathematical model in which cell state alterations occur due to stochastic transitions between distinct cell states whose rates only depend on the current state of a cell. CellTrans is an automated tool for estimating the underlying transition probabilities from appropriately prepared data. We point out potential analytical challenges in the quantification of these cell transitions and explain how CellTrans handles them. The applicability of CellTrans is demonstrated on publicly available data on the evolution of cell state compositions in cancer cell lines. We show that CellTrans can be used to (1) infer the transition probabilities between different cell states, (2) predict cell line compositions at a certain time, (3) predict equilibrium cell state compositions, and (4) estimate the time needed to reach this equilibrium. We provide an implementation of CellTrans in R, freely available via GitHub (https://github.com/tbuder/CellTrans).

APA, Harvard, Vancouver, ISO, and other styles

17

Hahsler, Michael, Kurt Hornik, and Christian Buchta. "Getting Things in Order: An Introduction to the R Package seriation." American Statistical Association, 2008. http://epub.wu.ac.at/4003/1/things.pdf.

Full text

Abstract:

Seriation, i.e., finding a suitable linear order for a set of objects given data and a loss or merit function, is a basic problem in data analysis. Caused by the problem's combinatorial nature, it is hard to solve for all but very small sets. Nevertheless, both exact solution methods and heuristics are available. In this paper we present the package seriation which provides an infrastructure for seriation with R. The infrastructure comprises data structures to represent linear orders as permutation vectors, a wide array of seriation methods using a consistent interface, a method to calculate the value of various loss and merit functions, and several visualization techniques which build on seriation. To illustrate how easily the package can be applied for a variety of applications, a comprehensive collection of examples is presented.

APA, Harvard, Vancouver, ISO, and other styles

18

Feinerer, Ingo, Christian Buchta, Wilhelm Geiger, Johannes Rauch, Patrick Mair, and Kurt Hornik. "The textcat Package for n-Gram Based Text Categorization in R." American Statistical Association, 2013. http://epub.wu.ac.at/3985/1/textcat.pdf.

Full text

Abstract:

Identifying the language used will typically be the first step in most natural language processing tasks. Among the wide variety of language identification methods discussed in the literature, the ones employing the Cavnar and Trenkle (1994) approach to text categorization based on character n-gram frequencies have been particularly successful. This paper presents the R extension package textcat for n-gram based text categorization which implements both the Cavnar and Trenkle approach as well as a reduced n-gram approach designed to remove redundancies of the original approach. A multi-lingual corpus obtained from the Wikipedia pages available on a selection of topics is used to illustrate the functionality of the package and the performance of the provided language identification methods. (authors' abstract)

APA, Harvard, Vancouver, ISO, and other styles

19

Hahsler, Michael, Kurt Hornik, and Christian Buchta. "Getting Things in Order: An Introduction to the R package seriation." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2007. http://epub.wu.ac.at/852/1/document.pdf.

Full text

Abstract:

Seriation, i.e., finding a linear order for a set of objects given data and a loss or merit function, is a basic problem in data analysis. Caused by the problem's combinatorial nature, it is hard to solve for all but very small sets. Nevertheless, both exact solution methods and heuristics are available. In this paper we present the package seriation which provides the infrastructure for seriation with R. The infrastructure comprises data structures to represent linear orders as permutation vectors, a wide array of seriation methods using a consistent interface, a method to calculate the value of various loss and merit functions, and several visualization techniques which build on seriation. To illustrate how easily the package can be applied for a variety of applications, a comprehensive collection of examples is presented.
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

20

Jot, Sapan. "pcaL1: An R Package of Principal Component Analysis using the L1 Norm." VCU Scholars Compass, 2011. http://scholarscompass.vcu.edu/etd/2488.

Full text

Abstract:

Principal component analysis (PCA) is a dimensionality reduction tool which captures the features of data set in low dimensional subspace. Traditional PCA uses L2-PCA and has much desired orthogonality properties, but is sensitive to outliers. PCA using L1 norm has been proposed as an alternative to counter the effect of outliers. The R environment for statistical computing already provides L2-PCA function prcomp(), but there are not many options for L1 norm PCA methods. The goal of the research was to create one R package with different options of PCA methods using L1 norm. So, we choose three different L1-PCA algorithms: PCA-L1 proposed by Kwak [10], L1-PCA* by Brooks et. al. [1], and L1-PCA by Ke and Kanade [9]; to create a package pcaL1 in R, interfacing with C implementation of these algorithms. An open source software for solving linear problems, CLP, is used to solve the optimization problems for L1-PCA* and L1-PCA. We use this package on human microbiome data to investigate the relationship between people based on colonizing bacteria.

APA, Harvard, Vancouver, ISO, and other styles

21

Kastner, Gregor. "Dealing with Stochastic Volatility in Time Series Using the R Package stochvol." Foundation for Open Access Statistics, 2016. http://epub.wu.ac.at/4890/1/v69i05.pdf.

Full text

Abstract:

The R package stochvol provides a fully Bayesian implementation of heteroskedasticity modeling within the framework of stochastic volatility. It utilizes Markov chain Monte Carlo (MCMC) samplers to conduct inference by obtaining draws from the posterior distribution of parameters and latent variables which can then be used for predicting future volatilities. The package can straightforwardly be employed as a stand-alone tool; moreover, it allows for easy incorporation into other MCMC samplers. The main focus of this paper is to show the functionality of stochvol. In addition, it provides a brief mathematical description of the model, an overview of the sampling schemes used, and several illustrative examples using exchange rate data. (author's abstract)

APA, Harvard, Vancouver, ISO, and other styles

22

Hornik, Kurt, and Bettina Grün. "movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions." American Statistical Association, 2014. http://epub.wu.ac.at/4893/1/v58i10.pdf.

Full text

Abstract:

Finite mixtures of von Mises-Fisher distributions allow to apply model-based clustering methods to data which is of standardized length, i.e., all data points lie on the unit sphere. The R package movMF contains functionality to draw samples from finite mixtures of von Mises-Fisher distributions and to fit these models using the expectation-maximization algorithm for maximum likelihood estimation. Special features are the possibility to use sparse matrix representations for the input data, different variants of the expectationmaximization algorithm, different methods for determining the concentration parameters in the M-step and to impose constraints on the concentration parameters over the components. In this paper we describe the main fitting function of the package and illustrate its application. In addition we compare the clustering performance of finite mixtures of von Mises-Fisher distributions to spherical k-means. We also discuss the resolution of several numerical issues which occur for estimating the concentration parameters and for determining the normalizing constant of the von Mises-Fisher distribution. (authors' abstract)

APA, Harvard, Vancouver, ISO, and other styles

23

Zhu, Hongxu. "AN R PACKAGE FOR FITTING DIRICHLET PROCESS MIXTURES OF MULTIVARIATE GAUSSIAN DISTRIBUTIONS." Case Western Reserve University School of Graduate Studies / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case155752396390554.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Downey, Sean S., Guowei Sun, and Peter Norquest. "alineR: an R Package for Optimizing Feature-Weighted Alignments and Linguistic Distances." R FOUNDATION STATISTICAL COMPUTING, 2017. http://hdl.handle.net/10150/625224.

Full text

Abstract:

Linguistic distance measurements are commonly used in anthropology and biology when quantitative and statistical comparisons between words are needed. This is common, for example, when analyzing linguistic and genetic data. Such comparisons can provide insight into historical population patterns and evolutionary processes. However, the most commonly used linguistic distances are derived from edit distances, which do not weight phonetic features that may, for example, represent smaller-scale patterns in linguistic evolution. Thus, computational methods for calculating feature-weighted linguistic distances are needed for linguistic, biological, and evolutionary applications; additionally, the linguistic distances presented here are generic and may have broader applications in fields such as text mining and search, as well as applications in psycholinguistics and morphology. To facilitate this research, we are making available an open-source R software package that performs feature-weighted linguistic distance calculations. The package also includes a supervised learning methodology that uses a genetic algorithm and manually determined alignments to estimate 13 linguistic parameters including feature weights and a skip penalty. Here we present the package and use it to demonstrate the supervised learning methodology by estimating the optimal linguistic parameters for both simulated data and for a sample of Austronesian languages. Our results show that the methodology can estimate these parameters for both simulated and real language data, that optimizing feature weights improves alignment accuracy by approximately 29%, and that optimization significantly affects the resulting distance measurements. Availability: alineR is available on CRAN.

APA, Harvard, Vancouver, ISO, and other styles

25

Chuová, Trang. "Analýza výrobního sortimentu firmy KEB-EGE spol.s r. o." Master's thesis, Vysoká škola ekonomická v Praze, 2011. http://www.nusl.cz/ntk/nusl-113608.

Full text

Abstract:

My diploma thesis is dedicated to the analysis of the product assortment of the KEB-EGE ltd. company that concentrates on producing autodiagnostics and steel constructions. The theoretical part involves basic marketing terms and the practical part introduces the company itself, its customers, competition and suppliers, the analysis of the company macro environment and the analysis of the marketing mix as for its four tools, which is product, price, distribution and marketing communication.

APA, Harvard, Vancouver, ISO, and other styles

26

Hothorn, Torsten, Kurt Hornik, de Wiel Mark A. van, and Achim Zeileis. "Implementing a Class of Permutation Tests: The coin Package." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2007. http://epub.wu.ac.at/408/1/document.pdf.

Full text

Abstract:

The R package coin implements a unified approach to permutation tests providing a huge class of independence tests for nominal, ordered, numeric, and censored data as well as multivariate data at mixed scales. Based on a rich and flexible conceptual framework that embeds different permutation test procedures into a common theory, a computational framework is established in coin that likewise embeds the corresponding R functionality in a common S4 class structure with associated generic functions. As a consequence, the computational tools in coin inherit the flexibility of the underlying theory and conditional inference functions for important special cases can be set up easily. Conditional versions of classical tests - such as tests for location and scale problems in two or more samples, independence in two- or three-way contingency tables, or association problems for censored, ordered categorical or multivariate data - can be easily be implemented as special cases using this computational toolbox by choosing appropriate transformations of the observations. The paper gives a detailed exposition of both the internal structure of the package and the provided user interfaces.
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

27

Zeileis, Achim, Mark A. van de Wiel, Kurt Hornik, and Torsten Hothorn. "Implementing a Class of Permutation Tests: The coin Package." American Statistical Association, 2008. http://epub.wu.ac.at/4004/1/class.pdf.

Full text

Abstract:

The R package coin implements a unified approach to permutation tests providing a huge class of independence tests for nominal, ordered, numeric, and censored data as well as multivariate data at mixed scales. Based on a rich and exible conceptual framework that embeds different permutation test procedures into a common theory, a computational framework is established in coin that likewise embeds the corresponding R functionality in a common S4 class structure with associated generic functions. As a consequence, the computational tools in coin inherit the exibility of the underlying theory and conditional inference functions for important special cases can be set up easily. Conditional versions of classical tests|such as tests for location and scale problems in two or more samples, independence in two- or three-way contingency tables, or association problems for censored, ordered categorical or multivariate data|can easily be implemented as special cases using this computational toolbox by choosing appropriate transformations of the observations. The paper gives a detailed exposition of both the internal structure of the package and the provided user interfaces along with examples on how to extend the implemented functionality. (authors' abstract)

APA, Harvard, Vancouver, ISO, and other styles

28

Zeileis, Achim, Friedrich Leisch, Kurt Hornik, and Christian Kleiber. "strucchange. An R package for testing for structural change in linear regression models." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 2001. http://epub.wu.ac.at/1124/1/document.pdf.

Full text

Abstract:

This paper introduces ideas and methods for testing for structural change in linear regression models and presents how these have been realized in an R package called strucchange. It features tests from the generalized fluctuation test framework as well as from the F test (Chow test) framework. Extending standard significance tests it contains methods to fit, plot and test empirical fluctuation processes (like CUSUM, MOSUM and estimates-based processes) on the one hand and to compute, plot and test sequences of F statistics with the supF, aveF and expF test on the other. Thus, it makes powerful tools available to display information about structural changes in regression relationships and to assess their significance. Furthermore it is described how incoming data can be monitored online.
Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"

APA, Harvard, Vancouver, ISO, and other styles

29

Kleiber, Christian, Kurt Hornik, Friedrich Leisch, and Achim Zeileis. "strucchange: An R Package for Testing for Structural Change in Linear Regression Models." American Statistical Association, 2002. http://epub.wu.ac.at/4001/1/strucchange.pdf.

Full text

Abstract:

This paper reviews tests for structural change in linear regression models from the generalized fluctuation test framework as well as from the F test (Chow test) framework. It introduces a unified approach for implementing these tests and presents how these ideas have been realized in an R package called strucchange. Enhancing the standard significance test approach the package contains methods to fit, plot and test empirical fluctuation processes (like CUSUM, MOSUM and estimates-based processes) and to compute, plot and test sequences of F statistics with the supF, aveF and expF test. Thus, it makes powerful tools available to display information about structural changes in regression relationships and to assess their significance. Furthermore, it is described how incoming data can be monitored.

APA, Harvard, Vancouver, ISO, and other styles

30

Hornik, Kurt, Duncan Murdoch, and Achim Zeileis. "Who Did What? The Roles of R Package Authors and How to Refer to Them." WU Vienna University of Economics and Business, 2011. http://epub.wu.ac.at/3269/1/Report114.pdf.

Full text

Abstract:

Computational infrastructure for reprenting persons and citations has been available in R for several years, but has been restructured through enhanced classes "person" and "bibentry" in recent versions of R. The new features include support for the specification of the roles of package authors (e.g.,maintainer, author, contributor, translator, etc.) and more flexible formatting/printing tools among various other improvements. Here, we introduce the new classes and their methods and indicate how this functionality is employed in themanagement of R packages. Specifically, we show how the authors of R packages can be specified along with their roles in package 'DESCRIPTION' and/or 'CITATION' files and the citations produced from it. (author's abstract)
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

31

Mair, Patrick, Eva Hofmann, Kathrin Gruber, Reinhold Hatzinger, Achim Zeileis, and Kurt Hornik. "Motives for Participation in Open-Source Software Projects: A Survey among R Package Authors." WU Vienna University of Economics and Business, 2014. http://epub.wu.ac.at/4135/1/Report126.pdf.

Full text

Abstract:

One of the cornerstones of the R system for statistical computing is the multitude of contributed packages making an extremely broad range of statistical techniques and other quantitative methods freely available. This study investigates which factors are the crucial determinants responsible for the participation of the package authors in the R project. For this purpose a survey was conducted among R package authors, collecting data on different types of participation in the R project, three psychometric scales (hybrid forms of motivation, work design characteristics, and values), as well as various specie-demographic factors. These data are analyzed using item response theory and generalized linear models, showing that the most important determinants for participation are a hybrid form of motivation and the knowledge characteristics of the work design. Other factors are found to have less impact or influence only specific aspects of participation. (authors' abstract)
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

32

Mair, Patrick, and Reinhold Hatzinger. "Extended Rasch Modeling: The eRm Package for the Application of IRT Models in R." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2007. http://epub.wu.ac.at/332/1/document.pdf.

Full text

Abstract:

Item response theory models (IRT) are increasingly becoming established in social science research, particularly in the analysis of performance or attitudinal data in psychology, education, medicine, marketing and other fields where testing is relevant. We propose the R package eRm (extended Rasch modeling) for computing Rasch models and several extensions. A main characteristic of some IRT models, the Rasch model being the most prominent, concerns the separation of two kinds of parameters, one that describes qualities of the subject under investigation, and the other relates to qualities of the situation under which the response of a subject is observed. Using conditional maximum likelihood (CML) estimation both types of parameters may be estimated independently from each other. IRT models are well suited to cope with dichotomous and polytomous responses, where the response categories may be unordered as well as ordered. The incorporation of linear structures allows for modeling the effects of covariates and enables the analysis of repeated categorical measurements. The eRm package fits the following models: the Rasch model, the rating scale model (RSM), and the partial credit model (PCM) as well as linear reparameterizations through covariate structures like the linear logistic test model (LLTM), the linear rating scale model (LRSM), and the linear partial credit model (LPCM). We use an unitary, efficient CML approach to estimate the item parameters and their standard errors. Graphical and numeric tools for assessing goodness-of-fit are provided. (author's abstract)
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

33

Almeida, Bernardo Simons de. "A visualização de dados nas lotas da Docapesca." Master's thesis, Instituto Superior de Economia e Gestão, 2020. http://hdl.handle.net/10400.5/21032.

Full text

Abstract:

Mestrado em Métodos Quantitativos para a Decisão Económica e Empresarial
Na segunda década do século XXI, assistiu-se a um crescimento exponencial dos dados criados diariamente. Consequentemente, a visualização de dados é cada vez mais relevante no contexto empresarial, revelando-se cada vez mais essencial para a tomada de decisões e planeamento de estratégias de negócio. Este trabalho foi realizado, em conjunto com a Docapesca, com a finalidade de criar um dashboard capaz de analisar os dados específicos relativos ao pescado transacionado nas lotas em Portugal num determinado período (Entre 2009 e 2018). Com uma abordagem metodológica orientada para a data visualization, foi possível elaborar um dashboard com as propriedades indicadas para o que foi solicitado. Foi possível implementar a revisão da literatura, criando um dashboard atendendo ao que foi referido anteriormente.
The second decade of the 21st century saw an exponential growth in the data created daily. Consequently, data visualization is increasingly relevant in the business context, proving to be increasingly essential for decision making and business strategy planning. In association with Docapesca, this work was carried out with the purpose of creating a dashboard capable of analyzing the specific data related to fish traded at auction in Portugal in a given period (Between 2009 and 2018). With a methodological approach oriented to data visualization, it was possible to elaborate a dashboard with the properties indicated for what was requested. Despite not being implemented, the final assessment was positive and allowed conclusions to be drawn that were correct in relation to the data under study.
info:eu-repo/semantics/publishedVersion

APA, Harvard, Vancouver, ISO, and other styles

34

Hornik, Kurt, Duncan Murdoch, and Achim Zeileis. "Who Did What? The Roles of R Package Authors and How to Refer to Them." The R Foundation for Statistical Computing, 2012. http://epub.wu.ac.at/6395/1/RJ%2D2012%2D009.pdf.

Full text

Abstract:

Computational infrastructure for representing persons and citations has been available in R for several years, but has been restructured through enhanced classes "person" and "bibentry" in recent versions of R. The new features include support for the specification of the roles of package authors (e.g. maintainer, author, contributor, translator, etc.) and more flexible formatting/printing tools among various other improvements. Here, we introduce the new classes and their methods and indicate how this functionality is employed in the management of R packages. Specifically, we show how the authors of R packages can be specified along with their roles in package ´DESCRIPTION´ and/or ´CITATION´ files and the citations produced from it.

APA, Harvard, Vancouver, ISO, and other styles

35

Hu, Yang. "Extreme Value Mixture Modelling with Simulation Study and Applications in Finance and Insurance." Thesis, University of Canterbury. Mathematics and Statistics, 2013. http://hdl.handle.net/10092/8538.

Full text

Abstract:

Extreme value theory has been used to develop models for describing the distribution of rare events. The extreme value theory based models can be used for asymptotically approximating the behavior of the tail(s) of the distribution function. An important challenge in the application of such extreme value models is the choice of a threshold, beyond which point the asymptotically justified extreme value models can provide good extrapolation. One approach for determining the threshold is to fit the all available data by an extreme value mixture model. This thesis will review most of the existing extreme value mixture models in the literature and implement them in a package for the statistical programming language R to make them more readily useable by practitioners as they are not commonly available in any software. There are many different forms of extreme value mixture models in the literature (e.g. parametric, semi-parametric and non-parametric), which provide an automated approach for estimating the threshold and taking into account the uncertainties with threshold selection. However, it is not clear that how the proportion above the threshold or tail fraction should be treated as there is no consistency in the existing model derivations. This thesis will develop some new models by adaptation of the existing ones in the literature and placing them all within a more generalized framework for taking into account how the tail fraction is defined in the model. Various new models are proposed by extending some of the existing parametric form mixture models to have continuous density at the threshold, which has the advantage of using less model parameters and being more physically plausible. The generalised framework all the mixture models are placed within can be used for demonstrating the importance of the specification of the tail fraction. An R package called evmix has been created to enable these mixture models to be more easily applied and further developed. For every mixture model, the density, distribution, quantile, random number generation, likelihood and fitting function are presented (Bayesian inference via MCMC is also implemented for the non-parametric extreme value mixture models). A simulation study investigates the performance of the various extreme value mixture models under different population distributions with a representative variety of lower and upper tail behaviors. The results show that the kernel density estimator based non-parametric form mixture model is able to provide good tail estimation in general, whilst the parametric and semi-parametric forms mixture models can give a reasonable fit if the distribution below the threshold is correctly specified. Somewhat surprisingly, it is found that including a constraint of continuity at the threshold does not substantially improve the model fit in the upper tail. The hybrid Pareto model performs poorly as it does not include the tail fraction term. The relevant mixture models are applied to insurance and financial applications which highlight the practical usefulness of these models.

APA, Harvard, Vancouver, ISO, and other styles

36

Ryan, Niamh Margaret. "The design and application of SuRFR : an R package to prioritise candidate functional DNA sequence variants." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/22916.

Full text

Abstract:

Genetic analyses such as linkage and genome wide association studies (GWAS) have been extremely successful at identifying genomic regions that harbour genetic variants contributing to complex disorders. Over 90% of disease-associated variants from GWAS fall within non-coding regions (Maurano et al., 2012). However, pinpointing the causal variants has proven a major bottleneck to genetic research. To address this I have developed SuRFR, an R package for the ranked prioritisation of candidate causal variants by predicted function. SuRFR produces rank orderings of variants based upon functional genomic annotations, including DNase hypersensitivity signal, chromatin state, minor allele frequency, and conservation. The ranks for each annotation are combined into a final prioritisation rank using a weighting system that has been parametrised and tested through ten-fold cross-validation. SuRFR has been tested extensively upon a combination of synthetic and real datasets and has been shown to perform with high sensitivity and specificity. These analyses have provided insight into the extent to which different classes of functional annotation are most useful for the identification of known regulatory variants: the most important factor for identifying a true variant across all classes of regulatory variants is position relative to genes. I have also shown that SuRFR performs at least as well as its nearest competitors whilst benefiting from the advantages that come from being part of the R environment. I have applied SuRFR to several genomics projects, particularly the study of psychiatric illness, including genome sequencing of a large Scottish family with bipolar disorder. This has resulted in the prioritisation of such variants for future study.

APA, Harvard, Vancouver, ISO, and other styles

37

Zhang, Jiaqi. "A Comparison of Propensity Score Matching Methods in R with the MatchIt Package: A Simulation Study." University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1367938207.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Konigorski, Stefan. "Development and application of new statistical methods for the analysis of multiple phenotypes to investigate genetic associations with cardiometabolic traits." Doctoral thesis, Humboldt-Universität zu Berlin, 2018. http://dx.doi.org/10.18452/19132.

Full text

Abstract:

Die biotechnologischen Entwicklungen der letzten Jahre ermöglichen eine immer detailliertere Untersuchung von genetischen und molekularen Markern mit multiplen komplexen Traits. Allerdings liefern vorhandene statistische Methoden für diese komplexen Analysen oft keine valide Inferenz. Das erste Ziel der vorliegenden Arbeit ist, zwei neue statistische Methoden für Assoziationsstudien von genetischen Markern mit multiplen Phänotypen zu entwickeln, effizient und robust zu implementieren, und im Vergleich zu existierenden statistischen Methoden zu evaluieren. Der erste Ansatz, C-JAMP (Copula-based Joint Analysis of Multiple Phenotypes), ermöglicht die Assoziation von genetischen Varianten mit multiplen Traits in einem gemeinsamen Copula Modell zu untersuchen. Der zweite Ansatz, CIEE (Causal Inference using Estimating Equations), ermöglicht direkte genetische Effekte zu schätzen und testen. C-JAMP wird in dieser Arbeit für Assoziationsstudien von seltenen genetischen Varianten mit quantitativen Traits evaluiert, und CIEE für Assoziationsstudien von häufigen genetischen Varianten mit quantitativen Traits und Ereigniszeiten. Die Ergebnisse von umfangreichen Simulationsstudien zeigen, dass beide Methoden unverzerrte und effiziente Parameterschätzer liefern und die statistische Power von Assoziationstests im Vergleich zu existierenden Methoden erhöhen können - welche ihrerseits oft keine valide Inferenz liefern. Für das zweite Ziel dieser Arbeit, neue genetische und transkriptomische Marker für kardiometabolische Traits zu identifizieren, werden zwei Studien mit genom- und transkriptomweiten Daten mit C-JAMP und CIEE analysiert. In den Analysen werden mehrere neue Kandidatenmarker und -gene für Blutdruck und Adipositas identifiziert. Dies unterstreicht den Wert, neue statistische Methoden zu entwickeln, evaluieren, und implementieren. Für beide entwickelten Methoden sind R Pakete verfügbar, die ihre Anwendung in zukünftigen Studien ermöglichen.
In recent years, the biotechnological advancements have allowed to investigate associations of genetic and molecular markers with multiple complex phenotypes in much greater depth. However, for the analysis of such complex datasets, available statistical methods often don’t yield valid inference. The first aim of this thesis is to develop two novel statistical methods for association analyses of genetic markers with multiple phenotypes, to implement them in a computationally efficient and robust manner so that they can be used for large-scale analyses, and evaluate them in comparison to existing statistical approaches under realistic scenarios. The first approach, called the copula-based joint analysis of multiple phenotypes (C-JAMP) method, allows investigating genetic associations with multiple traits in a joint copula model and is evaluated for genetic association analyses of rare genetic variants with quantitative traits. The second approach, called the causal inference using estimating equations (CIEE) method, allows estimating and testing direct genetic effects in directed acyclic graphs, and is evaluated for association analyses of common genetic variants with quantitative and time-to-event traits. The results of extensive simulation studies show that both approaches yield unbiased and efficient parameter estimators and can improve the power of association tests in comparison to existing approaches, which yield invalid inference in many scenarios. For the second goal of this thesis, to identify novel genetic and transcriptomic markers associated with cardiometabolic traits, C-JAMP and CIEE are applied in two large-scale studies including genome- and transcriptome-wide data. In the analyses, several novel candidate markers and genes are identified, which highlights the merit of developing, evaluating, and implementing novel statistical approaches. R packages are available for both methods and enable their application in future studies.

APA, Harvard, Vancouver, ISO, and other styles

39

Younkin, Samuel G. "The Linkage Disequilibrium LASSO for SNP Selection in Genetic Association Studies." Case Western Reserve University School of Graduate Studies / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=case1291219489.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Rodriguez, Hefferan Javier. "The Chief Research and Development (R&D) Officer's contribution to innovation : a study in Consumer Packaged Goods multinationals." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/107601.

Full text

Abstract:

Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Engineering, System Design and Management Program, Engineering and Management Program, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 136-143).
The Consumer Packaged Goods (CPG) industry faces challenges that are making it increasingly harder for CPG multinational companies to compete. The lack of differentiation in CPG, evolving consumer preferences, and the need to offer the consumer a unique and valuable experience is requiring CPG multinational companies to continuously innovate. The Research and Development (R&D) function helps to overcome the pressing challenges that the CPG industry faces by contributing to the overall innovation that a firm can deliver. The underlying question is how R&D creates innovation in the context of a CPG multinational. Such innovation, in the form of new products and processes, would seemingly require a central R&D executive, defined in this thesis as the Chief R&D Officer, the person who is accountable for creating innovation for the firm in the R&D context. To contribute to innovation, the Chief R&D Officer must not only set the direction for R&D, but also execute this direction in terms of formulating the R&D strategy, and then managing the R&D organizational structure and leading the R&D organizational culture. The proposition is that the evolving role of the Chief R&D Officer is demanding that these senior executives think systematically about these elements to guarantee the short-term and long-term competitiveness of the R&D organization. If the Chief R&D Officer formulates the right strategy but has an inefficient organizational structure or lacks an innovative organizational culture, then the R&D organization will fail in creating innovation for the firm. This thesis also explores how Chief R&D Officers in the CPG multinational companies have coped with these key elements to achieve successful innovation.
by Javier Rodriguez Hefferan.
S.M. in Engineering and Management

APA, Harvard, Vancouver, ISO, and other styles

41

Tremblay, Serge 1961. "A microcomputer software package to design agricultural drainage plans /." Thesis, McGill University, 1987. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=63911.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Mair, Patrick, Eva Hofmann, Kathrin Gruber, Reinhold Hatzinger, Achim Zeileis, and Kurt Hornik. "What Drives Package Authors to Participate in the R Project for Statistical Computing? Exploring Motivation, Values, and Work Design." National Academy of Sciences, 2015. http://epub.wu.ac.at/4702/1/cranpnas.pdf.

Full text

Abstract:

One of the cornerstones of the R system for statistical computing is the multitude of packages contributed by numerous package authors. This makes an extremely broad range of statistical techniques and other quantitative methods freely available. So far no empirical study has investigated psychological factors that drive authors to participate in the R project. This article presents a study of R package authors, collecting data on different types of participation (number of packages, participation in mailing lists, participation in conferences), three psychological scales (types of motivation, psychological values, and work design characteristics), as well as various sociodemographic factors. The data are analyzed using item response models and subsequent generalized linear models, showing that the most important determinants for participation are a hybrid form of motivation and the social characteristics of the work design. Other factors are found to have less impact or influence only specific aspects of participation. (authors' abstract)

APA, Harvard, Vancouver, ISO, and other styles

43

Kherrazi, Soufiane. "Les pratiques de contrôle managérial dans le contexte de l’innovation collaborative : le cas des consortiums de R&D européens sponsorisés." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASI004.

Full text

Abstract:

Cette recherche aborde la question du contrôle managérial (CM) de l'innovation collaborative. Elle se propose de contribuer à la littérature sur le CM en tenant compte à la fois des perspectives de contingence et de consistance interne pour examiner la mise en place du CM, en particulier dans le contexte des consortiums de R&D. L'ouverture des frontières de la R&D n’est pas sans conséquences sur le design et l'efficacité du CM. Elle implique des défis particuliers et soulève des tensions spécifiques entre les exigences de contrôle et les besoins d’innovation. Sur la base d'une enquête quantitative auprès de 232 firmes impliquées dans des consortiums européens de R&D sponsorisés et mobilisant une modélisation par équations structurelles, nous concevons un modèle de CM inter-firme permettant de soutenir l'innovation collaborative.Nos résultats montrent que l'écosystème de l'innovation joue un rôle essentiel en tant qu'élément institutionnel façonnant le design du CM. Nous mettons en évidence également sur la base de nos résultats que l'approche package est plus appropriée que l'approche système dans le contexte de l’innovation collaborative et permet, en outre, d’envisager plusieurs configurations du CM. Ces dernières sont amenées à s’ajuster en fonction des changements de l’environnement et de l’incertitude technologique. Les résultats font ressortir, enfin, les effets modérateurs des risques relationnels qui peuvent renforcer ou affaiblir l’efficacité du package de CM. L’efficacité du package semble être, en conséquence, liée à son « adéquation » avec le contexte de la collaboration plutôt qu'à sa « cohérence interne »
This research addresses the issue of management control (MC) of collaborative innovation. It attempts to fill this gap in MC literature by considering both contingency and internal consistency perspectives to examine the MC setting, especially in the context of R&D consortia. Opening the boundaries of R&D has implications for the MC design and effectiveness. It involves particular challenges and raises specific tensions of competing demands between control and innovation. Based on a quantitative survey of 232 firms involved in sponsored European R&D consortia and using the structural equation modeling method, we design an interfirm MC model to support collaborative innovation. Our results show that the innovation ecosystem plays a critical role as an institutional element shaping the MC design. We also infer based on our findings that the package approach is more suitable than the system one to set up control practices within a collaborative innovation context. Thus, the package allows several configurations of MC to face environment change and technological uncertainty. We highlight also the moderating effects of relational risks that may strengthen or damp the benefits of the MC package. Accordingly, the package’s effectiveness seems to be related to its “fit” with the collaboration context rather than its “internal consistency”

APA, Harvard, Vancouver, ISO, and other styles

44

Brett, Craig 1965. "An interval mathematics package for computer-aided design in electromagnetics /." Thesis, McGill University, 1990. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=59532.

Full text

Abstract:

Recent developments in CAD for electromagnetic devices have utilized knowledge-based programming techniques to simplify the task of design process automation. The ultimate goal is to duplicate the role an expert designer performs in synthesizing devices such as transformers, motors, and actuators.
A missing quality from these CAD systems is a convenient representation of design space. Interval mathematics is proposed as a means of dealing with this elusive representation. The present work details an interval mathematics package that allows the user to put practical limits on certain parameters enabling the program to deduce the valid design space of the device. This facility permits the system to recognize and eliminate impossible designs and to guide the novice designer through the true search space. Intervals may also be used in other types of expert knowledge such as if-then rules and heuristic "rules of thumb" derived from years of design experience.

APA, Harvard, Vancouver, ISO, and other styles

45

Liu, Hangcheng. "EXAMINING THE CONFIRMATORY TETRAD ANALYSIS (CTA) AS A SOLUTION OF THE INADEQUACY OF TRADITIONAL STRUCTURAL EQUATION MODELING (SEM) FIT INDICES." VCU Scholars Compass, 2018. https://scholarscompass.vcu.edu/etd/5565.

Full text

Abstract:

Structural Equation Modeling (SEM) is a framework of statistical methods that allows us to represent complex relationships between variables. SEM is widely used in economics, genetics and the behavioral sciences (e.g. psychology, psychobiology, sociology and medicine). Model complexity is defined as a model’s ability to fit different data patterns and it plays an important role in model selection when applying SEM. As in linear regression, the number of free model parameters is typically used in traditional SEM model fit indices as a measure of the model complexity. However, only using number of free model parameters to indicate SEM model complexity is crude since other contributing factors, such as the type of constraint or functional form are ignored. To solve this problem, a special technique, Confirmatory Tetrad Analysis (CTA) is examined. A tetrad refers to the difference in the products of certain covariances (or correlations) among four random variables. A structural equation model often implies that some tetrads should be zero. These model implied zero tetrads are called vanishing tetrads. In CTA, the goodness of fit can be determined by testing the null hypothesis that the model implied vanishing tetrads are equal to zero. CTA can be helpful to improve model selection because different functional forms may affect the model implied vanishing tetrad number (t), and models not nested according to the traditional likelihood ratio test may be nested in terms of tetrads. In this dissertation, an R package was created to perform CTA, a two-step method was developed to determine SEM model complexity using simulated data, and it is demonstrated how the number of vanishing tetrads can be helpful to indicate SEM model complexity in some situations.

APA, Harvard, Vancouver, ISO, and other styles

46

Costa, Gildete Fernandes da. "Aspectos lingu?stico-ergon?micos em r?tulos: avalia??o da linguagem verbo-visual de r?tulos de embalagens para alimentos achocolatados." Universidade Federal do Rio Grande do Norte, 2011. http://repositorio.ufrn.br:8080/jspui/handle/123456789/15021.

Full text

Abstract:

Made available in DSpace on 2014-12-17T14:53:04Z (GMT). No. of bitstreams: 1 GildeteFC_TESE.pdf: 3832787 bytes, checksum: 9d9174675d6d1744c17168384ff52967 (MD5) Previous issue date: 2011-12-12
We have been living in a world of packed products. The package and the labels support the companies to communicate with the customers in addition to give protection, storage and convenience in proportion to the products that move in the price list. The labels mainly add up a value which helps the companies differ their products and increase the value of the brands among the final customers. However, the information given in the label are not clear sometimes. It displays a verbal-visual defective language resulted from a poor visibility, legibleness and comprehensibleness of the verbal and visual marks. The aim of this research is to verify, according to the costumers‟ view, the level of the clarity in the informative texts, harmony and ergonomic conformity of the package labels in the chocolate powder of the Claralate brand, considering the linguistic aspects presented on the labels. The criteria to evaluate the chocolate package selected were based on the linguistic field: the organization and the structure of the text derided from the classification of the textual genre; the clarity and the comprehension of the language utilized on those labels. From the ergonomic view, the informative and ergonomic conformity, based on the following requirements: legibility, symbols, characters, reading fields and intermission of the written lines. Therefore, the research done july 2007 and added july 2011 had a structured questionnaire in the interview put to the 118 customers of the chocolate package that go shopping in one of the two supermarkets in Floriano, Piau? S?o Jorge and/or Super Quaresma. The main results of the investigation show that the linguistic aspects in the informative texts of the labels provide the customers‟ expectancy partially, while the consideration of the informative ergonomic analyzed can contribute to the improvement of the information and consequent visual progress of those, on the labels of chocolate package investigated. As recommendation towards the maker of the product, the outcome of the research indicates: harmonize the proportion of the letters and numbers; enlarge the letters size; make the visual information more comprehensive determined by the reading field; put the expiry date in a better visual place
Vivemos num mundo de produtos embalados. A embalagem e os r?tulos ajudam as empresas a se comunicarem com os consumidores e a fornecerem prote??o, armazenagem e conveni?ncia, ? medida que os produtos se movimentam na cadeia de valor. Especialmente os r?tulos adicionam um valor que auxilia as empresas a diferenciar seus produtos e a aumentar o valor da marca entre os consumidores finais. Por?m, muitas vezes, as informa??es contidas nos r?tulos das embalagens n?o s?o claras, apresentando uma linguagem verbo-visual deficiente resultante da m? visibilidade, legibilidade e compreensibilidade de signos verbais e visuais. O objetivo dessa pesquisa ? verificar, na vis?o dos consumidores, o n?vel de clareza dos textos informativos, harmonia e conformidade ergon?mica do r?tulo de embalagem do achocolatado em p? Claralate , considerando os aspectos lingu?sticos e ergon?micos presentes no r?tulo do produto. Os crit?rios para avalia??o da embalagem do achocolatado selecionado foram do ponto de vista lingu?stico: a organiza??o e estrutura??o dos textos a partir da classifica??o do g?nero textual; a clareza e compreens?o da linguagem utilizada no r?tulo. Do ponto de vista da ergonomia, a conformidade ergon?mica informacional, com base nos requisitos: legibilidade, s?mbolos, caracteres, campo de leitura e espa?amento de linhas. Para tanto, a pesquisa de campo realizada em julho de 2007 e, ampliada em julho de 2011, utilizou um question?rio estruturado na entrevista a 118 consumidores de achocolatados que realizam suas compras em um dos dois supermercados de Floriano-Pi. S?o Jorge e/ou Super Quaresma. Os principais resultados da investiga??o mostram que os aspectos lingu?sticos presentes nos textos informativos do r?tulo atendem parcialmente ?s expectativas dos consumidores, enquanto a considera??o dos requisitos ergon?micos informacionais analisados podem vir a contribuir para o aperfei?oamento das informa??es e consequente melhoria visual destas, no r?tulo de embalagem do achocolatado investigado. Como recomenda??o ? fabricante do produto, o resultado da pesquisa aponta: Harmonizar a propor??o das letras e algarismos; Aumentar o tamanho das letras; Tornar melhor compreensiva a informa??o visual determinada pelos campos de leitura; Colocar a data de validade em local de melhor destaque

APA, Harvard, Vancouver, ISO, and other styles

47

Bogered, Gustaf, and Christian Rundquist. "Management Control Systems as a Package and the Impact on Organizational Ambidexterity : A Case Study of a R&D Organization in a Swedish Medical Technology Company." Thesis, Linköpings universitet, Företagsekonomi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-131052.

Full text

Abstract:

Background: The idea of MCSs operating together in a package is not a new concept. However, not many empirical studies have been made where MCSs have been studied as a package despite studies showing the importance in doing so. The assessment of the performance implications of MCSPs has traditionally been limited to financial measures. Theory suggests that the performance of a MCSP ought to be assessed on a broader scale than conventional output measures. Organizational ambidexterity has been positively associated with a broad variety of performance measures and thus it is used as an assessment tool in this study in response to the need for performance measurement on a broader scale than financial output. Purpose: The purpose of this study is to describe the MCSP in two different phases in the case organization and assess how the MCSP in these two phases promotes the organization’s ability to achieve organizational ambidexterity. Methodology: This study uses a qualitative research strategy and is limited to a single case study in a R&D organization within a Swedish medical technology company. Semistructured qualitative interviews have been used to collect empirical data. Conclusion: The MCSP in the two phases is composed of different MCSs that were found to be used differently. Within the MCSP in both phases, several linkages were revealed between control elements and that some MCSs function to achieve the purposes of other MCSs. This study further concludes that the MCSP of the current phase was found to promote organizational ambidexterity better than the MCSP in the previous phase due to it promoting a better balance between exploitation and exploration.

APA, Harvard, Vancouver, ISO, and other styles

48

Guitton, Yann. "Diversité des composés terpéniques volatils au sein du genre Lavandula : aspects évolutifs et physiologiques." Phd thesis, Université Jean Monnet - Saint-Etienne, 2010. http://tel.archives-ouvertes.fr/tel-00675866.

Full text

Abstract:

La production de lavande concoure au rayonnement de la région Rhône-Alpes. Les applications de l'huile essentielle (HE) de lavande reposent sur la culture de 3 espèces (L. aangustifolia, L. latifolia et L. stoechas et d'un hybride L. x intermedia) aux chémotypes marqués. Le genre Lavandula est un modèle idéal pour comprendre la structuration et l'origine de la diversité des composés organiques volatils (COV) en particulier des terpènes. Les lavandes ont l'avantage d'avoir une aire de distribution large avec des régions bioclimatiques différentes, un nombre d'espèces limité (39) ayant des caractéristiques morphologiques et écologiques variées. Pour caractériser la diversité des COV accumulés dans les espèces du genre et envisager leur évolution, nous avons analysé (GC-MS) les COV de 29 espèces (certaines pour la première fois). Comme souvent chez les plantes, la production de COV dans les inflorescences de lavande est soumise à une régulation spatio-temporelle. L'émission différentielle de COV au cours du temps chez L. angustifolia a été relevée par les agriculteurs qui ont observé une qualité d'HE différente suivant la maturité des inflorescences au moment de la récolte. Pour modéliser ces variations et les corréler avec des étapes du développement de la plante, nous avons analysé, au niveau chimique (GC-FID) et moléculaire (qPCR), les variations temporelles des principaux COV dans les feuilles et les inflorescences (plusieurs années et cultivars). En amont de ces recherches sur les COV du genre Lavandula, différent outils de bioinformatique ont été développés. En particulier, le module " MSeasy " qui permet d'automatiser le rapatriement de données de GC-MS. Ceci constitue un pré-requis pour utiliser la lavande comme modèle d'étude des COV chez les Lamiacées

APA, Harvard, Vancouver, ISO, and other styles

49

Dufresne, Isabelle. "Shelf-life and safety studies on rainbow trout fillets packaged under modified atmospheres." Thesis, McGill University, 1999. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=30371.

Full text

Abstract:

The combined effect of various gas packaging atmospheres (air, vacuum and gas packaging), films of different oxygen transmission rate (OTR) and storage temperature (4 and 12°C) were investigated on the shelf-life and safety of flesh rainbow trout fillets.
Preliminary studies were done to determine the optimum packaging atmospheres to maintain the bright pink color of trout packaged in a high gas barrier film. Both vacuum and gas packaging (85% CO2:15%N2) resulted in the longest shelf-life (~28 days) in terms of color at 4°C. Based on these optimum gas atmospheres for color, shelf-life studies were performed at both refrigerated and temperature abuse conditions (12°C).
Challenges studies were also done with Listeria monocytogenes and Clostridium botulinum type E, two psychrotrophic pathogens of concern in modified atmosphere packaged (MAP) fish.
Subsequent studies were done to determine the effect of various levels of headspace oxygen (0--100%, balance CO2) or film OTR on the time to toxicity in trout stored at 12°C. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

50

Yan, Donglin. "Bivariate Generalization of the Time-to-Event Conditional Reassessment Method with a Novel Adaptive Randomization Method." UKnowledge, 2018. https://uknowledge.uky.edu/epb_etds/18.

Full text

Abstract:

Phase I clinical trials in oncology aim to evaluate the toxicity risk of new therapies and identify a safe but also effective dose for future studies. Traditional Phase I trials of chemotherapies focus on estimating the maximum tolerated dose (MTD). The rationale for finding the MTD is that better therapeutic effects are expected at higher dose levels as long as the risk of severe toxicity is acceptable. With the advent of a new generation of cancer treatments such as the molecularly targeted agents (MTAs) and immunotherapies, higher dose levels no longer guarantee increased therapeutic effects, and the focus has shifted to estimating the optimal biological dose (OBD). The OBD is a dose level with the highest biologic activity with acceptable toxicity. The search for OBD requires joint evaluation of toxicity and efficacy. Although several seamleass phase I/II designs have been published in recent years, there is not a consensus regarding an optimal design and further improvement is needed for some designs to be widely used in practice. In this dissertation, we propose a modification to an existing seamless phase I/II design by Wages and Tait (2015) for locating the OBD based on binary outcomes, and extend it to time to event (TITE) endpoints. While the original design showed promising results, we hypothesized that performance could be improved by replacing the original adaptive randomization stage with a different randomization strategy. We proposed to calculate dose assigning probabilities by averaging all candidate models that fit the observed data reasonably well, as opposed to the original design that based all calculations on one best-fit model. We proposed three different strategies to select and average among candidate models, and simulations are used to compare the proposed strategies to the original design. Under most scenarios, one of the proposed strategies allocates more patients to the optimal dose while improving accuracy in selecting the final optimal dose without increasing the overall risk of toxicity. We further extend this design to TITE endpoints to address a potential issue of delayed outcomes. The original design is most appropriate when both toxicity and efficacy outcomes can be observed shortly after the treatment, but delayed outcomes are common, especially for efficacy endpoints. The motivating example for this TITE extension is a Phase I/II study evaluating optimal dosing of all-trans retinoic acid (ATRA) in combination with a fixed dose of daratumumab in the treatment of relapsed or refractory multiple myeloma. The toxicity endpoint is observed in one cycle of therapy (i.e., 4 weeks) while the efficacy endpoint is assessed after 8 weeks of treatment. The difference in endpoint observation windows causes logistical challenges in conducting the trial, since it is not acceptable in practice to wait until both outcomes for each participant have been observed before sequentially assigning the dose of a newly eligible participant. The result would be a delay in treatment for patients and undesirably long trial duration. To address this issue, we generalize the time-to-event continual reassessment method (TITE-CRM) to bivariate outcomes with potentially non-monotonic dose-efficacy relationship. Simulation studies show that the proposed TITE design maintains similar probability in selecting the correct OBD comparing to the binary original design, but the number of patients treated at the OBD decreases as the rate of enrollment increases. We also develop an R package for the proposed methods and document the R functions used in this research. The functions in this R package assist implementation of the proposed randomization strategy and design. The input and output format of these functions follow similar formatting of existing R packages such as "dfcrm" or "pocrm" to allow direct comparison of results. Input parameters include efficacy skeletons, prior distribution of any model parameters, escalation restrictions, design method, and observed data. Output includes recommended dose level for the next patient, MTD, estimated model parameters, and estimated probabilities of each set of skeletons. Simulation functions are included in this R package so that the proposed methods can be used to design a trial based on certain parameters and assess performance. Parameters of these scenarios include total sample size, true dose-toxicity relationship, true dose-efficacy relationship, patient recruit rate, delay in toxicity and efficacy responses.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!