To see the other types of publications on this topic, follow the link: Test de permutation.

Dissertations / Theses on the topic 'Test de permutation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Test de permutation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Baranzano, Rosa. "Non-parametric kernel density estimation-based permutation test: Implementation and comparisons." Thesis, Uppsala universitet, Matematisk statistik, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-147052.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Eckerdal, Nils. "A permutation evaluation of the robustness of a high-dimensional test." Thesis, Uppsala universitet, Statistiska institutionen, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-352914.

Full text
Abstract:
The present thesis is a study of the robustness and performance of a test applicable in the high-dimensional context (𝑝>𝑛) whose components are unbiased statistics (U-statistics). This test (the U-test) has been shown to perform well under a variety of circumstances and can be adapted to any general linear hypothesis. However, the robustness of the test is largely unexplored. Here, a simulation study is performed, focusing particularly on violations of the assumptions the test is based on. For extended evaluation, the performance of the U-test is compared to its permutation counterpart. The simulations show that the U-test is robust, performing poorly only when the permutation test does so as well. It is also discussed that the U-test does not inevitably rest on the assumptions originally imposed on it.
APA, Harvard, Vancouver, ISO, and other styles
3

Albert, Mélisande. "Tests d’indépendance par bootstrap et permutation : étude asymptotique et non-asymptotique. Application en neurosciences." Thesis, Nice, 2015. http://www.theses.fr/2015NICE4079/document.

Full text
Abstract:
Premièrement, nous construisons de tels tests basés sur des approches par bootstrap ou par permutation, et étudions leurs propriétés asymptotiques dans un cadre de processus ponctuels, à travers l'étude du comportement asymptotique des lois conditionnelles des statistiques de test bootstrappée et permutée, sous l'hypothèse nulle ainsi que toute alternative. Nous les validons en pratique par simulation et les comparons à des méthodes classiques en neurosciences. Ensuite, nous nous concentrons sur les tests par permutation, connus pour contrôler non-asymptotiquement leur niveau. Les p-valeurs basées sur la notion de coïncidences avec délai, sont implémentées dans une procédure de tests multiples, appelée méthode Permutation Unitary Events, pour détecter les synchronisations entre deux neurones. Nous validons la méthode par simulation avant de l'appliquer à de vraies données. Deuxièmement, nous étudions les propriétés non-asymptotiques des tests par permutation en termes de vitesse de séparation uniforme. Nous construisons une procédure de tests agrégés, basée sur du seuillage par ondelettes dans un cadre de variables aléatoires à densité. Nous déduisons d'une inégalité fondamentale de Talagrand, une nouvelle inégalité de concentration de type Bernstein pour des sommes permutées aléatoirement qui nous permet de majorer la vitesse de séparation uniforme sur des espaces de Besov faibles et d'en déduire que cette procédure semble être optimale et adaptative au sens du minimax
On the one hand, we construct such tests based on bootstrap and permutation approaches. Their asymptotic performance are studied in a point process framework through the analysis of the asymptotic behavior of the conditional distributions of both bootstrapped and permuted test statistics, under the null hypothesis as well as under any alternative. A simulation study is performed verifying the usability of these tests in practice, and comparing them to existing classical methods in Neuroscience. We then focus on the permutation tests, well known for their non-asymptotic level properties. Their p-values, based on the delayed coincidence count, are implemented in a multiple testing procedure, called Permutation Unitary Events method, to detect the synchronization occurrences between two neurons. The practical validity of the method is verified on a simulation study before being applied on real data. On the other hand, the non-asymptotic performances of the permutation tests are studied in terms of uniform separation rates. A new aggregated procedure based on a wavelet thresholding method is developed in the density framework. Based on Talagrand's fundamental inequalities, we provide a new Bernstein-type concentration inequality for randomly permuted sums. In particular, it allows us to upper bound the uniform separation rate of the aggregated procedure over weak Besov spaces and deduce that this procedure seems to be optimal and adaptive in the minimax sens
APA, Harvard, Vancouver, ISO, and other styles
4

Yu, Li. "Tau-Path Test - A Nonparametric Test For Testing Unspecified Subpopulation Monotone Association." The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1255657068.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kačkina, Julija. "Svertinių rodiklių agregavimo lygmens parinkimas." Master's thesis, Lithuanian Academic Libraries Network (LABT), 2009. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2008~D_20090908_201759-89606.

Full text
Abstract:
Šiame darbe aš apibendrinau informaciją apie pasirinkimo tarp tiesinio prognozavimo mikro ir makro-modelių problemą. Agregavimas suprantamas kaip sektorinis agregavomas, o modeliai yra iš vienmatės tiesinės regresijos klasės. Aš išvedžiau kriterijų pasirinkimui tarp makro ir mikro-modelių ir idealaus agregavimo testą tiesinio agregavimo su fiksuotais ir atsitiktiniais svoriais atvejais. Paskutiniu atveju idealų agregavimą rekomenduoju tikrinti permutaciniu testu. Rezultatai iliustruoju ekonominiu pavyzdžiu. Modeliuoju Lietuvos vidutinį darbo užmokestį agreguotu modeliu ir atskirose ekonominės veiklos sektoriuose. Analizės rezultatas parodo, kad modeliai yra ekvivalentūs.
This paper focuses on the choice between macro and micro models. I suggest a hypothesis testing procedure for in-sample model selection for such variables as average wage. Empirical results show that Lithuanian average wage should be predict by using aggregate model.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Yan. "The impact of midbrain cauterize size on auditory and visual responses' distribution." unrestricted, 2009. http://etd.gsu.edu/theses/available/etd-04202009-145923/.

Full text
Abstract:
Thesis (M.S.)--Georgia State University, 2009.
Title from file title page. Yu-Sheng Hsu, committee chair; Xu Zhang, Sarah. L. Pallas, committee members. Description based on contents viewed June 12, 2009. Includes bibliographical references (p. 37). Appendix A: SAS code: p. 38-53.
APA, Harvard, Vancouver, ISO, and other styles
7

Shadrokh, Ali. "Analyse comparative des tests de permutations en régression multiple et application à l'analyse de tableaux de distances." Université Joseph Fourier (Grenoble), 2006. http://www.theses.fr/2006GRE10084.

Full text
Abstract:
Lorsque le processus de génération des données ne respecte pas certains des postulats fondant l'analyse statistique du modèle classique de régression linéaire, les tests de permutations offrent une alternative non paramétrique fiable de construction de tests d'hypothèse libres. La première application de cette méthode d'inférence statistique au modèle de régression linéaire simple renvoie à Fisher (1935) et Pitman (1937a,b,1938). Cette méthode de ré-échantillonnage est fondée sur des postulats moins forts que la méthode paramétrique classique et facilement vérifiables en pratique: l'échangeabilité des observations sous l'hypothèse nulle. Si l'utilisation des tests de permutation fait consensus en régression linéaire simple et pour tester l'adéquation d'un modèle en régression multiple, le problème se complique lorsqu'on souhaite mettre à l'épreuve une hypothèse de nullité d'un coefficient de régression partielle. L'étude des conditions d'échangeabilité n'est plus simple dans ce cas. Il n'est alors plus possible de construire des tests exactsplusieurs propositions de tests sont en concurrence. L'objectif principal de notre travail est la comparaison des tests de permutation adaptés aux hypothèses de nullité d'un coefficient de régression partielle dans un modèle linéaire à p variables explicatives, conditionnellement à l'observation d'un échantillon. Quatre méthodes sont comparées, d'une part en recourant à des simulations effectuées dans le cas d'une régression double, puis théoriquement, afin de déterminer les propriétés de biais, de couverture et de puissance de ces tests. Les résultats obtenus sont ensuite étendus au cas de la régression linéaire multiple. Un dernier chapitre complète cette étude en traitant le problème de test de la dépendance partielle entre tableaux de distances interpoints. Nous avons comparé les adaptations des quatre méthodes de test de permutation à ce contexte marqué par la dépendance existant entre éléments d'une matrice de distance et nous avons obtenu dans ce cas des résultats tout à fait différents de ceux qui caractérisent la situation classique d'une régression linéaire sur échantillon indépendant, id est dans le cas précédent
When the data generation process does not satisfy some of the assumptions founding the statistical inferences in the classic linear regression model, permutation tests offer a reliable nonparametric alternative for constructing distribution-free tests. The first application of the permutation test method%gy for statistical inference on the simple linear regression model can be traced back to papers by Fisher (1935) and Pitman (1937a, b, 1938). This resampling method is founded on hypothesis weaker than the ciassic parametric approach and which are easily checkable in practice: the exchangeability of the observations under the null hypothesis. There is general agreement concerning an appropriate permutation method yielding exact tests of hypotheses in the simple linear regression mode!. This is not the case, however, for partial tests needed in multiple linear regressions. Then, the problem becomes much trickier to test a null hypothesis concerning one partial regression coefficient. Due exchangeability properties are no more satisfied, and thus no exact test exists for that problem. Several asymptotically exact candidate methods have been proposed in that case. The main goal of our work aims at comparison of permutation test startegies adapted to the hypotheses of nullity of a partial coefficient regression in a linear regression model with p explanatory variables, conditionally on the information contained in the sam pie at hand. Four permutation test methods are compared, first on simulated data resorting to the double linear regression model, and then on theoretical grounds, in order to explore their unbiasedness properties, as weil as their power function's hierarchy. The results obtained are then extended to the general multiple linear regressions setting. A final chapter supplements our research by focussing on inferential problems met when dealing with partial dependence structures between inter-point distance matrices of finite order. We compared the adaptation of four candidate permutation test strategies in this context, the specificity of which relies on the complexities induced by the dependence structure existing between e/ements of a distance matrix. Therefore, we obtained resu/ts that revealed themselves quite different in this case from those obtained in the classic situation of linear regression applied to independent sam pies, which is the object of our simulations and formal developments presented in the first part of the thesis
APA, Harvard, Vancouver, ISO, and other styles
8

ZHONG, WEI. "STATISTICAL APPROACHES TO ANALYZE CENSORED DATA WITH MULTIPLE DETECTION LIMITS." University of Cincinnati / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1130204124.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Annica, Ivert. "Determining Attribute Importance Using an Ensemble of Genetic Programs and Permutation Tests : Relevansbestämning av attribut med hjälp av genetiska program och permutationstester." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-185260.

Full text
Abstract:
When classifying high-dimensional data, a lot can be gained, in terms of both computational time and precision, by only considering the most important features. Many feature selection methods are based on the assumption that important features are highly correlated with their corresponding classes, but mainly uncorrelated with each other. Often, this assumption can help eliminate redundancies and produce good predictors using only a small subset of features. However, when the predictability depends on interactions between the features, such methods will fail to produce satisfactory results. Also, since the suitability of the selected features depends on the learning algorithm in which they will be used, correlation-based filter methods might not be optimal when using genetic programs as the final classifiers, as they fail to capture the possibly complex relationships that are expressible by the genetic programming rules. In this thesis a method that can find important features, both independently and dependently discriminative, is introduced. This method works by performing two different types of permutation tests that classifies each of the features as either irrelevant, independently predictive or dependently predictive. The proposed method directly evaluates the suitability of the features with respect to the learning algorithm in question. Also, in contrast to computationally expensive wrapper methods that require several subsets of features to be evaluated, a feature classification can be obtained after only one single pass, even though the time required does equal the training time of the classifier. The evaluation shows that the attributes chosen by the permutation tests always yield a classifier at least as good as the one obtained when all attributes are used during training - and often better. The proposed method also fares well when compared to other attribute selection methods such as RELIEFF and CFS.
Då man handskas med data av hög dimensionalitet kan man uppnå både bättre precision och förkortad exekveringstid genom att enbart fokusera på de viktigaste attributen. Många metoder för att hitta viktiga attribut är baserade på ett grundantagande om en stark korrelation mellan de viktiga attributen och dess tillhörande klass, men ofta även på ett oberoende mellan de individuella attributen. Detta kan å ena sidan leda till att överflödiga attribut lätt kan elimineras och därmed underlätta processen att hitta en bra klassifierare, men å andra sidan också ge missvisande resultat ifall förmågan att separera klasser i hög grad beror på interaktioner mellan olika attribut. Då lämpligheten av de valda attributen också beror på inlärningsalgoritmen i fråga är det troligtvis inte optimalt att använda sig av metoder som är baserade på korrelationer mellan individuella attribut och dess tillhörande klass, ifall målet är att skapa klassifierare i form av genetiska program, då sådana metoder troligtvis inte har förmågan att fånga de komplexa interaktioner som genetiska program faktiskt möjliggör. Det här arbetet introducerar en metod för att hitta viktiga attribut - både de som kan klassifiera data relativt oberoende och de som får sina krafter endast genom att utnyttja beroenden av andra attribut. Den föreslagna metoden baserar sig på två olika typer av permutationstester, där attribut permuteras mellan de olika dataexemplaren för att sedan klassifieras som antingen oberende, beroende eller irrelevanta. Lämpligheten av ett attribut utvärderas direkt med hänsyn till den valda inlärningsalgoritmen till skillnad från så kallade wrappers, som är tidskrävande då de kräver att flera delmängder av attribut utvärderas. Resultaten visar att de attribut som ansetts viktiga efter permutationstesten genererar klassifierare som är åtminstone lika bra som när alla attribut används, men ofta bättre. Metoden står sig också bra när den jämförs med andra metoder som till exempel RELIEFF och CFS.
APA, Harvard, Vancouver, ISO, and other styles
10

Fu, Min. "A RESAMPLING BASED APPROACH IN EVALUATION OF DOSE-RESPONSE MODELS." Diss., Temple University Libraries, 2014. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/300992.

Full text
Abstract:
Statistics
Ph.D.
In this dissertation, we propose a computational approach using a resampling based permutation test as an alternative to MCP-Mod (a hybrid framework integrating the multiple comparison procedure and the modeling technique) and gMCP-Mod (generalized MCP-Mod) [11], [29] in the step of identifying significant dose-response signals via model selection. We name our proposed approach RMCP-Mod or gRMCP-Mod correspondingly. The RMCP-Mod/gRMCP-Mod transforms the drug dose comparisons into a dose-response model selection issue via multiple hypotheses testing, an area where not much extended researches have been done, and solve it using resampling based multiple testing procedures [38]. The proposed approach avoids the inclusion of the prior dose-response knowledge known as "guesstimates" used in the model selection step of the MCP-Mod/gMCP-Mod framework, and therefore reduces the uncertainty in the significant model identification. When a new drug is being developed to treat patients with a specified disease, one of the key steps is to discover an optimal drug dose or doses that would produce the desired clinical effect with an acceptable level of toxicity. In order to nd such a dose or doses (different doses may be able to produce the same or better clinical effect with similar acceptable toxicity), the underlying dose-response signals need to be identified and thoroughly examined through statistical analyses. A dose-response signal refers to the fact that a drug has different clinical effects at many quantitative dose levels. Statistically speaking, the dose-response signal is a numeric relationship curve (shape) between drug doses and the clinical effects in quantitative measures. It's often been a challenge to nd correct and accurate efficacy and/or safety dose-response signals that would best describe the dose-effect relationship in the drug development process via conventional statistical methods because the conventional methods tend to either focus on a fixed, small number of quantitative dosages or evaluate multiple pre-denied dose-response models without Type I error control. In searching for more efficient methods, a framework of combining both multiple comparisons procedure (MCP) and model-based (Mod) techniques acronymed MCP-Mod was developed by F. Bretz, J. C. Pinheiro, and M. Branson [11] to handle normally distributed, homoscedastic dose response observations. Subsequently, a generalized version of the MCP- Mod named gMCP-Mod which can additionally deal with binary, counts, or time-to-event dose-response data as well as repeated measurements over time was developed by J. C. Pinheiro, B. Bornkamp, E. Glimm and F. Bretz [29]. The MCP-Mod/gMCP-Mod uses the guesstimates" in the MCP step to pre-specify parameters of the candidate models; however, in situations where the prior knowledge of the dose-response information is difficult to obtain, the uncertainties could be introduced into the model selection process, impacting on the correctness of the model identification. Throughout the evaluation of its application to the hypothetical and real study examples as well as simulation comparisons to the MCP-Mod/gMCP-Mod, our proposed approach, RMCP-Mod/gRMCP-Mod seems a viable method that can be used in the practice with some further improvements and researches that are still needed in applications to broader dose-response data types.
Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
11

Somon, Bertille. "Corrélats neuro-fonctionnels du phénomène de sortie de boucle : impacts sur le monitoring des performances." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAS042/document.

Full text
Abstract:
Les mutations technologiques à l’œuvre dans les systèmes aéronautiques ont profondément modifié les interactions entre l’homme et la machine. Au fil de cette évolution, les opérateurs se sont retrouvés face à des systèmes de plus en plus complexes, de plus en plus automatisés et de plus en plus opaques. De nombreuses tragédies montrent à quel point la supervision des systèmes par des opérateurs humains reste un problème sensible. En particulier, de nombreuses évidences montrent que l’automatisation a eu tendance à éloigner l’opérateur de la boucle de contrôle des systèmes, créant un phénomène dit de sortie de boucle (OOL). Ce phénomène se caractérise notamment par une diminution de la conscience de la situation et de la vigilance de l’opérateur, ainsi qu’une complaisance et une sur-confiance dans les automatismes. Ces difficultés déclenchent notamment une baisse des performances de l’opérateur qui n’est plus capable de détecter les erreurs du système et de reprendre la main si nécessaire. La caractérisation de l’OOL est donc un enjeux majeur des interactions homme-système et de notre société en constante évolution. Malgré plusieurs décennies de recherche, l’OOL reste difficile à caractériser, et plus encore à anticiper. Nous avons dans cette thèse utilisé les théories issues des neurosciences, notamment sur le processus de détection d’erreurs, afin de progresser sur notre compréhension de ce phénomène dans le but de développer des outils de mesure physiologique permettant de caractériser l’état de sortie de boucle lors d’interactions avec des systèmes écologiques. En particulier, l’objectif de cette thèse était de caractériser l’OOL à travers l’activité électroencéphalographique (EEG) dans le but d’identifier des marqueurs et/ou précurseurs de la dégradation du processus de supervision du système. Nous avons dans un premier temps évalué ce processus de détection d’erreurs dans des conditions standards de laboratoire plus ou moins complexes. Deux études en EEG nous ont d’abord permis : (i) de montrer qu’une activité cérébrale associée à ce processus cognitif se met en place dans les régions fronto-centrales à la fois lors de la détection de nos propres erreurs (ERN-Pe et FRN-P300) et lors de la détection des erreurs d’un agent que l’on supervise, (complexe N2-P3) et (ii) que la complexité de la tâche évaluée peut dégrader cette activité cérébrale. Puis nous avons mené une autre étude portant sur une tâche plus écologique et se rapprochant des conditions de supervision courantes d’opérateurs dans l’aéronautique. Au travers de techniques de traitement du signal EEG particulières (e.g., analyse temps-fréquence essai par essai), cette étude a mis en évidence : (i) l’existence d’une activité spectrale θ dans les régions fronto-centrales qui peut être assimilée aux activités mesurées en condition de laboratoire, (ii) une diminution de l’activité cérébrale associée à la détection des décisions du système au cours de la tâche, et (iii) une diminution spécifique de cette activité pour les erreurs. Dans cette thèse, plusieurs mesures et analyses statistiques de l’activité EEG ont été adaptées afin de considérer les contraintes des tâches écologiques. Les perspectives de cette thèse ouvrent sur une étude en cours dont le but est de mettre en évidence la dégradation de l’activité de supervision des systèmes lors de la sortie de boucle, ce qui permettrait d’identifier des marqueurs précis de ce phénomène permettant ainsi de le détecter, voire même, de l’anticiper
The ongoing technological mutations occuring in aeronautics have profoundly changed the interactions between men and machines. Systems are more and more complex, automated and opaque. Several tragedies have reminded us that the supervision of those systems by human operators is still a challenge. Particularly, evidences have been made that automation has driven the operators away from the control loop of the system thus creating an out-of-the-loop phenomenon (OOL). This phenomenon is characterized by a decrease in situation awareness and vigilance, but also complacency and over-reliance towards automated systems. These difficulties have been shown to result in a degradation of the operator’s performances. Thus, the OOL phenomenon is a major issue of today’s society to improve human-machine interactions. Even though it has been studied for several decades, the OOL is still difficult to characterize, and even more to predict. The aim of this thesis is to define how cognitive neurosciences theories, such as the performance monitoring activity, can be used in order to better characterize the OOL phenomenon and the operator’s state, particularly through physiological measures. Consequently, we have used electroencephalographic activity (EEG) to try and identify markers and/or precursors of the supervision activity during system monitoring. In a first step we evaluated the error detection or performance monitoring activity through standard laboratory tasks, with varying levels of difficulty. We performed two EEG studies allowing us to show that : (i) the performance monitoring activity emerges both for our own errors detection but also during another agent supervision, may it be a human agent or an automated system, and (ii) the performance monitoring activity is significantly decreased by increasing task difficulty. These results led us to develop another experiment to assess the brain activity associated with system supervision in an ecological environment, resembling everydaylife aeronautical system monitoring. Thanks to adapted signal processing techniques (e.g. trial-by-trial time-frequency decomposition), we were able to show that there is : (i) a fronto-central θ activité time-locked to the system’s decision similar to the one obtained in laboratory condition, (ii) a decrease in overall supervision activity time-locked to the system’s decision, and (iii) a specific decrease of monitoring activity for errors. In this thesis, several EEG measures have been used in order to adapt to the context at hand. As a perspective, we have developped a final study aiming at defining the evolution of the monitoring activity during the OOL. Finding markers of this degradation would allow to monitor its emersion, and even better, predict it
APA, Harvard, Vancouver, ISO, and other styles
12

Eklund, Anders. "Computational Medical Image Analysis : With a Focus on Real-Time fMRI and Non-Parametric Statistics." Doctoral thesis, Linköpings universitet, Medicinsk informatik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-76120.

Full text
Abstract:
Functional magnetic resonance imaging (fMRI) is a prime example of multi-disciplinary research. Without the beautiful physics of MRI, there wouldnot be any images to look at in the first place. To obtain images of goodquality, it is necessary to fully understand the concepts of the frequencydomain. The analysis of fMRI data requires understanding of signal pro-cessing, statistics and knowledge about the anatomy and function of thehuman brain. The resulting brain activity maps are used by physicians,neurologists, psychologists and behaviourists, in order to plan surgery andto increase their understanding of how the brain works. This thesis presents methods for real-time fMRI and non-parametric fMRIanalysis. Real-time fMRI places high demands on the signal processing,as all the calculations have to be made in real-time in complex situations.Real-time fMRI can, for example, be used for interactive brain mapping.Another possibility is to change the stimulus that is given to the subject, inreal-time, such that the brain and the computer can work together to solvea given task, yielding a brain computer interface (BCI). Non-parametricfMRI analysis, for example, concerns the problem of calculating signifi-cance thresholds and p-values for test statistics without a parametric nulldistribution. Two BCIs are presented in this thesis. In the first BCI, the subject wasable to balance a virtual inverted pendulum by thinking of activating theleft or right hand or resting. In the second BCI, the subject in the MRscanner was able to communicate with a person outside the MR scanner,through a virtual keyboard. A graphics processing unit (GPU) implementation of a random permuta-tion test for single subject fMRI analysis is also presented. The randompermutation test is used to calculate significance thresholds and p-values forfMRI analysis by canonical correlation analysis (CCA), and to investigatethe correctness of standard parametric approaches. The random permuta-tion test was verified by using 10 000 noise datasets and 1484 resting statefMRI datasets. The random permutation test is also used for a non-localCCA approach to fMRI analysis.
APA, Harvard, Vancouver, ISO, and other styles
13

Mu, Zhiqiang. "Comparing the Statistical Tests for Homogeneity of Variances." Digital Commons @ East Tennessee State University, 2006. https://dc.etsu.edu/etd/2212.

Full text
Abstract:
Testing the homogeneity of variances is an important problem in many applications since statistical methods of frequent use, such as ANOVA, assume equal variances for two or more groups of data. However, testing the equality of variances is a difficult problem due to the fact that many of the tests are not robust against non-normality. It is known that the kurtosis of the distribution of the source data can affect the performance of the tests for variance. We review the classical tests and their latest, more robust modifications, some other tests that have recently appeared in the literature, and use bootstrap and permutation techniques to test for equal variances. We compare the performance of these tests under different types of distributions, sample sizes and true ratios of variances of the populations. Monte-Carlo methods are used in this study to calculate empirical powers and type I errors under different settings.
APA, Harvard, Vancouver, ISO, and other styles
14

Bobba, Srinivas. "The Impact of the COVID-19 Lockdown on the Urban Air Quality: A Machine Learning Approach." Thesis, Högskolan Dalarna, Institutionen för information och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:du-37493.

Full text
Abstract:
‘‘SARS-CoV-2’’ which is responsible for the current pandemic of COVID-19 disease was first reported from Wuhan, China, on 31 December 2019. Since then, to prevent its propagation around the world, a set of rapid and strict countermeasures have been taken. While most of the researchers around the world initiated their studies on the Covid-19 lockdown effect on air quality and concluded pollution reduction, the most reliable methods that can be used to find out the reduction of the pollutants in the air are still in debate. In this study, we performed an analysis on how Covid-19 lockdown procedures impacted the air quality in selected cities i.e. New Delhi, Diepkloof, Wuhan, and London around the world. The results show that the air quality index (AQI) improved by 43% in New Delhi,18% in Wuhan,15% in Diepkloof, and 12% in London during the initial lockdown from the 19th of March 2020 to 31st May 2020 compared to that of four-year pre-lockdown. Furthermore, the concentrations of four main pollutants, i.e., NO2, CO, SO2, and PM2.5 were analyzed before and during the lockdown in India. The quantification of pollution drop is supported by statistical measurements like the AVOVA Test and the Permutation Test. Overall, 58%, 61%,18% and 55% decrease is observed in NO2, CO,SO2, and PM2.5 concentrations, respectively. To check if the change in weather has played any role in pollution level reduction or not we analyzed how weather factors are correlated with pollutants using a correlation matrix. Finally, machine learning regression models are constructed to assess the lockdown impact on air quality in India by incorporating weather data. Gradient Boosting is performed well in the Prediction of drop-in PM2.5 concentration on individual cities in India. By comparing the feature importance ranking by regression models supported by correlation factors with PM2.5.This study concludes that COVID-19 lockdown has a significant effect on the natural environment and air quality improvement.
APA, Harvard, Vancouver, ISO, and other styles
15

Zeileis, Achim, and Torsten Hothorn. "Permutation Tests for Structural Change." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2006. http://epub.wu.ac.at/1182/1/document.pdf.

Full text
Abstract:
The supLM test for structural change is embedded into a permutation test framework for a simple location model. The resulting conditional permutation distribution is compared to the usual (unconditional) asymptotic distribution, showing that the power of the test can be clearly improved in small samples. Furthermore, generalizations are discussed for binary and multivariate dependent variables as well as model-based permutation testing for structural change. The procedures suggested are illustrated using both artificial and real-world data (number of youth homicides, employment discrimination data, structural-change publications, and stock returns).
Series: Research Report Series / Department of Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
16

Vadják, Šimon. "Statistické vyhodnocení fylogeneze biologických sekvencí." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2014. http://www.nusl.cz/ntk/nusl-221143.

Full text
Abstract:
The master's thesis provides a comprehensive overview of resampling methods for testing the correctness topology of the phylogenetic trees which estimate the process of phylogeny on the bases of biological sequences similarity. We focused on the possibility of errors creation in this estimate and the possibility of their removal and detection. These methods were implemented in Matlab for Bootstrapping, jackknifing, OTU jackknifing and PTP test (Permutation tail probability). The work aims to test their applicability to various biological sequences and also to assess the impact of the choice of input analysis parameters on the results of these statistical tests.
APA, Harvard, Vancouver, ISO, and other styles
17

Strasser, Helmut, and Christian Weber. "On the Asymptotic Theory of Permutation Statistics." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/102/1/document.pdf.

Full text
Abstract:
In this paper limit theorems for the conditional distributions of linear test statistics are proved. The assertions are conditioned by the sigma-field of permutation symmetric sets. Limit theorems are proved both for the conditional distributions under the hypothesis of randomness and under general contiguous alternatives with independent but not identically distributed observations. The proofs are based on results on limit theorems for exchangeable random variables by Strasser and Weber. The limit theorems under contiguous alternatives are consequences of an LAN-result for likelihood ratios of symmetrized product measures. The results of the paper have implications for statistical applications. By example it is shown that minimum variance partitions which are defined by observed data (e.g. by LVQ) lead to asymptotically optimal adaptive tests for the k-sample problem. As another application it is shown that conditional k-sample tests which are based on data-driven partitions lead to simple confidence sets which can be used for the simultaneous analysis of linear contrasts. (author's abstract)
Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
APA, Harvard, Vancouver, ISO, and other styles
18

Mahmoud, Hamdy Fayez Farahat. "Some Advanced Semiparametric Single-index Modeling for Spatially-Temporally Correlated Data." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/76744.

Full text
Abstract:
Semiparametric modeling is a hybrid of the parametric and nonparametric modelings where some function forms are known and others are unknown. In this dissertation, we have made several contributions to semiparametric modeling based on the single index model related to the following three topics: the first is to propose a model for detecting change points simultaneously with estimating the unknown function; the second is to develop two models for spatially correlated data; and the third is to further develop two models for spatially-temporally correlated data. To address the first topic, we propose a unified approach in its ability to simultaneously estimate the nonlinear relationship and change points. We propose a single index change point model as our unified approach by adjusting for several other covariates. We nonparametrically estimate the unknown function using kernel smoothing and also provide a permutation based testing procedure to detect multiple change points. We show the asymptotic properties of the permutation testing based procedure. The advantage of our approach is demonstrated using the mortality data of Seoul, Korea from January, 2000 to December, 2007. On the second topic, we propose two semiparametric single index models for spatially correlated data. One additively separates the nonparametric function and spatially correlated random effects, while the other does not separate the nonparametric function and spatially correlated random effects. We estimate these two models using two algorithms based on Markov Chain Expectation Maximization algorithm. Our approaches are compared using simulations, suggesting that the semiparametric single index nonadditive model provides more accurate estimates of spatial correlation. The advantage of our approach is demonstrated using the mortality data of six cities, Korea from January, 2000 to December, 2007. The third topic involves proposing two semiparametric single index models for spatially and temporally correlated data. Our first model has the nonparametric function which can separate from spatially and temporally correlated random effects. We refer it to "semiparametric spatio-temporal separable single index model (SSTS-SIM)", while the second model does not separate the nonparametric function from spatially correlated random effects but separates the time random effects. We refer our second model to "semiparametric nonseparable single index model (SSTN-SIM)". Two algorithms based on Markov Chain Expectation Maximization algorithm are introduced to simultaneously estimate parameters, spatial effects, and times effects. The proposed models are then applied to the mortality data of six major cities in Korea. Our results suggest that SSTN-SIM is more flexible than SSTS-SIM because it can estimate various nonparametric functions while SSTS-SIM enforces the similar nonparametric curves. SSTN-SIM also provides better estimation and prediction.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
19

Braun, Thomas Michael. "Optimal analysis of group randomized trials with permutation tests /." Thesis, Connect to this title online; UW restricted, 1999. http://hdl.handle.net/1773/9589.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Hothorn, Torsten, Kurt Hornik, de Wiel Mark A. van, and Achim Zeileis. "Implementing a Class of Permutation Tests: The coin Package." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2007. http://epub.wu.ac.at/408/1/document.pdf.

Full text
Abstract:
The R package coin implements a unified approach to permutation tests providing a huge class of independence tests for nominal, ordered, numeric, and censored data as well as multivariate data at mixed scales. Based on a rich and flexible conceptual framework that embeds different permutation test procedures into a common theory, a computational framework is established in coin that likewise embeds the corresponding R functionality in a common S4 class structure with associated generic functions. As a consequence, the computational tools in coin inherit the flexibility of the underlying theory and conditional inference functions for important special cases can be set up easily. Conditional versions of classical tests - such as tests for location and scale problems in two or more samples, independence in two- or three-way contingency tables, or association problems for censored, ordered categorical or multivariate data - can be easily be implemented as special cases using this computational toolbox by choosing appropriate transformations of the observations. The paper gives a detailed exposition of both the internal structure of the package and the provided user interfaces.
Series: Research Report Series / Department of Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
21

Zeileis, Achim, Mark A. van de Wiel, Kurt Hornik, and Torsten Hothorn. "Implementing a Class of Permutation Tests: The coin Package." American Statistical Association, 2008. http://epub.wu.ac.at/4004/1/class.pdf.

Full text
Abstract:
The R package coin implements a unified approach to permutation tests providing a huge class of independence tests for nominal, ordered, numeric, and censored data as well as multivariate data at mixed scales. Based on a rich and exible conceptual framework that embeds different permutation test procedures into a common theory, a computational framework is established in coin that likewise embeds the corresponding R functionality in a common S4 class structure with associated generic functions. As a consequence, the computational tools in coin inherit the exibility of the underlying theory and conditional inference functions for important special cases can be set up easily. Conditional versions of classical tests|such as tests for location and scale problems in two or more samples, independence in two- or three-way contingency tables, or association problems for censored, ordered categorical or multivariate data|can easily be implemented as special cases using this computational toolbox by choosing appropriate transformations of the observations. The paper gives a detailed exposition of both the internal structure of the package and the provided user interfaces along with examples on how to extend the implemented functionality. (authors' abstract)
APA, Harvard, Vancouver, ISO, and other styles
22

Grusea, Simona. "Applications du calcul des probabilités à la recherche de régions génomiques conservées." Phd thesis, Université de Provence - Aix-Marseille I, 2008. http://tel.archives-ouvertes.fr/tel-00377445.

Full text
Abstract:
Cette thèse se concentre sur quelques sujets de probabilités et statistique liés à la génomique comparative. Dans la première partie nous présentons une approximation de Poisson composée pour calculer des probabilités impliquées dans des tests statistiques pour la significativité des régions génomiques conservées trouvées par une approche de type région de référence.
Un aspect important de notre démarche est le fait de prendre en compte l'existence des familles multigéniques. Dans la deuxième partie nous proposons trois mesures, basées sur la distance de transposition dans le groupe symétrique, pour quantifier l'exceptionalité de l'ordre des gènes dans des régions génomiques conservées. Nous avons obtenu des expressions analytiques pour leur distribution dans le cas d'une permutation aléatoire. Dans la troisième partie nous avons étudié la distribution du nombre de cycles dans le graphe des points de rupture d'une permutation signée aléatoire. Nous avons utilisé la technique ``Markov chain imbedding'' pour obtenir cette distribution en terme d'un produit de matrices de transition d'une certaine chaîne de Markov finie. La connaissance de cette
distribution fournit par la suite une très bonne approximation pour la distribution de la distance d'inversion.
APA, Harvard, Vancouver, ISO, and other styles
23

Rahnenführer, Jörg. "Multivariate permutation tests for the k-sample problem with clustered data." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/1364/1/document.pdf.

Full text
Abstract:
The present paper deals with the choice of clustering algorithms before treating a k-sample problem. We investigate multivariate data sets that are quantized by algorithms that define partitions by maximal support planes (MSP) of a convex function. These algorithms belong to a wide class containing as special cases both the well known k-means algorithm and the Kohonen (1985) algorithm and have been profoundly investigated by Pötzelberger and Strasser (1999). For computing the test statistics for the k-sample problem we replace the data points by their conditional expections with respect to the MSP-partition. We present Monte Carlo simulations of power functions of different tests for the k-sample problem whereas the tests are carried out as multivariate permutation tests to ensure that they hold the level. The results presented show that there seems to be a vital and decisive connection between the optimal choice of the clustering algorithm and the tails of the probability distribution of the data. Especially for distributions with heavy tails like the exponential distribution the performance of tests based on a quadratic convex function with k-means type partitions totally breaks down. (author's abstract)
Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
APA, Harvard, Vancouver, ISO, and other styles
24

Hothorn, Torsten, Kurt Hornik, Mark A. van de Wiel, and Achim Zeileis. "A Lego System for Conditional Inference." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2005. http://epub.wu.ac.at/886/1/document.pdf.

Full text
Abstract:
Conditioning on the observed data is an important and flexible design principle for statistical test procedures. Although generally applicable, permutation tests currently in use are limited to the treatment of special cases, such as contingency tables or K-sample problems. A new theoretical framework for permutation tests opens up the way to a unified and generalized view. We argue that the transfer of such a theory to practical data analysis has important implications in many applications and requires tools that enable the data analyst to compute on the theoretical concepts as closely as possible. We re-analyze four data sets by adapting the general conceptual framework to these non-standard inference procedures and utilizing the coin add-on package in the R system for statistical computing to show what one can gain from going beyond the `classical' test procedures.
Series: Research Report Series / Department of Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
25

Winkler, Anderson M. "Widening the applicability of permutation inference." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:ce166876-0aa3-449e-8496-f28bf189960c.

Full text
Abstract:
This thesis is divided into three main parts. In the first, we discuss that, although permutation tests can provide exact control of false positives under the reasonable assumption of exchangeability, there are common examples in which global exchangeability does not hold, such as in experiments with repeated measurements or tests in which subjects are related to each other. To allow permutation inference in such cases, we propose an extension of the well known concept of exchangeability blocks, allowing these to be nested in a hierarchical, multi-level definition. This definition allows permutations that retain the original joint distribution unaltered, thus preserving exchangeability. The null hypothesis is tested using only a subset of all otherwise possible permutations. We do not need to explicitly model the degree of dependence between observations; rather the use of such permutation scheme leaves any dependence intact. The strategy is compatible with heteroscedasticity and can be used with permutations, sign flippings, or both combined. In the second part, we exploit properties of test statistics to obtain accelerations irrespective of generic software or hardware improvements. We compare six different approaches using synthetic and real data, assessing the methods in terms of their error rates, power, agreement with a reference result, and the risk of taking a different decision regarding the rejection of the null hypotheses (known as the resampling risk). In the third part, we investigate and compare the different methods for assessment of cortical volume and area from magnetic resonance images using surface-based methods. Using data from young adults born with very low birth weight and coetaneous controls, we show that instead of volume, the permutation-based non-parametric combination (NPC) of thickness and area is a more sensitive option for studying joint effects on these two quantities, giving equal weight to variation in both, and allowing a better characterisation of biological processes that can affect brain morphology.
APA, Harvard, Vancouver, ISO, and other styles
26

Mair, Patrick, Ingwer Borg, and Thomas Rusch. "Goodness-of-Fit Assessment in Multidimensional Scaling and Unfolding." Taylor & Francis Group, 2016. http://epub.wu.ac.at/5354/1/mairetal2016.pdf.

Full text
Abstract:
Judging goodness of fit in multidimensional scaling requires a comprehensive set of diagnostic tools instead of relying on stress rules of thumb. This article elaborates on corresponding strategies and gives practical guidelines for researchers to obtain a clear picture of the goodness of fit of a solution. Special emphasis will be placed on the use of permutation tests. The second part of the article focuses on goodness-of-fit assessment of an important variant of multidimensional scaling called unfolding, which can be applied to a broad range of psychological data settings. Two real-life data sets are presented in order to walk the reader through the entire set of diagnostic measures, tests, and plots. R code is provided as supplementary information that makes the whole goodness-of-fit assessment workflow, as presented in this article, fully reproducible.
APA, Harvard, Vancouver, ISO, and other styles
27

Horstman, Benjamin Philip. "Detecting Epistasis Effect in Genome-Wide Association Studies Based on Permutation Tests and Ensemble Approaches." Cleveland, Ohio : Case Western Reserve University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=case1270577390.

Full text
Abstract:
Thesis (Master of Sciences (Engineering))--Case Western Reserve University, 2010
Department of EECS - Computer and Information Sciences Title from PDF (viewed on 2010-05-25) Includes abstract Includes bibliographical references and appendices Available online via the OhioLINK ETD Center
APA, Harvard, Vancouver, ISO, and other styles
28

Gkamas, Theodosios. "Modélisation statistique de tenseurs d'ordre supérieur en imagerie par résonance magnétique de diffusion." Thesis, Strasbourg, 2015. http://www.theses.fr/2015STRAD036/document.

Full text
Abstract:
L'IRMd est un moyen non invasif permettant d'étudier in vivo la structure des fibres nerveuses du cerveau. Dans cette thèse, nous modélisons des données IRMd à l'aide de tenseurs d'ordre 4 (T4). Les problèmes de comparaison de groupes ou d'individu avec un groupe normal sont abordés, et résolus à l'aide d'analyses statistiques sur les T4s. Les approches utilisent des réductions non linéaires de dimension, et bénéficient des métriques non euclidiennes pour les T4s. Les statistiques sont calculées dans l'espace réduit, et permettent de quantifier la dissimilarité entre le groupe (ou l'individu) d'intérêt et le groupe de référence. Les approches proposées sont appliquées à la neuromyélite optique et aux patients atteints de locked in syndrome. Les conclusions tirées sont cohérentes avec les connaissances médicales actuelles
DW-MRI is a non-invasive way to study in vivo the structure of nerve fibers in the brain. In this thesis, fourth order tensors (T4) were used to model DW-MRI data. In addition, the problems of group comparison or individual against a normal group were discussed and solved using statistical analysis on T4s. The approaches use nonlinear dimensional reductions, assisted by non-Euclidean metrics for T4s. The statistics are calculated in the reduced space and allow us to quantify the dissimilarity between the group (or the individual) of interest and the reference group. The proposed approaches are applied to neuromyelitis optica and patients with locked in syndrome. The derived conclusions are consistent with the current medical knowledge
APA, Harvard, Vancouver, ISO, and other styles
29

Indyke, Amy W. "Saint Catherine of Siena permutations of the blood metaphor in written text and painted image /." Diss., Connect to the thesis, 2007. http://hdl.handle.net/10066/993.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Bonnin, Camille. "Le contrôle du set associé à une tâche : étude comportementale du contrôle exécutif dans des épreuves de permutation de tâche et des tâches de type Stroop." Poitiers, 2010. https://tel.archives-ouvertes.fr/tel-01339060.

Full text
Abstract:
Réaliser une tâche implique la mise en œuvre d'une configuration spécifique des processus cognitifs nécessaires à cette réalisation. C'est cette configuration spécifique qui est désignée par le terme de set (Monsell, 1996). Nous étudions les processus de contrôle impliqués dans l'établissement d'un set et dans la gestion du conflit entre sets, au moyen du paradigme de permutation de tâche et par l'utilisation de stimuli ambigus ou conflictuels (type Stroop) pouvant susciter plusieurs tâches. Le coût de permutation et l'effet d'interférence nous permettent d'apprécier l'efficience de ces processus de contrôle. L'objectif de ce travail est de déterminer (i) dans quelle mesure les processus de contrôle sont ajustés en fonction de certaines caractéristiques du contexte, et (ii) comment ces variations peuvent éventuellement impliquer la mise en œuvre de processus différents. Une première étude a examiné comment un contexte de conflit pouvait influencer le contrôle du set. Une seconde étude a porté sur le rôle des mécanismes de contrôle du set dans le maintien d'un équilibre entre persistance et flexibilité en fonction du contexte. Une troisième étude a testé l'hypothèse de mécanismes d'inhibition spécifiques du type de contrôle mis en œuvre. Enfin, en utilisant un paradigme de type Stroop dans une approche neuropsychologique chez des patients souffrant de la maladie de Parkinson, une dernière étude a permis de préciser les substrats neurophysiologiques des modes de contrôle proactif et réactif. Les recherches menées ont permis de discuter de l'intérêt du concept de set dans l'étude du contrôle exécutif ainsi que des intérêts et des limites du paradigme de permutation de tâche
To perform any cognitive task requires an appropriate organization of cognitive processes and mental representations, in order to act in accordance with task requirements. This internal configuration has been called task set (Monsell, 1996). In the present work, we studied control processes involved in establishing a task-set and mechanisms involved in the resolution of conflict between tasks-set, by using the task switching paradigm and ambivalent or conflict stimuli (Stroop-like stimuli) affording several tasks. The efficiency of control processes was indexed by switch cost and interference effects. The aim of this work was to determine (i) how control processes are adjusted according to contextual characteristics and how these adjustments reflect the implementation of different processes. An initial study explored the potential influence of a conflict context on task-set control. The results showed that the proportion of incongruent stimuli modulated the degree of conflict elicited by stroop-like stimuli, but did not influence task switching performance (switch cost). These results suggest that processes involved in the establishment of a new task-set and those involved in the resolution of conflict between task-sets are independent. A second study explored the role of task set control processes in maintaining a context-dependant balance between stability and flexibility. Results of experiment 2 showed that, in a context where the identity of the upcoming task is uncertain, a high frequency of task changes promoted flexibility. This suggests that task-set activation is not an all-or-none process, but rather a gradual process adjusted to context demand…
APA, Harvard, Vancouver, ISO, and other styles
31

Dahlberg, Gunnar. "Implementation and evaluation of a text extraction tool for adverse drug reaction information." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-134063.

Full text
Abstract:
Inom ramen för Världshälsoorganisationens (WHO:s) internationella biverkningsprogram rapporterar sjukvårdspersonal och patienter misstänkta läkemedelsbiverkningar i form av spontana biverkningsrapporter som via nationella myndigheter skickas till Uppsala Monitoring Centre (UMC). Hos UMC lagras rapporterna i VigiBase, WHO:s biverkningsdatabas. Rapporterna i VigiBase analyseras med hjälp av statistiska metoder för att hitta potentiella samband mellan läkemedel och biverkningar. Funna samband utvärderas i flera steg där ett tidigt steg i utvärderingen är att studera den medicinska litteraturen för att se om sambandet redan är känt sedan tidigare (tidigare kända samband filtreras bort från fortsatt analys). Att manuellt leta efter samband mellan ett visst läkemedel och en viss biverkan är tidskrävande. I den här studien har vi utvecklat ett verktyg för att automatiskt leta efter medicinska biverkningstermer i medicinsk litteratur och spara funna samband i ett strukturerat format. I verktyget har vi implementerat och integrerat funktionalitet för att söka efter medicinska biverkningar på olika sätt (utnyttja synonymer,ta bort ändelser på ord, ta bort ord som saknar betydelse, godtycklig ordföljd och stavfel). Verktygets prestanda har utvärderats på manuellt extraherade medicinska termer från SPC-texter (texter från läkemedels bipacksedlar) och på biverkningstexter från Martindale (medicinsk referenslitteratur för information om läkemedel och substanser) där WHO-ART- och MedDRA-terminologierna har använts som källa för biverkningstermer. Studien visar att sofistikerad textextraktion avsevärt kan förbättra identifieringen av biverkningstermer i biverkningstexter jämfört med en ordagrann extraktion.
Background: Initial review of potential safety issues related to the use of medicines involves reading and searching existing medical literature sources for known associations of drug and adverse drug reactions (ADRs), so that they can be excluded from further analysis. The task is labor demanding and time consuming. Objective: To develop a text extraction tool to automatically identify ADR information from medical adverse effects texts. Evaluate the performance of the tool’s underlying text extraction algorithm and identify what parts of the algorithm contributed to the performance. Method: A text extraction tool was implemented on the .NET platform with functionality for preprocessing text (removal of stop words, Porter stemming and use of synonyms) and matching medical terms using permutations of words and spelling variations (Soundex, Levenshtein distance and Longest common subsequence distance). Its performance was evaluated on both manually extracted medical terms (semi-structuredtexts) from summary of product characteristics (SPC) texts and unstructured adverse effects texts from Martindale (i.e. a medical reference for information about drugs andmedicines) using the WHO-ART and MedDRA medical term dictionaries. Results: For the SPC data set, a verbatim match identified 72% of the SPC terms. The text extraction tool correctly matched 87% of the SPC terms while producing one false positive match using removal of stop words, Porter stemming, synonyms and permutations. The use of the full MedDRA hierarchy contributed the most to performance. Sophisticated text algorithms together contributed roughly equally to the performance. Phonetic codes (i.e. Soundex) is evidently inferior to string distance measures (i.e. Levenshtein distance and Longest common subsequence distance) for fuzzy matching in our implementation. The string distance measures increased the number of matched SPC terms, but at the expense of generating false positive matches. Results from Martindaleshow that 90% of the identified medical terms were correct. The majority of false positive matches were caused by extracting medical terms not describing ADRs. Conclusion: Sophisticated text extraction can considerably improve the identification of ADR information from adverse effects texts compared to a verbatim extraction.
APA, Harvard, Vancouver, ISO, and other styles
32

Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. "Unbiased Recursive Partitioning: A Conditional Inference Framework." Institut für Statistik und Mathematik, WU Vienna University of Economics and Business, 2004. http://epub.wu.ac.at/676/1/document.pdf.

Full text
Abstract:
Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been known for a long time: Overfitting and a selection bias towards covariates with many possible splits or missing values. While pruning procedures are able to solve the overfitting problem, the variable selection bias still seriously effects the interpretability of tree-structured regression models. For some special cases unbiased procedures have been suggested, however lacking a common theoretical foundation. We propose a unified framework for recursive partitioning which embeds tree-structured regression models into a well defined theory of conditional inference procedures. Stopping criteria based on multiple test procedures are implemented and it is shown that the predictive performance of the resulting trees is as good as the performance of established exhaustive search procedures. It turns out that the partitions and therefore the models induced by both approaches are structurally different, indicating the need for an unbiased variable selection. The methodology presented here is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Data from studies on animal abundance, glaucoma classification, node positive breast cancer and mammography experience are re-analyzed.
Series: Research Report Series / Department of Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
33

Mériaux, Sébastien. "Diagnostique d'homogénéité et inférence non-paramétrique pour l'analyse de groupe en imagerie par résonance magnétique fonctionnelle." Phd thesis, Université Paris Sud - Paris XI, 2007. http://tel.archives-ouvertes.fr/tel-00371051.

Full text
Abstract:
L'un des objectifs principaux de l'imagerie par résonance magnétique fonctionnelle (IRMf) est la localisation in vivo et de manière non invasive des zones cérébrales associées à certaines fonctions cognitives. Le cerveau présentant une très grande variabilité anatomo-fonctionnelle inter-individuelle, les études d'IRMf incluent généralement plusieurs sujets et une analyse de groupe permet de résumer les résultats intra-sujets en une carte d'activation du groupe représentative de la population d'intérêt. L'analyse de groupe « standard » repose sur une hypothèse forte d'homogénéité des effets estimés à travers les sujets. Dans un premier temps, nous étudions la validité de cette hypothèse par une méthode multivariée diagnostique et un test de normalité univarié (le test de Grubbs). L'application de ces méthodes sur une vingtaine de jeux de données révèle la présence fréquente de données atypiques qui peuvent invalider l'hypothèse d'homogénéité. Nous proposons alors d'utiliser des statistiques de décision robustes calibrées par permutations afin d'améliorer la spécificité et la sensibilité des tests statistiques pour l'analyse de groupe. Puis nous introduisons de nouvelles statistiques de décision à effets mixtes fondées sur le rapport de vraisemblances maximales, permettant de pondérer les sujets en fonction de l'incertitude sur l'estimation de leurs effets. Nous confirmons sur des jeux de données que ces nouvelles méthodes d'inférence permettent un gain en sensibilité significatif, et nous fournissons l'ensemble des outils développés lors de cette thèse à la communauté de neuro-imagerie dans le logiciel DISTANCE.
APA, Harvard, Vancouver, ISO, and other styles
34

González, Monsalve Jonatan A. "Statistical tests for comparisons of spatial and spatio-temporal point patterns." Doctoral thesis, Universitat Jaume I, 2018. http://hdl.handle.net/10803/462034.

Full text
Abstract:
We mainly introduce a new set of tests to compare functional descriptors in point processes context. Firstly, since the study of spatio-temporal point processes has not been widely covered in the literature, a complete review is made. The review is a reference paper of the available techniques and approaches regarding the spatio-temporal context. Secondly, a studentized permutation test is developed in the spatio-temporal case. This test is motivated by locations of tornadoes in the U.S. in a period of 36 years. Some tools have been developed as a non-separable estimator of the first-order intensity, which allows a realistic analysis of the phenomenon through the new test. Finally, a factorial two-way design is considered, where the observations are spatial point patterns in presence of replication. This methodology is motivated by a minerals engineering experiment. We develop statistics to test the influence of the factors and the possible interaction effects.
Desarrollamos un nuevo conjunto de pruebas para comparar descriptores funcionales en el contexto de procesos puntuales. Puesto que el estudio de los procesos puntuales espacio-temporales no ha sido muy exhaustivo en la literatura, hemos hecho un artículo de resumen. Introducimos un test de permutación para grupos de patrones puntuales espacio-temporales motivado por las ubicaciones de ocurrencias de tornados en EE.UU. durante 36 años. Hemos desarrollado algunas técnicas como la estimación de la intensidad de primer-orden sin suponer separabilidad, lo que permite un tratamiento más realista del fenómeno climático en sí mismo a través del nuevo test.Finalmente, hemos desarrollado algunas técnicas para el análisis de la varianza de experimentos de dos factores en presencia de réplicas cuando las observaciones son patrones puntuales espaciales. Esta metodología está motivada por un experimento de ingeniería de minerales. Desarrollamos algunos estadísticos adecuados para probar la influencia de los factores y su posible interacción.
APA, Harvard, Vancouver, ISO, and other styles
35

NICOLAZZI, EZEQUIEL LUIS. "New trends in dairy cattle genetic evaluation." Doctoral thesis, Università Cattolica del Sacro Cuore, 2011. http://hdl.handle.net/10280/966.

Full text
Abstract:
I sistemi di valutazione genetica nel mondo sono in rapido sviluppo. Attualmente, i programmi di selezione “tradizionale” basati su fenotipi e rapporti di parentela tra gli animali vengono integrati, e nel futuro potrebbero essere sostituiti, dalle informazioni molecolari. In questo periodo di transizione, questa tesi riguarda ricerche su entrambi i tipi di valutazioni: dall’accertamento sull’accuratezza degli indici genetici internazionali (tradizionali), allo studio di metodi statistici utilizzati per integrare informazioni genomiche nella selezione (selezione genomica). Tre capitoli valutano gli approcci per stimare i valori genetici dai dati genomici riducendo il numero di variabili indipendenti. In modo particolare, la correzione di Bonferroni e il test di permutazioni con regressione a marcatori singoli (Capitolo III), analisi delle componenti principali con BLUP (Capitolo IV) e indice Fst tra razze con BayesA (Capitolo VI). Inoltre, il Capitolo V analizza l’accuratezza dei valori genomici con BLUP, BayesA e Bayesian LASSO includendo tutte le variabili disponibili. I risultati di questa tesi indicano che il progresso genetico atteso dall’analisi dei dati simulati può effettivamente essere ottenuto, anche se ulteriori ricerche sono necessarie per ottimizzare l’utilizzo delle informazioni molecolari in modo da ottimizzare i risultati per tutti i caratteri sotto selezione.
Genetic evaluation systems are in rapid development worldwide. In most countries, “traditional” breeding programs based on phenotypes and relationships between animals are currently being integrated and in the future might be replaced by the introduction of molecular information. This thesis stands in this transition period, therefore it covers research on both types of genetic evaluations: from the assessment of the accuracy of (traditional) international genetic evaluations to the study of statistical methods used to integrate genomic information into breeding (genomic selection). Three chapters investigate and evaluate approaches for the estimation of genetic values from genomic data reducing the number of independent variables. In particular, Bonferroni correction and Permutation test combined with single marker regression (Chapter III), principal component analysis combined with BLUP (Chapter IV) and Fst across breeds combined with BayesA (Chapter VI). In addition, Chapter V analyzes the accuracy of direct genomic values with BLUP, BayesA and Bayesian LASSO including all available variables. The results of this thesis indicate that the genetic gains expected from the analysis of simulated data can be obtained on real data. Still, further research is needed to optimize the use of genome-wide information and obtain the best possible estimates for all traits under selection.
APA, Harvard, Vancouver, ISO, and other styles
36

Oller, Piqué Ramon. "Survival analysis issues with interval-censored data." Doctoral thesis, Universitat Politècnica de Catalunya, 2006. http://hdl.handle.net/10803/6520.

Full text
Abstract:
L'anàlisi de la supervivència s'utilitza en diversos àmbits per tal d'analitzar dades que mesuren el temps transcorregut entre dos successos. També s'anomena anàlisi de la història dels esdeveniments, anàlisi de temps de vida, anàlisi de fiabilitat o anàlisi del temps fins a l'esdeveniment. Una de les dificultats que té aquesta àrea de l'estadística és la presència de dades censurades. El temps de vida d'un individu és censurat quan només és possible mesurar-lo de manera parcial o inexacta. Hi ha diverses circumstàncies que donen lloc a diversos tipus de censura. La censura en un interval fa referència a una situació on el succés d'interès no es pot observar directament i només tenim coneixement que ha tingut lloc en un interval de temps aleatori. Aquest tipus de censura ha generat molta recerca en els darrers anys i usualment té lloc en estudis on els individus són inspeccionats o observats de manera intermitent. En aquesta situació només tenim coneixement que el temps de vida de l'individu es troba entre dos temps d'inspecció consecutius.

Aquesta tesi doctoral es divideix en dues parts que tracten dues qüestions importants que fan referència a dades amb censura en un interval. La primera part la formen els capítols 2 i 3 els quals tracten sobre condicions formals que asseguren que la versemblança simplificada pot ser utilitzada en l'estimació de la distribució del temps de vida. La segona part la formen els capítols 4 i 5 que es dediquen a l'estudi de procediments estadístics pel problema de k mostres. El treball que reproduïm conté diversos materials que ja s'han publicat o ja s'han presentat per ser considerats com objecte de publicació.

En el capítol 1 introduïm la notació bàsica que s'utilitza en la tesi doctoral. També fem una descripció de l'enfocament no paramètric en l'estimació de la funció de distribució del temps de vida. Peto (1973) i Turnbull (1976) van ser els primers autors que van proposar un mètode d'estimació basat en la versió simplificada de la funció de versemblança. Altres autors han estudiat la unicitat de la solució obtinguda en aquest mètode (Gentleman i Geyer, 1994) o han millorat el mètode amb noves propostes (Wellner i Zhan, 1997).

El capítol 2 reprodueix l'article d'Oller et al. (2004). Demostrem l'equivalència entre les diferents caracteritzacions de censura no informativa que podem trobar a la bibliografia i definim una condició de suma constant anàloga a l'obtinguda en el context de censura per la dreta. També demostrem que si la condició de no informació o la condició de suma constant són certes, la versemblança simplificada es pot utilitzar per obtenir l'estimador de màxima versemblança no paramètric (NPMLE) de la funció de distribució del temps de vida. Finalment, caracteritzem la propietat de suma constant d'acord amb diversos tipus de censura. En el capítol 3 estudiem quina relació té la propietat de suma constant en la identificació de la distribució del temps de vida. Demostrem que la distribució del temps de vida no és identificable fora de la classe dels models de suma constant. També demostrem que la probabilitat del temps de vida en cadascun dels intervals observables és identificable dins la classe dels models de suma constant. Tots aquests conceptes els
il·lustrem amb diversos exemples.

El capítol 4 s'ha publicat parcialment en l'article de revisió metodològica de Gómez et al. (2004). Proporciona una visió general d'aquelles tècniques que s'han aplicat en el problema no paramètric de comparació de dues o més mostres amb dades censurades en un interval. També hem desenvolupat algunes rutines amb S-Plus que implementen la versió permutacional del tests de Wilcoxon, Logrank i de la t de Student per a dades censurades en un interval (Fay and Shih, 1998). Aquesta part de la tesi doctoral es complementa en el capítol 5 amb diverses propostes d'extensió del test de Jonckeere. Amb l'objectiu de provar una tendència en el problema de k mostres, Abel (1986) va realitzar una de les poques generalitzacions del test de Jonckheere per a dades censurades en un interval. Nosaltres proposem altres generalitzacions d'acord amb els resultats presentats en el capítol 4. Utilitzem enfocaments permutacionals i de Monte Carlo. Proporcionem programes informàtics per a cada proposta i realitzem un estudi de simulació per tal de comparar la potència de cada proposta sota diferents models paramètrics i supòsits de tendència. Com a motivació de la metodologia, en els dos capítols s'analitza un conjunt de dades d'un estudi sobre els beneficis de la zidovudina en pacients en els primers estadis de la infecció del virus VIH (Volberding et al., 1995).

Finalment, el capítol 6 resumeix els resultats i destaca aquells aspectes que s'han de completar en el futur.
Survival analysis is used in various fields for analyzing data involving the duration between two events. It is also known as event history analysis, lifetime data analysis, reliability analysis or time to event analysis. One of the difficulties which arise in this area is the presence of censored data. The lifetime of an individual is censored when it cannot be exactly measured but partial information is available. Different circumstances can produce different types of censoring. Interval censoring refers to the situation when the event of interest cannot be directly observed and it is only known to have occurred during a random interval of time. This kind of censoring has produced a lot of work in the last years and typically occurs for individuals in a study being inspected or observed intermittently, so that an individual's lifetime is known only to lie between two successive observation times.

This PhD thesis is divided into two parts which handle two important issues of interval censored data. The first part is composed by Chapter 2 and Chapter 3 and it is about formal conditions which allow estimation of the lifetime distribution to be based on a well known simplified likelihood. The second part is composed by Chapter 4 and Chapter 5 and it is devoted to the study of test procedures for the k-sample problem. The present work reproduces several material which has already been published or has been already submitted.

In Chapter 1 we give the basic notation used in this PhD thesis. We also describe the nonparametric approach to estimate the distribution function of the lifetime variable. Peto (1973) and Turnbull (1976) were the first authors to propose an estimation method which is based on a simplified version of the likelihood function. Other authors have studied the uniqueness of the solution given by this method (Gentleman and Geyer, 1994) or have improved it with new proposals (Wellner and Zhan, 1997).

Chapter 2 reproduces the paper of Oller et al. (2004). We prove the equivalence between different characterizations of noninformative censoring appeared in the literature and we define an analogous constant-sum condition to the one derived in the context of right censoring. We prove as well that when the noninformative condition or the constant-sum condition holds, the simplified likelihood can be used to obtain the nonparametric maximum likelihood estimator (NPMLE) of the failure time distribution function. Finally, we characterize the constant-sum property according to different types of censoring. In Chapter 3 we study the relevance of the constant-sum property in the identifiability of the lifetime distribution. We show that the lifetime distribution is not identifiable outside the class of constant-sum models. We also show that the lifetime probabilities assigned to the observable intervals are identifiable inside the class of constant-sum models. We illustrate all these notions with several examples.

Chapter 4 has partially been published in the survey paper of Gómez et al. (2004). It gives a general view of those procedures which have been applied in the nonparametric problem of the comparison of two or more interval-censored samples. We also develop some S-Plus routines which implement the permutational version of the Wilcoxon test, the Logrank test and the t-test for interval censored data (Fay and Shih, 1998). This part of the PhD thesis is completed in Chapter 5 by different proposals of extension of the Jonckeere's test. In order to test for an increasing trend in the k-sample problem, Abel (1986) gives one of the few generalizations of the Jonckheree's test for interval-censored data. We also suggest different Jonckheere-type tests according to the tests presented in Chapter 4. We use permutational and Monte Carlo approaches. We give computer programs for each proposal and perform a simulation study in order compare the power of each proposal under different parametric assumptions and different alternatives. We motivate both chapters with the analysis of a set of data from a study of the benefits of zidovudine in patients in the early stages of the HIV infection (Volberding et al., 1995).

Finally, Chapter 6 summarizes results and address those aspects which remain to be completed.
APA, Harvard, Vancouver, ISO, and other styles
37

Ledauphin, Stéphanie. "Analyse statistique d'évaluations sensorielles au cours du temps." Phd thesis, Université de Nantes, 2007. http://tel.archives-ouvertes.fr/tel-00139887.

Full text
Abstract:
Dans les industries agro-alimentaires ainsi que dans d'autres secteurs d'activités, l'analyse sensorielle est la clé pour répondre aux attentes des consommateurs. Cette discipline est le plus souvent basée sur l'établissement de profils sensoriels à partir de notes attribuées par des juges entraînés selon une liste de descripteurs (variables sensorielles). Dans ce type d'étude, il importe d'étudier la performance des juges et d'en tenir compte dans l'établissement des profils sensoriels. Dans cette perspective, nous proposons une démarche qui permet de procurer des indicateurs de performance du jury et de chacun des juges et de tenir compte de cette performance pour une détermination d'un tableau moyen. Des tests d'hypothèses pour évaluer la significativité de la contribution des juges à la détermination du compromis sont également proposés.
Depuis une vingtaine d'années, les courbes temps-intensité (TI) qui permettent de décrire l'évolution d'une sensation au cours de l'expérience sont de plus en plus populaires parmi les praticiens de l'analyse sensorielle. La difficulté majeure pour l'analyse des courbes TI provient d'un effet juge important qui se traduit par la présence d'une signature propre à chaque juge. Nous proposons une approche fonctionnelle basée sur les fonctions B-splines qui permet de réduire l'effet juge en utilisant une procédure d'alignement de courbes.
D'autres données sensorielles au cours du temps existent telles que le suivi de la dégradation organoleptique de produits alimentaires. Pour les étudier, nous proposons la modélisation par des chaînes de Markov cachées, de manière à pouvoir ensuite visualiser graphiquement la suivi de la dégradation.
APA, Harvard, Vancouver, ISO, and other styles
38

Garcia, Luz Mery González. "Modelos baseados no planejamento para análise de populações finitas." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-19062008-183609/.

Full text
Abstract:
Estudamos o problema de obtenção de estimadores/preditores ótimos para combinações lineares de respostas coletadas de uma população finita por meio de amostragem aleatória simples. Nesse contexto, estendemos o modelo misto para populações finitas proposto por Stanek, Singer & Lencina (2004, Journal of Statistical Planning and Inference) para casos em que se incluem erros de medida (endógenos e exógenos) e informação auxiliar. Admitindo que as variâncias são conhecidas, mostramos que os estimadores/preditores propostos têm erro quadrático médio menor dentro da classe dos estimadores lineares não viciados. Por meio de estudos de simulação, comparamos o desempenho desses estimadores/preditores empíricos, i.e., obtidos com a substituição das componentes de variância por estimativas, com aquele de competidores tradicionais. Também, estendemos esses modelos para análise de estudos com estrutura do tipo pré-teste/pós-teste. Também por intermédio de simulação, comparamos o desempenho dos estimadores empíricos com o desempenho do estimador obtido por meio de técnicas clássicas de análise de medidas repetidas e com o desempenho do estimador obtido via análise de covariância por meio de mínimos quadrados, concluindo que os estimadores/ preditores empíricos apresentaram um menor erro quadrático médio e menor vício. Em geral, sugerimos o emprego dos estimadores/preditores empíricos propostos para dados com distribuição assimétrica ou amostras pequenas.
We consider optimal estimation of finite population parameters with data obtained via simple random samples. In this context, we extend a finite population mixed model proposed by Stanek, Singer & Lencina (2004, Journal of Statistical Planning and Inference) by including measurement errors (endogenous or exogenous) and auxiliary information. Assuming that variance components are known, we show that the proposed estimators/predictors have the smallest mean squared error in the class of unbiased estimators. Using simulation studies, we compare the performance of the empirical estimators/predictors obtained by replacing variance components with estimates with the performance of a traditional estimator. We also extend the finite population mixed model to data obtained via pretest-posttest designs. Through simulation studies, we compare the performance of the empirical estimator of the difference in gain between groups with the performance of the usual repeated measures estimator and with the performance of the usual analysis of covariance estimator obtained via ordinary least squares. The empirical estimator has smaller mean squared error and bias than the alternative estimators under consideration. In general, we recommend the use of the proposed estimators/ predictors for either asymmetric response distributions or small samples.
APA, Harvard, Vancouver, ISO, and other styles
39

Bienaise, Solène. "Tests combinatoires en analyse géométrique des données - Etude de l'absentéisme dans les industries électriques et gazières de 1995 à 2011 à travers des données de cohorte." Phd thesis, Université Paris Dauphine - Paris IX, 2013. http://tel.archives-ouvertes.fr/tel-00941220.

Full text
Abstract:
La première partie de la thèse traite d'inférence combinatoire en Analyse Géométrique des Données (AGD). Nous proposons des tests multidimensionnels sans hypothèse sur le processus d'obtention des données ou les distributions. Nous nous intéressons ici aux problèmes de typicalité (comparaison d'un point moyen à un point de référence ou d'un groupe d'observations à une population de référence) et d'homogénéité (comparaison de plusieurs groupes). Nous utilisons des procédures combinatoires pour construire un ensemble de référence par rapport auquel nous situons les données. Les statistiques de test choisies mènent à des prolongements originaux : interprétation géométrique du seuil observé et construction d'une zone de compatibilité.La seconde partie présente l'étude de l'absentéisme dans les Industries Electriques et Gazières de 1995 à 2011 (avec construction d'une cohorte épidémiologique). Des méthodes d'AGD sont utilisées afin d'identifier des pathologies émergentes et des groupes d'agents sensibles.
APA, Harvard, Vancouver, ISO, and other styles
40

Collier, Olivier. "Méthodes statistiques pour la mise en correspondance de descripteurs." Phd thesis, Université Paris-Est, 2013. http://tel.archives-ouvertes.fr/tel-00904686.

Full text
Abstract:
De nombreuses applications, en vision par ordinateur ou en médecine notamment,ont pour but d'identifier des similarités entre plusieurs images ou signaux. On peut alors détecter des objets, les suivre, ou recouper des prises de vue. Dans tous les cas, les procédures algorithmiques qui traitent les images utilisent une sélection de points-clefs qu'elles essayent ensuite de mettre en correspondance par paire. Elles calculent pour chaque point un descripteur qui le caractérise, le discrimine des autres. Parmi toutes les procédures possibles,la plus utilisée aujourd'hui est SIFT, qui sélectionne les points-clefs, calcule des descripteurs et propose un critère de mise en correspondance globale. Dans une première partie, nous tentons d'améliorer cet algorithme en changeant le descripteur original qui nécessite de trouver l'argument du maximum d'un histogramme : en effet, son calcul est statistiquement instable. Nous devons alors également changer le critère de mise en correspondance de deux descripteurs. Il en résulte un problème de test non paramétrique dans lequel à la fois l'hypothèse nulle et alternative sont composites, et même non paramétriques. Nous utilisons le test du rapport de vraisemblance généralisé afin d'exhiber des procédures de test consistantes, et proposons une étude minimax du problème. Dans une seconde partie, nous nous intéressons à l'optimalité d'une procédure globale de mise en correspondance. Nous énonçons un modèle statistique dans lequel des descripteurs sont présents dans un certain ordre dans une première image, et dans un autre dans une seconde image. La mise en correspondance revient alors à l'estimation d'une permutation. Nous donnons un critère d'optimalité au sens minimax pour les estimateurs. Nous utilisons en particulier la vraisemblance afin de trouver plusieurs estimateurs consistants, et même optimaux sous certaines conditions. Enfin, nous nous sommes intéressés à des aspects pratiques en montrant que nos estimateurs étaient calculables en temps raisonnable, ce qui nous a permis ensuite d'illustrer la hiérarchie de nos estimateurs par des simulations
APA, Harvard, Vancouver, ISO, and other styles
41

Wang, Hsin-Chung, and 王信忠. "Permutation test on spatial comparison." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/07636027346076198818.

Full text
Abstract:
博士
國立政治大學
統計研究所
94
This thesis proposes the relabel (Fisher's) permutation test inspired by Fisher's exact test to compare between distributions of two (fishery) data sets locating on a two-dimensional lattice. We show that the permutation test given by Syrjala (1996} is not exact, but our relabel permutation test is exact and, additionally, more powerful. This thesis also studies two spatial models: the spatial multinomial-relative-log-normal model and the spatial Poisson-relative-log-normal model. Both models not only exhibit characteristics of skewness with a long right-hand tail and of high proportion of zero catches which usually appear in fishery data, but also have the ability to describe various types of aggregative behaviors.
APA, Harvard, Vancouver, ISO, and other styles
42

Van, Heerden Liske. "A comparative study of permutation procedures." Diss., 1994. http://hdl.handle.net/10500/16306.

Full text
Abstract:
The unique problems encountered when analyzing weather data sets - that is, measurements taken while conducting a meteorological experiment- have forced statisticians to reconsider the conventional analysis methods and investigate permutation test procedures. The problems encountered when analyzing weather data sets are simulated for a Monte Carlo study, and the results of the parametric and permutation t-tests are compared with regard to significance level, power, and the average coilfidence interval length. Seven population distributions are considered - three are variations of the normal distribution, and the others the gamma, the lognormal, the rectangular and empirical distributions. The normal distribution contaminated with zero measurements is also simulated. In those simulated situations in which the variances are unequal, the permutation test procedure was performed using other test statistics, namely the Scheffe, Welch and Behrens-Fisher test statistics.
Mathematical Sciences
M. Sc. (Statistics)
APA, Harvard, Vancouver, ISO, and other styles
43

Morris, Tracy Lynne. "A permutation test for the structure of a covariance matrix." 2007. http://digital.library.okstate.edu/etd/umi-okstate-2172.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Chan, Timothy. "The application of the permutation test on genome wide expression analysis." Thesis, 2006. http://hdl.handle.net/2429/17660.

Full text
Abstract:
We are now in a new era. The recent completion of the entire sequence of the human genome and high-throughput gene expression technologies has transformed the era of molecular biology to the era of genomics. Already, such technologies are showing great promise in disease classification and gene targets. However, like any new exciting technology, great promise and anticipation can lead to wasted resources and false hope. It is critical that we recognize the experimental limitations of these new technologies and most importantly, hidden problems must be addressed. The primary goal of a high-throughput gene expression experiment is to identify genes of interest that are differentially expressed between two sample groups. This thesis addresses two key issues that have hindered high-throughput gene expression technologies. The first is the sample size issue. Small sample sizes affect statistical confidence and are much more sensitive to outliers. Thus, we show that by using a nonparametric statistical test known as the permutation test, we can achieve higher accuracy than conventional parametric statistical tests such as the t-test. The second issue we address is the use of housekeeping genes for normalization of mRNA levels. It is well known that many biological experiments require a set of reference genes that are highly expressed and constant from sample to sample. The choice of reference genes is critical as the wrong choice can have dire effects on subsequent analyses. To address this issue, we developed a methodology based on SAGE, which is a genome wide expression technology that does not require normalization. Our results suggest that reference genes chosen by our methodology are more appropriate for mRNA normalization than the standard set of housekeeping genes. Furthermore, our results suggest that reference genes are more effective if chosen in a tissue-specific manner.
Science, Faculty of
Computer Science, Department of
Graduate
APA, Harvard, Vancouver, ISO, and other styles
45

Guetsop, Nangue Aurélien. "Tests de permutation d’indépendance en analyse multivariée." Thèse, 2016. http://hdl.handle.net/1866/18476.

Full text
Abstract:
Cette thèse est rédigée par articles. Les articles sont rédigés en anglais et le reste de la thèse est rédigée en français.
Le travail établit une équivalence en termes de puissance entre les tests basés sur la alpha-distance de covariance et sur le critère d'indépendance de Hilbert-Schmidt (HSIC) avec fonction caractéristique de distribution de probabilité stable d'indice alpha avec paramètre d'échelle suffisamment petit. Des simulations en grandes dimensions montrent la supériorité des tests de distance de covariance et des tests HSIC par rapport à certains tests utilisant les copules. Des simulations montrent également que la distribution de Pearson de type III, très utile et moins connue, approche la distribution exacte de permutation des tests et donne des erreurs de type I précises. Une nouvelle méthode de sélection adaptative des paramètres d'échelle pour les tests HSIC est proposée. Trois simulations, dont deux sont empruntées de l'apprentissage automatique, montrent que la nouvelle méthode de sélection améliore la puissance des tests HSIC. Le problème de tests d'indépendance entre deux vecteurs est généralisé au problème de tests d'indépendance mutuelle entre plusieurs vecteurs. Le travail traite aussi d'un problème très proche à savoir, le test d'indépendance sérielle d'une suite multidimensionnelle stationnaire. La décomposition de Möbius des fonctions caractéristiques est utilisée pour caractériser l'indépendance. Des tests généralisés basés sur le critère d'indépendance de Hilbert-Schmidt et sur la distance de covariance en sont obtenus. Une équivalence est également établie entre le test basé sur la distance de covariance et le test HSIC de noyau caractéristique d'une distribution stable avec des paramètres d'échelle suffisamment petits. La convergence faible du test HSIC est obtenue. Un calcul rapide et précis des valeurs-p des tests développés utilise une distribution de Pearson de type III comme approximation de la distribution exacte des tests. Un résultat fascinant est l'obtention des trois premiers moments exacts de la distribution de permutation des statistiques de dépendance. Une méthodologie similaire a été développée pour le test d'indépendance sérielle d'une suite. Des applications à des données réelles environnementales et financières sont effectuées.
The main result establishes the equivalence in terms of power between the alpha-distance covariance test and the Hilbert-Schmidt independence criterion (HSIC) test with the characteristic kernel of a stable probability distribution of index alpha with sufficiently small scale parameters. Large-scale simulations reveal the superiority of these two tests over other tests based on the empirical independence copula process. They also establish the usefulness of the lesser known Pearson type III approximation to the exact permutation distribution. This approximation yields tests with more accurate type I error rates than the gamma approximation usually used for HSIC, especially when dimensions of the two vectors are large. A new method for scale parameter selection in HSIC tests is proposed which improves power performance in three simulations, two of which are from machine learning. The problem of testing mutual independence between many random vectors is addressed. The closely related problem of testing serial independence of a multivariate stationary sequence is also considered. The Möbius transformation of characteristic functions is used to characterize independence. A generalization to p vectors of the alpha -distance covariance test and the Hilbert-Schmidt independence criterion (HSIC) test with the characteristic kernel of a stable probability distributionof index alpha is obtained. It is shown that an HSIC test with sufficiently small scale parameters is equivalent to an alpha -distance covariance test. Weak convergence of the HSIC test is established. A very fast and accurate computation of p-values uses the Pearson type III approximation which successfully approaches the exact permutation distribution of the tests. This approximation relies on the exact first three moments of the permutation distribution of any test which can be expressed as the sum of all elements of a componentwise product of p doubly-centered matrices. The alpha -distance covariance test and the HSIC test are both of this form. A new selection method is proposed for the scale parameter of the characteristic kernel of the HSIC test. It is shown in a simulation that this adaptive HSIC test has higher power than the alpha-distance covariance test when data are generated from a Student copula. Applications are given to environmental and financial data.
APA, Harvard, Vancouver, ISO, and other styles
46

Dragieva, Nataliya. "Construction d'un intervalle de confiance par la méthode bootstrap et test de permutation." Mémoire, 2008. http://www.archipel.uqam.ca/953/1/M10219.pdf.

Full text
Abstract:
Ce mémoire traite d'une application pratique de deux méthodes statistiques non paramétriques : le bootstrap et le test de permutation. La méthode du bootstrap a été proposée par Bradley Efron (1979) comme une alternative aux modèles mathématiques traditionnels dans des problèmes d'inférence complexe; celle-ci fournit plusieurs avantages sur les méthodes d'inférence traditionnelles. L'idée du test de permutation est apparue au début du XXème siècle dans les travaux de Neyman, Fisher et Pitman. Le test de permutation, très intensif quant au temps de calcul, est utilisé pour construire une distribution empirique de la statistique de test sous une hypothèse afin de la comparer avec la distribution de la même statistique sous l'hypothèse alternative. Notre objectif est de déterminer l'intervalle de confiance pour un estimateur à maximum de vraisemblance d'une méthode de cartographie génétique existante (MapArg, Larribe et al. 2002) et de tester la qualité de cet estimateur, c'est-à-dire d'établir des seuils de signification pour la fonction de la vraisemblance. Les deux méthodes utilisent le calcul répétitif d'une même statistique de test sur des échantillons obtenus à partir de l'échantillon initial, soit avec le «bootstrap», soit avec des permutations. Dans un test d'hypothèse, les deux méthodes sont complémentaires. Le but de ce mémoire est de proposer différentes variantes pour la construction de l'intervalle de confiance, et de tester des hypothèses distinctes, afin de trouver la meilleure solution adaptée pour la méthode MapArg. Pour faciliter la compréhension des décisions prises, un rappel de l'inférence statistique et des tests d' hypothèse est fait dans les chapitres 4 et 5 où la théorie du bootstrap et celle de test de permutation sont présentées. Comme les qualités d'un estimateur dépendent de la méthode utilisée pour le calculer, les chapitres 1 et 2 présentent la base biologique et la base en mathématiques sur lesquelles la méthode MapArg est construite, tandis qu'on trouvera dans le chapitre 3 une explication de la méthode MapArg. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Mutation, Recombinaison, Coalescence, Cartographie génétique, «Bootstrap», Test de permutation.
APA, Harvard, Vancouver, ISO, and other styles
47

Mirzac, Angela. "Études de bioéquivalence." Mémoire, 2008. http://www.archipel.uqam.ca/1825/1/M10691.pdf.

Full text
Abstract:
Dans la pratique clinique, on fait souvent appel aux études de bioéquivalence afin de comparer deux médicaments. Un premier objectif de ce mémoire est d'introduire la notion pharmacocinétique et statistique de ce type d'étude, ainsi que de définir et de présenter l'application des tests d'équivalence à travers les études de bioéquivalence. Pour réaliser l'inférence statistique des études de bioéquivalence, en pratique on utilise le test t de Student. Le second but de ce mémoire est l'utilisation du test de permutations dans le cas où les hypothèses du test paramétrique ne sont pas satisfaites. À travers une étude de simulation des données de plusieurs lois on compare la performance du test de permutations avec le test t de Student et le test de WiIcoxon pour deux échantillons. Ce dernier est le test non paramétrique utilisé dans la pratique courante des études de bioéquivalence. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Bioéquivalence, Tests d'hypothèses, Plan d'expérience croisé, Test de permutation.
APA, Harvard, Vancouver, ISO, and other styles
48

Chiu, Shih-Ting, and 邱詩婷. "A Study on the Multivariate Permutation Test to Detect the Minimal Fold Changes of Gene Expression Levels." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/01186537248036125914.

Full text
Abstract:
碩士
國立臺灣大學
農藝學研究所
95
The traditional hypothesis for identification of differentially expressed genes fails to take the biological meaning fold changes into consideration. However, a gene is differentially expressed if its fold change exceeds a threshold value in biological field. Compared with the traditional hypothesis of equality, the two one-sided tests procedure based on interval hypothesis(Liu, et al, 2007)not only consider the minimal biologically meaningful expression but truly identify the differentially expressed genes. To continue the research, we will apply multivariate permutation test to the interval hypothesis. Based on this proposed method, we conduct a simulation study to investigate its power, overall type I error and average type I error when the normal assumption of expression levels is in doubt. The simulation results indicate that because of lower overall type I error and average type I error and higher average power, the interval hypothesis works better than the traditional hypothesis of equality when there are enough replicates in array. And the multivariate permutation test which is a non-parametric approach could improve the ability of identifying gene expression with interval hypothesis.
APA, Harvard, Vancouver, ISO, and other styles
49

Chiu, Shih-Ting. "A Study on the Multivariate Permutation Test to Detect the Minimal Fold Changes of Gene Expression Levels." 2007. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-1707200716363600.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Wu, Mengjiao. "Equivalence testing for identity authentication using pulse waves from photoplethysmograph." Diss., 2019. http://hdl.handle.net/2097/39461.

Full text
Abstract:
Doctor of Philosophy
Department of Statistics
Suzanne Dubnicka
Christopher Vahl
Photoplethysmograph sensors use a light-based technology to sense the rate of blood flow as controlled by the heart’s pumping action. This allows for a graphical display of a patient’s pulse wave form and the description of its key features. A person’s pulse wave has been proposed as a tool in a wide variety of applications. For example, it could be used to diagnose the cause of coldness felt in the extremities or to measure stress levels while performing certain tasks. It could also be applied to quantify the risk of heart disease in the general population. In the present work, we explore its use for identity authentication. First, we visualize the pulse waves from individual patients using functional boxplots which assess the overall behavior and identify unusual observations. Functional boxplots are also shown to be helpful in preprocessing the data by shifting individual pulse waves to a proper starting point. We then employ functional analysis of variance (FANOVA) and permutation tests to demonstrate that the identities of a group of subjects could be differentiated and compared by their pulse wave forms. One of the primary tasks of the project is to confirm the identity of a person, i.e., we must decide if a given person is whom they claim to be. We used an equivalence test to determine whether the pulse wave of the person under verification and the actual person were close enough to be considered equivalent. A nonparametric bootstrap functional equivalence test was applied to evaluate equivalence by constructing point-wise confidence intervals for the metric of identity assurance. We also proposed new testing procedures, including the way of building the equivalence hypothesis and test statistics, determination of evaluation range and equivalence bands, to authenticate the identity.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography