To see the other types of publications on this topic, follow the link: Multinomial distribution.

Dissertations / Theses on the topic 'Multinomial distribution'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 25 dissertations / theses for your research on the topic 'Multinomial distribution.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Frühwirth-Schnatter, Sylvia, and Rudolf Frühwirth. "Bayesian Inference in the Multinomial Logit Model." Austrian Statistical Society, 2012. http://epub.wu.ac.at/5629/1/186%2D751%2D1%2DSM.pdf.

Full text
Abstract:
The multinomial logit model (MNL) possesses a latent variable representation in terms of random variables following a multivariate logistic distribution. Based on multivariate finite mixture approximations of the multivariate logistic distribution, various data-augmented Metropolis-Hastings algorithms are developed for a Bayesian inference of the MNL model.
APA, Harvard, Vancouver, ISO, and other styles
2

Olteanu, Denisa Anca. "Cumulative Sum Control Charts for Censored Reliability Data." Diss., Virginia Tech, 2010. http://hdl.handle.net/10919/26665.

Full text
Abstract:
Companies routinely perform life tests for their products. Typically, these tests involve running a set of products until the units fail. Most often, the data are censored according to different censoring schemes, depending on the particulars of the test. On occasion, tests are stopped at a predetermined time and the units that are yet to fail are suspended. In other instances, the data are collected through periodic inspection and only upper and lower bounds on the lifetimes are recorded. Reliability professionals use a number of non-normal distributions to model the resulting lifetime data with the Weibull distribution being the most frequently used. If one is interested in monitoring the quality and reliability characteristics of such processes, one needs to account for the challenges imposed by the nature of the data. We propose likelihood ratio based cumulative sum (CUSUM) control charts for censored lifetime data with non-normal distributions. We illustrate the development and implementation of the charts, and we evaluate their properties through simulation studies. We address the problem of interval censoring, and we construct a CUSUM chart for censored ordered categorical data, which we illustrate by a case study at Becton Dickinson (BD). We also address the problem of monitoring both of the parameters of the Weibull distribution for processes with right-censored data.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
3

Zong, Yujie. "A Sensitivity Analysis of a Nonignorable Nonresponse Model Via EM Algorithm and Bootstrap." Digital WPI, 2011. https://digitalcommons.wpi.edu/etd-theses/208.

Full text
Abstract:
The Slovenian Public Opinion survey (SPOS), which carried out in 1990, was used by the government of Slovenia as a benchmark to prepare for an upcoming plebiscite, which asked the respondents whether they support independence from Yugoslavia. However, the sample size was large and it is quite likely that the respondents and nonrespondents had divergent viewpoints. We first develop an ignorable nonresponse model which is an extension of a bivariate binomial model. In order to accommodate the nonrespondents, we then develop a nonignorable nonresponse model which is an extension of the ignorable model. Our methodology uses an EM algorithm to fit both the ignorable and nonignorable nonresponse models, and estimation is carried out using the bootstrap mechanism. We also perform sensitivity analysis to study different degrees of departures of the nonignorable nonresponse model from the ignorable nonresponse model. We found that the nonignorable nonresponse model is mildly sensitive to departures from the ignorable nonresponse model. In fact, our finding based on the nonignorable model is better than an earlier conclusion about another nonignorable nonresponse model fitted to these data.
APA, Harvard, Vancouver, ISO, and other styles
4

Van, Dyk Hendrik Oostewald. "Classification in high dimensional feature spaces / by H.O. van Dyk." Thesis, North-West University, 2009. http://hdl.handle.net/10394/4091.

Full text
Abstract:
In this dissertation we developed theoretical models to analyse Gaussian and multinomial distributions. The analysis is focused on classification in high dimensional feature spaces and provides a basis for dealing with issues such as data sparsity and feature selection (for Gaussian and multinomial distributions, two frequently used models for high dimensional applications). A Naïve Bayesian philosophy is followed to deal with issues associated with the curse of dimensionality. The core treatment on Gaussian and multinomial models consists of finding analytical expressions for classification error performances. Exact analytical expressions were found for calculating error rates of binary class systems with Gaussian features of arbitrary dimensionality and using any type of quadratic decision boundary (except for degenerate paraboloidal boundaries). Similarly, computationally inexpensive (and approximate) analytical error rate expressions were derived for classifiers with multinomial models. Additional issues with regards to the curse of dimensionality that are specific to multinomial models (feature sparsity) were dealt with and tested on a text-based language identification problem for all eleven official languages of South Africa.
Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
APA, Harvard, Vancouver, ISO, and other styles
5

Florence, Lindsay Walker. "Skill Evaluation in Women's Volleyball." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2286.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Allan, Michelle L. "Measuring Skill Importance in Women's Soccer and Volleyball." Diss., CLICK HERE for online access, 2009. http://contentdm.lib.byu.edu/ETD/image/etd2809.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Huynh, Huy. "Estimating the maximum probability of categorical classes with applications to biological diversity measurements." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44868.

Full text
Abstract:
The study of biological diversity has seen a tremendous growth over the past few decades. Among the commonly used indices capturing both the richness and evenness of a community, the Berger-Parker index, which relates to the maximum proportion of all species, is particularly effective. However, when the number of individuals and species grows without bound this index changes, and it is important to develop statistical tools to measure this change. In this thesis, we introduce two estimators for this maximum: the multinomial maximum and the length of the longest increasing subsequence. In both cases, the limiting distribution of the estimators, as the number of individuals and species simultaneously grows without bound, is obtained. Then, constructing the 95% confidence intervals for the maximum proportion helps improve the comparison of the Berger-Parker index among communities. Finally, we compare the two approaches by examining their associated bias corrected estimators and apply our results to environmental data.
APA, Harvard, Vancouver, ISO, and other styles
8

Xue, Huitian, and 薛惠天. "Maximum likelihood estimation of parameters with constraints in normaland multinomial distributions." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B47850012.

Full text
Abstract:
Motivated by problems in medicine, biology, engineering and economics, con- strained parameter problems arise in a wide variety of applications. Among them the application to the dose-response of a certain drug in development has attracted much interest. To investigate such a relationship, we often need to conduct a dose- response experiment with multiple groups associated with multiple dose levels of the drug. The dose-response relationship can be modeled by a shape-restricted normal regression. We develop an iterative two-step ascent algorithm to estimate normal means and variances subject to simultaneous constraints. Each iteration consists of two parts: an expectation{maximization (EM) algorithm that is utilized in Step 1 to compute the maximum likelihood estimates (MLEs) of the restricted means when variances are given, and a newly developed restricted De Pierro algorithm that is used in Step 2 to find the MLEs of the restricted variances when means are given. These constraints include the simple order, tree order, umbrella order, and so on. A bootstrap approach is provided to calculate standard errors of the restricted MLEs. Applications to the analysis of two real datasets on radioim-munological assay of cortisol and bioassay of peptides are presented to illustrate the proposed methods. Liu (2000) discussed the maximum likelihood estimation and Bayesian estimation in a multinomial model with simplex constraints by formulating this constrained parameter problem into an unconstrained parameter problem in the framework of missing data. To utilize the EM and data augmentation (DA) algorithms, he introduced latent variables {Zil;Yil} (to be defined later). However, the proposed DA algorithm in his paper did not provide the necessary individual conditional distributions of Yil given (the observed data and) the updated parameter estimates. Indeed, the EM algorithm developed in his paper is based on the assumption that{ Yil} are fixed given values. Fortunately, the EM algorithm is invariant under any choice of the value of Yil, so the final result is always correct. We have derived the aforesaid conditional distributions and hence provide a valid DA algorithm. A real data set is used for illustration.
published_or_final_version
Statistics and Actuarial Science
Master
Master of Philosophy
APA, Harvard, Vancouver, ISO, and other styles
9

Petrie, John Eric. "The Accuracy of River Bed Sediment Samples." Thesis, Virginia Tech, 1998. http://hdl.handle.net/10919/30957.

Full text
Abstract:
One of the most important factors that influences a stream's hydraulic and ecological health is the streambed's sediment size distribution. This distribution affects streambed stability, sediment transport rates, and flood levels by defining the roughness of the stream channel. Adverse effects on water quality and wildlife can be expected when excessive fine sediments enter a stream. Many chemicals and toxic materials are transported through streams by binding to fine sediments. Increases in fine sediments also seriously impact the survival of fish species present in the stream. Fine sediments fill tiny spaces between larger particles thereby denying fish embryos the necessary fresh water to survive. Reforestation, constructed wetlands, and slope stabilization are a few management practices typically utilized to reduce the amount of sediment entering a stream. To effectively gauge the success of these techniques, the sediment size distribution of the stream must be monitored. Gravel bed streams are typically stratified vertically, in terms of particle size, in three layers, with each layer having its own distinct grain size distribution. The top two layers of the stream bed, the pavement and subpavement, are the most significant in determining the characteristics of the stream. These top two layers are only as thick as the largest particle size contained within each layer. This vertical stratification by particle size makes it difficult to characterize the grain size distribution of the surface layer. The traditional bulk or volume sampling procedure removes a specified volume of material from the stream bed. However, if the bed exhibits vertical stratification, the volume sample will mix different populations, resulting in inaccurate sample results. To obtain accurate results for the pavement size distribution, a surface oriented sampling technique must be employed. The most common types of surface oriented sampling are grid and areal sampling. Due to limitations in the sampling techniques, grid samples typically truncate the sample at the finer grain sizes, while areal samples typically truncate the sample at the coarser grain sizes. When combined with an analysis technique, either frequency-by-number or frequency-by-weight, the sample results can be represented in terms of a cumulative grain size distribution. However, the results of different sampling and analysis procedures can lead to biased results, which are not equivalent to traditional volume sampling results. Different conversions, dependent on both the sampling and analysis technique, are employed to remove the bias from surface sample results. The topic of the present study is to determine the accuracy of sediment samples obtained by the different sampling techniques. Knowing the accuracy of a sample is imperative if the sample results are to be meaningful. Different methods are discussed for placing confidence intervals on grid sample results based on statistical distributions. The binomial distribution and its approximation with the normal distribution have been suggested for these confidence intervals in previous studies. In this study, the use of the multinomial distribution for these confidence intervals is also explored. The multinomial distribution seems to best represent the grid sampling process. Based on analyses of the different distributions, recommendations are made. Additionally, figures are given to estimate the grid sample size necessary to achieve a required accuracy for each distribution. This type of sample size determination figure is extremely useful when preparing for grid sampling in the field. Accuracy and sample size determination for areal and volume samples present difficulties not encountered with grid sampling. The variability in number of particles contained in the sample coupled with the wide range of particle sizes present make direct statistical analysis impossible. Limited studies have been reported on the necessary volume to sample for gravel deposits. The majority of these studies make recommendations based on empirical results that may not be applicable to different size distributions. Even fewer studies have been published that address the issue of areal sample size. However, using grid sample results as a basis, a technique is presented to estimate the necessary sizes for areal and volume samples. These areal and volume sample sizes are designed to match the accuracy of the original grid sample for a specified grain size percentile of interest. Obtaining grid and areal results with the same accuracy can be useful when considering hybrid samples. A hybrid sample represents a combination of grid and areal sample results that give a final grain size distribution curve that is not truncated. Laboratory experiments were performed on synthetic stream beds to test these theories. The synthetic stream beds were created using both glass beads and natural sediments. Reducing sampling errors and obtaining accurate samples in the field are also briefly discussed. Additionally, recommendations are also made for using the most efficient sampling technique to achieve the required accuracy.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
10

Meister, Kadri. "On Methods for Real Time Sampling and Distributions in Sampling." Doctoral thesis, Umeå : Univ, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-415.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Chen, Yen-Han, and 陳衍翰. "The Estimation of Restricted Multinomial Distribution Parameters." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/88908920130765114047.

Full text
Abstract:
碩士
國立交通大學
統計學研究所
104
The thesis mainly focuses on the estimation of parameters for the multinomial distribution in which the maximum occurrence times of each possible outcome are limited. The inspiration is briefly introduced first. We will then parameterize the problem and deal with the easier case. We derive the maximum likelihood estimator and then move on to its asymptotic distribution. Next, we wonder the estimation bias if we estimate the parameters without knowing the existence of occurrence restriction. We do several simulations to observe how the estimation will go wrong in accordance with different parameter settings. At last, we slightly generalize the easier case mentioned before and compare how different sampling orders will affect the standard deviations of maximum likelihood estimators.
APA, Harvard, Vancouver, ISO, and other styles
12

林哲民. "= Selection procedure for a multinomial distribution with inverse sampling." Thesis, 1992. http://ndltd.ncl.edu.tw/handle/84035062673000212612.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Zang, Jia-You, and 張嘉祐. "Application Normal Distribution Building Multinomial Tree of Option Pricing Model." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/70331226253035617653.

Full text
Abstract:
碩士
國立高雄應用科技大學
金融資訊研究所
98
Because the system events occurs frequently in the finance market, our study attempts to build a new option pricing model including the risk of price movement to increase correction of pricing option. Many studies focus on pricing option by Monte Carlo and trinomial tree, in this study we using tree model with multiple joint normal distribution to build option pricing model. To compare our model with traditional trees model, this new model provide the price path more widely, also can expect to speed model convergence. We assumed that all price in the tree node following normal distribution, and without consideration of the jump event. According to the simulation results, the model considerate the risk of price movement can speed the convergence and pricing option more effectively.
APA, Harvard, Vancouver, ISO, and other styles
14

Sheng-Gang, Wu, and 伍學綱. "The inference of change-point problem in Multinomial Distribution for repeated measures." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/95206019312261562441.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Chen, Anamda, and 陳慧珍. "The Analysis of Two-Dimensional Contingency Tables with Incompletely Classified Data by Poisson Distribution and Multinomial Distribution." Thesis, 1996. http://ndltd.ncl.edu.tw/handle/11174997402484033215.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Lee, Chung-Han, and 李宗翰. "Sample Size Calculation for Complete Data and Interval Estimation for the Multinomial Distribution." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/m84c6s.

Full text
Abstract:
博士
國立交通大學
統計學研究所
107
In this dissertation, we focus on two topics. The first topic is the interval estimation for the probability of the multinomial distribution. Statistical intervals are widely-used in many study fields. Simultaneous confidence intervals for the multinomial proportions have been proposed in many applications, including quality control and clinical data analysis. Because of these wide applications, the multinomial distribution plays an important role in many areas of science. Thus, we propose a method for constructing the confidence interval for the probability of the multinomial distribution. A simulation study is conducted to compare the performance of different intervals. The second topic is to derive the parameter estimators after the missing data imputation under the misspecified model and determine the sample size of the complete data. We consider the case that the misspecified model is underfitting. Finally, we apply the proposed methodology to analyze a stroke data. The time interval called the pre-hospital delay is important for thrombolytic therapy. Therefore, our study aimed at exploring the association of prehospital delay and arrival way, stroke severity, initial symptom and sign, and stroke risk factors.
APA, Harvard, Vancouver, ISO, and other styles
17

"Bayesian analysis of multinomial regression with gamma utilities." 2012. http://library.cuhk.edu.hk/record=b5549053.

Full text
Abstract:
多項式回歸模型可用來模擬賽馬過程。不同研究者對模型中馬匹的效用的分佈採取不同的假設,包括指數分佈,它與Harville 模型(Harville, 1973)相同,伽馬分佈(Stern, 1990)和正態分佈(Henery, 1981)。Harville 模型無法模擬賽馬過程中競爭第二位和第三位等非冠軍位置時增加的隨機性(Benter, 1994)。Stern 模型假設效用服從形狀參數大於一的伽馬分佈,Henery 模型假設效用服從正態分佈。Bacon-Shone,Lo 和 Busche(1992),Lo 和 Bacon-Shone(1994)和 Lo(1994)研究證明了相較於Harville 模型,這兩個模型能更好地模擬賽馬過程。本文利用賽馬歷史數據,採用貝葉斯方法對賽馬結果中馬匹勝出的概率進行預測。本文假設效用服從伽馬分佈。本文針對多項式回歸模型,提出一個在Metropolis-Hastings 抽樣方法中選擇提議分佈的簡便方法。此方法由Scott(2008)首次提出。我們在似然函數中加入服從伽馬分佈的效用作為潛變量。通過將服從伽馬分佈的效用變換成一個服從Mihram(1975)所描述的廣義極值分佈的隨機變量,我們得到一個線性回歸模型。由此線性模型我們可得到最小二乘估計,本文亦討論最小二乘估計的漸進抽樣分佈。我們利用此估計的方差得到Metropolis-Hastings 抽樣方法中的提議分佈。最後,我們可以得到回歸參數的後驗分佈樣本。本文用香港賽馬數據做模擬賽馬投資以檢驗本文提出的估計方法。
In multinomial regression of racetrack betting, dierent distributions of utilities have been proposed: exponential distribution which is equivalent to Harville’s model (Harville, 1973), gamma distribution (Stern, 1990) and normal distribution (Henery, 1981). Harville’s model has the drawback that it ignores the increasing randomness of the competitions for the second and third place (Benter, 1994). The Stern’s model using gamma utilities with shape parameter greater than 1 and the Henery’s model using normal utilities have been shown to produce a better t (Bacon-Shone, Lo and Busche, 1992; Lo and Bacon-Shone, 1994; Lo, 1994). In this thesis, we use the Bayesian methodology to provide prediction on the winning probabilities of horses with the historical observed data. The gamma utility is adopted throughout the thesis. In this thesis, a convenient method of selecting Metropolis-Hastings proposal distributions for multinomial models is developed. A similar method is rst exploited by Scott (2008). We augment the gamma distributed utilities in the likelihood as latent variables. The gamma utility is transformed to a variable that follows generalized extreme value distribution described by Mihram (1975) through which we get a linear regression model. Least squares estimate of the parameters is easily obtained from this linear model. The asymptotic sampling distribution of the least squares estimate is discussed. The Metropolis-Hastings proposal distribution is generated conditioning on the variance of the estimator. Finally, samples from the posterior distribution of regression parameters are obtained. The proposed method is tested through betting simulations using data from Hong Kong horse racing market.
Detailed summary in vernacular field only.
Xu, Wenjun.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.
Includes bibliographical references (leaves 46-48).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts also in Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Hong Kong Horse Racing Market and Models in Horse Racing --- p.4
Chapter 2.1 --- Hong Kong Horse Racing Market --- p.4
Chapter 2.2 --- Models in Horse Racing --- p.6
Chapter 3 --- Metropolis-Hastings Algorithm in Multinomial Regression with Gamma Utilities --- p.10
Chapter 3.1 --- Notations and Posterior Distribution --- p.10
Chapter 3.2 --- Metropolis-Hastings Algorithm --- p.11
Chapter 4 --- Application --- p.15
Chapter 4.1 --- Variables --- p.16
Chapter 4.2 --- Markov Chain Simulation --- p.17
Chapter 4.3 --- Model Selection --- p.27
Chapter 4.4 --- Estimation Result --- p.31
Chapter 4.5 --- Betting Strategies and Comparisons --- p.33
Chapter 5 --- Conclusion --- p.41
Appendix A --- p.43
Appendix B --- p.44
Bibliography --- p.46
APA, Harvard, Vancouver, ISO, and other styles
18

WU, CHEN-HSUAN, and 吳晨瑄. "Interval estimation for the odds ratio of a 2 × 2 contingency table from multinomial distribution." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/32n7je.

Full text
Abstract:
碩士
國立中央大學
統計研究所
96
For a 2 × 2 contingency table sampled from multinomial distribution, we are interested in measuring strength of association between two variables by the odds ratio. Also constructing a confidence interval for the odds ratio is primarily of concerned in practice. For the multinomial sampling, there are two nuisance parameters except for the odds ratio. Hence we usually take the exact conditional approach to obtain a confidence interval for the odds. However, the exact conditional confidence interval can be very conservative because the exact conditional approach may use a high discrete conditional distribution when the sample size is small. On the other hand, the exact unconditional approach eliminates the nuisance parameters by taking the maximal p-value over all possible values of the nuisance parameters. In this paper, we take the unconditional approach to obtain a modified confidence interval. For small to moderate sample sizes, numerical studies show that comparing to other interval the modified confidence interval usually has shorter length, and its actual confidence coefficient is closer to and at least the nominal confidence coefficient.
APA, Harvard, Vancouver, ISO, and other styles
19

(7043036), Eric A. Gerber. "A Mixed Effects Multinomial Logistic-Normal Model for Forecasting Baseball Performance." Thesis, 2019.

Find full text
Abstract:
Prediction of player performance is a key component in the construction of baseball team rosters. Traditionally, the problem of predicting seasonal plate appearance outcomes has been approached univariately. That is, focusing on each outcome separately rather than jointly modeling the collection of outcomes. More recently, there has been a greater emphasis on joint modeling, thereby accounting for the correlations between outcomes. However, most of these state of the art prediction models are the proprietary property of teams or industrial sports entities and so little is available in open publications.

This dissertation introduces a joint modeling approach to predict seasonal plate appearance outcome vectors using a mixed-effects multinomial logistic-normal model. This model accounts for positive and negative correlations between outcomes both across and within player seasons. It is also applied to the important, yet unaddressed, problem of predicting performance for players moving between the Japanese and American major leagues.

This work begins by motivating the methodological choices through a comparison of state of the art procedures followed by a detailed description of the modeling and estimation approach that includes model t assessments. We then apply the method to longitudinal multinomial count data of baseball player-seasons for players moving between the Japanese and American major leagues and discuss the results. Extensions of this modeling framework to other similar data structures are also discussed.
APA, Harvard, Vancouver, ISO, and other styles
20

Barr, Aila. "New statistical models for discrete uni- and multivariate data sets with special reference to the Dirichlet multinomial distribution." Thesis, 2014. http://hdl.handle.net/10539/15910.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

theng, hui-ching, and 曾慧菁. "Confidence interval and confidence level for the odds ratio of a multinomial distribution by the family of power divergence statistics." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/85891494256470102477.

Full text
Abstract:
碩士
國立中興大學
應用數學系
87
Let {X } is multinomial distribution Mul(N,P ,P ,P ,P ),The power divergence family of statistics (indexed by l) introduced by Cressie and Read (1984) is employed to obtain asymptotic confidence intervals for j =P P /P P . The actual confidence levels and simulated confidence levels are computed exactly and by computer simulations. It is observed that when 0.67£l£1.5 and when minimum cell expectation 35, the power divergence intervals performed uniformly larger than and near confidence level 1-a。
APA, Harvard, Vancouver, ISO, and other styles
22

Silvestre, Cláudia Marisa Vasconcelos. "Clustering with discrete mixture models: An integrated approach for model selection." Doctoral thesis, 2014. http://hdl.handle.net/10071/9991.

Full text
Abstract:
A investigação em analise de agrupamento (cluster analysis) continua em curso. Identificar o número de grupos, bem como seleccionar um subconjunto de variáveis relevantes a partir de dados de uma amostra constituem domínios de investigação ativa em agrupamento. Grande parte dos métodos desenvolvidos para abordar estas temáticas refere-se a dados contínuos, e não podem ser directamente aplicados ao agrupamento de dados categoriais. Este trabalho, pretende ser um contributo nesta área, abordando o agrupamento de dados categoriais.
Research on cluster analysis continues to develop. Identifying the number of clusters and selecting a subset of relevant variables available in the data have been active areas in research on clustering methods. The approaches proposed for addressing these issues are mostly designed to deal with numerical data and cannot be directly applied for clustering categorical data. This work intends to be a contribution to handling categorical data, in this area.
APA, Harvard, Vancouver, ISO, and other styles
23

Tsai, Pei-Yuan, and 蔡佩洹. "A robust inference of comparing multinomial distributions under paired designs." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/78296465828870685143.

Full text
Abstract:
碩士
國立中央大學
統計研究所
104
We propose a new robust likelihood approach for inference about the difference between two multinomial distributions in paired designs. The merit of this parametric robust method is illustrated by the robust score statistic for testing the equality of two multinomial distributions. This test accounts for the within-cluster correlation in a data-driven manner and is easy to compute without a full model specification. The robust score test reduces to the McNemar’s test in the paired binary data scenario. We provide theoretical justification and use simulations and real data analysis to demonstrate the superiority of the robust procedure.
APA, Harvard, Vancouver, ISO, and other styles
24

Chuang, Wei-en, and 莊瑋恩. "A Method to Setting the Parameters of Prior Distributions on the Multinomial Naïve Bayes Model for Text Classification." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/19361401568414373730.

Full text
Abstract:
碩士
國立成功大學
工業與資訊管理學系碩博士班
96
The naïve Bayes classifier is a popular technique for text classification because it performs well and has low computation complexity. Due to the various type of the distribution of the words in documents, there are some probabilistic model has been proposed such as binary independence model, multinomial model, poisson model, the negative model…etc. Previous studies have found that the multinomial model usually gives higher classification accuracy than the binary independence model. In this study, we use the multinomial naïve Bayes classifier for text classification and focus on the impact of setting the parameters of prior distributions. In the multinomial naïve Bayes model, we assume the prior distribution to be either a Dirichlet or a generalized Dirichlet distribution. Setting the large amount of parameters becomes an issue when we use generalized Dirichlet distributions as priors. In order to reduce the computation complexity and obtain higher accuracy, we separate the parameters into several groups and propose five methods to systematically change the parameters corresponding to a group. We use data set MDR88 in our analysis. By the experiment result, the concurrent prior setting method cannot get a better classification accuracy because it ignores the influence of the document in each class. On the contrary, if we consider the influence of the document in each class, it also means we should use the individual prior setting method that does improve the accuracy. Since every word may play an important role in certain class, it is improper to adjust all parameters in a prior concurrently. We try to release this restriction by using generalized Dirichlet distributions as priors and the concept of separating parameters in groups. The experiment result shows that individual prior setting in group can get a higher classification accuracy.
APA, Harvard, Vancouver, ISO, and other styles
25

Ouimet, Frédéric. "Extremes of log-correlated random fields and the Riemann zeta function, and some asymptotic results for various estimators in statistics." Thèse, 2019. http://hdl.handle.net/1866/22667.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography