Dissertations / Theses on the topic 'Item response theory – Statistical methods'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 35 dissertations / theses for your research on the topic 'Item response theory – Statistical methods.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Combs, Adam. "Bayesian Model Checking Methods for Dichotomous Item Response Theory and Testlet Models." Bowling Green State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1394808820.
Full textKopf, Julia [Verfasser]. "Model-based Recursive Partitioning Meets Item Response Theory. New Statistical Methods for the Detection of Differential Item Functioning and Appropriate Anchor Selection / Julia Kopf." München : Verlag Dr. Hut, 2013. http://d-nb.info/1045988804/34.
Full textKopf, Julia [Verfasser], and Carolin [Akademischer Betreuer] Strobl. "Model-based recursive partitioning meets item response theory : new statistical methods for the detection of differential item functioning and appropriate anchor selection / Julia Kopf. Betreuer: Carolin Strobl." München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2013. http://d-nb.info/1046503235/34.
Full textCarter, Nathan T. "APPLICATIONS OF DIFFERENTIAL FUNCTIONING METHODS TO THE GENERALIZED GRADED UNFOLDING MODEL." Bowling Green State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1290885927.
Full textUeckert, Sebastian. "Novel Pharmacometric Methods for Design and Analysis of Disease Progression Studies." Doctoral thesis, Uppsala universitet, Institutionen för farmaceutisk biovetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-216537.
Full textJiang, Jing. "Regularization Methods for Detecting Differential Item Functioning:." Thesis, Boston College, 2019. http://hdl.handle.net/2345/bc-ir:108404.
Full textDifferential item functioning (DIF) occurs when examinees of equal ability from different groups have different probabilities of correctly responding to certain items. DIF analysis aims to identify potentially biased items to ensure the fairness and equity of instruments, and has become a routine procedure in developing and improving assessments. This study proposed a DIF detection method using regularization techniques, which allows for simultaneous investigation of all items on a test for both uniform and nonuniform DIF. In order to evaluate the performance of the proposed DIF detection models and understand the factors that influence the performance, comprehensive simulation studies and empirical data analyses were conducted. Under various conditions including test length, sample size, sample size ratio, percentage of DIF items, DIF type, and DIF magnitude, the operating characteristics of three kinds of regularized logistic regression models: lasso, elastic net, and adaptive lasso, each characterized by their penalty functions, were examined and compared. Selection of optimal tuning parameter was investigated using two well-known information criteria AIC and BIC, and cross-validation. The results revealed that BIC outperformed other model selection criteria, which not only flagged high-impact DIF items precisely, but also prevented over-identification of DIF items with few false alarms. Among the regularization models, the adaptive lasso model achieved superior performance than the other two models in most conditions. The performance of the regularized DIF detection model using adaptive lasso was then compared to two commonly used DIF detection approaches including the logistic regression method and the likelihood ratio test. The proposed model was applied to analyzing empirical datasets to demonstrate the applicability of the method in real settings
Thesis (PhD) — Boston College, 2019
Submitted to: Boston College. Lynch School of Education
Discipline: Educational Research, Measurement and Evaluation
Peterson, Jaime Leigh. "Multidimensional item response theory observed score equating methods for mixed-format tests." Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/1379.
Full textMorse, Brendan J. "Controlling Type 1 errors in moderated multiple regression an application of item response theory for applied psychological research /." Ohio : Ohio University, 2009. http://www.ohiolink.edu/etd/view.cgi?ohiou1247063796.
Full textChoi, Jiwon. "Comparison of MIRT observed score equating methods under the common-item nonequivalent groups design." Diss., University of Iowa, 2019. https://ir.uiowa.edu/etd/6716.
Full textChen, Keyu. "A comparison of fixed item parameter calibration methods and reporting score scales in the development of an item pool." Diss., University of Iowa, 2019. https://ir.uiowa.edu/etd/6923.
Full textPfleger, Phillip Isaac. "Designing Software to Unify Person-Fit Assessment." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8776.
Full textSheng, Yanyan. "Bayesian analysis of hierarchical IRT models comparing and combining the unidimensional & multi-unidimensional IRT models /." Diss., Columbia, Mo. : University of Missouri-Columbia, 2005. http://hdl.handle.net/10355/4153.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file viewed on (July 19, 2006) Vita. Includes bibliographical references.
Martin, Dale Frederick Hosking. "Improving the Detection of Narcissistic Transformational Leaders with the Multifactor Leadership Questionnaire: An Item Response Theory Analysis." ScholarWorks, 2011. https://scholarworks.waldenu.edu/dissertations/849.
Full textMeng, Huijuan Vispoel Walter P. Lee Won-Chan. "A comparison study of IRT calibration methods for mixed-format tests in vertical scaling." Iowa City : University of Iowa, 2007. http://ir.uiowa.edu/etd/338.
Full textWhorton, Skyler. "Can a computer adaptive assessment system determine, better than traditional methods, whether students know mathematics skills?" Digital WPI, 2013. https://digitalcommons.wpi.edu/etd-theses/224.
Full textMair, Patrick, Eva Hofmann, Kathrin Gruber, Reinhold Hatzinger, Achim Zeileis, and Kurt Hornik. "What Drives Package Authors to Participate in the R Project for Statistical Computing? Exploring Motivation, Values, and Work Design." National Academy of Sciences, 2015. http://epub.wu.ac.at/4702/1/cranpnas.pdf.
Full textHou, Jianlin Vispoel Walter P. "Effectiveness of the hybrid Levine equipercentile and modified frequency estimation equating methods under the common-item nonequivalent groups design." Iowa City : University of Iowa, 2007. http://ir.uiowa.edu/etd/339.
Full textLopez, Gabriel E. "Detection and Classification of DIF Types Using Parametric and Nonparametric Methods: A comparison of the IRT-Likelihood Ratio Test, Crossing-SIBTEST, and Logistic Regression Procedures." Scholar Commons, 2012. http://scholarcommons.usf.edu/etd/4131.
Full textDuncan, Kristin A. "Case and covariate influence implications for model assessment /." Connect to this title online, 2004. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1095357183.
Full textTitle from first page of PDF file. Document formatted into pages; contains xi, 123 p.; also includes graphics (some col.). Includes bibliographical references (p. 120-123).
Santos, José Roberto Silva dos 1984. "Um modelo de resposta ao item para grupos múltiplos com distribuições normais assimétricas centralizadas." [s.n.], 2012. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306791.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica
Made available in DSpace on 2018-08-20T09:23:25Z (GMT). No. of bitstreams: 1 Santos_JoseRobertoSilvados_M.pdf: 2068782 bytes, checksum: f8dc91d2f7f6091813ba229dc12991f4 (MD5) Previous issue date: 2012
Resumo: Uma das suposições dominantes nos modelos de resposta ao item (MRI) é a suposição de normalidade simétrica para modelar o comportamento dos traços latentes. No entanto, tal suposição tem sido questionada em vários trabalhos como, por exemplo, nos trabalhos de Micceri (1989) e Bazán et.al (2006). Recentemente Azevedo et.al (2011) propuseram um MRI com distribuição normal assimétrica centralizada para os traços latentes, considerando a estrutura de um único grupo de indivíduos. No presente trabalho fazemos uma extensão desse modelo para o caso de grupos múltiplos. Desenvolvemos dois algoritmos MCMC para estimação dos parâmetros utilizando a estrutura de dados aumentados para representar a função de resposta ao item (FRI), veja Albert (1992). O primeiro é um amostrador de Gibbs com passos de Metropolis-Hastings. No segundo utilizamos representações estocásticas (gerando uma estrutura hierárquica) das densidades a priori dos traços latentes e parâmetros populacionais conseguindo, assim, formas conhecidas para todas as distribuições condicionais completas, o que nos possibilitou desenvolver o amostrador de Gibbs completo. Comparamos esses algoritmos utilizando como critério o tamanho efetivo de amostra, veja Sahu (2002). O amostrador de Gibbs completo obteve o melhor desempenho. Também avaliamos o impacto do número de respondentes por grupo, número de itens por grupo, número de itens comuns, assimetria da distribuição do grupo de referência e priori, na recuperação dos parâmetros. Os resultados indicaram que nosso modelo recuperou bem todos os parâmetros, principalmente, quando utilizamos a priori de Jeffreys. Além disso, o número de itens por grupo e o número de examinados por grupo, mostraram ter um alto impacto na recuperação dos traços latentes e parâmetros dos itens, respectivamente. Analisamos um conjunto de dados reais que apresenta indícios de assimetria na distribuição dos traços latentes de alguns grupos. Os resultados obtidos com o nosso modelo confirmam a presença de assimetria na maioria dos grupos. Estudamos algumas medidas de diagnóstico baseadas na distribuição preditiva de medidas de discrepância adequadas. Por último, comparamos os modelos simétrico e assimétrico utilizando os critérios sugeridos por Spiegelhalter et al. (2002). O modelo assimétrico se ajustou melhor aos dados segundo todos os critérios
Abstract: An usual assumption for parameter estimation in the Item Response Models (IRM) is to assume that the latent traits are random variables which follow a normal distribution. However, many works suggest that this assumption does not apply in many cases. For example, the works of Micceri (1989) and Bazán (2006). Recently Azevedo et.al (2011) proposed an IRM with skew-normal distribution under the centred parametrization for the latent traits, considering one single group of examinees. In the present work, we developed an extension of this model to account for multiple groups. We developed two MCMC algorithms to parameter estimation using the augmented data structure to represent the Item response function (IRF), see Albert (1992). The First is a Metropolis-Hastings within Gibbs sampling. In the second, we use stochastic representations (creating a hierarchical structure) in the prior distribution of the latent traits and population parameters. Therefore, we obtained known full conditional distributions, which enabled us to develop the full Gibbs sampler. We compared these algorithms using the effective sample size criteria, see Sahu (2002). The full Gibbs sampling presented the best performance. We also evaluated the impact of the number of examinees per group, number of items per group, number of common items, priors and asymmetry of the reference group, on the parameter recovery. The results indicated that our approach recovers properly all parameters, mainly, when we consider the Jeffreys prior. Furthermore, the number of items per group and the number of examinees per group, showed to have a high impact on the recovery of the true of latent traits and item parameters, respectively. We analyze a real data set in which we found an evidence of asymmetry in the distribution of latent traits of some groups. The results obtained with our model confirmed the presence of asymmetry in most groups. We studied some diagnostic measures based on predictive distribution of appropriate discrepancy measures. Finally, we compared the symmetric and asymmetric models using the criteria suggested by Spiegelhalter et al. (2002). The asymmetrical model fits better according to all criteria
Mestrado
Estatistica
Mestre em Estatística
Padilla, Gómez Juan Leonardo 1989. "Modelos da teoria de resposta ao item multidimensionais assimétricos de grupos múltiplos para respostas dicotômicas sob um enfoque bayesiano." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306792.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica
Made available in DSpace on 2018-08-24T22:30:44Z (GMT). No. of bitstreams: 1 PadillaGomez_JuanLeonardo_M.pdf: 10775900 bytes, checksum: 50bc9965f728b4b04b42b7428c3ec8ab (MD5) Previous issue date: 2014
Resumo: No presente trabalho propõe-se novos modelos da Teoria de Resposta ao Item Multidimensional (TRIM) para respostas dicotômicas ou dicotomizadas considerando uma estrutura de grupos múltiplos. Para as distribuições dos traços latentes de cada grupo, propõe-se uma nova parametrização da distribuição normal assimétrica multivariada centrada, que combina as propostas de Lachos (2004) e de Arellano-Valle et.al (2008), a qual não só garante a identificabilidade dos modelos aqui introduzidos, mas também facilita a interpretação e estimação dos seus parâmetros. Portanto, nosso modelo representa uma alternativa interessante, para solucionar os problemas de falta de identificabilidade encontrados por Matos (2010) e Nojosa (2008), nos modelos multidimensionais assimétricos de um único grupo por eles desenvolvidos. Estudos de simulação, considerando vários cenários de interesse prático, foram conduzidos a fim de avaliar o potencial da tríade: modelagem, métodos de estimação e ferramentas de diagnósticos. Os resultados indicam que os modelos considerando a assimetria nos traços latentes, em geral, forneceram estimativas mais acuradas que os modelos tradicionais. Para a seleção de modelos, utilizou-se o critério de informação deviance (DIC), os valores esperados do critério de informação de Akaike (EAIC) e o critério de informação bayesiano (EBIC). Em relação à verificação da qualidade do ajuste de modelos, explorou-se alguns métodos de checagem preditiva a posteriori, os quais fornecem meios para avaliar a qualidade tanto do instrumento de medida, quanto o ajuste do modelo de um ponto de vista global e em relação à suposições específicas, entre elas a dimensão do teste. Com relação aos métodos de estimação, adaptou-se e implementou-se vários algoritmos MCMC propostos na literatura para outros modelos, inclusive a proposta de aceleração de convergência de González (2004), os quais foram comparados em relação aos aspectos de qualidade de convergência através do critério de tamanho efetivo da amostra de Sahu (2002). A análise de um conjunto de dados reais, referente à primeira fase do vestibular da UNICAMP de 2013 também foi realizada
Abstract: In this work it is proposed a new class of Multidimensional Item Response Theory (MIRT) models for dichotomic or dichotomized answers considering a multiple group structure. For the latent traits distribution of each group, it is proposed a new parametrization of the centered multivariate skew normal distribution, which combines the proposed by Lachos (2004) and the one proposed by Arellano-Valle et.al (2008), which not only ensures de identifiability of our proposed models, but also it makes simpler the interpretation and estimation of their parameters. Hence, our model stands as an important alternative, in order to solve the identifiability problems found for the one group multidimensional skewed models proposed by Matos (2010) and Nojosa (2008). Simulation studies, taking into account some situations of practical interest, were conducted in order to evaluate the potential of the triad: modeling, estimation methods and diagnostic tools. The results indicate that the models considering a skew component on the latent traits, in general, produced more accurate results than those ones obtained with the symmetric models. For model selection, it was used the deviance information criterion (DIC), the expected values of both the Akaike¿s information criterion (EAIC) and bayesian information criteron (EBIC). Concerning assessment of model fit quality, it was explored posterior predictive checking methods, which allows for evaluating the quality of the measure instrument as well as the quality fit of the model from a global point of view and related to specific assumptions, as the test dimensionality. Concerning the estimation methods, it was adopted and implemented several MCMC algorithms proposed in the literature for other models, including the convergence accelerating propose algorithm by Gonzalez (2004), which were compared concerning some convergence quality aspects through the Sahu (2002) effective sample size. The analysis of a real data set, from the 2013 first stage of the UNICAMP admission exam was done as well
Mestrado
Estatistica
Mestre em Estatística
Sunny, Cijy Elizabeth. "Stakeholders’ Conceptualization of Students’ Attitudes and Persistence towards STEM: A Mixed Methods Instrument Development and Validation Study." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1521190666039014.
Full textBush, Joan Spooner. "A Comparison of Traditional Norming and Rasch Quick Norming Methods." Thesis, University of North Texas, 1993. https://digital.library.unt.edu/ark:/67531/metadc277818/.
Full textSilva, Wellington. "Eficácia dos processos de linkagem na avaliação educacional em larga escala." Universidade Federal de Juiz de Fora (UFJF), 2010. https://repositorio.ufjf.br/jspui/handle/ufjf/2699.
Full textApproved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-10-04T15:48:34Z (GMT) No. of bitstreams: 1 wellingtonsilva.pdf: 6130109 bytes, checksum: 639bf4b28ab59af38731c1e34562bfcc (MD5)
Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-10-04T15:48:47Z (GMT) No. of bitstreams: 1 wellingtonsilva.pdf: 6130109 bytes, checksum: 639bf4b28ab59af38731c1e34562bfcc (MD5)
Made available in DSpace on 2016-10-04T15:48:47Z (GMT). No. of bitstreams: 1 wellingtonsilva.pdf: 6130109 bytes, checksum: 639bf4b28ab59af38731c1e34562bfcc (MD5) Previous issue date: 2010-09-14
Em 1997, através do Sistema Nacional de Avaliação da Educação Básica – SAEB, definiu-se a escala de proficiência para o Brasil. A partir de então, praticamente todas as avaliações em larga escala realizadas por diversos estados brasileiros têm procurado manter uma comparabilidade de resultados com essa escala, por meio da Metodologia da Teoria da Resposta ao Item – TRI. Entretanto observa-se uma diversidade de situações ao se analisar as diferentes avaliações realizadas pelos Estados brasileiro e até mesmo no próprio SAEB. Nesse trabalho, apresentaremos alguns aspectos técnicos necessários para se garantir a comparabilidade nos procedimentos de linkagem de avaliações, bem como as características das avaliações do SAEB e de alguns estados brasileiros ao longo do tempo.
In 1997, through the National System of Basic Education Evaluation ( SAEB ), the proficiency scale for Brazil was defined. From that time on, almost all the assessment realized by several Brazilian states have tried to keep a result comparability with this scale through Item Response Theory Methodology ( IRT ). However, a variety of situations is observed when different assessments realized in Brazilian states or even at SAEB are analyzed.In this article, some technical aspects needed for ensuring the comparability in the assessment linking procedures are presented, as well as the characteristic of SAEB´s assessment and some Brazilian states´ assessment throughout time.
Azevedo, Caio Lucidius Naberezny. "Modelos longitudinais de grupos múltiplos multiníveis na teoria da resposta ao item: métodos de estimação e seleção estrutural sob uma perspectiva bayesiana." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-15042008-165256/.
Full textIn this work we proposed a bayesian framework, by using an augmented data scheme, to analyze longitudinal multiple groups models (LMGMIRT) in the Item Response Theory (IRT). Such framework consists in the following set : modelling, estimation methods and diagnostic tools to the LMGMIRT. Concerning the modelling, we exploited multivariate and multilevel structures in order to represent the hierarchical nature of the longitudinal multiple groupos model. This approach allows to consider several submodels such that: multiple groups and longitudinal one group models. We studied some positive and negative aspects of both above mentioned approches. The multivariate modelling allows to represent, in a straightforward way, many dependence structures. Furthermore it possibilities that many of them can be easily considered in the estimation process. This allows, for example, to consider an unstructured covariance matrix and, then, it allows to obtain information about the most appropritate dependece structure. On the other hand, the multilevel modelling permits to obtain: more straightforward interpretations of the model, the construction of univariate full conditional distributions, an easy way to include auxiliary information, the incorporation of within and between subjects (groups) sources of variability, among others. Concerning the estimation methods, we developed a procedure based on Monte Carlo Markov Chain (MCMC) simulation. We showed that the full conditional distributions are known and easy to sample from. Even though such approach demands a considerable amount of time it circumvents many problems such that: limitation in the number of groups that can be considered, the limitation in the number of instants of observation, the choice of covariance matrices, latent trait asymmetry, data imputation, among others. Furthermore, within the MCMC metodology, we developed a procedure to select covariance matrices, by using the so called Reversible Jump MCMC (RJMCMC). Simulation studies show that the model, the estimation method and the model selection procedure produce reasonable results. Also, the studies indicate that the developed metodology presents robustness concerning prior choice and different initial values choice. It is possible to extent the developed estimation methods to other situations in a straightforward way. Some diagnostics techniques that were studied allow to assess the model fit, in a global sense. Others techniques give directions toward the departing from some specific assumptions as the latent trait normality. Such methodology also provides ways to assess the quality of the test or questionaire used to measure the latent traits. Finally, by analyzing a real data set, using some of the models that were developed, it was possible to verify the potential of the methodology considered in this work. Furthermore, the results of this analysis indicate advantages in using longitudinal IRT models to model educational repeated measurement data instead of to assume independence.
Buatois, Simon. "Novel pharmacometric methods to improve clinical drug development in progressive diseases." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCC133.
Full textIn the mid-1990, model-based approaches were mainly used as supporting tools for drug development. Restricted to the “rescue mode” in situations of drug development failure, the impact of model-based approaches was relatively limited. Nowadays, the merits of these approaches are widely recognised by stakeholders in healthcare and have a crucial role in drug development for progressive diseases. Despite their numerous advantages, model-based approaches present important drawbacks limiting their use in confirmatory trials. Traditional pharmacometric (PMX) analyses relies on model selection, and consequently ignores model structure uncertainty when generating statistical inference. The problem of model selection is potentially leading to over-optimistic confidence intervals and resulting in a type I error inflation. Two projects of this thesis aimed at investigating the value of innovative PMX approaches to address part of these shortcomings in a hypothetical dose-finding study for a progressive disorder. The model averaging approach coupled to a combined likelihood ratio test showed promising results and represents an additional step towards the use of PMX for primary analysis in dose-finding studies. In the learning phase, PMX is a key discipline with applications at every stage of drug development to gain insight into drug, mechanism and disease characteristics with the ultimate goal to aid efficient drug development. In this thesis, the merits of PMX analysis were evaluated, in the context of Parkinson’s disease. An item-response theory longitudinal model was successfully developed to precisely describe the disease progression of Parkinson’s disease patients while acknowledging the composite nature of a patient-reported outcome. To conclude, this thesis enhances the use of PMX to aid efficient drug development and/or regulatory decisions in drug development
Fugita, Felipe. "Avaliação educacional : um olhar matemático." reponame:Repositório Institucional da UFABC, 2018.
Find full textDissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Mestrado Profissional em Matemática em Rede Nacional - PROFMAT, Santo André, 2017.
Um dos objetivos desse trabalho é explicar a Teoria de Resposta ao Item, conhecida como TRI, enfatizando o modelo logístico de três parâmetros e descrevendo suas principais características. Outro objetivo é mostrar como o professor pode utilizar ferramentas estatísticas, em uma planilha eletrônica, para: verificar a qualidade das questões que compõe sua prova; analisar se existe uma correlação entre dois instrumentos de avaliação; utilizar a média escolar de um aluno para inferir sobre o seu desempenho no vestibular; entre outras possibilidades. Com a finalidade de explicar a TRI e seu método de estimação de parâmetros por Máxima Verossimilhança, são apresentados previamente os modelos Matemáticos, Probabilísticos e Estatísticos, pilares dessa teoria. Além disso, é descrito como os programas de avaliações educacionais em larga escala de diversos países utilizam a TRI para monitorar o desempenho de seus sistemas educacionais. Em seguida, são expostas algumas ferramentas Estatísticas, em específico, o coeficiente de correlação, o método de mínimos quadrados e o ponto bisserial que podem colaborar nos processos de avaliações educacionais que fazem parte da rotina escolar. São ilustrados também exemplos de planilhas eletrônicas com a descrição passo a passo de sua construção e dos comandos utilizados. Desse modo, espera-se contribuir para compreensão da TRI e, consequentemente, dos indicativos educacionais produzidos pelos programas de avaliações em larga escala, bem como, para atuação e reflexão da prática docente em seus métodos de avaliação educacional.
One of the goals of this work is to explain Item Response Theory, known as IRT, emphasizing the Three-Parameter Logistic model and describing its main characteristics. Another objective is to demonstrate how educators can use statistical tools within a spreadsheet to: verify the quality and reliability of test questions; examine whether there is a correlation between two assessment tools; use the school average of a student to predict his or her performance in entrance examinations; among other possibilities. To explain IRT and its method of parameter estimation by maximum likelihood, this work presents the mathematical, probabilistic and statistical models that are the pillars of the theory. It also describes how the large-scale educational assessment programs of various countries use IRT to monitor the performance of their education systems. Then, this work presents a selection of statistical tools, specifically, the correlation coefficient, the least squares method and the point biserial correlation, which could contribute to the process of routine educational assessments. Also provided are illustrated examples of spreadsheets with step-by- step descriptions of their creation and the commands used. Thus, the work hopes to contribute to the understanding of IRT and, consequently, of the educational indicators produced by large-scale assessment programs, as well as benefit educators in their practice and reflection on methods of educational evaluation.
"Stability and sensitivity of a model-based person-fit index in detecting item pre-knowledge in computerized adaptive test." Thesis, 2008. http://library.cuhk.edu.hk/record=b6074790.
Full textItem response theory is a modern test theory. It focuses on the performance of each item. Under this framework, the performance of test takers on a test item can be predicted by a set of abilities. The relationship between the test takers' item performances and the set of abilities underlying item performances can be described by a monotonically increasing function called an item characteristic curve. Due to various personal reasons, the performances of the test takers may depart from the response patterns predicted by the underlying test model. In order to calculate the extent of departure of these aberrant response patterns, a number of methods have been developed under the theme "person-fit statistics". The degree of aberration is calculated as an index called person-fit index. Inside the computerized adaptive testing (CAT), test takers with different abilities will answer different numbers of questions and the difficulties of the items administered to them are usually clustered at the abilities of the test takers. Due to this reason, the application of person-fit indices in the computerized adaptive testing environment to measure misfit is difficult.
The present study also found that FLOR has a much superior sensitivity over other indices in detecting item pre-knowledge. Concerning about the sensitivity over different abilities of test takers, it was found that the sensitivity of FLOR was the highest among low ability test takers and the weakest among strong ability test takers in the fixed length and fixed items tests. However, the sensitivities of FLOR became the same among different abilities of test takers if items with difficulties matching their abilities were used in the tests. The number of beneficiaries among the test takers did not affect the sensitivity of FLOR. Moreover, in a simulation to test the differentiating power of FLOR, it was found that FLOR could differentiate item pre-knowledge from other reasons of personal misfits (test anxiety, player, random response and challenger) effectively.
The present study assessed the stability of FLOR over other variables, which were unrelated to item pre-knowledge. It found that FLOR was stable over the discrimination and difficulty parameters of test items. It was also stable over positions of the exposed items in the test and the initial assignment of prior probability of item pre-knowledge. However, the asymptotes (guessing factor) and the probabilities of item exposure did affect the final values of FLOR seriously.
The present study used the hf plot to access the sensitivity of the person-fit indices. hf plot is a plot of hit rate against false alarm rate. For a higher hit rate, usually a higher false alarm rate is followed. hf plot provides a good tools for comparison between indices by inspection of the speed of rise of the curves. A sensitive index should give a faster rise of the curve. In this study, sensitivity of an index was defined as the speed of rise of the hf plot, which is represented by a parameter hftau estimated from the data obtained from hf plot.
When the frequent accesses to the item bank has become feasible, test takers may memorize blocks of test items and share these items with future test takers. Individuals with prior knowledge of some items may use that information to get high scores, in the sense that their test scores have been artificially inflated. FLOR is an index of posterior log-odds ratio used for detecting the use of item pre-knowledge. It can be applied both in the fixed item, fixed length test and the CAT environment. It is a model-based index in which aberrant models are defined in the situation of item pre-knowledge. FLOR describes the likelihood that a response pattern arises from the aberrant models.
Hui Hing-fai.
Adviser: Kit-tai Hau.
Source: Dissertation Abstracts International, Volume: 70-09, Section: A, page: .
Thesis (Ed.D.)--Chinese University of Hong Kong, 2008.
Includes bibliographical references (leaves 108-111).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts in English and Chinese.
School code: 1307.
Deng, Nina. "Evaluating IRT- and CTT- based methods of estimating classification consistency and accuracy indices from single administrations." 2011. https://scholarworks.umass.edu/dissertations/AAI3482610.
Full textMeng, Yu. "Comparison of kernel equating and item response theory equating methods." 2012. https://scholarworks.umass.edu/dissertations/AAI3518262.
Full textKuo, Hsiu-Fen, and 郭秀芬. "The Performance of Diffrernt Estimation Methods under Multidimensional Item Response Theory." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/95824695563257046316.
Full text國立臺中教育大學
教育測驗統計研究所
101
Currently, the relevant research of possible values is based on UIRT. Reaserch that focus on the influence of auxiliary variables on the estimation of population statistics and item parameter based on MIRT is rare. The purpose of this study is to explore the performance of different methods of estimation under MIRT, using simulated data. This study is based on multidimensional random coefficients multinomial logit model (MRCMLM). EAP, EAP_AV, MLE, WLE, PV_noAV and PV are used to compare the estimation efficiency of individual ability and population statistics under both of the situations that ancillary variables are used or not. The result shows that when the number of items increase, the estimation error decrease, especially in the individual ability estimation. In multidimensional item response theory, when the correlation between dimensions increases, the accuracy of estimation does not improve. In the condition that ancillary variables are used, the impact of the correlation between the ancillary variables and the ability is limited by the item parameter setting. When ancillary variables are incorporated, the parameters can be estimated well. Even if the item parameters and ability parameters have different distributions. In the condition that ancillary variables are not incorporated, plausible value still performances well in the recovery of standard deviation.
Sabouri, Pooneh 1980. "Alternative estimation approaches for some common Item Response Theory models." Thesis, 2010. http://hdl.handle.net/2152/ETD-UT-2010-08-1841.
Full texttext
Sohn, Youngsoon. "A comparison of methods for item analysis and DIF using classical test theory, item response theory, and generalized linear model." 2009. http://purl.galileo.usg.edu/uga%5Fetd/sohn%5Fyoungsoon%5F200905%5Fma.
Full textTaljaard, Monica. "Non-response error in surveys." Diss., 1997. http://hdl.handle.net/10500/16167.
Full textMathematical Sciences
M. Com. (Statistics)
Tělupil, Dominik. "Adaptivní testování pro odhad znalostí." Master's thesis, 2018. http://www.nusl.cz/ntk/nusl-386945.
Full text