Log in

Relevant bibliographies by topics / Item response theory – Mathematical models / Journal articles

To see the other types of publications on this topic, follow the link: Item response theory – Mathematical models.

Journal articles on the topic 'Item response theory – Mathematical models'

Author: Grafiati

Published: 4 June 2021

Last updated: 11 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Item response theory – Mathematical models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kim, Jinho, and Mark Wilson. "Polytomous Item Explanatory Item Response Theory Models." Educational and Psychological Measurement 80, no. 4 (2019): 726–55. http://dx.doi.org/10.1177/0013164419892667.

Full text

Abstract:

This study investigates polytomous item explanatory item response theory models under the multivariate generalized linear mixed modeling framework, using the linear logistic test model approach. Building on the original ideas of the many-facet Rasch model and the linear partial credit model, a polytomous Rasch model is extended to the item location explanatory many-facet Rasch model and the step difficulty explanatory linear partial credit model. To demonstrate the practical differences between the two polytomous item explanatory approaches, two empirical studies examine how item properties explain and predict the overall item difficulties or the step difficulties each in the Carbon Cycle assessment data and in the Verbal Aggression data. The results suggest that the two polytomous item explanatory models are methodologically and practically different in terms of (a) the target difficulty parameters of polytomous items, which are explained by item properties; (b) the types of predictors for the item properties incorporated into the design matrix; and (c) the types of item property effects. The potentials and methodological advantages of item explanatory modeling are discussed as well.

APA, Harvard, Vancouver, ISO, and other styles

2

Pacheco, Juliano Anderson, Dalton Francisco de Andrade, and Antonio Cezar Bornia. "Benchmarking by Item Response Theory (BIRTH)." Benchmarking: An International Journal 22, no. 5 (2015): 945–62. http://dx.doi.org/10.1108/bij-03-2013-0035.

Full text

Abstract:

Purpose – The purpose of this paper is to present a new method for benchmarking, which allows the construction of scales of competitiveness for the comparison of products using Item Response Theory (IRT). Design/methodology/approach – Theoretically, the method combines classic benchmarking process steps with IRT steps and demonstrates through mathematical models how this technique can measure the competitiveness of products by means of a latent trait. Findings – The IRT method uses the theories of psychometrics to measure the competitiveness of products through qualitative and quantitative interpretation of the tangible and intangible characteristics of those products. To demonstrate the application of the developed method, the items were constructed for teaching staff. Research limitations/implications – The application of the developed method will increase the accuracy of assessments of the competitiveness of a product because this method uses a mathematical model of the IRT to evaluate the characteristics product that reflect market competitiveness. Items must be selected based on theories relevant to the product and/or expert opinion or customers. Practical implications – The applicability of the method results in the construction of a scale in which items identify good practice with greater difficulty because they are represented in the same units that index competitiveness. Thus, managers of companies obtain knowledge about their products and the market, which allows them to assess their performance against their competitors and to make decisions regarding the continuous improvement of their production process and expansion of product characteristics. Originality/value – This work presents a new method for benchmarking using a quantitative technique that enables measurement of the latent trait of “competitiveness” through robust mathematical models.

APA, Harvard, Vancouver, ISO, and other styles

3

Falani, Ilham, Makruf Akbar, and Dali Santun Naga. "Comparison of the Accuracy of Item Response Theory Models in Estimating Student’s Ability." Journal of Educational Science and Technology (EST) 6, no. 2 (2020): 178. http://dx.doi.org/10.26858/est.v6i2.13295.

Full text

Abstract:

This study aims to determine the item response theory model which is more accurate in estimating students' mathematical abilities. The models compared in this study are Multiple Choice Model and Three-Parameter Logistic Model. Data used in this study are the responses of a mathematical test of 1704 eighth-grade junior high school students from six schools in the Depok City, West Java. The Sampling is done by using a purposive random sampling technique. The mathematics test used for research data collection consisted of 30 multiple choice format items. After the data is obtained, Research hypotheses were tested using the variance test method (F-test) to find out which model is more accurate in estimating ability parameters. The results showed that Fvalue is obtained 1.089, and Ftable is 1.087, the value of Fvalue > Ftable, so it concluded that Ho rejected. That means Multiple Choice Model is more accurate than Three-Parameter Logistic Model in estimating the parameters of students' mathematical abilities. This makes the Multiple-Choice Model a recommended model for estimating mathematical ability in MC item format tests, especially in the field of mathematics and other fields that have similar characteristics.

APA, Harvard, Vancouver, ISO, and other styles

4

Liu, Yang, Ji Seung Yang, and Alberto Maydeu-Olivares. "Restricted Recalibration of Item Response Theory Models." Psychometrika 84, no. 2 (2019): 529–53. http://dx.doi.org/10.1007/s11336-019-09667-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Holland, Paul W. "On the sampling theory roundations of item response theory models." Psychometrika 55, no. 4 (1990): 577–601. http://dx.doi.org/10.1007/bf02294609.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Lipovetsky, Stan. "Handbook of Item Response Theory, Volume 1, Models." Technometrics 63, no. 3 (2021): 428–31. http://dx.doi.org/10.1080/00401706.2021.1945324.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Sheng, Yanyan, and Christopher K. Wikle. "Comparing Multiunidimensional and Unidimensional Item Response Theory Models." Educational and Psychological Measurement 67, no. 6 (2007): 899–919. http://dx.doi.org/10.1177/0013164406296977.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Shanmugam, Ramalingam. "Handbook of Item Response Theory: Volume one, Models." Journal of Statistical Computation and Simulation 90, no. 10 (2019): 1922. http://dx.doi.org/10.1080/00949655.2019.1628905.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

da Silva, Marcelo A., Ren Liu, Anne C. Huggins-Manley, and Jorge L. Bazán. "Incorporating the Q-Matrix Into Multidimensional Item Response Theory Models." Educational and Psychological Measurement 79, no. 4 (2018): 665–87. http://dx.doi.org/10.1177/0013164418814898.

Full text

Abstract:

Multidimensional item response theory (MIRT) models use data from individual item responses to estimate multiple latent traits of interest, making them useful in educational and psychological measurement, among other areas. When MIRT models are applied in practice, it is not uncommon to see that some items are designed to measure all latent traits while other items may only measure one or two traits. In order to facilitate a clear expression of which items measure which traits and formulate such relationships as a math function in MIRT models, we applied the concept of the Q-matrix commonly used in diagnostic classification models to MIRT models. In this study, we introduced how to incorporate a Q-matrix into an existing MIRT model, and demonstrated benefits of the proposed hybrid model through two simulation studies and an applied study. In addition, we showed the relative ease in modeling educational and psychological data through a Bayesian approach via the NUTS algorithm.

APA, Harvard, Vancouver, ISO, and other styles

10

Zheng, Xiaohui, and Sophia Rabe-Hesketh. "Estimating Parameters of Dichotomous and Ordinal Item Response Models with Gllamm." Stata Journal: Promoting communications on statistics and Stata 7, no. 3 (2007): 313–33. http://dx.doi.org/10.1177/1536867x0700700302.

Full text

Abstract:

Item response theory models are measurement models for categorical responses. Traditionally, the models are used in educational testing, where responses to test items can be viewed as indirect measures of latent ability. The test items are scored either dichotomously (correct–incorrect) or by using an ordinal scale (a grade from poor to excellent). Item response models also apply equally for measurement of other latent traits. Here we describe the one- and two-parameter logit models for dichotomous items, the partial-credit and rating scale models for ordinal items, and an extension of these models where the latent variable is regressed on explanatory variables. We show how these models can be expressed as generalized linear latent and mixed models and fitted by using the user-written command gllamm.

APA, Harvard, Vancouver, ISO, and other styles

11

Berger, Martijn P. F., C. Y. Joy King, and Weng Kee Wong. "Minimax d-optimal designs for item response theory models." Psychometrika 65, no. 3 (2000): 377–90. http://dx.doi.org/10.1007/bf02296152.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Oliveira da Silva, Cassiano Augusto, Ana Paula Rodrigues Cavalcanti, Kaline da Silva Lima, Carlos André Macêdo Cavalcanti, Tânia Cristina de Oliveira Valente, and Arndt Büssing. "Item Response Theory Applied to the Spiritual Needs Questionnaire (SpNQ) in Portuguese." Religions 11, no. 3 (2020): 139. http://dx.doi.org/10.3390/rel11030139.

Full text

Abstract:

The item response theory (IRT), or latent trace theory, is based on a set of mathematical models to complement the qualitative analysis of the items in a given questionnaire. This study analyzes the items of the Spiritual Needs Questionnaire (SpNQ) in the Portuguese version, applied to HIV+ patients, with R Studio 3.4.1, mirt statistical package, to find out if the items of the SpNQ possess appropriate psychometric qualities to discriminate between respondents as to the probability of marking one answer and not another, in the same item, showing whether or not the questionnaire is biased towards a pattern of response desired by the researcher. The parameters of discrimination, difficulty, information, and the characteristic curve of the items are evaluated. The reliable items to measure the constructs of each of the five dimensions of the SpNQ of this HIV+ sample (Religious Needs; Inner Peace and Family Support Needs; Existential Needs; Social Recognition Needs; and Time Domain Needs) are presented, as well as the most likely response categories, depending on the latent trace level of the individuals. The questionnaire items showed satisfactory discrimination and variability of difficulty, confirming the good psychometric quality of SpNQ.

APA, Harvard, Vancouver, ISO, and other styles

13

DeMars, Christine E. "``Guessing'' Parameter Estimates for Multidimensional Item Response Theory Models." Educational and Psychological Measurement 67, no. 3 (2007): 433–46. http://dx.doi.org/10.1177/0013164406294778.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Lee, HyeSun, and Weldon Z. Smith. "A Bayesian Random Block Item Response Theory Model for Forced-Choice Formats." Educational and Psychological Measurement 80, no. 3 (2019): 578–603. http://dx.doi.org/10.1177/0013164419871659.

Full text

Abstract:

Based on the framework of testlet models, the current study suggests the Bayesian random block item response theory (BRB IRT) model to fit forced-choice formats where an item block is composed of three or more items. To account for local dependence among items within a block, the BRB IRT model incorporated a random block effect into the response function and used a Markov Chain Monte Carlo procedure for simultaneous estimation of item and trait parameters. The simulation results demonstrated that the BRB IRT model performed well for the estimation of item and trait parameters and for screening those with relatively low scores on target traits. As found in the literature, the composition of item blocks was crucial for model performance; negatively keyed items were required for item blocks. The empirical application showed the performance of the BRB IRT model was equivalent to that of the Thurstonian IRT model. The potential advantage of the BRB IRT model as a base for more complex measurement models was also demonstrated by incorporating gender as a covariate into the BRB IRT model to explain response probabilities. Recommendations for the adoption of forced-choice formats were provided along with the discussion about using negatively keyed items.

APA, Harvard, Vancouver, ISO, and other styles

15

Anderson, Carolyn J. "Multidimensional Item Response Theory Models with Collateral Information as Poisson Regression Models." Journal of Classification 30, no. 2 (2013): 276–303. http://dx.doi.org/10.1007/s00357-013-9131-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Jewsbury, Paul A., and Peter W. van Rijn. "IRT and MIRT Models for Item Parameter Estimation With Multidimensional Multistage Tests." Journal of Educational and Behavioral Statistics 45, no. 4 (2019): 383–402. http://dx.doi.org/10.3102/1076998619881790.

Full text

Abstract:

In large-scale educational assessment data consistent with a simple-structure multidimensional item response theory (MIRT) model, where every item measures only one latent variable, separate unidimensional item response theory (UIRT) models for each latent variable are often calibrated for practical reasons. While this approach can be valid for data from a linear test, unacceptable item parameter estimates are obtained when data arise from a multistage test (MST). We explore this situation from a missing data perspective and show mathematically that MST data will be problematic for calibrating multiple UIRT models but not MIRT models. This occurs because some items that were used in the routing decision are excluded from the separate UIRT models, due to measuring a different latent variable. Both simulated and real data from the National Assessment of Educational Progress are used to further confirm and explore the unacceptable item parameter estimates. The theoretical and empirical results confirm that only MIRT models are valid for item calibration of multidimensional MST data.

APA, Harvard, Vancouver, ISO, and other styles

17

Luo, Yong, and Hong Jiao. "Using the Stan Program for Bayesian Item Response Theory." Educational and Psychological Measurement 78, no. 3 (2017): 384–408. http://dx.doi.org/10.1177/0013164417693666.

Full text

Abstract:

Stan is a new Bayesian statistical software program that implements the powerful and efficient Hamiltonian Monte Carlo (HMC) algorithm. To date there is not a source that systematically provides Stan code for various item response theory (IRT) models. This article provides Stan code for three representative IRT models, including the three-parameter logistic IRT model, the graded response model, and the nominal response model. We demonstrate how IRT model comparison can be conducted with Stan and how the provided Stan code for simple IRT models can be easily extended to their multidimensional and multilevel cases.

APA, Harvard, Vancouver, ISO, and other styles

18

te Marvelde, Janneke M., Cees A. W. Glas, Georges Van Landeghem, and Jan Van Damme. "Application of Multidimensional Item Response Theory Models to Longitudinal Data." Educational and Psychological Measurement 66, no. 1 (2006): 5–34. http://dx.doi.org/10.1177/0013164405282490.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Feuerstahler, Leah M., Niels Waller, and Angus MacDonald. "Improving Measurement Precision in Experimental Psychopathology Using Item Response Theory." Educational and Psychological Measurement 80, no. 4 (2019): 695–725. http://dx.doi.org/10.1177/0013164419892049.

Full text

Abstract:

Although item response models have grown in popularity in many areas of educational and psychological assessment, there are relatively few applications of these models in experimental psychopathology. In this article, we explore the use of item response models in the context of a computerized cognitive task designed to assess visual working memory capacity in people with psychosis as well as healthy adults. We begin our discussion by describing how item response theory can be used to evaluate and improve unidimensional cognitive assessment tasks in various examinee populations. We then suggest how computerized adaptive testing can be used to improve the efficiency of cognitive task administration. Finally, we explore how these ideas might be extended to multidimensional item response models that better represent the complex response processes underlying task performance in psychopathological populations.

APA, Harvard, Vancouver, ISO, and other styles

20

Fujimoto, Ken A. "The Bayesian Multilevel Trifactor Item Response Theory Model." Educational and Psychological Measurement 79, no. 3 (2018): 462–94. http://dx.doi.org/10.1177/0013164418806694.

Full text

Abstract:

Advancements in item response theory (IRT) have led to models for dual dependence, which control for cluster and method effects during a psychometric analysis. Currently, however, this class of models does not include one that controls for when the method effects stem from two method sources in which one source functions differently across the aspects of another source (i.e., a nested method–source interaction). For this study, then, a Bayesian IRT model is proposed, one that accounts for such interaction among method sources while controlling for the clustering of individuals within the sample. The proposed model accomplishes these tasks by specifying a multilevel trifactor structure for the latent trait space. Details of simulations are also reported. These simulations demonstrate that this model can identify when item response data represent a multilevel trifactor structure, and it does so in data from samples as small as 250 cases nested within 50 clusters. Additionally, the simulations show that misleading estimates for the item discriminations could arise when the trifactor structure reflected in the data is not correctly accounted for. The utility of the model is also illustrated through the analysis of empirical data.

APA, Harvard, Vancouver, ISO, and other styles

21

Haberman, Shelby J., Paul W. Holland, and Sandip Sinharay. "Limits on Log Odds Ratios for Unidimensional Item Response Theory Models." Psychometrika 72, no. 4 (2007): 551–61. http://dx.doi.org/10.1007/s11336-007-9009-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Sueiro, Manuel J., and Francisco J. Abad. "Assessing Goodness of Fit in Item Response Theory With Nonparametric Models." Educational and Psychological Measurement 71, no. 5 (2011): 834–48. http://dx.doi.org/10.1177/0013164410393238.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Wang, Wen-Chung, Hui-Fang Chen, and Kuan-Yu Jin. "Item Response Theory Models for Wording Effects in Mixed-Format Scales." Educational and Psychological Measurement 75, no. 1 (2014): 157–78. http://dx.doi.org/10.1177/0013164414528209.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Ippel, Lianne, and David Magis. "Efficient Standard Errors in Item Response Theory Models for Short Tests." Educational and Psychological Measurement 80, no. 3 (2019): 461–75. http://dx.doi.org/10.1177/0013164419882072.

Full text

Abstract:

In dichotomous item response theory (IRT) framework, the asymptotic standard error (ASE) is the most common statistic to evaluate the precision of various ability estimators. Easy-to-use ASE formulas are readily available; however, the accuracy of some of these formulas was recently questioned and new ASE formulas were derived from a general asymptotic theory framework. Furthermore, exact standard errors were suggested to better evaluate the precision of ability estimators, especially with short tests for which the asymptotic framework is invalid. Unfortunately, the accuracy of exact standard errors was assessed so far only in a very limiting setting. The purpose of this article is to perform a global comparison of exact versus (classical and new formulations of) asymptotic standard errors, for a wide range of usual IRT ability estimators, IRT models, and with short tests. Results indicate that exact standard errors globally outperform the ASE versions in terms of reduced bias and root mean square error, while the new ASE formulas are also globally less biased than their classical counterparts. Further discussion about the usefulness and practical computation of exact standard errors are outlined.

APA, Harvard, Vancouver, ISO, and other styles

25

Tijmstra, Jesper, and Maria Bolsinova. "Bayes Factors for Evaluating Latent Monotonicity in Polytomous Item Response Theory Models." Psychometrika 84, no. 3 (2019): 846–69. http://dx.doi.org/10.1007/s11336-019-09661-w.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Kohli, Nidhi, Jennifer Koran, and Lisa Henn. "Relationships Among Classical Test Theory and Item Response Theory Frameworks via Factor Analytic Models." Educational and Psychological Measurement 75, no. 3 (2014): 389–405. http://dx.doi.org/10.1177/0013164414559071.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Kuravsky, L. S., A. A. Margolis, P. A. Marmalyuk, A. S. Panfilova, and G. A. Yuryev. "Mathematical Aspects of the Concept of Adaptive Training Device." Психологическая наука и образование 21, no. 2 (2016): 84–95. http://dx.doi.org/10.17759/pse.2016210210.

Full text

Abstract:

The paper presents a concept of an adaptive training system implied for electronic learning that supports the choice of tasks according to parametric models. This approach is an alternative to the adaptive technologies based on item response theory (IRT). The features of the diagnostic methods underlying the choice of tasks presented in tests include, firstly, the account of the dynamics in an individual’s levels of performance and in the time s/he needs to complete the test, and, secondly, the lesser amount of tasks required to be presented in the training session.

APA, Harvard, Vancouver, ISO, and other styles

28

Sun, Jianan, Yunxiao Chen, Jingchen Liu, Zhiliang Ying, and Tao Xin. "Latent Variable Selection for Multidimensional Item Response Theory Models via $$L_{1}$$ Regularization." Psychometrika 81, no. 4 (2016): 921–39. http://dx.doi.org/10.1007/s11336-016-9529-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Houseman, E. Andrés, Carmen Marsit, Margaret Karagas, and Louise M. Ryan. "Penalized Item Response Theory Models: Application to Epigenetic Alterations in Bladder Cancer." Biometrics 63, no. 4 (2007): 1269–77. http://dx.doi.org/10.1111/j.1541-0420.2007.00806.x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Izsák, Andrew, Erik Jacobson, Zandra de Araujo, and Chandra Hawley Orrill. "Measuring Mathematical Knowledge for Teaching Fractions With Drawn Quantities." Journal for Research in Mathematics Education 43, no. 4 (2012): 391–427. http://dx.doi.org/10.5951/jresematheduc.43.4.0391.

Full text

Abstract:

Researchers have recently used traditional item response theory (IRT) models to measure mathematical knowledge for teaching (MKT). Some studies (e.g., Hill, 2007; Izsák, Orrill, Cohen, & Brown, 2010), however, have reported subgroups when measuring middle-grades teachers' MKT, and such groups violate a key assumption of IRT models. This study investigated the utility of an alternative called the mixture Rasch model that allows for subgroups. The model was applied to middle-grades teachers' performance on pretests and posttests bracketing a 42-hour professional development course focused on drawn models for fraction arithmetic. Results from psychometric modeling and evidence from video-recorded interviews and professional development sessions suggested that there were 2 subgroups of middle-grades teachers, 1 better able to reason with 3-level unit structures and 1 constrained to 2-level unit structures. Some teachers, however, were easier to classify than others.

APA, Harvard, Vancouver, ISO, and other styles

31

Fujimoto, Ken A., and Sabina R. Neugebauer. "A General Bayesian Multidimensional Item Response Theory Model for Small and Large Samples." Educational and Psychological Measurement 80, no. 4 (2020): 665–94. http://dx.doi.org/10.1177/0013164419891205.

Full text

Abstract:

Although item response theory (IRT) models such as the bifactor, two-tier, and between-item-dimensionality IRT models have been devised to confirm complex dimensional structures in educational and psychological data, they can be challenging to use in practice. The reason is that these models are multidimensional IRT (MIRT) models and thus are highly parameterized, making them only suitable for data provided by large samples. Unfortunately, many educational and psychological studies are conducted on a small scale, leaving the researchers without the necessary MIRT models to confirm the hypothesized structures in their data. To address the lack of modeling options for these researchers, we present a general Bayesian MIRT model based on adaptive informative priors. Simulations demonstrated that our MIRT model could be used to confirm a two-tier structure (with two general and six specific dimensions), a bifactor structure (with one general and six specific dimensions), and a between-item six-dimensional structure in rating scale data representing sample sizes as small as 100. Although our goal was to provide a general MIRT model suitable for smaller samples, the simulations further revealed that our model was applicable to larger samples. We also analyzed real data from 121 individuals to illustrate that the findings of our simulations are relevant to real situations.

APA, Harvard, Vancouver, ISO, and other styles

32

Krylovas, Aleksandras, and Natalja Kosareva. "MATHEMATICAL MODELLING OF FORECASTING THE RESULTS OF KNOWLEDGE TESTING / ŽINIŲ TIKRINIMO REZULTATŲ PROGNOZĖS MATEMATINIS MODELIAVIMAS." Technological and Economic Development of Economy 14, no. 3 (2008): 388–401. http://dx.doi.org/10.3846/1392-8619.2008.14.388-401.

Full text

Abstract:

In this paper a mathematical model for obtaining probability distribution of the knowledge testing results is proposed. Differences and similarities of this model and Item Response Theory (IRT) logistic model are discussed. Probability distributions of 10 items test results for low, middle and high ability populations selecting characteristic functions of the various difficulty items combinations are obtained. Entropy function values for these items combinations are counted. These results enable to formulate recomendations for test items selection for various testing groups according to their attainment level. Method of selection of a suitable item characteristic function based on the Kolmogorov compatibility test, is proposed. This method is illustrated by applying it to a discreet mathematics test item. Santrauka Straipsnyje pasiūlytas matematinis modelis žinių tikrinimo rezultatų tikimybiniam skirstiniui gauti. Aptarti šio modelio ir užduočių sprendimo teorijos (IRT) logistinio modelio skirtumai ir panašumai. Išnagrinėti 10 klausimų testo rezultatų tikimybiniai skirstiniai silpnai, vidutinei ir stipriai testuojamųjų populiacijoms parenkant įvairias testo klausimų sunkumo funkcijų kombinacijas. Apskaičiuotos entropijos funkcijos reikšmės. Gauti rezultatai leidžia formuluoti rekomendacijas testo klausimams parinkti skirtingoms testuojamųjų grupėms pagal jų žinių lygį. Pasiūlytas tinkamiausios klausimo charakteristinės funkcijos parinkimo būdas, grindžiamas Kolmogorovo kriterijumi. Ši procedūra iliustruojama taikant ją konkrečiam diskrečiosios matematikos testo klausimui.

APA, Harvard, Vancouver, ISO, and other styles

33

Andersson, Björn, and Tao Xin. "Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients." Educational and Psychological Measurement 78, no. 1 (2017): 32–45. http://dx.doi.org/10.1177/0013164417713570.

Full text

Abstract:

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability is typically not reported. In this study, the asymptotic variances of the IRT marginal and test reliability coefficient estimators are derived for dichotomous and polytomous IRT models assuming an underlying asymptotically normally distributed item parameter estimator. The results are used to construct confidence intervals for the reliability coefficients. Simulations are presented which show that the confidence intervals for the test reliability coefficient have good coverage properties in finite samples under a variety of settings with the generalized partial credit model and the three-parameter logistic model. Meanwhile, it is shown that the estimator of the marginal reliability coefficient has finite sample bias resulting in confidence intervals that do not attain the nominal level for small sample sizes but that the bias tends to zero as the sample size increases.

APA, Harvard, Vancouver, ISO, and other styles

34

Kim, Stella Y., Won-Chan Lee, and Michael J. Kolen. "Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests." Educational and Psychological Measurement 80, no. 1 (2019): 91–125. http://dx.doi.org/10.1177/0013164419854208.

Full text

Abstract:

A theoretical and conceptual framework for true-score equating using a simple-structure multidimensional item response theory (SS-MIRT) model is developed. A true-score equating method, referred to as the SS-MIRT true-score equating (SMT) procedure, also is developed. SS-MIRT has several advantages over other complex multidimensional item response theory models including improved efficiency in estimation and straightforward interpretability. The performance of the SMT procedure was examined and evaluated through four studies using different data types. In these studies, results from the SMT procedure were compared with results from four other equating methods to assess the relative benefits of SMT compared with the other procedures. In general, SMT showed more accurate equating results compared with the traditional unidimensional IRT (UIRT) equating when the data were multidimensional. More accurate performance of SMT over UIRT true-score equating was consistently observed across the studies, which supports the benefits of a multidimensional approach in equating for multidimensional data. Also, SMT performed similarly to a SS-MIRT observed score method across all studies.

APA, Harvard, Vancouver, ISO, and other styles

35

Wang, Chun, Gongjun Xu, and Xue Zhang. "Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models." Psychometrika 84, no. 3 (2019): 673–700. http://dx.doi.org/10.1007/s11336-019-09672-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Bartolucci, Francesco, Silvia Bacci, and Michela Gnaldi. "MultiLCIRT: An R package for multidimensional latent class item response models." Computational Statistics & Data Analysis 71 (March 2014): 971–85. http://dx.doi.org/10.1016/j.csda.2013.05.018.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Tay, Louis, та Fritz Drasgow. "Adjusting the Adjusted χ2/df Ratio Statistic for Dichotomous Item Response Theory Analyses". Educational and Psychological Measurement 72, № 3 (2011): 510–28. http://dx.doi.org/10.1177/0013164411416976.

Full text

Abstract:

Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted χ2/ df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted χ2/ df values greater than 3 using a cross-validation data set indicate substantial misfit. The authors used simulations to examine this critical value across different test lengths (15, 30, 45) and sample sizes (500, 1,000, 1,500, 5,000). The one-, two- and three-parameter logistic models were fitted to data simulated from different logistic models, including unidimensional and multidimensional models. In general, a fixed cutoff value was insufficient to ascertain item response theory model–data fit. Consequently, the authors propose the use of the parametric bootstrap to investigate misfit and evaluated its performance. This new approach produced appropriate Type I error rates and had substantial power to detect misfit across simulated conditions. In a third study, the authors applied the parametric bootstrap approach to LSAT data to determine which dichomotous item response theory model produced the best fit. Future applications of the mean adjusted χ2/ df statistic are discussed.

APA, Harvard, Vancouver, ISO, and other styles

38

Glas, Cees A. W., and Jonald L. Pimentel. "Modeling Nonignorable Missing Data in Speeded Tests." Educational and Psychological Measurement 68, no. 6 (2008): 907–22. http://dx.doi.org/10.1177/0013164408315262.

Full text

Abstract:

In tests with time limits, items at the end are often not reached. Usually, the pattern of missing responses depends on the ability level of the respondents; therefore, missing data are not ignorable in statistical inference. This study models data using a combination of two item response theory (IRT) models: one for the observed response data and one for the missing data indicator. The missing data indicator is modeled using a sequential model with linear restrictions on the item parameters. The models are connected by the assumption that the respondents' latent proficiency parameters have a joint multivariate normal distribution. Model parameters are estimated by maximum marginal likelihood. Simulations show that treating missing data as ignorable can lead to considerable bias in parameter estimates. Including an IRT model for the missing data indicator removes this bias. The method is illustrated with data from an intelligence test with a time limit.

APA, Harvard, Vancouver, ISO, and other styles

39

Jefmański, Bartłomiej, and Adam Sagan. "Item Response Theory Models for the Fuzzy TOPSIS in the Analysis of Survey Data." Symmetry 13, no. 2 (2021): 223. http://dx.doi.org/10.3390/sym13020223.

Full text

Abstract:

The fuzzy TOPSIS (The Technique for Order of Preference by Similarity to Ideal Solution) is an attractive tool for measuring complex phenomena based on uncertain data. The original version of the method assumes that the object assessments in terms of the adopted criteria are expressed as triangular fuzzy numbers. One of the crucial stages of the fuzzy TOPSIS is selecting the fuzzy conversion scale, which is used to evaluate objects in terms of the adopted criteria. The choice of a fuzzy conversion scale may influence the results of the fuzzy TOPSIS. There is no uniform approach in constructing and selecting the fuzzy conversion scale for the fuzzy TOPSIS. The choice is subjective and made by researchers. Therefore, the aim of the article is to present a new, objective approach to the construction of fuzzy conversion scales based on Item Response Theory (IRT) models. The following models were used in the construction of fuzzy conversion scales: Polychoric Correlation Model (PM), Polytomous Rasch Model (PRM), Rating Scale Model (RSM), Partial Credit Model (PCM), Generalized Partial Credit Model (GPCM), Graded Response Model (GRM), Nominal Response Model (NRM). The usefulness of the proposed approach is presented on the example of the analysis of a survey’s results on measuring the quality of professional life of inhabitants of selected communes in Poland. The obtained results indicate that the choice of the fuzzy conversion scale has a large impact on the closeness coefficient values. A large difference was also observed in the spreads of triangular fuzzy numbers between scales based on IRT models and those used in the literature on the subject. The use of the fuzzy TOPSIS with fuzzy conversion scales built based on PRM, RSM, PCM, GPCM, and GRM models gives results with a greater range of variability than in the case of fuzzy conversion scales used in empirical research.

APA, Harvard, Vancouver, ISO, and other styles

40

SANTOS, Naiara Caroline Aparecido dos, and Jorge Luiz BAZÁN. "RESIDUAL ANALYSIS IN RASCH POISSON COUNTS MODELS." REVISTA BRASILEIRA DE BIOMETRIA 39, no. 1 (2021): 206–20. http://dx.doi.org/10.28951/rbb.v39i1.531.

Full text

Abstract:

A Rasch Poisson counts (RPC) model is described to identify individual latent traits and facilities of the items of tests that model the error (or success) count in several tasks over time, instead of modeling the correct responses to items in a test as in the dichotomous item response theory (IRT) model. These types of tests can be more informative than traditional tests. To estimate the model parameters, we consider a Bayesian approach using the integrated nested Laplace approximation (INLA). We develop residual analysis to assess model t by introducing randomized quantile residuals for items. The data used to illustrate the method comes from 228 people who took a selective attention test. The test has 20 blocks (items), with a time limit of 15 seconds for each block. The results of the residual analysis of the RPC were promising and indicated that the studied attention data are not well tted by the RPC model.

APA, Harvard, Vancouver, ISO, and other styles

41

Dimitrov, Dimiter M., and Yong Luo. "A Note on the D-Scoring Method Adapted for Polytomous Test Items." Educational and Psychological Measurement 79, no. 3 (2018): 545–57. http://dx.doi.org/10.1177/0013164418786014.

Full text

Abstract:

An approach to scoring tests with binary items, referred to as D-scoring method, was previously developed as a classical analog to basic models in item response theory (IRT) for binary items. As some tests include polytomous items, this study offers an approach to D-scoring of such items and parallels the results with those obtained under the graded response model (GRM) for ordered polytomous items in the framework of IRT. The proposed design of using D-scoring with “virtual” binary items generated from polytomous items provides (a) ability scores that are consistent with their GRM counterparts and (b) item category response functions analogous to those obtained under the GRM. This approach provides a unified framework for D-scoring and psychometric analysis of tests with binary and/or polytomous items that can be efficient in different scenarios of educational and psychological assessment.

APA, Harvard, Vancouver, ISO, and other styles

42

Cai, Li, and Carrie R. Houts. "Longitudinal Analysis of Patient-Reported Outcomes in Clinical Trials: Applications of Multilevel and Multidimensional Item Response Theory." Psychometrika 86, no. 3 (2021): 754–77. http://dx.doi.org/10.1007/s11336-021-09777-y.

Full text

Abstract:

AbstractWith decades of advance research and recent developments in the drug and medical device regulatory approval process, patient-reported outcomes (PROs) are becoming increasingly important in clinical trials. While clinical trial analyses typically treat scores from PROs as observed variables, the potential to use latent variable models when analyzing patient responses in clinical trial data presents novel opportunities for both psychometrics and regulatory science. An accessible overview of analyses commonly used to analyze longitudinal trial data and statistical models familiar in both psychometrics and biometrics, such as growth models, multilevel models, and latent variable models, is provided to call attention to connections and common themes among these models that have found use across many research areas. Additionally, examples using empirical data from a randomized clinical trial provide concrete demonstrations of the implementation of these models. The increasing availability of high-quality, psychometrically rigorous assessment instruments in clinical trials, of which the Patient-Reported Outcomes Measurement Information System (PROMIS®) is a prominent example, provides rare possibilities for psychometrics to help improve the statistical tools used in regulatory science.

APA, Harvard, Vancouver, ISO, and other styles

43

Liu, Ren, Anne Corinne Huggins-Manley, and Okan Bulut. "Retrofitting Diagnostic Classification Models to Responses From IRT-Based Assessment Forms." Educational and Psychological Measurement 78, no. 3 (2017): 357–83. http://dx.doi.org/10.1177/0013164416685599.

Full text

Abstract:

Developing a diagnostic tool within the diagnostic measurement framework is the optimal approach to obtain multidimensional and classification-based feedback on examinees. However, end users may seek to obtain diagnostic feedback from existing item responses to assessments that have been designed under either the classical test theory or item response theory frameworks. Retrofitting diagnostic classification models to existing assessments designed under other psychometric frameworks could be a plausible approach to obtain more actionable scores or understand more about the constructs themselves. This study (a) discusses the possibility and problems of retrofitting, (b) proposes a step-by-step retrofitting framework, and (c) explores the information one can gain from retrofitting through an empirical application example. While retrofitting may not always be an ideal approach to diagnostic measurement, this article aims to invite discussions through presenting the possibility, challenges, process, and product of retrofitting.

APA, Harvard, Vancouver, ISO, and other styles

44

Noventa, Stefano, Luca Stefanutti, and Giulio Vidotto. "An Analysis of Item Response Theory and Rasch Models Based on the Most Probable Distribution Method." Psychometrika 79, no. 3 (2013): 377–402. http://dx.doi.org/10.1007/s11336-013-9348-y.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Lee, Soo, Okan Bulut, and Youngsuk Suh. "Multidimensional Extension of Multiple Indicators Multiple Causes Models to Detect DIF." Educational and Psychological Measurement 77, no. 4 (2016): 545–69. http://dx.doi.org/10.1177/0013164416651116.

Full text

Abstract:

A number of studies have found multiple indicators multiple causes (MIMIC) models to be an effective tool in detecting uniform differential item functioning (DIF) for individual items and item bundles. A recently developed MIMIC-interaction model is capable of detecting both uniform and nonuniform DIF in the unidimensional item response theory (IRT) framework. The goal of the current study is to extend the MIMIC-interaction model for detecting DIF in the context of multidimensional IRT modelling and examine the performance of the multidimensional MIMIC-interaction model under various simulation conditions with respect to Type I error and power rates. Simulation conditions include DIF pattern and magnitude, test length, correlation between latent traits, sample size, and latent mean differences between focal and reference groups. The results of this study indicate that power rates of the multidimensional MIMIC-interaction model under uniform DIF conditions were higher than those of nonuniform DIF conditions. When anchor item length and sample size increased, power for detecting DIF increased. Also, the equal latent mean condition tended to produce higher power rates than the different mean condition. Although the multidimensional MIMIC-interaction model was found to be a reasonably useful tool for identifying uniform DIF, the performance of the model in detecting nonuniform DIF appeared to be questionable.

APA, Harvard, Vancouver, ISO, and other styles

46

Schwabe, Inga, Dorret I. Boomsma, and Stéphanie M. van den Berg. "Mathematical Ability and Socio-Economic Background: IRT Modeling to Estimate Genotype by Environment Interaction." Twin Research and Human Genetics 20, no. 6 (2017): 511–20. http://dx.doi.org/10.1017/thg.2017.59.

Full text

Abstract:

Genotype by environment interaction in behavioral traits may be assessed by estimating the proportion of variance that is explained by genetic and environmental influences conditional on a measured moderating variable, such as a known environmental exposure. Behavioral traits of interest are often measured by questionnaires and analyzed as sum scores on the items. However, statistical results on genotype by environment interaction based on sum scores can be biased due to the properties of a scale. This article presents a method that makes it possible to analyze the actually observed (phenotypic) item data rather than a sum score by simultaneously estimating the genetic model and an item response theory (IRT) model. In the proposed model, the estimation of genotype by environment interaction is based on an alternative parametrization that is uniquely identified and therefore to be preferred over standard parametrizations. A simulation study shows good performance of our method compared to analyzing sum scores in terms of bias. Next, we analyzed data of 2,110 12-year-old Dutch twin pairs on mathematical ability. Genetic models were evaluated and genetic and environmental variance components estimated as a function of a family's socio-economic status (SES). Results suggested that common environmental influences are less important in creating individual differences in mathematical ability in families with a high SES than in creating individual differences in mathematical ability in twin pairs with a low or average SES.

APA, Harvard, Vancouver, ISO, and other styles

47

Debelak, Rudolf, and Carolin Strobl. "Investigating Measurement Invariance by Means of Parameter Instability Tests for 2PL and 3PL Models." Educational and Psychological Measurement 79, no. 2 (2018): 385–98. http://dx.doi.org/10.1177/0013164418777784.

Full text

Abstract:

M-fluctuation tests are a recently proposed method for detecting differential item functioning in Rasch models. This article discusses a generalization of this method to two additional item response theory models: the two-parametric logistic model and the three-parametric logistic model with a common guessing parameter. The Type I error rate and the power of this method were evaluated by a variety of simulation studies. The results suggest that the new method allows the detection of various forms of differential item functioning in these models, which also includes differential discrimination and differential guessing effects. It is also robust against moderate violations of several assumptions made in the item parameter estimation.

APA, Harvard, Vancouver, ISO, and other styles

48

Liu, Yuan, and Kit-Tai Hau. "Measuring Motivation to Take Low-Stakes Large-Scale Test: New Model Based on Analyses of “Participant-Own-Defined” Missingness." Educational and Psychological Measurement 80, no. 6 (2020): 1115–44. http://dx.doi.org/10.1177/0013164420911972.

Full text

Abstract:

In large-scale low-stake assessment such as the Programme for International Student Assessment (PISA), students may skip items (missingness) which are within their ability to complete. The detection and taking care of these noneffortful responses, as a measure of test-taking motivation, is an important issue in modern psychometric models. Traditional approaches based on questionnaires and item response theory may have different limitations. In the present research, we proposed a new way by directly using “participant-own-defined” missing item information (user missingness) in a zero-inflated Poisson model. An empirical study using the PISA 2015 data (eight representative economies in two cultures) and another simulation study were conducted to validate our new approach. Results indicated that our model could successfully capture test-taking motivation. We also found that the Confucian students had lower user missingness irrespective of item positions as compared with their Western counterparts.

APA, Harvard, Vancouver, ISO, and other styles

49

Falk, Carl F., and Scott Monroe. "On Lagrange Multiplier Tests in Multidimensional Item Response Theory: Information Matrices and Model Misspecification." Educational and Psychological Measurement 78, no. 4 (2017): 653–78. http://dx.doi.org/10.1177/0013164417714506.

Full text

Abstract:

Lagrange multiplier (LM) or score tests have seen renewed interest for the purpose of diagnosing misspecification in item response theory (IRT) models. LM tests can also be used to test whether parameters differ from a fixed value. We argue that the utility of LM tests depends on both the method used to compute the test and the degree of misspecification in the initially fitted model. We demonstrate both of these points in the context of a multidimensional IRT framework. Through an extensive Monte Carlo simulation study, we examine the performance of LM tests under varying degrees of model misspecification, model size, and different information matrix approximations. A generalized LM test designed specifically for use under misspecification, which has apparently not been previously studied in an IRT framework, performed the best in our simulations. Finally, we reemphasize caution in using LM tests for model specification searches.

APA, Harvard, Vancouver, ISO, and other styles

50

Wang, Wen-Chung, and Xue-Lan Qiu. "Multilevel Modeling of Cognitive Diagnostic Assessment: The Multilevel DINA Example." Applied Psychological Measurement 43, no. 1 (2018): 34–50. http://dx.doi.org/10.1177/0146621618765713.

Full text

Abstract:

Many multilevel linear and item response theory models have been developed to account for multilevel data structures. However, most existing cognitive diagnostic models (CDMs) are unilevel in nature and become inapplicable when data have a multilevel structure. In this study, using the log-linear CDM as the item-level model, multilevel CDMs were developed based on the latent continuous variable approach and the multivariate Bernoulli distribution approach. In a series of simulations, the newly developed multilevel deterministic input, noisy, and gate (DINA) model was used as an example to evaluate the parameter recovery and consequences of ignoring the multilevel structures. The results indicated that all parameters in the new multilevel DINA were recovered fairly well by using the freeware Just Another Gibbs Sampler (JAGS) and that ignoring multilevel structures by fitting the standard unilevel DINA model resulted in poor estimates for the student-level covariates and underestimated standard errors, as well as led to poor recovery for the latent attribute profiles for individuals. An empirical example using the 2003 Trends in International Mathematics and Science Study eighth-grade mathematical test was provided.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!