Log in

Relevant bibliographies by topics / Differential analysis of the functioning of the item / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Differential analysis of the functioning of the item.

Dissertations / Theses on the topic 'Differential analysis of the functioning of the item'

Author: Grafiati

Published: 4 June 2021

Last updated: 16 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Differential analysis of the functioning of the item.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Lees, Jared Andrew. "Differential Item Functioning Analysis of the Herrmann Brain Dominance Instrument." Diss., CLICK HERE for online access, 2007. http://contentdm.lib.byu.edu/ETD/image/etd2103.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Chen, Dong Qi Kayla. "Gender-related differential item functioning analysis on the GEPT-kids." Thesis, University of Macau, 2018. http://umaclib3.umac.mo/record=b3953512.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Yildirim, Huseyin Husnu. "The Differential Item Functioning (dif) Analysis Of Mathematics Items In The International Assessment Programs." Phd thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607135/index.pdf.

Full text

Abstract:

Cross-cultural studies, like TIMSS and PISA 2003, are being conducted since 1960s with an idea that these assessments can provide a broad perspective for evaluating and improving education. In addition countries can assess their relative positions in mathematics achievement among their competitors in the global world. However, because of the different cultural and language settings of different countries, these international tests may not be functioning as expected across all the countries. Thus, tests may not be equivalent, or fair, linguistically and culturally across the participating countries. In this conte! ! xt, the present study aimed at assessing the equivalence of mathematics items of TIMSS 1999 and PISA 2003 across cultures and languages, to fin! d out if mathematics achievement possesses any culture specifi! c aspect s. For this purpose, the present study assessed Turkish and English versions of TIMSS 1999 and PISA 2003 mathematics items with respect to, (a) psychometric characteristics of items, and (b) possible sources of Differential Item Functioning (DIF) between these two versions. The study used Restricted Factor Analysis, Mantel-Haenzsel Statistics and Item Response Theory Likelihood Ratio methodologies to determine DIF items. The results revealed that there were adaptation problems in both TIMSS and PISA studies. However it was still possible to determine a subtest of items functioning fairly between cultures, to form a basis for a cross-cultural comparison. In PISA, there was a high rate of agreement among the DIF methodologies used. However, in TIMSS, the agree! ment ra! te decreased considerably possibly because the rate o! f differ e! ntially functioning items within TIMSS was higher, and differential guessing and differential discriminating were also issues in the test. The study! also revealed that items requiring competencies of reproduction of practiced knowledge, knowledge of facts, performance of routine procedures, application of technical skills were less likely to be biased against Turkish students with respect to American students at the same ability level. On the other hand, items requiring students to communicate mathematically, items where various results must be compared, and items that had real-world context were less likely to be in favor of Turkish students.

APA, Harvard, Vancouver, ISO, and other styles

4

Zhang, Mo. "Gender related differential item functioning in mathematics tests a meta-analysis /." Pullman, Wash. : Washington State University, 2009. http://www.dissertations.wsu.edu/Thesis/Summer2009/m_zhang_072109.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Stephens-Bonty, Torie Amelia. "Using Three Different Categorical Data Analysis Techniques to Detect Differential Item Functioning." Digital Archive @ GSU, 2008. http://digitalarchive.gsu.edu/eps_diss/24.

Full text

Abstract:

Diversity in the population along with the diversity of testing usage has resulted in smaller identified groups of test takers. In addition, computer adaptive testing sometimes results in a relatively small number of items being used for a particular assessment. The need and use for statistical techniques that are able to effectively detect differential item functioning (DIF) when the population is small and or the assessment is short is necessary. Identification of empirically biased items is a crucial step in creating equitable and construct-valid assessments. Parshall and Miller (1995) compared the conventional asymptotic Mantel-Haenszel (MH) with the exact test (ET) for the detection of DIF with small sample sizes. Several studies have since compared the performance of MH to logistic regression (LR) under a variety of conditions. Both Swaminathan and Rogers (1990), and Hildalgo and López-Pina (2004) demonstrated that MH and LR were comparable in their detection of items with DIF. This study followed by comparing the performance of the MH, the ET, and LR performance when both the sample size is small and test length is short. The purpose of this Monte Carlo simulation study was to expand on the research done by Parshall and Miller (1995) by examining power and power with effect size measures for each of the three DIF detection procedures. The following variables were manipulated in this study: focal group sample size, percent of items with DIF, and magnitude of DIF. For each condition, a small reference group size of 200 was utilized as well as a short, 10-item test. The results demonstrated that in general, LR was slightly more powerful in detecting items with DIF. In most conditions, however, power was well below the acceptable rate of 80%. As the size of the focal group and the magnitude of DIF increased, the three procedures were more likely to reach acceptable power. Also, all three procedures demonstrated the highest power for the most discriminating item. Collectively, the results from this research provide information in the area of small sample size and DIF detection.

APA, Harvard, Vancouver, ISO, and other styles

6

Mehta, Vandhana. "Structural Validity and Item Functioning of the LoTi Digital-Age Survey." Thesis, University of North Texas, 2011. https://digital.library.unt.edu/ark:/67531/metadc68014/.

Full text

Abstract:

The present study examined the structural construct validity of the LoTi Digital-Age Survey, a measure of teacher instructional practices with technology in the classroom. Teacher responses (N = 2840) from across the United States were used to assess factor structure of the instrument using both exploratory and confirmatory analyses. Parallel analysis suggests retaining a five-factor solution compared to the MAP test that suggests retaining a three-factor solution. Both analyses (EFA and CFA) indicate that changes need to be made to the current factor structure of the survey. The last two factors were composed of items that did not cover or accurately measure the content of the latent trait. Problematic items, such as items with crossloadings, were discussed. Suggestions were provided to improve the factor structure, items, and scale of the survey.

APA, Harvard, Vancouver, ISO, and other styles

7

Asil, Mustafa. "Differential item functioning (DIF) analysis of the verbal section of the 2003 student selection examination (SSE)." The Ohio State University, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=osu1399553097.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Whitmore, Marjorie Lee Threet. "A Comparison of Two Differential Item Functioning Detection Methods: Logistic Regression and an Analysis of Variance Approach Using Rasch Estimation." Thesis, University of North Texas, 1995. https://digital.library.unt.edu/ark:/67531/metadc278366/.

Full text

Abstract:

Differential item functioning (DIF) detection rates were examined for the logistic regression and analysis of variance (ANOVA) DIF detection methods. The methods were applied to simulated data sets of varying test length (20, 40, and 60 items) and sample size (200, 400, and 600 examinees) for both equal and unequal underlying ability between groups as well as for both fixed and varying item discrimination parameters. Each test contained 5% uniform DIF items, 5% non-uniform DIF items, and 5% combination DIF (simultaneous uniform and non-uniform DIF) items. The factors were completely crossed, and each experiment was replicated 100 times. For both methods and all DIF types, a test length of 20 was sufficient for satisfactory DIF detection. The detection rate increased significantly with sample size for each method. With the ANOVA DIF method and uniform DIF, there was a difference in detection rates between discrimination parameter types, which favored varying discrimination and decreased with increased sample size. The detection rate of non-uniform DIF using the ANOVA DIF method was higher with fixed discrimination parameters than with varying discrimination parameters when relative underlying ability was unequal. In the combination DIF case, there was a three-way interaction among the experimental factors discrimination type, relative ability, and sample size for both detection methods. The error rate for the ANOVA DIF detection method decreased as test length increased and increased as sample size increased. For both methods, the error rate was slightly higher with varying discrimination parameters than with fixed. For logistic regression, the error rate increased with sample size when relative underlying ability was unequal between groups. The logistic regression method detected uniform and non-uniform DIF at a higher rate than the ANOVA DIF method. Because the type of DIF present in real data is rarely known, the logistic regression method is recommended for most cases.

APA, Harvard, Vancouver, ISO, and other styles

9

Ing, Pamela Grace. "An Investigation of the 'White Male Effect' from a Psychometric Perspective." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1338312146.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Liu, Mingyang Liu. "Differential Item Functioning in Large-scale Mathematics Assessments: Comparing the Capabilities of the Rasch Trees Model to Traditional Approaches." University of Toledo / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1513266587329066.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Mâsse, Louise C. "A presentation and comparison of some new statistical techniques in the analysis of polytomous differential item functioning: A Monte Carlo investigation." Thesis, University of Ottawa (Canada), 1994. http://hdl.handle.net/10393/9904.

Full text

Abstract:

There is a need to develop and investigate methods which can assess the Item Response Differences (IRD) found in all the options of an item. In this study, such an investigation was referred to as Polytomous Differential Item Functioning (PDIF). The purpose of this study was to present and investigate the performance of four new approaches in the assessment of PDIF. The four approaches are a MANOVA (MCO) and a MANCOVA (MCA) approach applied to categorical dependent variables, a Polytomous Logistic Regression (PLR) approach, and an ANOVA analysis based on the item responses quantified by Dual Scaling (DS). In this study the effectiveness of these approaches (MCA, MCO, PLR, and DS) as well as the Log-Linear (LOG) approach of Mellenbergh (1982) were assessed under various conditions of test length, sample size, item difficulty, and the amount and location of PDIF. A two-parameter polytomous logistic regression model was used to generate the data. In this study, only uniform PDIF was introduced in the alternatives of the item. The type of PDIF simulated (e.g. uniform) in this study did not allow for a direct comparison of the nonuniform test of hypothesis between the Logistic (LOG and PLR) approaches and the MAN(C)OVA (MCA and MCO) approaches because the Logistic approaches test for a difference in logits while the MAN(C)OVA approaches test for a difference in proportions. It was shown in this study that varying the probability of choosing the alternatives resulted in uniform logit differences which did not only translate into uniform differences in proportions but also translated into nonuniform differences in proportions. These differences affected the interpretation of the PDIF results because the test of nonuniform PDIF for the Logistic procedures corresponded to a valid test of the null hypothesis while the MAN(C)OVA results for nonuniform PDIF had to be adjusted in order to yield a test which approximated a true test of the null hypothesis. The results of this study lend some optimism to the employment of the MCA and PLR approaches. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

12

Cúri, Mariana. "Análise de questionários com itens constrangedores." Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-06072009-155633/.

Full text

Abstract:

As pesquisas científicas na área da Psiquiatria freqüentemente avaliam características subjetivas de indivíduos como, por exemplo, depressão, ansiedade e fobias. Os dados são coletados através de questionários, cujos itens tentam identificar a presença ou ausência de certos sintomas associados à morbidade psiquiátrica de interesse. Alguns desses itens, entretanto, podem provocar constrangimento em parte dos indivíduos respondedores por abordarem características ou comportamentos socialmente questionáveis ou, até, ilegais. Um modelo da teoria de resposta ao item é proposto neste trabalho visando diferenciar a relação entre a probabilidade de presença do sintoma e a gravidade da morbidade de indivíduos constrangidos e não constrangidos. Itens que necessitam dessa diferenciação são chamados \\textbf{itens com comportamento diferencial}. Adicionalmente, o modelo permite assumir que indivíduos constrangidos em responder um item possam vir a mentir em suas respostas, no sentido de omitir a presença de um sintoma. Aplicações do modelo proposto a dados simulados para questionários com 20 itens mostraram que as estimativas dos parâmetros são próximas aos seus verdadeiros valores. A qualidade das estimativas piora com a diminuição da amostra de indivíduos, com o aumento do número de itens com comportamento diferencial e, principalmente, com o aumento do número de itens com comportamento diferencial suscetíveis à mentira. A aplicação do modelo a um conjunto de dados reais, coletados para avaliar depressão em adolescentes, ilustra a diferença do padrão de resposta do item ``crises de choro\" entre homens e mulheres.<br>Psychiatric scientific research often evaluate subjective characteristics of the individual such as depression, anxiety and phobias. Data are collected through questionnaires with items that try to identify the presence or absence of certain symptoms associated with the psychiatric disease. Some of these items though could make some people embarrassed since they are related to questionable or even illegal social behaviors. The item response theory model proposed within this work envisions to differentiate the relationship between the probability of the symptom presence and the gravity of the disease of embarrassed and non-embarrassed individuals. Items that need this differentiation are called differential item functioning (dif). Additionally, the model has the assumption that individuals embarrassed with one particular item could lie across other answers to omit a possible condition. Applications of the proposed model to simulated data for a 20-item questionnaire have showed that parameter estimates of the proposed model are close to their real values. The estimate accuracy gets worse as the number of individuals decreases, the number of dif increases, and especially as the number of dif susceptible to lying increases. The application of the model to a group of real data, collected to evaluate teenager depression, shows the difference in the probability of \"crying crisis\" presence between men and women.

APA, Harvard, Vancouver, ISO, and other styles

13

Cet, Selda. "A Multivariate Analysis In Detecting Differentially Functioning Items Through The Use Of Programme For Internetional Student Assessment (pisa) 2003 Mathematics Literacy Items." Phd thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607157/index.pdf.

Full text

Abstract:

Differential Item Functioning Analyses investigates whether individuals with same ability in different groups also show similar performance on an item. In matching the individuals of the same ability, most of the methodologies use total scores of the tests which are usually constructed to be unidimensional. th purpose of the present study is evaluating the PISA 2003 mathematics literacy items through the use of DIF methodology which uses a multidimensional approach in matching students instead of single total score, improve the matching for DIF analyses. In the study factor structure of the tests will be determeined via both exploratory and confirmatory analyses in a complimentary fashion. then DIF analyses conducted using Logistic regression (LR) and Mantel -Haenszel methods.Analyses showed that the matching criterion improved when multivariate analyses were used. the number of DIF items was decreased when the matching criterion is defined based on multiple criterion scores such as mathematical literacy and problem solving scores or two different mathematical literacy subtest score. In addition, qualitative reviews and examination of the distribution of DIF items by content categories, cognitive demands, item types,item text, visual-spatial factors and linguistic properties of items were analyzed to explain the differential performance. Curriculum, cultural and translation differences were the main criteria for the qualitative analyses of DIF items. The results imply that curriculum and translation differences in items might be causing the DIF across Turkish and English versions of the tests.

APA, Harvard, Vancouver, ISO, and other styles

14

AGUIAR, GLAUCO DA SILVA. "A COMPARATIVE STUDY AMONG BRAZIL AND PORTUGAL ABOUT THE DIFFERENCES IN THE CURRICULAR EMPHASES IN MATHEMATICS USING THE ANALYSIS OF THE DIFFERENTIAL ITEM FUNCTIONING (DIF) FROM PISA 2003." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2008. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=12869@1.

Full text

Abstract:

COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR<br>Este estudo compara as diferenças nas ênfases curriculares em Matemática no Brasil e Portugal a partir dos resultados do Programa Internacional de Avaliação dos Estudantes (PISA) no ano de 2003. Deste programa participam jovens de 15 anos de idade dos países membros da Organização para a Cooperação e o Desenvolvimento Econômico (OCDE) e também de países convidados em uma perspectiva de avaliar habilidades e conhecimentos requeridos para uma atuação efetiva na sociedade. Com base na literatura sobre currículo a ensinar, ensinado e aprendido, o estudo parte do pressuposto que os resultados de diversos países em avaliações internacionais constituem-se uma estratégia para a análise do currículo aprendido e das ênfases pedagógicas na área da Matemática. O trabalho utiliza como metodologia a análise do Funcionamento Diferencial do Item (DIF) para identificar as diferenças curriculares, como também de abordagens pedagógicas e socioculturais. Um item apresenta funcionamento diferencial quando, alunos de diferentes países que possuem a mesma habilidade cognitiva, não têm a mesma probabilidade de acertarem o item. Os resultados mostram que alguns itens de Matemática apresentam funcionamento diferencial entre alunos brasileiros e portugueses. Os aspectos que explicam este funcionamento diferencial estão relacionados com ênfases diferenciadas não apenas em determinados conteúdos da Matemática, mas também de processos cognitivos e do formato do item.<br>This study compares the differences in the curricular emphases in mathematics in Brazil and Portugal using the results from the Programme for International Student Assessment (PISA) in 2003. The participants of this programme are 15-year-old students from the member countries of the Organisation for Economic Co-operation and Development (OECD) and from partner countries. Its aim is to assess how these students master the essential skills and knowledge to meet real- life challenges. Based on the existing literature about the official, taught and learned curricula, this study assumes that the results of several countries in international surveys constitute a strategy for analysing the learned curriculum and the pedagogical emphases in the mathematical area. The methodology used in this work to identify the curricular differences as well as the pedagogical and sociocultural approaches is the analysis of the Differential Item Functioning (DIF). One item presents a differential functioning when students from different countries, who have the same cognitive ability, do not have the same probability of answering the item correctly. The results show that some mathematics items present differential functioning between Brazilian and Portuguese students. The aspects that explain this differential functioning are related to differential emphases not only on certain mathematics contents but also on the cognitive processes and on the item format.

APA, Harvard, Vancouver, ISO, and other styles

15

Foster, Garett C. "Measurement Invariance of Burnout Inventories across Sex." Bowling Green State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1428162452.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Anderson, Hannah Ruth. "A Psychometric Investigation of a Mathematics Placement Test at a Science, Technology, Engineering, and Mathematics (STEM) Gifted Residential High School." Kent State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=kent1594656968297342.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Roomaney, Rizwana. "Towards establishing the equivalence of the IsiXhosa and English versions of the Woodcok Munoz language survey : an item and construct bias analysis of the verbal analogies scale." Thesis, University of the Western Cape, 2010. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_7549_1306830207.

Full text

Abstract:

<p>This study formed part of a larger project that is concerned with the adaptation of a test of cognitive academic language proficiency, the Woodcock Mu&ntilde<br>oz Language Survey (WMLS). The WMLS has been adapted from English into isiXhosa and the present study is located within the broader study that is concerned with establishing overall equivalence between the two language versions of the WMLS. It was primarily concerned with the Verbal Analogies (VA) scale. Previous research on this scale has demonstrated promising results, but continues to find evidence of some inequivalence. This study aimed to cross-validate previous research on the two language versions of the WMLS and improve on methodological issues by employing matched groups. It drew upon an existing dataset from the larger research project. The study employed a monolingual matched two-group design consisting of 150 mainly English speaking and 149 mainly isiXhosa learners in grades 6 and 7. This study had two sub aims. The first was to investigate item bias by identifying DIF items in the VA scale across the isiXhosa and English by conducting a logistic regression and Mantel-Haenszel procedure. Five items were identified by both techniques as DIF. The second sub aim was to evaluate construct equivalence between the isiXhosa and English versions of the WMLS on the VA scale by conducting a factor analysis on the tests after removal of DIF items. Two factors were requested during the factor analysis. The first factor displayed significant loadings across both language versions and was identified as a stable factor. This was confirmed by the Tucker&rsquo<br>s Phi and scatter plot. The second factor was stable for the English version but not for the isiXhosa version. The Tucker&rsquo<br>s phi and scatter plot indicated that this factor is not structurally equivalent across the two language versions</p>

APA, Harvard, Vancouver, ISO, and other styles

18

Rodrigues, Edilene Noronha. "O funcionamento diferencial do item de língua portuguesa: análise das causas e conseqüências no contexto do programa Nova Escola-RJ e do PROEB-MG." Universidade Federal de Juiz de Fora (UFJF), 2008. https://repositorio.ufjf.br/jspui/handle/ufjf/5320.

Full text

Abstract:

Submitted by isabela.moljf@hotmail.com (isabela.moljf@hotmail.com) on 2017-06-23T14:34:24Z No. of bitstreams: 1 edilenenoronharodrigues.pdf: 5251162 bytes, checksum: 84e82c4603e207a783d3b2f6cffa3c4c (MD5)<br>Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-08-07T20:14:34Z (GMT) No. of bitstreams: 1 edilenenoronharodrigues.pdf: 5251162 bytes, checksum: 84e82c4603e207a783d3b2f6cffa3c4c (MD5)<br>Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-08-07T20:15:08Z (GMT) No. of bitstreams: 1 edilenenoronharodrigues.pdf: 5251162 bytes, checksum: 84e82c4603e207a783d3b2f6cffa3c4c (MD5)<br>Made available in DSpace on 2017-08-07T20:15:08Z (GMT). No. of bitstreams: 1 edilenenoronharodrigues.pdf: 5251162 bytes, checksum: 84e82c4603e207a783d3b2f6cffa3c4c (MD5) Previous issue date: 2008-04-15<br>O funcionamento diferencial do item ocorre quando um item é aplicado a dois grupos de alunos com mesma habilidade cognitiva e esses grupos obtêm desempenhos diferenciados ao responderem a esse item, levando um grupo a apresentar melhores resultados que o outro. O presente estudo levou em consideração grupos de alunos do sexo feminino e do masculino de diferentes grupos raciais e alunos de regiões distintas dos Estados do Rio de Janeiro e Minas Gerais. Os resultados estatísticos obtidos entre dois grupos de alunos podem demonstrar que um grupo está sendo beneficiado em relação ao outro. A importância de realizar a análise do funcionamento diferencial – DIF– dos itens nos diferentes grupos de alunos se justifica, para que a proficiência estimada seja de boa qualidade, e os resultados possam ser adequadamente equalizados. Pode também apontar uma diferenciação pedagógica entre as regiões escolares, devido a uma maior exposição ao tema ou maior ênfase curricular, diferenças culturais associadas a diferenças sócio-econômicas, ou diagnosticar deficiências pedagógicas existentes. O objetivo da pesquisa é identificar e analisar os itens de Língua Portuguesa, especificamente das 3ª, 4a e 5ª séries do Ensino Fundamental, referentes aos testes aplicados em 2005 pelo Programa Nova Escola e pelo Simave/Proeb, que apresentaram funcionamento diferencial, segundo sua dificuldade, para alunos de diferentes regiões e com diferenças de gênero e raça, no Estado do Rio de Janeiro e Minas Gerais. Inicialmente, apresentou-se uma revisão histórica da avaliação em larga escala no Brasil e dos principais programas de avaliação executados nos Estados; sem pretensão de esgotar o assunto, alguns métodos de medidas em avaliação educacional em larga escala e a construção da escala de habilidades foram descritos. A seguir, procurou-se descrever o Funcionamento Diferencial do Item – DIF, os principais métodos para sua detecção e sua evolução histórica. Posteriormente, identificou-se o Funcionamento Diferencial-DIF em itens do Programa Nova Escola e do Simave/Proeb, foram classificados os encontrados e que geraram hipóteses para explicar o DIF. No presente estudo constatou-se que nem todos os itens diagnosticados com DIF são ruins para efeito de avaliação, pois eles podem trazer informações importantes para diagnóstico do sistema educacional do Estado do Rio de Janeiro e de Minas Gerais; por intermédio deles, podem-se diagnosticar as deficiências curriculares, indícios de discriminações raciais ou diversidades culturais. Assim, nem sempre o procedimento mais adequado é a sua retirada dos testes.<br>The operation of the item gap occurs when an item is applied to two groups of students with cognitive ability and these same groups get different performances to respond to that item, leading a group to produce better results than the other. This study took into account groups of students were female and male from different racial groups and students from different regions of the states of Rio de Janeiro and Minas Gerais. The statistical results obtained between two groups of students can demonstrate that a group is being received in relation to the other. The importance of conducting an analysis of the functioning differential - DIF-of items in different groups of students is justified, so that the estimated proficiency is of good quality, and results can be adequately equalizados. It may also indicate a differentiation between the regions educational school, due to greater exposure to the subject or greater emphasis curriculum, cultural differences associated with socio-economic differences, or diagnose existing educational deficiencies. The objective of the survey is to identify and analyse the items of Portuguese Language, specifically the 3, 4th and 5 th series of elementary school, regarding the tests applied in 2005 by the New School Program and the Simave / Proeb, which showed differential operation, according to their difficulty For students from different regions and with differences in gender and race in the state of Rio de Janeiro and Minas Gerais. Initially, presented to a review of historical assessment on a large scale in Brazil and the core program implemented in the evaluation; without pretension to exhaust the subject, some methods of measures in educational evaluation on a large scale and the scale of construction skills were described . Then, sought to describe the operation Differential Item - DIF, the main methods for its detection and its historical development. Subsequently, identified as the Operation Differential-DIF items in the Programme and the New School Simave / Proeb, were classified and that those found generated hypotheses to explain the DIF. In this study it was found that not all items are diagnosed with DIF bad for the purpose of evaluation, because they can provide important information for diagnosis of the educational system of the State of Rio de Janeiro and Minas Gerais, through them, you can diagnose learning disabilities, evidence of racial discrimination or cultural diversity. So, not always the most appropriate procedure is the withdrawal of its tests.

APA, Harvard, Vancouver, ISO, and other styles

19

Ismail, Ghouwa. "Towards establishing the equivalence of the English version of the verbal analogies scale of the Woodcock Munuz Language Survey across English and Xhosa first language speakers." Thesis, University of the Western Cape, 2010. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_9609_1305113932.

Full text

Abstract:

<p>In the majority of the schools in South Africa (SA), learners commence education in English. This English milieu poses a considerable challenge for English second-language speakers. In an attempt to bridge the gap between English as the main medium of instruction and the nine indigenous languages of the country and assist with the implementation of mother-tongue based bilingual education, this study focuses on the cross-validation of a monolingual English test used in the assessment of multilingual or bilingual learners in the South African context. This test, namely the Woodcock Mu&ntilde<br>oz Language Survey (WMLS), is extensively used in the United States in Additive Bilingual Education in the country. The present study is a substudy of a broader study, in which the original WMLS (American-English version) was adapted into SA English and Xhosa. For this specific sub-study, the researcher was interested in investigating the scalar equivalence of the adapted English version of the Verbal Analogies (VA) subscale of the WMLS across English first-language speakers and Xhosa first-language speakers. This was achieved by utilising differential item functioning (DIF) and construct bias statistical techniques. The Mantel-Haenszel DIF detection method was employed to detect DIF, while construct equivalence was examined by means of exploratory factor analysis (EFA) utilising an a priori two-factor structure. The Tucker&rsquo<br>s phi coefficient was used to assess the congruence of the construct across the two language groups</p>

APA, Harvard, Vancouver, ISO, and other styles

20

Backteman-Erlanson, Susann. "Burnout, work, stress of conscience and coping among female and male patrolling police officers." Doctoral thesis, Umeå universitet, Institutionen för omvårdnad, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-71206.

Full text

Abstract:

Background. Police work is a stressful occupation with frequent exposure to traumatic events and psychological strain from work might increase the risk of burnout. This thesis focuses on patrolling police officers (PPO), who work most of their time in the community and have daily contact with the public. Since police work traditionally is a male coded occupation we assume that there are differences between women and men in burnout as well as experiences from psychosocial work environment. Aim. The overall aim of this thesis is to explore burnout, psychosocial and physical work environment, coping strategies, and stress of conscience when taking gender into consideration among patrolling police officers. Methods. This thesis employs both qualitative and quantitative methods. In Paper I a qualitative approach with narrative interviews was used where male PPO described experiences of traumatic situations when caring for victims of traffic accidents. A convenience sample of nine male PPO from a mid-sized police authority was recruited. Interviews were analyzed using qualitative content analysis. Papers II, III, and IV were based on a cross-sectional survey from a randomly selected sample stratified for gender from all 21 local police authorities in Sweden. In the final sample, 1554 PPOs were invited (778 women, 776 men), response rate was 55% (n=856) in total, 56% for women (n=437) and 53% for men (n=419). The survey included a self-administered questionnaire based on instruments measuring burnout, stress of conscience, psychosocial and physical work environment, and coping. Results. Findings from Paper I were presented in three themes; “being secure with the support system,” “being confident about prior successful actions,” and “being burdened with uncertainty.” Results from Paper II showed high levels of emotional exhaustion (EE), 30% for female PPOs and 26% for male PPOs. High levels of depersonalization (DP) were reported for 52 % of female PPO, corresponding proportions for male were 60%. Multiple logistic regression showed that stress of conscience (SCQ-A), high demand, and organizational climate increased the risk of EE for female PPO. For male PPO stress of conscience (SCQ-A), low control and high demand increased the risk of EE. Independent of gender, stress of conscience (SCQ-A) increased the risk of DP. Psychometric properties of the WOCQ were investigated with exploratory factor analysis and confirmatory factor analysis, a six-factor solution was confirmed. DIF analysis was detected for a third of the items in relation to gender. In Paper IV a block wise hierarchical multiple regression analysis was performed investigating the predictive impact of psychological demand, decision latitude, social support, coping strategies, and stress of conscience on EE as well as DP. Findings revealed that, regardless of gender, risk of EE and DP increased with a troubled conscience amongst the PPO. Conclusion. “Being burdened with uncertainty” in this male-dominated context indicate that the PPO did not feel confident talking about traumatic situations, which might influence their coping strategies when arriving to a similar situation. This finding can be related to Paper II and IV showing that stress of conscience increased the risk of both EE and DP. The associations between troubled conscience and the risk of experiencing both emotional exhaustion and depersonalization indicate that stress of conscience should be considered when studying the influence of the psychosocial work environment on burnout. Results from this study show that the psychosocial work environment is not satisfying and needs improvement for patrolling police officers in Sweden. Further studies including both qualitative and quantitative (longitudinal) methods should be used to improve knowledge in this area to increase conditions for preventive and rehabilitative actions.

APA, Harvard, Vancouver, ISO, and other styles

21

O'Brien, Erin L. "USING DIFFERENTIAL FUNCTIONING OF ITEMS AND TESTS (DFIT) TO EXAMINE TARGETED DIFFERENTIAL ITEM FUNCTIONING." Wright State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=wright1421955213.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Jiang, Jing. "Regularization Methods for Detecting Differential Item Functioning:." Thesis, Boston College, 2019. http://hdl.handle.net/2345/bc-ir:108404.

Full text

Abstract:

Thesis advisor: Zhushan Mandy Li<br>Differential item functioning (DIF) occurs when examinees of equal ability from different groups have different probabilities of correctly responding to certain items. DIF analysis aims to identify potentially biased items to ensure the fairness and equity of instruments, and has become a routine procedure in developing and improving assessments. This study proposed a DIF detection method using regularization techniques, which allows for simultaneous investigation of all items on a test for both uniform and nonuniform DIF. In order to evaluate the performance of the proposed DIF detection models and understand the factors that influence the performance, comprehensive simulation studies and empirical data analyses were conducted. Under various conditions including test length, sample size, sample size ratio, percentage of DIF items, DIF type, and DIF magnitude, the operating characteristics of three kinds of regularized logistic regression models: lasso, elastic net, and adaptive lasso, each characterized by their penalty functions, were examined and compared. Selection of optimal tuning parameter was investigated using two well-known information criteria AIC and BIC, and cross-validation. The results revealed that BIC outperformed other model selection criteria, which not only flagged high-impact DIF items precisely, but also prevented over-identification of DIF items with few false alarms. Among the regularization models, the adaptive lasso model achieved superior performance than the other two models in most conditions. The performance of the regularized DIF detection model using adaptive lasso was then compared to two commonly used DIF detection approaches including the logistic regression method and the likelihood ratio test. The proposed model was applied to analyzing empirical datasets to demonstrate the applicability of the method in real settings<br>Thesis (PhD) — Boston College, 2019<br>Submitted to: Boston College. Lynch School of Education<br>Discipline: Educational Research, Measurement and Evaluation

APA, Harvard, Vancouver, ISO, and other styles

23

Lee, Yoonsun. "The impact of a multidimensional item on differential item functioning (DIF) /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/7920.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

McBride, Nadine LeBarron. "Differential Item Functioning on the International Personality Item Pool's Neuroticism Scale." Diss., Virginia Tech, 2008. http://hdl.handle.net/10919/29999.

Full text

Abstract:

As use of the public-domain International Personality Item Pool (IPIP) scales has grown significantly over the past decade (Goldberg, Johnson, Eber, Hogan, Ashton, Cloninger, & Gough, 2006) research on the psychometric properties of the items and scales have become increasingly important. This research study examines the IPIP scale constructed to measure the Five Factor Model (FFM) domain of Neuroticism (as measured by the NEO-PI-R) for occurrences of differential functioning at both the item and test level by gender and three age ranges using the DFIT framework (Raju, van der Linden, & Fleer, 1993) This study found six items that displayed differential item functioning by gender and three items that displayed differential item functioning by age. No differential functioning at the test level was found. Items demonstrating DIF and implications for potential scale revision are discussed.<br>Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

25

Li, Zhen. "Impact of differential item functioning on statistical conclusions." Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/14680.

Full text

Abstract:

Differential item functioning (DIF), sometimes called item bias, has been widely studied in educational and psychological measurement; however, to date, research has focused on the definitions of, and the methods for, detecting DIF. It is well accepted that the presence of DIF may degrade the validity of a test. There is relatively little known, however, about the impact of DIF on later statistical decisions when one uses the observed test scores in data analyses and corresponding statistical hypothesis tests. This dissertation investigated the impact of DIF on later statistical decisions based on the observed total test (or scale) score. Very little is known in the literature about the impact of DIF on the Type I error rate and effect size of, for instance, the independent samples t-test on the observed total test scores. Five studies were conducted: studies one to three investigated the impact of unidirectional DIF (i.e., DIF amplification) on the Type I error rate and effect size of the independent samples t-test; studies four and five investigated the DIF cancellation effects on the Type I error rate and effect size of the independent samples t-test. The Type I error rate and effect size were defined in terms of latent population means rather than observed sample means. The results showed that the amplification and cancellation effects among uniform DIF items did transfer to the test level. Both the Type I error rate and effect size were inflated. The degree of inflation depends on the number of DIF items, magnitude of DIF, sample sizes, and interactions among these factors. These findings highlight the importance of screening DIF before conducting any further statistical analysis. It offers advice to practicing researchers about when and how much the presence of DIF will affect their statistical conclusions based on the total observed test scores.

APA, Harvard, Vancouver, ISO, and other styles

26

Sanguras, Laila Y. "Construct Validation and Measurement Invariance of the Athletic Coping Skills Inventory for Educational Settings." Thesis, University of North Texas, 2017. https://digital.library.unt.edu/ark:/67531/metadc984216/.

Full text

Abstract:

The present study examined the factor structure and measurement invariance of the revised version of the Athletic Coping Skills Inventory (ACSI-28), following adjustment of the wording of items such that they were appropriate to assess Coping Skills in an educational setting. A sample of middle school students (n = 1,037) completed the revised inventory. An initial confirmatory factor analysis led to the hypothesis of a better fitting model with two items removed. Reliability of the subscales and the instrument as a whole was acceptable. Items were examined for sex invariance with differential item functioning (DIF) using item response theory, and five items were flagged for significant sex non-invariance. Following removal of these items, comparison of the mean differences between male and female coping scores revealed that there was no significant difference between the two groups. Further examination of the generalizability of the coping construct and the potential transfer of psychosocial skills between athletic and academic settings are warranted.

APA, Harvard, Vancouver, ISO, and other styles

27

Samuelsen, Karen Marie. "Examining differential item functioning from a latent class perspective." College Park, Md. : University of Maryland, 2005. http://hdl.handle.net/1903/2682.

Full text

Abstract:

Thesis (Ph. D.) -- University of Maryland, College Park, 2005.<br>Thesis research directed by: Measurement, Statistics and Evaluation. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

28

Bryant, Damon. "THE EFFECTS OF DIFFERENTIAL ITEM FUNCTIONING ON PREDICTIVE BIAS." Doctoral diss., University of Central Florida, 2004. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2237.

Full text

Abstract:

The purpose of this research was to investigate the relation between measurement bias at the item level (differential item functioning, dif) and predictive bias at the test score level. Dif was defined as a difference in the probability of getting a test item correct for examinees with the same ability but from different subgroups. Predictive bias was defined as a difference in subgroup regression intercepts and/or slopes in predicting a criterion. Data were simulated by computer. Two hypothetical subgroups (a reference group and a focal group) were used. The predictor was a composite score on a dimensionally complex test with 60 items. Sample size (35, 70, and 105 per group), validity coefficient (.3 or .5), and the mean difference on the predictor (0, .33, .66, and 1 standard deviation, sd) and the criterion (0 and .35 sd) were manipulated. The percentage of items showing dif (0%, 15%, and 30%) and the effect size of dif (small = .3, medium = .6, and large = .9) were also manipulated. Each of the 432 conditions in the 3 x 2 x 4 x 2 x 3 x 3 design was replicated 500 times. For each replication, a predictive bias analysis was conducted, and the detection of predictive bias against each subgroup was the dependent variable. The percentage of dif and the effect size of dif were hypothesized to influence the detection of predictive bias; hypotheses were also advanced about the influence of sample size and mean subgroup differences on the predictor and criterion. Results indicated that dif was not related to the probability of detecting predictive bias against any subgroup. Results were inconsistent with the notion that measurement bias and predictive bias are mutually supportive, i.e., the presence (or absence) of one type of bias is evidence in support of the presence (or absence) of the other type of bias. Sample size and mean differences on the predictor/criterion had direct and indirect effects on the probability of detecting predictive bias against both reference and focal groups. Implications for future research are discussed.<br>Ph.D.<br>Department of Psychology<br>Arts and Sciences<br>Psychology

APA, Harvard, Vancouver, ISO, and other styles

29

Greenberg, Stuart Elliot. "Differential item functioning on the Myers-Briggs type indicator." Diss., Virginia Tech, 1992. http://hdl.handle.net/10919/38455.

Full text

Abstract:

Differential item functioning on the Myers-Briggs Type Indicator (MBTI) was examined in regard to gender. The Myers-Briggs has a differential scoring system for males and females on its thinking/feeling subscale. This scoring system preserves the 60 % thinking male and 30 % thinking female proportion that is implied by the Jungian theory underlying the Indicator. The MBTI's authors contended that the sex-based differential scoring system corrects items that subjects at a certain level of a latent trait either incorrectly endorse or leave blank. This reasoning is the classical definition of differential item functioning (DIF); consequently, the non differentially scored items should exhibit DIF. If these items do not show DIF, then there would be no reason to use a differential scoring system. Although the Indicator has been in use for several decades, no rigorous item response theory (IRT) item-level analysis of the Indicator has been undertaken. IRT analysis allows for mean differences in subgroups to occur, independent of the question of DlF. Linn and Harnisch's (1981) pseudo-lRT analysis was chosen to test for the presence of DlF in the MBTl items because it is best for tests of relatively small length. The Myers-Briggs subscales range from 22 to 26 items, which is relatively small by lRT standards. lRT analyses conducted on N=1887 subjects indicated that no items on the thinking/feeling subscale showed evidence of DIF. Out of 94 items, only one extraversion/introversion item and one judging/perception item showed evidence of DIF; no Thinking/Feeling items showed DIF. It is recommended that sex-based differential MBTI scoring be abandoned, and that the distribution of type in the population be examined in future studies.<br>Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

30

Henderson, Dianne L. "Investigation of differential item functioning in exit examinations across item format and subject area." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape8/PQDD_0019/NQ46848.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Gibson, Shanan Gwaltney IV. "Differential Item Functioning on the Armed Services Vocational Aptitude Battery." Thesis, Virginia Tech, 1998. http://hdl.handle.net/10919/37047.

Full text

Abstract:

Utilizing Item Response Theory (IRT) methodologies, the Armed Services Vocational Aptitude Battery (ASVAB) was examined for differential item functioning (DIF) on the basis of crossed gender and ethnicity variables. Both the Mantel-Haenszel procedure and an IRT area-based technique were utilized to assess the degree of uniform and non-uniform DIF in a sample of ASVAB takers. The analysis was performed such that each subgroup of interest functioned as the focal group to be compared to the male reference group. This type of DIF analysis allowed for comparisons within ethnic group, within gender group, as well as crossed ethnic/gender group. The groups analyzed were: White, Black, and Hispanic males, and White and Black females. It was hypothesized that DIF would be found, at the scale level, on several of the ASVAB sub-tests as a result of unintended latent trait demands of items. In particular, those tests comprised of items requiring specialized jargon, visuospatial ability, or advanced English vocabulary are anticipated to show bias toward white males and/or white females. Findings were mixed. At the item level, DIF fluctuated greatly. Numerous instances of DIF favoring the reference as well as the focal group were found. At the scale level, inconsistencies existed across the forms and versions. Tests varied in their tendency to be biased against the focal group of interest and at times, performed contrary to expectations.<br>Master of Science

APA, Harvard, Vancouver, ISO, and other styles

32

Gratias, Melissa B. "Gender and Ethnicity-Based Differential Item Functioning on the Myers-Briggs Type Indicator." Thesis, Virginia Tech, 1997. http://hdl.handle.net/10919/30362.

Full text

Abstract:

Item Response Theory (IRT) methodologies were employed in order to examine the Myers-Briggs Type Indicator (MBTI) for differential item functioning (DIF) on the basis of crossed gender and ethnicity variables. White males were the reference group, and the focal groups were: black females, black males, and white females. The MBTI was predicted to show DIF in all comparisons. In particular, DIF on the Thinking-Feeling scale was hypothesized especially in the comparisons between white males and black females and between white males and white females. A sample of 10,775 managers who took the MBTI at assessment centers provided the data for the present experiment. The Mantel-Haenszel procedure and an IRT-based area technique were the methods of DIF-detection. Results showed several biased items on all scales for all comparisons. Ethnicitybased bias was seen in the white male vs. black female and white male vs. black male comparisons. Gender-based bias was seen particularly in the white male vs. white female comparisons. Consequently, the Thinking-Feeling showed the least DIF of all scales across comparisons, and only one of the items differentially scored by gender was found to be biased. Findings indicate that the gender-based differential scoring system is not defensible in managerial samples, and there is a need for further research into the study of differential item functioning with regards to ethnicity.<br>Master of Science

APA, Harvard, Vancouver, ISO, and other styles

33

Juve, John A. "Assessing differential item functioning and item parameter drift in the college basic academic subjects examination /." free to MU campus, to others for purchase, 2004. http://wwwlib.umi.com/cr/mo/fullcit?p3137717.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Li, Yong "Isaac." "Extending the Model with Internal Restrictions on Item Difficulty (MIRID) to Study Differential Item Functioning." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6724.

Full text

Abstract:

Differential item functioning (DIF) is a psychometric issue routinely considered in educational and psychological assessment. However, it has not been studied in the context of a recently developed componential statistical model, the model with internal restrictions on item difficulty (MIRID; Butter, De Boeck, & Verhelst, 1998). Because the MIRID requires test questions measuring either single or multiple cognitive processes, it creates a complex environment for which traditional DIF methods may be inappropriate. This dissertation sought to extend the MIRID framework to detect DIF at the item-group level and the individual-item level. Such a model-based approach can increase the interpretability of DIF statistics by focusing on item characteristics as potential sources of DIF. In particular, group-level DIF may reveal comparative group strengths in certain secondary constructs. A simulation study was conducted to examine under different conditions parameter recovery, Type I error rates, and power of the proposed approach. Factors manipulated included sample size, magnitude of DIF, distributional characteristics of the groups, and the MIRID DIF models corresponding to discrete sources of differential functioning. The impact of studying DIF using wrong models was investigated. The results from the recovery study of the MIRID DIF model indicate that the four delta (i.e., non-zero value DIF) parameters were underestimated whereas item locations of the four associated items were overestimated. Bias and RMSE were significantly greater when delta was larger; larger sample size reduced RMSE substantially while the effects from the impact factor were neither strong nor consistent. Hypothesiswise and adjusted experimentwise Type I error rates were controlled in smaller delta conditions but not in larger delta conditions as estimates of zero-value DIF parameters were significantly different from zero. Detection power of the DIF model was weak. Estimates of the delta parameters of the three group-level DIF models, the MIRID differential functioning in components (DFFc), the MIRID differential functioning in item families (DFFm), and the MIRID differential functioning in component weights (DFW), were acceptable in general. They had good hypothesiswise and adjusted experimentwise Type I error control across all conditions and overall achieved excellent detection power. When fitting the proposed models to mismatched data, the false detection rates were mostly beyond the Bradley criterion because the zero-value DIF parameters in the mismatched model were not estimated adequately, especially in larger delta conditions. Recovery of item locations and component weights was also not adequate in larger delta conditions. Estimation of these parameters was more or less affected adversely by the DIF effect simulated in the mismatched data. To study DIF in MIRID data using the model-based approach, therefore, more research is necessary to determine the appropriate procedure or model to implement, especially for item-level differential functioning.

APA, Harvard, Vancouver, ISO, and other styles

35

Pagano, Ian S. "Ethnic differential item functioning in the assessment of quality of life." Thesis, University of Hawaii at Manoa, 2003. http://hdl.handle.net/10125/3067.

Full text

Abstract:

Ethnic differential item functioning (DIF) on the QLQ-C30 quality of life (QoL) questionnaire for cancer patients was investigated using item response theory methods. The sample consisted of 359 cancer patients representing four ethnic groups: Caucasian, Filipino, Hawaiian, and Japanese. Results showed the presence of DIF on several items, indicating ethnic differences in the assessment of quality of life. Relative to the Caucasian and Japanese groups, items related to financial difficulties, need for rest, nausea or vomiting, emotional difficulties, and social difficulties, exhibited DIF for Filipinos. On these items Filipinos exhibited lower QoL scores, even though overall QoL was not lower. This evidence may explain why Filipinos have previously been found to have lower overall QoL. Although Filipinos score lower on QoL than other groups, this may not reflect lower QoL, but rather differences in how QoL is defined. Additionally, DIF did not appear to alter the psychometric properties of the QLQ-C30.<br>Thesis (Ph. D.)--University of Hawaii at Manoa, 2003.<br>Includes bibliographical references (leaves 41-48).<br>Mode of access: World Wide Web.<br>Also available by subscription via World Wide Web<br>vi, 48 leaves, bound 29 cm

APA, Harvard, Vancouver, ISO, and other styles

36

Mtsatse, Nangamso. "Exploring Differential Item Functioning on reading achievement between English and isiXhosa." Diss., University of Pretoria, 2017. http://hdl.handle.net/2263/65447.

Full text

Abstract:

Post-Apartheid South Africa has undergone an educational language policy shift from only Afrikaans and English in education to the representation of all 11 official languages: Afrikaans, English, isiZulu, isiXhosa, isiNdebele, siSwati, Sesotho, Setswana, Tshivenda and Xitsonga. The national languages policy included the Language in Education Policy (LiEP), which stipulates that learners from grades 1- 3 in all ways possible should be provided the opportunity to be taught in their home language (HL). With this change, there has been a need to increase access to African languages in education. The 2007 Status of LoLT report released by the Department of Education (DoE) revealed that since 1996 up to 65% of learners in the foundation phase are being taught in their home language. In other ways, the LiEP has been successful in bridging the gap of access to African languages in the basic education system. With that said, there has been rapid growth of interest in early childhood crosscultural literacy assessment across the globe. Internationally South Africa has participated in the Southern and Eastern Africa Consortium for Monitoring Education Quality as well as the Progress in International Reading Literacy Study studies. The design of these particular international studies meant participation in the same assessment but in different languages, calling into question the equivalence of assessments across languages. Assessing across languages should aim to encourage linguistic equivalence, functioning equivalence, cultural equivalence as well as metric equivalence. South Africa has taken part in three cycles of the Progress in International Reading Literacy (PIRLS) study. The purposes of the current study is to present secondary analysis of the prePIRLS 2011 data, to investigate any differential item functioning (DIF) of the achievement scores between English and isiXhosa. The Organisation for Economic Co-operation and Development (OECD) developed a framework of input, process and output for curriculum process. The framework shows the multiple facets that needs to be considered when implementing a curriculum in a country. The curriculum process framework was used as the theoretical framework for this study. The framework views curriculum success as a process of measuring how the intended curriculum (input) was implemented (process) and should be reflected in the attained curriculum (output). The adapted framework is LiEP as the attained curriculum, as learners in the prePIRLS 2011 are tested in the LoLT in Grades 1-3. Followed by the prePIRLS 2011 assessment, as the implemented curriculum testing the learners’ comprehension skills requires by grade 4 in their HL. Lastly, the attained curriculum refers the learners’ achievement scores in the prePIRLS 2011 study. A sample of 819 Grade 4 learners (539 English L1 speaking learners and 279 isiXhosa L1 speakign learners) that participated in the prePIRLS 2011 study were included in this study. These learners wrote a literary passage called The Lonely Giraffe, accompanied by 15 items. The study made use of the Rasch model to investigate any evidence of Differential Item Functioning (DIF) on the reading achievement of the learners. The findings showed that the items did not reflect an equal distribution. In addition, an item by item DIF analysis revealed discrimination on one subgroup over the other. A further investigation showed that these discriminations could be explained by means of inaccurate linguistic equivalence. The linguistic equivalence could be explained by means of mistranslation and/or dialectal differences. Subsequently, the complexities of dialects in African languages are presented by providing isiXhosa alternative translations to the items. The significance of the current study is in its potential contribution in further understanding language complexities in large-scale assessments. Additionally, in attempts to provide valid, reliable and fair assessment data across sub-groups.<br>Dissertation (MEd)--University of Pretoria, 2017.<br>Science, Mathematics and Technology Education<br>Centre for Evaluation & Assessment (CEA)<br>MEd<br>Unrestricted

APA, Harvard, Vancouver, ISO, and other styles

37

Brown, Paulette C. "An empirical study of the consistency of differential item functioning detection." Thesis, University of Ottawa (Canada), 1992. http://hdl.handle.net/10393/7928.

Full text

Abstract:

Total test scores of examinees on any given standardized test are used to provide reliable and objective information regarding the overall performance of the test takers. When the probability of successfully responding to a test item is not the same for examinees at the same ability levels, but from different groups, the item functions differentially in favour of one group over the other group. This type of problem, defined as differential item functioning (DIF), creates a disadvantage for members of certain subgroups of test takers. Test items need to be accurate and valid measures for all groups because test results may be used to make significant decisions which may have an impact on the future opportunities available to test takers. Thus, DIF is an issue of concern in the field of educational measurement. The purpose of this study was to investigate how well the Mantel-Haenszel (MH) and logistic regression (LR) procedures perform in the identification of items that function differentially across gender groups and regional groups. Research questions to be answered by this study were concerned with three issues: (1) the detection rates for DIF items and items which did not exhibit DIF, (2) the agreement for the MH and LR methods in the detection of DIF items, and (3) the effectiveness of these indices across sample size and over replications. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

38

Carter, Nathan T. "APPLICATIONS OF DIFFERENTIAL FUNCTIONING METHODS TO THE GENERALIZED GRADED UNFOLDING MODEL." Bowling Green State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1290885927.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Thurman, Carol Jenetha. "A Monte Carlo Study Investigating the Influence of Item Discrimination, Category Intersection Parameters, and Differential Item Functioning in Polytomous Items." Digital Archive @ GSU, 2009. http://digitalarchive.gsu.edu/eps_diss/48.

Full text

Abstract:

The increased use of polytomous item formats has led assessment developers to pay greater attention to the detection of differential item functioning (DIF) in these items. DIF occurs when an item performs differently for two contrasting groups of respondents (e.g., males versus females) after controlling for differences in the abilities of the groups. Determining whether the difference in performance on an item between two demographic groups is due to between group differences in ability or some form of unfairness in the item is a more complex task for a polytomous item, because of its many score categories, than for a dichotomous item. Effective DIF detection methods must be able to locate DIF within each of these various score categories. The Mantel, Generalized Mantel Haenszel (GMH), and Logistic Regression (LR) are three of several DIF detection methods that are able to test for DIF in polytomous items. There have been relatively few studies on the effectiveness of polytomous procedures to detect DIF; and of those studies, only a very small percentage have examined the efficiency of the Mantel, GMH, and LR procedures when item discrimination magnitudes and category intersection parameters vary and when there are different patterns of DIF (e.g., balanced versus constant) within score categories. This Monte Carlo simulation study compared the Type I error and power of the Mantel, GMH, and OLR (LR method for ordinal data) procedures when variation occurred in 1) the item discrimination parameters, 2) category intersection parameters, 3) DIF patterns within score categories, and 4) the average latent traits between the reference and focal groups. Results of this investigation showed that high item discrimination levels were directly related to increased DIF detection rates. The location of the difficulty parameters was also found to have a direct effect on DIF detection rates. Additionally, depending on item difficulty, DIF magnitudes and patterns within score categories were found to impact DIF detection rates and finally, DIF detection power increased as DIF magnitudes became larger. The GMH outperformed the Mantel and OLR and is recommended for use with polytomous data when the item discrimination varies across items.

APA, Harvard, Vancouver, ISO, and other styles

40

Garrett, Phyllis Lorena. "A Monte Carlo Study Investigating Missing Data, Differential Item Functioning, and Effect Size." Digital Archive @ GSU, 2009. http://digitalarchive.gsu.edu/eps_diss/35.

Full text

Abstract:

ABSTRACT A MONTE CARLO STUDY INVESTIGATING MISSING DATA, DIFFERENTIAL ITEM FUNCTIONING, AND EFFECT SIZE by Phyllis Garrett The use of polytomous items in assessments has increased over the years, and as a result, the validity of these assessments has been a concern. Differential item functioning (DIF) and missing data are two factors that may adversely affect assessment validity. Both factors have been studied separately, but DIF and missing data are likely to occur simultaneously in real assessment situations. This study investigated the Type I error and power of several DIF detection methods and methods of handling missing data for polytomous items generated under the partial credit model. The Type I error and power of the Mantel and ordinal logistic regression were compared using within-person mean substitution and multiple imputation when data were missing completely at random. In addition to assessing the Type I error and power of DIF detection methods and methods of handling missing data, this study also assessed the impact of missing data on the effect size measure associated with the Mantel, the standardized mean difference effect size measure, and ordinal logistic regression, the R-squared effect size measure. Results indicated that the performance of the Mantel and ordinal logistic regression depended on the percent of missing data in the data set, the magnitude of DIF, and the sample size ratio. The Type I error for both DIF detection methods varied based on the missing data method used to impute the missing data. Power to detect DIF increased as DIF magnitude increased, but there was a relative decrease in power as the percent of missing data increased. Additional findings indicated that the percent of missing data, DIF magnitude, and sample size ratio also influenced the effect size measures associated with the Mantel and ordinal logistic regression. The effect size values for both DIF detection methods generally increased as DIF magnitude increased, but as the percent of missing data increased, the effect size values decreased.

APA, Harvard, Vancouver, ISO, and other styles

41

Wood, Scott William. "Differential item functioning procedures for polytomous items when examinee sample sizes are small." Diss., University of Iowa, 2011. https://ir.uiowa.edu/etd/1110.

Full text

Abstract:

As part of test score validity, differential item functioning (DIF) is a quantitative characteristic used to evaluate potential item bias. In applications where a small number of examinees take a test, statistical power of DIF detection methods may be affected. Researchers have proposed modifications to DIF detection methods to account for small focal group examinee sizes for the case when items are dichotomously scored. These methods, however, have not been applied to polytomously scored items. Simulated polytomous item response strings were used to study the Type I error rates and statistical power of three popular DIF detection methods (Mantel test/Cox's β, Liu-Agresti statistic, HW3) and three modifications proposed for contingency tables (empirical Bayesian, randomization, log-linear smoothing). The simulation considered two small sample size conditions, the case with 40 reference group and 40 focal group examinees and the case with 400 reference group and 40 focal group examinees. In order to compare statistical power rates, it was necessary to calculate the Type I error rates for the DIF detection methods and their modifications. Under most simulation conditions, the unmodified, randomization-based, and log-linear smoothing-based Mantel and Liu-Agresti tests yielded Type I error rates around 5%. The HW3 statistic was found to yield higher Type I error rates than expected for the 40 reference group examinees case, rendering power calculations for these cases meaningless. Results from the simulation suggested that the unmodified Mantel and Liu-Agresti tests yielded the highest statistical power rates for the pervasive-constant and pervasive-convergent patterns of DIF, as compared to other DIF method alternatives. Power rates improved by several percentage points if log-linear smoothing methods were applied to the contingency tables prior to using the Mantel or Liu-Agresti tests. Power rates did not improve if Bayesian methods or randomization tests were applied to the contingency tables prior to using the Mantel or Liu-Agresti tests. ANOVA tests showed that statistical power was higher when 400 reference examinees were used versus 40 reference examinees, when impact was present among examinees versus when impact was not present, and when the studied item was excluded from the anchor test versus when the studied item was included in the anchor test. Statistical power rates were generally too low to merit practical use of these methods in isolation, at least under the conditions of this study.

APA, Harvard, Vancouver, ISO, and other styles

42

Liu, Ruixue. "DIFFERENTIAL ITEM FUNCTIONING AMONG ENGLISH LANGUAGE LEARNERS ON A LARGE-SCALE MATHEMATICS ASSESSMENT." UKnowledge, 2019. https://uknowledge.uky.edu/edsc_etds/50.

Full text

Abstract:

English language learner (ELL) is a term to describe students who are still acquiring English proficiency. In recent decades, ELLs are a very rapidly growing student group in United States. In school classrooms, ELLs are learning English and their academic subjects simultaneously. It is challenging for them to hear lectures, read textbooks, and complete tests in English despite of their inadequate English language proficiency (Ilich, 2013). As a result, the increasing number of ELLs in public schools has paralleled the increase in ELLs’ low mathematics performance (NCES, 2016). Due to the popularization of international large-scale assessments in the recent decade, it is necessary to analyze their psychometric properties (e.g., reliability, validity) so that those results can provide with evidence-based implications for policymakers. Educational researchers need to assess the validity for subgroups within each country. The Programme for International Student Assessment (PISA), as one of the influential large-scale assessments, allows researchers to investigate academic achievement and group membership from a variety of different viewpoints. The current study was to understand the nature and potential sources of the gaps in mathematics achievement between ELLs and non-ELLs. The nature of achievement gap was examined using three DIF methodologies including Mantel-Haenszel procedure, Rasch analysis, and Hierarchical Generalized Linear Model (HGLM) at the item level instead of total test level. Amon the three methods, HGLM was utilized to examine the potential sources of DIF. This method can take into account of the nested structure of data where items are nested within students, and students nested within schools. At the student level, sources of DIF were investigated through students’ variations in mathematics self-efficacy, language proficiency, and student socioeconomic status. At the school level, school type and school educational resource were investigated as potential sources of DIF after controlling for the student variables. The U.S. sample from PISA 2012 was used, and 76 dichotomously coded items from PISA 2012 mathematics assessment were included to detect DIF effects. Results revealed that ten common items are identified with DIF effects using MH procedure, Rasch analysis, and HGLM. These ten items are all in favor of non-ELLs.The decreasing number of items showing DIF effects in HGLM after controlling for student-level variables revealed mathematics self-efficacy, language proficiency, and SES are potential sources of DIF between ELLs and non-ELLs. In addition, the number of DIF items continued to decrease after controlling for both student and school-level variables. This finding proved that school type and school educational resources were also potential sources of DIF between ELLs and non-ELLs. Findings from this study can help educational researchers, administrators, and policymakers understand the nature of the gap at item level instead of the total test level so that United States can be competitive in middle school mathematics education. This study can also help guide item writers and test developers in the construction of more linguistically accessible assessments for students who are still learning English. The significance of this study lies in the empirical investigation of the gap between ELLs and non-ELLs in mathematics achievement at an item level and from perspectives of both students and schools.

APA, Harvard, Vancouver, ISO, and other styles

43

Raiford-Ross, Terris. "The Impact of Multidimensionality on the Detection of Differential Bundle Functioning Using SIBTEST." Digital Archive @ GSU, 2008. http://digitalarchive.gsu.edu/eps_diss/14.

Full text

Abstract:

In response to public concern over fairness in testing, conducting a differential item functioning (DIF) analysis is now standard practice for many large-scale testing programs (e.g., Scholastic Aptitude Test, intelligence tests, licensing exams). As highlighted by the Standards for Educational and Psychological Testing manual, the legal and ethical need to avoid bias when measuring examinee abilities is essential to fair testing practices (AERA-APA-NCME, 1999). Likewise, the development of statistical and substantive methods of investigating DIF is crucial to the goal of designing fair and valid educational and psychological tests. Douglas, Roussos and Stout (1996) introduced the concept of item bundle DIF and the implications of differential bundle functioning (DBF) for identifying the underlying causes of DIF. Since then, several studies have demonstrated DIF/DBF analyses within the framework of “unintended” multidimensionality (Oshima & Miller, 1992; Russell, 2005). Russell (2005), in particular, examined the effect of secondary traits on DBF/DTF detection. Like Russell, this study created item bundles by including multidimensional items on a simulated test designed in theory to be unidimensional. Simulating reference group members to have a higher mean ability than the focal group on the nuisance secondary dimension, resulted in DIF for each of the multidimensional items, that when examined together produced differential bundle functioning. The purpose of this Monte Carlo simulation study was to assess the Type I error and power performance of SIBTEST (Simultaneous Item Bias Test; Shealy & Stout, 1993a) for DBF analysis under various conditions with simulated data. The variables of interest included sample size and ratios of reference to focal group sample sizes, correlation between primary and secondary dimensions, magnitude of DIF/DBF, and angular item direction. Results showed SIBTEST to be quite powerful in detecting DBF and controlling Type I error for almost all of the simulated conditions. Specifically, power rates were .80 or above for 84% of all conditions and the average Type I error rate was approximately .05. Furthermore, the combined effect of the studied variables on SIBTEST power and Type I error rates provided much needed information to guide further use of SIBTEST for identifying potential sources of differential item/bundle functioning.

APA, Harvard, Vancouver, ISO, and other styles

44

Langer, Michelle M. Thissen David. "A reexamination of Lord's Wald test for differential item functioning using item response theory and modern error estimation." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2008. http://dc.lib.unc.edu/u?/etd,2084.

Full text

Abstract:

Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2008.<br>Title from electronic title page (viewed Feb. 17, 2009). "... in partial fulfillment of the requirements for the degree of Doctor in Philosophy in the Department of Psychology Quantitative." Discipline: Psychology; Department/School: Psychology.

APA, Harvard, Vancouver, ISO, and other styles

45

Bilir, Mustafa Kuzey. "Mixture item response theory-Mimic model simultaneous estimation of differential item functioning for manifest groups and latent classes /." Tallahassee, Florida : Florida State University, 2009. http://etd.lib.fsu.edu/theses/available/etd-08212009-172739/.

Full text

Abstract:

Thesis (Ph. D.)--Florida State University, 2009.<br>Advisor: Akihito Kamata, Florida State University, College of Education, Dept. of Educational Psychology and Learning Systems. Title and description from dissertation home page (viewed on April 29, 2010). Document formatted into pages; contains xvii, 207 pages. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

46

Gattamorta, Karina Alvarez. "A Comparison of Adjacent Categories and Cumulative DSF Effect Estimators." Scholarly Repository, 2009. http://scholarlyrepository.miami.edu/oa_dissertations/343.

Full text

Abstract:

The study of measurement invariance in polytomous items that targets individual score levels is known as differential step functioning (DSF; Penfield, 2007, 2008). DSF methods provide specific information describing the manifestation of the invariance effect within particular score levels and therefore serve a diagnostic role in identifying the individual score levels involved in the item's invariance effect. The analysis of DSF requires the creation of a set of dichotomizations of the item response variable. There are two primary approaches for creating the set of dichotomizations to conduct a DSF analysis. The first approach, known as the adjacent categories approach, is consistent with the dichotomization scheme underlying the generalized partial credit model (GPCM; Muraki, 1992) and considers each pair of adjacent score levels while treating the other score levels as missing. The second approach, known as the cumulative approach, is consistent with the dichotomization scheme underlying the graded response model (GRM; Samejima, 1997) and includes data from every score level in each dichotomization. To date, there is limited research on how the cumulative and adjacent categories approaches compare within the context of DSF, particularly as applied to a real data set. The understanding of how the interpretation and practical outcomes may vary given these two approaches is also limited. The current study addressed these two issues. This study evaluated the results of a DSF analysis using both the adjacent categories and cumulative dichotomization schemes in order to determine if the two approaches yield similar results and interpretations of DSF. These approaches were applied to data from a polytomously scored alternate assessment administered to children with significant cognitive disabilities. The results of the DSF analyses revealed that the two approaches generally led to consistent results, particularly in the case where DSF effects were negligible. For steps where significant DSF was present, the two approaches generally guide analysts to the same location of the item. However, several aspects of the results rose questions about the use of the adjacent categories dichotomization scheme. First, there seemed to be a lack of independence of the adjacent categories method since large DSF effects at one step are often paired with large DSF effects in the opposite direction found in the previous step. Additionally, when a substantial DSF effect existed, it was more likely to be significant using the cumulative approach over the adjacent categories approach. This is likely due to the smaller standard errors that lead to greater stability of the cumulative approach. In sum, the results indicate that the cumulative approach is preferable over the adjacent categories approach when conducting a DSF analysis.

APA, Harvard, Vancouver, ISO, and other styles

47

Anjorin, Idayatou. "HIGH-STAKES TESTS FOR STUDENTS WITH SPECIFIC LEARNING DISABILITIES: DISABILITY-BASED DIFFERENTIAL ITEM FUNCTIONING." Available to subscribers only, 2009. http://proquest.umi.com/pqdweb?did=1967913321&sid=3&Fmt=2&clientId=1509&RQT=309&VName=PQD.

Full text

Abstract:

Thesis (Ph. D.)--Southern Illinois University Carbondale, 2009.<br>"Department of Educational Psychology and Special Education." Includes bibliographical references (p. 110-126). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

48

Wilson, Ann Wells 1962. "Logistic regression and its use in detecting nonuniform differential item functioning in polytomous items." Diss., The University of Arizona, 1993. http://hdl.handle.net/10150/284324.

Full text

Abstract:

A computer simulation study was conducted to determine the feasibility of using logistic regression procedures to detect nonuniform differential item functioning (DIF) in polytomous items. A simulated test of 25 items was generated, of which the 25th item contained nonuniform DIF. The degree of nonuniform DIF in the 25th item was varied in four ways. Item scores were generated using Muraki's generalized partial credit model and the data were artificially dichotomized in three different ways for the logistic regression procedure. The results indicate that logistic regression is a viable procedure in the detection of most forms of nonuniform DIF; however, it was not sensitive to DIF that is uniform within score categories and nonuniform across score categories. Logistic regression procedures were also quite awkward in the polytomous case, because several regressions must be run per polytomous item and it was difficult to determine an omnibus result in most cases. Some logistic regression procedures, however, may be useful in the post hoc analysis of DIF in polytomous items.

APA, Harvard, Vancouver, ISO, and other styles

49

Driana, Elin. "GENDER DIFFERENTIAL ITEM FUNCTIONING ON A NINTH-GRADE MATHEMATICS PROFICIENCY TEST IN APPALACHIAN OHIO." Ohio University / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1181693190.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Clark, Patrick Carl Jr. "An Examination of Type I Errors and Power for Two Differential Item Functioning Indices." Wright State University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=wright1284475420.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!