To see the other types of publications on this topic, follow the link: Educational testing and measurements.

Journal articles on the topic 'Educational testing and measurements'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Educational testing and measurements.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Glas, Cees A. W. "Item response theory in educational assessment and evaluation." Mesure et évaluation en éducation 31, no. 2 (May 13, 2014): 19–34. http://dx.doi.org/10.7202/1025005ar.

Full text
Abstract:
Item response theory provides a useful and theoretically well-founded framework for educational measurement. It supports such activities as the construction of measurement instruments, linking and equating measurements, and evaluation of test bias and differential item functioning. It further provides underpinnings for item banking and flexible test administration designs, such as multiple matrix sampling, flexi-level testing, and computerized adaptive testing. First, a concise introduction to the principles of IRT models is given. The models discussed pertain to dichotomous items (items that are scored as either correct or incorrect) and polytomous items (items with partial credit scoring, such as most types of openended questions and performance assessments). Second, it is shown how an IRT measurement model can be enhanced with a structural model, such as, for instance, an analysis of variance model, to relate data from achievement and ability tests to students’ background variables, such as socio-economic status, intelligence or cultural capital, to school variables, and to features of the schooling system. Two applications are presented. The first one pertains to equating and linking of assessments, and the second one to a combination of an IRT measurement model and a multilevel linear model useful in school effectiveness research.
APA, Harvard, Vancouver, ISO, and other styles
2

Howie, Sarah J., Pekka Kupari, Martin Goy, Heike Wendt, and Wilfried Bos. "Rasch measurement in educational contexts Special issue 1: Rasch modeling in educational testing." Educational Research and Evaluation 17, no. 5 (October 2011): 301–6. http://dx.doi.org/10.1080/13803611.2011.634578.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Anderson, Beverly L. "State Testing and the Educational Measurement Community: Friends or Foes?" Educational Measurement: Issues and Practice 4, no. 2 (June 1985): 22–26. http://dx.doi.org/10.1111/j.1745-3992.1985.tb00876.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Camara, Wayne J., and Dianne C. Brown. "Educational and Employment Testing: Changing Concepts in Measurement and Policy." Educational Measurement: Issues and Practice 14, no. 1 (October 25, 2005): 5–11. http://dx.doi.org/10.1111/j.1745-3992.1995.tb00845.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kim, Eun Sook, Myeongsun Yoon, and Taehun Lee. "Testing Measurement Invariance Using MIMIC." Educational and Psychological Measurement 72, no. 3 (December 6, 2011): 469–92. http://dx.doi.org/10.1177/0013164411427395.

Full text
Abstract:
Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be tested through other measurement invariance testing techniques. MIMIC modeling is also used for measurement invariance testing by allowing a direct path from a grouping covariate to each observed variable. This simulation study with both continuous and categorical variables investigated the performance of MIMIC in detecting noninvariant variables under various study conditions and showed that the likelihood ratio test of MIMIC with Oort adjustment not only controlled Type I error rates below the nominal level but also maintained high power across study conditions.
APA, Harvard, Vancouver, ISO, and other styles
6

Paap, Muirne C. S., Sebastian Born, and Johan Braeken. "Measurement Efficiency for Fixed-Precision Multidimensional Computerized Adaptive Tests: Comparing Health Measurement and Educational Testing Using Example Banks." Applied Psychological Measurement 43, no. 1 (April 23, 2018): 68–83. http://dx.doi.org/10.1177/0146621618765719.

Full text
Abstract:
It is currently not entirely clear to what degree the research on multidimensional computerized adaptive testing (CAT) conducted in the field of educational testing can be generalized to fields such as health assessment, where CAT design factors differ considerably from those typically used in educational testing. In this study, the impact of a number of important design factors on CAT performance is systematically evaluated, using realistic example item banks for two main scenarios: health assessment (polytomous items, small to medium item bank sizes, high discrimination parameters) and educational testing (dichotomous items, large item banks, small- to medium-sized discrimination parameters). Measurement efficiency is evaluated for both between-item multidimensional CATs and separate unidimensional CATs for each latent dimension. In this study, we focus on fixed-precision (variable-length) CATs because it is both feasible and desirable in health settings, but so far most research regarding CAT has focused on fixed-length testing. This study shows that the benefits associated with fixed-precision multidimensional CAT hold under a wide variety of circumstances.
APA, Harvard, Vancouver, ISO, and other styles
7

Bichi, Ado Abdu, and Rohaya Talib. "Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development." International Journal of Evaluation and Research in Education (IJERE) 7, no. 2 (June 1, 2018): 142. http://dx.doi.org/10.11591/ijere.v7i2.12900.

Full text
Abstract:
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and reliability estimates are necessary. There are two generally acceptable frameworks used in evaluating the quality of test in educational and psychological measurements, these are; Classical Test Theory (CTT) and Item Response Theory (IRT). The estimates of test items validity and reliability depend on a particular measurement model used. It is vital for a test developer to be familiar with the different test development and item analysis methods in order to facilitate the development of a new test. The CTT is a traditional approach which was widely criticise in the measurement community for its shortcomings such as sample dependency of coefficient measures and estimates of measurement error. However, the IRT is a modern approach which provides solutions to most of the CTT’s identified shortcomings. This paper therefore, provides a comprehensive overview of the IRT and its procedures as applied to test item development and analysis. The paper concludes with some suggestions for test developers and test specialists at all levels to adopt IRT for its identified crucial theoretical and empirical gains over CTT. IRT based parameter estimates should be superior and reliable than CTT based parameter estimates. With these features, IRT can help resolve the problems associated with test design based on CTT.
APA, Harvard, Vancouver, ISO, and other styles
8

Steinmetz, Holger, Peter Schmidt, Andrea Tina-Booh, Siegrid Wieczorek, and Shalom H. Schwartz. "Testing measurement invariance using multigroup CFA: differences between educational groups in human values measurement." Quality & Quantity 43, no. 4 (January 5, 2008): 599–616. http://dx.doi.org/10.1007/s11135-007-9143-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

А.А., Huseynova, and Vashchinnikova K.D. "IMPACT ASSESSMENT ON LEARNING MOTIVATION IN THE FRAMEWORK OF THE COMPETENCE APPROACH IN EDUCATION." “Educational bulletin “Consciousness” 22, no. 9 (September 22, 2020): 28–34. http://dx.doi.org/10.26787/nydha-2686-6846-2020-22-9-28-34.

Full text
Abstract:
Turning to the new educational paradigm, the research paper considers the conditions for ensuring the effective-ness of authentic assessment of students ' achievements within the competence approach and the transition from traditional knowledge control to tests developed on the basis of the theory of pedagogical measurements. Special attention is paid to independent assessment as a tool for stimulating learning activities, as well as to the justifica-tion of the pattern design method used in the develop-ment of measurement tools. The experimental basis of the study is based on the results of an independent assess-ment of educational achievements of students of the sen-ior level of secondary vocational education in social studies in several educational organizations. As a result of the survey of participants in independent testing, the formation of a stable positive learning motivation is not-ed. The relationship with the assessment of the impact on educational motivation is confirmed by the respondents ' attitude to the authentic assessment procedure on the part of participants in the assessment process: school-children, teachers, and parents. As a result, it was re-vealed that all subjects of the educational process evalu-ate the impact of the proposed method of assessment on educational motivation from a positive side.
APA, Harvard, Vancouver, ISO, and other styles
10

Coutinho, Martha J. "Book Review: Educational testing and measurement: Classroom application and practice (5th ed.)." Journal of Psychoeducational Assessment 17, no. 3 (September 1999): 269–74. http://dx.doi.org/10.1177/073428299901700306.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Cizek, Gregory J. "Some Thoughts on Educational Testing: Measurement Policy Issues Into the Next Millennium." Educational Measurement: Issues and Practice 12, no. 3 (October 25, 2005): 10–16. http://dx.doi.org/10.1111/j.1745-3992.1993.tb00537.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Trávníček, Marek, Petr Vlček, Jaroslav Vrbas, and Jiří Nykodým. "Pilotní ověření testové baterie pohybových dovedností MOBAK jako součást kurikula sportovních her ve školní tělesné výchově." Studia sportiva 10, no. 2 (December 12, 2016): 164–76. http://dx.doi.org/10.5817/sts2016-2-18.

Full text
Abstract:
Current elementary school education is based on the Framework education programmes for elementary education. Each educational area contains the characteristics of the educational area, the objectives of the educational area and its educational content. To evaluate the competencies and achieved goals also in physical education we need to create assessment tools which are applicable in the physical education classes and besides others regard also the sport games educational content of the elementary school curricula. The goal of the presented paper is to introduce an international network based on the MOBAK testing battery research. The battery presents one of the options how to test physical skills in elementary school physical education. In the paper we present some results of the pilot examination of the chosen elementary schools in the South-Moravian district of the Czech Republic. Based on the preliminary measurements (n357) we can conclude that our results are similar to the other network countries. We were also able to observe positive attitude toward using the MOBAK 1 testing battery by the participating teachers. As we find important to verify the battery on larger number of respondents we are working on the intensifying the international cooperation.
APA, Harvard, Vancouver, ISO, and other styles
13

Curren, Randall R. "Educational measurement and knowledge of other minds." Theory and Research in Education 2, no. 3 (November 2004): 235–53. http://dx.doi.org/10.1177/1477878504046517.

Full text
Abstract:
This article addresses the capacity of high stakes tests to measure the most significant kinds of learning. It begins by examining a set of philosophical arguments pertaining to construct validity and alleged conceptual obstacles to attributing specific knowledge and skills to learners. The arguments invoke philosophical doctrines of holism and radical interpretation and the theory of situated learning, and they are found to be unsound. The article goes on to examine the difficulties involved in combining adequate validity and reliability in one test. The literature on test item formats is brought to bear on the potential validity of multiple-choice items, and the rater reliability of constructed-response items is addressed through discussion of the methods used by the Educational Testing Service (USA) and a summary report of alternative methods developed by the author and others in cooperation with the California Golden State Examination.
APA, Harvard, Vancouver, ISO, and other styles
14

Mason, Emanuel J. "Measurement Issues in High Stakes Testing." Journal of Applied School Psychology 23, no. 2 (July 24, 2007): 27–46. http://dx.doi.org/10.1300/j370v23n02_03.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Hamre, Bjørn, and Christian Ydesen. "The Ascent of Educational Psychology in Denmark in the Interwar Years." Nordic Journal of Educational History 1, no. 2 (November 24, 2014): 87–111. http://dx.doi.org/10.36368/njedh.v1i2.40.

Full text
Abstract:
In this article, we argue that an understanding of the interwar years and the ascent of educational psychology contribute valuable knowledge about the inner workings of modern-day education with regard to the links between society and education in general and the boundary between normality and deviation in particular. The establishment of the educational psychologist’s office at Frederiksberg in Denmark, the introduction of IQ testing, and the related psychological files of students provide an image of a period of measurement in schools during which IQ testing was decisive in decisions to transfer students to the remedial school. The testing and filing were the foremost important technologies of the period. We draw on sources that allow us to view educational psychology and testing in their local, national, and political context. The sources applied are primarily obtained from Frederiksberg City Archive that contains archives from the Educational Psychology Office.
APA, Harvard, Vancouver, ISO, and other styles
16

Norcini, John J., and David B. Swanson. "Factors influencing testing time requirements for measurements using written simulations." Teaching and Learning in Medicine 1, no. 2 (January 1989): 85–91. http://dx.doi.org/10.1080/10401338909539387.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Frey, Andreas, and Nicki-Nils Seitz. "Multidimensional adaptive testing in educational and psychological measurement: Current state and future challenges." Studies in Educational Evaluation 35, no. 2-3 (June 2009): 89–94. http://dx.doi.org/10.1016/j.stueduc.2009.10.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Balluerka, Nekane, Ian Plewis, Arantxa Gorostiaga, and José-Luis Padilla. "Examining Sources of DIF in Psychological and Educational Assessment Using Multilevel Logistic Regression." Methodology 10, no. 2 (September 1, 2014): 71–79. http://dx.doi.org/10.1027/1614-2241/a000076.

Full text
Abstract:
In the last three decades, important progress has been made toward more efficient statistical techniques for detecting Differential Item Functioning (DIF). However, the findings are scant when it comes to explaining DIF. Multilevel regression models can expand the knowledge of DIF causes, specifying a DIF parameter that varies randomly over items and testing hypotheses on sources of DIF shared by item bundles. The present study uses multilevel logistic regression to identify the item characteristics that could explain the presence of DIF in short tests or questionnaires, which are usually used in psychological and educational assessment. The usefulness of the approach is tested on measurements of the attitudes toward science of Spanish and English pupils obtained from the OECD Programme for International Student Assessment database.
APA, Harvard, Vancouver, ISO, and other styles
19

LAURIER, MICHEL. "Can computerised testing be authentic?" ReCALL 12, no. 1 (May 2000): 93–104. http://dx.doi.org/10.1017/s0958344000001014.

Full text
Abstract:
The concept of authenticity first appeared with the development of the communicative approach. More recently, in the field of educational measurement, authentic assessment methods have been proposed. Although adaptive testing seems to be the most important application of computers in language assessment, these tests are usually not authentic. Since many real world tasks are accomplished with computers, these may be used for authentic direct testing. Computers may be also used in semi-direct testing as a way to enhance the context. Finally in authentic assessment, computers may be used as a tool to process the data when the learners use them to organise their portfolio. Using the computer, test developers can also create better authentic tests.
APA, Harvard, Vancouver, ISO, and other styles
20

Zaini, S. S., N. Rossli, T. A. Majid, S. N. C. Deraman, and N. A. Razak. "Wind Directional Effect on a Single Storey House Using Educational Wind Tunnel." MATEC Web of Conferences 206 (2018): 01006. http://dx.doi.org/10.1051/matecconf/201820601006.

Full text
Abstract:
Wind tunnel testing of single-storey isolated building with 1: 100 scale down model was carried out in an open circuit wind tunnel without roughness elements facilities. The gable roof building model with 30˚ roof pitch was studied for wind directions of 0˚, 30˚, 45˚, 60˚ and 90˚. Pressure measurements were performed on all the walls and the roof (Zone 1, 2, 3, 4 and 5) of the building model with wind speed of 12 m/s. The results showed that the high suctions were generally induced by the 90˚ wind direction for Zone 1 and 60˚ and 90˚ wind directions for Zone 2. Mostly, high suction was also observed in case of 45˚ and 60˚ wind direction in Zone 3. In zone 4 and zone 5, high suction was generally induced by the 0˚ wind direction.
APA, Harvard, Vancouver, ISO, and other styles
21

Hocevar, Dennis, Ali-Maher Khattab, and William B. Michael. "Significance Testing and Efficiency in Lisrel Measurement Models." Educational and Psychological Measurement 47, no. 1 (March 1987): 45–49. http://dx.doi.org/10.1177/0013164487471006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Egberink, Iris J. L., Rob R. Meijer, and Jorge N. Tendeiro. "Investigating Measurement Invariance in Computer-Based Personality Testing." Educational and Psychological Measurement 75, no. 1 (February 3, 2014): 126–45. http://dx.doi.org/10.1177/0013164414520965.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Bursal, Murat. "A COMPARISON OF STANDARD AND RETROSPECTIVE PRE-POST TESTING FOR MEASURING THE CHANGES IN SCIENCE TEACHING EFFICACY BELIEFS." Journal of Baltic Science Education 14, no. 2 (April 25, 2015): 275–83. http://dx.doi.org/10.33225/jbse/15.14.275.

Full text
Abstract:
Thirty-nine American and 78 Turkish preservice elementary teachers’ personal science teaching efficacy (PSTE) beliefs were investigated during science methods courses with standard and retrospective pre-post testing methods. Significant differences in the PSTE gain scores, which indicate the changes in the mean PSTE scores from standard/retrospective pretests to the posttest, were found between the standard and retrospective measurements in both samples. Significant differences between the standard and retrospectively measured gain scores were detected among all subgroups under study, which were formed by participants’ PSTE levels and gender. It has been concluded that the differences between the standard and retrospectively measured PSTE gain scores are due to the difference in the nature of these measurement methods and can be seen in most research samples in educational studies around the world. The findings of this study suggest that the response-shift bias should be considered as a common threat to validity for research studies measuring self-efficacy beliefs with the standard pre-post testing method. Key words: personal science teaching efficacy, preservice elementary teacher, response-shift bias, retrospective pretest.
APA, Harvard, Vancouver, ISO, and other styles
24

Лагодинський, Олександр Сергійович, Олексій Васильович Буяло, and Сергій Васильович Хамула. "APPLICATION OF ACHIEVEMENT TESTING SOFTWARE OF CADETS IN HIGHER MILITARY EDUCATIONAL INSTITUTIONS." Information Technologies and Learning Tools 81, no. 1 (February 23, 2021): 222–34. http://dx.doi.org/10.33407/itlt.v81i1.3675.

Full text
Abstract:
The article presents description, advantages, and use of the Achievement Testing Software (ATS) developed by the authors – the teaching staff of the Military Diplomatic Academy named after Yevheniy Bereznyak – for cadets’ achievement testing in the educational process of higher military educational institutions. The authors prove the necessity of the ATS introduction into the educational process of such institutions due to the inability of existing computer testing software to fully satisfy their needs (high costs of technical maintenance, closed exit codes, and constant reliance on the Internet connection making it impossible to provide sensitive information security). Unlike other systems, the ATS is a reliable instrument of military educational control, capable of operating off-line. It was developed based on fundamental works in test theory, measurement and evaluation by Ukrainian and foreign scholars. It can be widely used for different types of achievement testing in higher military educational institutions: classroom; entrance or summative; and in any type of military course. The ATS rational use allows saving learning time and teachers’ effort, simultaneous engagement of many cadets in the training process, as well as objective measurement and evaluation through its automation. The ATS also provides capabilities of control over the educational process which allow curricula correction due to the constant feedback from cadets. Basically, the ATS performs two interrelated functions: test development and editing (by teachers); and academic achievement measuring (by cadets through the developed test items). The system can be easily installed on personal computers with Windows XP Professional SP2 operational system. The article describes in greater detail the procedure of operating in two modules: the Teacher’s Module and the Testing Module. Here, the ATS provides a user-friendly menu that can be easily navigated by pressing on buttons and selecting necessary options according to the instructions. The test entrance is password-protected, the test is encrypted, and the test score can be quickly viewed by cadets and teachers immediately after its completion. The ATS efficiency was proved through an experiment involving cadets of the Military Diplomatic Academy named after Yevheniy Bereznyak by demonstrating improvement in their performance.
APA, Harvard, Vancouver, ISO, and other styles
25

Morra, Sergio. "Issues in Working Memory Measurement: Testing for M Capacity." International Journal of Behavioral Development 17, no. 1 (March 1994): 143–59. http://dx.doi.org/10.1177/016502549401700109.

Full text
Abstract:
Two studies on measurement of M capacity are reported. Study 1, with 191 subjects aged 6-11, found factor-analytical and correlational evidence that five M capacity tests share a common source of variance, and that, with age, they increase at a similar rate. Study 2, with 124 subjects aged 6-10 years, replicated the previous findings. It is suggested that, in this age range, M capacity can be measured with a battery of tests.
APA, Harvard, Vancouver, ISO, and other styles
26

Demir, Ergül. "Testing the Cultural Differences of School Characteristics with Measurement Invariance." Journal of Education and Learning 5, no. 2 (April 26, 2016): 337. http://dx.doi.org/10.5539/jel.v5n2p337.

Full text
Abstract:
<p>In this study, it was aimed to model the school characteristics in multivariate structure, and according to this model, aimed to test the invariance of this model across five randomly selected countries and economies from PISA 2012 sample. It is thought that significant differences across group in the context of school characteristics have the potential to explain the effectiveness of schools and educational systems. This study was conducted with correlational model as a basic research. Secondary level analyses were conducted on PISA 2012 School Questionnaire data. To construct “school characteristics model”, whole data from 65 participant countries and economies were considered. One country from each proficiency level and totally 5 countries were randomly selected for the research sample. These countries and economies are Shanghai-China, Korea, Ireland, Turkey and Uruguay. In this way sample was composed of totally 835 schools. Multi-group confirmatory factor analysis was used to test the invariance of school characteristics across countries. According to the results, Shanghai and Uruguay differed from each other and other countries. Across Korea, Ireland and Turkey, School characteristics provide strong invariance. These three cultures were more similar. Main result of this study is that school characteristics cannot be invariant across some cultural groups or sub-groups. In order to provide equal opportunity to all stakeholders of the educational system, and also provide school effectiveness, such kinds of differences are considered carefully.</p>
APA, Harvard, Vancouver, ISO, and other styles
27

Geisinger, Kurt F. "Empirical Considerations on Intelligence Testing and Models of Intelligence: Updates for Educational Measurement Professionals." Applied Measurement in Education 32, no. 3 (June 17, 2019): 193–97. http://dx.doi.org/10.1080/08957347.2019.1619564.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Zueva, I. I., and M. E. Laishcheva. "TO THE QUESTION OF THE FEASIBILITY OF USING TEST TECHNOLOGIES TO CONTROL THE QUALITY OF EDUCATIONAL ACTIVITY OF STUDENTS." Construction and Geotechnics 10, no. 3 (December 15, 2019): 117–23. http://dx.doi.org/10.15593/2224-9826/2019.3.12.

Full text
Abstract:
The article discusses the feasibility of using computer test technologies to monitor student learning in a technical college. The main purpose of the federal state educational standards is the result of the educational process. The effectiveness of education is considered not as a set of knowledge and skills acquired by students from different academic disciplines, but as the ability of a learner to apply knowledge and skills in practical professional activities, as the formation by students of certain competencies. In assessing educational results, it is necessary to analyze the levels of education that have been achieved by students at a certain stage of study. Along with the traditional methods of assessing students learning of educational material for current and mid-term monitoring of academic achievement, it is advisable to use test technologies to form students' knowledge component of competences. The article discusses the advantages, place and limitations of test methods in the system of control and assessment of the formation of students' competencies. The use of test tasks for technical disciplines in order to assess the formation of basic concepts of the discipline among students, assimilation by students of educational material is very effective and useful. But to evaluate the student's ability to concretize his answer with examples, the ability to logically and convincingly express his thoughts, analyze the depth of knowledge and practical skills of solving the student's problems using test techniques are unlikely to work. Despite these shortcomings of testing as a method of pedagogical control, its positive aspects (efficiency, objectivity, systematic conducting, teaching testing function, computer implementation of testing) make this form of knowledge diagnostics a promising direction for the development of pedagogical measurements in the system of training qualified personnel of various levels in the construction industry.
APA, Harvard, Vancouver, ISO, and other styles
29

Zhang, Tianjiao, Xiuzhi Guo, Li’an Hou, Haijian Zhao, Rong Ma, Liangyu Xia, Honglei Li, Tingting You, Ling Qiu, and Chuanbao Zhang. "Effects of calcium dobesilate (CaD) interference on serum creatinine measurements: a national External Quality Assessment (EQA)-based educational survey of drug-laboratory test interactions." Clinical Chemistry and Laboratory Medicine (CCLM) 59, no. 1 (January 26, 2021): 139–45. http://dx.doi.org/10.1515/cclm-2020-0424.

Full text
Abstract:
AbstractObjectivesDrug-laboratory test interactions (DLTIs) are one of the major sources of laboratory errors. Calcium dobesilate (CaD) interference on serum creatinine testing is a widespread problem that has long been ignored in China. A national EQA-based survey was launched to investigate the current status of CaD interference on creatinine routine methods used in China and enhance the education of CaD interference in clinical laboratories.MethodsA descriptive survey was developed to characterize the status quo of Chinese laboratory professionals’ cognition to CaD interference. Four of survey samples which were spiked with/without interference additive were shipped to 175 participant laboratories. The target reference values from a reference measurement procedure were compared against the results from participating laboratories to evaluate the CaD interference on serum creatinine measurements using enzymatic method or Jaffé method.ResultsThe lack of knowledge of DLTIs and the barriers to collect information from pharmacological and laboratory data systems had become the main problems on implementing DLTIs education in China. A significant negative influence of CaD on enzymatic method was observed regardless of measurement platforms. Jaffé method was generally free from interaction with CaD but showed poor precision and accuracy at low creatinine concentrations.ConclusionsMore efforts should be made to enhance the education of DLTIs in clinical laboratories in China.
APA, Harvard, Vancouver, ISO, and other styles
30

McLaughlin, J. Patrick, and Jason T. White. "Major Field Achievement Test In Business - Guidelines For Improved Outcome Scores - Part I." College Teaching Methods & Styles Journal (CTMS) 3, no. 2 (July 22, 2011): 11. http://dx.doi.org/10.19030/ctms.v3i2.5276.

Full text
Abstract:
Outcomes measurements have always been an important part of proving to outside constituencies how you measure up to other schools with your business programs. A common nationally-normed exam that is used is the Major Field Achievement Test in Business from Educational Testing Services. Our paper discusses some guidelines that we are pilot testing to see if we can improve not only our lowest score, marketing majors in the finance area, but all of our overall outcome scores in the eight (8) segmented areas covered in the exam. If we are going to use the MFAT, let us try to make sure that the input from our students is the best we can get so that our output scores are truly meaningful.
APA, Harvard, Vancouver, ISO, and other styles
31

Reuterberg, Sven-Eric, and Jan-Eric Gustafsson. "Confirmatory Factor Analysis and Reliability: Testing Measurement Model Assumptions." Educational and Psychological Measurement 52, no. 4 (December 1992): 795–811. http://dx.doi.org/10.1177/0013164492052004001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Rensvold, Roger B., and Gordon W. Cheung. "Testing Measurement Models for Factorial Invariance: A Systematic Approach." Educational and Psychological Measurement 58, no. 6 (December 1998): 1017–34. http://dx.doi.org/10.1177/0013164498058006010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Andreou, Christos, Evridiki Papastavrou, Chryssoula Lemonidou, Kyriacos Mattheou, and Anastasios Merkouris. "Adaptation and Validation of the Learning Style Inventory Version 3.1 in Greek Language: A Methodological Study." Journal of Nursing Measurement 23, no. 2 (2015): 88E—111E. http://dx.doi.org/10.1891/1061-3749.23.2.88.

Full text
Abstract:
Background and Purpose: The nursing research on learning differences is currently expanding, suggesting the need for trustful measurements. This study aimed to adapt and cross-culturally validate the Learning Style Inventory. Methods: The first phase involved symmetrical translation and adaptation to the Greek target language. The second phase concerned the psychometric testing. Results: Internal reliability showed satisfactory alpha values. Kappa coefficients supported the test–retest reliability, and paired t test correlations justified the stability. Factor analysis yielded 2 constructs fitted with theory. The internal validity was also evidenced. The nursing students' learning style profile was discussed within their educational field and cultural background. Conclusions: The inventory presented content and construct equivalence to original scales. Certain implications were drawn for nursing supporting the utility of learning styles' measurements.
APA, Harvard, Vancouver, ISO, and other styles
34

Ellis, Jason. "“Inequalities of Children in Original Endowment”: How Intelligence Testing Transformed Early Special Education in a North American City School System." History of Education Quarterly 53, no. 4 (November 2013): 401–29. http://dx.doi.org/10.1111/hoeq.12035.

Full text
Abstract:
“There are few if any more significant events in modern educational history than the developments which have recently taken place in methods of mental measurement,” Lewis Terman wrote in 1923 about the intelligence testing movement he did so much to pioneer in American schools throughout the 1920s. Indeed educational historians, particularly Paul Chapman, have shown that the rise of intelligence testing provoked large and relatively swift changes in public education, enabling school systems to sort and stream their students by ability on an unprecedented scale. “By 1930,” Chapman writes, “both intelligence testing and ability grouping had become central features of the educational system.” Less often talked about are the effects of intelligence testing and the concept of intelligence quotient (IQ) on early special education classes, and on the pupils who attended them. In fact, Terman recognized the significance of IQ testing to special education as well. In 1919, he wrote that IQ tests would help to turn the existing logic of learning problems on its head by proving that “the retardation problem is exactly the reverse of what it is popularly supposed to be.”
APA, Harvard, Vancouver, ISO, and other styles
35

Granić, Andrina, Jelena Nakić, and Nikola Marangunić. "Scenario-based Group Usability Testing as a Mixed Methods Approach to the Evaluation of Three-Dimensional Virtual Learning Environments." Journal of Educational Computing Research 58, no. 3 (July 5, 2019): 616–39. http://dx.doi.org/10.1177/0735633119859918.

Full text
Abstract:
Although virtual reality became popular technology whose application is recognized in various domains, the field generally still lacks a widespread culture of usability. This is also evident when considering environments intended for learning, specifically virtual learning environments (VLEs). According to our findings, it is clear that there is a growing need for systematic evaluation approach to help with the design and development of usable learner-centered VLE solutions. After comprehensive introductory background and state of the art in the field, this article provides an insight into Scenario-based Group Usability Testing (ScerGUT), a mixed methods approach to the evaluation of three-dimensional VLEs which integrates several different methods of usability testing with measurements of educational value. While the majority of the existing work has made use of usually one single usability assessment technique, ScerGUT employs a number of methods putting in focus users and user testing. To examine efficiency and applicability of the approach, empirical validation is conducted as a case study of particular VLE. The contribution of the article is twofold: (a) ScerGUT as a mixed methods approach to the evaluation of VLEs, which brings new scientific value and could help other researchers and (b) ScerGUT’s application to a particular VLE, which brings quantitative and qualitative results, thus providing an insight into ease of use and educational value of specific VLE.
APA, Harvard, Vancouver, ISO, and other styles
36

Diao, Qi, and Hao Ren. "Constructing Shadow Tests in Variable-Length Adaptive Testing." Applied Psychological Measurement 42, no. 7 (February 20, 2018): 538–52. http://dx.doi.org/10.1177/0146621617753736.

Full text
Abstract:
Imposing content constraints is very important in most operational computerized adaptive testing (CAT) programs in educational measurement. Shadow test approach to CAT (Shadow CAT) offers an elegant solution to imposing statistical and nonstatistical constraints by projecting future consequences of item selection. The original form of Shadow CAT presumes fixed test lengths. The goal of the current study was to extend Shadow CAT to tests under variable-length termination conditions and evaluate its performance relative to other content balancing approaches. The study demonstrated the feasibility of constructing Shadow CAT with variable test lengths and in operational CAT programs. The results indicated the superiority of the approach compared with other content balancing methods.
APA, Harvard, Vancouver, ISO, and other styles
37

Deshaies, Kathryn, Noori Akhtar-Danesh, and Sharon Kaasalainen. "An Evaluation of Chronic Pain Questionnaires in the Adult Population." Journal of Nursing Measurement 23, no. 1 (2015): 22–39. http://dx.doi.org/10.1891/1061-3749.23.1.22.

Full text
Abstract:
Background and Purpose: Considering pain’s subjectivity, measurement and its processes are indispensable to clinicians and researchers. Development and testing methods of recently published chronic pain questionnaires were analyzed to determine the state of measurement in chronic pain. Methods: There were 8 questionnaires analyzed against 28 criteria, which combined specific testing standards and commonly accepted reliability statistics. Results: Only 1 questionnaire received a rating of good method quality. The 7 remaining questionnaires received a rating of poor method quality. Conclusions: Newly developed chronic pain self-report questionnaires revealed deficiencies in construction and testing methods. It is proposed that an adapted version of the Standards of Educational and Psychological Testing serves as a useful guide for developing and testing new health questionnaires.
APA, Harvard, Vancouver, ISO, and other styles
38

Kemper, Han C. G., and Willem Van Mechelen. "Physical Fitness Testing of Children: A European Perspective." Pediatric Exercise Science 8, no. 3 (August 1996): 201–14. http://dx.doi.org/10.1123/pes.8.3.201.

Full text
Abstract:
The purpose of this article is to clarify the scientific basis of physical fitness assessment in children and to review the European efforts to develop a EUROFIT fitness test battery for the youth in the countries of the Council of Europe. The development of EUROFIT is based on the efforts made in the United States in the 1950s and in Europe in the 1980s. Physical fitness measurement is not identical to physiological measurement: The EUROFIT tests are aimed at measuring abilities rather than skills. Correlations between physical fitness tests and physiological laboratory tests show varying results and, therefore, need to be continued. Reliability of fitness tests needs to be continually studied. Because of the multipurposes of physical fitness testing, EUROFIT norm- and criterion-referenced scales for EUROFIT have to be developed. Examples of scaling methods are given. Implementation of the EUROFIT fitness tests for educational purposes is urgently needed.
APA, Harvard, Vancouver, ISO, and other styles
39

Bond, Trevor G. "Ready for school? Ready for learning? An empirical contribution to a perennial debate." Australian Educational and Developmental Psychologist 18, no. 1 (2001): 77–80. http://dx.doi.org/10.1017/s0816512200028303.

Full text
Abstract:
The Rasch measurement principles espoused in the Bond & Fox (2001) volume reviewed elsewhere in this journal are routinely adopted by Australia's major educational measurement projects (e.g., by Australian Council for Educational Research, Educational Testing Centre). Yet those ideas are yet to have their full impact in smaller research projects in educational and developmental psychology. A number of quantitative analytical techniques used in our disciplines are able to help us to draw conclusions like “Betty is better than or more developed than Bob”, but Rasch measurement is uniquely placed to help us conclude that “Betty is this much better than or more developed than Bob.” In educational and psychological statistics, we regularly presume the “interval” nature of our research data, but only the Rasch model sets about to ensure that the units of measurement maintain their unit value across the whole achievement or development scale.
APA, Harvard, Vancouver, ISO, and other styles
40

Espinoza-Venegas, Maritza, Olivia Sanhueza-Alvarado, Noé Ramírez-Elizondo, and Katia Sáez-Carrillo. "A validation of the construct and reliability of an emotional intelligence scale applied to nursing students." Revista Latino-Americana de Enfermagem 23, no. 1 (February 2015): 139–47. http://dx.doi.org/10.1590/0104-1169.3498.2535.

Full text
Abstract:
OBJECTIVE: The current study aimed to validate the construct and reliability of an emotional intelligence scale.METHOD: The Trait Meta-Mood Scale-24 was applied to 349 nursing students. The process included content validation, which involved expert reviews, pilot testing, measurements of reliability using Cronbach's alpha, and factor analysis to corroborate the validity of the theoretical model's construct.RESULTS: Adequate Cronbach coefficients were obtained for all three dimensions, and factor analysis confirmed the scale's dimensions (perception, comprehension, and regulation).CONCLUSION: The Trait Meta-Mood Scale is a reliable and valid tool to measure the emotional intelligence of nursing students. Its use allows for accurate determinations of individuals' abilities to interpret and manage emotions. At the same time, this new construct is of potential importance for measurements in nursing leadership; educational, organizational, and personal improvements; and the establishment of effective relationships with patients.
APA, Harvard, Vancouver, ISO, and other styles
41

Himelfarb, Igor. "A primer on standardized testing: History, measurement, classical test theory, item response theory, and equating." Journal of Chiropractic Education 33, no. 2 (June 6, 2019): 151–63. http://dx.doi.org/10.7899/jce-18-22.

Full text
Abstract:
Objective:This article presents health science educators and researchers with an overview of standardized testing in educational measurement. The history, theoretical frameworks of classical test theory, item response theory (IRT), and the most common IRT models used in modern testing are presented.Methods:A narrative overview of the history, theoretical concepts, test theory, and IRT is provided to familiarize the reader with these concepts of modern testing. Examples of data analyses using different models are shown using 2 simulated data sets. One set consisted of a sample of 2000 item responses to 40 multiple-choice, dichotomously scored items. This set was used to fit 1-parameter logistic (PL) model, 2PL, and 3PL IRT models. Another data set was a sample of 1500 item responses to 10 polytomously scored items. The second data set was used to fit a graded response model.Results:Model-based item parameter estimates for 1PL, 2PL, 3PL, and graded response are presented, evaluated, and explained.Conclusion:This study provides health science educators and education researchers with an introduction to educational measurement. The history of standardized testing, the frameworks of classical test theory and IRT, and the logic of scaling and equating are presented. This introductory article will aid readers in understanding these concepts.
APA, Harvard, Vancouver, ISO, and other styles
42

Chalhoub–Deville, Micheline, and Craig Deville. "COMPUTER ADAPTIVE TESTING IN SECOND LANGUAGE CONTEXTS." Annual Review of Applied Linguistics 19 (January 1999): 273–99. http://dx.doi.org/10.1017/s0267190599190147.

Full text
Abstract:
The widespread accessibility to large, networked computer labs at educational sites and commercial testing centers, coupled with fast-paced advances in both computer technology and measurement theory, along with the availability of off-the-shelf software for test delivery, all help to make the computerized assessment of individuals more efficient and accurate than assessment using traditional paper-and-pencil (P&P) tests. Computer adaptive testing (CAT) is a form of computerized assessment that has achieved a strong foothold in licensure and certification testing and is finding greater application in many other areas as well, including education. A CAT differs from a straightforward, linear test in that an item(s) is selected for each test taker based on his/her performance on previous items. As such, assessment is tailored online to accommodate the test taker's estimated ability and confront the examinee with items that best measure that ability.
APA, Harvard, Vancouver, ISO, and other styles
43

Debelak, Rudolf, and Martin Arendasy. "An Algorithm for Testing Unidimensionality and Clustering Items in Rasch Measurement." Educational and Psychological Measurement 72, no. 3 (January 6, 2012): 375–87. http://dx.doi.org/10.1177/0013164411426005.

Full text
Abstract:
A new approach to identify item clusters fitting the Rasch model is described and evaluated using simulated and real data. The proposed method is based on hierarchical cluster analysis and constructs clusters of items that show a good fit to the Rasch model. It thus gives an estimate of the number of independent scales satisfying the postulates of sufficiency of total number of correctly answered items for a person’s proficiency, unidimensionality, and local independence that can be constructed from an item set. The method is also compared with the application of a principal components analysis based on tetrachoric correlations. In general, the proposed method was shown to provide practically usable results especially for large person samples.
APA, Harvard, Vancouver, ISO, and other styles
44

Karahan, Ali Yavuz, Bugra Kaya, Banu Kuran, Ozlem Altındag, Pelin Yildirim, Sevil Ceyhan Dogan, Aynur Basaran, et al. "Common Mistakes in the Dual-Energy X-ray Absorptiometry (DXA) in Turkey. A Retrospective Descriptive Multicenter Study." Acta Medica (Hradec Kralove, Czech Republic) 59, no. 4 (2016): 117–23. http://dx.doi.org/10.14712/18059694.2017.38.

Full text
Abstract:
Background: Osteoporosis is a widespread metabolic bone disease representing a global public health problem currently affecting more than two hundred million people worldwide. The World Health Organization states that dual-energy X-ray absorptiometry (DXA) is the best densitometric technique for assessing bone mineral density (BMD). DXA provides an accurate diagnosis of osteoporosis, a good estimation of fracture risk, and is a useful tool for monitoring patients undergoing treatment. Common mistakes in BMD testing can be divided into four principal categories: 1) indication errors, 2) lack of quality control and calibration, 3) analysis and interpretation errors, and 4) inappropriate acquisition techniques. The aim of this retrospective multicenter descriptive study is to identify the common errors in the application of the DXA technique in Turkey. Methods: All DXA scans performed during the observation period were included in the study if the measurements of both, the lumbar spine and proximal femur were recorded. Forearm measurement, total body measurements, and measurements performed on children were excluded. Each examination was surveyed by 30 consultants from 20 different centers each informed and trained in the principles of and the standards for DXA scanning before the study. Results: A total of 3,212 DXA scan results from 20 different centers in 15 different Turkish cities were collected. The percentage of the discovered erroneous measurements varied from 10.5% to 65.5% in the lumbar spine and from 21.3% to 74.2% in the proximal femur. The overall error rate was found to be 31.8% (n = 1021) for the lumbar spine and 49.0% (n = 1576) for the proximal femur. Conclusion: In Turkey, DXA measurements of BMD have been in use for over 20 years, and examination processes continue to improve. There is no educational standard for operator training, and a lack of knowledge can lead to significant errors in the acquisition, analysis, and interpretation.
APA, Harvard, Vancouver, ISO, and other styles
45

Chen, Yunxiao, Yang Liu, and Shuangshuang Xu. "Mutual Information Reliability for Latent Class Analysis." Applied Psychological Measurement 42, no. 6 (January 15, 2018): 460–77. http://dx.doi.org/10.1177/0146621617748324.

Full text
Abstract:
Latent class models are powerful tools in psychological and educational measurement. These models classify individuals into subgroups based on a set of manifest variables, assisting decision making in a diagnostic system. In this article, based on information theory, the authors propose a mutual information reliability (MIR) coefficient that summaries the measurement quality of latent class models, where the latent variables being measured are categorical. The proposed coefficient is analogous to a version of reliability coefficient for item response theory models and meets the general concept of measurement reliability in the Standards for Educational and Psychological Testing. The proposed coefficient can also be viewed as an extension of the McFadden’s pseudo R-square coefficient, which evaluates the goodness-of-fit of logistic regression model, to latent class models. Thanks to several information-theoretic inequalities, the MIR coefficient is unitless, lies between 0 and 1, and receives good interpretation from a measurement point of view. The coefficient can be applied to both fixed and computerized adaptive testing designs. The performance of the MIR coefficient is demonstrated by simulated examples.
APA, Harvard, Vancouver, ISO, and other styles
46

Wise, Steven L., Dena A. Pastor, and Xiaojing J. Kong. "Correlates of Rapid-Guessing Behavior in Low-Stakes Testing: Implications for Test Development and Measurement Practice." Applied Measurement in Education 22, no. 2 (April 6, 2009): 185–205. http://dx.doi.org/10.1080/08957340902754650.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Winter, Sonja D., and Sarah Depaoli. "An illustration of Bayesian approximate measurement invariance with longitudinal data and a small sample size." International Journal of Behavioral Development 44, no. 4 (October 16, 2019): 371–82. http://dx.doi.org/10.1177/0165025419880610.

Full text
Abstract:
This article illustrates the Bayesian approximate measurement invariance (MI) approach in M plus with longitudinal data and small sample size. Approximate MI incorporates zero-mean small variance prior distributions on the differences between parameter estimates over time. Contrary to traditional invariance testing methods, where exact invariance is tested, this method allows for some “wiggle room” in the parameter estimates over time. The procedure is illustrated using longitudinal data on college students’ academic stress as it changes in the period leading up to and right after an important midterm. Results show that traditional invariance testing methods come to a standstill due to the small sample size. Bayesian approximate MI testing was able to identify non-invariant parameters, after which a partially invariant model could be estimated.
APA, Harvard, Vancouver, ISO, and other styles
48

Torrance, Harry. "Notes: Combining Measurement-Driven Instruction With Authentic Assessment: Some Initial Observations of National Assessment in England and Wales." Educational Evaluation and Policy Analysis 15, no. 1 (March 1993): 81–90. http://dx.doi.org/10.3102/01623737015001081.

Full text
Abstract:
Recently assessment has been singled out as a key mechanism for monitoring and intervening in the educational process. In particular, claims are being made that new forms of assessment will be able to drive teaching and learning in more positive ways than was originally associated with narrower testing programs. This note reports on initial evidence from the National Assessment in England and Wales and highlights a number of problems in implementing new approaches to assessment in the context of national testing.
APA, Harvard, Vancouver, ISO, and other styles
49

Polivyanchuk, A., M. Smirny, S. Romanenko, R. Semenenko, R. Plotnikova, D. Onatsky, and O. Efimov. "RESEARCH OF EFFICIENCY ECOLOGICAL DIAGNOSTICS SYSTEM OF HEAT ENGINES AND BOILER PLANTS." Municipal economy of cities 6, no. 152 (December 28, 2019): 73–78. http://dx.doi.org/10.33042/2522-1809-2019-6-152-73-78.

Full text
Abstract:
A universal, multifunctional system of environmental diagnostics of heat engines and boiler plants has been created, which allows one to determine indicators characterizing the chemical and physical effect of these objects on the environment: concentrations, mass, specific and average operational emissions of pollutants, noise, thermal pollution, vibration. This measuring system consists of instrumental, testing, demonstration and laboratory modules, which allows you to use it as a diagnostic tool, training and test bench and laboratory; as well as apply it in various fields: transport, energy, environmental and educational fields. The diagnostic system implements methods for monitoring and improving the accuracy of measurements of average operating emissions of pollutants: method for determining the resulting measurement errors of the average operational emissions of gaseous pollutants and particulate matter – GAS and РT indicators, which allows you to evaluate the impact on the data of the value of the errors of the measuring equipment of the diagnostic system; a method for increasing the accuracy of measurements of the normalized РT index by taking into account the methodological error of measurements of a given value due to the influence of the temperature of the sample in the tunnel on the measured emission of particulate matter - δРТt. Experimental studies of the diagnostic system and methods for increasing its accuracyon full-scale objects were carried out: diesel engines: tractor 4CHN12/14 tractor D65M, diesel locomotive diesel engine DEL-01 and boiler units: gas - DKVR-20/13 and AOGV-100E, solid fuel - KCHM-2M-4. Transport diesels were tested according to the cycles established by the UNECE Regulations R-49, R-96 and the international standard ISO-8178. As a result of tests of these engines, the coefficients KРi, KMgasi and KMрmi were determined, which are used to assess the accuracy of measurements of GAS and РT indicators, the resulting measurement errors of these values were investigated and the range of variation of the methodical error δРТt was determined.
APA, Harvard, Vancouver, ISO, and other styles
50

Morris, Anne K., and James Hiebert. "Openness and Measurement: Two Principles for Improving Educational Practice and Shared Instructional Products." Mathematics Teacher Educator 3, no. 2 (March 2015): 130–53. http://dx.doi.org/10.5951/mathteaceduc.3.2.0130.

Full text
Abstract:
Two studies were conducted to identify the conditions under which instructors teaching the same mathematics teacher preparation course would continuously improve their shared instructional products (lesson plans for class sessions) using small amounts of data on preservice teacher performance. Findings indicated that when lesson-level student performance data were simply collected, by course section, the instructors could make important changes to the lessons but did not often do so. However, when the instructors were encouraged to compare data across semesters, they generated hypotheses that guided instructional improvements, which then were tested through multiple cycles. The cycles of hypothesis testing helped instructors clarify the goals for improvement, use the performance data to test whether changes were actually improvements, and reduce their tolerance for marginal student performance.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography