Log in

Relevant bibliographies by topics / Generalizability theory / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Generalizability theory.

Dissertations / Theses on the topic 'Generalizability theory'

Author: Grafiati

Published: 4 June 2021

Last updated: 29 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 34 dissertations / theses for your research on the topic 'Generalizability theory.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Kayandé, Ujwal Anilchandra. "Theory of generalizability and optimization of marketing measurement." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp02/NQ29053.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Ark, Tavinder K. "Ordinal generalizability theory using an underlying latent variable framework." Thesis, University of British Columbia, 2015. http://hdl.handle.net/2429/53892.

Full text

Abstract:

This dissertation introduces a method for estimating the variance components required in the use of generalizability theory (GT) with categorical ratings (e.g., ordinal variables). Traditionally, variance components in GT are estimated using statistical techniques that treat ordinal variables as continuous. This may lead to bias in the estimation of variance components and the resulting reliability coefficients (called G-coefficients). This dissertation demonstrates that variance components can be estimated using a structural equation modeling (SEM) technique called covariance structural modeling (CSM) of a polychoric or tetrachoric correlation matrix, which accounts for the metric of ordinal variables. The dissertation provides a proof of concept of this method, which will be called ordinal GT, using real data in the computation of a relative G-coefficient, and a simulation study presenting the relative merits of ordinal to conventional G-coefficients from ordinal data. The results demonstrate that ordinal GT is viable using CSM of the polychoric matrix of ordinal data. In addition, using a Monte Carlo simulation, the relative G-coefficients when ordinal data are naively treated as continuous are compared to when they are correctly treated as ordinal. The number of response categories, magnitude of the theoretical G-coefficient, and skewness of the item response distributions varied in experimental conditions for: (i) a two-facet crossed G-study design, and (ii) a one-facet partially nested G-study design. The results reveal that when ordinal data were treated as continuous, the empirical G-coefficients were consistently underestimates than their theoretical values. This was true regardless of the number of response categories, magnitude of the theoretical G-coefficient, and skewness. In contrast, the ordinal G-coefficients performed much better in all conditions. This dissertation shows that using CSM to model the polychoric correlation matrix provides better estimates of variance components in the GT of ordinal variables. It offers researchers a new statistical avenue for computing relative G-coefficients when using ordinal variables.
Education, Faculty of
Educational and Counselling Psychology, and Special Education (ECPS), Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

3

Wang, Yi. "Decomposing Variance Components for Risk Perceptions Using Generalizability Theory." Bowling Green State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1498785199689687.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Peeters, Michael Joseph. "Using Generalizability Theory to Improve Assessment within Pharmacy Education." University of Toledo / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1571775359957282.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Karlsson, Jenny. "Generalizability Theory and a Scale Measuring Emotion Knowledge in Preschool Children." Thesis, Stockholms universitet, Psykologiska institutionen, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-138153.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Marcus, Mindy Beth. "Explaining adult crime : the role of Adlerian theory and the generalizability of social control theory /." Digital version accessible at:, 1998. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Wang, Ze. "Estimating reliability under a generalizability theory model for writing scores in C-base." Diss., Columbia, Mo. : University of Missouri-Columbia, 2005. http://hdl.handle.net/10355/4292.

Full text

Abstract:

Thesis (M.S.)--University of Missouri-Columbia, 2005.
The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file viewed on (January 10, 2007) Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

8

Moore, Joann Lynn. "Estimating standard errors of estimated variance components in generalizability theory using bootstrap procedures." Diss., University of Iowa, 2010. https://ir.uiowa.edu/etd/860.

Full text

Abstract:

This study investigated the extent to which rules proposed by Tong and Brennan (2007) for estimating standard errors of estimated variance components held up across a variety of G theory designs, variance component structures, sample size patterns, and data types. Simulated data was generated for all combinations of conditions, and point estimates, standard error estimates, and coverage for three types of confidence intervals were calculated for each estimated variance component and relative and absolute error variance across a variety of bootstrap procedures for each combination of conditions. It was found that, with some exceptions, Tong and Brennan's (2007) rules produced adequate standard error estimates for normal and polytomous data, while some of the results differed for dichotomous data. Additionally, some refinements to the rules were suggested with respect to nested designs. This study provides support for the use of bootstrap procedures for estimating standard errors of estimated variance components when data are not normally distributed.

APA, Harvard, Vancouver, ISO, and other styles

9

Sharpnack, Jim D. "An Investigation of the Parenting Stress Index in the Context of Generalizability Theory." DigitalCommons@USU, 1997. https://digitalcommons.usu.edu/etd/6102.

Full text

Abstract:

This present study examined the application of generalizability theory (GT) to the Parenting Stress Index (PSI) long and short forms for families having children with disabilities. The purpose of the study was to evaluate the dependability of parenting stress data scores gathered from families having children with disabilities. The data for the present study came from an extant data set collected by the Early Intervention Research Institute (EIRI; Contract #800-85-0173) at Utah State University. The EIRI studies represented attempts to assess the benefits and cost of conducting early intervention programs. The EIRI data were recoded at the item level for the Psychometrics Project, which established norms, reliability, and validity information on self-report, family-functioning measures gathered from families having children with disabilities. The GT study results suggested that the items facet made a large contribution, indicating that there may not be any established trends in item responses. An explanation for the items facet indicates that the PSI forms provide an accurate measure of overall parental stress. According to the times facet results, the effects of time are minimal except the increase between occasion one to occasion two. Classical reliability theory (CRT) and GT analyses provide contradictory results, probably due to GT's multiple error source analyses compared to CRT's examination of a single error source in one analysis. GT study analyses indicate that the highest g and phi coefficients are produced with the highest number of administrations and items. However, administering the highest number of administrations and items would be impractical within any setting. The original number of items from the Parent Domain, Child Domain, and short PSI total score should be administered twice to increase the dependability of scores and still fall within practical limitations. A researcher and/or practitioner may want information to decide what form, long or short, to choose. If the PSI is to be used as a quick screening tool or as one test in a complete assessment, the short form may be of more use. If the PSI is to be used as a primary source of information about parent and child interactive systems, the long PSI version would be recommended.

APA, Harvard, Vancouver, ISO, and other styles

10

Tanner, Nicholas Andrew, and Nicholas Andrew Tanner. "Generalizability of Universal Screening Measures for Behavioral and Emotional Risk." Diss., The University of Arizona, 2017. http://hdl.handle.net/10150/625352.

Full text

Abstract:

Data derived from universal screening procedures are increasingly utilized by schools to identify and provide additional supports to students at-risk of behavioral and emotional concerns. As screening has the potential to be resource intensive, effort has been placed on the development of efficient screening procedures, namely brief behavior rating scales. This study utilized classical test theory and generalizability theory to examine the extent to which differences among students, raters, occasions, and screening measures affect the meaningfulness of data derived from universal screening procedures. Teacher pairs from three middle school classrooms completed two brief behavior rating scales during fall and spring screening administrations for all students in their respective classrooms. Correlation coefficients examining interrater reliability, test-retest reliability, and concurrent validity were generally strong. Generalizability analyses indicated that the majority of variance in teacher ratings were attributable to student differences across all score comparisons, but differences between teacher ratings for particular students accounted for relatively large percentages of error variance among student behavior ratings. Although decision studies showed that increasing the number of screening occasions resulted in more generalizable data, the impact of increasing the number of raters resulted in more efficient screening procedures.

APA, Harvard, Vancouver, ISO, and other styles

11

Drost, Ellen Antoinette. "Toward a unified theory of task-oriented and relationship-oriented leader behavior: a multi-country generalizability study." FIU Digital Commons, 2001. http://digitalcommons.fiu.edu/etd/3086.

Full text

Abstract:

The theoretical foundation of this study comes from the significant recurrence throughout the leadership literature of two distinct behaviors, task orientation and relationship orientation. Task orientation and relationship orientation are assumed to be generic behaviors, which are universally observed and applied in organizations, even though they may be uniquely enacted in organizations across cultures. The lack of empirical evidence supporting these assumptions provided the impetus to hypothetically develop and empirically confirm the universal application of task orientation and relationship orientation and the generalizability of their measurement in a cross-cultural setting. Task orientation and relationship orientation are operationalized through consideration and initiation of structure, two well-established theoretical leadership constructs. Multiple-group mean and covariance structures (MACS) analyses are used to simultaneously validate the generalizability of the two hypothesized constructs across the 12 cultural groups and to assess whether the similarities and differences discovered are measurement and scaling artifacts or reflect true cross-cultural differences. The data were collected by the author and others as part of a larger international research project. The data are comprised of 2341 managers from 12 countries/regions. The results provide compelling evidence that task orientation and relationship orientation, reliably and validly operationalized through consideration and initiation of structure, are generalizable across the countries/regions sampled. But the results also reveal significant differences in the perception of these behaviors, suggesting that some aspects of task orientation and relationship orientation are strongly affected by cultural influences. These (similarities and) differences reflect directly interpretable, error-free effects among the constructs at the behavioral level. Thus, task orientation and relationship orientation can demonstrate different relations among cultures, yet still be defined equivalently across the 11 cultures studied. The differences found in this study are true differences and may contain information about cultural influences characterizing each cultural context (i.e. group). The nature of such influences should be examined before the results can be meaningfully interpreted. To examine the effects of cultural characteristics on the constructs, additional hypotheses on the constructs' latent parameters can be tested across groups. Construct-level tests are illustrated in hypothetical examples in light of the study's results. The study contributes significantly to the theoretical understanding of the nature and generalizability of psychological constructs. The theoretical and practical implications of embedding context into a unified theory of task orientated and relationship oriented leader behavior are proposed. Limitations and contributions are also discussed.

APA, Harvard, Vancouver, ISO, and other styles

12

Raffle, Holly. "Assessment and Reporting of Intercoder Reliability in Published Meta-Analyses Related to Preschool Through Grade 12 Education." Ohio University / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1156167922.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Williamson, J. Austin. "Social support, mood, and relationship satisfaction at the trait and social levels." Diss., University of Iowa, 2015. https://ir.uiowa.edu/etd/1932.

Full text

Abstract:

Many social processes influence the amount, quality, and availability of support from an individual's social network. Trait influences are characteristics of the individual that generalize across relationships and affect how much support is received and perceived on average from other people. Social influences comprise characteristics of the individual's social network. They are relationship specific and account for the variability in supportiveness among an individual's providers. Recent studies have taken a multilevel approach to studying social support in order to partition the variance in sets of relationship-specific support measures into trait and social components. These studies have also used multivariate generalizability (G) theory to examine the correlations between social support and other constructs, such as negative mood, at the trait and social level. These multilevel studies have begun to clarify the relative contributions of trait and social influences on social support, but much is yet to be learned about the nature and measurement of social support's trait and social components. One set of aims within this project was to identify characteristics of support recipients and characteristics of support providers that were related to the reception and perception of social support. Another set of aims focused on validating the measurement strategies used by G theory researchers and understanding how the trait and social components of support and mood derived from relationship-specific measures relate to traditional measures of these constructs. My final set of aims involved the application of multilevel analyses of social support and negative mood to three existing theories in the social support literature--the buffering hypothesis, the matching hypothesis, and the platinum rule. The participants in this study comprised two samples--one group of 755 undergraduate psychology students, and one group of 430 community members from across the United States. Participants completed measures of their personality traits, recent depressive symptoms, recent experiences of life adversity and perceived control over life adversity. They also reported on three close relationships including support from those relationships, satisfaction with those relationships, and mood experienced when interacting with those three people. Several multilevel analyses were used in the study. Univariate G theory analyses were used to quantify the relative variance in support, mood, and relationship satisfaction attributable to trait and social influences. Multivariate G theory analyses were used to estimate the links between these variables at the trait and social levels of analysis. Mixed effects models were used to identify trait and relationship-specific constructs that that might partly constitute the trait and social influences on social support. Multilevel Structural Equation Modeling (SEM) was used to evaluate the validity of several constructs employed in previous multilevel studies on social support. Finally, mixed effects and multivariate G theory analyses were used to test the buffering hypothesis, the matching hypothesis, and the platinum rule. Consistent with previous multilevel studies of social support, recipients who received more support, on average, from their social networks also reported more negative mood when interacting with their providers. After taking those average tendencies into account, the amount of support received from an individual support provider was not associated with negative mood experienced when with that provider. The investigation of the trait influences on social support showed that recipients who were younger, more extraverted, and more open to new experiences tended to receive more social support. Women tended to receive more support than men. With respect to social influences, romantic partners tended to provide the most support whereas friends and siblings provided significantly less support on average. Women tended to provide more support than men. The validity assessment showed that the social component of support availability was only modestly distinct from the social component of generic relationship satisfaction. The trait component of support availability showed good discriminant validity from relationship satisfaction and good convergent validity with global support availability. The trait component of relationship-specific mood showed moderate convergent validity with general mood. The buffering and matching hypotheses were not supported by my findings. The platinum rule was supported at the trait level in that recipients who reported greater support adequacy, on average, tended to report more positive mood and less negative mood. The platinum rule was also supported at the social level in that recipients tended to report experiencing the most positive mood and least negative mood when interacting with individual providers who tended to supply the most adequate support.

APA, Harvard, Vancouver, ISO, and other styles

14

Burton, Rachel Clinger. "Oral Retelling as a Measure of Reading Comprehension: The Generalizability of Ratings of Elementary School Students Reading Expository Texts." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2409.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Fox, Danielle Polizzi. "Testing the generalizability of Sampson and Laub's life-course theory examining the relationship between adult social bonds and drug use among an African American sample /." College Park, Md. : University of Maryland, 2004. http://hdl.handle.net/1903/1461.

Full text

Abstract:

Thesis (Ph. D.) -- University of Maryland, College Park, 2004.
Thesis research directed by: Criminology and Criminal Justice. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

16

Putka, Dan J. "The Variance Architecture Approach to the Study of Constructs in Organizational Contexts." Ohio University / OhioLINK, 2002. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1018372521.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Fox, Jesse. "The Development of the Counselor Intuition Scale." Doctoral diss., University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5738.

Full text

Abstract:

Intuition is an important aspect of counseling, several revered counselors have either attested to the powers of their intuition or have had such powers attributed to them by their contemporaries. Moreover, many counselors believe that their intuition is more influential in their work with clients than are evidence-based practices (EBPs). However, the academy criticizes intuition for its susceptibility to cognitive errors and its poor performance when compared to statistical methods. In addition, the exact nature of intuition's role in counseling is largely unknown. Therefore, its contribution to client outcomes is equally a mystery, making it difficult for counselors to justify their reliance on its powers. Until this study, counselor intuition has been regarded as a, more or less, phantom construct in need of evidence to even suggest that it does in fact exist. Therefore, the purpose of this study was to develop the Counselor Intuition Scale (CIS). The construction of the CIS began by adapting the methodology of instruments already in existence and whose purpose was to measure interpersonal and emotional sensitivity. The construction of the CIS began by creating a series of 39 video segments (lasting approximately two minutes each) depicting a client discussing a presenting problem. The video segments were then reviewed by two rounds of counseling experts (N = 45) whose intuitive responses to the clients featured in the CIS were used to create the criterion responses of the instrument. The expert responses were analyzed using Q-Methodology, the results of which suggested that the counseling experts approached the clients from a unidimensional perspective, which the researcher named “counselor intuition.” The expert ratings were also analyzed using generalizability theory to assess the consistency of expert responses, the results of which suggested that interrater reliability was excellent, ranging from .88 to .85. Lastly, the experts identified 263 criterion responses that can be used for the future development of the instrument. The implications of the study's findings, as well as the recommendations for future research are discussed.
Ph.D.
Doctorate
Dean's Office, Education
Education and Human Performance
Education; Counselor Education

APA, Harvard, Vancouver, ISO, and other styles

18

Zaidi, Nikki. "Hidden Variance in Multiple Mini-Interview Scores." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1427797882.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Hall, Ritchie V. II. "The Role of Racial Bias in Family Assessment Measures." University of Cincinnati / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1243369253.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Plummer, Kenneth James. "Analysis of the Psychometric Properties of Two Different Concept-Map Assessment Tasks." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2281.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Van, Ingen Sarah. "Preparing Teachers to Apply Research to Mathematics Teaching: Using Design-Based Research to Define and Assess the Process of Evidence-Based Practice." Scholar Commons, 2013. http://scholarcommons.usf.edu/etd/4799.

Full text

Abstract:

Persistent lack of mathematics achievement and disparity in achievement has led to the publication of research findings related to equitable teaching practices. Although the publication of such research provides insights about approaches for potentially increasing equity in mathematics education, teachers must be able to apply what has been learned from these studies to their classroom teaching practices. Despite the widespread expectation that teachers use research-supported teaching strategies to meet the needs of their diverse classrooms, the research to practice gap persists. Little research is currently available to guide mathematics teacher educators in how to prepare future teachers to apply research to teaching practices. Inspired by advancements in social work and other health-related fields, this study departed from the standard approach of preparing teachers to utilize specific, research- based teaching strategies to preparing teachers to engage in the meta-process of applying research to practice. This meta-process has been defined by the health-related disciplines as the process of evidence-based practice (EBP). This process is explicated in a conceptual framework that is composed of the following five steps. The practitioner (1) formulates an answerable practice question, (2) searches for the best research evidence, (3) critically appraises the evidence, (4) selects the best intervention for a specific practice context, and (5) evaluates the outcome of the intervention. The purpose of this study was to examine the process of preparing preservice elementary teachers of mathematics to engage in the five-step process of EBP. Because this process, which can be conceptualized as a routine of practice, has not been identified for the field of mathematics education previously, it was examined using a design-based research (DBR) methodological approach. There were two objectives to the study: (1) to create an empirically tested teaching intervention that mathematics teacher educators can use to prepare preservice teachers to apply research to teaching practice and (2) to create a system of assessment that supports the teaching of this intervention. The study involved five iterations of the DBR process that permited the intervention to be evaluated and revised after each iteration. Although each iteration is discussed, this study focuses primarily on the process used in the fifth iteration of the DBR process. This iteration took place in the context of a mathematics methods course in a clinically-rich, undergraduate residency program for initial preparation of elementary school teachers. The twelve participants were simultaneously enrolled in the methods course and embedded in co-teaching assignments at an elementary school. The intervention to prepare teachers to engage in EBP included two workshops that were co-facilitated by an education librarian and a mathematics teacher educator and a semester-long Education Research Project. The project required participants to identify a problem of practice related to teaching or learning mathematics, find relevant research to address that problem, create an intervention to apply the research findings to classroom instruction, implement that intervention, and collect data to evaluate the effectiveness of the designed intervention. Instruments used to collect data included: (1) a self-report Information Literacy Questionnaire, (2) a self-report Familiarity with the Process of Evidence-Based Practice in Education Scale, (3) the Education Research Project report, and (4) a standardized performance assessment. The standardized performance assessment was used to assess beginning proficiency with the process of EBP. Generalizeability theory was used to evaluate the reliability of the system created for the standardized performance assessment. The system that included three raters, two tasks, and two scoring occasions was found to be fairly reliable (absolute generalizability coefficient = .81). Results from this study revealed that participants were more successful at creating implementation plans and linking those plans to research than they were at modifying their plans to meet the needs of specific students or evaluating their research implementation. This study contributes to both research and mathematics education communities' understandings about the potential of EBP as a high-leverage routine of practice and the use of generalizability theory in the creation of a reliable assessment to evaluate this routine of practice. This study documents the complexity of the process of linking research to practice and provides an empirically tested conceptual framework for preparing preservice teachers to engage in this complex practice.

APA, Harvard, Vancouver, ISO, and other styles

22

Ure, Abigail Christine. "The Effect of Raters and Rating Conditions on the Reliability of the Missionary Teaching Assessment." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2456.

Full text

Abstract:

This study investigated how 2 different rating conditions, the controlled rating condition (CRC) and the uncontrolled rating condition (URC), effected rater behavior and the reliability of a performance assessment (PA) known as the Missionary Teaching Assessment (MTA). The CRC gives raters the capability to manipulate (pause, rewind, fast-forward) video recordings of an examinee's performance as they rate while the URC does not give them this capability (i.e., the rater must watch the recording straight through without making any manipulations). Few studies have compared the effect of these two rating conditions on ratings. Ryan et al. (1995) analyzed the impact of the CRC and URC on the accuracy of ratings, but few, if any, have analyzed its impact on reliability. The Missionary Teaching Assessment is a performance assessment used to assess the teaching abilities of missionaries for the Church of Jesus Christ of Latter-day Saints at the Missionary Training Center. In this study, 32 missionaries taught a 10-minute lesson that was recorded and later rated by trained raters based on a rubric containing 5 criteria. Each teaching sample was rated by 4 of 6 raters. Two of the 4 ratings were rated using the CRC and 2 using the URC. Camtasia Studio (2010), a screen capture software, was used to record when raters used any type of manipulation. The recordings were used to analyze if raters manipulated the recordings and if so, when and how frequently. Raters also performed think-alouds following a random sample of the ratings that were performed using the CRC. These data revealed that when raters had access to the CRC they took advantage of it the majority of the time, but they differed in how frequently they manipulated the recordings. The CRC did not add an exorbitant amount of time to the rating process. The reliability of the ratings was analyzed using both generalizability theory (G theory) and many-facets Rasch measurement (MFRM). Results indicated that, in general, the reliability of the ratings obtained from the 2 rating conditions were not statistically significantly different from each other. The implications of these findings are addressed.

APA, Harvard, Vancouver, ISO, and other styles

23

Kumazawa, Takaaki. "Systematic criterion-referenced test development in an English-language program." Diss., Temple University Libraries, 2011. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/119394.

Full text

Abstract:

Educational Administration
Ed.D.
Although classroom assessment is one of the most frequent practices carried out by teachers in all educational programs, limited research has been conducted to investigate the dependability and validity of criterion-referenced tests (CRTs). The main purpose of this study is to develop a criterion-referenced test for first-year Japanese university students in a general English program. To this end, four research questions are formulated: (a) To what extent do the criterion-referenced items function effectively?; (b) To what extent do the facets of persons, items, sections, classes, and subtests contribute to the total score variation in two CRT forms?; (c) To what extent are two CRT forms dependable when administered as pretests and posttests?; and (d) To what extent are two CRT forms valid when administered as pretests and posttests? Two CRT forms made up of vocabulary (k = 25), listening (k = 20), and reading (k = 25) subtests were administered to 249 students using a counterbalanced design. Criterion-referenced item analyses showed that most items were working well for criterion-referenced purposes. Both univariate and multivariate generalizability studies indicated that the most of the variance was accounted for by the interaction effect, followed by the items effect, and then by the persons effect. FACETS analyses showed the separation for all the facets accounted for in the analyses and showed that item separation was greater than person separation. This indicated that the students' ability estimates were similar due to their having taken a placement test, whose results were used to form proficiency-based classes. Both univariate and multivariate decision studies indicated that the CRT forms were moderately to highly dependable. The content validity of the CRT forms was supported because the test content was strongly linked to what was taught in class. The construct validity was supported mainly because a fair amount of score gain was observed. This study elucidates how the statistical analyses used in this study can be applied to CRT development, and how CRT development can be carried out as part of curriculum development.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

24

Robitzsch, Alexander. "Essays zu methodischen Herausforderungen im Large-Scale Assessment." Doctoral thesis, Humboldt-Universität zu Berlin, Kultur-, Sozial- und Bildungswissenschaftliche Fakultät, 2016. http://dx.doi.org/10.18452/17424.

Full text

Abstract:

Mit der wachsenden Verbreitung empirischer Schulleistungsleistungen im Large-Scale Assessment gehen eine Reihe methodischer Herausforderungen einher. Die vorliegende Arbeit untersucht, welche Konsequenzen Modellverletzungen in eindimensionalen Item-Response-Modellen (besonders im Rasch-Modell) besitzen. Insbesondere liegt der Fokus auf vier methodischen Herausforderungen von Modellverletzungen. Erstens, implizieren Positions- und Kontexteffekte, dass gegenüber einem eindimensionalen IRT-Modell Itemschwierigkeiten nicht unabhängig von der Position im Testheft und der Zusammenstellung des Testheftes ausgeprägt sind und Schülerfähigkeiten im Verlauf eines Tests variieren können. Zweitens, verursacht die Vorlage von Items innerhalb von Testlets lokale Abhängigkeiten, wobei unklar ist, ob und wie diese in der Skalierung berücksichtigt werden sollen. Drittens, können Itemschwierigkeiten aufgrund verschiedener Lerngelegenheiten zwischen Schulklassen variieren. Viertens, sind insbesondere in low stakes Tests nicht bearbeitete Items vorzufinden. In der Arbeit wird argumentiert, dass trotz Modellverletzungen nicht zwingend von verzerrten Schätzungen von Itemschwierigkeiten, Personenfähigkeiten und Reliabilitäten ausgegangen werden muss. Außerdem wird hervorgehoben, dass man psychometrisch häufig nicht entscheiden kann und entscheiden sollte, welches IRT-Modell vorzuziehen ist. Dies trifft auch auf die Fragestellung zu, wie nicht bearbeitete Items zu bewerten sind. Ausschließlich Validitätsüberlegungen können dafür Hinweise geben. Modellverletzungen in IRT-Modellen lassen sich konzeptuell plausibel in den Ansatz des Domain Samplings (Item Sampling; Generalisierbarkeitstheorie) einordnen. In dieser Arbeit wird gezeigt, dass die statistische Unsicherheit in der Modellierung von Kompetenzen nicht nur von der Stichprobe der Personen, sondern auch von der Stichprobe der Items und der Wahl statistischer Modelle verursacht wird.
Several methodological challenges emerge in large-scale student assessment studies like PISA and TIMSS. Item response models (IRT models) are essential for scaling student abilities within these studies. This thesis investigates the consequences of several model violations in unidimensional IRT models (especially in the Rasch model). In particular, this thesis focuses on the following four methodological challenges of model violations. First, position effects and contextual effects imply (in comparison to unidimensional IRT models) that item difficulties depend on the item position in a test booklet as well as on the composition of a test booklet. Furthermore, student abilities are allowed to vary among test positions. Second, the administration of items within testlets causes local dependencies, but it is unclear whether and how these dependencies should be taken into account for the scaling of student abilities. Third, item difficulties can vary among different school classes due to different opportunities to learn. Fourth, the amount of omitted items is in general non-negligible in low stakes tests. In this thesis it is argued that estimates of item difficulties, student abilities and reliabilities can be unbiased despite model violations. Furthermore, it is argued that the choice of an IRT model cannot and should not be made (solely) from a psychometric perspective. This also holds true for the problem of how to score omitted items. Only validity considerations provide reasons for choosing an adequate scoring procedure. Model violations in IRT models can be conceptually classified within the approach of domain sampling (item sampling; generalizability theory). In this approach, the existence of latent variables need not be posed. It is argued that statistical uncertainty in modelling competencies does not only depend on the sampling of persons, but also on the sampling of items and on the choice of statistical models.

APA, Harvard, Vancouver, ISO, and other styles

25

Miller, Tamara B. "Using generalizability theory and error-tolerance ratios to analyze error in change scores." 1999. http://catalog.hathitrust.org/api/volumes/oclc/42020669.html.

Full text

Abstract:

Thesis (M.S.)--University of Wisconsin--Madison, 1999.
Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 123-125).

APA, Harvard, Vancouver, ISO, and other styles

26

Patterson, Patricia. "An investigation of the dependability of criterion-referenced test scores using generalizability theory." 1985. http://catalog.hathitrust.org/api/volumes/oclc/13118263.html.

Full text

Abstract:

Thesis (Ph. D.)--University of Wisconsin--Madison, 1985.
Typescript. Vita. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 112-118).

APA, Harvard, Vancouver, ISO, and other styles

27

de, Vries INGRID. "AN ANALYSIS OF TEST CONSTRUCTION PROCEDURES AND SCORE DEPENDABILITY OF A PARAMEDIC RECERTIFICATION EXAM." Thesis, 2012. http://hdl.handle.net/1974/7434.

Full text

Abstract:

High-stakes testing is used for the purposes of providing results that have important consequences such as certifications, licensing, or credentialing. The purpose of this study was to examine aspects of an exam recently written by flight paramedics for recertification and make recommendations for development of future exams. In 2008, an unexpectedly high failure led to revisions in the exam development process for flight paramedics. Using principles of classical test theory and generalizability theory, I examined the decision consistency and dependability of the examination and found the decision consistency for dichotomous items to be within acceptable limits, yet the dependability was low. Discrimination was strong at the cut-score. An in-depth look into the process used to set the exam, as well as the psychometric properties of the exam and the items have led to recommendations that will contribute to future development of dependable exams in the industry that result in more valid interpretations with respect to paramedic competence.
Thesis (Master, Education) -- Queen's University, 2012-09-06 22:41:41.552

APA, Harvard, Vancouver, ISO, and other styles

28

"Competency Assessment in Nursing Using Simulation: A Generalizability Study and Scenario Validation Process." Doctoral diss., 2014. http://hdl.handle.net/2286/R.I.25805.

Full text

Abstract:

abstract: The measurement of competency in nursing is critical to ensure safe and effective care of patients. This study had two purposes. First, the psychometric characteristics of the Nursing Performance Profile (NPP), an instrument used to measure nursing competency, were evaluated using generalizability theory and a sample of 18 nurses in the Measuring Competency with Simulation (MCWS) Phase I dataset. The relative magnitudes of various error sources and their interactions were estimated in a generalizability study involving a fully crossed, three-facet random design with nurse participants as the object of measurement and scenarios, raters, and items as the three facets. A design corresponding to that of the MCWS Phase I data--involving three scenarios, three raters, and 41 items--showed nurse participants contributed the greatest proportion to total variance (50.00%), followed, in decreasing magnitude, by: rater (19.40%), the two-way participant x scenario interaction (12.93%), and the two-way participant x rater interaction (8.62%). The generalizability (G) coefficient was .65 and the dependability coefficient was .50. In decision study designs minimizing number of scenarios, the desired generalizability coefficients of .70 and .80 were reached at three scenarios with five raters, and five scenarios with nine raters, respectively. In designs minimizing number of raters, G coefficients of .72 and .80 were reached at three raters and five scenarios and four raters and nine scenarios, respectively. A dependability coefficient of .71 was attained with six scenarios and nine raters or seven raters and nine scenarios. Achieving high reliability with designs involving fewer raters may be possible with enhanced rater training to decrease variance components for rater main and interaction effects. The second part of this study involved the design and implementation of a validation process for evidence-based human patient simulation scenarios in assessment of nursing competency. A team of experts validated the new scenario using a modified Delphi technique, involving three rounds of iterative feedback and revisions. In tandem, the psychometric study of the NPP and the development of a validation process for human patient simulation scenarios both advance and encourage best practices for studying the validity of simulation-based assessments.
Dissertation/Thesis
Doctoral Dissertation Educational Psychology 2014

APA, Harvard, Vancouver, ISO, and other styles

29

Cao, Qian. "Interrater Agreement and Reliability of Observed Behaviors: Comparing Percentage Agreement, Kappa, Correlation Coefficient, ICC and G Theory." Thesis, 2013. http://hdl.handle.net/1969.1/149310.

Full text

Abstract:

The study of interrater agreement and itnerrater reliability attract extensive attention, due to the fact that the judgments from multiple raters are subjective and may vary individually. To evaluate interrater agreement and interrater reliability, five different methods or indices are proposed: percentage of agreement, kappa coefficient, the correlation coefficient, intraclass correlation coefficient (ICC), and generalizability (G) theory. In this study, we introduce and discuss the advantages and disadvantages of these methods to evaluate interrater agreement and reliability. Then we review and explore the rank across these five indices by use of frequency in practice in the past five years. Finally, we illustrate how to use these five methods under different circumstances and provide SPSS and SAS code to analyze interrater agreement and reliability. We apply the methods above to analyze the data from Parent-Child Interaction System of global ratings (PARCHISY), and conclude as follows: (1) ICC is the most often used method to evaluate interrater reliability in recent five years, while generalizability theory is the least often used method. The G coefficients provide similar interrater reliability with weighted kappa and ICC on most items, based on the criteria. (2) When the reliability is high itself, different methods provide consistent indication on interrater reliability based on different criteria. If the reliability is not consistent among different methods, both ICC and G coefficient will provide better interrater reliability based on the criteria, and they also provide consistent results.

APA, Harvard, Vancouver, ISO, and other styles

30

Chen, Yi-Yu, and 陳怡玉. "The Method of Reducing the Variance of Person-by-task on Math Performance Assessment—Using Generalizability Theory." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/83242439931570141297.

Full text

Abstract:

碩士
國立臺南大學
測驗統計研究所
93
The purpose of this study was to investigate the impact of task stratification and the scaffolding by using graphic organizers on person-by-task variances, generalizability coefficients, and indices of dependability. The subjects came from two classes of 4th grade. One was the researcher，s class and the other class had a teacher with the same teaching experience as the researcher. Two variables were manipulated in this study, task stratification and using the graphic organizer as the scaffolding. A completely crossed design and tasks nested within stratifications design were compared to examine the effects of task stratification. The researcher class accepted the instruction with graphic organizer as the scaffolding. The other class had the general instruction following the teaching guide. During the experiment, two classes received identical math class periods and homework. After the experiment, two classes were given the same math performance assessment. The variances of person-by-task in two classes were compared to examine the effect of scaffolding. The main results were as follows: 1.The task stratification couldn，t reduce p×t variance well, because the stratification variable was inappropriate. 2.The scaffolding by using graphic organizers could reduce p×t variance. 3. The method of using graphic organizer as the scaffolding performed better than the method of task stratification. 4.Considering the effect of task stratification, a single stratified assessment, which used the p×T×R design, had the largest generalizability coefficient. However, the stratified design p×（T：S）×R which generalizability coefficient was lower than the un-stratified design. 5.The method of scaffolding by using graphic organizers could raise the generalizability coefficient of math performance assessment.

APA, Harvard, Vancouver, ISO, and other styles

31

(7046648), Alison M. Haney. "Emotion Regulation and Religiosity: A Repeated Measures Approach." Thesis, 2019.

Find full text

Abstract:

Religious faith has been identified as a protective factor against negative psychological outcomes and is associated with a range of positive mental and physical health outcomes. While religion is thought to confer psychological benefits to believers in part by enhancing emotion regulation abilities and providing faith-based regulatory methods such as religious coping, these associations have not been examined empirically. This may be due to a lack of measures that are appropriate for use in repeated measures contexts, which are needed for accurate assessment of dynamic constructs such as emotions and regulation. This study employed generalizability theory in a sample (N = 146) collected in daily dairy format over 21 days to determine the reliability of commonly used measures of religiosity and religious coping at the daily level. Once reliability was established, varying time scales were used in a multilevel modeling framework to examine the associations among intrinsic religiosity, religious coping, positive and negative affect, and difficulties in emotion regulation. Positive religious coping (PRC) measured at baseline, same day, and a 1-day lag was associated with higher levels of daily positive affect, though PRC was also associated with negative affect when measured on the same day. Negative religious coping (NRC) measured at baseline predicted lower levels of daily positive affect and was associated with higher levels of negative affect when measured on the same day and a 1-day lag. NRC was also associated with higher levels of difficulties in emotion regulation at all measurement periods, though PRC and intrinsic religiosity were not significantly associated with emotion regulation difficulties. While not associated with daily positive or negative affect, intrinsic religiosity was found to enhance the effect of positive affect inertia. These results did not support the conceptualization that religiosity broadly promotes adaptive emotion regulation, but rather that intrinsic religiosity may increase positive affect by amplifying the effects of positive affect inertia. Additional work is needed with increased measurement occasions to fully understand the temporal associations among these constructs.

APA, Harvard, Vancouver, ISO, and other styles

32

Vašenda, Michal. "Proces pilotní standardizace české verze dotazníku SERVQUAL pro oblast sportovních služeb." Master's thesis, 2011. http://www.nusl.cz/ntk/nusl-313029.

Full text

Abstract:

(EN) Title: The process of pilot standardization of the Czech version of the questionnaire SERVQUAL for the sports services indurty. Objectives: This paper focus is to initiate the standardization process of the Czech version of the SERVQUAL questionnaire for fitness and recreational sport and prepare the ground for its future use in practice. Methods: First was SERVQUAL questionnaire translated into Czech language and distributed to two fitness centers during six months period. Then using Generalizability theory and faktor analysis the reliability and internal structure of this instrument was examined. Results: Provides information about use of this instrument in assesing service quality of fitness centers in Czech republic . Proposes recommendations for further modification and use the questionnaire in the Czech environment. Key words: Service quality, Measurement of service quality, Generalizability theory, Factor analysis. This research was supported by the Grant agency of Charles University, project no. 267811 Measuring sport services quality in fitness industry

APA, Harvard, Vancouver, ISO, and other styles

33

Fortin, Carole. "Développement et validation d’un outil clinique pour l’analyse quantitative de la posture auprès de personnes atteintes d’une scoliose idiopathique." Thèse, 2010. http://hdl.handle.net/1866/4182.

Full text

Abstract:

La scoliose idiopathique (SI) est une déformation tridimensionnelle (3D) de la colonne vertébrale et de la cage thoracique à potentiel évolutif pendant la croissance. Cette déformation 3D entraîne des asymétries de la posture. La correction de la posture est un des objectifs du traitement en physiothérapie chez les jeunes atteints d’une SI afin d’éviter la progression de la scoliose, de réduire les déformations morphologiques et leurs impacts sur la qualité de vie. Les outils cliniques actuels ne permettent pas de quantifier globalement les changements de la posture attribuables à la progression de la scoliose ou à l’efficacité des interventions thérapeutiques. L’objectif de cette thèse consiste donc au développement et à la validation d’un nouvel outil clinique permettant l’analyse quantitative de la posture auprès de personnes atteintes d’une SI. Ce projet vise plus spécifiquement à déterminer la fidélité et la validité des indices de posture (IP) de ce nouvel outil clinique et à vérifier leur capacité à détecter des changements entre les positions debout et assise. Suite à une recension de la littérature, 34 IP représentant l’alignement frontal et sagittal des différents segments corporels ont été sélectionnés. L’outil quantitatif clinique d’évaluation de la posture (outil 2D) construit dans ce projet consiste en un logiciel qui permet de calculer les différents IP (mesures angulaires et linéaires). L’interface graphique de cet outil est conviviale et permet de sélectionner interactivement des marqueurs sur les photographies digitales. Afin de vérifier la fidélité et la validité des IP de cet outil, la posture debout de 70 participants âgés entre 10 et 20 ans atteints d'une SI (angle de Cobb: 15º à 60º) a été évaluée à deux occasions par deux physiothérapeutes. Des marqueurs placés sur plusieurs repères anatomiques, ainsi que des points de référence anatomique (yeux, lobes des oreilles, etc.), ont permis de mesurer les IP 2D en utilisant des photographies. Ces mêmes marqueurs et points de référence ont également servi au calcul d’IP 3D obtenus par des reconstructions du tronc avec un système de topographie de surface. Les angles de Cobb frontaux et sagittaux et le déjettement C7-S1 ont été mesurés sur des radiographies. La théorie de la généralisabilité a été utilisée pour déterminer la fidélité et l’erreur standard de la mesure (ESM) des IP de l’outil 2D. Des coefficients de Pearson ont servi à déterminer la validité concomitante des IP du tronc de l’outil 2D avec les IP 3D et les mesures radiographiques correspondantes. Cinquante participants ont été également évalués en position assise « membres inférieurs allongés » pour l’étude comparative de la posture debout et assise. Des tests de t pour échantillons appariés ont été utilisés pour détecter les différences entre les positions debout et assise. Nos résultats indiquent un bon niveau de fidélité pour la majorité des IP de l’outil 2D. La corrélation entre les IP 2D et 3D est bonne pour les épaules, les omoplates, le déjettement C7-S1, les angles de taille, la scoliose thoracique et le bassin. Elle est faible à modérée pour la cyphose thoracique, la lordose lombaire et la scoliose thoraco-lombaire ou lombaire. La corrélation entre les IP 2D et les mesures radiographiques est bonne pour le déjettement C7-S1, la scoliose et la cyphose thoracique. L’outil est suffisamment discriminant pour détecter des différences entre la posture debout et assise pour dix des treize IP. Certaines recommandations spécifiques résultents de ce projet : la hauteur de la caméra devrait être ajustée en fonction de la taille des personnes; la formation des juges est importante pour maximiser la précision de la pose des marqueurs; et des marqueurs montés sur des tiges devraient faciliter l’évaluation des courbures vertébrales sagittales. En conclusion, l’outil développé dans le cadre de cette thèse possède de bonnes propriétés psychométriques et permet une évaluation globale de la posture. Cet outil devrait contribuer à l’amélioration de la pratique clinique en facilitant l’analyse de la posture debout et assise. Cet outil s’avère une alternative clinique pour suivre l’évolution de la scoliose thoracique et diminuer la fréquence des radiographies au cours du suivi de jeunes atteints d’une SI thoracique. Cet outil pourrait aussi être utile pour vérifier l’efficacité des interventions thérapeutiques sur la posture.
Idiopathic scoliosis (IS) is characterized by three-dimensional (3D) deformity of the spine and rib cage which can increase during growth. The morphologic changes of the trunk result in posture asymmetries. Correction of posture is an important goal of physiotherapy interventions among persons with IS to prevent scoliosis progression, to reduce morphologic deformities and their impact on quality of life. Currently, there are no tools that globally quantify changes in posture that may be attributable to scoliosis progression or to treatment effectiveness, that are usable in a clinical setting. The objective of this thesis was thus to develop and validate a new clinical quantitative posture assessment tool among persons with IS. More specifically, this project aims to determine reliability and concurrent validity of posture indices (PI) of this new tool and to verify their capacity to detect changes between standing and sitting positions. We conducted a literature review and selected 34 PI representing frontal and sagittal alignment of the different body segments. We constructed a software-based quantitative posture assessment tool to calculate different PI (angular and linear measurements). The software has a user-friendly graphical interface and allows calculation of PI from a set of markers selected interactively on digital photographs. For the reliability and validity studies, standing posture of 70 participants aged 10 to 20 years old with IS (Cobb angle: 15º to 60º) was assessed on two occasions by two physiotherapists. Markers placed on several bony landmarks as well as natural reference points (eyes, ear lobe, etc.) were used to measure the PI from photographs with the 2D tool and to calculate 3D PI obtained from trunk reconstructions with a surface topography system. Frontal and sagittal Cobb angles and trunk list were also calculated on radiographs. The generalizability theory was used to estimate the reliability and standard error of measurement (SEM) of PI of the 2D tool. Pearson correlation coefficients served to estimate concurrent validity of the 2D trunk PI with corresponding 3D PI and with those obtained from radiographs. Fifty participants were assessed for the comparative study between standing and sitting positions. We compared the average values of each PI in standing and long sitting positions using paired t-tests. Our results show a good level of reliability for the majority of PI of the 2D tool. Correlation between 2D and 3D PI was good for shoulder, scapula, trunk list, waist angles, thoracic scoliosis and pelvis but fair to moderate for thoracic kyphosis, lumbar lordosis and thoracolumbar or lumbar scoliosis. The correlation between 2D and radiograph measurements was good for trunk list, thoracic scoliosis and thoracic kyphosis. Our tool can detect differences between standing and sitting posture for ten out of thirteen PI. A few recommendations specific to this work are: camera height should be adjusted according to the subject’s height; training of judges is important to maximize accuracy in placement of markers; and measurement of sagittal vertebral curves may be facilitated by using markers mounted on pins. In conclusion, the tool developed in this thesis has good psychometric properties to evaluate posture. This tool should contribute to clinical practice by facilitating the analysis of standing and sitting posture. This tool may also be a good alternative to monitor thoracic scoliosis progression in a clinical setting and may contribute to a reduction in the use of x-rays in the follow-up of youths with thoracic IS. It may also be useful to verify the effectiveness of therapeutic interventions on posture.

APA, Harvard, Vancouver, ISO, and other styles

34

Arsenault, Frédéric. "Validation de la reproductibilité d’outils de mesure de la fraction d’éjection du ventricule gauche en médecine nucléaire." Thèse, 2016. http://hdl.handle.net/1866/16254.

Full text

Abstract:

La fraction d’éjection du ventricule gauche est un excellent marqueur de la fonction cardiaque. Plusieurs techniques invasives ou non sont utilisées pour son calcul : l’angiographie, l’échocardiographie, la résonnance magnétique nucléaire cardiaque, le scanner cardiaque, la ventriculographie radioisotopique et l’étude de perfusion myocardique en médecine nucléaire. Plus de 40 ans de publications scientifiques encensent la ventriculographie radioisotopique pour sa rapidité d’exécution, sa disponibilité, son faible coût et sa reproductibilité intra-observateur et inter-observateur. La fraction d’éjection du ventricule gauche a été calculée chez 47 patients à deux reprises, par deux technologues, sur deux acquisitions distinctes selon trois méthodes : manuelle, automatique et semi-automatique. Les méthodes automatique et semi-automatique montrent dans l’ensemble une meilleure reproductibilité, une plus petite erreur standard de mesure et une plus petite différence minimale détectable. La méthode manuelle quant à elle fournit un résultat systématiquement et significativement inférieur aux deux autres méthodes. C’est la seule technique qui a montré une différence significative lors de l’analyse intra-observateur. Son erreur standard de mesure est de 40 à 50 % plus importante qu’avec les autres techniques, tout comme l’est sa différence minimale détectable. Bien que les trois méthodes soient d’excellentes techniques reproductibles pour l’évaluation de la fraction d’éjection du ventricule gauche, les estimations de la fiabilité des méthodes automatique et semi-automatique sont supérieures à celles de la méthode manuelle.
Left ventricular ejection fraction is an excellent indicator of cardiac function. Many invasive and non-invasive techniques can be used for its assessment: angiography, echocardiography, cardiac MRI, computed tomography of the heart, multigated radionuclide angiography and myocardial perfusion imaging. More than 40 years of scientific publication praise the multigated radionuclide angiography for its execution speed, its availability, its low cost and intrarater and interrater reproducibility. The left ventricular ejection fraction was calculated twice for 47 patients, using two raw data acquisitions, two technologists and three software platforms: one fully manual, one semi-automatic and one fully automatic. In general, the automatic and semi-automatic methods showed greater reproducibility, a smaller standard error of measurement and minimal detectable change than the manual method, whereas the manual method systematically gave a significantly lower quality of result. It was the only technique that showed significant intrarater difference, and its standard error of measurement and minimal detectable change were 40% to 50% higher than those of automatic and semi-automatic methods. Even though all three techniques are all excellent and reliable options, reliability coefficient estimations were superior using automatic and semi-automatic methods as compared to the manual method.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!