jnl:hobart2009

Improving the evaluation of therapeutic interventions in multiple sclerosis: The role of new psychometric methods.

Improving the evaluation of therapeutic interventions in multiple sclerosis: The role of new psychometric methods.

Chapter 1

New psychometric methods

Rasch measurement vs Item response theory

Traditional psychometric methods

psychometric – originally psychology but now broadened
psychometric: methods for developing and evaluating rating scales, and for analyzing their data

Rating scales

for properties that we can't measure directly e.g. happiness, disability (vs weight)
latent traits – “inferred variable” = aka 'property', 'trait', 'concept', 'aspect'
Dichotomous (RMI) vs polytomous (3 or more)
Direct also opposite RMI vs MSIS-29

Evaluation of a rating scale

 " extent to which a quantitative conceptualization has been operationalized successfully "

Traditional 3: reliability, validity, responsiveness
Hobart 6:

Data quality
scaling (assumptions)
targeting
reliability
validity
responsiveness

Classical Test Theory (Assumptions x 5) & limitations

Weak true core theory
'weak assumptions leads to weak conclusions'

Limitations of traditional psychometric methods

ordered counts are not interval measures
results for scales are sample dependent (therefore unstable)
results for samples are scale dependent
missing data
SEM
scaling

Chapter 2 New Psychometric Methods 2021-03-13 (1)

Overview

Traditionally use Likert summated ratings –> cannot verify data
New: theories can be formally and rigorously tested
1. IRT & Rasch: different origins, perspectives and proponents
2. weak theory –> weak conclusions – we need strong theory (IRT and Rasch)

Mathematical relations between variables and events enables checking, prediction, analysis (refine model, review data)

New methods also concern the relationship between the TRUE and OBSERVABLE score, BUT also

 "focus on the UNOBSERVABLE measurement on the underlying trait and the probability of responding to one of the the response categories of a scale item"
 
 Therefore a focus change to ITEM rather than TOTAL SCORE

History

IRT: Louis Thurstone UC, Frederick Lord - Educational Testing Service
Georg Rasch, Copenhagen - Poisson; each Student as an individual
1. Stability: item locations and person locations could be estimated independently of each other
2. Mathematical model given primacy == 'justification for model selection is theoretical evidence of its suitability'
3. {if data doesn't fit model, change the data, not the model?}

Why Rasch is better than IRT

Source

Hobart, J., & Cano, S. (2009). Improving the evaluation of therapeutic interventions in multiple sclerosis: The role of new psychometric methods. Health Technology Assessment, 13(12). https://doi.org/10.3310/hta13120

Table of Contents