Submitted by: Camille Ann Bolos
__________________________________________________________________
Multiple Choice Test Item Analysis
David Hamill and Paul Usala
·
What is
Item Analysis?
o
“The term
‘Item Analysis’ refers to a loosely structured group of statistics that can be
computed for each item in a test.” (Murphy, K. & Davidshofer, 1991)
·
Why
Analyze Test Items?
o
To
Identify:
§
Potential
mistakes in scoring program
§
Ambiguous
items
§
Equal
distribution of all alternatives
§
Alternatives
that don’t work
§
“Tricky”
items
§
Problems
with time limits
§
Potential
organizational policy conflicts
·
Assessment
System Parameters
o
Program
Parameters
§
Assessments
are used as part of a promotional assessment system
§
Each
assessment is part of a comprehensive examination taxonomy
§
Structured
method for test development
§
Continuous
testing – No piloting of Multiple-Choice test items
§
Equate
each new assessment with previous versions
·
Assessment
Process
o
Develop
Assessments
o
Administer
Assessments to Candidates
o
Perform a
“Key Clearance”
§
BP
Correlations
§
Hi/Low
statistics
§
Internal
consistency correlations
o
Score
Assessments
o
Equate
with current version
·
KEY
CLEARANCE OVERVIEW
1.
Review
of PB Data
a.
Does
a correct response relate to overall test performance?
2.
Examine
Distractors and Item Difficulty
a.
How
many candidates responded correctly?
b.
Which
responses did high and low performers choose?
3.
Confirm
Decisions with the KR-20 Internal Consistency Estimate of Reliability
a.
If
the Item is removed, will it increase the reliability of the assessment?
KEY CLEARANCE
·
Item-Total
Correlations (Point Biseral)
o
Purpose:
to identify items that correlate with overall test performance
·
Distractors
and Item difficulty
o
Difficulty
analysis: to identify the proportion of examinees who choose the correct
response option (p-value)
·
Distractors
and Item Difficulty
o
Method of
extreme groups: categorizes test takers into groups and compares item responses
o
Split
test takers into 3 groups based on scores:
§
Upper
group (usually top 27%-35%)
§
Middle
group
§
Lower
group (usually bottom 27%-35%)
·
KR-20
Internal Consistency Reliability
o
Identifies
items that would increase overall test reliability if dropped from battery
No comments:
Post a Comment