“Education
is the most powerful weapon you can use to change the world.” –Nelson Mandela
This
past week the boxes of End of Course Assessments were carted into HSHS and took
up residence in Test Central, the Small Office Conference Room. Starting this Tuesday, our students will start
taking these high-stakes tests. During the
work of sorting and packing the ECAs, I tend to ruminate on how testing has
changed education.
No
question about it: Our current students are the most tested generation ever.
Like
most educators, I have some biases about testing and am asked about these at
times, often when I am unprepared for the conversation. If this happens to you, the following three
points about assessments in general and high-stakes testing specifically may be
useful.
1) ECAs
Do What They Were Designed to Do
End
of Course Assessments do a good job of measuring what they were designed to
measure. They are very good indicators of whether or not students have reached minimum
competency in three content areas: English/Language Arts, Biology, and Algebra
I. But let’s be clear: This is all they
were designed to do. They are not
diagnostic in nature, and they are not designed to measure teacher or school
effectiveness. Certainly, there are
attempts to use these tests for other purposes, but they were designed to
measure minimum student achievement in three specific areas.
Sometimes
I think we lose track of this fact, and it is helpful to keep their designed
purpose in mind.
2) One
Problem with Single Assessments: Margin of Error
Single
Assessments are all imprecise to a certain degree. Imprecision in assessments can be the result
of many factors: poorly constructed test items, student lack of attention or
effort, and/or mistakes in scoring or grading.
The impact of imprecision on interpretation of test scores is often surprising
to educators and completely startling to non-educators. To illustrate, Robert Marzano uses this
equation:
Observed Score = True Score +
Error Score
He
explains:
This
equation indicates that a student’s observed score on an assessment (the final
score on the assessment) consists of two components—the student’s true score
and the student’s error score. The
student’s true score is that which represents the student’s true level of
understanding or skill regarding the topic being measured. The error score is the part of an observed
score that is due to factors other than the student’s level of understanding or
skill.
In
other words, error is inherent in scores assigned to students on every
assessment. In his book, Formative Assessments and Standards-Based
Grading, Robert Marzano provides this chart to show how dramatic the impact
of error can be.
Reliability of
Assessment
|
Score Student Receives
on the Assessment
|
Lowest Possible Score
|
Highest Possible Score
|
Range
|
0.85
|
70
|
60
|
80
|
20
|
0.75
|
70
|
58
|
82
|
24
|
0.65
|
70
|
56
|
84
|
28
|
0.55
|
70
|
54
|
86
|
32
|
0.45
|
70
|
52
|
88
|
36
|
Consider
this: The typical reliability of a state standardized test is 0.85, which is
very good. Again, I defer to Marzano’s
chart and his own words: “For an assessment with a reliability of 0.85 and an
observed score of 70, one would be 95 percent sure the student’s true score is
anywhere between a score of 60 and 80.” Reread
that line until it is clear because this is a really important concept about a
single-shot assessment.
Another
way to look at the same numbers: Using our typical in-house scoring scale, we
could be fairly confident that a student with an observed score of 70 is likely
to have a true score between a D- and a B-.
Spend
some time considering this chart and the idea of reliability, and I think you
will agree that good assessment and grading practice requires multiple
assessments to get us closer to identifying a student’s true ability.
3) It’s
Not Just Standardized Tests: We Face the Same Problems in Our Classrooms
Bob
Marzano doesn’t let us off easily at the school level either. He states that the highest reliability we can
expect from an assessment designed by a teacher, school, or district is
0.75. Again, this is very good
reliability for a school-based assessment.
The research from Marzano Laboratories, however, is that the typical
reliability for classroom assessments is 0.45.
If we use the same example from above, this means that with the typical
classroom assessment (which has a reliability of 0.45 and a range of 36) we can
with confidence say that a student with an observed score of 70 is likely to
have a true score between 52 and 88. On
a typical grading scale, this falls somewhere between an F and a B+. That’s a bit frightening!
The
good news is that our common practice is to have multiple assessment points,
rather than just one exam. The more
evidence we gather, the better our understanding of a student’s true ability. This is good practice and why multiple forms
of assessment are more reliable than any one single assessment.
In Summary
If
Marzano is right—and let’s face it, he is perhaps the leading educational researcher
of the past two decades—his findings have major implications for our grading
practice. But that is a discussion for
another day. The topic for today is
testing, especially high stakes testing.
Marzano’s research makes a very,
very strong argument for using multiple and varied assessments at the classroom
level, and it also reminds us exactly what our state standardized tests do well
and what they may or may not tell us about student performance.
Robert
Marzano’s argument is neither for nor against state standardized testing. Rather, he encourages us to be informed about
the science of testing and to communicate accurate information to people
outside of education, especially those making decisions about how test results
will be used.
I
hope this helps you talk to friends, family, and acquaintances about these
high-stakes tests, and I hope this week of giving ECAs doesn’t test your
patience too much.
Keep
fighting the good fight, HSE.
Phil
I
started this week’s memo with a few words from Nelson Mandela, a man who
exemplified to the world a life well-lived.
So Kudos go this week to our Set a Good Example students. I love walking by the display case by the
media center and seeing pictures of our SAGE students. It’s a gift to have kids like this at our
school. We have the opportunity to teach
them and learn from them.
“There
is no passion to be found playing small—in settling for a life that is less
than the one you are capable of living.”
--Nelson Mandela