Friday, December 6, 2013

Test Central

“Education is the most powerful weapon you can use to change the world.” –Nelson Mandela

This past week the boxes of End of Course Assessments were carted into HSHS and took up residence in Test Central, the Small Office Conference Room.  Starting this Tuesday, our students will start taking these high-stakes tests.  During the work of sorting and packing the ECAs, I tend to ruminate on how testing has changed education.

No question about it: Our current students are the most tested generation ever.

Like most educators, I have some biases about testing and am asked about these at times, often when I am unprepared for the conversation.  If this happens to you, the following three points about assessments in general and high-stakes testing specifically may be useful.

1)      ECAs Do What They Were Designed to Do

End of Course Assessments do a good job of measuring what they were designed to measure. They are very good indicators of whether or not students have reached minimum competency in three content areas: English/Language Arts, Biology, and Algebra I.  But let’s be clear: This is all they were designed to do.  They are not diagnostic in nature, and they are not designed to measure teacher or school effectiveness.  Certainly, there are attempts to use these tests for other purposes, but they were designed to measure minimum student achievement in three specific areas. 

Sometimes I think we lose track of this fact, and it is helpful to keep their designed purpose in mind.

2)      One Problem with Single Assessments: Margin of Error

Single Assessments are all imprecise to a certain degree.  Imprecision in assessments can be the result of many factors: poorly constructed test items, student lack of attention or effort, and/or mistakes in scoring or grading.  The impact of imprecision on interpretation of test scores is often surprising to educators and completely startling to non-educators.  To illustrate, Robert Marzano uses this equation:

Observed Score = True Score + Error Score

He explains: 

This equation indicates that a student’s observed score on an assessment (the final score on the assessment) consists of two components—the student’s true score and the student’s error score.  The student’s true score is that which represents the student’s true level of understanding or skill regarding the topic being measured.  The error score is the part of an observed score that is due to factors other than the student’s level of understanding or skill.

In other words, error is inherent in scores assigned to students on every assessment.  In his book, Formative Assessments and Standards-Based Grading, Robert Marzano provides this chart to show how dramatic the impact of error can be. 

Reliability of Assessment
Score Student Receives on the Assessment
Lowest Possible Score
Highest Possible Score
Range
0.85
70
60
80
20
0.75
70
58
82
24
0.65
70
56
84
28
0.55
70
54
86
32
0.45
70
52
88
36

Consider this: The typical reliability of a state standardized test is 0.85, which is very good.  Again, I defer to Marzano’s chart and his own words: “For an assessment with a reliability of 0.85 and an observed score of 70, one would be 95 percent sure the student’s true score is anywhere between a score of 60 and 80.”  Reread that line until it is clear because this is a really important concept about a single-shot assessment. 

Another way to look at the same numbers: Using our typical in-house scoring scale, we could be fairly confident that a student with an observed score of 70 is likely to have a true score between a D- and a B-. 

Spend some time considering this chart and the idea of reliability, and I think you will agree that good assessment and grading practice requires multiple assessments to get us closer to identifying a student’s true ability.

3)      It’s Not Just Standardized Tests: We Face the Same Problems in Our Classrooms

Bob Marzano doesn’t let us off easily at the school level either.  He states that the highest reliability we can expect from an assessment designed by a teacher, school, or district is 0.75.  Again, this is very good reliability for a school-based assessment.  The research from Marzano Laboratories, however, is that the typical reliability for classroom assessments is 0.45.  If we use the same example from above, this means that with the typical classroom assessment (which has a reliability of 0.45 and a range of 36) we can with confidence say that a student with an observed score of 70 is likely to have a true score between 52 and 88.  On a typical grading scale, this falls somewhere between an F and a B+.  That’s a bit frightening!

The good news is that our common practice is to have multiple assessment points, rather than just one exam.  The more evidence we gather, the better our understanding of a student’s true ability.  This is good practice and why multiple forms of assessment are more reliable than any one single assessment.

In Summary

If Marzano is right—and let’s face it, he is perhaps the leading educational researcher of the past two decades—his findings have major implications for our grading practice.  But that is a discussion for another day.  The topic for today is testing, especially high stakes testing.   Marzano’s research makes a very, very strong argument for using multiple and varied assessments at the classroom level, and it also reminds us exactly what our state standardized tests do well and what they may or may not tell us about student performance.

Robert Marzano’s argument is neither for nor against state standardized testing.  Rather, he encourages us to be informed about the science of testing and to communicate accurate information to people outside of education, especially those making decisions about how test results will be used.

I hope this helps you talk to friends, family, and acquaintances about these high-stakes tests, and I hope this week of giving ECAs doesn’t test your patience too much.

Keep fighting the good fight, HSE.

Phil

I started this week’s memo with a few words from Nelson Mandela, a man who exemplified to the world a life well-lived.  So Kudos go this week to our Set a Good Example students.  I love walking by the display case by the media center and seeing pictures of our SAGE students.  It’s a gift to have kids like this at our school.  We have the opportunity to teach them and learn from them. 


“There is no passion to be found playing small—in settling for a life that is less than the one you are capable of living.”  --Nelson Mandela

No comments:

Post a Comment