91制片厂视频

Opinion
Standards Opinion

Engineering Good Math Tests

By Hugh Burkhardt 鈥 October 02, 2012 6 min read
  • Save to favorites
  • Print
Email Copy URL

Narrow math tests inevitably drive down real standards because accountability pressures principals and teachers to teach to the test. Conversely, well-engineered tests of the math we actually want people to study and learn raise standards. It may not surprise you that high-performing countries such as Singapore have better mathematics鈥攁s defined in the Common Core State Standards鈥攐n their tests than the United States does. More surprisingly, good tests are less expensive in real terms.

There are worrying signs that the actual common-core assessments will be too close to 鈥渂usiness as usual,鈥 albeit computerized. If so, most U.S. students and future citizens will be condemned to further mediocrity in mathematics.

The need for better tests is accepted by business, industry, and government. In 2009, President Barack Obama called on 鈥渙ur nation鈥檚 governors and state education chiefs to develop standards and assessments that don鈥檛 simply measure whether students can fill in a bubble on a test, but whether they possess 21st-century skills like problem-solving and critical thinking, entrepreneurship, and creativity.鈥

Since then, the states have led the development of common standards in mathematics that embody this broader vision, and two consortia of states, the , or SBAC, and the , or PARCC, have been funded to develop assessments aligned with the standards. Much progress has been made.

Everyone accepts that, when used as the linchpin of accountability, tests are not 鈥渏ust鈥 measurement, but often direct the efforts of school employees and dominate what is taught in classrooms. The SBAC 鈥渃ontent specification鈥 for the common-core math assessment (which I helped write) features problem-solving and modeling with mathematics, reasoning, and critiques of reasoning, alongside the concepts and skills needed to make these possible.

To find out if students can do mathematics, we need to find how well they can create, critique, and explain substantial chains of reasoning."

Crucially, it also includes many examples of assessment tasks that show how these principles have been realized in math examinations in the United States and around the world. Examples are harder to misinterpret than descriptions. Teachers, students, and citizens understand that items on the tests represent the types of tasks students must learn to do.

The feedback to SBAC on this content specification has been overwhelmingly positive. So what is the problem?

A strong undertow of fear appears to be pulling the system back to the familiar. This is a test of our courage鈥攁 test our tests may fail.

There is growing concern that test implementation will be a third-rate realization of the common core鈥攖hat the design and 鈥渆ngineering鈥 will not be good enough. The problems seem to be caused by a mixture of fear and lack of experience and by a decisionmaking structure unsuitable for innovation. State assessment directors are fearful of cost and litigation if their well-oiled testing systems, already sometimes controversial, have to change. High-quality examinations that cover the common core and meet international standards are outside their experience and their zone of comfort.

In high-performing countries, mathematics-curriculum experts have final say on the problems and scoring of the examinations. Psychometricians are technical advisers. In the United States, the practice has been turned upside down: Psychometricians too often have the final say on the items in a test, while the mathematics experts play a secondary role. SBAC and PARCC continue this upside-down tradition that values technical measurement above accountability for teaching and learning the core mathematics in the standards.

What is the problem with current tests? Multiple-choice tests and their latest variant鈥攃omputer-adaptive tests鈥攎easure with many very short items. The grain size of these items is much smaller than the basic concepts of mathematics. The items have a very indirect relationship to the targets of instruction: the math in the standards. In mathematical reasoning and problem-solving, the whole is more than the sum of the parts. This is recognized in English/language arts, where we assess substantial pieces of reading and writing.

To find out if students can do mathematics, we need to find how well they can create, critique, and explain substantial chains of reasoning. Multiple-choice tests cannot handle this, nor can their computer-based variants. When you look at the 鈥渢echnology-enhanced items鈥 designed to assess 鈥渄epth of knowledge,鈥 you find that potentially rich tasks have been broken into sequences of short items. This ignores the real target: chains of student reasoning that may take diverse paths and be expressed with words, sketch diagrams, and symbols in diverse ways. Mathematics is not treated as a coherent body of mathematical content and practices, but as fragments indirectly related to the target knowledge. This makes a test that defines the targets of instruction invalid.

It is easy to do better. You ask students to tackle tasks that represent the kinds of performance that you really want them to be able to do, not proxy tasks that are easy to assess. As with writing, you have them scored by trained human beings using specific rubrics for each task that award points for the core elements of performance. You audit the process to ensure reliable scoring. This is the way examinations are run in other advanced countries.

A well-made test matches the depth and balance of the learning targets. This involves selecting an appropriate balance of short items and substantial performance tasks so that teachers who teach to the test, as most teachers will, are led to deliver a balanced curriculum that reflects the standards. This needs a 鈥渕athematics board,鈥 a body whose members are experts in math education and mathematics. The consortia should establish such panels for task selection and test balancing.

Where will the tasks come from? Designing accessible assessment tasks that demand substantial chains of reasoning is a challenging area of educational design. Test vendors have little experience, and the skills do not come quickly. However, there is a large international literature of well-engineered tasks across this range that can be licensed for use in tests. (Disclosure: A project in which I am involved鈥攖he Mathematics Assessment Resource Service, or MARS鈥攊s one not-for-profit source.)

And what of cost? Vendors charge a dollar or two for traditional tests, and they only need a class period of testing time. People are rightly concerned at 鈥渨asting鈥 teaching and learning time. Yet ask teachers how much time they spend on otherwise unproductive test preparation. Typical responses are that test prep for state tests takes 20 days a year. That鈥檚 more than 10 percent of teachers鈥 time and, worse, more than 10 percent of the students鈥 learning time. This is the real cost of aiming at a cheap target.

Good tests cost a bit more than computer-based tests. How much depends on how you manage them. One inexpensive model is to make scoring training and actual scoring part of each teacher鈥檚 job. This is high-quality professional development, showing teachers what is valued in math performance and what other students can do. If this takes two days a year, you are still well ahead on the test-prep clock, with many more days for real teaching and learning than with artificial tests.

What about test prep for good tests? With 鈥渢ests worth teaching to,鈥 that is something you want. Test tasks are valuable learning experiences. The test itself is not a waste of learning time; it is instead exactly the task for which teaching prepares you.

A version of this article appeared in the October 03, 2012 edition of 91制片厂视频 Week as Engineering Good Math Tests

Events

Recruitment & Retention Webinar Keep Talented Teachers and Improve Student Outcomes
Keep talented teachers and unlock student success with strategic planning based on insights from Apple 91制片厂视频 and educational leaders.鈥
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 91制片厂视频 Week's editorial staff.
Sponsor
Families & the Community Webinar
Family Engagement: The Foundation for a Strong School Year
Learn how family engagement promotes student success with insights from National PTA, AASA鈥痑nd leading districts and schools.鈥
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 91制片厂视频 Week's editorial staff.
Sponsor
Special 91制片厂视频 Webinar
How Early Adopters of Remote Therapy are Improving IEPs
Learn how schools are using remote therapy to improve IEP compliance & scalability while delivering outcomes comparable to onsite providers.
Content provided by 

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide 鈥 elementary, middle, high school and more.
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.

Read Next

Standards Explainer What鈥檚 the Purpose of Standards in 91制片厂视频? An Explainer
What are standards? Why are they important? What's the Common Core? Do standards improve student achievement? Our explainer has the answers.
11 min read
Photo of students taking test.
F. Sheehan for EdWeek / Getty
Standards Florida's New African American History Standards: What's Behind the Backlash
The state's new standards drew national criticism and leave teachers with questions.
9 min read
Florida Governor and Republican presidential candidate Ron DeSantis speaks during a press conference at the Celebrate Freedom Foundation Hangar in West Columbia, S.C. July 18, 2023. For DeSantis, Tuesday was supposed to mark a major moment to help reset his stagnant Republican presidential campaign. But yet again, the moment was overshadowed by Donald Trump. The former president was the overwhelming focus for much of the day as DeSantis spoke out at a press conference and sat for a highly anticipated interview designed to reassure anxious donors and primary voters that he's still well-positioned to defeat Trump.
Florida Governor and Republican presidential candidate Ron DeSantis speaks during a press conference in West Columbia, S.C., on July 18, 2023. Florida officials approved new African American history standards that drew national backlash, and which DeSantis defended.
Sean Rayford/AP
Standards Here鈥檚 What鈥檚 in Florida鈥檚 New African American History Standards
Standards were expanded in the younger grades, but critics question the framing of many of the new standards.
1 min read
Vice President Kamala Harris speaks at the historic Ritz Theatre in downtown Jacksonville, Fla., on July 21, 2023. Harris spoke out against the new standards adopted by the Florida State Board of 91制片厂视频 in the teaching of Black history.
Vice President Kamala Harris speaks at the historic Ritz Theatre in downtown Jacksonville, Fla., on July 21, 2023. Harris spoke out against the new standards adopted by the Florida state board of education in the teaching of Black history.
Fran Ruchalski/The Florida Times-Union via AP
Standards Opinion How One State Found Common Ground to Produce New History Standards
A veteran board member discusses how the state school board pushed past partisanship to offer a richer, more inclusive history for students.
10 min read
Image shows a multi-tailed arrow hitting the bullseye of a target.
DigitalVision Vectors/Getty