By Robert Krampf
Over the past few weeks, I have discovered some major scientific errors in the guidelines that are used to develop questions for the fifth and eighth grade Science FCAT tests.
The Science FCAT is Florida’s high stakes test that assesses all the science concepts and information that students should have learned by the end of fifth grade. Schools and districts are subject to financial incentives or penalties, depending on their students’ FCAT scores, so this is a very important test.
A few weeks ago, I started developing FCAT practice questions to help students review concepts and prepare for the test. To develop those questions I used the Florida Department of Education’s FCAT 2.0 Science Test Item Specifications. These documents are used as “a resource that defines the content and format of the test and test items for item writers and reviewers.”
I expected the Test Item Specifications to be a tremendous help in writing simulated FCAT questions. What I found was a collection of poorly written examples, multiple-choice questions where one or more of the wrong responses were actually scientifically correct answers, and definitions that ranged from misleading to totally wrong.
I suggest that you read over the entire document, but here are a few of the problems that I found in the FCAT 2.0 Science Test Item Specifications for grade 5.
A glossary of definitions (Appendix C) is provided for test item writers to indicate the level of understanding expected of fifth grade students. Included in that list is the following definition:
Predator—An organism that obtains nutrients from other organisms.
By that definition, cows are predators because they obtain nutrients from plants. The plants are predators too, since they obtain nutrients from decaying remains of other organisms. I have yet to find anyone who thinks that this is a proper definition of a predator. In the same list we find:
Germination—The process by which plants begin to grow from seed to spore or from seed to bud.
There are no plants that grow from seed to spore. The mistakes in these definitions are not technicalities. They are errors that any fourth grade science teacher would catch. How did they make it past scientific review?
Sample Item 2 for SC.5.N.1.6 (page 32), which assesses the following benchmark.
SC.5.N.1.6: Recognize and explain the difference between personal opinion/interpretation and verified observation.
This sample question offers the following observations, and asks which is scientifically testable.
- The petals of red roses are softer than the petals of yellow roses.
- The song of a mockingbird is prettier than the song of a cardinal.
- Orange blossoms give off a sweeter smell than gardenia flowers.
- Sunflowers with larger petals attract more bees than sunflowers with smaller petals.
The document indicates that 4 is the correct answer, but answers 1 and 3 are also scientifically testable.
For answer 1, the Sunshine State Standards list texture as a scientifically testable property in the third grade (SC.3.P.8.3), fourth grade (SC.4.P.8.1), and fifth grade (SC.5.P.8.1), so even the State Standards say it is a scientifically correct answer.
For answer 3, smell is a matter of chemistry. Give a decent chemist the chemical makeup of the scent of two different flowers, and she will be able to tell you which smells sweeter without ever smelling them.
While this question has three correct answers, any student that answered 1 or 3 would be graded as getting the question wrong. Why use scientifically correct “wrong” answers instead of using responses that were actually incorrect? Surely someone on the Content Advisory Committee knew enough science to spot this problem.
For this one, you have to go to the document and scroll down to Sample Item 7 for SC.4.E.6.2 (page 42)There is nothing in the drawing or written information that indicates if the square object is a streak plate (to test streak) or a glass plate (to test hardness.) Scratching a glass plate is one of the most common tests for hardness, and it appears as a graphic or photograph in most textbook units on minerals. C would be just as valid an answer as B, but a student that answered C would be graded as giving a wrong answer. This flaw could have easily been avoided by simply not listing hardness as one of the choices.
These are just a few of the problems that I found. I contacted FLDOE’s Test Development Center, and sent them a list of the errors. Their response for error after error was: “This item was reviewed and deemed appropriate by our Content Advisory Committee.”
I asked for contact information of someone from the Content Advisory Committee, so I could find out how these errors made it past scientific review. Steve Ash, Executive Director of the Test Development Center, told me that FLDOE would not give out that information.
Even more troubling was their response to the example questions that had more than one correct answer. In response to the example above for SC.5.N.1.6, Christopher Harvey, the Mathematics and Science Coordinator at the Test Development Center told me: “We need to keep in mind what level of understanding 5th graders are expected to know according to the benchmarks. We cannot assume they would receive instruction beyond what the benchmark states. Regarding #1 – While I don’t disagree with your science, the benchmarks do not address the hardness or softness of rose petals. We cannot assume that a student who receives instruction on hardness of minerals would make the connection to other materials. The Content Advisory committee felt that students would know what flowers were and would view this statement as subjective. Similarly with option 3, students are not going to know what a gas chromatograph is or how it works. How a gas chromatograph works is far beyond a 5th grade understanding and is not covered by the benchmarks. As you stated most Science Supervisors felt that student would not know this property was scientifically testable. The Content Advisory Committee also felt that 5th graders would view this statement as subjective. We cannot assume that student saw a TV show or read an article.”
The response to my comments on Sample Item 7 for SC.4.E.6.2 (page 42), I was told: “Here again I don’t disagree with your science; however, elementary educators consistently told us that glass plates are not used in elementary classrooms for safety reasons. They did not feel that 5th graders would be familiar with using glass plates to test hardness.”
So according to the Test Development Center, it appears that it is acceptable to use scientifically correct answers for wrong responses on the Science FCAT as long as FLDOE does not expect a fifth grader to be educated enough to realize that the wrong answers are scientifically correct.
I wonder how many students got “wrong” answers on the FCAT because their teachers taught them too much. How many “F” schools would have higher grades if those scientifically correct “wrong” answers were counted as correct answers. How many “B” schools would get the extra funding that “A” schools get, if those scientifically correct “wrong” answers were counted as correct answers?
We may never know the answers to those questions. The Test Item Specifications are the guidelines that are used to write the test questions. If the Science FCAT test is reviewed by the same Content Advisory Committee that reviewed the Test Item Specifications, then it probably has similar errors. But as much as I would love to check the accuracy of the questions from the actual Science FCAT, I can’t. Teachers, scientists, and the general public are not allowed to see actual test questions, even after the tests have been graded and the penalties for those grades have been imposed.