The Test Preparation Study

Thank you to the many teachers who participated in the Test Preparation Study!

The Test Preparation Study was conducted from 2004 to 2008 by researchers at the Ontario Institute for Studies in Education of the University of Toronto, with funding from the Social Sciences and Humanities Research Council of Canada. This is a summary of the results. For more information, please contact the study’s principal investigator, Professor Ruth Childs, at rchilds@oise.utoronto.ca or 416-978-1079.

Why Did We Study Test Preparation and Administration?

What Did We Ask Teachers?

What Did Teachers Tell Us?

Preparing Students for the Test

Administering the Test

The Purposes and Appropriateness of the Test

Teachers' Roles

What Next?

Why Did We Study Test Preparation and Administration?

Teachers play critically important roles in preparing students for and administering large-scale assessments. Yet, these roles are poorly understood. In this study, we investigated how Grade 3 teachers in Ontario prepare their students for the provincially-mandated assessment of reading, writing and mathematics, and how they administer the assessment.

What Did We Ask Teachers?

We began by interviewing eight teachers from the greater Toronto area. They told us about their practices and how they decided what to do. Based on the dilemmas described by these teachers, we created a set of vignettes. We posed these vignettes, along with questions about teachers’ experiences and beliefs about testing in an on-line survey and collected responses from 98 teachers across Ontario who had administered the Grade 3 Assessment (now called the Primary Assessment) at least once. We also used this survey in hour-long interviews with 40 Grade 3 teachers. Most (80%) of the teachers were in the greater Toronto area, but we also heard from teachers around the province: London (7%), Ottawa (6%), North Bay and Sudbury (3%), Thunder Bay (2%) and Barrie (1%). They worked for the English Public (79%), English Separate (12%), French Public (2%) and French Separate (5%) school systems. They had extensive experience with the test: Almost half had administered the test more than three times and about a quarter had worked as a test scorer.

What Did Teachers Tell Us?

Preparing Students for the Test

We asked the teachers what activities or approaches they would recommend to a new Grade 3 teacher who asked how to prepare for the test. Almost all (99%) of the teachers recommended teaching students strategies for answering multiple-choice and open-response questions. More than 95% endorsed teaching students how to understand the test instructions, having students work on sample questions, discussing examples of good responses to those questions, and helping students get used to working independently. About 85% recommended administering a mock test and teaching students how to handle feelings of anxiety about the test. Less than half of the teachers recommended talking with the students about whether the test was important.

We asked the teachers the reasons for their recommendations. Most teachers felt it was important to teach students how to answer multiple-choice and other types of test questions and how to work independently because these skill would be useful not just on this test, but in later grades. The teachers were more ambivalent about activities that were directly tied to the test, such as discussing sample questions or administering a mock test; they were especially wary of talking about the importance of the test.

Administering the Test

Double Bubbles

We asked the teachers to imagine they were in the following situation: “As you collect the multiple-choice answer sheets, you notice that one student has coloured two bubbles on several questions.” We asked them what they would do. Almost half (46%) said that they would do nothing. Many of these teachers pointed to the instructions. For example, one teacher wrote: “Although this has not happened, I would like to think I would follow appropriate procedures and not say anything.” Other teachers pointed out the importance of preparing the students to avoid careless mistakes: “During test prep, I would have taught them how to do [multiple-choice] questions properly, and if they didn’t, I would have dealt with it at that time, not during the test.”

The remaining teachers (54%) said they would respond to the student’s mistake. A few would point it out, arguing that this would make the test’s results more accurate: “I would ask them to recheck their answers to make sure they have filled in the bubbles correctly [because] correctly filling in a multiple choice answer sheet is not one of the skills that [the test] is assessing.” Others would remind the whole class to check their answer sheets, read the instructions, answer carefully, or do their best work. Of these teachers, some believed it was okay to give these reminders during the testing session and others believed they should wait and remind the students between sessions.

Too Short Answers

We also presented the following vignette: “During the test, you walk around the classroom. You notice that one of the students is answering questions with only a word or two. The instructions say students should explain their reasoning. What would you do?” A third (33%) of the teachers said they would do nothing. A few believed that to point out the problem would unfairly influence the results: “I would leave them alone. I want my students to earn their level on their own. The children and their parents/guardians need to have a realistic view of what their children have learned.” Others would try to avoid the situation by preparing the students: “Any student in my class will have heard over 100 times -- ‘if they give you 10 lines PLEASE do not write two words!!!’ Once the test has begun, I am mute!” Some teachers described vividly how difficult it was for them to follow the instructions: “I wouldn’t walk around because it’s too torturous. I try not to look at what they’re writing.”

Many teachers felt that reminding the student to explain their answer or to fill the space provided for the answer, while contrary to the testing instructions, was not ethically wrong. For example, one teacher explained: “I feel people in the ‘real world’, i.e., outside of the classroom, use tools to help them succeed with a task. When I write a paper, I have someone proof-read it before I hand it in to be judged for a mark. I know my students very well and there are some who need ... extra reminders to complete something to the best of their abilities.” Other teachers said they would remind the whole class: “I would say, ‘Oh, wasn’t everybody working really hard?! Let’s make sure we put down all the ideas that we have in our heads, because, remember, I am not able to mark the test and the people that mark the test has never met you. So, you need to tell them how intelligent you are....’ That is not going to influence the test results.” As one teacher noted, it was important for her to maintain her usual classroom practice: “I feel that it is important to explain to students as a whole to try their best and that means to add as much important detail as they can or are allowed to do for a question. In a Grade 3 setting I try to maintain some of the regular teacher/student interaction that we are all used to without centering out one student or give away an answer.” A few teachers have developed creative nonverbal cues; for example: “I use a lot of humour with my students [through looks], so I would probably give them one of my looks and they would get it.” Another wrote: “[I would] point to the instructions and to the space allotted. The student should be able to understand this prompt.”

The complexity of these decisions is illustrated in the following eloquent answer: “My response to that would depend on who the student was. My expectations for my students are that they do the best that they can. It is my expectation throughout the year. If that student is one who really struggles and that is a typical response to short answer questions, I might make a global reminder to everyone in the class to read instructions carefully and make sure you are explaining your answers in detail, to the best of their ability. If the student is one who normally answers questions in complete sentences and with supporting detail, I might ask the student, ‘how are you doing?’ Students generally reply honestly and you can get a sense as to why their effort is inconsistent with their normal responses. It may be that they had a difficult morning, don’t feel well, are tired or feeling stressed, or simply didn’t read the question carefully. Their response guides mine. Finally, if all is well with the student and there appears to be no reason for the inconsistency, I might gently remind him/her that short answer questions need to include details that explain their reasoning - or I might congratulate another student, saying, ‘I like the way you have a lot of detail in your responses’ (thereby supporting one student and encouraging the others).”

What if you heard...?

We asked teachers what they would do if they found out before the test that it would include something they had not yet taught their students (the example we gave was volume). Many teachers rightly responded that they would be unlikely to find out about the test’s content, especially because EQAO has recently made the rules about opening the test boxes more strict. However, 63% of the teachers responded that if they did find out about the test, they would quickly cover the untaught content. As one teacher explained: “I would teach a mini lesson on volume. I know that may sound bad, but if you want the truth, that is what I would do. Grade 3’s are little and there is a lot of information on the test. I would like to give them as many opportunities for success as possible. I would not give them exact questions that are on the test (although I worked in a school where such nonsense was done). I think that is wrong, but ensuring they have some knowledge of the concepts is for me an acceptable way to deal with the issue.” Several other teachers pointed to a sense of duty to their students: They had promised the students that there would not be anything on the test that they had not been taught and they believed it was very important to honor that commitment to their students. Some mentioned that the test is administered about a month before the end of the school year, making it difficult to cover all the material. Some worried that encountering unfamiliar content would cause the students unnecessary stress. Other teachers focused on general strategies: “I would probably try to find a bit of time to work in some volume lessons, but wouldn’t stress about it. I might work with the students on strategies to use when they encounter a question that they are unsure of and would reinforce that they should at least attempt every question.”

The Purposes and Appropriateness of the Test

In deciding what to do, many teachers focused on what they believed were the most important purposes of the test. We asked all of the teachers questions about their beliefs about the test. We also asked the teachers we interviewed to tell us more about the test’s purposes.

Accountability of the education system was seen as an important intended purpose, although not all the teachers agreed that the test was the best way to hold teachers and schools accountable. One teacher said: “Teachers need to be accountable, but are there other ways of doing this.” Another was more resigned: “The way I understand it, the government needs some way to see the results of the curriculum they designed. It doesn’t serve the students much but still needs to be done so the government can keep schools accountable for what they are doing.” Overall, 78% of the teachers believed the government put too much emphasis on the test. At least one teacher felt the money could be better spent: “I feel it is too much money spent of this with little actual outcome for the student. Send me in an extra body with the money to help the students!” When we asked the teachers whether it was good for Ontario to have a provincially-mandated testing program, almost half (49%) agreed. However, 47% felt that the Grade 3 Assessment should be cancelled.

Many teachers believed the results of the test were being or should be used for planning, resource allocation, and improving practice. The planning could be at the provincial level: “The purpose is to make sure, in general, that students everywhere are where they should be, and then send resources where needed.” The planning could be by the individual teacher: “The most important ....thing is to compare my teaching with the rest of the province...I look for trends... I use it to shape my instruction.” Or by the schools and boards: “For the schools and the school boards - to know what can be improved, what can be done better.” Two-thirds (67%) of the teachers believed that the test was an important source of information for schools and 56% that it could help identify areas where students could improve.

To allocate resources fairly, the test results must be comparable across schools and boards. Only 21% of the teachers believed that the test results could be meaningfully used to compare schools and school boards. As three of the teachers explained: “I think the publications of individual school results in the newspaper is horrible and have no value except to give parents one more thing to point at individual schools and teachers and complain about!,” “I disagree that results are public and are used by some to choose schools or, e.g., to sell houses in one area and not another,” “Je crois que les résultâts individuels devraient être rapportés, mais que les résultâts des écoles ne devraient pas être publiés dans un forum ouvert tel que les médias.” Worries about the fairness of the test for students in all cultural groups and geographic regions (only 8% believed the test was fair for all students) also contributed to the teachers’ skepticism about comparing results across schools. As one teacher asked: “Why are we comparing schools with completely different socio-economic backgrounds? If I teach children who see violence in the home, come to school dirty, hungry with a lack of sleep…” Another provided an example: “I think that the testing should be more appropriate for all students from all backgrounds. For example the first language question dealt with a menu and chili. I had a student who had never been to a restaurant with a menu and had never eaten chili.”

Differences in test administration across schools were also a worry for some teachers. Most teachers (71%) believed that the test should be administered to all students in the same way. For example, one teacher said that teachers should “follow the guidelines so that the test is fair for all schools and students.” Another teacher suggested: “If they are serious about this test and if the real purpose is to get an idea of where the average child is, they should send in test administrators.” However, 39% of the teachers believed that teachers should use their professional judgement when deciding how to administer the test (some of the same teachers supported professional judgement and standardized administration).

For many teachers, comparability of the results across schools was not the most important concern. After all, comparability is little comfort if the scores are not meaningful. One way to understand the decisions some teachers made about how to administer the test is that they were trying to make the test scores more meaningful. By adjusting the test administration, these teachers believed they were correcting flaws in the test itself, such as confusing wording, or in the test instructions, such as inappropriate restrictions on encouraging the students. Some believed the test and administration instructions were not developmentally appropriate for Grade 3 students and so needed to be adjusted. Many teachers who chose not to follow the instructions did so to help students with weak test-taking skills or who were nervous or distracted during the test to “show what they know.” Because they were not helping the students to “show what they don’t know,” they did not consider it cheating. They believed they were increasing the meaningfulness of the test scores. One teacher explained her perspective this way: “If the test is administered the way it is suggested then there is unnecessary stress put on kids and that way kids aren’t allowed to project their true potential.”

That does not mean that the teachers did not believe that cheating was a problem. However, they defined cheating in different ways. One teacher summarized a common view: “In respect to cheating or what constitutes cheating - I think students should be helped to develop their answers but not given the answer.” Another said, “Cheating is telling a child ‘that’s wrong’ and singling out an individual child.” Still others defined cheating simply as not following the instructions.

Teachers’ Roles

Many teachers in this study questioned their role in testing: Should they act as a teacher or a test proctor? Some choose the latter: “As difficult as it is, we have to remove the teacher hat for a little while and wear the hat of a test administrator.” Others see their role as a teacher as extending into the testing: “I think [teachers] help kids day after day ... whether to support them, whether to help them succeed, whether to take them through, and then we say, ‘Here, you are on your own.’ ... You are not giving them answers, you are supporting them ... I think teachers are trying to help their children to feel at ease.” Another explained, “All year long, I tell my students that I am there to help them and guide them. I don’t give them the answers, but some students need more support than others. Suddenly for EQAO, I am not supposed to give them any support at all…” Fear about the possible consequences of not following the testing instructions is very real consideration, as well. One teacher put it bluntly: “[T]eachers who strictly adhere to the test guidelines do so perhaps out of fear of job loss."

What Next?

This short summary cannot do justice to the complexity of teachers’ decision making, but it does point to an important implication: Trying to standardize test preparation and administration by threatening to prosecute teachers for cheating will not work if teachers do not believe they are cheating. Because many of the teachers in our study believed the current instructions are unreasonable for current classroom realities, they did not see not following the instructions as unethical. On the contrary, they saw themselves as correcting flaws in the test and testing instructions. The challenge for test developers is to work with teachers to develop tests and instructions that teachers can support.

We Welcome Your Comments

Contact the study’s principal investigator, Professor Ruth Childs, at rchilds@oise.utoronto.ca or 416-978-1079.