Evaluation

Outcomes of the Pilot Phase

Paired Reading has previously been subject to a great deal of evaluative research. Indeed, it is one of the most thoroughly researched approaches in use in education. Reports on its use throughout the world, including in some extremely impoverished and chaotic contexts, will be found in the literature (see Topping, 1995, 2001). Many of these reports describe the use of Paired Reading with parents rather than peers as tutors.

Here we will confine ourselves to reporting the main evaluation results from the 13 Read On pilot schools, firstly from Phase 1 (Paired Reading) and secondly from Phase 2 (Paired Reading and Thinking). Details will be found in Topping, K. J. (2001), Thinking Reading Writing: A Practical Guide To Paired Learning with Peers, Parents & Volunteers, New York & London: Continuum International(Click here for details)

Phase 1 Teacher Observation Results

At the end of Phase 1, each participating teacher was asked to record their summary observations of child behaviour relevant to the aims of the project. They were asked to comment only on children in their class whose reading they knew before Paired Reading started, and only indicate change if:

  • you have observed it,
  • it is significant,
  • it has definitely occurred since PR started

The response rate was 33 out of 34 possible (97%). One teacher had left the school.

The summary results are displayed in Figure 1 for behaviour in the classroom during Paired Reading, and in Figure 2 for behaviour in other activities in the classroom and outside the classroom within school.

It is clear that regarding the former, very few teachers had not observed a positive shift in the majority of their children. Regarding generalisation of positive effects to other subject areas and outside the classroom, the effects are not as strong (as would be expected), but are still very positive. The improvement in motivation during the PR sessions was particularly striking. Especially worthy of note was the improvement in ability to relate to each other - and that their social competence improved both during PR and beyond it.

Figure 1: 
Teacher Observations: During The PR Sessions

Figure1

Figure 2: 
Teacher Observations: Outside The PR Sessions

Figure2

Phase 1 Teacher Feedback

At the Phase 2 training, teachers gave their views about the problems and opportunities of Phase 1. A summary was collated by group scribes, and will be found in the Teacher's Voices section of this web site.

Phase 1 Child Feedback

At Langlees Primary School, the tutees and tutors take part in Circle Time activities to help explore how they are feeling about Read On and how it might be improved. These sessions are sometimes captured on video. Some of the thoughts of the tutors and tutees will be found in the Children's Voices section of this web site.

The subjective evaluative responses of children from Bankier Primary School - expressed in their own concepts and language - will be found in the same location.

Circle Time was used in a similar way by Woodside Primary School. At the end of Phase 1, tutors and tutees met together to share their thoughts and feelings about Paired Reading. Comments related to reading skills and personal and social development.

"I like Paired Reading because it helps you get harder reading books" (male tutee). 
"I like Paired Reading because the books help you to get new knowledge" (male tutee). 
"I like Paired Reading because I can get a lot of advice" (male tutee). 
"I like Paired Reading because I like helping the little ones" (male tutor). 
"I like Paired Reading because I think it makes you more mature" (female tutor). 
"I like Paired Reading because it gives you more responsibility" (male tutor).

The tutors spoke a great deal about the enjoyment of being able to help:

"It helps the little ones to read more books". 
"I've learned that little ones are not just little kids, they can be really good". 
"I've learned more about reading skills and how not to just rush through books". 
"I've gained a lot. It gives you more responsibility and I think it's encouraged the tutors to read more and to read a different variety of books - not just what you normally read".

At Holy Cross Primary School, four pairs were interviewed together, and expressed their views confidently:

"Before the project, reading was boring ... now it's fun..." 
"Helping the younger ones was good because we (tutors) got the chance to go back over 'younger' books...we wouldn't normally be able to do this because people would say they were too easy for us or that we would be silly..." 
"It's easier than reading with the teacher". 
"We actually got to talk about the books, too". 
"It makes you shy when you read for the teacher, but not when you read with the P6s (tutors)". 
"I liked the project - it has helped my reading". 
"It makes you feel good, giving the wee ones some help". 
"Paired Reading got us out of doing some history". 
"It would be better if it lasted all year".

At Bervie Primary School, tutors and tutees completed feedback questionnaires after their Phase 1 Paired Reading experiences.

TUTEES

Paired Reading has led to:

Tutees1

(a) Not liking all kinds of reading 
(b) Liking all reading better


Tutees2

(a) Getting better at all kinds of reading 
(b) Getting no better at all kinds of reading


Tutors

(a) Go on peer tutoring as often as now 
(b) Go on tutoring, but not so often 
(c) Go on tutoring, but with a different tutee 
(d) Be tutored yourself, by someone better 
(e) Tutor reading, but in a different way 
(f) Tutor something else, like maths or spelling

 

Phase 1 Reading Test Results: Caveats

The Primary Reading Test used in the pilot is a relatively inexpensive and quickly administered norm-referenced multiple-choice group test of reading comprehension based on isolated incomplete sentences. It has been normed in Scotland as well as elsewhere in the UK, the manual gives evidence that it is adequately reliable and valid, and it has been widely used over many years. However, in part owing to its structure and brevity, it is not very sensitive to complex changes in reading capability. Some of the pilot teachers felt parts of it were rather dated.

Although the test raw scores yielded both reading ages and standardised scores, the reading ages were very undiscriminatory in the lower reaches (a "floor" effect), and standardised scores will be cited here. The Scottish norms consistently yielded lower scores than the English norms (because reading standards are higher in Scotland than England), but the two sets of standardised scores correlated very highly (0.99) and the Scottish will be cited here. When interpreting standardised scores, remember that no change in standardised score between testing indicates a normal rate of gain over that time, while gains in standardised scores indicate a greater than normal rate of gain. Also remember that the majority of children in this project were in relatively disadvantaged schools, and their average score before the project started was well below average in many cases - in other words, what was a "normal" rate of gain for most children was not normal for them.

Although the pilot tutors and tutees took the standard version at pre-test, a parallel form at post-test, and the standard version again at follow-up (to reduce content practice effects), the impact on child motivation of being required to take essentially the same test on three occasions within a seven month period is a source of concern. If only a few children in a class are unmotivated to re-take the test, their scores can show a sharp decline, so that on average the whole class appears not to have gained. The scores of some individuals in some schools were extremely erratic from one testing to the next, and pilot schools were counselled not to attach much significance to the results of individual children, and instead to consider the average (or "mean") gains of groups of children. Additionally, where several classes in a school participated, results were often very different between different classes. This might reflect nothing more than differences in the circumstances in which the tests were given. There were a number of classes in which the teachers reported marked observational evidence of improved reading attainment where this was not apparent in the test results.

"Control" groups were available in a minority of schools, were often much smaller than "experimental" groups, and often proved to be of doubtful comparability, in that their mean pre-test scores were often very different on average to those of the experimentals (this sometimes came as a surprise to the schools).

The pilot project was essentially an action research effort carried forward by practising teachers within their own classrooms. A modest amount of process data on actual implementation was gathered for six schools, and analysis relating implementation integrity to test outcomes is in hand. Some schools were offered and accepted some consultative support from the Centre for Paired Learning, but this was by no means intensive, not least because one intention was to establish whether this type of project could operate successfully without a substantial injection of extra resources.

Finally, although statistical significance is mentioned below, statistical significance is notoriously hard to obtain with small sample sizes, and a change might be educationally significant even if it not statistically significant.

Phase 1 Reading Test Results

In Phase 1, gains were clearly seen for both tutees and tutors. A striking example of gains for tutors was at Insch school, where the mean gain for tutors in one term (lasting 3 months) equated to 13.56 months of reading age.

At Langlees school, gains for tutees were substantially greater than for their control group (see Figure 3 below).

Figure 3: 
Pre-Post Gains For Tutees & Tutee Controls

Figure3

Figure 4: 
Pre-Post Gains For Tutees, Tutee Controls, & Tutors

Figure4

In this school, the tutees' average pre-test score was well below that of the controls, i.e. the two groups were not really exactly comparable. However, the tutees made substantial gains, while the controls stayed the same. So the tutees almost caught up with the controls.

In Sunnybank school in two participating classes, the picture was similar (see Figure 4).

Again, tutee controls were more able at pre-test than tutees, and therefore not strictly comparable. The Tutee pre-post gain was statistically significant at p<0.005. (In research, statistical analysis allows you to work out how probable it is that the gain happened just by random chance. In this case, the probability that this gain happened just by random chance was 5 in one thousand, or 0.5%. Researchers usually consider a probability of p<0.05 or 5% to be enough to consider the gain "statistically significant", but the smaller the probability, the stronger the confidence in the result). The difference in change between tutees and controls was significant at p<0.010.

The tutor pre-post gain was significant at p<0.014. Reading ages for the tutors indicate average gains of 1.02 years of reading age in one term (lasting 3 months) (p<0.010).

At Bainsford school, control groups were available for both tutees and tutors (see Figure 5), (results given in reading ages on this occasion):

Figure 5: Pre-Post Gains For Tutees & Tutors, & Their Controls

Figure5

All 13 schools and 34 classes were then considered in aggregate (in 4 classes not all children were involved, often because they were mixed-age classes, so the equivalent of 32 full classes participated).

For Tutees in 16 full classes, 9 classes showed gains which reached statistical significance, 6 classes showed gains which did not reach statistical significance, and one class did not show a gain (remember zero gain on standardised score = a "normal" rate of progress).

For Tutors in 16 full classes, 7 classes showed gains which reached statistical significance, 8 classes showed gains which did not reach statistical significance, and one class did not show a gain.

For tutees the gain in standardised score was from a pre-test average of 90.4 to a post-test average of 94.0 (full data available for n=342 tutees). This gain was highly statistically significant (p<0.000) (the probability was so small that when giving results only to three places of decimals, as we have done here, it doesn't seem to exist at all). The control children also gained on average, but only from 91.3 to 93.9 (n=61).

For tutors the gain in standardised score was from a pre-test average of 91.6 to a post-test average of 95.4 (full data available for n=344 tutors). This gain was highly statistically significant (p<0.000). The control children also gained on average, from 90.6 to 95.7, but there were only 29 control children in aggregate.

Overall, the results of comparisons with control groups were patchy. There were Tutor control groups in 4 schools, and Tutee control groups (CGs) in these and another two schools (totalling 7 classes). All CGs were very small (n=5-13). CG and Experimental groups rarely equivalent at pre-test. Ten out of 11 CGs showed signs of regression to the mean. For Tutors, in four out of four cases the experimental gain was not significantly different from the control gain. For the tutees, in four out of seven cases the experimental group gained more than the control group, but in only one case did this difference reach statistical significance.

Phase 1 Reading Test Results: Effect of Reading Ability

An analysis was conducted of the relationship between pre-test reading ability and amount of gain in Phase 1. Overall, the least able tutees gain most, and the least able tutors gain most.

Low ability tutors produced tutee gains at least equivalent to those produced by high ability tutors. Low ability tutors themselves gained more than high ability tutors, irrespective of the ability of their tutee. Low ability pairings were good for both tutees and tutors in terms of test outcomes.

Pairing low ability tutees with high ability tutors was good for the tutees but not for the tutors, in terms of test outcomes. Pairing high ability tutees with low ability tutors was good for the tutors but not for the tutees. High ability tutors gained more if paired with a high ability tutee, but this was no better for the tutee.

These findings suggested that exact pair matching was perhaps less important than pre-project ability.

However, the statistical phenomenon of regression to the mean might have influenced these findings (however improbable such artefacts seem to teachers who find it very difficult to raise the test scores of the weakest pupils).

Phase 1 Reading Test Results: Effect of Gender

An analysis was conducted of the relationship between pre-test reading ability and gender of the child in Phase 1. Overall, female tutees did better than male tutees, but male tutors did better than female tutors.

Pilot teachers had been encouraged to match children by ability differential, disregarding gender, but nevertheless cross gender matching proved to be less usual. However, cross gender matching actually yielded better tutee gains than same-gender matching, and was good for tutors as well.

Male-Male pairs appeared very good for the tutor, but not for the tutee (contrary to previous findings of high gains for both partners in this constellation). Female-Female pairs did least well on aggregate.

Phase 2 Reading Test Results

Not all the Phase 1 pilot schools felt able to mount the extension to Paired Thinking in Phase 2, although the majority did. However, in those schools implementing Paired Thinking, practice was very various. Some schools continued with three sessions per week, but some only had one session per week, which was unlikely to yield a measurable difference. As noted earlier, the group test of reading comprehension used was expected to show some correlation with increases in thinking skills, but Paired Thinking actually involved less time reading and more time discussing than Paired Reading, so a paradoxical effect was possible. Additionally, some schools ended their Paired Thinking experiment quite early in the summer term, while others continued almost to the end of the academic year. In short, it is far from clear what the Phase 2 Reading Test Results are actually measuring. These difficulties were further compounded when not all of the 13 schools managed to complete the follow-up test before the end of the academic year.

A further complication is that reading test norms typically track the progress of children over a full calendar year, while the school year is shorter, and progress over the school year might well be uneven. For instance, it is possible that most of the growth in reading ability occurs in the Autumn and Spring terms, as the Summer term is often devoted to wider ranging activities and little reading might be done by children during the long summer holiday. If this is the case, to show no change in standardised score (i.e. to sustain a "normal" rate of growth in reading) during the Summer term might be construed as a "good" outcome.

Considering the follow-up results available at the time of writing, it is striking that 70 per cent of the control groups showed a fall in standardised scores from post-test levels. This suggests that to have showed no change at follow-up is indeed a good outcome.

Of those schools which implemented Paired Thinking, for Tutees seven classes showed no gain, two showed gains which did not reach statistical significance, and one showed gains which did reach statistical significance. For Tutors five classes showed no gain, three showed gains which did not reach statistical significance, and one showed gains which did reach statistical significance. This suggests that tutors benefited most consistently from the Paired Thinking phase, in terms of the crude measure of reading comprehension.

Of those schools which did not implement Paired Thinking, for previous Tutees one class showed no gain, one showed gains which did not reach statistical significance, and one showed gains which did reach statistical significance. For previous Tutors one class showed no gain, and two showed gains which did not reach statistical significance. Overall, it was difficult to see consistent differences between PR only and PRT classes.

The Phase 2 test results are reported in greater detail in Topping (2001).

Phase 2 Child Feedback

The tutors and tutees at Langlees Primary School expressed their thoughts separately about the transition to Paired Thinking in two Circle Time sessions, held in May and June.

At the first session, each group split into pairs to discuss how Paired Thinking is different from Paired Reading. The facilitator emphasised that there were no right or wrong answers to this question. When the whole group reconvened, each child was invited to share some of the things they had been discussing with their partner. After this, each child was asked to offer a single word to describe Paired Thinking.

At the second session, each group split into pairs to discuss what they like best, Paired Reading or Paired Thinking. They were also asked to discuss why they chose the answer they did. The facilitator emphasised that there were no right or wrong answers to these questions. When the whole group reconvened, each child was invited to share some of the things they had been discussing with their partner. After this, each child was asked to offer a single word to describe Paired Thinking.

TUTEE FIRST SESSION

Paired Thinking is different from Paired Reading because....

"Paired Thinking is different from Paired Reading because you get different books." (boy) 
"Paired Thinking is different from Paired Reading because you get new questions and more questions." (girl) 
"Paired Thinking is different from Paired Reading because you get new partners." (boy) 
"Paired Thinking is different from Paired Reading because you are with different teachers all the time." (girl) 
"Paired Thinking is different from Paired Reading because you get to ask your partner questions." (girl) 
"Paired Thinking is different from Paired Reading because you get to do the five finger test." (girl) 
"Paired Thinking is different from Paired Reading because you get a new sheet with answers on it (Tips for Tutors)." (boy) 
"Paired Thinking is different from Paired Reading because you get prompt cards." (girl) 
"Paired Thinking is different from Paired Reading because you get 21 top tips." (girl) 
"Paired Thinking is different from Paired Reading because you get different sheets to fill in." (girl) 
Three children chose to Pass.

Paired Thinking is....

"good", "brilliant", "excellent", "brilliant", "extra-good", "fantabulous", "extra-fantastic", "good", "superb", "triple good", "extra-fabulous", "hard".

TUTEE SECOND SESSION

I like Paired Reading best because ....

"I like Paired Reading best because you don't get questions." (girl) 
"I like Paired Reading best because you don't get as much questions." (boy) 
"I like Paired Reading best." (boy) 
"I like Paired Reading best because you don't get to answer questions and I liked the partner I had for Paired Reading." (boy) 
"I like Paired Reading best because you get less questions." (boy)

I like Paired Thinking best because ....

"I like Paired Thinking best because you get new books and a new partner." (girl) 
"I like Paired Thinking best because you get more questions." (girl) 
"I like Paired Thinking best because you get new partners and new sheets to fill in." (girl) 
"I like Paired Thinking best because you get prompt cards and sheets for your tutor to fill in. You also get a bigger range of books." (girl) 
"I like Paired Thinking best because you get prompt cards and a change of partner." (girl) 
"I like Paired Thinking best because you get harder books and asked more questions than Paired Reading." (girl) 
"I like Paired Thinking best because I got a new partner, prompt sheets and more questions and sometimes, you can laugh at the questions." (girl) 
"I like Paired Thinking best because I got the partner I wanted." (boy)

Paired Thinking is ....

"good", "excellent", "brilliant", "fantastic", "fabulous", "excellent", "good", "great", "fabulous", "excellent" , "brilliant", "fabulous", "brilliant".

TUTOR FIRST SESSION

Paired Thinking is different from Paired Reading because ....

Answers included: that Paired Thinking involves more questioning than Paired Reading, there are prompt cards and other new materials such as Tips for Tutors and a revised diary, and that pairs do not do as much actual reading in Paired Thinking as they do during Paired Reading. Problems included the difficulty some tutors and tutees had understanding some of the prompt questions and also that not all these were appropriate for the book being read.

Paired Thinking is ....

Most acknowledged that Paired Thinking was more difficult than Paired Reading for both the tutees and the tutors. Some tutors thought this was an advantage because it 'stretched' all those involved more, while others thought this was a distinct disadvantage!

TUTOR SECOND SESSION

I like Paired Reading best because ....

"I like Paired Reading best because there's not so many questions." (girl) 
"I like Paired Reading best because there's some questions, but not so many as Paired Thinking." (girl) 
"I like Paired Reading best because Paired Thinking has too many questions and sometimes the tutors don't understand the questions themselves." (girl) 
"I like Paired Reading best because there's not as much questions." (boy) 
"I like Paired Reading best because my tutee now (for Paired Thinking) is too bossy." (girl) 
"I like Paired Reading best because the Paired Thinking questions are a bit hard for the younger ones and sometimes I don't understand the questions." (boy) 
"I like Paired Reading best because when you do Paired Thinking you have to interrupt the book you're reading, but you don't have to with Paired Reading." (boy)
"I like Paired Reading best because with Paired Thinking you don't get as far with the book. You can't get onto the next chapter." (boy) 
"I like Paired Reading best because when you've got a good book you don't like a sking questions because you won't get it finished." (girl) 
"I like Paired Reading best because it's much easier for everyone." (boy) 
"I like Paired Reading best because it's helpful." (girl) 
"I like Paired Reading best because there's not as much questions to ask the younger ones." (girl) 
"I like Paired Reading best because you can chat to the child and it's more socialising." (girl)

I like Paired Thinking best because ....

"I like Paired Thinking best because the children don't just pick up any book they think will be easy, because you've got a sheet with questions on that you must ask them." (girl) 
"I like Paired Thinking best because you get a different tutee and you get harder books." (girl) 
"I like Paired Thinking best because it will be better for us when we go to high school and it's better for the little ones too." (boy) 
"I like Paired Thinking best because it encourages both the tutor and the tutee to read more." (boy)

Paired Thinking is ...

"hard", "sociable", "okay", "wonderful", "easy", "more harder", "superb", "okay", "difficult", "hard", "all right", "difficult", "brilliant", "excellent", "tempting", "educational", "a wee bit harder although sometimes it's OK."

Overall, it seems that Paired Thinking was most popular with tutees, but less popular than Paired Reading with the tutors, for whom it involved a lot of strenuous thinking.

Summary & Conclusions

In Phase 1 Paired Reading, teachers reported that most pairs worked well together and adhered to the method in which they had been trained, with evident social benefits. Very few teachers had not observed a positive shift in the majority of their children, especially in reading motivation and including in social competence, and generalisation of positive effects to other subject areas and outside the classroom had been observed. The feedback from the children was also generally very positive.

On a group reading comprehension test, 15 out 16 classes of tutees showed gains above the "normal" rate of progress, and 15 out 16 classes of tutors gains above the "normal" rate. Average tutee gain in standardised score was from 90.4 to 94.0 (highly statistically s ignificant, p<0.000); average tutor gain in standardised score was from 91.6 to 95.4 (highly statistically significant, p<0.000). Results of comparisons with control groups (which were often very small) were patchy. For Tutors, experimental gain was generally not significantly different from the control gain. For Tutees, in four out of seven cases the experimental group gained more than the control group, but in only one case did this reach statistical significance.

From test results overall, the least able tutees gained most, and the least able tutors gained most. Low ability tutors produced tutee gains at least equivalent to those produced by high ability tutors. Low ability pairings were good for both tutees and tutors in terms of test outcomes. Overall, female tutees did better than male tutees, but male tutors did better than female tutors. However, cross gender matching yielded better tutee gains than same-gender matching, and was good for tutors as well.

In Phase 2 Paired Reading and Thinking, many schools implemented only one session per week. Test gains were modest, with tutors doing best, but control groups tended to fall. Tutors appeared to have benefited most consistently from the Paired Thinking phase, in terms of the crude measure of reading comprehension. Overall, it was difficult to see consistent differences between PR only and PRT classes. Paired Thinking appeared more popular with tutees, but less popular than Paired Reading with the tutors, for whom it involved a lot of strenuous thinking.

The testing undertaken in Phases 1 and 2 was brief and superficial, and the scores of individual children were often very erratic. There is a need for a more intensive and rigorous study, assaying implementation integrity in detail using a range of ethnographic process measures, and using multiple outcome measures to triangulate assessment of Thinking gains in particular. Such a study is in hand, and will be reported in due course.

References & Further Reading

Topping, K. J. (1988) 
The Peer Tutoring Handbook 
Cambridge MA: Brookline Books

Topping, K. J. (1995) 
Paired Reading, Spelling & Writing: The handbook for teachers and parents. 
London & New York : Cassell

Topping, K. J. (1998) 
The Paired Science Handbook: Parental Involvement and Peer Tutoring in Science. 
London : Fulton; Bristol PA : Taylor & Francis

Topping, K. J. (2001) 
Peer Assisted Learning Click here for details
Cambridge, MA: Brookline Books.

Topping, K. J. (2001) 
Thinking Reading Writing: A Practical Guide To Paired Learning with Peers, Parents & Volunteers. (Click here for details) 
New York & London: Continuum International.

Topping, K. J. & Bamford, J. (1998) 
Parental Involvement and Peer Tutoring in Mathematics and Science: Developing Paired Maths into Paired Science. 
London : Fulton; Bristol PA : Taylor & Francis

Topping, K. J. & Bamford, J. (1998) 
The Paired Maths Handbook: Parental Involvement and Peer Tutoring in Mathematics. 
London : Fulton; Bristol PA : Taylor & Francis

Topping, K. J. & Ehly, S. (eds.) (1998) 
Peer Assisted Learning. 
Mahwah NJ & London UK : Lawrence Erlbaum

Wolfendale, S. W. & Topping, K. J. (eds.) (1996) 
Family Involvement in Literacy: Effective partnerships in education. 
London & New York : Cassell