Research project

Intervening with Disturbed Adolescents

Serious social, emotional and behavioural disturbance in adolescents is costly - incurring long-term costs to society in service provision and collateral damage as well as personal costs to the young people themselves

On this page


Start date

October 2001

The research questions of the project are:

  1. What is the current evidence for the effectiveness of psycho-educational interventions for seriously disturbed adolescents?
  2. To what extent is the practice of school psychologists in this area evidence-based?
  3. How might #1 and #2 be improved?

Essentially, a literature search, critical review and meta-analysis of the effectiveness of interventions with seriously emotionally and behaviourally disturbed adolescents was conducted.

Then a survey of school psychologists in New York State explored the extent to which these professionals were aware of this evidential basis, if and how it informed their practice and recommendations, and how such research and dissemination could be improved to have a greater impact on practice.

Research questions #2 and #3 involve exploration of the various demography and socio-cultural contexts within the profession of school (educational) psychology, and the considerable variation in theoretical and philosophical orientation and attitudes and opinions.

The urgency of providing effective school-based interventions for emotionally disturbed adolescents is similar in the United Kingdom and the United States. School psychologists usually have a pivotal role in finding an appropriate solution for the educational and social needs of these students. The results of the project should help connect research, practice and policy. They suggest directions for future training and staff development.


Serious social, emotional and behavioural disturbance in adolescents is costly - incurring long-term costs to society in service provision and collateral damage as well as personal costs to the young people themselves.

Recent promotion of "evidence-based" professional practice is driven by the goal of increasing the effectiveness of service delivery and consequently indirectly reducing the sundry costs to the various stakeholders, i.e. increasing service cost-effectiveness (e.g., Davies, Nutley, & Smith, 2000; Hammersley, 2001). Arguably the latter is desirable irrespective of the absolute level of resourcing of services, and funding services of unknown effectiveness is pointless and wasteful except on a small scale pilot basis until adequate evidence of effectiveness is available.

For busy professional practitioners, especially those with generic rather than specialist responsibilities, keeping abreast of relevant current effectiveness research is difficult and itself incurs time costs. They risk accessing the research literature through convenience sampling rather than systematically, or (worse) by sampling which is systematically biased by some pressure group or other constituency. They might also have a personal systematic bias to sampling only research that accords with the theoretical perspectives underpinning their pre-service training (which might now be outdated or discredited) or that which utilizes research methodologies with which they are familiar and comfortable.

Consequently, recent years have seen a number of initiatives intended to make systematic and putatively unbiased reviews of research literature on specific aspects of service delivery in the health, education and welfare professions easily accessible to practitioners in a form requiring only low time costs for assimilation. Examples include the Cochrane Collaboration for medical professions, the Campbell Collaboration for the welfare and justice professions, the Evidence for Policy and Practice (EPPI) Centre, the What Works Clearinghouse and the Evidence Network.

However, many aspects of service delivery still await systematic reviews of research literature. Indeed, some aspects may still await any systematic research. Meanwhile, the professional practitioner attempting to deliver a "joined up" service has to go beyond the evidence - using their clinical or professional judgement based upon professional observation and experience in the field and/or the similarly based advice of professional colleagues. However, there are inevitably serious questions about the reliability and internal and external validity of such judgements, which can be readily challenged in court when things turn out badly.

Even in those aspects of service delivery that have been the subject of systematic reviews of the research evidence, further questions remain about the gap between research and practice. To what extent does the evidence reach all practitioners? Even in an era when continuing professional development is increasingly a requirement for renewal of professional practicing certificates, the information overload problem is considerable. Some professionals are likely to manage this better than others, by virtue of: personal predisposition, prioritization and capability; nature, quality and recency of professional training; and ease of access to relevant infrastructure and resources, for example.

Even where the evidence does reach all practitioners, to what extent is it actually operationalised by all practitioners? Or do some practitioners read the latest research, but when pressed by the information-processing or social or political demands of an urgent problem requiring immediate decision-making, still revert to long over-learned reflex responses or the folk wisdom of the profession?

This latter connects with a related issue: what or whose is the definition of the problem, and whose problem is to be solved? Even medical problems are set in a complex human, social and organizational context, and both problems and solutions are socially constructed, whether consciously and explicitly or otherwise.

Thus in turn connects with a wider issue: what is the definition of evidence? The movement towards evidence-based professional practice was in many ways led by the medical profession, in which logical positivist conceptions of research methodology historically took precedence as medicine strove to establish itself as a "science", and the randomized controlled trial was seen as the methodological "gold standard". However, wider post-modern epistemological perspectives have accepted other ways of knowing, including a variety of non-experimental and qualitative methods (although there is also often confusion between exploratory and confirmatory research).

Amidst the research "paradigm wars", the question of fitness for purpose can sometimes be overlooked. A research methodology which is suitable for large scale field testing of the broad-spectrum effectiveness of a pharmacological product which has already undergone extensive small scale laboratory trials might not be the same as that which is suitable for the formative evaluation of a multiple-component pilot intervention with a highly complex social or educational problem, where the stakeholders are numerous and participant reflexivity is a major and unpredictable variable. Simply transferring historical notions of what is "good" research thoughtlessly from one context to another is clearly unsatisfactory. A case in point was the epic review of research on reading interventions undertaken by the National Reading Panel (2000), later criticized for adopting a narrowly experimental and quasi-experimental definition of "good" research and thereby excluding much of the literature in the field.

This leads to the question of how literature reviews should be conducted. A traditional narrative critical review facilitates the inclusion of research from a wide range of methodological traditions, but has difficulty with synthesizing studies reporting data in idiosyncratic ways. A "best evidence synthesis" has the advantage of explicating precisely the definition of "good" research espoused by the reviewers, but the disadvantage of narrowness if the reader does not agree with that definition. Paradoxically, BES might have low external validity, since the best studies might tend to differentially report the best organized intervention programs. Meta-analysis has the advantage of using a standardized and replicable procedure to synthesize many studies reporting different types of quantitative data , but only accounts for research quality if blocking by that (precisely defined) variable is reported, and is inapplicable to many studies which do not fully report quantitative data.

This paper explores and exemplifies some of these quandaries. It offers a systematic review of the effectiveness of interventions for adolescents with serious social, emotional and behavioural disturbance. It compares and contrasts the narrative review, best evidence synthesis and meta-analytic approaches to this task. It sets this in the context of utility for informing professional practice, particularly for school (educational) psychologists, who often make key recommendations and decisions for these young people and are certainly subject to the influence of the "evidence-based" movement (Kratochwill & Stoiber, 2002). This profession is currently developing a consensual coding structure for judgements of the quality of evidential support for psycho-educational interventions. This is interesting inter alia for its attempt to combine rigor with an emphasis on research in complex field settings with high external validity, and for its attempt to subsume both preventive and remedial interventions. Similar struggles are being undertaken by colleagues in related health, education and welfare professions (e.g., Chambless & Ollendick, 2001).

For this review, "adolescent" was defined as a student between 11 and 18 years of age. "Serious emotional disturbance" (SED) was defined as according to the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) of the American Psychiatric Association (1994), as applied by the researchers in the studies reviewed or assayed as equivalent by the reviewers from information in the original study. The use of such labels is highly contentious, especially outside North America, not least because of their quasi-medical implications and conceptual limitations. However, the DSM-IV definition was used in this review because that is what was most commonly done in the research literature, and there was no other commonly agreed definition.

Computerized bibliographic searches were undertaken to identify relevant studies, in the PsycINFO database of international literature in psychology and related social and behavioural sciences, the British Education Index, the (US) National Library of Medicine's Medline database, the ERIC (Educational Resources Information Center) database, the Current Index to Journals in Education (CIJE) and Resources in Education (RIE).

A pilot trawl to evaluate different search descriptors and combinations led to a systematic scanning of these databases from 1974 to 1999 using the following descriptors and combinations: serious emotional disorder, serious emotional disturbance, disorder, disturbance, emotional, SED, EBD, EB/D and adolescent/adolescence. These were paired with variations of further descriptors: evaluation, intervention, outcome, treatment, school, education. Manual searches of the following journals were also carried out: Behavioural Disorders, British Journal of Educational Psychology, Review of Educational Research, School Psychology Review. Further searches followed up selected citations from retrieved articles. Additionally, searches were made of the World Wide Web (using the same descriptor terms with various search engines and portals), grey literature, conference proceedings, and other professional sources.

The search identified approximately 6,900 items. Almost all publications were in English, though several translations of foreign-language works were included. Subsequently almost 6,700 references were excluded from the onward review process, as they did not contain any relevant outcome information. Group studies with and without control or comparison groups were included together with single-case studies. There were 202 studies identified that reported the outcome of a psycho-educational intervention. All of these were included in a narrative literature review. Only 41 studies contained outcome data that enabled the calculation of effect sizes, and these studies were included in a meta-analysis. This led to a consideration of a best-evidence perspective.

Findings are reported using this classification system of intervention type:

  • Individual Counselling
  • Family Counselling
  • Group Counselling
  • Psychotherapies
  • Cognitive-Behavioural Therapy
  • Behaviour Modification
  • Self-Monitoring & Self-Management
  • Social Skills Training
  • Peer Mediated Interventions
  • Teacher Consultation
  • Wraparound Planning
  • Multi-Systemic Therapy
  • Other (otherwise unclassifiable)

For a traditional narrative literature review, 202 studies were selected as relevant. For a meta-analysis, 41 studies were selected as relevant, yielding 118 effect sizes. For a best-evidence synthesis, only four studies were selected as relevant. Most of the 202 studies reported positive findings, but few of these studies were methodologically adequate. Although the meta-analysis synthesized 118 effect sizes, many of these were derived from various types of data in small-scale studies of an idiosyncratic nature lacking control or comparison groups. Some small scale studies generated many effect sizes owing to the use of multiple outcome measures. Blocking for further analysis was constrained by small numbers of studies in sub-categories. Each of the three approaches to the literature had advantages and disadvantages.

Forms of counselling were characterized by low quantity and quality of supporting evidence, as were psychotherapy and teacher consultation. However, cognitive-behavioural therapy and behaviour modification had a better quantity and quality of evidence. Self-monitoring/management, social skills training and peer mediated interventions had the highest quantity and quality of evidence. Wraparound planning had a relatively high quantity of evidence but average quality, while multi-systemic therapy and other miscellaneous interventions had average quantity and quality of evidence.

In the narrative review, the effectiveness of interventions was generally positively correlated with the quantity and quality of evidence supporting them, except that outcomes for wraparound planning were mixed. From the meta-analysis, effect size was very high for self-management/monitoring, a little above average for wraparound planning/multi-systemic therapy, and below average for social skills training and behavioural methods. Other effect sizes were based on small numbers and significant difference could not be demonstrated.

However, studies in different intervention categories differed in respect of the breadth of outcomes targeted. Thus behaviour modification and self-management studies tended to target short term gain in specific contexts, while wraparound planning tended to target short and long term gain in many life contexts - this aligns with the difference previously discussed between response and recovery. Research tends to focus on the measurable, but what is easily measurable might be relatively trivial. Consequently these effect sizes (and subsequent estimates of cost-effectiveness) should be evaluated in the context of study outcome breadth, or what might be termed "impact footprint". Of course, an array or sequence of a number of highly cost-effective but "small footprint" interventions could be included in a wraparound plan, but there was little evidence of this in the literature.

Beyond this, the interventions were very different in terms of probable delivery costs. Few studies actually included information on costs, which might in any event differ even for the same intervention from locality to locality. Nevertheless, some speculations on probable average relative cost of the interventions can be made to inform further speculations about probable cost-effectiveness.

Regarding effect sizes for setting or context for delivery of the intervention, effectiveness was high for mainstream special education students, slightly below average for residential special school, and very low for psychiatric facilities. The probable cost for these interventions was inversely related to their effectiveness. Of course, it might be assumed that this suggests that less tractable cases are more likely to be treated in more restrictive environments, possibly having previously failed in the less restrictive environment. However, previous research indicates this is not necessarily the case (Safer, 1982; Topping, 1983), not least owing to regional variations in resource availability and decision making. In any event, noting that the mean ES for residential special school is still lower than the mean ES for all categories, it seems clear that placement in a more restrictive (and more costly) environment still does not yield encouraging outcomes.

Translating evidence-based practice from rhetoric into reality is no small challenge (Halladay & Bero, 2000). An obvious action implication of this review (and many others) is that researchers must actively seek to make such systematic reviews accessible to practitioners. Researchers should also triangulate methodologies for systematic reviews, in order to gain the advantages of each and counter-balance the disadvantages of each. In recent years more studies enabling calculation of effect sizes have been published, which is encouraging, but systematic reviews need updating, perhaps at five-yearly intervals.

What of the implications for practitioners? Serious Emotional Disturbance is a very difficult and costly field in which to make decisions, with implications for pre-service training as well as continuing professional development (Shapiro, 1991). Practitioners cannot wait until the research improves, although they should hope for updated reviews at five-yearly intervals. As the negative Effect Sizes in this review and the phenomenon of spontaneous remission demonstrate, doing something is not necessarily better than doing nothing, and arguably funding services of unknown effectiveness is pointless and wasteful except on a small scale pilot basis until adequate evidence of effectiveness is available.

Practitioners must actively seek out unbiased systematic reviews, rather than intentionally or unintentionally sample research selectively. They need to consider different indices of effectiveness carefully. Practitioners also need to consider the relative impact footprint of evidence-based interventions, since a small effect in recovery may save more money and human suffering than a large effect in response. They need to consider cost-effectiveness, irrespective of the absolute level of resources available. They must then ensure that the evidence base actually consistently informs their day to day practice.

Practitioners should be cautious in their interpretation of systematic reviews, since publication bias towards the reporting only of significant positive effects may cause over-estimation of the likely effects of interventions when implemented more widely than in research studies (the so-called "file drawer problem"). However, this might apply equally to all types of intervention. Larger effect sizes might be partially a product of type of instrumentation used, the number of instruments used, and the time scale of use - many highly sensitive instruments closely connected to the intervention and deployed frequently over short time scales are arguably intrinsically likely to yield a larger number of higher effect sizes. Practitioners should also remember that zero is data - what evidence is not found in a systematic review may be quite as important as what is found.

Professional practitioners will always have to go beyond the evidence:

  • to adapt to local resource constraints, and
  • to avoid a "one size fits all" approach and choose an intervention from a range of well evidenced interventions which is particularly likely to be effective in the problem context or ecology in hand, and
  • to consider the durability of interventions in field settings when delivered in less than ideal circumstances (often unexplored in the literature), and
  • to bridge the many gaps in the evidence base.


A practitioner summary of this systematic review has been made available in the form of a table (see below). It is anticipated that practitioners recommending interventions where resources are very limited might choose low cost but relatively high effectiveness interventions (such as self-management training or peer mediation), which however might tend to be those with a smaller impact footprint. Where resources are more accessible and cost less of an issue, practitioners might choose from a wider range of options, including those of higher cost and average cost-effectiveness but larger impact footprint (such as social skills training, wraparound planning and multi-systemic therapy).


Project lead(s)

Professor Keith Topping

External team members

Brian Flynn
Research Director
now based in Hawaii

Professor Marian C. Fish
School Psychology Program
Department of Educational & Community Programs
CUNY/Queens College, School of Education
Powdermaker Hall, Room 052A
Flushing, NY 11367-1597

Dr. Lynne Thies
Past President
New York Association of School Psychologists 
PO Box 60470
Lyell Station
Rochester, NY 14606

Related groups

Education and Society

Project type

Research project