An occasionally irregular blog about orthodontics

Early Class II treatment is effective! A retrospective study…

Early Class II treatment is effective! A retrospective study…

Early Class II treatment is effective! A retrospective study…

The effectiveness of early Class II treatment is one of the areas of orthodontic treatment that has being extensively researched  with randomised trials that have then been included in systematic reviews. This large body of research has concluded that there are limited advantages in undertaking early treatment when compared to providing one phase treatment  in adolescence. However, the burden of care is increased.  I have blogged about this before, suggesting that this is a “dead question” and it is time for us to move on and carry out research in other areas. However, this retrospective study which disagrees with the results of the randomised trials has just been published. I thought that I should discuss its findings.

This paper was produced by a highly respected group from San Francisco.  Prof Jonathan Sandler and I debated the issue of early treatment with them in 2015 at the AAO meeting in San Francisco.  As part of their evidence to support early intervention they presented data from a retrospective study. I thought that the study design was rather poor and that any paper based on this data would not be published. However, this has recently been published in the Angle orthodontist….

Screen Shot 2016-07-22 at 11.54.26A retrospective study of Class II mixed-dentition treatment

Oh, Heesoo et al

Angle Orthodontist on line

In the discussion they suggested that the previous randomised trials into early class II treatment, using functional appliances and headgear, were not valid to the work of clinicians who use a more comprehensive phase 1 treatment.

They studied the outcome of one particular comprehensive approach (limited fixed appliance therapy with or without headgear and lingual arches) in the mixed dentition (8-11) as opposed to treatment provided in one phase that started later (12-15 years). They attempted to study the following questions;

  • How much class II correction was achieved through early treatment?
  • At the end of treatment, what differences were noted in molar relationship and cephalometric measurements between early treatment patients and patients who began treatment in the permanent dentition?
  • How did treatment duration and extraction rates compare between the early and later treated patients?

What did they do?

The sample was drawn retrospectively from three orthodontic practices. The cases were taken from a total time period between 1976 to 2000 (24 years). They produced a nice flowchart which showed that they initially screened 6881 patients for inclusion in the study. They then excluded patients 679 early and 355 late treatment who had incomplete records, missing charts and did not complete treatment etc.  Finally, they excluded any class I and class III patients to obtain a final sample of 54 early and 58 and later treatment patients. I found the flow of patients and the inclusion/exclusion process was not too clear. But in effect they analysed only 2.1% (54/2628) early and 5% (58/1127)  later-treatment patients from the initial sample.

They then selected 51 untreated class II subjects from that AAO Craniofacial Legacy Growth Foundation collection. They did this by visual comparison of the cephalograms. This was, therefore, a historical sample and I have drawn attention to issues with the use of this type of untreated control previously.

They then analysed the cephalograms and the study casts of the patients and ran simple t-tests across their data.

 What did they find?

The cephalometric data was presented as a table of main variables. There was no difference between the groups at the start of treatment. Importantly, there were no differences in the variables at the end of treatment for the early and later treatment groups.

When they looked at the extraction rate they found that this was 38% for the later treatment cases and only 6% of those who had early treatment.

The study cast data revealed no differences between the groups.

Extractions were employed in the permanent dentition in 22 of the 58 (37.9%) late treatment cases, but in only 3 of the 54 (5.6%) early treatment cases. This was clinically and statistically significant.

I have previously mentioned that an important variable we need to consider in clinical studies is the burden of care to the patients. They measured this as treatment time and number of office visits.. I’ve extracted this into this table;


Treatment Treatment time (years) Treatment time plus supervision (years) Number of office visit
Early 2.6 5.8 53
Late 2.6 2.6 33


They did not carry out a cost effectiveness analysis, but I would suggest that this is not necessary because of the large differences. It is very clear that the burden of care for  early treatment was far in excess of treatment in the permanent dentition.

I found the discussion rather difficult to follow, but they concluded that

  1. Early treatment resulted in successful treatment
  2. Early treatment had less extractions
  3. Early treatment patients have less time in phase 2 fixed appliances but the late treatment patients spent much less time in total treatment

These conclusions were fair but in their abstract they concluded;

 “Early treatment comprehensive mixed dentition treatment was effective modality for early correction of class II malocclusions “.

 What did I think?

I will be brief about several issues with this study design and paper. I identified the following important problems.

The conclusions in the abstract were not supported by the data.

  • This was a retrospective study and while the authors drew attention to the fact that there is a risk of selection bias in the final sample of subjects. I think it is important to point out that they only included 2% and 5% of the initial sample. This is massive data loss and I feel that this makes any conclusions meaningless.
  • The other main problem with retrospective studies is that there may be differences in the samples, because the treatment was not randomised, but was selected by the operator. As a result, we must assume that there may be substantial differences between the patients in the early treatment and later treatment group because they were treated differently.
  • They also included a historical control group and recent research has suggested that studies that use historical control groups overestimate the effect of treatment. I have discussed this in a previous blog post.

In conclusion, I cannot help feeling that this study suffers from severe methodological and interpretation errors. As a result, it does not change the conclusions of the RCTs and systematic reviews.

You may feel that I’ve been overcritical. But I strongly believe in evidence-based care and when investigators challenge the results of trials and systematic reviews they need to provide high quality research. I cannot help thinking that this paper fails to do this?

Tags: , , , , ,

There Are 11 Comments

Trackback URL | Comments RSS Feed

  1. Dear Professor O’Brien

    Thank you for posting this. And while I haven’t yet read the entire article, there seems to be a glaring omission… both, this study design, and also of your critique; not one word apparently from either of you regarding the impact of various class II treatment Tx regimens upon a child’s long term correction stability, facial esthetics (attractiveness), airway and/or TMJ health. Burden of treatment time, need/no need for later extraction of permanent teeth, and/or possible complexity/length of phase II Tx time seem to be the primary pillars for judging early Cl II Tx success or failure. And while most would agree that these are indeed important success criteria professor, for accurately judging early Cl II Tx effectiveness, there is ample published research, albeit not necessarily from robust prospective, blinded and longitudinal RCT’s, that seem to support the hypothesis that airway health, TMJ health, Tx stability and facial esthetics, should be added to the list of long term pre-Tx objectives and criteria for Tx success/failure. Back in the 1990’s I attended a national symposium on the childhood obesity epidemic in America. As an educationally qualified dietitian (M. Sc.), I challenged the then spokesperson from Pepsico, Inc. regarding the high sugar content of the beverage products that they actively marketed to children; her response, “well sugar consumption has not been proven to contribute to childhood obesity”. At that point the moderator interrupted her, ” Sometimes it is in the best health interest of children to ACT on the best AVAILABLE evidence, than it would be to WAIT for the best EXPECTED evidence.” Please watch sometime “Sugar:The Bitter Truth” by Robert Lustig at UCSF medical school if you’d like to hear about how the EXPECTED evidence actually turned out regarding how sugar overconsumption actually impact de novo lipogenesis.

  2. Alfred C. Griffin Jr. DDS says:

    The inclusion process was so flawed that the results and conclusions are meaningless. For instance, extraction patients will most likely not drop out of treatment at the same rate as non extraction patients. Unfortunately I don’t usually read every article in the Angle Orthodontist completely but rather scan and read the results and conclusions. It would be great if there was an objective numerical grade (1-5) for statistical and procedural validity that could be assigned to each article in the abstract to let us know how much credence the article deserves.

  3. Kevin O'Brien says:

    Yes, thats a great idea. I do wonder if the journals should do this, as this will reflect the great variability in published research?

  4. If it is true that the incidence of extractions is less in two phase tx, I would like to know if the incidence of second molar impaction is greater.

  5. Fenris Ulfr says:

    The severity of the Class II molar relationship was on average 0.7 mm in the early Tx group, and 1 mm in the late Tx group. These represent very mild class II molar relationships, and a spontaneous improvement in the molar relationship is frequently seen without any early treatment, due to mesial migration of the mandibular molars into the E-space (the UnTx group went from 1mm to 0.6mm). A big red herring is the reported frequency of extractions. Overjet and sagittal correction are not the only reasons for extractions. Crowding, incisor protrusion, lip incompetence, midline discrepancies, vertical control etc. are additional reasons that frequently necessitate premolar extractions. Since this was a retrospective study with a non-randomized sample, the specific distribution of those parameters is unknowable. Also, no mention was made of the ethnic variation in the sample. In terms of all cephalometric measures, the two groups seemed similar at the end of treatment, AND the late Tx group spent less time in total Tx with fewer visits – seems like the conclusion of the study should be that there is no benefit at all to routine early treatment for Class II correction!

  6. Lysle Johnston says:

    One thing seems to be missing in the quest for evidence: is there any received theoretical basis? In other words, is there any reason to pay attention to a given study? In the present case, is there any reason to believe that the treatment can have such a huge effect on the extraction rate? Given the usual anchorage loss, first-premolar extraction will provide about 8 mm of space in the mandible. Based on Moyers’ data for leeway space, a lower lingual should net about 6 mm. Given some sort of gizmo to control (hold/”distalize”) the upper buccal segments to achieve a “Class I” molar relationship, the methods of Oh and co-workers might well be seen conservatively as 60-70% efficient compared to extraction. Although your critique is well-reasoned, it seems to me that the existence of a believable theoretical basis for a reduction in the extraction rate should serve to elevate this paper a bit in terms of the attention it deserves and receives.

    • Kevin O'Brien says:

      Thanks, my main concern with this paper is the data loss from the original sample and the distinct possibility that the two groups are different.
      This approach is very different from that taken by you when you used a discriminant analysis to identify similar cases at the start of treatment.
      I agree that there is a theoretical basis to the reason for the differences in extraction rate, but I am not sure that this paper can provide us with sufficient certainty that this may be case because of the large amount of selection bias.

  7. Fenris Ulfr says:

    I agree with Dr. O’Brien. Although there was a reduction in the extraction rate, can it be attributed to the nature of the intervention between the groups as opposed to differences between the groups? In the absence of previously established parity between the groups and randomization to minimize selection bias, I don’t think one can draw that conclusion.

  8. AndyPearson says:

    Thanks for the precis of the study Kevin. The hidden assumption in this study seems to be that early treatment will AVOID extns later, which is not what it says but is probably what it wants to say. Again the old confusing correlation with causation mistake. Could the early group have been treated later and got the same results with regards to extns? We dont know because the initial treatment decisions were not randomised. All in all not very dramatic conclusions, sort of what you would expect when looking at patients of 3 different clinicians retrospectively.
    The issue of how long treatment takes often seems to get brushed under the rug. I get the impression that some orthodontists think that the longer the better and forget that the patient has to put up with the various appliances for most if not all their childhood. For the patient, orthodontics is not an absolute joy and we should do all we can to avoid treatment that has no advantage over other treatments except being longer.

Post a Comment

Your email address will not be published. Required fields are marked *


Pin It on Pinterest

Share This