A Friday JAMA-lama Ding Dong!

Well ... I wasn't really going to weigh in further past using that recent JAMA article (EDIT:  full text no longer free at JAMA, see link below) as an example of where statistics can lead us astray.  But ... as is probably expected, this study has kicked up some dust in the community.  I'm sure I'll miss a few, but here's the weighing in so far:

That's just the blog posts, not the tweets, FB postings or other social media buzz ... and I'm happy to edit in more, just drop the hint in comments.  And here's the full text of the study:  Effects of Dietary Composition on Energy Expenditure During Weight-Loss Maintenance

The more I look at this study, the more angry I get about the piss poor state of affairs that is study design and implementation these days.  While I'm keenly aware that it is impossible to control and account for everything in such studies, that's no excuse for not even paying lip service to those that should be paramount to the integrity of your data.  There's a saying in statistics that no amount of statistical manipulation can fix bad data.  More succintly stated:  Garbage In = Garbage Out.  This study, after closer examination to make sure my first impressions were accurate, is GIGO, and GIGO cannot be taken seriously in any discussion of CICO.  

Why do I make such a bold GIGO claim?  Well, because the data presented has some glaring holes.  Missing information on no-brainer obvious stuff.  Why?  And not only that, there is no mention of measures taken to procure the data, e.g. did they even bother to collect it?  While there are certain aspects of any free-living study that can't ever be controlled completely, there are ways to minimize the influence of these factors on outcomes, and well-designed studies generally outline the specifics of measures they took to minimize this.  

Not so, Ebbeling, et.al.  Here's what they did, and if I'm in error in anything I state here it is only because it is not more clearly laid out in their peer reviewed account.  They took what ended up being 21 obese individuals, determined weight stable usual intake for 4 weeks, then put them on a reducing diet.  The reducing diet was by % P/C/F  25/45/30.  FWIW, this would qualify as "low fat" and "high protein" compared to the SAD, and perhaps higher in protein but lower in carb than your standard CRD.  After 12 weeks on this diet, subjects lost an average of 14.3 kg and were then weight stabilized for 4 weeks on this diet.  It would appear that these subjects essentially lost 13.6% body weight and remained obese in maintenance with intake at around 2600 cal/day.  This was determined in a 4 week post-diet stabilization period.   REE and TDEE were assessed during free living periods at the end of the initial 4 week (pre-diet) weight stable phase and during the last 1 or 2 weeks of each 4 week "maintenance" test diet.  
  • MISSING:  What was the REE/TDEE of these subjects in the last two weeks of the weight-reduced stabilization period??
In a cross-over design, they were then assigned to one of three isocaloric diets -- LF, LGI, and LC -- presumably calibrated to maintain current weight.  The Methods would indicate that "controlled feeding" here amounted to providing participants with menus to consume from.  
  • MISSING:  Food logs, dietary recall, etc.  ... Basically ANY sort of indication as to compliance with either of the three regimes.  Worse yet ...
  • MISSING:  Any indication that the researchers made any effort to assess or insure compliance. Therefore ...
  • MISSING:  Actual intake composition data.  Recent studies have shown just how important this is in making comparisons!
Though we get a litany of data points for various metabolic parameters at those time points where they were reported, 
  • MISSING:  Body weight at each time point.
  • EDIT:  FOUND -- Apologies, would have been nice to have this in a table somewhere, but average weight was reported for each time period and they remained weight stable, on average (~91-91.5 kg).  Still, this compounds the mystery of why there wasn't weight loss on VLC with that increased expenditure?  Also still not getting why they dont report standard deviation vs. CI (confidence interval).  I'll have to address the CI sometime, odd ... just odd ... to present that for results (SD is an exact calculated value for a sample, CI's are estimates projected for a population based on samples and assumptions of probabilities, etc.).
This is important, because in most weight stability studies, weight is monitored closely and intake adjusted accordingly.   And we should be suspicious here because ...
  • MISSING:  The usual dropouts over a 3 month period for 1 month each spent on radically different diets.  (The final n=21 resulted from only 1 dropout during this portion of the study, although the "rotten apples may have been weeded out in the weight loss portion).  Still ...
  • MISSING:  Evidence of maintenance!
  • EDIT:  Found, as stated above, that mean weight remained relatively constant as assessed at the end of each 4 week period.  There does not appear to be any discussion of monitoring stability.
So we have no clue how closely any of these participants adhered to the prescribed diets or whether they maintained, lost or even gained weight during that leg of the study.    GARBAGE IN.

Let's assume, however, that all 21, were perfect subjects, dutifully following menus and weighing daily and self-adjusting intake to remain weight stable for three months.  That is a huh-YOOGE assumption, but let's make it anyway.  You know what else is missing?  Standard Deviation of the Mean values reported for REE and TEE  (I reserve the right to focus my argument on these outcomes that are the major buzz generator).   Almost all reports of mean data include a +/- of SD, SE or give that value in parentheses.  Not so here.
  • MISSING:  Transparent reporting of statistical error/variability.
What is given in the table is the range of values, which can be highly misleading due to outliers, and elsewhere we get the 95% CI ranges, all Mean [95% CI] kcal/day -- LF: –205 [–265 to –144] ; LGI: –166 [–227 to –106] ; VLC: −138 [–198 to –77] . From this, and the blue squares on the plots, it would appear that the margin of error was ~60 kcal/day.  For the 95% CI, margin of error, E = 1.96*SD/sqrt(n) ... working backwards then SD = E*sqrt(n)/1.96 = 60*sqrt(20)/1.96 = 137 kcal/day.   This would only get worse if statistical laws were obeyed for minimum sample size and appropriate distribution used, etc.  Harummph.  In other words, I don't know what kind of statistical magical massage parlor they put this data through, but they need to clean under the baseboards more often cuz it's starting to smell.

I'm going to pull a Taubes here.  Let's ignore the "Mediterranean style" diet and just put LC vs LF up  head to head.  Our null hypothesis would state that changes in REE were equal (e.g. the means were the same), and if the calculated probability of this being true was very low (usually <5%, but sometimes other benchmarks are used), we would reject this null hypothesis, and this would support the contention that the two means were different.  And ha!  I don't care what kind of mumbojumbo over all P = 0.03 they are reporting, they deal with the mean values in drawing their conclusions.  Thus their stats need to deal with the mean.  Putting the LF & LC data through this online stats, the REE's are not statistically different to P < 0.1, though they are for P < 0.2 -- or better odds than a roll of the die?!  


I invite checking of my calcs/assumptions/methods here, but this was the quickest way to check this and it looks bad.  Please, tell me it ain't so, and I'll take it all back!

The authors Comment:
The results of our study challenge the notion that a calorie is a calorie from a metabolic perspective. During isocaloric feeding following weight loss, REE was 67 kcal/d higher with the very low-carbohydrate diet compared with the low-fat diet. TEE differed by approximately 300 kcal/d between these 2 diets, an effect corresponding with the amount of energy typically expended in 1 hour of moderate-intensity physical activity.
  • MISSING:  Statistical support for this misleading assertion.
At best you could hang your hat on a weak trending.  And even if the difference in the means was statistically significant, putting the value of 67 cal/day on it for REE (and 300 for TEE) is not supported by that test alone.  To be fair, absolutes and percents comparing means are used in reporting every day, but at least they pass the stat.sig. bar.   When individual comparisons are taken into account -- with erratic swings in both directions, particularly vs. LGI, it's difficult to support a trending based on carb/fat content at all.

And lastly,
  • MISSING:  Magical weight loss on isocaloric VLC despite this supposed "metabolic advantage" because clearly these subjects were still markedly overweight or even obese.  They would have had to lose weight ...
I'm filing this one in Science Krispies.  It belongs there :-(


First, this should not happen.  The supplemental materials should be referenced within the main text when describing methods, especially when they provide critical information for the reader to assess the significance of the study.  From HERE:
We tracked body weight and monitored adherence at daily (Monday through Friday) visits for lunch or dinner. When necessary, we made adjustments in energy intake to achieve weight loss and stabilization during the run-in phase. We allowed the duration of the run-in phase to vary among participants, to account for individual differences in the rate of weight loss. The energy intake required for weight stabilization at the end of the run-in phase was established as the energy intake for the entire test phase, with no further adjustments regardless of any weight fluctuation with the test diets. Participants completed a diary during daily visits to document deviations from study protocols, such as not eating all study foods or consuming any foods or beverages not consistent with the assigned diet. When participants had challenges in complying with protocols, we initiated behavioral counseling.  We neither prescribed nor restricted aerobic physical activity but asked participants not to take part in a weight training program.
Since this was to ascertain the effectiveness of a maintenance regime, why not report the counseling frequency?  Since compliance was assessed, how about reporting the actual intake or some measure of the degree of this compliance?  This furthers the mystery about the TEE and weight, because if caloric level was held constant, the low carbers should have lost weight.    Eating 1 meal a day at the facility M-F leaves a lot of "free living".  Not saying that's bad, but some mention of self-reported compliance with provided diets would have been nice.

Hunger and Well-being. Prior to breakfast during each hospital admission conducted as part of the test phase, we asked participants to rate level of hunger using a 10-cm visual analog scale, anchored with the descriptors “not at all hungry” and “extremely hungry.” To assess well-being, we asked participants, “How do you feel right now?” Responses were obtained using a visual analog scale, anchored with the descriptors “really terrible” and “really great.” A rating of 10 would represent the highest level of hunger and the best sense of well being on these two scales, respectively.

Are you hungry this morning?  How do you feel right now?  WHY BOTHER!!!


I've seen conflicting reports that they were actually provided their food. The LA Times report says calls it "an intensive, seven-month experiment during which 21 overweight men and women had their diets strictly controlled down to each last morsel."

If they are free living, compliance could still be an issue, but if this is actually true, might explain the lack of food diaries.
CarbSane said…
Thanks Beth, Over on Stephan's blog someone mentioned that they had all meals prepared for them and they picked them up daily. If this really is what happened, I'm dumbfounded why it's not stated in the Methods, since TTBOMK there's no universal definition of controlled feeding to be assumed. Or why, for example, in this study they went to such great lengths to describe how the diets were administered. IMO that was a well written paper, this is not.

We know they were free living for half of each test diet phase, and half of the initial evaluation.
Stephan Guyenet said…
It's in the methods supplement. They were probably extremely limited on space in this paper.
CarbSane said…
Thanks, I edited in an update. I can see space limits, but find a way to refer to supplemental materials in your main text dammit!

Since they collected deviation information, they should have commented on whether any of it appeared to be significant or subjects had difficulty adhering to any diet.

If a higher REE/TEE doesn't produce weight loss on an isocaloric intake, that sounds like a dagger in the heart of metabolic advantage >:)

Haven't really addressed the big thing, which you mentioned in your blog -- duration. If there's a metabolic boost to "switching it up", 4 weeks is probably not long enough to settle down to normal. Too many true metabolic ward studies -- including the maintenance leg of Gray & Kipnis -- showing otherwise to read much more into this.
Unknown said…
Sometimes a cigar is just a cigar
Unknown said…
Your statistics are all wrong. Remember, these are within-person comparisons, so at the very least you need to be using a paired t-test rather than idependent samples t-tests. Paired tests account for the correlation between responses from the same person, which are probably fairly substantial here, and that will make the results more "significant" (i.e., much smaller p-values). The authors used mixed models, which are basically just glorified paired t-tests, but adjusted for their other specified variables.

Thus, your claim of "GIGO!" is seriously misplaced. To all appearances (to those of us who understand randomized trials) this is a VERY well-conducted trial.
Unknown said…
Additionally, if you really want the test for comparing LC and LF, then you really only need to look at the "trend test" column of Table 3. It is straightforward to show that that test, with 3 groups assumed to be equally spaced, is equivalent to pairwise comparison between the lowest group (here LC) and the highest group (here LF). My one complaint about this trial (for which I've already written a letter to the editor) is that, according to their specified analysis plan, they should not have even presented those comparisons unless the overall null could be rejected (e.g., for CRP, it wasn't).
ItsTheWooo said…
The usual suspects are pointing out the obese popluation was "weight stable" which is inconsistent with a higher energy output. Weight stable does not mean body fat stable. Perhaps they were preserving or developing lean tissue and atrophying their fat tissue, leading to stability in kilos, but their actual obesity was still progressively improving? Did they do dexa scans and compare them pre and post diet? That would be interesting as other studies have found the VLC leads to a more favorable body composition in spite of a lack of weight loss, particularly in women, when compared to a low fat diet.

Anecdote, although a reverse case: when I stopped leptin in may of 2011, I weighed almost 10 pounds less than I do now. I have gained a lot of weight, but all my clothes still fit. While I have obviously gained some body fat, I definitely gained a lot of lean mass too, as evidenced by stronger larger muscles. OMG fatass you gained 10 pounds, but yea, I still wear all my old clothing (it is slightly tigher), it seems much of the weight I gained was lean tissue. I doubt I gained more than 5 pounds of fat given the fact there is not even 1 piece of clothing I cannot wear (there is one pair of jeans which is almost too tight but I can squeeze into them; it should be noted these jeans were tight even last year!)

If at my size (very thin) I can gain pretty much nearly 10 pounds and have a lot of it be shunted into lean tissue, I am pretty sure a bunch of obese people eating VLC diets can lose body fat/retain or gain lean tissue, and weigh exactly the same as another group of fat people who are altering their endocrine dynamic to preserve fat tissue and waste lean from that "insufficient substrate availability in the late postprandial period". Eventually over a longer period of time we would observe weight loss in the VLC group (when lean tissue retention becomes maximal, and fat tissue atrophy leads to loss of scale weight) or alternatively an attenuation of the energy out advantage and no loss of mass.

The higher thyroid and leptin in the LF group suggests fat tissue growth, as a relatively higher T3 and leptin is an expected result of insulin activity and larger fat tissue.
Geoff 99 said…
Here is an oddity. The RQ value for each of the 4 diets compared to the actual macro nutrient ratios consumed. If you use your standard NPRQ conversion tables to determine the percentages of carbohydrate and fat being oxidized for each given RQ, it looks something like this (please excuse my ad hoc rounding):

Diet (RQ) = cho% : fat %
PWLB ( .901) = 67.5 : 32.5
LFD ( .905) = 69 : 31
LGI ( .861) = 54.3 : 45.7
VLC ( .826) = 42 : 58

Using the above numbers subtract the actual fat burnt percentage from the fat supplied in the diet ( all the while assuming weight stability and dietary compliance )

Diet (supplied fat - RQ determined burnt fat) = Result
PWLB (30 - 32.5) = 2.5% Dietary Fat Deficit
LFD (20 - 31) = 11% Dietary Fat Deficit
LGI (40 - 45.7) = 5.7% Dietary Fat Deficit
VLC (60 - 58) = 2% Dietary Fat Surplus

So only the VLC diet is burning less fat than actually consumed in the diet?
Sort of makes you wonder where the metabolic advantage is taking you :-)
ProudDaddy said…
Maybe "VERY well-conducted", but certainly for me (and even the average MD, I suspect) maybe not very well reported. For me, it's a perfect example of the desirability of posting the raw data online. This had to be a very expensive study, and as reported it seems to me that it asks more questions than it answers. Or perhaps that's the grant application writer's objective?
ProudDaddy said…
Interesting. Do we know whether they measured body composition following each dietary intervention and just didn't report it (control variable only or somesuch reason)? (Or maybe I overlooked it?)

Does the production of lean body mass, sans water, require a greater energy expenditure than fat mass? If so, do we know how much?
ProudDaddy said…
While you academics tease out the details, allow a layman to present a few big picture thoughts:

1. It might have been better to have only LF and VLC diets for 6 weeks each. Less confusion, less cost.

2. The VLC could have just as easily be labeled a High Protein diet. 150g/day is not much different than that of a lot of bodybuilders. (This relates to Wooo's comments on body composition. And yes I know that body composition change via eucaloric diet composition differences is controversial, but...)

3. How can physical activity be the same and TEE difference from REE be so large? What generates this additional expenditure?

4. Where did the additional TEE or REE go? Body temperature? They didn't think to take it? Futile cycles? Which cycle? Etc.

5. And finally, the big one. To a layman, the scatter plots indicate a number of things. First, as Nigel says, we are all different, and prescribing a maintenance diet based on these results would be ludicrous. Perhaps which diet to try first, maybe. Second, how can the differences be so great and the blue box (CI?) be so small? I'm guessing it relates to getting the same mean if you repeated the test or something, but why not state the obvious - the results are all over the map, and all the statistical mumbo jumbo you throw at it can't change that. The words confounder and measurement error quickly come to mind.
CarbSane said…
See the post I just made. Interesting how the layman and the academic came to the same place, eh?!

In the REE data, the error in the CI is almost the maximum differential between the means (60 vs 67). P = 0.03????????
ProudDaddy, I have to disagree with you on point 1. Maybe it's all the Buddhist reading I've done, but I strongly suspect that the middle way is likely to be the best approach. So studies that only compare VHC to VLC are problematic IMO.
I should probably qualify that "best approach" ... more like a better approach for perhaps a fairly large population. Some folks will do better on VHC or VLC for a variety of reasons.
Anonymous said…
Why did the authors of the study have to supply the participants with fiber (Metamucil) and multivitamins?
CarbSane said…
@euler: This was Don's major quibble. Why low fat necessarily has to be high glycemic index is beyond me -- the Pima and Kitivan would disagree. When you say it's whole grain based with veggies, eh.

Maybe little pharma funded this >:)
CarbSane said…
I won't speak for PD, but I don't think he's saying extremes are the solution, but perhaps that to look at these things, let's do a more extreme comparison free from confounders. Why not compare 25% protein with two extremes between fat & carb (20F/55C v. 65F/5C or whatever) for a slightly longer time. I think this would produce better information all the way around. There are other confounders I have not addressed ... like the relative MUFA contents of the diets.

If you're going to do a 3-way, and attempt to draw inferences for one variable (carb quality/quantity), at the very least control what you can -- that being prescribed protein, fatty acid profile and heck, even fiber.
Galina L. said…
I personally think that the perfect "middle way" is hard to achieve in a real life in many things diet included. Getting all balanced more often is a straggle than not. I find that eating less by limiting (not eliminating) carbohydrates makes the execution of the diet more doable.
CarbSane said…
I agree. For many the extremes are easier to maintain because if they abstain from X entirely, it's easier than eating just a little X or consuming "moderately of X". Some might get the wrong impression from me that I'm anti-LC. I'm not. It works for me to a point, and from time to time I just find myself eating VLC. I don't find true low fat (<=20% but 25% max, I don't consider 30% LF) workable for me, but VLC can be for long stretches.

That said, I do believe that the most people will probably achieve acceptable (if not optimal) results somewhere in the middle. Unless they are able to control their environment and limit socializing it's going to be tough. And I think it's dangerous to forge friendships based on eating style. Cuz "come out" when you change course and then what?
Galina L. said…
There are people who claim it is harder to socialize on LC diets. I have to admit that I am unable to relate to that complain at all. The reason, probably, is that my main health reason that could affect socializing is the number of allergies I have (especially on cats, and the worsening of all allergies after an alcohol consumption), and in comparison to it LCarbing is a very small thing. You can always eat some cake or take couple cookies, or even forget about your diet for one evening, or bring a LC version of a desert with you to a party(I opt for it most of the time), but with an allergy on a salmon for example the room for a compromise is a small one when the main course if the salmon.

All three of us - me, my husband and our son are prone to have allergies. Our family has more serious challenge than a diet which may affect relationships with other people, especially Russian friends - we had to stop social drinking because it immediately makes our allergies worse. My son who goes to a college experiences the most social discomfort, despite that his friends are glad to have him as a constant designated driver. I am the only one LCarber in my family, and as I commented before, my diet made my allergies less of an issue. I now not limited any longer to 45 min present time in my best friend house because she has 3 cats, I can have a glass of vine or two again without wheezing from asthma afterwards, I can eat salmon again. My son stopped eating gluten, it made his eczema much better, but still drinking produces a huge eczema flare. He even considered LCarbing in order to be able to drink at the parties, it would be the most bizarre reason for a diet. He likes mashed potatoes and buckwheat crepes too much for that. He is on vocation at home now, and I have a jar with a mix for the crepes in my fridge all the time.
ProudDaddy said…
Beth, what I was trying to say was that for the same money they could have increased the diet durations to get much better stabilized data, and a crossover design with three changes is not only more expensive but more difficult to analyze wrt carryover effects. Of course, the middle way will probably be the best way for the most people, but I thought studies like this had their real value in learning how things work. The next step would be prescribing the way.

I've read a lot of Alan Watts, etc., but let's not forget that the Buddha's goal was to eliminate ALL desire. And how that relates to anything escapes me right now, but erasing is hard on my tablet!:-)
ProudDaddy said…
Beth, I answered your response before reading Evelyn's. She CAN speak for me (on this matter) and has certainly done a better job than I.

I see Evelyn's new post addresses some, but not all, of my 5 points. Anyone have any ideas about say Issue 4?