Propensity score matching in Python, revisited

Update 8/11/2017: I’ve been working on turning this code into a package people can download and contribute to. Please use the package, linked here, instead of the code I shared in a Jupyter notebook previously.

I can’t believe how many people from all around the world visit my previous blog post on propensity score matching in Python every day.  It feels great to know that my code is out there and people are actually using it.  However, I realized that the notebook I link to previously doesn’t contain much and that I wrote heaps more code after posting it.  Hence, I’m sharing a more complete notebook with code for different variations on propensity score matching, functions to compute average treatment effects and get standard errors, and check for balance between matched groups.

Advertisements

Workload and reduced fecundity — try not to work too hard, ladies

Here’s a vaguely misogynistic study for you.  This article, “Women who work or lift a lot may struggle to get pregnant,” discusses the findings from a recent paper in Occupational and Environmental Medicine.  The authors surveyed women trying to conceive in the Nurses’ Health Study 3, a large cohort of predominantly Caucasian nurses.  Their covariates of interest were how many hours per week women worked and how often they lifted more than 25 lbs in a day; the primary outcome was time to conception.  The authors concluded,

Working more than 40 hours a week was linked with taking 20 percent longer to get pregnant compared to women who worked 21 to 40 hours.

Moving or lifting at least 25-pound loads several times a day was also tied to delayed pregnancy, extending the time to conception by about 50 percent.

The unstated interpretation is that women’s bodies can’t handle working a full day or lifting any weight, so women at reproductive age should think twice about what they do for a living.

Here’s the original paper.

One potential issue is that the study is cross-sectional and the authors didn’t actually follow the women from the time of first interview until they got pregnant.  Instead, they used one survey to ask how long the women had been trying to conceive, then used a survival analysis method to estimate the time to pregnancy based on the self-reported times.  This method of analysis is biased: women who had no trouble conceiving are underrepresented in the sample and women who have taken a long time to get pregnant are overrepresented.  Furthermore, we don’t know the true outcomes for these women, only the ones estimated by a parametric model.

My biggest issue with this study is that they attach any meaning to their findings at all, saying that working more has a “detrimental impact on female nurses’ ability to get pregnant”.  They use duration of pregnancy attempt “as a surrogate for fecundity”.  Fecundity implies some biological ability to reproduce.  However, using time to conception as a proxy for fecundity relies on the assumption that everyone is trying equally hard to get pregnant.  If that were the case, then any variation in time to conception would be due to fecundity.  This isn’t something they checked or measured, and differences in women’s ideas of what “trying to get pregnant” means are probably what’s actually driving the trend the authors reported.

The Reuters article quoted someone sensible:

“If this effect is real, it is likely due to the fact that these women are having less frequent intercourse due to their work demands,” Lynch, who wasn’t involved in the study, said by email.

Nobody needed to do a study to figure that out.  Anyway, we could come up with all sorts of other plausible explanations for why women who work more are having less frequent intercourse.  If they redid this study on a cohort of women working in tech, I’m sure they’d find a similar relationship between number of hours worked and time to conception.  The point is, working more hours or picking up 25 lb boxes probably has no effect on anyone’s biological capacity to reproduce.  The authors are making a mountain out of a molehill.

I routinely lift 100 lbs over my head, so I guess I’ll really be screwed when I want to have a baby.

An avocado a day

I don’t know about you but I love avocados.  In the last year or so, my diet has evolved to incorporate a great deal of healthy fats in the form of avocados, nuts, seeds, and eggs. The jury is still out as to whether fat is good or bad for you.  The Dietary Guidelines Advisory Committee has recently modified their recommendations about fat and cholesterol, calling instead for a reduction in sugar consumption and a move towards healthy fats like in the Mediterranean diet.  This seems to me like a move in the right direction, but there is still some push-back from people (and big players in the food industry) who think fat is bad.

I was pleased to come across this recent paper which reports a significant beneficial effect of eating an avocado per day.  In particular, switching from the “average American diet” to a moderate fat diet that incorporates one avocado per day results in a decrease in LDL cholesterol, the “bad cholesterol”.  The most piquing result is that the decrease in LDL cholesterol is greater in the moderate fat, avocado group than in the moderate fat, no avocado group – presumably due to something unique to avocados. The study looks pretty solid to me:

  • They used a randomized crossover design – each subject gets all three diets, so we get the benefit of bigger treatment groups to estimate the causal effect of each dietary intervention. Randomizing the order in which a person gets the three treatments controls for the possible effect of a particular sequence of treatments.
  • The statistical analyses look acceptable.  They measured the “causal effect” of each diet on baseline characteristics using linear mixed models and non-parametric ANOVA, correcting for multiple tests with Tukey’s method.  If the randomization worked, then all confounding variables are balanced between the groups receiving a particular sequence of diets, making them alike in every way except for the treatment received.  This means the coefficient for the diet in the model is actually estimating what it is supposed to.
  • It’s hard to assess compliance for a dietary intervention. Subjects have to eat the prescribed foods in the right amount.  Without constant supervision it’s hard to know what really happened.  The authors tried to assign subjects meal plans with just enough calories to maintain their weight.  Then they assessed compliance by weighing subjects every day.  This is probably as accurate as they can get without constantly watching the subjects.  Perhaps they could do urine/stool samples to check the amounts of certain nutrients, but that seems like overkill.
  • The sample size is small, n = 40. However, they did a power analysis based on a similar study of a pistachio diet and in an idealized situation, a sample size of 37 would give them 95% power to detect a 10% decrease in LDL-C due to the avocado diet versus the low fat diet.

All diets had a beneficial effect on various subtypes of LDL and HDL cholesterol, but the avocado diet surpassed the other two.  The next steps are more studies to look at the individual micronutrients in avocados that are driving this beneficial effect.  In the mean time, I plan to continue eating my avocado a day.