How To Measure Small Effects in Your Data

If you make a change to your daily routine or try a new medication, how do you know if it is working?

This was the question Bard sent in for the QS Scientific Advisory Board. His challenge was met by Neil Rubens, Teresa Lunt and David Goldberg. Read Bard’s question and their answers below.

And if you have a question about your self-tracking for our advisors, let me know.


Bard’s Symptom Tracking Experiment:

My purpose is to correlate whether taking a particular medication helps to alleviate specific symptoms.

Medications and symptoms are tracked in a Google doc,

My question is, imagine 2 columns, A representing whether medication was taken (0 or 1) and B measuring some relevant symptom (value x). The median being 99 in column B next to any 1′s in column A, and 100 in column B next to any 0′s in column A. Standard deviation 1 (or any SD really, I just wanted to find a formula that could still find a high correlation with such a small deviation, which shouldn’t be hard considering the huge amount of data that I have). This simulates a pill that works, on average, to decrease the symptoms by 1 point. Not a huge change, but extremely consistent, so worth identifying.

I would like a formula that returns the “strength” of the correlation, which in this example is approx. 100%, given a large enough data set. Any help would be greatly appreciated.


Neil Rubens’ Answer:

neilrubens.jpgHi Bard,

If I understood your questions correctly you may consider using the following two approaches to analyze your data.

1. There are many different ways of measuring dependence between variables (besides correlation); this wiki link on “dependence measurement” should provide a good place to start.

2. You can use a “statistical hypothesis test” to establish whether the difference in treatments are statistically significant (even if this difference is very small) — unlikely to have occurred by chance.

I hope this at least partially answers your questions.



David Goldberg and Teresa Lunt’s answer:

davidgoldberg.jpgHi Bard,

I’m not sure correlation is the best way to think about this, since one of the variables (the A column in your notation) takes on only two values, either 0 or 1.

It might make more sense to consider the two sets of symptom values, S1 for subjects who didn’t take the medication, and S2 for for those who did.  Then you can use the tests developed for comparing two sets of numbers.  Here are three common tests.

1. The t-test.  Using the free ‘R’ statistical package on this data (where x=S1, y=S2) gives:

> t.test(x,y)

Welch Two Sample t-test

data:  x and y

t = 4.105, df = 797.703, p-value = 4.459e-05

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.4905078 1.3894922

sample estimates:

mean of x mean of y

99.795    98.855

This not only says there is a statistically significant difference (p

= .00004), but tells you that with 95% confidence, the difference in

means between the two sets is between 0.5 and 1.4.  In other words,

the symptom value in the control (no medication) group is likely to be

at least 0.5 more than in the experimental group.

teresalunt.jpg2. Another possibility is the Wilcoxon rank-sum test.  If you think the

symptom values are nowhere near having a Gaussian distribution, then

this would be more apppropriate.  For your data (again using ‘R’)

> wilcox.test(x,y)

Wilcoxon rank sum test with continuity correction

data:  x and y

W = 93025.5, p-value = 6.297e-05

alternative hypothesis: true location shift is not equal to 0

Again the test shows the two sets are unequal, since p is so small (p

= 0.00006).  However, you don’t get the confidence interval for the

difference of means.

3. If the data aren’t Gaussian and you want the confidence interval for

the difference in means, consider using the bootstrap.

> fn = function()

+   mean(sample(x, length(x), replace=TRUE)) – mean(sample(y, length(y), replace=TRUE))

> replicates=replicate(1000,fn()

)> quantile(replicates, c(0.025, 0.975))

2.5%     97.5%

0.4674375 1.3500000

This gives a similar 95% confidence interval as the t-test:  (.47, 1.4) vs (.49, 1.4)

Palo Alto Research Center


Thanks to Bard for the question and to Neil, David and Teresa for their answers! Brilliant and experimenting QS readers, please send in your questions and we’ll do our best to find answers for you.

This entry was posted in Discussions and tagged , , , , . Bookmark the permalink.

3 Responses to How To Measure Small Effects in Your Data

  1. Bard says:

    Wow! Fantastic guys, thank you so much.
    Now that my head is spinning sufficiently I will get down to working through your solutions and hopefully apply them to my personal biometric data and see if some interesting results come out of it.
    The hard part for me is jamming these formulas into my basic google spreadsheet which I’m running my metrics through.
    The basic correlative coefficient is what I have used so far simply because it’s just so easy to type “=correl” :) and set up a grid comparing each correlation across categories (there’s an example of it in my blog post on this site).
    Thanks again, Bard

  2. Jochen says:

    Because all measurements come from the same person, these are not independent observations. Therefore the t-test and the Wilcoxon-test are not suitable test in this case. Unfortunately I do not have a solution for the problem.

  3. Gary Wolf says:

    Jochen – I’m afraid I don’t understand why neither t-test or Wilcoxon-test have value in this situation. Could you expand?

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.