Skip to content

Commit e386f95

Browse files
authored
Merge pull request #136 from sangyu/v0.4dev
Delta-delta docs amendment
2 parents 1a135ad + ad62a6f commit e386f95

2 files changed

Lines changed: 51 additions & 36 deletions

File tree

dabest/_classes.py

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -857,23 +857,31 @@ def _all_plot_groups(self):
857857

858858
class DeltaDelta(object):
859859
"""
860-
A class to compute and store the delta-delta statistics. In a 2-by-2 arrangement where two independent variables, A and B, each have two categorical values, two primary deltas are first calculated with one independent variable and a delta-delta effect size is calculated as a difference between the two primary deltas.
860+
A class to compute and store the delta-delta statistics for experiments with a 2-by-2 arrangement where two independent variables, A and B, each have two categorical values, 1 and 2. The data is divided into two pairs of two groups, and a primary delta is first calculated as the mean difference between each of the pairs:
861861
862862
.. math::
863863
864-
\\hat{\\theta}_{B1} = \\overline{X}_{A2, B1} - \\overline{X}_{A1, B1}
864+
\\Delta_{1} = \\overline{X}_{A_{2}, B_{1}} - \\overline{X}_{A_{1}, B_{1}}
865865
866-
\\hat{\\theta}_{B2} = \\overline{X}_{A2, B2} - \\overline{X}_{A1, B2}
866+
\\Delta_{2} = \\overline{X}_{A_{2}, B_{2}} - \\overline{X}_{A_{1}, B_{2}}
867867
868+
where :math:`\overline{X}_{A_{i}, B_{j}}` is the mean of the sample with A = i and B = j, :math:`\\Delta` is the mean difference between two samples.
869+
870+
A delta-delta value is then calculated as the mean difference between the two primary deltas:
871+
868872
.. math::
869873
870-
\\hat{\\theta}_{\\theta} = \\hat{\\theta}_{B2} - \\hat{\\theta}_{B1}
874+
\\Delta_{\\Delta} = \\Delta_{B_{2}} - \\Delta_{B_{1}}
871875
872876
and:
873877
878+
and the standard deviation of the delta-delta value is calculated from a pooled variance of the 4 samples:
879+
874880
.. math::
875881
876-
s_{\\theta} = \\frac{(n_{A2, B1}-1)s_{A2, B1}^2+(n_{A1, B1}-1)s_{A1, B1}^2+(n_{A2, B2}-1)s_{A2, B2}^2+(n_{A1, B2}-1)s_{A1, B2}^2}{(n_{A2, B1} - 1) + (n_{A1, B1} - 1) + (n_{A2, B2} - 1) + (n_{A1, B2} - 1)}
882+
s_{\\Delta_{\\Delta}} = \\sqrt{\\frac{(n_{A_{2}, B_{1}}-1)s_{A_{2}, B_{1}}^2+(n_{A_{1}, B_{1}}-1)s_{A_{1}, B_{1}}^2+(n_{A_{2}, B_{2}}-1)s_{A_{2}, B_{2}}^2+(n_{A_{1}, B_{2}}-1)s_{A_{1}, B_{2}}^2}{(n_{A_{2}, B_{1}} - 1) + (n_{A_{1}, B_{1}} - 1) + (n_{A_{2}, B_{2}} - 1) + (n_{A_{1}, B_{2}} - 1)}}
883+
884+
where :math:`s` is the standard deviation and :math:`n` is the sample size.
877885
878886
Example
879887
-------
@@ -887,16 +895,16 @@ class DeltaDelta(object):
887895
>>> y = norm.rvs(loc=3, scale=0.4, size=N*4)
888896
>>> y[N:2*N] = y[N:2*N]+1
889897
>>> y[2*N:3*N] = y[2*N:3*N]-0.5
890-
>>> # Add drug column
898+
>>> # Add a `Treatment` column
891899
>>> t1 = np.repeat('Placebo', N*2).tolist()
892900
>>> t2 = np.repeat('Drug', N*2).tolist()
893901
>>> treatment = t1 + t2
894-
>>> # Add a `rep` column as the first variable for the 2 replicates of experiments done
902+
>>> # Add a `Rep` column as the first variable for the 2 replicates of experiments done
895903
>>> rep = []
896904
>>> for i in range(N*2):
897905
>>> rep.append('Rep1')
898906
>>> rep.append('Rep2')
899-
>>> # Add a `genotype` column as the second variable
907+
>>> # Add a `Genotype` column as the second variable
900908
>>> wt = np.repeat('W', N).tolist()
901909
>>> mt = np.repeat('M', N).tolist()
902910
>>> wt2 = np.repeat('W', N).tolist()
@@ -909,10 +917,12 @@ class DeltaDelta(object):
909917
>>> df_delta2 = pd.DataFrame({'ID' : id_col,
910918
>>> 'Rep' : rep,
911919
>>> 'Genotype' : genotype,
912-
>>> 'Drug': treatment,
920+
>>> 'Treatment': treatment,
913921
>>> 'Y' : y
914922
>>> })
915-
923+
>>> unpaired_delta2 = dabest.load(data = df_delta2, x = ["Genotype", "Genotype"], y = "Y", delta2 = True, experiment = "Treatment")
924+
>>> unpaired_delta2.mean_diff.plot()
925+
916926
917927
918928

docs/source/deltadelta.rst

Lines changed: 31 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Effectively, we have 4 groups of subjects for comparison.
3535
<thead>
3636
<tr style="text-align: right;">
3737
<th></th>
38-
<th>Wildtype</th>
38+
<th>Wild type</th>
3939
<th>Mutant</th>
4040
</tr>
4141
</thead>
@@ -60,7 +60,7 @@ Effectively, we have 4 groups of subjects for comparison.
6060
</div>
6161

6262

63-
There are 2 ``Treatment`` conditions, ``Placebo`` (control group) and ``Drug`` (test group). There are 2 ``Genotype`` s: ``W`` (wildtype population) and ``M`` (mutant population). In addition, each experiment was done twice (``Rep1`` and ``Rep2``). We shall do a few analyses to visualise these differences in a simulated dataset.
63+
There are 2 ``Treatment`` conditions, ``Placebo`` (control group) and ``Drug`` (test group). There are 2 ``Genotype``\s: ``W`` (wild type population) and ``M`` (mutant population). In addition, each experiment was done twice (``Rep1`` and ``Rep2``). We shall do a few analyses to visualise these differences in a simulated dataset.
6464

6565
Simulate a dataset
6666
------------------
@@ -83,18 +83,18 @@ Simulate a dataset
8383
y[N:2*N] = y[N:2*N]+1
8484
y[2*N:3*N] = y[2*N:3*N]-0.5
8585
86-
# Add drug column
86+
# Add a `Treatment` column
8787
t1 = np.repeat('Placebo', N*2).tolist()
8888
t2 = np.repeat('Drug', N*2).tolist()
8989
treatment = t1 + t2
9090
91-
# Add a `rep` column as the first variable for the 2 replicates of experiments done
91+
# Add a `Rep` column as the first variable for the 2 replicates of experiments done
9292
rep = []
9393
for i in range(N*2):
9494
rep.append('Rep1')
9595
rep.append('Rep2')
9696
97-
# Add a `genotype` column as the second variable
97+
# Add a `Genotype` column as the second variable
9898
wt = np.repeat('W', N).tolist()
9999
mt = np.repeat('M', N).tolist()
100100
wt2 = np.repeat('W', N).tolist()
@@ -112,7 +112,7 @@ Simulate a dataset
112112
df_delta2 = pd.DataFrame({'ID' : id_col,
113113
'Rep' : rep,
114114
'Genotype' : genotype,
115-
'Drug': treatment,
115+
'Treatment': treatment,
116116
'Y' : y
117117
})
118118
@@ -206,8 +206,7 @@ for slopegraphs. We use the ``experiment`` input to specify grouping of the data
206206
.. code-block:: python3
207207
:linenos:
208208
209-
unpaired_delta2 = dabest.load(data = df_delta2, x = ["Genotype", "Genotype"], y = "Y", delta2 = True,
210-
experiment = "Drug")
209+
unpaired_delta2 = dabest.load(data = df_delta2, x = ["Genotype", "Genotype"], y = "Y", delta2 = True, experiment = "Treatment")
211210
212211
The above function creates the following object:
213212

@@ -279,26 +278,31 @@ administered, the mutant phenotype is around 1.23 [95%CI 0.948, 1.52]. This diff
279278
and ``Drug`` group are plotted at the right bottom with a separate y-axis from other bootstrap plots.
280279
This effect size, at about -0.903 [95%CI -1.26, -0.535], is the net effect size of the drug treatment. That is to say that treatment with drug A reduced disease phenotype by 0.903.
281280

281+
Mean difference between mutants and wild types given the placebo treatment is:
282+
282283
.. math::
283284
284-
\hat{\theta}_{P} = \overline{X}_{P, M} - \overline{X}_{P, W}
285+
\Delta_{1} = \overline{X}_{P, M} - \overline{X}_{P, W}
286+
287+
Mean difference between mutants and wild types given the drug treatment is:
285288

286-
\hat{\theta}_{D} = \overline{X}_{D, M} - \overline{X}_{D, W}
287-
288289
.. math::
289290
291+
\Delta_{2} = \overline{X}_{D, M} - \overline{X}_{D, W}
290292
291-
\hat{\theta}_{\theta} = \hat{\theta}_{D} - \hat{\theta}_{P}
293+
The net effect of the drug on mutants is:
292294

293-
and:
294-
295295
.. math::
296296
297-
s_{\theta} = \frac{(n_{P, M}-1)s_{P, M}^2+(n_{P, W}-1)s_{P, W}^2+(n_{D, M}-1)s_{D, M}^2+(n_{D, M}-1)s_{D, M}^2}{(n_{P, M} - 1) + (n_{P, W} - 1) + (n_{D, M} - 1) + (n_{D, M} - 1)}
298297
298+
\Delta_{\Delta} = \Delta_{2} - \Delta_{1}
299+
300+
301+
where :math:`\overline{X}` is the sample mean, :math:`\Delta` is the mean difference.
299302

300303

301-
where :math:`\overline{X}` is the sample mean, :math:`\hat{\theta}` is the mean difference, :math:`s` is the variance and :math:`n` is the sample size.
304+
Specifying Grouping for Comparisons
305+
-----------------------------------
302306

303307

304308
In the example above, we used the convention of "test - control' but you can manipulate the orders of experiment groups as well as the horizontal axis variable by setting ``experiment_label`` and ``x1_level``.
@@ -334,28 +338,29 @@ We produce the following plot:
334338

335339
.. image:: _images/tutorial_108_0.png
336340

337-
We see that the drug had a non-specific effect of -0.321 [95%CI -0.498, -0.131] on wildtype subjects even when they were not sick, and it had a bigger effect of -1.22 [95%CI -1.52, -0.906] in mutant subjects. In this visualisation, we can see the delta-delta value of -0.903 [95%CI -1.21, -0.587] as the net effect of the drug accounting for non-specific actions in healthy individuals.
341+
We see that the drug had a non-specific effect of -0.321 [95%CI -0.498, -0.131] on wild type subjects even when they were not sick, and it had a bigger effect of -1.22 [95%CI -1.52, -0.906] in mutant subjects. In this visualisation, we can see the delta-delta value of -0.903 [95%CI -1.21, -0.587] as the net effect of the drug accounting for non-specific actions in healthy individuals.
338342

339-
.. math::
340-
341-
\hat{\theta}_{W} = \overline{X}_{D, W} - \overline{X}_{P, W}
342343

343-
\hat{\theta}_{W} = \overline{X}_{D, M} - \overline{X}_{P, M}
344+
Mean difference between drug and placebo treatments in wild type subjects is:
344345

345346
.. math::
346347
347-
\hat{\theta}_{\theta} = \hat{\theta}_{M} - \hat{\theta}_{W}
348-
349-
and:
348+
\Delta_{1} = \overline{X}_{D, W} - \overline{X}_{P, W}
349+
350+
Mean difference between drug and placebo treatments in mutant subjects is:
350351

351352
.. math::
352353
353-
s_{\theta} = \frac{(n_{D, W}-1)s_{D, W}^2+(n_{P, W}-1)s_{P, W}^2+(n_{D, M}-1)s_{D, M}^2+(n_{P, M}-1)s_{P, M}^2}{(n_{D, W} - 1) + (n_{P, W} - 1) + (n_{D, M} - 1) + (n_{P, M} - 1)}
354+
\Delta_{2} = \overline{X}_{D, M} - \overline{X}_{P, M}
354355
355356
357+
The net effect of the drug on mutants is:
356358

357-
where :math:`\overline{X}` is the sample mean, :math:`\hat{\theta}` is the mean difference, :math:`s` is the variance and :math:`n` is the sample size.
359+
.. math::
358360
361+
\Delta_{\Delta} = \Delta_{2} - \Delta_{1}
362+
363+
where :math:`\overline{X}` is the sample mean, :math:`\Delta` is the mean difference.
359364

360365

361366
Connection to ANOVA

0 commit comments

Comments
 (0)