Skip to content

Commit 6220c2b

Browse files
authored
Merge pull request #157 from ACCLAB/refactoring_final_phase
Refactoring final phase
2 parents 65660da + 9450252 commit 6220c2b

11 files changed

Lines changed: 230 additions & 322 deletions

nbs/01-getting_started.ipynb

Lines changed: 14 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -27,16 +27,16 @@
2727
"source": [
2828
"\n",
2929
"\n",
30-
"Python 3.8 is strongly recommended. DABEST has also been tested with Python 3.6 and 3.7.\n",
30+
"Python 3.10 is strongly recommended. DABEST has also been tested with Python 3.6, 3.7 and 3.8.\n",
3131
"\n",
3232
"In addition, the following packages are also required (listed with their minimal versions):\n",
3333
"\n",
34-
"* [numpy 1.22.3](https://www.numpy.org)\n",
34+
"* [numpy 1.22.4](https://www.numpy.org)\n",
3535
"* [scipy 1.9.3](https://www.scipy.org)\n",
36-
"* [matplotlib 3.5.1](https://www.matplotlib.org)\n",
37-
"* [pandas 1.5.0](https://pandas.pydata.org)\n",
36+
"* [matplotlib 3.6.3](https://www.matplotlib.org)\n",
37+
"* [pandas 1.5.3](https://pandas.pydata.org)\n",
3838
"* [seaborn 0.12.2](https://seaborn.pydata.org)\n",
39-
"* [lqrt 0.3](https://github.com/alyakin314/lqrt)\n",
39+
"* [lqrt 0.3.3](https://github.com/alyakin314/lqrt)\n",
4040
"\n",
4141
"To obtain these package dependencies easily, it is highly recommended to download the [Anaconda](https://www.continuum.io/downloads) distribution of Python.\n"
4242
]
@@ -58,8 +58,10 @@
5858
"\n",
5959
"At the command line, run\n",
6060
"\n",
61+
"``` shell\n",
62+
"$ pip install dabest\n",
63+
"```\n",
6164
"\n",
62-
"**$ pip install dabest**\n",
6365
"\n"
6466
]
6567
},
@@ -70,11 +72,12 @@
7072
"source": [
7173
"2. Using Github\n",
7274
"\n",
73-
"Clone the [DABEST-python repo](https://github.com/ACCLAB/DABEST-python) locally (see instructions [here] (https://help.github.com/articles/cloning-a-repository/).\n",
75+
"Clone the [DABEST-python repo](https://github.com/ACCLAB/DABEST-python) locally (see instructions [here](https://help.github.com/articles/cloning-a-repository/)).\n",
7476
"\n",
7577
"Then, navigate to the cloned repo in the command line and run\n",
76-
"\n",
77-
"**$ pip install**"
78+
"```\n",
79+
"$ pip install .\n",
80+
"```"
7881
]
7982
},
8083
{
@@ -90,9 +93,9 @@
9093
"id": "a9f8cb3e",
9194
"metadata": {},
9295
"source": [
93-
"To test DABEST, you will need to install [pytest](https://docs.pytest.org/en/latest/).\n",
96+
"To test DABEST, you will need to install [pytest](https://docs.pytest.org/en/latest/). \n",
9497
"\n",
95-
"Run ``pytest`` in the root directory of the source distribution. This runs the test suite in ``dabest/tests`` folder. The test suite will ensure that the bootstrapping functions and the plotting functions perform as expected.\n",
98+
"Run ``pytest`` in the root directory of the source distribution. This runs the test suite in ``dabest/tests`` folder including also the image-based tests of the ``mpl_image_tests`` sub folder. The test suite will ensure that the bootstrapping functions and the plotting functions perform as expected.\n",
9699
"\n"
97100
]
98101
},
@@ -127,14 +130,6 @@
127130
"source": [
128131
"All contributions are welcome. Please fork the [Github repo](https://github.com/ACCLAB/DABEST-python/) and open a pull request.\n"
129132
]
130-
},
131-
{
132-
"cell_type": "code",
133-
"execution_count": null,
134-
"id": "23a7b823",
135-
"metadata": {},
136-
"outputs": [],
137-
"source": []
138133
}
139134
],
140135
"metadata": {

nbs/02-about.ipynb

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@
6666
"\n",
6767
" * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.\n",
6868
"\n",
69+
"<div style=\"padding: 15px; border: 1px solid transparent; border-color: transparent; margin-bottom: 20px; border-radius: 4px; color: #8a6d3b;; background-color: #fcf8e3; border-color: #faebcc;\">\n",
6970
"NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY\n",
7071
"THIS LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND\n",
7172
"CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n",
@@ -77,7 +78,8 @@
7778
"BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER\n",
7879
"IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)\n",
7980
"ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE\n",
80-
"POSSIBILITY OF SUCH DAMAGE.\n"
81+
"POSSIBILITY OF SUCH DAMAGE.\n",
82+
"</div>\n"
8183
]
8284
},
8385
{

nbs/blog/posts/bootstraps/bootstraps.ipynb

Lines changed: 14 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
"source": [
88
"# Bootstrap Confidence Intervals\n",
99
"\n",
10-
"> Explaination of the bootstrap method and its application in hypothesis testing using DABEST.\n",
10+
"> Explanation of the bootstrap method and its application in hypothesis testing using **DABEST**.\n",
1111
"\n",
1212
"- order: 3"
1313
]
@@ -17,7 +17,7 @@
1717
"id": "6321ea6f",
1818
"metadata": {},
1919
"source": [
20-
"## Sampling from Populations"
20+
"## Sampling from populations"
2121
]
2222
},
2323
{
@@ -27,7 +27,7 @@
2727
"source": [
2828
"In a typical scientific experiment, we are interested in two populations\n",
2929
"(Control and Test), and whether there is a difference between their means\n",
30-
"$(\\mu_{Test}-\\mu_{Control})$\n"
30+
"$(\\mu_{Test}-\\mu_{Control})$.\n"
3131
]
3232
},
3333
{
@@ -43,7 +43,7 @@
4343
"id": "5573045c",
4444
"metadata": {},
4545
"source": [
46-
"We go about this by collecting observations from the control population, and from the test population."
46+
"We go about this by collecting observations from the control population and from the test population."
4747
]
4848
},
4949
{
@@ -62,7 +62,7 @@
6262
"We can easily compute the mean difference in our observed samples. This is our\n",
6363
"estimate of the population effect size that we are interested in.\n",
6464
"\n",
65-
"**But how do we obtain a measure of precision and confidence about our estimate?\n",
65+
"**But how do we obtain a measure of the precision and confidence about our estimate?\n",
6666
"Can we get a sense of how it relates to the population mean difference?**\n"
6767
]
6868
},
@@ -79,11 +79,11 @@
7979
"id": "fe977cc6",
8080
"metadata": {},
8181
"source": [
82-
"We want to obtain a 95% confidence interval (95% CI) around the our estimate of the mean difference. The 95% indicates that any such confidence interval will capture the population mean difference 95% of the time.\n",
82+
"We want to obtain a 95% confidence interval (95% CI) around our estimate of the mean difference. The 95% indicates that any such confidence interval will capture the population mean difference 95% of the time.\n",
8383
"\n",
84-
"In other words, if we repeated our experiment 100 times, gathering 100 independent sets of observations, and computing a 95% confidence interval for the mean difference each time, 95 of these intervals would capture the population mean difference. That is to say, we can be 95% confident the interval contains the true mean of the population.\n",
84+
"In other words, if we were to repeat our experiment 100 times, gathering 100 independent sets of observations and computing a 95% confidence interval for the mean difference each time, 95 of these intervals would capture the population mean difference. That is to say, we can be 95% confident the interval contains the true mean of the population.\n",
8585
"\n",
86-
"We can calculate the 95% CI of the mean difference with [bootstrap resampling](https://en.wikipedia.org/wiki/Bootstrapping_(statistics))\n"
86+
"We can calculate the 95% CI of the mean difference with [bootstrap resampling](https://en.wikipedia.org/wiki/Bootstrapping_(statistics)).\n"
8787
]
8888
},
8989
{
@@ -99,7 +99,7 @@
9999
"id": "0685adaf",
100100
"metadata": {},
101101
"source": [
102-
"The [`bootstrap`](#1)[1] is a simple but powerful technique. It was [first described] (https://projecteuclid.org/euclid.aos/1176344552) by [Bradley Efron](https://statistics.stanford.edu/people/bradley-efron).\n",
102+
"The [`bootstrap`](#1)[1] is a simple but powerful technique. It was [first described](https://projecteuclid.org/euclid.aos/1176344552) by [Bradley Efron](https://statistics.stanford.edu/people/bradley-efron).\n",
103103
"\n",
104104
"It creates multiple *resamples* (with replacement) from a single set of\n",
105105
"observations, and computes the effect size of interest on each of these\n",
@@ -134,11 +134,7 @@
134134
"the Central Limit Theorem, the resampling distribution of the effect size will\n",
135135
"approach a normality.\n",
136136
"\n",
137-
"2. *Easy construction of the 95% CI from the resampling distribution.* For 1000\n",
138-
"bootstrap resamples of the mean difference, one can use the 25th value and the\n",
139-
"975th value of the ranked differences as boundaries of the 95% confidence\n",
140-
"interval. (This captures the central 95% of the distribution.) Such an interval\n",
141-
"construction is known as a *percentile interval*."
137+
"2. *Easy construction of the 95% CI from the resampling distribution.* In the context of bootstrap resampling or other non-parametric methods, the 2.5th and 97.5th percentiles are often used to define the lower and upper limits, respectively. The use of these percentiles ensures that the resulting interval contains the central 95% of the resampled distribution. Such an interval construction is known as a *percentile interval*."
142138
]
143139
},
144140
{
@@ -156,12 +152,10 @@
156152
"source": [
157153
"While resampling distributions of the difference in means often have a normal\n",
158154
"distribution, it is not uncommon to encounter a skewed distribution. Thus, Efron\n",
159-
"developed the [bias-corrected and accelerated bootstrap]\n",
160-
"(https://en.wikipedia.org/wiki/Bootstrapping_(statistics)#History) (BCa\n",
161-
"bootstrap) to account for the skew, and still obtain the central 95% of the\n",
155+
"developed the [bias-corrected and accelerated bootstrap](https://en.wikipedia.org/wiki/Bootstrapping_(statistics)#History) (BCa bootstrap) to account for the skew, and still obtain the central 95% of the\n",
162156
"distribution.\n",
163157
"\n",
164-
"DABEST applies the BCa correction to the resampling bootstrap distributions of\n",
158+
"**DABEST** applies the BCa correction to the resampling bootstrap distributions of\n",
165159
"the effect size."
166160
]
167161
},
@@ -186,7 +180,7 @@
186180
"id": "fb1a8fa6",
187181
"metadata": {},
188182
"source": [
189-
"The estimation plot produced by DABEST presents the rawdata and the bootstrap\n",
183+
"The estimation plot produced by DABEST presents the raw data and the bootstrap\n",
190184
"confidence interval of the effect size (the difference in means) side-by-side as\n",
191185
"a single integrated plot."
192186
]
@@ -204,7 +198,7 @@
204198
"id": "eaad7dd5",
205199
"metadata": {},
206200
"source": [
207-
"It thus tightly couples visual presentation of the raw data with an indication of the population mean difference, and its confidence interval."
201+
"Thus, it tightly couples a visual presentation of the raw data with an indication of the population mean difference plus its confidence interval."
208202
]
209203
},
210204
{
@@ -215,14 +209,6 @@
215209
"<a id='1'></a>\n",
216210
"`[1]`: The name is derived from the saying \"[pull oneself by one's bootstraps](https://en.wiktionary.org/wiki/pull_oneself_up_by_one%27s_bootstraps)\", often used as an exhortation to achieve success without external help.\n"
217211
]
218-
},
219-
{
220-
"cell_type": "code",
221-
"execution_count": null,
222-
"id": "87e5611b",
223-
"metadata": {},
224-
"outputs": [],
225-
"source": []
226212
}
227213
],
228214
"metadata": {

nbs/blog/posts/robust-beautiful/robust-beautiful.ipynb

Lines changed: 18 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -71,11 +71,11 @@
7171
"the same when visualized with a barplot on the right panel. (You can\n",
7272
"download the [dataset](_static/four_samples.csv) to see for yourself.)\n",
7373
"\n",
74-
"We're not the first ones (see\n",
75-
"[this](https://www.nature.com/articles/nmeth.2837),\n",
76-
"[this](http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128),\n",
74+
"We're not the first ones (see these articles:\n",
75+
"[article 1](https://www.nature.com/articles/nmeth.2837),\n",
76+
"[article 2](http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128),\n",
7777
"or\n",
78-
"[that](https://onlinelibrary.wiley.com/doi/full/10.1111/ejn.13400))\n",
78+
"[article 3](https://onlinelibrary.wiley.com/doi/full/10.1111/ejn.13400))\n",
7979
"to point out the barplot's fatal flaws. Indeed, it is both sobering and\n",
8080
"fascinating to realise that the barplot is a [17th century\n",
8181
"invention](https://en.wikipedia.org/wiki/Bar_chart#History) initially\n",
@@ -118,7 +118,7 @@
118118
"The figure above visualizes the same four samples as a swarmplot (left\n",
119119
"panel) and as a boxplot. If we did not label the x-axis with the sample\n",
120120
"size, it would be impossible to definitively distinguish the sample with\n",
121-
"5 obesrvations from the sample with 50.\n",
121+
"5 observations from the sample with 50.\n",
122122
"\n",
123123
"Even if the world gets rid of barplots and boxplots, the problems\n",
124124
"plaguing statistical practices will remain unsolved. Null-hypothesis\n",
@@ -148,24 +148,21 @@
148148
"id": "a7e3b1ad",
149149
"metadata": {},
150150
"source": [
151-
"hown above is a Gardner-Altman estimation plot. (The plot draws its name from\n",
152-
"[Martin J. Gardner]\n",
153-
"(https://www.independent.co.uk/news/people/obituary-professor-martin-gardner-1470261.html)\n",
151+
"This is a *Gardner-Altman* estimation plot. The plot draws its name from\n",
152+
"[Martin J. Gardner](https://www.independent.co.uk/news/people/obituary-professor-martin-gardner-1470261.html)\n",
154153
"and [Douglas Altman](https://www.bmj.com/content/361/bmj.k2588), who are\n",
155-
"credited with [creating the design]\n",
156-
"(https://www.bmj.com/content/bmj/292/6522/746.full.pdf) in 1986).\n",
154+
"credited with [creating the design](https://www.bmj.com/content/bmj/292/6522/746.full.pdf) in 1986.\n",
157155
"\n",
158156
"This plot has two key features:\n",
159157
"\n",
160-
" 1. It presents all datapoints as a *swarmplot*, which orders each point to\n",
161-
" display the underlying distribution.\n",
158+
" 1. It presents all data points as a swarmplot, ordering each point to display the underlying distribution.\n",
162159
"\n",
163160
" 2. It presents the effect size as a *bootstrap 95% confidence interval* (95% CI)\n",
164-
" on a separate but aligned axes. where the effect size is displayed to the right\n",
165-
" of the war data, and the mean of the test group is aligned with the effect size.\n",
161+
" on a separate but aligned axis. The effect size is displayed to the right of the raw data, and the mean of the test group is aligned with the effect size.\"\n",
162+
"\n",
163+
"<div style=\"padding: 15px; border: 1px solid transparent; border-color: transparent; margin-bottom: 20px; border-radius: 4px; color: #31708f; background-color: #d9edf7; border-color: #bce8f1;\"> Thus, estimation plots are robust, beautiful, and convey important statistical\n",
164+
"information elegantly and efficiently. </div>\n",
166165
"\n",
167-
"*Thus, estimation plots are robust, beautiful, and convey important statistical\n",
168-
"information elegantly and efficiently.*\n",
169166
"\n",
170167
"An estimation plot obtains and displays the 95% CI through nonparametric\n",
171168
"bootstrap resampling. This enables visualization of the confidence interval as\n",
@@ -283,13 +280,11 @@
283280
"id": "b7b643f8",
284281
"metadata": {},
285282
"source": [
286-
"For comparisons between 3 or more groups that typically employ analysis\n",
283+
"For comparisons between three or more groups that typically employ analysis\n",
287284
"of variance (ANOVA) methods, one can use the [Cumming estimation\n",
288285
"plot](https://en.wikipedia.org/wiki/Estimation_statistics#Cumming_plot),\n",
289-
"named after [Geoff\n",
290-
"Cumming](https://www.youtube.com/watch?v=nDN-hcKR7j8), and draws its\n",
291-
"design heavily from his 2012 textbook [Understanding the New\n",
292-
"Statistics](https://www.routledge.com/Understanding-The-New-Statistics-Effect-Sizes-Confidence-Intervals-and/Cumming/p/book/9780415879682).\n",
286+
"named after [Geoff Cumming](https://www.youtube.com/watch?v=nDN-hcKR7j8), and draws its\n",
287+
"design heavily from his 2012 textbook [\"Understanding the New Statistics\"](https://www.routledge.com/Understanding-The-New-Statistics-Effect-Sizes-Confidence-Intervals-and/Cumming/p/book/9780415879682).\n",
293288
"This estimation plot design can be considered a variant of the\n",
294289
"Gardner-Altman plot.\n"
295290
]
@@ -307,8 +302,8 @@
307302
"id": "b443b0a8",
308303
"metadata": {},
309304
"source": [
310-
"The effect size and 95% CIs are still plotted a separate axes, but\n",
311-
"unlike the Gardner-Altman plot, this axes is positioned beneath the raw\n",
305+
"The effect size and 95% CIs are still plotted on a separate axis, but\n",
306+
"unlike the Gardner-Altman plot, this axis is positioned beneath the raw\n",
312307
"data.\n",
313308
"\n",
314309
"Such a design frees up visual space in the upper panel, allowing the\n",

0 commit comments

Comments
 (0)