Skip to content

Commit 3cbd480

Browse files
committed
ENH add magic reset between merged notebooks
1 parent a9ceac6 commit 3cbd480

14 files changed

Lines changed: 260 additions & 277 deletions

doc/merge_notebooks.py

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,12 @@ def merge_notebooks(filenames):
2222
merged.cells = [cell_with_title(title=os.path.basename(fname))
2323
] + nb.cells
2424
else:
25-
# add a Markdown cell with the file name, then all cells
26-
merged.cells.extend(
27-
[cell_with_title(title=os.path.basename(fname))] + nb.cells)
25+
# add a code cell resetting all variables
26+
merged.cells.append(cell_with_reset())
27+
# add a Markdown cell with the file name
28+
merged.cells.append(cell_with_title(title=os.path.basename(fname)))
29+
# add all cells from current notebook
30+
merged.cells.extend(nb.cells)
2831

2932
if not hasattr(merged.metadata, 'name'):
3033
merged.metadata.name = ''
@@ -41,6 +44,17 @@ def cell_with_title(title):
4144
})
4245

4346

47+
def cell_with_reset():
48+
"""Returns a code cell with magic command to reset all variables."""
49+
return nbformat.from_dict({
50+
'cell_type': 'code',
51+
'execution_count': None,
52+
'metadata': {'collapsed': False},
53+
'outputs': [],
54+
'source': '%reset -f',
55+
})
56+
57+
4458
if __name__ == '__main__':
4559
notebooks = sys.argv[1:]
4660
if not notebooks:

tutorials/movies_3T/00_load_colab.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,5 +48,8 @@
4848
cortex.utils.db = cortex.database.db
4949
cortex.dataset.braindata.db = cortex.database.db
5050

51+
import sklearn
52+
sklearn.set_config(assume_finite=True)
53+
5154
###############################################################################
5255
# Your Google Colab environment is now set up for the voxelwise tutorials.

tutorials/movies_3T/01_plot_explainable_variance.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,14 @@
1515
are the same for each repetition of the stimulus. Thus, encoding models will
1616
predict only the repeatable stimulus-dependent signal.
1717
18-
The stimulus-dependent signal can be estimated by taking the mean of
18+
The stimulus-dependent signal can be estimated by taking the mean of
1919
brain responses over repeats of the same stimulus or experiment. The variance
2020
of the estimated stimulus-dependent signal, which we call the explainable
2121
variance, is proportional to the maximum prediction accuracy that can be
22-
obtained by a voxelwise encoding model in the test set.
22+
obtained by a voxelwise encoding model in the test set.
2323
2424
Mathematically, let :math:`y_i, i = 1 \\dots N` be the measured signal in
25-
a voxel for each of the :math:`N` repetitions of the same stimulus and
25+
a voxel for each of the :math:`N` repetitions of the same stimulus and
2626
:math:`\\bar{y} = \\frac{1}{N}\\sum_{i=1}^Ny_i` the average brain response
2727
across repetitions. For each repeat, we define the residual timeseries
2828
between brain response and average brain response as :math:`r_i = y_i - \\bar{y}`.
@@ -114,7 +114,7 @@
114114
plt.show()
115115

116116
###############################################################################
117-
# We see that many voxels have low explainable variance. This is
117+
# We see that many voxels have low explainable variance. This is
118118
# expected, since many voxels are not driven by a visual stimulus, and their
119119
# response changes over repeats of the same stimulus.
120120
# We also see that some voxels have high explainable variance (around 0.7). The
@@ -150,8 +150,8 @@
150150
plt.show()
151151

152152
###############################################################################
153-
# This figure is a flattened map of the cortical surface. A number of regions of
154-
# interest (ROIs) have been labeled to ease interpretation. If you have
153+
# This figure is a flattened map of the cortical surface. A number of regions
154+
# of interest (ROIs) have been labeled to ease interpretation. If you have
155155
# never seen such a flatmap, we recommend taking a look at a `pycortex brain
156156
# viewer <https://www.gallantlab.org/brainviewer/Deniz2019>`_, which displays
157157
# the brain in 3D. In this viewer, press "I" to inflate the brain, "F" to

tutorials/movies_3T/02_plot_wordnet_model.py

Lines changed: 50 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@
2121
relashionship based on the `Wordnet <https://wordnet.princeton.edu/>`_ dataset.
2222
2323
*Summary:* We first concatenate the features with multiple temporal delays to
24-
account for the slow hemodynamic response. We then use linear regression to fit
25-
a predictive model of brain activity. The linear regression is regularized to
26-
improve robustness to correlated features and to improve generalization
24+
account for the slow hemodynamic response. We then use linear regression to fit
25+
a predictive model of brain activity. The linear regression is regularized to
26+
improve robustness to correlated features and to improve generalization
2727
performance. The optimal regularization hyperparameter is selected over a
2828
grid-search with cross-validation. Finally, the model generalization
2929
performance is evaluated on a held-out test set, comparing the model
@@ -66,9 +66,9 @@
6666
###############################################################################
6767
# If we repeat an experiment multiple times, part of the fMRI responses might
6868
# change. However the modeling features do not change over the repeats, so the
69-
# voxelwise encoding model will predict the same signal for each repeat. To have an
70-
# upper bound of the model prediction accuracy, we keep only the repeatable part of
71-
# the signal by averaging the test repeats.
69+
# voxelwise encoding model will predict the same signal for each repeat. To
70+
# have an upper bound of the model prediction accuracy, we keep only the
71+
# repeatable part of the signal by averaging the test repeats.
7272
Y_test = Y_test.mean(0)
7373

7474
print("(n_samples_test, n_voxels) =", Y_test.shape)
@@ -123,15 +123,15 @@
123123
#
124124
# Now, let's define the model pipeline.
125125
#
126-
# We first center the features, since we will not use an intercept. The
127-
# mean value in fMRI recording is non-informative, so each run is detrended and
126+
# We first center the features, since we will not use an intercept. The mean
127+
# value in fMRI recording is non-informative, so each run is detrended and
128128
# demeaned independently, and we do not need to predict an intercept value in
129129
# the linear model.
130130
#
131131
# However, we prefer to avoid normalizing by the standard deviation of each
132-
# feature. If the features are extracted in a consistent way from the
133-
# stimulus, their relative scale is meaningful. Normalizing them independently
134-
# from each other would remove this information. Moreover, the wordnet features are
132+
# feature. If the features are extracted in a consistent way from the stimulus,
133+
# their relative scale is meaningful. Normalizing them independently from each
134+
# other would remove this information. Moreover, the wordnet features are
135135
# one-hot-encoded, which means that each feature is either present (1) or not
136136
# present (0) in each sample. Normalizing one-hot-encoded features is not
137137
# recommended, since it would scale disproportionately the infrequent features.
@@ -141,11 +141,11 @@
141141

142142
###############################################################################
143143
# Then we concatenate the features with multiple delays to account for the
144-
# hemodynamic response. Due to neurovascular coupling, the recorded BOLD signal is
145-
# delayed in time with respect to the stimulus onset. With different delayed versions
146-
# of the features, the linear regression model will weigh each delayed feature
147-
# with a different weight to maximize the predictions. With a sample every 2
148-
# seconds, we typically use 4 delays [1, 2, 3, 4] to cover the
144+
# hemodynamic response. Due to neurovascular coupling, the recorded BOLD signal
145+
# is delayed in time with respect to the stimulus onset. With different delayed
146+
# versions of the features, the linear regression model will weigh each delayed
147+
# feature with a different weight to maximize the predictions. With a sample
148+
# every 2 seconds, we typically use 4 delays [1, 2, 3, 4] to cover the
149149
# hemodynamic response peak. In the next example, we further describe this
150150
# hemodynamic response estimation.
151151
from voxelwise_tutorials.delayer import Delayer
@@ -154,12 +154,13 @@
154154
###############################################################################
155155
# Finally, we use a ridge regression model. Ridge regression is a linear
156156
# regression with L2 regularization. The L2 regularization improves robustness
157-
# to correlated features and improves generalization performance. However, the L2
158-
# regularization is controlled by a hyperparameter ``alpha`` that needs to be
159-
# tuned for each dataset. This regularization hyperparameter is usually selected over a grid
160-
# search with cross-validation, selecting the hyperparameter that maximizes the
161-
# predictive performances on the validation set. More details about
162-
# cross-validation can be found in the `scikit-learn documentation
157+
# to correlated features and improves generalization performance. However, the
158+
# L2 regularization is controlled by a hyperparameter ``alpha`` that needs to
159+
# be tuned for each dataset. This regularization hyperparameter is usually
160+
# selected over a grid search with cross-validation, selecting the
161+
# hyperparameter that maximizes the predictive performances on the validation
162+
# set. More details about cross-validation can be found in the `scikit-learn
163+
# documentation
163164
# <https://scikit-learn.org/stable/modules/cross_validation.html>`_.
164165
#
165166
# For computational reasons, when the number of features is larger than the
@@ -177,8 +178,8 @@
177178
# mean score over targets. Here, we want to find a different optimal
178179
# hyperparameter per target/voxel, so we use the package `himalaya
179180
# <https://github.com/gallantlab/himalaya>`_ which implements a
180-
# ``scikit-learn`` compatible estimator ``KernelRidgeCV``, with
181-
# hyperparameter selection independently on each target.
181+
# ``scikit-learn`` compatible estimator ``KernelRidgeCV``, with hyperparameter
182+
# selection independently on each target.
182183
from himalaya.kernel_ridge import KernelRidgeCV
183184

184185
###############################################################################
@@ -266,11 +267,10 @@
266267
# Plot the model prediction accuracy
267268
# ----------------------------------
268269
#
269-
# To visualize the model prediction accuracy, we can plot it for each voxel on
270-
# a flattened surface of the brain. To do so, we use a mapper that is specific
271-
# to the each subject's brain.
272-
# (Check previous example to see how to use the mapper to Freesurfer average
273-
# surface.)
270+
# To visualize the model prediction accuracy, we can plot it for each voxel on
271+
# a flattened surface of the brain. To do so, we use a mapper that is specific
272+
# to the each subject's brain. (Check previous example to see how to use the
273+
# mapper to Freesurfer average surface.)
274274
import matplotlib.pyplot as plt
275275
from voxelwise_tutorials.viz import plot_flatmap_from_mapper
276276

@@ -280,17 +280,16 @@
280280

281281
###############################################################################
282282
# We can see that the "wordnet" features successfully predict part of the
283-
# measured brain activity, with :math:`R^2` scores as high as 0.4. Note that these
284-
# scores are generalization scores, since they are computed on a test set that
285-
# was not used during model fitting.
286-
# Since we fitted a model independently in each voxel, we can inspect the
287-
# generalization performances at the best available spatial resolution:
288-
# individual voxels.
283+
# measured brain activity, with :math:`R^2` scores as high as 0.4. Note that
284+
# these scores are generalization scores, since they are computed on a test set
285+
# that was not used during model fitting. Since we fitted a model independently
286+
# in each voxel, we can inspect the generalization performances at the best
287+
# available spatial resolution: individual voxels.
289288
#
290-
# The best-predicted voxels are located in visual semantic areas like EBA, or FFA.
291-
# This is expected since the wordnet features encode semantic information about
292-
# the visual stimulus. For more discussions about these results, we refer the
293-
# reader to the original publication [1]_.
289+
# The best-predicted voxels are located in visual semantic areas like EBA, or
290+
# FFA. This is expected since the wordnet features encode semantic information
291+
# about the visual stimulus. For more discussions about these results, we refer
292+
# the reader to the original publication [1]_.
294293

295294
###############################################################################
296295
# Plot the selected hyperparameters
@@ -320,8 +319,9 @@
320319
# that have more predictive power.
321320
#
322321
# Since we know the meaning of each feature, we can interpret the large
323-
# regression coefficients. In the case of wordnet features, we can even build
324-
# a graph that represents the features that are linked by a semantic relationship.
322+
# regression coefficients. In the case of wordnet features, we can even build a
323+
# graph that represents the features that are linked by a semantic
324+
# relationship.
325325

326326
###############################################################################
327327
# We first get the (primal) ridge regression coefficients from the fitted
@@ -367,11 +367,11 @@
367367

368368
###############################################################################
369369
# Similarly to [1]_, we correct the coefficients of features linked by a
370-
# semantic relationship. When building the wordnet features, if a frame was labeled
371-
# with `wolf`, the authors automatically added the semantically linked categories
372-
# `canine`, `carnivore`, `placental mammal`, `mamma`, `vertebrate`, `chordate`,
373-
# `organism`, and `whole`. The authors thus argue that the same correction
374-
# needs to be done on the coefficients.
370+
# semantic relationship. When building the wordnet features, if a frame was
371+
# labeled with `wolf`, the authors automatically added the semantically linked
372+
# categories `canine`, `carnivore`, `placental mammal`, `mamma`, `vertebrate`,
373+
# `chordate`, `organism`, and `whole`. The authors thus argue that the same
374+
# correction needs to be done on the coefficients.
375375

376376
from voxelwise_tutorials.wordnet import load_wordnet
377377
from voxelwise_tutorials.wordnet import correct_coefficients
@@ -406,10 +406,10 @@
406406
#
407407
# In this example, because we use only a single subject and we perform a
408408
# different voxel selection, our result is slightly different than in [1]_. We
409-
# also use a different regularization parameter in each voxel, while in [1]_
410-
# all voxels had the same regularization parameter.
411-
# We do not aim at reproducing exactly the results in [1]_,
412-
# but we rather describe the general approach.
409+
# also use a different regularization parameter in each voxel, while in [1]_
410+
# all voxels had the same regularization parameter. We do not aim at
411+
# reproducing exactly the results in [1]_, but we rather describe the general
412+
# approach.
413413

414414
###############################################################################
415415
# To project the principal component on the cortical surface, we first need to
@@ -492,5 +492,3 @@
492492
#
493493
# .. [2] Saunders, C., Gammerman, A., & Vovk, V. (1998).
494494
# Ridge regression learning algorithm in dual variables.
495-
496-
del pipeline, kernel_ridge_cv

tutorials/movies_3T/03_plot_hemodynamic_response.py

Lines changed: 20 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,17 @@
88
example, and further describe the need to delay the features in time to account
99
for the delayed BOLD response.
1010
11-
Because of the temporal dynamics of neurovascular coupling, the recorded BOLD signal is delayed in
12-
time with respect to the stimulus. To account for this lag, we fit encoding
13-
models on delayed features. In this way, the linear regression model weighs
14-
each delayed feature separately and recovers the shape of the hemodynamic
15-
response function in each voxel separately. In turn, this method (also known as
16-
a Finite Impulse Response model, or FIR) maximizes the model prediction
17-
accuracy. With a repetition time of 2 seconds, we typically use 4 delays [1, 2,
18-
3, 4] to cover the peak of the the hemodynamic response function. However, the
19-
optimal number of delays can vary depending on the experiment and the brain
20-
area of interest, so you should experiment with different delays.
11+
Because of the temporal dynamics of neurovascular coupling, the recorded BOLD
12+
signal is delayed in time with respect to the stimulus. To account for this
13+
lag, we fit encoding models on delayed features. In this way, the linear
14+
regression model weighs each delayed feature separately and recovers the shape
15+
of the hemodynamic response function in each voxel separately. In turn, this
16+
method (also known as a Finite Impulse Response model, or FIR) maximizes the
17+
model prediction accuracy. With a repetition time of 2 seconds, we typically
18+
use 4 delays [1, 2, 3, 4] to cover the peak of the the hemodynamic response
19+
function. However, the optimal number of delays can vary depending on the
20+
experiment and the brain area of interest, so you should experiment with
21+
different delays.
2122
2223
In this example, we show that a model without delays performs far worse than a
2324
model with delays. We also show how to visualize the estimated hemodynamic
@@ -160,20 +161,20 @@
160161
###############################################################################
161162
# We fit and score the model as the previous one.
162163
pipeline_no_delay.fit(X_train, Y_train)
163-
scores_nodelay = pipeline_no_delay.score(X_test, Y_test)
164-
scores_nodelay = backend.to_numpy(scores_nodelay)
165-
print("(n_voxels,) =", scores_nodelay.shape)
164+
scores_no_delay = pipeline_no_delay.score(X_test, Y_test)
165+
scores_no_delay = backend.to_numpy(scores_no_delay)
166+
print("(n_voxels,) =", scores_no_delay.shape)
166167

167168
###############################################################################
168-
# Then, we plot the comparison of model prediction accuracies with a 2D histogram.
169-
# All ~70k voxels are represented in this histogram, where the diagonal
170-
# corresponds to identical prediction accuracy for both models. A distibution deviating
171-
# from the diagonal means that one model has better prediction accuracy
172-
# than the other.
169+
# Then, we plot the comparison of model prediction accuracies with a 2D
170+
# histogram. All ~70k voxels are represented in this histogram, where the
171+
# diagonal corresponds to identical prediction accuracy for both models. A
172+
# distibution deviating from the diagonal means that one model has better
173+
# prediction accuracy than the other.
173174
import matplotlib.pyplot as plt
174175
from voxelwise_tutorials.viz import plot_hist2d
175176

176-
ax = plot_hist2d(scores_nodelay, scores)
177+
ax = plot_hist2d(scores_no_delay, scores)
177178
ax.set(
178179
title='Generalization R2 scores',
179180
xlabel='model without delays',
@@ -251,5 +252,3 @@
251252
# We see that the hemodynamic response function (HRF) is captured in the model
252253
# weights. Note that in this dataset, the brain responses are recorded every
253254
# two seconds.
254-
255-
del pipeline, pipeline_no_delay

0 commit comments

Comments
 (0)