|
3 | 3 | Fit a ridge model with motion energy features |
4 | 4 | ============================================= |
5 | 5 |
|
6 | | -In this second example, we model the fMRI responses with motion-energy features |
7 | | -extracted from the movie stimulus. The model is still a regularized linear |
8 | | -regression model. |
| 6 | +In this example, we model the fMRI responses with motion-energy features |
| 7 | +extracted from the movie stimulus. The model is a regularized linear regression |
| 8 | +model. |
9 | 9 |
|
10 | 10 | This tutorial reproduces part of the analysis described in Nishimoto et al |
11 | 11 | (2011) [1]_. See this publication for more details about the experiment, the |
12 | 12 | motion-energy features, along with more results and more discussions. |
13 | 13 |
|
14 | | -Motion-energy features result from filtering a video stimulus with |
15 | | -spatio-temporal Gabor filters. A pyramid of filters is used to compute the |
16 | | -motion-energy features at multiple spatial and temporal scales. |
17 | | -
|
18 | | -As in the previous example, we first concatenate the features with multiple |
19 | | -delays, to account for the hemodynamic response. A linear regression model |
20 | | -then weights each delayed feature with a different weight, to build a |
21 | | -predictive model of BOLD activity. |
22 | | -Again, the linear regression is regularized to improve robustness to correlated |
23 | | -features and to improve generalization. The optimal regularization |
24 | | -hyperparameter is selected over a grid-search with cross-validation. |
25 | | -Finally, the model generalization performance is evaluated on a held-out test |
26 | | -set, comparing the model predictions with the corresponding ground-truth fMRI |
27 | | -responses. |
28 | | -
|
29 | | -The ridge model is fitted with the package |
30 | | -`himalaya <https://github.com/gallantlab/himalaya>`_. |
| 14 | +*Motion-energy features:* Motion-energy features result from filtering a video |
| 15 | +stimulus with spatio-temporal Gabor filters. A pyramid of filters is used to |
| 16 | +compute the motion-energy features at multiple spatial and temporal scales. |
| 17 | +
|
| 18 | +*Summary:* As in the previous example, we first concatenate the features with |
| 19 | +multiple delays, to account for the slow hemodynamic response. A linear |
| 20 | +regression model then weights each delayed feature with a different weight, to |
| 21 | +build a predictive model of BOLD activity. Again, the linear regression is |
| 22 | +regularized to improve robustness to correlated features and to improve |
| 23 | +generalization. The optimal regularization hyperparameter is selected |
| 24 | +independently on each voxel over a grid-search with cross-validation. Finally, |
| 25 | +the model generalization performance is evaluated on a held-out test set, |
| 26 | +comparing the model predictions with the ground-truth fMRI responses. |
31 | 27 | """ |
32 | 28 | ############################################################################### |
33 | | - |
34 | | -# path of the data directory |
| 29 | +# Path of the data directory |
35 | 30 | import os |
36 | 31 | from voxelwise_tutorials.io import get_data_home |
37 | 32 | directory = os.path.join(get_data_home(), "vim-4") |
38 | 33 | print(directory) |
39 | 34 |
|
| 35 | +############################################################################### |
| 36 | + |
40 | 37 | # modify to use another subject |
41 | 38 | subject = "S01" |
42 | 39 |
|
|
52 | 49 | file_name = os.path.join(directory, "responses", f"{subject}_responses.hdf") |
53 | 50 | Y_train = load_hdf5_array(file_name, key="Y_train") |
54 | 51 | Y_test = load_hdf5_array(file_name, key="Y_test") |
55 | | -run_onsets = load_hdf5_array(file_name, key="run_onsets") |
56 | 52 |
|
57 | | -# We average the test repeats, since we cannot model the non-repeatable part of |
58 | | -# fMRI responses. It means that the prediction :math:`R^2` scores will be |
59 | | -# relative to the explainable variance. |
| 53 | +print("(n_samples_train, n_voxels) =", Y_train.shape) |
| 54 | +print("(n_repeats, n_samples_test, n_voxels) =", Y_test.shape) |
| 55 | + |
| 56 | +############################################################################### |
| 57 | +# We average the test repeats, to remove the non-repeatable part of fMRI |
| 58 | +# responses. |
60 | 59 | Y_test = Y_test.mean(0) |
61 | 60 |
|
62 | | -# We remove NaN values present on non-cortical voxels. |
| 61 | +print("(n_samples_test, n_voxels) =", Y_test.shape) |
| 62 | + |
| 63 | +############################################################################### |
| 64 | +# We fill potential NaN (not-a-number) values with zeros. |
63 | 65 | Y_train = np.nan_to_num(Y_train) |
64 | 66 | Y_test = np.nan_to_num(Y_test) |
65 | 67 |
|
66 | 68 | ############################################################################### |
67 | | -# Then we load the "motion-energy" features, that we will |
68 | | -# use for the linear regression model. |
| 69 | +# Then we load the precomputed "motion-energy" features. |
69 | 70 |
|
70 | 71 | feature_space = "motion_energy" |
71 | 72 | file_name = os.path.join(directory, "features", f"{feature_space}.hdf") |
72 | 73 | X_train = load_hdf5_array(file_name, key="X_train") |
73 | 74 | X_test = load_hdf5_array(file_name, key="X_test") |
74 | 75 |
|
75 | | -# We use single precision float to speed up model fitting on GPU. |
76 | | -X_train = X_train.astype("float32") |
77 | | -X_test = X_test.astype("float32") |
| 76 | +print("(n_samples_train, n_features) =", X_train.shape) |
| 77 | +print("(n_samples_test, n_features) =", X_test.shape) |
78 | 78 |
|
79 | 79 | ############################################################################### |
80 | 80 | # Define the cross-validation scheme |
|
88 | 88 |
|
89 | 89 | # indice of first sample of each run |
90 | 90 | run_onsets = load_hdf5_array(file_name, key="run_onsets") |
| 91 | +print(run_onsets) |
91 | 92 |
|
92 | | -# define a cross-validation splitter, compatible with ``scikit-learn``` API |
| 93 | +############################################################################### |
| 94 | +# We define a cross-validation splitter, compatible with ``scikit-learn`` API. |
93 | 95 | n_samples_train = X_train.shape[0] |
94 | 96 | cv = generate_leave_one_run_out(n_samples_train, run_onsets) |
95 | 97 | cv = check_cv(cv) # copy the cross-validation splitter into a reusable list |
|
108 | 110 | from himalaya.backend import set_backend |
109 | 111 | backend = set_backend("torch_cuda", on_error="warn") |
110 | 112 |
|
| 113 | +X_train = X_train.astype("float32") |
| 114 | +X_test = X_test.astype("float32") |
| 115 | + |
111 | 116 | alphas = np.logspace(1, 20, 20) |
112 | 117 |
|
113 | 118 | pipeline_motion_energy = make_pipeline( |
|
135 | 140 | scores_motion_energy = pipeline_motion_energy.score(X_test, Y_test) |
136 | 141 | scores_motion_energy = backend.to_numpy(scores_motion_energy) |
137 | 142 |
|
| 143 | +print("(n_voxels,) =", scores_motion_energy.shape) |
| 144 | + |
138 | 145 | ############################################################################### |
139 | 146 | # Plot the model performances |
140 | 147 | # --------------------------- |
|
150 | 157 |
|
151 | 158 | ############################################################################### |
152 | 159 | # The motion-energy features lead to large generalization scores in the |
153 | | -# early visual cortex (V1, V2? V3, ...). For more discussions about these |
| 160 | +# early visual cortex (V1, V2, V3, ...). For more discussions about these |
154 | 161 | # results, we refer the reader to the original publication [1]_. |
155 | 162 |
|
156 | 163 | ############################################################################### |
157 | 164 | # Compare with the wordnet model |
158 | 165 | # ------------------------------ |
159 | 166 | # |
160 | | -# It is interesting to compare the performances of this motion-energy model |
161 | | -# to the performances of the wordnet model fitted in the previous example. |
162 | | -# To compare them, we first need to fit again the semantic wordnet model. |
| 167 | +# Interestingly, the motion-energy model performs well in different brain |
| 168 | +# regions than the semantic "wordnet" model fitted in the previous example. To |
| 169 | +# compare the two models, we first need to fit again the wordnet model. |
163 | 170 |
|
164 | 171 | feature_space = "wordnet" |
165 | 172 | file_name = os.path.join(directory, "features", f"{feature_space}.hdf") |
166 | 173 | X_train = load_hdf5_array(file_name, key="X_train") |
167 | 174 | X_test = load_hdf5_array(file_name, key="X_test") |
168 | 175 |
|
169 | | -# We use single precision float to speed up model fitting on GPU. |
170 | 176 | X_train = X_train.astype("float32") |
171 | 177 | X_test = X_test.astype("float32") |
172 | 178 |
|
173 | | -# We can create an unfitted copy of the pipeline with the `clone` function. |
| 179 | +############################################################################### |
| 180 | +# We can create an unfitted copy of the pipeline with the ``clone`` function. |
174 | 181 | from sklearn.base import clone |
175 | 182 | pipeline_wordnet = clone(pipeline_motion_energy) |
176 | 183 | pipeline_wordnet |
|
202 | 209 | ############################################################################### |
203 | 210 | # Interestingly, the well predicted voxels are different in the two models. |
204 | 211 | # To further describe these differences, we can plot both performances on the |
205 | | -# same flatmap. |
| 212 | +# same flatmap, using a 2D colormap. |
206 | 213 |
|
207 | 214 | from voxelwise_tutorials.viz import plot_2d_flatmap_from_mapper |
208 | 215 |
|
|
214 | 221 | plt.show() |
215 | 222 |
|
216 | 223 | ############################################################################### |
217 | | -# The blue regions are well predicted by the motion-energy features, |
218 | | -# the orange regions are well predicted by the wordnet features, |
219 | | -# and the white regions are well predicted by both feature spaces. |
| 224 | +# The blue regions are well predicted by the motion-energy features, the orange |
| 225 | +# regions are well predicted by the wordnet features, and the white regions are |
| 226 | +# well predicted by both feature spaces. |
220 | 227 | # |
221 | 228 | # Interestingly, a large part of the visual semantic areas are not only well |
222 | | -# predicted by the wordnet features, but also by the motion-energy features, |
223 | | -# as indicated by the white color. Since these two features spaces encode |
224 | | -# quite different information, two interpretations are possible. |
225 | | -# In the first one, the two feature spaces encode complementary information, |
226 | | -# and could be used jointly to further increase the generalization |
227 | | -# performances. In the second interpretation, both feature spaces encode the |
228 | | -# same information, because of spurious correlation in the stimulus. For |
229 | | -# example, all faces in the stimulus might be located in the same part of the |
230 | | -# visual field, thus a motion-energy feature at this location might contain |
231 | | -# all the necessary information to predict the presence of a face, without |
232 | | -# specifically encoding for the semantic of faces. |
| 229 | +# predicted by the wordnet features, but also by the motion-energy features, as |
| 230 | +# indicated by the white color. Since these two features spaces encode quite |
| 231 | +# different information, two interpretations are possible. In the first |
| 232 | +# interpretation, the two feature spaces encode complementary information, and |
| 233 | +# could be used jointly to further increase the generalization performances. In |
| 234 | +# the second interpretation, both feature spaces encode the same information, |
| 235 | +# because of spurious correlation in the stimulus. For example, all faces in |
| 236 | +# the stimulus might be located in the same part of the visual field, thus a |
| 237 | +# motion-energy feature at this location might contain all the necessary |
| 238 | +# information to predict the presence of a face, without specifically encoding |
| 239 | +# for the semantic of faces. |
233 | 240 | # |
234 | 241 | # To better disentangle the two feature spaces, we developed a joint model |
235 | 242 | # called `banded ridge regression` [2]_, which fits multiple feature spaces |
236 | | -# simultaneously with optimal regularization for each feature space. |
237 | | -# This model is described in the next example. |
| 243 | +# simultaneously with optimal regularization for each feature space. This model |
| 244 | +# is described in the next example. |
238 | 245 |
|
239 | 246 | ############################################################################### |
240 | 247 | # References |
|
0 commit comments