Skip to content

Commit 903dbe6

Browse files
committed
MNT,DOC clean up notebooks and improve FIR notebook
1 parent 7622600 commit 903dbe6

8 files changed

Lines changed: 291 additions & 57 deletions

File tree

tutorials/Makefile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,9 @@ help:
1212
@echo " merge-notebooks Merge notebook files into single files for Colab"
1313
@echo " build Build the jupyter book"
1414
@echo " push-pages Push built book to GitHub Pages"
15+
@echo " preview Preview the book locally"
1516

16-
.PHONY: help Makefile clean merge-notebooks build push-pages
17+
.PHONY: help Makefile clean merge-notebooks build push-pages preview
1718

1819
clean:
1920
rm -rf $(BUILDDIR)/

tutorials/notebooks/shortclips/05_fit_wordnet_model.ipynb

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -122,14 +122,14 @@
122122
"outputs": [],
123123
"source": [
124124
"from scipy.stats import zscore\n",
125+
"from voxelwise_tutorials.utils import zscore_runs\n",
125126
"\n",
126127
"# indice of first sample of each run\n",
127128
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
128129
"print(run_onsets)\n",
129130
"\n",
130131
"# zscore each training run separately\n",
131-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
132-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
132+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
133133
"# zscore each test run separately\n",
134134
"Y_test = zscore(Y_test, axis=1)"
135135
]
@@ -156,7 +156,6 @@
156156
"source": [
157157
"Y_test = Y_test.mean(0)\n",
158158
"# We need to zscore the test data again, because we took the mean across repetitions.\n",
159-
"# This averaging step makes the standard deviation approximately equal to 1/sqrt(n_repeats)\n",
160159
"Y_test = zscore(Y_test, axis=0)\n",
161160
"\n",
162161
"print(\"(n_samples_test, n_voxels) =\", Y_test.shape)"

tutorials/notebooks/shortclips/06_visualize_hemodynamic_response.ipynb

Lines changed: 61 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@
8484
"import numpy as np\n",
8585
"from scipy.stats import zscore\n",
8686
"from voxelwise_tutorials.io import load_hdf5_array\n",
87+
"from voxelwise_tutorials.utils import zscore_runs\n",
8788
"\n",
8889
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
8990
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -96,8 +97,7 @@
9697
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
9798
"\n",
9899
"# zscore each training run separately\n",
99-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
100-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
100+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
101101
"# zscore each test run separately\n",
102102
"Y_test = zscore(Y_test, axis=1)"
103103
]
@@ -287,14 +287,16 @@
287287
"cell_type": "markdown",
288288
"metadata": {},
289289
"source": [
290-
"## Intermission: understanding delays\n",
290+
"## Understanding delays\n",
291291
"\n",
292292
"To have an intuitive understanding of what we accomplish by delaying the\n",
293293
"features before model fitting, we will simulate one voxel and a single\n",
294294
"feature. We will then create a ``Delayer`` object (which was used in the\n",
295-
"previous pipeline) and visualize its effect on our single feature. Let's\n",
296-
"start by simulating the data.\n",
297-
"\n"
295+
"previous pipeline) and visualize its effect on our single feature. \n",
296+
"\n",
297+
"Let's start by simulating the data. We assume a simple scenario in which an event in\n",
298+
"our experiment occurred at $t = 20$ seconds and lasted for 10 seconds. For each timepoint, our simulated feature\n",
299+
"is a simple variable that indicates whether the event occurred or not."
298300
]
299301
},
300302
{
@@ -305,71 +307,83 @@
305307
},
306308
"outputs": [],
307309
"source": [
308-
"# number of total trs\n",
309-
"n_trs = 50\n",
310-
"# repetition time for the simulated data\n",
311-
"TR = 2.0\n",
312-
"rng = np.random.RandomState(42)\n",
313-
"y = rng.randn(n_trs)\n",
314-
"x = np.zeros(n_trs)\n",
315-
"# add some arbitrary value to our feature\n",
316-
"x[15:20] = 0.5\n",
317-
"x += rng.randn(n_trs) * 0.1 # add some noise\n",
310+
"from voxelwise_tutorials.delays_toy import create_voxel_data\n",
318311
"\n",
319-
"# create a delayer object and delay the features\n",
320-
"delayer = Delayer(delays=[0, 1, 2, 3, 4])\n",
321-
"x_delayed = delayer.fit_transform(x[:, None])"
312+
"# simulate an activation pulse at 20 s for 10 s of duration\n",
313+
"simulated_X, simulated_Y, times = create_voxel_data(onset=20, duration=10)"
314+
]
315+
},
316+
{
317+
"cell_type": "markdown",
318+
"metadata": {},
319+
"source": [
320+
"We next plot the simulated data. In this toy example, we assumed a \"canonical\" \n",
321+
"hemodynamic response function (HRF) (a double gamma function). This is an idealized\n",
322+
"HRF that is often used in the literature to model the BOLD response. In practice, \n",
323+
"however, the HRF can vary significantly across brain areas.\n",
324+
"\n",
325+
"Because of the HRF, notice that even though the event occurred at $t = 20$ seconds, \n",
326+
"the BOLD response is delayed in time. "
327+
]
328+
},
329+
{
330+
"cell_type": "code",
331+
"execution_count": null,
332+
"metadata": {},
333+
"outputs": [],
334+
"source": [
335+
"import matplotlib.pyplot as plt\n",
336+
"from voxelwise_tutorials.delays_toy import plot_delays_toy\n",
337+
"\n",
338+
"plot_delays_toy(simulated_X, simulated_Y, times)\n",
339+
"plt.show()"
322340
]
323341
},
324342
{
325343
"cell_type": "markdown",
326344
"metadata": {},
327345
"source": [
328-
"In the next cell we are plotting six lines. The subplot at the top shows the\n",
329-
"simulated BOLD response, while the other subplots show the simulated feature\n",
330-
"at different delays. The effect of the delayer is clear: it creates multiple\n",
346+
"We next create a `Delayer` object and use it to delay the simulated feature. \n",
347+
"The effect of the delayer is clear: it creates multiple\n",
331348
"copies of the original feature shifted forward in time by how many samples we\n",
332349
"requested (in this case, from 0 to 4 samples, which correspond to 0, 2, 4, 6,\n",
333350
"and 8 s in time with a 2 s TR).\n",
334351
"\n",
335352
"When these delayed features are used to fit a voxelwise encoding model, the\n",
336353
"brain response $y$ at time $t$ is simultaneously modeled by the\n",
337-
"feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. In the remaining\n",
338-
"of this example we will see that this method improves model prediction\n",
339-
"accuracy and it allows to account for the underlying shape of the hemodynamic\n",
340-
"response function.\n",
341-
"\n"
354+
"feature $x$ at times $t-0, t-2, t-4, t-6, t-8$. For example, the time sample highlighted\n",
355+
"in the plot below ($t = 30$ seconds) is modeled by the features at \n",
356+
"$t = 30, 28, 26, 24, 22$ seconds."
342357
]
343358
},
344359
{
345360
"cell_type": "code",
346361
"execution_count": null,
347-
"metadata": {
348-
"collapsed": false
349-
},
362+
"metadata": {},
350363
"outputs": [],
351364
"source": [
352-
"import matplotlib.pyplot as plt\n",
365+
"# create a delayer object and delay the features\n",
366+
"delayer = Delayer(delays=[0, 1, 2, 3, 4])\n",
367+
"simulated_X_delayed = delayer.fit_transform(simulated_X[:, None])\n",
353368
"\n",
354-
"fig, axs = plt.subplots(6, 1, figsize=(6, 6), constrained_layout=True, sharex=True)\n",
355-
"times = np.arange(n_trs) * TR\n",
356-
"\n",
357-
"axs[0].plot(times, y, color=\"r\")\n",
358-
"axs[0].set_title(\"BOLD response\")\n",
359-
"for i, (ax, xx) in enumerate(zip(axs.flat[1:], x_delayed.T)):\n",
360-
" ax.plot(times, xx, color=\"k\")\n",
361-
" ax.set_title(\n",
362-
" \"$x(t - {0:.0f})$ (feature delayed by {1} sample{2})\".format(\n",
363-
" i * TR, i, \"\" if i == 1 else \"s\"\n",
364-
" )\n",
365-
" )\n",
366-
"for ax in axs.flat:\n",
367-
" ax.axvline(40, color=\"gray\")\n",
368-
" ax.set_yticks([])\n",
369-
"_ = axs[-1].set_xlabel(\"Time [s]\")\n",
369+
"# plot the simulated data and highlight t = 30\n",
370+
"plot_delays_toy(simulated_X_delayed, simulated_Y, times, highlight=30)\n",
370371
"plt.show()"
371372
]
372373
},
374+
{
375+
"cell_type": "markdown",
376+
"metadata": {},
377+
"source": [
378+
"This simple example shows how the delayed features take into account of the HRF. \n",
379+
"This approach is often referred to as a \"finite impulse response\" (FIR) model.\n",
380+
"By delaying the features, the regression model learns the weights for each voxel \n",
381+
"separately. Therefore, the FIR approach is able to adapt to the shape of the HRF in each \n",
382+
"voxel, without assuming a fixed canonical HRF shape. \n",
383+
"As we will see in the remaining of this notebook, this approach improves model \n",
384+
"prediction accuracy significantly."
385+
]
386+
},
373387
{
374388
"cell_type": "markdown",
375389
"metadata": {},

tutorials/notebooks/shortclips/08_fit_motion_energy_model.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@
9191
"import numpy as np\n",
9292
"from scipy.stats import zscore\n",
9393
"from voxelwise_tutorials.io import load_hdf5_array\n",
94+
"from voxelwise_tutorials.utils import zscore_runs\n",
9495
"\n",
9596
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
9697
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -103,8 +104,7 @@
103104
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
104105
"\n",
105106
"# zscore each training run separately\n",
106-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
107-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
107+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
108108
"# zscore each test run separately\n",
109109
"Y_test = zscore(Y_test, axis=1)"
110110
]
@@ -486,7 +486,7 @@
486486
"semantic information.\n",
487487
"\n",
488488
"To better disentangle the two feature spaces, we developed a joint model\n",
489-
"called `banded ridge regression` {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n",
489+
"called **banded ridge regression** {cite}`nunez2019,dupre2022`, which fits multiple feature spaces\n",
490490
"simultaneously with optimal regularization for each feature space. This model\n",
491491
"is described in the next example.\n",
492492
"\n"

tutorials/notebooks/shortclips/09_fit_banded_ridge_model.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@
7777
"import numpy as np\n",
7878
"from scipy.stats import zscore\n",
7979
"from voxelwise_tutorials.io import load_hdf5_array\n",
80+
"from voxelwise_tutorials.utils import zscore_runs\n",
8081
"\n",
8182
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
8283
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
@@ -89,8 +90,7 @@
8990
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
9091
"\n",
9192
"# zscore each training run separately\n",
92-
"Y_train = np.split(Y_train, run_onsets[1:])\n",
93-
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
93+
"Y_train = zscore_runs(Y_train, run_onsets)\n",
9494
"# zscore each test run separately\n",
9595
"Y_test = zscore(Y_test, axis=1)"
9696
]

voxelwise_tutorials/delays_toy.py

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
import matplotlib.pyplot as plt
2+
import numpy as np
3+
from scipy.stats import gamma
4+
5+
6+
def simulate_bold(stimulus_array, TR=2.0, duration=32.0):
7+
"""Simulate BOLD signal by convolving a stimulus time series with a canonical HRF.
8+
9+
Parameters:
10+
-----------
11+
stimulus_array : array-like
12+
Binary array representing stimulus onset (1) and absence (0)
13+
TR : float, optional
14+
Repetition time of the scanner in seconds (default: 2.0s)
15+
duration : float, optional
16+
Total duration to model the HRF in seconds (default: 32.0s)
17+
18+
Returns:
19+
--------
20+
bold_signal : ndarray
21+
The simulated BOLD signal after convolution
22+
"""
23+
# Create time array based on TR
24+
t = np.arange(0, duration, TR)
25+
26+
# Define canonical double-gamma HRF
27+
# Parameters for the canonical HRF (based on SPM defaults)
28+
peak_delay = 6.0 # delay of peak in seconds
29+
undershoot_delay = 16.0 # delay of undershoot
30+
peak_disp = 1.0 # dispersion of peak
31+
undershoot_disp = 1.0 # dispersion of undershoot
32+
peak_amp = 1.0 # amplitude of peak
33+
undershoot_amp = 0.1 # amplitude of undershoot
34+
35+
# Compute positive and negative gamma functions
36+
pos_gamma = peak_amp * gamma.pdf(t, peak_delay / peak_disp, scale=peak_disp)
37+
neg_gamma = undershoot_amp * gamma.pdf(
38+
t, undershoot_delay / undershoot_disp, scale=undershoot_disp
39+
)
40+
41+
# Canonical HRF = positive gamma - negative gamma
42+
hrf = pos_gamma - neg_gamma
43+
44+
# Normalize HRF to have sum = 1
45+
hrf = hrf / np.sum(hrf)
46+
47+
# Perform convolution
48+
bold_signal = np.convolve(stimulus_array, hrf, mode="full")
49+
50+
# Return only the part of the signal that matches the length of the input
51+
return bold_signal[: len(stimulus_array)]
52+
53+
54+
def create_voxel_data(n_trs=50, TR=2.0, onset=30, duration=10, random_seed=42):
55+
"""Create a toy dataset with a single voxel and a single feature.
56+
57+
Parameters
58+
----------
59+
n_trs : int
60+
Number of time points (TRs).
61+
TR : float
62+
Repetition time in seconds.
63+
activation_t : float
64+
Time point of activation onset in seconds.
65+
activation_duration : float
66+
Duration of activation in seconds.
67+
random_seed : int
68+
Seed for random number generation.
69+
70+
Returns
71+
-------
72+
X : array of shape (n_trs,)
73+
The generated feature.
74+
Y : array of shape (n_trs,)
75+
The generated voxel data.
76+
times : array of shape (n_trs,)
77+
The time points corresponding to the voxel data.
78+
"""
79+
if onset > n_trs * TR:
80+
raise ValueError("onset must be less than n_trs * TR")
81+
if duration > n_trs * TR:
82+
raise ValueError("duration must be less than n_trs * TR")
83+
if duration < 0:
84+
raise ValueError("duration must be greater than 0")
85+
if onset < 0:
86+
raise ValueError("onset must be greater than 0")
87+
rng = np.random.RandomState(random_seed)
88+
# figure out slices of activation
89+
activation = slice(int(onset / TR), int((onset + duration) / TR))
90+
X = np.zeros(n_trs)
91+
# add some arbitrary value to our feature
92+
X[activation] = 1
93+
Y = simulate_bold(X, TR=TR)
94+
# add some noise
95+
Y += rng.randn(n_trs) * 0.1
96+
times = np.arange(n_trs) * TR
97+
return X, Y, times
98+
99+
100+
def plot_delays_toy(X_delayed, Y, times, highlight=None):
101+
"""Creates a figure showing a BOLD response and delayed versions of a stimulus.
102+
103+
Parameters
104+
----------
105+
X_delayed : ndarray, shape (n_timepoints, n_delays)
106+
The delayed stimulus, where each column corresponds to a different delay.
107+
Y : ndarray, shape (n_timepoints,)
108+
The BOLD response time series.
109+
times : ndarray, shape (n_timepoints,)
110+
Time points in seconds.
111+
highlight : float or None, optional
112+
Time point to highlight in the plot (default: 30 seconds).
113+
114+
Returns
115+
-------
116+
axs : ndarray
117+
Array of matplotlib axes objects containing the plots.
118+
"""
119+
if X_delayed.ndim == 1:
120+
X_delayed = X_delayed[:, np.newaxis]
121+
n_delays = X_delayed.shape[1]
122+
n_rows = n_delays + 1
123+
TR = times[1] - times[0]
124+
125+
fig, axs = plt.subplots(
126+
n_rows,
127+
1,
128+
figsize=(6, n_rows),
129+
constrained_layout=True,
130+
sharex=True,
131+
sharey=True,
132+
)
133+
axs[0].plot(times, Y, color="r")
134+
axs[0].set_title("BOLD response")
135+
if len(axs) == 2:
136+
axs[1].set_title("Feature")
137+
axs[1].plot(times, X_delayed, color="k")
138+
else:
139+
for i, (ax, xx) in enumerate(zip(axs.flat[1:], X_delayed.T)):
140+
ax.plot(times, xx, color="k")
141+
ax.set_title(
142+
"$x(t - {0:.0f})$ (feature delayed by {1} sample{2})".format(
143+
i * TR, i, "" if i == 1 else "s"
144+
)
145+
)
146+
if highlight is not None:
147+
for ax in axs.flat:
148+
ax.axvline(highlight, color="gray")
149+
ax.set_yticks([])
150+
_ = axs[-1].set_xlabel("Time [s]")
151+
# add more margin at the top of the y axis
152+
ylim = axs[0].get_ylim()
153+
axs[0].set_ylim(ylim[0], ylim[1] * 1.2)
154+
return axs

0 commit comments

Comments
 (0)