Commit b9e6eda
committed
Optimize and clean up internal replicator purge checkpoints
Previously, the internal replicator created twice the number of checkpoints
needed to replicate purges between nodes. An internal replication job first
pulls the purges from the target to the source, then pushes the purges from the
source to the target, then finally pushes the document updates to the target.
During the pull operation, for example, from node `B` (the target) to node
`A` (the source), it creates an `A->B` checkpoint on node `B` (the target).
Then, during the push from `A` to `B` it creates an `A->B` checkpoint on node
A (the source). As a result, after the job finishes there are two checkpoints:
an A->B one on A, and an `A->B` one on B. It may look something like this:
```
[node A] [node B]
<-------pull------ (A->B)
(A->B) --------push------>
```
When the internal replication job runs on node B and _pushes_ purges to node A,
it will create a `B->A` checkpoint on B. After this instant, there will be two
checkpoints on B for replicating purges from B to A: one is `A->B`, from the
first job, and another `B->A`, from the second job. Both of the checkpoints
essentially checkpoint the same thing. It may looke like this after both
replication jobs finish:
```
[node A] [node B]
<-------pull------ (A->B) JOB1
(A->B) --------push------>
(B->A) --------pull------>
<-------push------ (B->A) JOB2
```
On B, the checkpoints `A->B` and `B->A` could have a different purge sequence:
one higher than the other, and so the lower one could delay the compactor from
cleaning up purge infos. This also makes it harder to reason about the
replication process, since we have an `A->B` checkpoint on `B`, but it's for
sending changes _from_ B _to_ A, not like one might expect `A->B` based on its
name.
To fix this, make sure to use a single checkpoint per direction of replication.
So, when change are pulled from B to A, the checkpoint is now B->A, and when
changes are pushed from B to A the checkpoint is also B->A.
It should look something like this:
```
[node A] [node B]
<-------pull------ JOB1
(A->B) --------push------>
--------pull------>
<-------push------ (B->A) JOB2
```
Since after this change we'll have some deprecated purge checkpoints to clean
up, it's probably also a good time to introduce purge checkpoint cleanup. We
have this for indexes but we didn't have it for the internal replicator. That
meant that on shard map reconfigurations, or node membership changes, user
would have to manually hunt down local (un-clustered) stale purge checkpoints
and remove them. Now this happens automatically when we compact, and before we
replicate between nodes.1 parent d512917 commit b9e6eda
4 files changed
Lines changed: 263 additions & 54 deletions
File tree
- src
- couch/src
- mem3
- src
- test/eunit
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
161 | 165 | | |
162 | 166 | | |
163 | 167 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
56 | 57 | | |
57 | 58 | | |
58 | 59 | | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
59 | 64 | | |
60 | 65 | | |
61 | 66 | | |
| |||
148 | 153 | | |
149 | 154 | | |
150 | 155 | | |
151 | | - | |
| 156 | + | |
152 | 157 | | |
153 | 158 | | |
154 | 159 | | |
155 | 160 | | |
156 | | - | |
| 161 | + | |
157 | 162 | | |
158 | 163 | | |
159 | 164 | | |
| |||
166 | 171 | | |
167 | 172 | | |
168 | 173 | | |
169 | | - | |
| 174 | + | |
170 | 175 | | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
| 176 | + | |
| 177 | + | |
181 | 178 | | |
182 | | - | |
183 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
184 | 231 | | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
201 | 237 | | |
202 | 238 | | |
203 | | - | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
204 | 242 | | |
205 | 243 | | |
206 | 244 | | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
207 | 266 | | |
208 | 267 | | |
209 | 268 | | |
| |||
335 | 394 | | |
336 | 395 | | |
337 | 396 | | |
| 397 | + | |
338 | 398 | | |
339 | 399 | | |
340 | 400 | | |
| |||
365 | 425 | | |
366 | 426 | | |
367 | 427 | | |
368 | | - | |
| 428 | + | |
369 | 429 | | |
370 | | - | |
| 430 | + | |
371 | 431 | | |
372 | 432 | | |
373 | 433 | | |
| |||
391 | 451 | | |
392 | 452 | | |
393 | 453 | | |
394 | | - | |
| 454 | + | |
395 | 455 | | |
396 | 456 | | |
397 | 457 | | |
| |||
427 | 487 | | |
428 | 488 | | |
429 | 489 | | |
430 | | - | |
| 490 | + | |
| 491 | + | |
431 | 492 | | |
432 | 493 | | |
433 | 494 | | |
| |||
741 | 802 | | |
742 | 803 | | |
743 | 804 | | |
744 | | - | |
745 | | - | |
746 | 805 | | |
747 | | - | |
748 | | - | |
| 806 | + | |
| 807 | + | |
749 | 808 | | |
750 | 809 | | |
751 | 810 | | |
752 | | - | |
| 811 | + | |
753 | 812 | | |
754 | 813 | | |
755 | 814 | | |
756 | 815 | | |
757 | 816 | | |
758 | | - | |
| 817 | + | |
759 | 818 | | |
760 | 819 | | |
761 | 820 | | |
| |||
802 | 861 | | |
803 | 862 | | |
804 | 863 | | |
805 | | - | |
| 864 | + | |
806 | 865 | | |
807 | 866 | | |
808 | 867 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
189 | 189 | | |
190 | 190 | | |
191 | 191 | | |
192 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
193 | 198 | | |
194 | 199 | | |
195 | 200 | | |
| |||
222 | 227 | | |
223 | 228 | | |
224 | 229 | | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
234 | 246 | | |
235 | 247 | | |
236 | 248 | | |
237 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
238 | 260 | | |
239 | 261 | | |
240 | 262 | | |
| |||
0 commit comments