223
223
one out of every three rows, but it usually gets close. For the sake
224
224
of example, let's say that this process ends up owning rows 2 and 5.
226
Once it's finished syncing those rows, it updates SP1 to be the
226
Once it's finished trying to sync those rows, it updates SP1 to be the
227
227
biggest row-id that it's seen, which is 6 in this example. ::
242
242
On the next run, the container-sync starts off looking at rows with
243
243
ids between SP1 and SP2. This time, there are a bunch of them. The
244
sync process takes the ones it *does not* own and syncs them. Again,
245
this is based on the hashes, so this will be everything it didn't sync
246
before. In this example, that's rows 0, 1, 3, 4, and 6.
244
sync process try to sync all of them. If it succeeds, it will set
245
SP2 to equal SP1. If it fails, it will set SP2 to the failed object
246
and will continue to try all other objects till SP1, setting SP2 to
247
the first object that failed.
248
Under normal circumstances, the container-sync processes for the other
249
replicas will have already taken care of synchronizing those rows, so
250
this is a set of quick checks. However, if one of those other sync
249
Under normal circumstances, the container-sync processes
250
will have already taken care of synchronizing all rows, between SP1
251
and SP2, resulting in a set of quick checks.
252
However, if one of the sync
251
253
processes failed for some reason, then this is a vital fallback to
252
254
make sure all the objects in the container get synchronized. Without
253
255
this seemingly-redundant work, any container-sync failure results in
254
unsynchronized objects.
256
unsynchronized objects. Note that the container sync will persistently
257
retry to sync any faulty object until success, while logging each failure.
256
Once it's done with the fallback rows, SP2 is advanced to SP1. ::
259
Once it's done with the fallback rows, and assuming no faults occured,
260
SP2 is advanced to SP1. ::