Skip to content

Commit a00b0db

Browse files
ttaylorrgitster
authored andcommitted
repack: introduce --write-midx=incremental
Expose the incremental MIDX repacking mode (implemented in an earlier commit) via a new --write-midx=incremental option for `git repack`. Add "incremental" as a recognized argument to the --write-midx OPT_CALLBACK, mapping it to REPACK_WRITE_MIDX_INCREMENTAL. When this mode is active and --geometric is in use, set the midx_layer_threshold on the pack geometry so that only packs in sufficiently large tip layers are considered for repacking. Two new configuration options control the compaction behavior: - repack.midxSplitFactor (default: 2): the factor used in the geometric merging condition for MIDX layers. - repack.midxNewLayerThreshold (default: 8): the minimum number of packs in the tip MIDX layer before its packs are considered as candidates for geometric repacking. Add tests exercising the new mode across a variety of scenarios including basic geometric violations, multi-round chain integrity, branching and merging histories, cross-layer object uniqueness, and threshold-based compaction. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent e96ff36 commit a00b0db

11 files changed

Lines changed: 593 additions & 21 deletions

File tree

Documentation/config/repack.adoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,21 @@ repack.midxMustContainCruft::
4646
`--write-midx`. When false, cruft packs are only included in the MIDX
4747
when necessary (e.g., because they might be required to form a
4848
reachability closure with MIDX bitmaps). Defaults to true.
49+
50+
repack.midxSplitFactor::
51+
The factor used in the geometric merging condition when
52+
compacting incremental MIDX layers during `git repack` when
53+
invoked with the `--write-midx=incremental` option.
54+
+
55+
Adjacent layers are merged when the accumulated object count of the
56+
newer layer exceeds `1/<N>` of the object count of the next deeper
57+
layer. Defaults to 2.
58+
59+
repack.midxNewLayerThreshold::
60+
The minimum number of packs in the tip MIDX layer before those
61+
packs are considered as candidates for geometric repacking
62+
during `git repack --write-midx=incremental`.
63+
+
64+
When the tip layer has fewer packs than this threshold, those packs are
65+
excluded from the geometric repack entirely, and are thus left
66+
unmodified. Defaults to 8.

Documentation/git-repack.adoc

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ SYNOPSIS
1111
[verse]
1212
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]
1313
[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]
14-
[--write-midx] [--name-hash-version=<n>] [--path-walk]
14+
[--write-midx[=<mode>]] [--name-hash-version=<n>] [--path-walk]
1515

1616
DESCRIPTION
1717
-----------
@@ -250,9 +250,42 @@ pack as the preferred pack for object selection by the MIDX (see
250250
linkgit:git-multi-pack-index[1]).
251251

252252
-m::
253-
--write-midx::
253+
--write-midx[=<mode>]::
254254
Write a multi-pack index (see linkgit:git-multi-pack-index[1])
255-
containing the non-redundant packs.
255+
containing the non-redundant packs. The following modes are
256+
available:
257+
+
258+
--
259+
`default`;;
260+
Write a single MIDX covering all packs. This is the
261+
default when `--write-midx` is given without an
262+
explicit mode.
263+
264+
`incremental`;;
265+
Write an incremental MIDX chain instead of a single
266+
flat MIDX. This mode requires `--geometric`.
267+
+
268+
The incremental mode maintains a chain of MIDX layers that is compacted
269+
over time using a geometric merging strategy. Each repack creates a new
270+
tip layer containing the newly written pack(s). Adjacent layers are then
271+
merged whenever the newer layer's object count exceeds
272+
`1/repack.midxSplitFactor` of the next deeper layer's count. Layers
273+
that do not meet this condition are retained as-is.
274+
+
275+
The result is that newer (tip) layers tend to contain many small packs
276+
with relatively few objects, while older (deeper) layers contain fewer,
277+
larger packs covering more objects. Because compaction is driven by the
278+
tip of the chain, newer layers are also rewritten more frequently than
279+
older ones, which are only touched when enough objects have accumulated
280+
to justify merging into them. This keeps the total number of layers
281+
logarithmic relative to the total number of objects.
282+
+
283+
Only packs in the tip MIDX layer are considered as candidates for the
284+
geometric repack; packs in deeper layers are left untouched. If the tip
285+
layer contains fewer packs than `repack.midxNewLayerThreshold`, those
286+
packs are excluded from the geometry entirely, and a new layer is
287+
created for any new pack(s) without disturbing the existing chain.
288+
--
256289

257290
--name-hash-version=<n>::
258291
Provide this argument to the underlying `git pack-objects` process.

builtin/repack.c

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ static int midx_must_contain_cruft = 1;
3333
static const char *const git_repack_usage[] = {
3434
N_("git repack [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]\n"
3535
"[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]\n"
36-
"[--write-midx] [--name-hash-version=<n>] [--path-walk]"),
36+
"[--write-midx[=<mode>]] [--name-hash-version=<n>] [--path-walk]"),
3737
NULL
3838
};
3939

@@ -42,9 +42,14 @@ static const char incremental_bitmap_conflict_error[] = N_(
4242
"--no-write-bitmap-index or disable the pack.writeBitmaps configuration."
4343
);
4444

45+
#define DEFAULT_MIDX_SPLIT_FACTOR 2
46+
#define DEFAULT_MIDX_NEW_LAYER_THRESHOLD 8
47+
4548
struct repack_config_ctx {
4649
struct pack_objects_args *po_args;
4750
struct pack_objects_args *cruft_po_args;
51+
int midx_split_factor;
52+
int midx_new_layer_threshold;
4853
};
4954

5055
static int repack_config(const char *var, const char *value,
@@ -94,6 +99,16 @@ static int repack_config(const char *var, const char *value,
9499
midx_must_contain_cruft = git_config_bool(var, value);
95100
return 0;
96101
}
102+
if (!strcmp(var, "repack.midxsplitfactor")) {
103+
repack_ctx->midx_split_factor = git_config_int(var, value,
104+
ctx->kvi);
105+
return 0;
106+
}
107+
if (!strcmp(var, "repack.midxnewlayerthreshold")) {
108+
repack_ctx->midx_new_layer_threshold = git_config_int(var, value,
109+
ctx->kvi);
110+
return 0;
111+
}
97112
return git_default_config(var, value, ctx, cb);
98113
}
99114

@@ -109,6 +124,8 @@ static int option_parse_write_midx(const struct option *opt, const char *arg,
109124

110125
if (!arg || !*arg)
111126
*cfg = REPACK_WRITE_MIDX_DEFAULT;
127+
else if (!strcmp(arg, "incremental"))
128+
*cfg = REPACK_WRITE_MIDX_INCREMENTAL;
112129
else
113130
return error(_("unknown value for %s: %s"), opt->long_name, arg);
114131

@@ -223,6 +240,8 @@ int cmd_repack(int argc,
223240
memset(&config_ctx, 0, sizeof(config_ctx));
224241
config_ctx.po_args = &po_args;
225242
config_ctx.cruft_po_args = &cruft_po_args;
243+
config_ctx.midx_split_factor = DEFAULT_MIDX_SPLIT_FACTOR;
244+
config_ctx.midx_new_layer_threshold = DEFAULT_MIDX_NEW_LAYER_THRESHOLD;
226245

227246
repo_config(repo, repack_config, &config_ctx);
228247

@@ -244,6 +263,9 @@ int cmd_repack(int argc,
244263
if (pack_everything & PACK_CRUFT)
245264
pack_everything |= ALL_INTO_ONE;
246265

266+
if (write_midx == REPACK_WRITE_MIDX_INCREMENTAL && !geometry.split_factor)
267+
die(_("--write-midx=incremental requires --geometric"));
268+
247269
if (write_bitmaps < 0) {
248270
if (write_midx == REPACK_WRITE_MIDX_NONE &&
249271
(!(pack_everything & ALL_INTO_ONE) || !is_bare_repository()))
@@ -293,6 +315,10 @@ int cmd_repack(int argc,
293315
if (geometry.split_factor) {
294316
if (pack_everything)
295317
die(_("options '%s' and '%s' cannot be used together"), "--geometric", "-A/-a");
318+
if (write_midx == REPACK_WRITE_MIDX_INCREMENTAL) {
319+
geometry.midx_layer_threshold = config_ctx.midx_new_layer_threshold;
320+
geometry.midx_layer_threshold_set = true;
321+
}
296322
pack_geometry_init(&geometry, &existing, &po_args);
297323
pack_geometry_split(&geometry);
298324
}
@@ -540,6 +566,8 @@ int cmd_repack(int argc,
540566
.show_progress = show_progress,
541567
.write_bitmaps = write_bitmaps > 0,
542568
.midx_must_contain_cruft = midx_must_contain_cruft,
569+
.midx_split_factor = config_ctx.midx_split_factor,
570+
.midx_new_layer_threshold = config_ctx.midx_new_layer_threshold,
543571
.mode = write_midx,
544572
};
545573

@@ -552,11 +580,15 @@ int cmd_repack(int argc,
552580

553581
if (delete_redundant) {
554582
int opts = 0;
555-
existing_packs_remove_redundant(&existing, packdir);
583+
bool wrote_incremental_midx = write_midx == REPACK_WRITE_MIDX_INCREMENTAL;
584+
585+
existing_packs_remove_redundant(&existing, packdir,
586+
wrote_incremental_midx);
556587

557588
if (geometry.split_factor)
558589
pack_geometry_remove_redundant(&geometry, &names,
559-
&existing, packdir);
590+
&existing, packdir,
591+
wrote_incremental_midx);
560592
if (show_progress)
561593
opts |= PRUNE_PACKED_VERBOSE;
562594
prune_packed_objects(opts);

midx.c

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -852,6 +852,37 @@ void clear_midx_file(struct repository *r)
852852
strbuf_release(&midx);
853853
}
854854

855+
void clear_incremental_midx_files(struct repository *r,
856+
const struct strvec *keep_hashes)
857+
{
858+
struct strbuf chain = STRBUF_INIT;
859+
860+
get_midx_chain_filename(r->objects->sources, &chain);
861+
862+
if (r->objects) {
863+
struct odb_source *source = r->objects->sources;
864+
for (source = r->objects->sources; source; source = source->next) {
865+
struct odb_source_files *files = odb_source_files_downcast(source);
866+
if (files->packed->midx)
867+
close_midx(files->packed->midx);
868+
files->packed->midx = NULL;
869+
}
870+
}
871+
872+
if (!keep_hashes && remove_path(chain.buf))
873+
die(_("failed to clear multi-pack-index chain at %s"),
874+
chain.buf);
875+
876+
clear_incremental_midx_files_ext(r->objects->sources, MIDX_EXT_BITMAP,
877+
keep_hashes);
878+
clear_incremental_midx_files_ext(r->objects->sources, MIDX_EXT_REV,
879+
keep_hashes);
880+
clear_incremental_midx_files_ext(r->objects->sources, MIDX_EXT_MIDX,
881+
keep_hashes);
882+
883+
strbuf_release(&chain);
884+
}
885+
855886
static int verify_midx_error;
856887

857888
__attribute__((format (printf, 1, 2)))

midx.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ struct repository;
99
struct bitmapped_pack;
1010
struct git_hash_algo;
1111
struct odb_source;
12+
struct strvec;
1213

1314
#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */
1415
#define MIDX_VERSION_V1 1
@@ -143,6 +144,8 @@ int write_midx_file_compact(struct odb_source *source,
143144
const char *incremental_base,
144145
unsigned flags);
145146
void clear_midx_file(struct repository *r);
147+
void clear_incremental_midx_files(struct repository *r,
148+
const struct strvec *keep_hashes);
146149
int verify_midx_file(struct odb_source *source, unsigned flags);
147150
int expire_midx_packs(struct odb_source *source, unsigned flags);
148151
int midx_repack(struct odb_source *source, size_t batch_size, unsigned flags);

repack-geometry.c

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -251,7 +251,8 @@ static void remove_redundant_packs(struct packed_git **pack,
251251
uint32_t pack_nr,
252252
struct string_list *names,
253253
struct existing_packs *existing,
254-
const char *packdir)
254+
const char *packdir,
255+
bool wrote_incremental_midx)
255256
{
256257
const struct git_hash_algo *algop = existing->repo->hash_algo;
257258
struct strbuf buf = STRBUF_INIT;
@@ -271,7 +272,8 @@ static void remove_redundant_packs(struct packed_git **pack,
271272
(string_list_has_string(&existing->kept_packs, buf.buf)))
272273
continue;
273274

274-
repack_remove_redundant_pack(existing->repo, packdir, buf.buf);
275+
repack_remove_redundant_pack(existing->repo, packdir, buf.buf,
276+
wrote_incremental_midx);
275277
}
276278

277279
strbuf_release(&buf);
@@ -280,12 +282,13 @@ static void remove_redundant_packs(struct packed_git **pack,
280282
void pack_geometry_remove_redundant(struct pack_geometry *geometry,
281283
struct string_list *names,
282284
struct existing_packs *existing,
283-
const char *packdir)
285+
const char *packdir,
286+
bool wrote_incremental_midx)
284287
{
285288
remove_redundant_packs(geometry->pack, geometry->split,
286-
names, existing, packdir);
289+
names, existing, packdir, wrote_incremental_midx);
287290
remove_redundant_packs(geometry->promisor_pack, geometry->promisor_split,
288-
names, existing, packdir);
291+
names, existing, packdir, wrote_incremental_midx);
289292
}
290293

291294
void pack_geometry_release(struct pack_geometry *geometry)

repack-midx.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -894,6 +894,7 @@ static int write_midx_incremental(struct repack_write_midx_opts *opts)
894894
struct midx_compaction_step *steps = NULL;
895895
struct strbuf lock_name = STRBUF_INIT;
896896
struct lock_file lf;
897+
struct strvec keep_hashes = STRVEC_INIT;
897898
size_t steps_nr = 0;
898899
size_t i;
899900
int ret = 0;
@@ -939,11 +940,15 @@ static int write_midx_incremental(struct repack_write_midx_opts *opts)
939940
BUG("missing result for compaction step %"PRIuMAX,
940941
(uintmax_t)i);
941942
fprintf(get_lock_file_fp(&lf), "%s\n", step->csum);
943+
strvec_push(&keep_hashes, step->csum);
942944
}
943945

944946
commit_lock_file(&lf);
945947

948+
clear_incremental_midx_files(opts->existing->repo, &keep_hashes);
949+
946950
done:
951+
strvec_clear(&keep_hashes);
947952
strbuf_release(&lock_name);
948953
for (i = 0; i < steps_nr; i++)
949954
midx_compaction_step_release(&steps[i]);

repack.c

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -55,14 +55,18 @@ void pack_objects_args_release(struct pack_objects_args *args)
5555
}
5656

5757
void repack_remove_redundant_pack(struct repository *repo, const char *dir_name,
58-
const char *base_name)
58+
const char *base_name,
59+
bool wrote_incremental_midx)
5960
{
6061
struct strbuf buf = STRBUF_INIT;
6162
struct odb_source *source = repo->objects->sources;
6263
struct multi_pack_index *m = get_multi_pack_index(source);
6364
strbuf_addf(&buf, "%s.pack", base_name);
64-
if (m && source->local && midx_contains_pack(m, buf.buf))
65+
if (m && source->local && midx_contains_pack(m, buf.buf)) {
6566
clear_midx_file(repo);
67+
if (!wrote_incremental_midx)
68+
clear_incremental_midx_files(repo, NULL);
69+
}
6670
strbuf_insertf(&buf, 0, "%s/", dir_name);
6771
unlink_pack_path(buf.buf, 1);
6872
strbuf_release(&buf);
@@ -252,23 +256,26 @@ void existing_packs_mark_for_deletion(struct existing_packs *existing,
252256

253257
static void remove_redundant_packs_1(struct repository *repo,
254258
struct string_list *packs,
255-
const char *packdir)
259+
const char *packdir,
260+
bool wrote_incremental_midx)
256261
{
257262
struct string_list_item *item;
258263
for_each_string_list_item(item, packs) {
259264
if (!existing_pack_is_marked_for_deletion(item))
260265
continue;
261-
repack_remove_redundant_pack(repo, packdir, item->string);
266+
repack_remove_redundant_pack(repo, packdir, item->string,
267+
wrote_incremental_midx);
262268
}
263269
}
264270

265271
void existing_packs_remove_redundant(struct existing_packs *existing,
266-
const char *packdir)
272+
const char *packdir,
273+
bool wrote_incremental_midx)
267274
{
268275
remove_redundant_packs_1(existing->repo, &existing->non_kept_packs,
269-
packdir);
276+
packdir, wrote_incremental_midx);
270277
remove_redundant_packs_1(existing->repo, &existing->cruft_packs,
271-
packdir);
278+
packdir, wrote_incremental_midx);
272279
}
273280

274281
void existing_packs_release(struct existing_packs *existing)

repack.h

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,8 @@ void prepare_pack_objects(struct child_process *cmd,
3434
void pack_objects_args_release(struct pack_objects_args *args);
3535

3636
void repack_remove_redundant_pack(struct repository *repo, const char *dir_name,
37-
const char *base_name);
37+
const char *base_name,
38+
bool wrote_incremental_midx);
3839

3940
struct write_pack_opts {
4041
struct pack_objects_args *po_args;
@@ -84,7 +85,8 @@ void existing_packs_retain_cruft(struct existing_packs *existing,
8485
void existing_packs_mark_for_deletion(struct existing_packs *existing,
8586
struct string_list *names);
8687
void existing_packs_remove_redundant(struct existing_packs *existing,
87-
const char *packdir);
88+
const char *packdir,
89+
bool wrote_incremental_midx);
8890
void existing_packs_release(struct existing_packs *existing);
8991

9092
struct generated_pack;
@@ -129,7 +131,8 @@ struct packed_git *pack_geometry_preferred_pack(struct pack_geometry *geometry);
129131
void pack_geometry_remove_redundant(struct pack_geometry *geometry,
130132
struct string_list *names,
131133
struct existing_packs *existing,
132-
const char *packdir);
134+
const char *packdir,
135+
bool wrote_incremental_midx);
133136
void pack_geometry_release(struct pack_geometry *geometry);
134137

135138
struct tempfile;

t/meson.build

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -950,6 +950,7 @@ integration_tests = [
950950
't7702-repack-cyclic-alternate.sh',
951951
't7703-repack-geometric.sh',
952952
't7704-repack-cruft.sh',
953+
't7705-repack-incremental-midx.sh',
953954
't7800-difftool.sh',
954955
't7810-grep.sh',
955956
't7811-grep-open.sh',

0 commit comments

Comments
 (0)