Skip to content

Commit 753f670

Browse files
peffgitster
authored andcommitted
commit: avoid leaking already-saved buffer
When we parse a commit via repo_parse_commit_internal(), if save_commit_buffer is set we'll stuff the buffer of the object contents into a cache, overwriting any previous value. This can result in a leak of that previously cached value, though it's rare in practice. If we have a value in the cache it would have come from a previous parse, and during that parse we'd set the object.parsed flag, causing any subsequent parse attempts to exit without doing any work. But it's possible to "unparse" a commit, which we do when registering a commit graft. And since shallow fetches are implemented using grafts, the leak is triggered in practice by t5539. There are a number of possible ways to address this: 1. the unparsing function could clear the cached commit buffer, too. I think this would work for the case I found, but I'm not sure if there are other ways to end up in the same state (an unparsed commit with an entry in the commit buffer cache). 2. when we parse, we could check the buffer cache and prefer it to reading the contents from the object database. In theory the contents of a particular sha1 are immutable, but the code in question is violating the immutability with grafts. So this approach makes me a bit nervous, although I think it would work in practice (the grafts are applied to what we parse, but we still retain the original contents). 3. We could realize the cache is already populated and discard its contents before overwriting. It's possible some other code could be holding on to a pointer to the old cache entry (and we'd introduce a use-after-free), but I think the risk of that is relatively low. 4. The reverse of (3): when the cache is populated, don't bother saving our new copy. This is perhaps a little weird, since we'll have just populated the commit struct based on a different buffer. But the two buffers should be the same, even in the presence of grafts (as in (2) above). I went with option 4. It addresses the leak directly and doesn't carry any risk of breaking other assumptions. And it's the same technique used by parse_object_buffer() for this situation, though I'm not sure when it would even come up there. The extra safety has been there since bd1e17e (Make "parse_object()" also fill in commit message buffer data., 2005-05-25). This lets us mark t5539 as leak-free. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent c800963 commit 753f670

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

commit.c

+2-1
Original file line numberDiff line numberDiff line change
@@ -595,7 +595,8 @@ int repo_parse_commit_internal(struct repository *r,
595595
}
596596

597597
ret = parse_commit_buffer(r, item, buffer, size, 0);
598-
if (save_commit_buffer && !ret) {
598+
if (save_commit_buffer && !ret &&
599+
!get_cached_commit_buffer(r, item, NULL)) {
599600
set_commit_buffer(r, item, buffer, size);
600601
return 0;
601602
}

t/t5539-fetch-http-shallow.sh

+1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ test_description='fetch/clone from a shallow clone over http'
55
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
66
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
77

8+
TEST_PASSES_SANITIZE_LEAK=true
89
. ./test-lib.sh
910
. "$TEST_DIRECTORY"/lib-httpd.sh
1011
start_httpd

0 commit comments

Comments
 (0)