Skip to content

Commit

Permalink
Bug 1186934 - update jemalloc to upstream HEAD; r=glandium
Browse files Browse the repository at this point in the history
  • Loading branch information
froydnj committed Dec 14, 2015
1 parent 056852f commit 1ef4f45
Show file tree
Hide file tree
Showing 45 changed files with 1,778 additions and 748 deletions.
95 changes: 86 additions & 9 deletions memory/jemalloc/src/ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,101 @@ brevity. Much more detail can be found in the git revision history:

https://github.com/jemalloc/jemalloc

* 4.0.1 (XXX)
* 4.0.4 (October 24, 2015)

This bugfix release fixes another xallocx() regression. No other regressions
have come to light in over a month, so this is likely a good starting point
for people who prefer to wait for "dot one" releases with all the major issues
shaken out.

Bug fixes:
- Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large
allocations that have been randomly assigned an offset of 0 when
--enable-cache-oblivious configure option is enabled.

* 4.0.3 (September 24, 2015)

This bugfix release continues the trend of xallocx() and heap profiling fixes.

Bug fixes:
- Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large
allocations when --enable-cache-oblivious configure option is enabled.
- Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations
when resizing from/to a size class that is not a multiple of the chunk size.
- Fix prof_tctx_dump_iter() to filter out nodes that were created after heap
profile dumping started.
- Work around a potentially bad thread-specific data initialization
interaction with NPTL (glibc's pthreads implementation).

* 4.0.2 (September 21, 2015)

This bugfix release addresses a few bugs specific to heap profiling.

Bug fixes:
- Fix ixallocx_prof_sample() to never modify nor create sampled small
allocations. xallocx() is in general incapable of moving small allocations,
so this fix removes buggy code without loss of generality.
- Fix irallocx_prof_sample() to always allocate large regions, even when
alignment is non-zero.
- Fix prof_alloc_rollback() to read tdata from thread-specific data rather
than dereferencing a potentially invalid tctx.

* 4.0.1 (September 15, 2015)

This is a bugfix release that is somewhat high risk due to the amount of
refactoring required to address deep xallocx() problems. As a side effect of
these fixes, xallocx() now tries harder to partially fulfill requests for
optional extra space. Note that a couple of minor heap profiling
optimizations are included, but these are better thought of as performance
fixes that were integral to disovering most of the other bugs.

Optimizations:
- Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
fast path when heap profiling is enabled. Additionally, split a special
case out into arena_prof_tctx_reset(), which also avoids chunk metadata
reads.
- Optimize irallocx_prof() to optimistically update the sampler state. The
prior implementation appears to have been a holdover from when
rallocx()/xallocx() functionality was combined as rallocm().

Bug fixes:
- Fix TLS configuration such that it is enabled by default for platforms on
which it works correctly.
- Fix arenas_cache_cleanup() and arena_get_hard() to handle
allocation/deallocation within the application's thread-specific data
cleanup functions even after arenas_cache is torn down.
- Don't bitshift by negative amounts when encoding/decoding run sizes in chunk
header maps. This affected systems with page sizes greater than 8 KiB.
- Rename index_t to szind_t to avoid an existing type on Solaris.
- Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to
match glibc and avoid compilation errors when including both
jemalloc/jemalloc.h and malloc.h in C++ code.
- Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS.
- Fix chunk purge hook calls for in-place huge shrinking reallocation to
specify the old chunk size rather than the new chunk size. This bug caused
no correctness issues for the default chunk purge function, but was
visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl.
- Fix TLS configuration such that it is enabled by default for platforms on
which it works correctly.
- Fix heap profiling bugs:
+ Fix heap profiling to distinguish among otherwise identical sample sites
with interposed resets (triggered via the "prof.reset" mallctl). This bug
could cause data structure corruption that would most likely result in a
segfault.
+ Fix irealloc_prof() to prof_alloc_rollback() on OOM.
+ Make one call to prof_active_get_unlocked() per allocation event, and use
the result throughout the relevant functions that handle an allocation
event. Also add a missing check in prof_realloc(). These fixes protect
allocation events against concurrent prof_active changes.
+ Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample()
in the correct order.
+ Fix prof_realloc() to call prof_free_sampled_object() after calling
prof_malloc_sample_object(). Prior to this fix, if tctx and old_tctx were
the same, the tctx could have been prematurely destroyed.
- Fix portability bugs:
+ Don't bitshift by negative amounts when encoding/decoding run sizes in
chunk header maps. This affected systems with page sizes greater than 8
KiB.
+ Rename index_t to szind_t to avoid an existing type on Solaris.
+ Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to
match glibc and avoid compilation errors when including both
jemalloc/jemalloc.h and malloc.h in C++ code.
+ Don't assume that /bin/sh is appropriate when running size_classes.sh
during configuration.
+ Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.
+ Link tests to librt if it contains clock_gettime(2).

* 4.0.0 (August 17, 2015)

Expand Down
18 changes: 10 additions & 8 deletions memory/jemalloc/src/Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ CFLAGS := @CFLAGS@
LDFLAGS := @LDFLAGS@
EXTRA_LDFLAGS := @EXTRA_LDFLAGS@
LIBS := @LIBS@
TESTLIBS := @TESTLIBS@
RPATH_EXTRA := @RPATH_EXTRA@
SO := @so@
IMPORTLIB := @importlib@
Expand Down Expand Up @@ -265,15 +266,15 @@ $(STATIC_LIBS):

$(objroot)test/unit/%$(EXE): $(objroot)test/unit/%.$(O) $(TESTS_UNIT_LINK_OBJS) $(C_JET_OBJS) $(C_TESTLIB_UNIT_OBJS)
@mkdir -p $(@D)
$(CC) $(LDTARGET) $(filter %.$(O),$^) $(call RPATH,$(objroot)lib) $(LDFLAGS) $(filter-out -lm,$(LIBS)) -lm $(EXTRA_LDFLAGS)
$(CC) $(LDTARGET) $(filter %.$(O),$^) $(call RPATH,$(objroot)lib) $(LDFLAGS) $(filter-out -lm,$(LIBS)) -lm $(TESTLIBS) $(EXTRA_LDFLAGS)

$(objroot)test/integration/%$(EXE): $(objroot)test/integration/%.$(O) $(C_TESTLIB_INTEGRATION_OBJS) $(C_UTIL_INTEGRATION_OBJS) $(objroot)lib/$(LIBJEMALLOC).$(IMPORTLIB)
@mkdir -p $(@D)
$(CC) $(LDTARGET) $(filter %.$(O),$^) $(call RPATH,$(objroot)lib) $(objroot)lib/$(LIBJEMALLOC).$(IMPORTLIB) $(LDFLAGS) $(filter-out -lm,$(filter -lpthread,$(LIBS))) -lm $(EXTRA_LDFLAGS)
$(CC) $(LDTARGET) $(filter %.$(O),$^) $(call RPATH,$(objroot)lib) $(objroot)lib/$(LIBJEMALLOC).$(IMPORTLIB) $(LDFLAGS) $(filter-out -lm,$(filter -lpthread,$(LIBS))) -lm $(TESTLIBS) $(EXTRA_LDFLAGS)

$(objroot)test/stress/%$(EXE): $(objroot)test/stress/%.$(O) $(C_JET_OBJS) $(C_TESTLIB_STRESS_OBJS) $(objroot)lib/$(LIBJEMALLOC).$(IMPORTLIB)
@mkdir -p $(@D)
$(CC) $(LDTARGET) $(filter %.$(O),$^) $(call RPATH,$(objroot)lib) $(objroot)lib/$(LIBJEMALLOC).$(IMPORTLIB) $(LDFLAGS) $(filter-out -lm,$(LIBS)) -lm $(EXTRA_LDFLAGS)
$(CC) $(LDTARGET) $(filter %.$(O),$^) $(call RPATH,$(objroot)lib) $(objroot)lib/$(LIBJEMALLOC).$(IMPORTLIB) $(LDFLAGS) $(filter-out -lm,$(LIBS)) -lm $(TESTLIBS) $(EXTRA_LDFLAGS)

build_lib_shared: $(DSOS)
build_lib_static: $(STATIC_LIBS)
Expand Down Expand Up @@ -343,22 +344,23 @@ check_unit_dir:
@mkdir -p $(objroot)test/unit
check_integration_dir:
@mkdir -p $(objroot)test/integration
check_stress_dir:
stress_dir:
@mkdir -p $(objroot)test/stress
check_dir: check_unit_dir check_integration_dir check_stress_dir
check_dir: check_unit_dir check_integration_dir

check_unit: tests_unit check_unit_dir
$(SHELL) $(objroot)test/test.sh $(TESTS_UNIT:$(srcroot)%.c=$(objroot)%)
check_integration_prof: tests_integration check_integration_dir
ifeq ($(enable_prof), 1)
$(MALLOC_CONF)="prof:true" $(SHELL) $(objroot)test/test.sh $(TESTS_INTEGRATION:$(srcroot)%.c=$(objroot)%)
$(MALLOC_CONF)="prof:true,prof_active:false" $(SHELL) $(objroot)test/test.sh $(TESTS_INTEGRATION:$(srcroot)%.c=$(objroot)%)
endif
check_integration: tests_integration check_integration_dir
$(SHELL) $(objroot)test/test.sh $(TESTS_INTEGRATION:$(srcroot)%.c=$(objroot)%)
check_stress: tests_stress check_stress_dir
stress: tests_stress stress_dir
$(SHELL) $(objroot)test/test.sh $(TESTS_STRESS:$(srcroot)%.c=$(objroot)%)
check: tests check_dir check_integration_prof
$(SHELL) $(objroot)test/test.sh $(TESTS:$(srcroot)%.c=$(objroot)%)
$(SHELL) $(objroot)test/test.sh $(TESTS_UNIT:$(srcroot)%.c=$(objroot)%) $(TESTS_INTEGRATION:$(srcroot)%.c=$(objroot)%)

ifeq ($(enable_code_coverage), 1)
coverage_unit: check_unit
Expand All @@ -372,7 +374,7 @@ coverage_integration: check_integration
$(SHELL) $(srcroot)coverage.sh $(srcroot)test/src integration $(C_TESTLIB_INTEGRATION_OBJS)
$(SHELL) $(srcroot)coverage.sh $(srcroot)test/integration integration $(TESTS_INTEGRATION_OBJS)

coverage_stress: check_stress
coverage_stress: stress
$(SHELL) $(srcroot)coverage.sh $(srcroot)src pic $(C_PIC_OBJS)
$(SHELL) $(srcroot)coverage.sh $(srcroot)src jet $(C_JET_OBJS)
$(SHELL) $(srcroot)coverage.sh $(srcroot)test/src stress $(C_TESTLIB_STRESS_OBJS)
Expand Down
2 changes: 1 addition & 1 deletion memory/jemalloc/src/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
4.0.0-12-ged4883285e111b426e5769b24dad164ebacaa5b9
4.0.4-12-g3a92319ddc5610b755f755cbbbd12791ca9d0c3d
54 changes: 43 additions & 11 deletions memory/jemalloc/src/bin/jeprof.in
Original file line number Diff line number Diff line change
Expand Up @@ -1160,8 +1160,21 @@ sub PrintSymbolizedProfile {
}
print '---', "\n";

$PROFILE_PAGE =~ m,[^/]+$,; # matches everything after the last slash
my $profile_marker = $&;
my $profile_marker;
if ($main::profile_type eq 'heap') {
$HEAP_PAGE =~ m,[^/]+$,; # matches everything after the last slash
$profile_marker = $&;
} elsif ($main::profile_type eq 'growth') {
$GROWTH_PAGE =~ m,[^/]+$,; # matches everything after the last slash
$profile_marker = $&;
} elsif ($main::profile_type eq 'contention') {
$CONTENTION_PAGE =~ m,[^/]+$,; # matches everything after the last slash
$profile_marker = $&;
} else { # elsif ($main::profile_type eq 'cpu')
$PROFILE_PAGE =~ m,[^/]+$,; # matches everything after the last slash
$profile_marker = $&;
}

print '--- ', $profile_marker, "\n";
if (defined($main::collected_profile)) {
# if used with remote fetch, simply dump the collected profile to output.
Expand All @@ -1171,6 +1184,12 @@ sub PrintSymbolizedProfile {
}
close(SRC);
} else {
# --raw/http: For everything to work correctly for non-remote profiles, we
# would need to extend PrintProfileData() to handle all possible profile
# types, re-enable the code that is currently disabled in ReadCPUProfile()
# and FixCallerAddresses(), and remove the remote profile dumping code in
# the block above.
die "--raw/http: jeprof can only dump remote profiles for --raw\n";
# dump a cpu-format profile to standard out
PrintProfileData($profile);
}
Expand Down Expand Up @@ -3427,12 +3446,22 @@ sub FetchDynamicProfile {
}
$url .= sprintf("seconds=%d", $main::opt_seconds);
$fetch_timeout = $main::opt_seconds * 1.01 + 60;
# Set $profile_type for consumption by PrintSymbolizedProfile.
$main::profile_type = 'cpu';
} else {
# For non-CPU profiles, we add a type-extension to
# the target profile file name.
my $suffix = $path;
$suffix =~ s,/,.,g;
$profile_file .= $suffix;
# Set $profile_type for consumption by PrintSymbolizedProfile.
if ($path =~ m/$HEAP_PAGE/) {
$main::profile_type = 'heap';
} elsif ($path =~ m/$GROWTH_PAGE/) {
$main::profile_type = 'growth';
} elsif ($path =~ m/$CONTENTION_PAGE/) {
$main::profile_type = 'contention';
}
}

my $profile_dir = $ENV{"JEPROF_TMPDIR"} || ($ENV{HOME} . "/jeprof");
Expand Down Expand Up @@ -3730,6 +3759,8 @@ sub ReadProfile {
my $symbol_marker = $&;
$PROFILE_PAGE =~ m,[^/]+$,; # matches everything after the last slash
my $profile_marker = $&;
$HEAP_PAGE =~ m,[^/]+$,; # matches everything after the last slash
my $heap_marker = $&;

# Look at first line to see if it is a heap or a CPU profile.
# CPU profile may start with no header at all, and just binary data
Expand All @@ -3756,7 +3787,13 @@ sub ReadProfile {
$header = ReadProfileHeader(*PROFILE) || "";
}

if ($header =~ m/^--- *($heap_marker|$growth_marker)/o) {
# Skip "--- ..." line for profile types that have their own headers.
$header = ReadProfileHeader(*PROFILE) || "";
}

$main::profile_type = '';

if ($header =~ m/^heap profile:.*$growth_marker/o) {
$main::profile_type = 'growth';
$result = ReadHeapProfile($prog, *PROFILE, $header);
Expand Down Expand Up @@ -3808,9 +3845,9 @@ sub ReadProfile {
# independent implementation.
sub FixCallerAddresses {
my $stack = shift;
if ($main::use_symbolized_profile) {
return $stack;
} else {
# --raw/http: Always subtract one from pc's, because PrintSymbolizedProfile()
# dumps unadjusted profiles.
{
$stack =~ /(\s)/;
my $delimiter = $1;
my @addrs = split(' ', $stack);
Expand Down Expand Up @@ -3878,12 +3915,7 @@ sub ReadCPUProfile {
for (my $j = 0; $j < $d; $j++) {
my $pc = $slots->get($i+$j);
# Subtract one from caller pc so we map back to call instr.
# However, don't do this if we're reading a symbolized profile
# file, in which case the subtract-one was done when the file
# was written.
if ($j > 0 && !$main::use_symbolized_profile) {
$pc--;
}
$pc--;
$pc = sprintf("%0*x", $address_length, $pc);
$pcs->{$pc} = 1;
push @k, $pc;
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 1ef4f45

Please sign in to comment.