Skip to content

Commit

Permalink
Update jemalloc to 4.2.0.
Browse files Browse the repository at this point in the history
  • Loading branch information
jasone committed May 13, 2016
1 parent b1f46f7 commit 1f0a49e
Show file tree
Hide file tree
Showing 53 changed files with 4,020 additions and 2,306 deletions.
48 changes: 46 additions & 2 deletions contrib/jemalloc/ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,50 @@ brevity. Much more detail can be found in the git revision history:

https://github.com/jemalloc/jemalloc

* 4.2.0 (May 12, 2016)

New features:
- Add the arena.<i>.reset mallctl, which makes it possible to discard all of
an arena's allocations in a single operation. (@jasone)
- Add the stats.retained and stats.arenas.<i>.retained statistics. (@jasone)
- Add the --with-version configure option. (@jasone)
- Support --with-lg-page values larger than actual page size. (@jasone)

Optimizations:
- Use pairing heaps rather than red-black trees for various hot data
structures. (@djwatson, @jasone)
- Streamline fast paths of rtree operations. (@jasone)
- Optimize the fast paths of calloc() and [m,d,sd]allocx(). (@jasone)
- Decommit unused virtual memory if the OS does not overcommit. (@jasone)
- Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order
to avoid unfortunate interactions during fork(2). (@jasone)

Bug fixes:
- Fix chunk accounting related to triggering gdump profiles. (@jasone)
- Link against librt for clock_gettime(2) if glibc < 2.17. (@jasone)
- Scale leak report summary according to sampling probability. (@jasone)

* 4.1.1 (May 3, 2016)

This bugfix release resolves a variety of mostly minor issues, though the
bitmap fix is critical for 64-bit Windows.

Bug fixes:
- Fix the linear scan version of bitmap_sfu() to shift by the proper amount
even when sizeof(long) is not the same as sizeof(void *), as on 64-bit
Windows. (@jasone)
- Fix hashing functions to avoid unaligned memory accesses (and resulting
crashes). This is relevant at least to some ARM-based platforms.
(@rkmisra)
- Fix fork()-related lock rank ordering reversals. These reversals were
unlikely to cause deadlocks in practice except when heap profiling was
enabled and active. (@jasone)
- Fix various chunk leaks in OOM code paths. (@jasone)
- Fix malloc_stats_print() to print opt.narenas correctly. (@jasone)
- Fix MSVC-specific build/test issues. (@rustyx, @yuslepukhin)
- Fix a variety of test failures that were due to test fragility rather than
core bugs. (@jasone)

* 4.1.0 (February 28, 2016)

This release is primarily about optimizations, but it also incorporates a lot
Expand Down Expand Up @@ -59,14 +103,14 @@ brevity. Much more detail can be found in the git revision history:
Bug fixes:
- Fix stats.cactive accounting regression. (@rustyx, @jasone)
- Handle unaligned keys in hash(). This caused problems for some ARM systems.
(@jasone, Christopher Ferris)
(@jasone, @cferris1000)
- Refactor arenas array. In addition to fixing a fork-related deadlock, this
makes arena lookups faster and simpler. (@jasone)
- Move retained memory allocation out of the default chunk allocation
function, to a location that gets executed even if the application installs
a custom chunk allocation function. This resolves a virtual memory leak.
(@buchgr)
- Fix a potential tsd cleanup leak. (Christopher Ferris, @jasone)
- Fix a potential tsd cleanup leak. (@cferris1000, @jasone)
- Fix run quantization. In practice this bug had no impact unless
applications requested memory with alignment exceeding one page.
(@jasone, @djwatson)
Expand Down
57 changes: 30 additions & 27 deletions contrib/jemalloc/FREEBSD-diffs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
diff --git a/doc/jemalloc.xml.in b/doc/jemalloc.xml.in
index bc5dbd1..ba182da 100644
index c4a44e3..4626e9b 100644
--- a/doc/jemalloc.xml.in
+++ b/doc/jemalloc.xml.in
@@ -53,11 +53,23 @@
Expand Down Expand Up @@ -27,7 +27,7 @@ index bc5dbd1..ba182da 100644
<refsect2>
<title>Standard API</title>
<funcprototype>
@@ -2905,4 +2917,18 @@ malloc_conf = "lg_chunk:24";]]></programlisting></para>
@@ -2961,4 +2973,18 @@ malloc_conf = "lg_chunk:24";]]></programlisting></para>
<para>The <function>posix_memalign<parameter/></function> function conforms
to IEEE Std 1003.1-2001 (&ldquo;POSIX.1&rdquo;).</para>
</refsect1>
Expand All @@ -47,7 +47,7 @@ index bc5dbd1..ba182da 100644
+ </refsect1>
</refentry>
diff --git a/include/jemalloc/internal/jemalloc_internal.h.in b/include/jemalloc/internal/jemalloc_internal.h.in
index 3f54391..d240256 100644
index 51bf897..7de22ea 100644
--- a/include/jemalloc/internal/jemalloc_internal.h.in
+++ b/include/jemalloc/internal/jemalloc_internal.h.in
@@ -8,6 +8,9 @@
Expand Down Expand Up @@ -90,10 +90,10 @@ index 2b8ca5d..42d97f2 100644
#ifdef _WIN32
# include <windows.h>
diff --git a/include/jemalloc/internal/mutex.h b/include/jemalloc/internal/mutex.h
index f051f29..561378f 100644
index 5221799..60ab041 100644
--- a/include/jemalloc/internal/mutex.h
+++ b/include/jemalloc/internal/mutex.h
@@ -47,15 +47,13 @@ struct malloc_mutex_s {
@@ -52,9 +52,6 @@ struct malloc_mutex_s {

#ifdef JEMALLOC_LAZY_LOCK
extern bool isthreaded;
Expand All @@ -102,19 +102,20 @@ index f051f29..561378f 100644
-# define isthreaded true
#endif

bool malloc_mutex_init(malloc_mutex_t *mutex);
void malloc_mutex_prefork(malloc_mutex_t *mutex);
void malloc_mutex_postfork_parent(malloc_mutex_t *mutex);
void malloc_mutex_postfork_child(malloc_mutex_t *mutex);
bool malloc_mutex_init(malloc_mutex_t *mutex, const char *name,
@@ -62,6 +59,7 @@ bool malloc_mutex_init(malloc_mutex_t *mutex, const char *name,
void malloc_mutex_prefork(tsdn_t *tsdn, malloc_mutex_t *mutex);
void malloc_mutex_postfork_parent(tsdn_t *tsdn, malloc_mutex_t *mutex);
void malloc_mutex_postfork_child(tsdn_t *tsdn, malloc_mutex_t *mutex);
+bool malloc_mutex_first_thread(void);
bool mutex_boot(void);
bool malloc_mutex_boot(void);

#endif /* JEMALLOC_H_EXTERNS */
diff --git a/include/jemalloc/internal/private_symbols.txt b/include/jemalloc/internal/private_symbols.txt
index 5880996..6e94e03 100644
index f2b6a55..69369c9 100644
--- a/include/jemalloc/internal/private_symbols.txt
+++ b/include/jemalloc/internal/private_symbols.txt
@@ -296,7 +296,6 @@ iralloct_realign
@@ -311,7 +311,6 @@ iralloct_realign
isalloc
isdalloct
isqalloc
Expand All @@ -124,10 +125,10 @@ index 5880996..6e94e03 100644
jemalloc_postfork_child
diff --git a/include/jemalloc/jemalloc_FreeBSD.h b/include/jemalloc/jemalloc_FreeBSD.h
new file mode 100644
index 0000000..433dab5
index 0000000..c58a8f3
--- /dev/null
+++ b/include/jemalloc/jemalloc_FreeBSD.h
@@ -0,0 +1,160 @@
@@ -0,0 +1,162 @@
+/*
+ * Override settings that were generated in jemalloc_defs.h as necessary.
+ */
Expand All @@ -138,6 +139,8 @@ index 0000000..433dab5
+#define JEMALLOC_DEBUG
+#endif
+
+#undef JEMALLOC_DSS
+
+/*
+ * The following are architecture-dependent, so conditionally define them for
+ * each supported architecture.
Expand Down Expand Up @@ -300,7 +303,7 @@ index f943891..47d032c 100755
+#include "jemalloc_FreeBSD.h"
EOF
diff --git a/src/jemalloc.c b/src/jemalloc.c
index 0735376..a34b85c 100644
index 40eb2ea..666c49d 100644
--- a/src/jemalloc.c
+++ b/src/jemalloc.c
@@ -4,6 +4,10 @@
Expand All @@ -314,7 +317,7 @@ index 0735376..a34b85c 100644
/* Runtime configuration options. */
const char *je_malloc_conf JEMALLOC_ATTR(weak);
bool opt_abort =
@@ -2611,6 +2615,107 @@ je_malloc_usable_size(JEMALLOC_USABLE_SIZE_CONST void *ptr)
@@ -2673,6 +2677,107 @@ je_malloc_usable_size(JEMALLOC_USABLE_SIZE_CONST void *ptr)
*/
/******************************************************************************/
/*
Expand All @@ -341,7 +344,7 @@ index 0735376..a34b85c 100644
+ if (p == NULL)
+ return (ALLOCM_ERR_OOM);
+ if (rsize != NULL)
+ *rsize = isalloc(p, config_prof);
+ *rsize = isalloc(tsdn_fetch(), p, config_prof);
+ *ptr = p;
+ return (ALLOCM_SUCCESS);
+}
Expand Down Expand Up @@ -370,7 +373,7 @@ index 0735376..a34b85c 100644
+ } else
+ ret = ALLOCM_ERR_OOM;
+ if (rsize != NULL)
+ *rsize = isalloc(*ptr, config_prof);
+ *rsize = isalloc(tsdn_fetch(), *ptr, config_prof);
+ }
+ return (ret);
+}
Expand Down Expand Up @@ -422,8 +425,8 @@ index 0735376..a34b85c 100644
* The following functions are used by threading libraries for protection of
* malloc during fork().
*/
@@ -2717,4 +2822,11 @@ jemalloc_postfork_child(void)
ctl_postfork_child();
@@ -2814,4 +2919,11 @@ jemalloc_postfork_child(void)
ctl_postfork_child(tsd_tsdn(tsd));
}

+void
Expand All @@ -435,7 +438,7 @@ index 0735376..a34b85c 100644
+
/******************************************************************************/
diff --git a/src/mutex.c b/src/mutex.c
index 2d47af9..934d5aa 100644
index a1fac34..a24e420 100644
--- a/src/mutex.c
+++ b/src/mutex.c
@@ -66,6 +66,17 @@ pthread_create(pthread_t *__restrict thread,
Expand All @@ -456,22 +459,22 @@ index 2d47af9..934d5aa 100644
#endif

bool
@@ -137,7 +148,7 @@ malloc_mutex_postfork_child(malloc_mutex_t *mutex)
@@ -140,7 +151,7 @@ malloc_mutex_postfork_child(tsdn_t *tsdn, malloc_mutex_t *mutex)
}

bool
-mutex_boot(void)
-malloc_mutex_boot(void)
+malloc_mutex_first_thread(void)
{

#ifdef JEMALLOC_MUTEX_INIT_CB
@@ -151,3 +162,14 @@ mutex_boot(void)
@@ -154,3 +165,14 @@ malloc_mutex_boot(void)
#endif
return (false);
}
+
+bool
+mutex_boot(void)
+malloc_mutex_boot(void)
+{
+
+#ifndef JEMALLOC_MUTEX_INIT_CB
Expand All @@ -481,10 +484,10 @@ index 2d47af9..934d5aa 100644
+#endif
+}
diff --git a/src/util.c b/src/util.c
index 02673c7..116e981 100644
index a1c4a2a..04f9153 100644
--- a/src/util.c
+++ b/src/util.c
@@ -66,6 +66,22 @@ wrtmessage(void *cbopaque, const char *s)
@@ -67,6 +67,22 @@ wrtmessage(void *cbopaque, const char *s)

JEMALLOC_EXPORT void (*je_malloc_message)(void *, const char *s);

Expand Down
2 changes: 1 addition & 1 deletion contrib/jemalloc/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
4.1.0-1-g994da4232621dd1210fcf39bdf0d6454cefda473
4.2.0-1-gdc7ff6306d7a15b53479e2fb8e5546404b82e6fc
53 changes: 45 additions & 8 deletions contrib/jemalloc/doc/jemalloc.3
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
.\" Title: JEMALLOC
.\" Author: Jason Evans
.\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/>
.\" Date: 02/28/2016
.\" Date: 05/12/2016
.\" Manual: User Manual
.\" Source: jemalloc 4.1.0-1-g994da4232621dd1210fcf39bdf0d6454cefda473
.\" Source: jemalloc 4.2.0-1-gdc7ff6306d7a15b53479e2fb8e5546404b82e6fc
.\" Language: English
.\"
.TH "JEMALLOC" "3" "02/28/2016" "jemalloc 4.1.0-1-g994da4232621" "User Manual"
.TH "JEMALLOC" "3" "05/12/2016" "jemalloc 4.2.0-1-gdc7ff6306d7a" "User Manual"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
Expand All @@ -31,7 +31,7 @@
jemalloc \- general purpose memory allocation functions
.SH "LIBRARY"
.PP
This manual describes jemalloc 4\&.1\&.0\-1\-g994da4232621dd1210fcf39bdf0d6454cefda473\&. More information can be found at the
This manual describes jemalloc 4\&.2\&.0\-1\-gdc7ff6306d7a15b53479e2fb8e5546404b82e6fc\&. More information can be found at the
\m[blue]\fBjemalloc website\fR\m[]\&\s-2\u[1]\d\s+2\&.
.PP
The following configuration options are enabled in libc\*(Aqs built\-in jemalloc:
Expand Down Expand Up @@ -461,7 +461,8 @@ Memory is conceptually broken into equal\-sized chunks, where the chunk size is
Small objects are managed in groups by page runs\&. Each run maintains a bitmap to track which regions are in use\&. Allocation requests that are no more than half the quantum (8 or 16, depending on architecture) are rounded up to the nearest power of two that is at least
sizeof(\fBdouble\fR)\&. All other object size classes are multiples of the quantum, spaced such that there are four size classes for each doubling in size, which limits internal fragmentation to approximately 20% for all but the smallest size classes\&. Small size classes are smaller than four times the page size, large size classes are smaller than the chunk size (see the
"opt\&.lg_chunk"
option), and huge size classes extend from the chunk size up to one size class less than the full address space size\&.
option), and huge size classes extend from the chunk size up to the largest size class that does not exceed
\fBPTRDIFF_MAX\fR\&.
.PP
Allocations are packed tightly together, which can be an issue for multi\-threaded applications\&. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when allocating\&.
.PP
Expand Down Expand Up @@ -518,6 +519,8 @@ l r l
^ r l
^ r l
^ r l
^ r l
^ r l
^ r l.
T{
Small
Expand Down Expand Up @@ -645,6 +648,16 @@ T}
T}:T{
\&.\&.\&.
T}
:T{
512 PiB
T}:T{
[2560 PiB, 3 EiB, 3584 PiB, 4 EiB]
T}
:T{
1 EiB
T}:T{
[5 EiB, 6 EiB, 7 EiB]
T}
.TE
.sp 1
.SH "MALLCTL NAMESPACE"
Expand Down Expand Up @@ -841,7 +854,7 @@ function\&. If
is specified during configuration, this has the potential to cause deadlock for a multi\-threaded process that exits while one or more threads are executing in the memory allocation functions\&. Furthermore,
\fBatexit\fR\fB\fR
may allocate memory during application initialization and then deadlock internally when jemalloc in turn calls
\fBatexit\fR\fB\fR, so this option is not univerally usable (though the application can register its own
\fBatexit\fR\fB\fR, so this option is not universally usable (though the application can register its own
\fBatexit\fR\fB\fR
function with equivalent functionality)\&. Therefore, this option should only be used with care; it is primarily intended as a performance tuning aid during application development\&. This option is disabled by default\&.
.RE
Expand Down Expand Up @@ -1007,7 +1020,7 @@ is controlled by the
option\&. Note that
\fBatexit\fR\fB\fR
may allocate memory during application initialization and then deadlock internally when jemalloc in turn calls
\fBatexit\fR\fB\fR, so this option is not univerally usable (though the application can register its own
\fBatexit\fR\fB\fR, so this option is not universally usable (though the application can register its own
\fBatexit\fR\fB\fR
function with equivalent functionality)\&. This option is disabled by default\&.
.RE
Expand Down Expand Up @@ -1113,6 +1126,14 @@ Trigger decay\-based purging of unused dirty pages for arena <i>, or for all are
for details\&.
.RE
.PP
"arena\&.<i>\&.reset" (\fBvoid\fR) \-\-
.RS 4
Discard all of the arena\*(Aqs extant allocations\&. This interface can only be used with arenas created via
"arenas\&.extend"\&. None of the arena\*(Aqs discarded/cached allocations may accessed afterward\&. As part of this requirement, all thread caches which were used to allocate/deallocate in conjunction with the arena must be flushed beforehand\&. This interface cannot be used if running inside Valgrind, nor if the
quarantine
size is non\-zero\&.
.RE
.PP
"arena\&.<i>\&.dss" (\fBconst char *\fR) rw
.RS 4
Set the precedence of dss allocation as related to mmap allocation for arena <i>, or for all arenas if <i> equals
Expand Down Expand Up @@ -1503,7 +1524,7 @@ Get the current sample rate (see
.PP
"prof\&.interval" (\fBuint64_t\fR) r\- [\fB\-\-enable\-prof\fR]
.RS 4
Average number of bytes allocated between inverval\-based profile dumps\&. See the
Average number of bytes allocated between interval\-based profile dumps\&. See the
"opt\&.lg_prof_interval"
option for additional information\&.
.RE
Expand Down Expand Up @@ -1547,6 +1568,15 @@ Total number of bytes in active chunks mapped by the allocator\&. This is a mult
"stats\&.resident"\&.
.RE
.PP
"stats\&.retained" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR]
.RS 4
Total number of bytes in virtual memory mappings that were retained rather than being returned to the operating system via e\&.g\&.
\fBmunmap\fR(2)\&. Retained virtual memory is typically untouched, decommitted, or purged, so it has no strongly associated physical memory (see
chunk hooks
for details)\&. Retained memory is excluded from mapped memory statistics, e\&.g\&.
"stats\&.mapped"\&.
.RE
.PP
"stats\&.arenas\&.<i>\&.dss" (\fBconst char *\fR) r\-
.RS 4
dss (\fBsbrk\fR(2)) allocation precedence as related to
Expand Down Expand Up @@ -1592,6 +1622,13 @@ or similar has not been called\&.
Number of mapped bytes\&.
.RE
.PP
"stats\&.arenas\&.<i>\&.retained" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR]
.RS 4
Number of retained bytes\&. See
"stats\&.retained"
for details\&.
.RE
.PP
"stats\&.arenas\&.<i>\&.metadata\&.mapped" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR]
.RS 4
Number of mapped bytes in arena chunk headers, which track the states of the non\-metadata pages\&.
Expand Down
Loading

0 comments on commit 1f0a49e

Please sign in to comment.