Skip to content

Commit

Permalink
documentation update for new features/stats.
Browse files Browse the repository at this point in the history
  • Loading branch information
dormando committed Jan 9, 2015
1 parent dbb62be commit 09e15d5
Show file tree
Hide file tree
Showing 4 changed files with 101 additions and 15 deletions.
39 changes: 39 additions & 0 deletions doc/new_lru.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
In versions new enough to have the `-o lru_maintainer` option, a new LRU
mechanic is available.

Previously: Each slab class has an independent doubly-linked list comprising
its LRU. Items are pulled from the bottom and either reclaimed or evicted as
needed.

Now, enabling `-o lru_maintainer` changes all of the behavior below:

* LRU's are now split between HOT, WARM, and COLD LRU's. New items enter the
HOT LRU.
* LRU updates only happen as items reach the bottom of an LRU. If active in
HOT, stay in HOT, if active in WARM, stay in WARM. If active in COLD, move
to WARM.
* HOT/WARM each capped at 32% of memory available for that slab class. COLD
is uncapped (by default, as of this writing).
* Items flow from HOT/WARM into COLD.
* A background thread exists which shuffles items between/within the LRU's as
limits are reached.
* The background thread can also control the lru_crawler, if enabled.

The primary goal is to better protect active items from "scanning". Items
which are never hit again will flow from HOT, through COLD, and out the
bottom. Items occasionally active (reaching COLD, but being hit before
eviction), move to WARM. There they can stay relatively protected.

A secondary goal is to improve latency. The LRU locks are no longer used on
item reads, only during sets and from the background thread. Also the
background thread is likely to find expired items and release them back to the
slab class asynchronously, which speeds up new allocations.

It is recommended to use this feature with the lru crawler as well:
`memcached -o lru_maintainer,lru_crawler` - Then it will automatically scan
slab classes for items with expired TTL's. If your items are always set to
never expire, you can omit this option safely.

An extra option: `-o expirezero_does_not_evict` (when used with
lru_maintainer) will make items with an expiration time of 0 unevictable. Take
caution as this will crowd out memory available for other items.
30 changes: 28 additions & 2 deletions doc/protocol.txt
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,14 @@ integers separated by a colon (treat this as a floating point number).
| slabs_moved | 64u | Total slab pages moved |
| crawler_reclaimed | 64u | Total items freed by LRU Crawler |
| lrutail_reflocked | 64u | Times LRU tail was found with active ref. |
| | | Items moved to head to avoid OOM errors. |
| | | Items can be evicted to avoid OOM errors. |
| moves_to_cold | 64u | Items moved from HOT/WARM to COLD LRU's |
| moves_to_warm | 64u | Items moved from COLD to WARM LRU |
| moves_within_lru | 64u | Items reshuffled within HOT or WARM LRU's |
| direct_reclaims | 64u | Times worker threads had to directly |
| | | reclaim or evict items. |
| lru_maintainer_juggles |
| | 64u | Number of times the LRU bg thread woke up |
|-----------------------+---------+-------------------------------------------|

Settings statistics
Expand Down Expand Up @@ -629,7 +636,14 @@ other stats command.
| hash_algorithm | char | Hash table algorithm in use |
| lru_crawler | bool | Whether the LRU crawler is enabled |
| lru_crawler_sleep | 32 | Microseconds to sleep between LRU crawls |
| lru_crawler_tocrawl| 32u | Max items to crawl per slab per run |
| lru_crawler_tocrawl |
| | 32u | Max items to crawl per slab per run |
| lru_maintainer_thread |
| | bool | Split LRU mode and background threads |
| hot_lru_pct | 32 | Pct of slab memory reserved for HOT LRU |
| warm_lru_pct | 32 | Pct of slab memory reserved for WARM LRU |
| expirezero_does_not_evict |
| | bool | If yes, items with 0 exptime cannot evict |
|-------------------+----------+----------------------------------------------|


Expand Down Expand Up @@ -657,6 +671,10 @@ Name Meaning
------------------------------
number Number of items presently stored in this class. Expired
items are not automatically excluded.
number_hot Number of items presently stored in the HOT LRU.
number_warm Number of items presently stored in the WARM LRU.
number_cold Number of items presently stored in the COLD LRU.
number_noexp Number of items presently stored in the NOEXP class.
age Age of the oldest item in the LRU.
evicted Number of times an item had to be evicted from the LRU
before it expired.
Expand All @@ -679,6 +697,14 @@ expired_unfetched Number of expired items reclaimed from the LRU which
evicted_unfetched Number of valid items evicted from the LRU which were
never touched after being set.
crawler_reclaimed Number of items freed by the LRU Crawler.
lrutail_reflocked Number of items found to be refcount locked in the
LRU tail.
moves_to_cold Number of items moved from HOT or WARM into COLD.
moves_to_warm Number of items moved from COLD to WARM.
moves_within_lru Number of times active items were bumped within
HOT or WARM.
direct_reclaims Number of times worker threads had to directly pull LRU
tails to find memory for a new item.

Note this will only display information about slabs which exist, so an empty
cache will return an empty set.
Expand Down
38 changes: 26 additions & 12 deletions doc/threads.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
WARNING: This document is currently a stub. It is incomplete, but provided to
give a vague overview of how threads are implemented.

Multithreading in memcached *was* originally simple:

- One listener thread
Expand All @@ -14,9 +11,6 @@ and modifications happen under central locks.

THIS HAS CHANGED!

I do need to flesh this out more, and it'll need a lot more tuning, but it has
changed in the following ways:

- A secondary small hash table of locks is used to lock an item by its hash
value. This prevents multiple threads from acting on the same item at the
same time.
Expand All @@ -25,22 +19,42 @@ changed in the following ways:
thread may read or write against a particular hash table bucket.
- atomic refcounts per item are used to manage garbage collection and
mutability.
- A central lock is still held around any "item modifications" - any change to
any item flags on any item, the LRU state, or refcount incrementing are
still centrally locked.

- When pulling an item off of the LRU tail for eviction or re-allocation, the
system must attempt to lock the item's bucket, which is done with a trylock
to avoid deadlocks. If a bucket is in use (and not by that thread) it will
walk up the LRU a little in an attempt to fetch a non-busy item.

Since I'm sick of hearing it:
- Each LRU (and sub-LRU's in newer modes) has an independent lock.

- Raw accessses to the slab class are protected by a global slabs_lock. This
is a short lock which covers pushing and popping free memory.

- item_lock must be held while modifying an item.
- slabs_lock must be held while modifying the ITEM_SLABBED flag bit within an item.
- ITEM_LINKED must not be set before an item has a key copied into it.
- items without ITEM_SLABBED set cannot have their memory zeroed out.

LOCK ORDERS:

(incomplete as of writing, sorry):

item_lock -> lru_lock -> slabs_lock

lru_lock -> item_trylock

Various stats_locks should never have other locks as dependencies.

Various locks exist for background threads. They can be used to pause the
thread execution or update settings while the threads are idle. They may call
item or lru locks.

A low priority isssue:

- If you remove the per-thread stats lock, CPU usage goes down by less than a
point of a percent, and it does not improve scalability.
- In my testing, the remaining global STATS_LOCK calls never seem to collide.

Yes, more stats can be moved to threads, and those locks can actually be
removed entirely on x86-64 systems. However my tests haven't shown that as
beneficial so far, so I've prioritized other work. Apologies for the rant but
it's a common question.
beneficial so far, so I've prioritized other work.
9 changes: 8 additions & 1 deletion memcached.c
Original file line number Diff line number Diff line change
Expand Up @@ -4833,14 +4833,21 @@ static void usage(void) {
" restart.\n"
" - tail_repair_time: Time in seconds that indicates how long to wait before\n"
" forcefully taking over the LRU tail item whose refcount has leaked.\n"
" The default is 3 hours.\n"
" Disabled by default; dangerous option.\n"
" - hash_algorithm: The hash table algorithm\n"
" default is jenkins hash. options: jenkins, murmur3\n"
" - lru_crawler: Enable LRU Crawler background thread\n"
" - lru_crawler_sleep: Microseconds to sleep between items\n"
" default is 100.\n"
" - lru_crawler_tocrawl: Max items to crawl per slab per run\n"
" default is 0 (unlimited)\n"
" - lru_maintainer: Enable new LRU system + background thread\n"
" - hot_lru_pct: Pct of slab memory to reserve for hot lru.\n"
" (requires lru_maintainer)\n"
" - warm_lru_pct: Pct of slab memory to reserve for warm lru.\n"
" (requires lru_maintainer)\n"
" - expirezero_does_not_evict: Items set to not expire, will not evict.\n"
" (requires lru_maintainer)\n"
);
return;
}
Expand Down

0 comments on commit 09e15d5

Please sign in to comment.