Skip to content

Commit

Permalink
pondering: dcnn pondering experiment
Browse files Browse the repository at this point in the history
tree search is heavily influenced by initial priors so we can't just reuse
the tree and inject dcnn priors later on. but if we can guess opponent's
next move then we can do some useful work at pondering time: evaluate dcnn
and run tree search ahead of time. later on when play command arrives we
just promote the node and reuse the tree if it matches our guess. if it
doesn't match we discard search results and start from scratch.

so after genmove is over, reset the tree, get dcnn priors for our move and
likely opponent moves (guess from dcnn priors and genmove search results)
and start pondering as usual.

this makes sense when using cpu for dcnn evaluation which is relatively
slow, with a gpu you could evaluate dcnn for every node even during tree
search so no need to go through such restrictions.

number of guesses can be tweaked by 2 uct params:
    dcnn_pondering_prior:  number of guesses from prior best moves (default: 5)
    dcnn_pondering_mcts :  number of guesses from genmove search   (default: 3)

for slow games it makes sense to increase this: we spend more time before
actual search starts but there's more chance we guess right, so that
pondering will be useful. if we guess wrong search results will be
discarded and pondering will not be useful for this move. for fast games
try decreasing it.
  • Loading branch information
lemonsqueeze committed Jan 31, 2019
1 parent 71e69ba commit e2aa881
Show file tree
Hide file tree
Showing 11 changed files with 206 additions and 54 deletions.
8 changes: 8 additions & 0 deletions HACKING
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,14 @@ support the undo command. The final_status_list command requires engine
support.


DCNN
====

dcnn.c has the code that prepares the input planes from board state and
feeds that to caffe for dcnn evaluation (caffe.cpp). If you want to use
a network with different inputs this is the place to accomodate it.


General Pattern Matcher
=======================

Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,19 +92,16 @@ overridden at runtime by setting `DATA_DIR` environment variable.

## DCNN support

Pachi can use a neural network as source of good moves to consider (priors).
Pachi can use a neural network as source of good moves to consider.
With dcnn support Pachi can play at dan level strength on modest hardware.
For large number of playouts this makes it about 1 stone stronger, and
tends to make the games more pretty. A raw dcnn engine is available for
pure dcnn play (not recommended for actual games, pachi won't know when to
pass or resign !).

Currently dcnn is used only for root node, and pondering and dcnn can't be
used together (you should get a warning on startup).

To build Pachi with DCNN support:
- Install [Caffe](http://caffe.berkeleyvision.org)
CPU only build is fine, no need for GPU, cuda or the other optional dependencies.
CPU-only build is fine, no need for GPU, cuda or the other optional dependencies.
You need openblas for good performance.
- Edit Makefile, set DCNN=1, point it to where caffe is installed and build.

Expand All @@ -114,9 +111,12 @@ Detlef Schmicker's 54% dcnn can be found at:

More information about this dcnn [here](http://computer-go.org/pipermail/computer-go/2015-December/008324.html).

If you want to use a network with different inputs you'll have to tweak
dcnn.c to accomodate it. Pachi will check for `golast19.prototxt` and
`golast.trained` files on startup.
Pachi will look for `golast19.prototxt` and `golast.trained` files on startup.

Althouh it was trained on 19x19 it can be used on other board sizes as well since
it's fully convolutional (Right now Pachi will use it all the way down to 13x13).
Currently dcnn is used only for root node, dcnn + pondering is working now.
(see `dcnn_pondering_prior` and `dcnn_pondering_mcts` uct params to tweak it).


## How to run
Expand Down
19 changes: 17 additions & 2 deletions dcnn.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ static bool dcnn_required = false;
void disable_dcnn(void) { dcnn_enabled = false; }
void require_dcnn(void) { dcnn_required = true; }

static void detlef54_dcnn_eval(struct board *b, enum stone color, float result[]);

static bool
dcnn_supported_board_size(struct board *b)
{
Expand All @@ -37,11 +39,24 @@ dcnn_init(struct board *b)
if (dcnn_required && !caffe_ready()) die("dcnn required, aborting.\n");
}

void
dcnn_evaluate_quiet(struct board *b, enum stone color, float result[])
{
detlef54_dcnn_eval(b, color, result);
}

void
dcnn_evaluate(struct board *b, enum stone color, float result[])
{
double time_start = time_now();
detlef54_dcnn_eval(b, color, result);
if (DEBUGL(2)) fprintf(stderr, "dcnn in %.2fs\n", time_now() - time_start);
}

static void
detlef54_dcnn_eval(struct board *b, enum stone color, float result[])
{
assert(dcnn_supported_board_size(b));
double time_start = time_now();

int size = real_board_size(b);
int dsize = 13 * size * size;
Expand Down Expand Up @@ -77,7 +92,6 @@ dcnn_evaluate(struct board *b, enum stone color, float result[])

caffe_get_data(data, result, 13, size);
free(data);
if (DEBUGL(2)) fprintf(stderr, "dcnn in %.2fs\n", time_now() - time_start);
}


Expand All @@ -104,3 +118,4 @@ print_dcnn_best_moves(struct board *b, coord_t *best_c, float *best_r, int nbest
fprintf(stderr, "%-3i ", (int)(best_r[i] * 100));
fprintf(stderr, "]\n");
}

1 change: 1 addition & 0 deletions dcnn.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ void require_dcnn(void);
void disable_dcnn(void);

void dcnn_evaluate(struct board *b, enum stone color, float result[]);
void dcnn_evaluate_quiet(struct board *b, enum stone color, float result[]);
bool using_dcnn(struct board *b);
void dcnn_init(struct board *b);
void get_dcnn_best_moves(struct board *b, float *r, coord_t *best_c, float *best_r, int nbest);
Expand Down
11 changes: 8 additions & 3 deletions uct/internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,17 @@ struct uct {
TM_TREEVL, /* Tree parallelization with virtual loss. */
} thread_model;
int virtual_loss;
bool pondering_opt; /* User wants pondering */
bool pondering; /* Actually pondering now */
bool slave; /* Act as slave in distributed engine. */
int max_slaves; /* Optional, -1 if not set */
int slave_index; /* 0..max_slaves-1, or -1 if not set */
enum stone my_color;

bool pondering_opt; /* User wants pondering */
bool pondering; /* Actually pondering now */
int dcnn_pondering_prior; /* Prior next move guesses */
int dcnn_pondering_mcts; /* Genmove next move guesses */
coord_t dcnn_pondering_mcts_c[20];

int fuseki_end;
int yose_start;

Expand Down Expand Up @@ -123,12 +127,13 @@ struct uct {
/* Saved dead groups, for final_status_list dead */
struct move_queue dead_groups;
int pass_moveno;

/* Timing */
double mcts_time_start;

/* Game state - maintained by setup_state(), reset_state(). */
struct tree *t;
bool tree_ready;
};

#define UDEBUGL(n) DEBUGL_(u->debug_level, n)
Expand Down
14 changes: 9 additions & 5 deletions uct/prior.c
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

#define PRIOR_BEST_N 20

static void
void
get_node_prior_best_moves(struct tree_node *parent, coord_t *best_c, float *best_r, int nbest)
{
for (int i = 0; i < nbest; i++) {
Expand Down Expand Up @@ -126,11 +126,13 @@ uct_prior_dcnn(struct uct *u, struct tree_node *node, struct prior_map *map)
{
#ifdef DCNN
float r[19 * 19];
coord_t best_c[DCNN_BEST_N];
coord_t best_c[DCNN_BEST_N];
float best_r[DCNN_BEST_N];
dcnn_evaluate(map->b, map->to_play, r);
if (!node->parent) dcnn_evaluate(map->b, map->to_play, r);
else dcnn_evaluate_quiet(map->b, map->to_play, r);
get_dcnn_best_moves(map->b, r, best_c, best_r, DCNN_BEST_N);
if (UDEBUGL(2))

if (UDEBUGL(2) && !node->parent)
print_dcnn_best_moves(map->b, best_c, best_r, DCNN_BEST_N);

foreach_free_point(map->b) {
Expand All @@ -144,6 +146,8 @@ uct_prior_dcnn(struct uct *u, struct tree_node *node, struct prior_map *map)
assert(val >= 0.0 && val <= 1.0);
add_prior_value(map, c, 1, sqrt(val) * u->prior->dcnn_eqex);
} foreach_free_point_end;

node->hints |= TREE_HINT_DCNN;
#endif
}

Expand Down Expand Up @@ -295,7 +299,7 @@ uct_prior(struct uct *u, struct tree_node *node, struct prior_map *map)
if (u->prior->even_eqex) uct_prior_even(u, node, map);

/* Use dcnn for root priors */
if (u->prior->dcnn_eqex && !node->parent) uct_prior_dcnn(u, node, map);
if (u->prior->dcnn_eqex && !u->tree_ready) uct_prior_dcnn(u, node, map);

if (u->prior->pattern_eqex) uct_prior_pattern(u, node, map);
else { /* Fallback to old prior features if patterns are off. */
Expand Down
1 change: 1 addition & 0 deletions uct/prior.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,5 +58,6 @@ add_prior_value(struct prior_map *map, coord_t c, floating_t value, int playouts

/* Display node's priors best moves */
void print_node_prior_best_moves(struct board *b, struct tree_node *parent);
void get_node_prior_best_moves(struct tree_node *parent, coord_t *best_c, float *best_r, int nbest);

#endif
80 changes: 77 additions & 3 deletions uct/search.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include "uct/uct.h"
#include "uct/walk.h"
#include "uct/prior.h"
#include "dcnn.h"


/* Default time settings for the UCT engine. In distributed mode, slaves are
Expand Down Expand Up @@ -93,6 +94,7 @@ static pthread_cond_t finish_cond = PTHREAD_COND_INITIALIZER;
static volatile int finish_thread;
static pthread_mutex_t finish_serializer = PTHREAD_MUTEX_INITIALIZER;

static void uct_expand_next_best_moves(struct uct *u, struct tree *t, struct board *b, enum stone color);
static void *spawn_logger(void *ctx_);

static void *
Expand All @@ -114,22 +116,27 @@ spawn_worker(void *ctx_)
}
}

/* Expand root node (dcnn). Other threads wait till it's ready. */
/* Expand root node (dcnn). Other threads wait till it's ready.
* For dcnn pondering we also need dcnn values for opponent's best moves. */
struct tree *t = ctx->t;
struct tree_node *n = t->root;
if (!ctx->tid) {
enum stone player_color = ctx->color;
enum stone node_color = stone_other(player_color);
assert(node_color == t->root_color);

if (tree_leaf_node(n) && !__sync_lock_test_and_set(&n->is_expanded, 1))
if (tree_leaf_node(n) && !__sync_lock_test_and_set(&n->is_expanded, 1)) {
tree_expand_node(t, n, ctx->b, player_color, u, 1);
if (u->pondering && using_dcnn(ctx->b))
uct_expand_next_best_moves(u, t, ctx->b, player_color);
}
else if (DEBUGL(2)) { /* Show previously computed priors */
print_joseki_moves(joseki_dict, ctx->b, ctx->color);
print_node_prior_best_moves(ctx->b, n);
}
u->tree_ready = true;
}
else while (tree_leaf_node(n))
else while (!u->tree_ready)
usleep(100 * 1000);

/* Run */
Expand Down Expand Up @@ -173,6 +180,8 @@ spawn_thread_manager(void *ctx_)
t->root = tree_garbage_collect(t, t->root);
}

u->tree_ready = false;

/* Logging thread for pondering */
if (u->pondering)
pthread_create(&threads[u->threads], NULL, spawn_logger, mctx);
Expand Down Expand Up @@ -255,6 +264,71 @@ spawn_logger(void *ctx_)
return NULL;
}

/* Expand next move node (dcnn pondering) */
static void
uct_expand_next_move(struct uct *u, struct tree *t, struct board *board, enum stone color, coord_t c)
{
struct tree_node *n = tree_get_node(t->root, c);
assert(n && tree_leaf_node(n) && !n->is_expanded);

struct board b;
board_copy(&b, board);

struct move m = { .coord = c, .color = color };
int res = board_play(&b, &m);
if (res < 0) goto done;

if (!__sync_lock_test_and_set(&n->is_expanded, 1))
tree_expand_node(t, n, &b, stone_other(color), u, -1);

done: board_done_noalloc(&b);
}

/* For pondering with dcnn we need dcnn values for next move as well before
* search starts. Can't evaluate all of them, so guess from prior best moves +
* genmove's best moves for opponent. If we guess right all is well. If we
* guess wrong pondering will not be useful for this move, search results
* will be discarded. */
static void
uct_expand_next_best_moves(struct uct *u, struct tree *t, struct board *b, enum stone color)
{
assert(using_dcnn(b));
struct move_queue q = { .moves = 0 };

{ /* Prior best moves (dcnn policy mostly) */
int nbest = u->dcnn_pondering_prior;
float best_r[nbest];
coord_t best_c[nbest];
get_node_prior_best_moves(t->root, best_c, best_r, nbest);
assert(t->root->hints & TREE_HINT_DCNN);

for (int i = 0; i < nbest && !is_pass(best_c[i]); i++)
mq_add(&q, best_c[i], 0);
}

{ /* Opponent best moves from genmove search */
int nbest = u->dcnn_pondering_mcts;
coord_t *best_c = u->dcnn_pondering_mcts_c;
for (int i = 0; i < nbest && !is_pass(best_c[i]); i++) {
mq_add(&q, best_c[i], 0);
mq_nodup(&q);
}
}

if (DEBUGL(2)) { /* Show guesses. */
fprintf(stderr, "dcnn eval %s ", stone2str(color));
for (unsigned int i = 0; i < q.moves; i++)
fprintf(stderr, "%s ", coord2sstr(q.move[i], b));
fflush(stderr);
}

for (unsigned int i = 0; i < q.moves && !uct_halt; i++) { /* Don't hang if genmove comes in. */
uct_expand_next_move(u, t, b, color, q.move[i]);
if (DEBUGL(2)) { fprintf(stderr, "."); fflush(stderr); }
}
if (DEBUGL(2)) fprintf(stderr, "\n");
}


/*** THREAD MANAGER end */

Expand Down
9 changes: 8 additions & 1 deletion uct/tree.c
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "uct/prior.h"
#include "uct/tree.h"
#include "uct/slave.h"
#include "dcnn.h"


/* Allocate tree node(s). The returned nodes are initialized with zeroes.
Expand Down Expand Up @@ -795,13 +796,19 @@ tree_promote_node(struct tree *tree, struct tree_node **node)
}

bool
tree_promote_at(struct tree *t, struct board *b, coord_t c)
tree_promote_at(struct tree *t, struct board *b, coord_t c, int *reason)
{
*reason = 0;
tree_fix_symmetry(t, b, c);

struct tree_node *n = tree_get_node(t->root, c);
if (!n) return false;

if (using_dcnn(b) && !(n->hints & TREE_HINT_DCNN)) {
*reason = TREE_HINT_DCNN;
return false; /* No dcnn priors, can't reuse ... */
}

tree_promote_node(t, &n);
return true;
}
3 changes: 2 additions & 1 deletion uct/tree.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ struct tree_node {
unsigned char d;

#define TREE_HINT_INVALID 1 // don't go to this node, invalid move
#define TREE_HINT_DCNN 2 // node has dcnn priors
unsigned char hints;

/* In case multiple threads walk the tree, is_expanded is set
Expand Down Expand Up @@ -161,7 +162,7 @@ struct tree_node *tree_get_node(struct tree_node *parent, coord_t c);
struct tree_node *tree_get_node2(struct tree *tree, struct tree_node *parent, coord_t c, bool create);
struct tree_node *tree_garbage_collect(struct tree *tree, struct tree_node *node);
void tree_promote_node(struct tree *tree, struct tree_node **node);
bool tree_promote_at(struct tree *tree, struct board *b, coord_t c);
bool tree_promote_at(struct tree *tree, struct board *b, coord_t c, int *reason);

void tree_expand_node(struct tree *tree, struct tree_node *node, struct board *b, enum stone color, struct uct *u, int parity);
struct tree_node *tree_lnode_for_node(struct tree *tree, struct tree_node *ni, struct tree_node *lni, int tenuki_d);
Expand Down
Loading

0 comments on commit e2aa881

Please sign in to comment.