The Lean Theorem Proving Environment

Setup instructions

Make sure to install elan: https://www.github.com/kha/elan

Set up `lean-tpe` repository

git clone [email protected]:jesse-michael-han/lean-tpe-public $LEAN_TPE_DIR

cd $LEAN_TPE_DIR

leanpkg configure

leanproject get-mathlib-cache

bash ./_target/deps/mathlib/scripts/mk_all.sh # make a file that imports all of mathlib

leanpkg build

Evaluation instructions

The Lean scripts will dump JSON messages to a specified file, with a new message on each line. Each message encodes the history of a single proof search.

Run some toy examples

WARNING: The error messages of the Lean scripts are currently extremely unhelpful. Make sure arguments are passed in correct order.

Check that single-threaded baseline evaluation works

# greedy strategy

lean --run ./src/evaluation/greedy/baseline.lean ./test/dummy_decls2.names ./test/dummy_out_greedy_baseline.log 50 5 300 > ./test/dummy_out_greedy_baseline.out 2>&1

# this should be 31
cat ./test/dummy_out_greedy_baseline.log | grep | wc -l

# this should be 0 (TODO(jesse): allow optional environment-setting)
cat ./test/dummy_out_greedy_baseline.log | grep '"success":true' | wc -l

# bfs strategy

lean --run ./src/evaluation/bfs/baseline.lean ./test/dummy_decls2.names ./test/dummy_out_bfs_baseline.log 50 25 50 5 300 > ./test/dummy_out_bfs_baseline.out 2>&1

# this should be 31
cat ./test/dummy_out_bfs_baseline.log | grep | wc -l

# this should be 0 (TODO(jesse): allow optional environment-setting)
cat ./test/dummy_out_bfs_baseline.log | grep '"success":true' | wc -l

Check that single-threaded GPT-f evaluation works

# greedy strategy

# args: decl_names logfile max_tokens temp top_p n best_of fuel engine_id api_key
lean --run ./src/evaluation/greedy/gptf.lean ./test/dummy_decls2.names ./test/dummy_out_greedy_gptf.log 256 0.7 1.0 10 10 50 formal-lean-wm-to-tt-m1-m2-v4-c4 $OPENAI_API_KEY 5 300 > ./test/dummy_out_greedy_gptf.out 2>&1

# should be 31
cat ./test/dummy_out_greedy_gptf.log | grep | wc -l

# should be ~20
cat ./test/dummy_out_greedy_gptf.log | grep '"success":true' | wc -l

# bfs strategy

# args: decl_names logfile max_tokens temp top_p n best_of fuel max_width max_depth engine_id api_key
lean --run ./src/evaluation/bfs/gptf.lean ./test/dummy_decls2.names ./test/dummy_out_bfs_gptf.log 256 0.7 1.0 10 10 50 10 25 formal-lean-wm-to-tt-m1-m2-v4-c4 $OPENAI_API_KEY 5 300 > ./test/dummy_out_bfs_gptf.out 2>&1

# should be 31
cat ./test/dummy_out_bfs_gptf.log | grep | wc -l

# should be ~24
cat ./test/dummy_out_bfs_gptf.log | grep '"success":true' | wc -l

Check that the Python script wrapper works

The Python script wraps the Lean scripts and runs them in parallel, sharding a list of theorems.

usage: parallel_eval.py [-h] [--max_tokens MAX_TOKENS]
                        [--temperature TEMPERATURE] [--top_p TOP_P] [--n N]
                        [--best_of BEST_OF] [--fuel FUEL]
                        [--engine_id ENGINE_ID] [--api_key API_KEY]
                        [--nbest NBEST] [--beam BEAM]
                        [--model_path MODEL_PATH] [--data_path DATA_PATH]
                        [--entry_pt ENTRY_PT] [--max_width MAX_WIDTH]
                        [--max_depth MAX_DEPTH] [--lean_threads LEAN_THREADS]
                        [--lean_timeout LEAN_TIMEOUT] [--api API]
                        [--search_strategy SEARCH_STRATEGY]
                        [--tac_timeout TAC_TIMEOUT]
                        [--global_timeout GLOBAL_TIMEOUT]
                        n_workers decls_per_shard decls_file dest_dir

positional arguments:
  n_workers
  decls_per_shard
  decls_file
  dest_dir

optional arguments:
  -h, --help            show this help message and exit
  --max_tokens MAX_TOKENS
  --temperature TEMPERATURE
  --top_p TOP_P
  --n N
  --best_of BEST_OF
  --fuel FUEL
  --engine_id ENGINE_ID
  --api_key API_KEY
  --nbest NBEST
  --beam BEAM
  --model_path MODEL_PATH
  --data_path DATA_PATH
  --entry_pt ENTRY_PT
  --max_width MAX_WIDTH
                        maximum size of search queue for BFS
  --max_depth MAX_DEPTH
                        maximum distance of search node from root before the
                        search queue rejects it
  --lean_threads LEAN_THREADS
                        number of threads per Lean process
  --lean_timeout LEAN_TIMEOUT
                        deterministic timeout for Lean process in millions of
                        allocations. Interactive default is one. Default is
                        unbounded (none).
  --api API             gptf|baseline|fairseq
  --search_strategy SEARCH_STRATEGY
                        greedy|bfs
  --tac_timeout TAC_TIMEOUT
                        tactic execution timeout (s)
  --global_timeout GLOBAL_TIMEOUT
                        proof search timeout (s)

Check that the baseline version works. Inspect some of the output files.

python ./scripts/parallel_eval.py 4 8 ./test/dummy_decls2.names ./test_parallel/baseline/ --fuel 50 --api baseline --search_strategy greedy --tac_timeout 5 --global_timeout 300

Check that the greedy GPT-f version works. Inspect some of the output files.

python ./scripts/parallel_eval.py 4 8 ./test/dummy_decls2.names ./test_parallel/gptf_greedy/ --max_tokens 256 --temperature 0.7 --top_p 1.0 --n 10 --best_of 10 --fuel 50 --engine_id formal-lean-wm-to-tt-m1-m2-v4-c4 --api_key $OPENAI_API_KEY --api gptf --search_strategy greedy --tac_timeout 5 --global_timeout 300

Check that the BFS GPT-f version works. Inspect some of the output files.

python ./scripts/parallel_eval.py 4 8 ./test/dummy_decls2.names ./test_parallel/gptf_bfs/ --max_tokens 256 --temperature 0.7 --top_p 1.0 --n 10 --best_of 10 --fuel 50 --max_width 10 --max_depth 50 --engine_id formal-lean-wm-to-tt-m1-m2-v4-c4 --api_key $OPENAI_API_KEY --api gptf --search_strategy bfs --tac_timeout 5 --global_timeout 300

Data processing

Removing non-theorems

lean --run ./src/tools/filter_defs.lean $ORIGINAL_NAMES_FILE $NEW_NAMES_FILE

Shuffling names files

python ./scripts/shuffle_lines.py $NAMES_FILE $SHUFFLED_NAMES_FILE # optional seed -- seed 12387

Guidelines for contributing

Ensure that the following invariants hold before making a PR or pushing to master:

leanpkg build does not raise any errors.
single-threaded evaluation does not fail (check the logs)
your API key is not in the source code

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
scripts		scripts
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.org		README.org
leanpkg.toml		leanpkg.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Lean Theorem Proving Environment

Setup instructions

Set up `lean-tpe` repository

Evaluation instructions

Run some toy examples

Check that single-threaded baseline evaluation works

Check that single-threaded GPT-f evaluation works

Check that the Python script wrapper works

Data processing

Removing non-theorems

Shuffling names files

Guidelines for contributing

About

Releases

Packages

Languages

License

jesse-michael-han/lean-tpe-public

Folders and files

Latest commit

History

Repository files navigation

The Lean Theorem Proving Environment

Setup instructions

Set up lean-tpe repository

Evaluation instructions

Run some toy examples

Check that single-threaded baseline evaluation works

Check that single-threaded GPT-f evaluation works

Check that the Python script wrapper works

Data processing

Removing non-theorems

Shuffling names files

Guidelines for contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Set up `lean-tpe` repository

Packages