GitHub - hzm8341/taichi: Productive programming language for portable, high-performance, sparse & differentiable computing

Docs | Tutorial | DiffTaichi | Examples | Contribute | Forum

Documentations	Chat	taichi-nightly	taichi-nightly-cuda-10-0	taichi-nightly-cuda-10-1

# Python 3.6/3.7 needed

# CPU only. No GPU/CUDA needed. (Linux, OS X and Windows)
python3 -m pip install taichi-nightly

# With GPU (CUDA 10.0) support (Linux only)
python3 -m pip install taichi-nightly-cuda-10-0

# With GPU (CUDA 10.1) support (Linux only)
python3 -m pip install taichi-nightly-cuda-10-1

Contribution Guidelines

	Linux (CUDA)	OS X (10.14+)	Windows
Build
PyPI

Short-term goals

(Done) Fully implement the LLVM backend to replace the legacy source-to-source C++/CUDA backends (By Dec 2019)
- The only missing features compared to the old source-to-source backends:
  - Vectorization on CPUs. Given most users who want performance are using GPUs (CUDA), this is given low priority.
  - Automatic shared memory utilization. Postponed until Feb/March 2020.
(Done) Redesign & reimplement (GPU) memory allocator (by the end of Jan 2020)
(WIP) Tune the performance of the LLVM backend to match that of the legacy source-to-source backends (Hopefully by mid Feb, 2020. Current progress: setting up/tuning for final benchmarks)

Updates

(Feb 16, 2020) v0.5.1 released
- Keyboard and mouse events supported in the GUI system. Check out mpm128.py for a interactive demo! (by Yubin Peng [archibate] and Ye Kuang [k-ye])
- Basic algebraic simplification passes (by Mingkuan Xu [xumingkuan])
- (For developers) ti (ti.exe) command supported on Windows after setting %PATH% correctly (by Mingkuan Xu [xumingkuan])
- General power operator x ** y now supported in Taichi kernels (by Yubin Peng [archibate])
- .dense(...).pointer() now abbreviated as .pointer(...). pointer now stands for a dense pointer array. This leads to cleaner code and better performance. (by Kenneth Lozes [KLozes])
- (Advanced struct-fors only) for i in X now iterates all child instances of X instead of X itself. Skip this if you only use X=leaf node such as ti.f32/i32/Vector/Matrix.
- Fixed cuda random number generator racing conditions
(Feb 14, 2020) v0.5.0 released with a new Apple Metal GPU backend for Mac OS X users! (by Ye Kuang [k-ye])
- Just initialize your program with ti.init(..., arch=ti.metal) and run Taichi on your Mac GPUs!
- A few takeaways if you do want to use the Metal backend:
  - For now, the Metal backend only supports dense SNodes and 32-bit data types. It doesn't support ti.random() or print().
  - Pre-2015 models may encounter some undefined behaviors under certain conditions (e.g. read-after-write). According to our tests, it seems like the memory order on a single GPU thread could go inconsistent on these models.
  - The [] operator in Python is slow in the current implementation. If you need to do a large number of reads, consider dumping all the data to a numpy array via to_numpy() as a workaround. For writes, consider first generating the data into a numpy array, then copying that to the Taichi variables as a whole.
  - Do NOT expect a performance boost yet, and we are still profiling and tuning the new backend. (So far we only saw a big performance improvement on a 2015 MBP 13-inch model.)
(Feb 12, 2020) v0.4.6 released.
- (For compiler developers) An error will be raised when TAICHI_REPO_DIR is not a valid path (by Yubin Peng [archibate])
- Fixed a CUDA backend deadlock bug
- Added test selectors ti.require() and ti.archs_excluding() (by Ye Kuang [k-ye])
- ti.init(**kwargs) now takes a parameter debug=True/False, which turns on debug mode if true
- ... or use TI_DEBUG=1 to turn on debug mode non-intrusively
- Fixed ti.profiler_clear
- Added GUI.line(begin, end, color, radius) and ti.rgb_to_hex
- Renamed ti.trace (Matrix trace) to ti.tr. ti.trace is now for logging with ti.TRACE level
- Fixed return value of ti test_cpp (thanks to Ye Kuang [k-ye])
- Raise default loggineg level to ti.INFO instead of trace to make the world quiter
- General performance/compatibility improvements
- Doc updated
(Feb 6, 2020) v0.4.5 released.
- ti.init(arch=..., print_ir=..., default_fp=..., default_ip=...) now supported. ti.cfg.xxx is deprecated
- Immediate data layout specification supported after ti.init. No need to wrap data layout definition with @ti.layout anymore (unless you intend to do so)
- ti.is_active, ti.deactivate, SNode.deactivate_all supported in the new LLVM x64/CUDA backend. Example
- Experimental Windows non-UTF-8 path fix (by Yubin Peng [archibate])
- ti.global_var (which duplicates ti.var) is removed
- ti.Matrix.rotation2d(angle) added
(Feb 5, 2020) v0.4.4 released.
- For developers: ffi-navigator support [doc]. (by masahi)
- Fixed f64 precision support of sin and cos on CUDA backends (by Kenneth Lozes [KLozes])
- Make Profiler print the arch name in its title (by Ye Kuang [k-ye])
- Tons of invisible contributions by Ye Kuang [k-ye], for the WIP Metal backend
- Profiler working on CPU devices. To enable, ti.cfg.enable_profiler = True. Call ti.profiler_print() to print kernel running times
- General performance improvements
(Feb 3, 2020) v0.4.3 released.
- GUI.circles 2.4x faster
- General performance improvements
(Feb 2, 2020) v0.4.2 released.
- GUI framerates are now more stable
- Optimized OffloadedRangeFor with const bounds. Light computation programs such as mpm88.py is 30% faster on CUDA due to reduced kernel launches
- Optimized CPU parallel range for performance
(Jan 31, 2020) v0.4.1 released.
- Fixed an autodiff bug introduced in v0.3.24. Please update if you are using Taichi differentiable programming.
- Updated Dockerfile (by Shenghang Tsai [jackalcooper])
- pbf2d.py visualization performance boosted (by Ye Kuang [k-ye])
- Fixed GlobalTemporaryStmt codegen
(Jan 30, 2020) v0.4.0 released.
- Memory allocator redesigned
- Struct-fors with pure dense data structures will be demoted into a range-for, which is faster since no element list generation is needed
- Python 3.5 support is dropped. Please use Python 3.6(pip)/3.7(pip)/3.8(Windows: pip; OS X & Linux: build from source) (by Chujie Zeng [Psycho7])
- ti.deactivate now supported on sparse data structures
- GUI.circles (batched circle drawing) performance improved by 30x
- Minor bug fixes (by Yubin Peng [archibate], Ye Kuang [k-ye])
- Doc updated
Full changelog

Related papers

(SIGGRAPH Asia 2019) High-Performance Computation on Sparse Data Structures [Video] [BibTex]
- by Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand
(ICLR 2020) Differentiable Programming for Physical Simulation [Video] [BibTex] [Code]
- by Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and Frédo Durand

Name		Name	Last commit message	Last commit date
Latest commit History 5,565 Commits
.github		.github
benchmarks		benchmarks
cmake		cmake
docs		docs
examples		examples
external		external
misc		misc
python		python
taichi		taichi
tests		tests
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md
appveyor.yml		appveyor.yml
changelog.md		changelog.md
ci_setup.py		ci_setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docs | Tutorial | DiffTaichi | Examples | Contribute | Forum

Contribution Guidelines

Short-term goals

Updates

Related papers

About

Releases

Packages

Languages

License

hzm8341/taichi

Folders and files

Latest commit

History

Repository files navigation

Docs | Tutorial | DiffTaichi | Examples | Contribute | Forum

Contribution Guidelines

Short-term goals

Updates

Related papers

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages