Skip to content
/ taichi Public
forked from taichi-dev/taichi

Productive programming language for portable, high-performance, sparse & differentiable computing

License

Notifications You must be signed in to change notification settings

yuv4r4j/taichi

 
 

Repository files navigation

Documentations Chat taichi-nightly taichi-nightly-cuda-10-0 taichi-nightly-cuda-10-1
Documentation Status Join the chat at https://gitter.im/taichi-dev/Lobby Downloads Downloads Downloads
# Python 3.6/3.7 needed

# CPU only. No GPU/CUDA needed. (Linux, OS X and Windows)
python3 -m pip install taichi-nightly

# With GPU (CUDA 10.0) support (Linux only)
python3 -m pip install taichi-nightly-cuda-10-0

# With GPU (CUDA 10.1) support (Linux only)
python3 -m pip install taichi-nightly-cuda-10-1
Linux (CUDA) OS X (10.14+) Windows
Build Build Status Build Status Build status
PyPI Build Status Build Status Build status

Related papers

Short-term goals

  • (Done) Fully implement the LLVM backend to replace the legacy source-to-source C++/CUDA backends (By Dec 2019)
    • The only missing features compared to the old source-to-source backends:
      • Vectorization on CPUs. Given most users who want performance are using GPUs (CUDA), this is given low priority.
      • Automatic shared memory utilization. Postponed until Feb/March 2020.
  • (Done) Redesign & reimplement (GPU) memory allocator (by the end of Jan 2020)
  • (WIP) Tune the performance of the LLVM backend to match that of the legacy source-to-source backends (Hopefully by mid Feb, 2020. Current progress: setting up/tuning for final benchmarks)

Updates

  • (Feb 6, 2020) v0.4.5 released.

    • ti.init(arch=..., print_ir=..., default_fp=..., default_ip=...) now supported. ti.cfg.xxx is deprecated
    • Immediate data layout specification supported after ti.init. No need to wrap data layout definition with @ti.layout anymore (unless you intend to do so)
    • ti.is_active, ti.deactivate, SNode.deactivate_all supported in the new LLVM x64/CUDA backend. Example
    • Experimental Windows non-UTF-8 path fix (by Yubin Peng [archibate])
    • ti.global_var (which duplicates ti.var) is removed
    • ti.Matrix.rotation2d(angle) added
  • (Feb 5, 2020) v0.4.4 released.

    • For developers: ffi-navigator support [doc]. (by masahi)
    • Fixed f64 precision support of sin and cos on CUDA backends (by Kenneth Lozes [KLozes])
    • Make Profiler print the arch name in its title (by Ye Kuang [k-ye])
    • Tons of invisible contributions by Ye Kuang [k-ye], for the WIP Metal backend
    • Profiler working on CPU devices. To enable, ti.cfg.enable_profiler = True. Call ti.profiler_print() to print kernel running times
    • General performance improvements
  • (Feb 3, 2020) v0.4.3 released.

    • GUI.circles 2.4x faster
    • General performance improvements
  • (Feb 2, 2020) v0.4.2 released.

    • GUI framerates are now more stable
    • Optimized OffloadedRangeFor with const bounds. Light computation programs such as mpm88.py is 30% faster on CUDA due to reduced kernel launches
    • Optimized CPU parallel range for performance
  • (Jan 31, 2020) v0.4.1 released.

    • Fixed an autodiff bug introduced in v0.3.24. Please update if you are using Taichi differentiable programming.
    • Updated Dockerfile (by Shenghang Tsai [jackalcooper])
    • pbf2d.py visualization performance boosted (by Ye Kuang [k-ye])
    • Fixed GlobalTemporaryStmt codegen
  • (Jan 30, 2020) v0.4.0 released.

    • Memory allocator redesigned
    • Struct-fors with pure dense data structures will be demoted into a range-for, which is faster since no element list generation is needed
    • Python 3.5 support is dropped. Please use Python 3.6(pip)/3.7(pip)/3.8(Windows: pip; OS X & Linux: build from source) (by Chujie Zeng [Psycho7])
    • ti.deactivate now supported on sparse data structures
    • GUI.circles (batched circle drawing) performance improved by 30x
    • Minor bug fixes (by Yubin Peng [archibate], Ye Kuang [k-ye])
    • Doc updated
  • Full changelog

About

Productive programming language for portable, high-performance, sparse & differentiable computing

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 78.0%
  • Python 17.0%
  • CMake 3.5%
  • Cuda 0.8%
  • C 0.5%
  • Dockerfile 0.2%