Vidyut aims to provide performant and high-quality solutions for the common problems that Sanskrit programmers face. Some of these problems include:
-
Word generation, or converting bases and suffixes into complete words. (भू → भवति)
-
Word lookup, or mapping a complete word back to its bases and suffixes. (भवति → भू)
-
Transliteration, or conversion of Sanskrit text from one script to another. (भू → bhū)
-
Metrical analysis, or understanding the meter used by a piece of Sanskrit text.
-
Sandhi changes, or applying and undoing the sound changes that occur between pieces of Sanskrit text. (चैव → च एव)
-
Segmentation, or splitting a piece of Sanskrit text into distinct words. (भवत्येव → भवति एव)
Vidyut compiles to fast and efficient native code, and it can be bound to other programming languages with minimal work. We provide first-class support for Python and are eager to support other bindings as well.
Vidyut is under active development as part of the Ambuda project.
License: MIT
Vidyut is implemented in Rust, which provides low-level control with high-level ergonomics. For your convenience, we also provide first-class support for Python bindings through the vidyut Python package. This section describes how to use Vidyut either through Rust or through Python.
First, install Rust on your computer by following the instructions here.
Once you've done so, create a new project with cargo new
and install Vidyut's packages:
cargo add vidyut-prakriya
cargo add vidyut-kosha
cargo add vidyut-lipi
# ... and so on
You can also install directly from this repository:
cargo add vidyut-prakriya --git https://github.com/ambuda-org/vidyut.git
cargo add vidyut-kosha --git https://github.com/ambuda-org/vidyut.git
cargo add vidyut-lipi --git https://github.com/ambuda-org/vidyut.git
# ... and so on
We recommend using our pre-built linguistic data, which is available as a ZIP file here.
For more information, see our Rust documentation.
First, install Python on your computer. There are many ways to do so, but we
recommend installing uv then running uv init my-project
to create a
Python project.
Once your setup is ready, you can install the vidyut
package:
# With uv
$ uv add vidyut
# With pip
$ pip install vidyut
You can also install directly from this repository. Doing so compiles the repository from scratch and might take several minutes, so we strongly suggest using our latest PyPI release instead.
# Building from scratch is slow, so we pass `--verbose` to monitor its status.
# With uv
$ uv add "git+https://github.com/ambuda-org/vidyut.git#subdirectory=bindings-python" --verbose
# With pip
$ pip install -e "git+https://github.com/ambuda-org/vidyut.git#egg=vidyut&subdirectory=bindings-python" --verbose
We recommend using our pre-built linguistic data, which is available as a ZIP file here.
For more information, see our Python documentation.
Building from source lets you work with Vidyut as a developer and contributor.
(This setup requires cargo
. Confirm that you have cargo
installed by running
cargo --version
.)
Once you download the repo, you can run cargo test --all
to run unit tests.
$ git clone https://github.com/ambuda-org/vidyut.git
$ cd vidyut
$ cargo test --all
(If you install cargo-nextest
, you can also run make test
for a
nicer testing experience.)
Your first build will likely take a few minutes, but future builds will be much faster.
We recommend using our pre-built linguistic data, which is available as a ZIP file here. Or if you prefer, you can build this data for yourself:
$ cd vidyut-data
$ make create_all_data
Output will be written to data/build/vidyut-latest
.
NOTE: this command is resource-intensive and might stall on slower machines.
(This setup requires uv
. Confirm that you have uv
installed by running
uv --version
.)
Once you download the repo, you can run make test
in the bindings-python
directory to run Python-specific unit tests:
$ git clone https://github.com/ambuda-org/vidyut.git
$ cd vidyut/bindings-python
$ make test
make test
uses a development build, which compiles more quickly but has worse
runtime performance. To create a release build instead, run make release
.
Vidyut contains several standard components for common Sanskrit processing tasks. These components work together well, but you can also use them independently depending on your use case.
In Rust, components of this kind are called crates.
vidyut-chandas
identifies the meter in some piece of Sanskrit text. This
crate is experimental, and while it is useful for common and basic use cases,
it is not a state-of-the-art solution.
For details, see the vidyut-chandas README.
vidyut-cheda
segments Sanskrit expressions into words then annotates those
words with their morphological data. Our segmenter is optimized for real-time
and interactive usage: it is fast, low-memory, and capably handles pathological
input.
For details, see the vidyut-cheda README.
vidyut-kosha
defines a key-value store that can compactly map tens of
millions of Sanskrit words to their inflectional data. Depending on the
application, storage costs can be as low as 1 byte per word. This storage
efficiency comes at the cost of increased lookup time, but in practice, we have
found that this increase is negligible and well worth the efficiency gains
elsewhere.
For details, see the vidyut-kosha README.
vidyut-lipi
is a transliteration library for Sanskrit and Pali that also
supports many of the scripts used within the Indosphere. Our goal is to provide
a standard transliterator that is easy to bind to other programming languages.
For details, see the vidyut-lipi README.
vidyut-prakriya
generates Sanskrit words with their prakriyās (derivations)
according to the rules of Paninian grammar. Our long-term goal is to provide a
complete implementation of the Ashtadhyayi.
For details, see the vidyut-prakriya README.
vidyut-sandhi
contains various utilities for working with sandhi changes
between words. It is fast, simple, and appropriate for most use cases.
For details, see the vidyut-sandhi README.
Our Rust documentation is available on docs.rs, and our Python documentation is available on readthedocs.org. You can also build our documentation from scratch:
-
(Rust) To view documentation for all crates (including private modules and structs), run
make docs
from the repository root. This command will generate Rust's standard documentation and open it in your default web browser. -
(Python) To view the latest build of our Python documentation, run
make docs
from thebindings-python
directory. This command will write our Python docs to local HTML files, which you should then open manually.
Thank you for considering a contribution to Vidyut! Vidyut is an ambitious and transformative project, and it can grow only with your help.
For all of the details, see our CONTRIBUTING.md file.
If you're excited about our work on Vidyut, we would love to have you join our community.
-
Most of our conversation occurs on Ambuda's Discord server on the
#vidyut
channel, where you can chat directly with our team and get fast answers to your questions. We also schedule time to spend together virtually, usually on a weekly frequency. -
Occasional discussion related to Vidyut might also appear on ambuda-discuss or on standard mailing lists like sanskrit-programmers.
-
You can also follow along with project announcements on ambuda-announce.
-
More technical discussions will appear on our issues page.