Skip to content

Latest commit

 

History

History
28 lines (20 loc) · 775 Bytes

README.md

File metadata and controls

28 lines (20 loc) · 775 Bytes

⏳ tiktoken

tiktoken is a BPE tokeniser for use with OpenAI's models, forked from the original tiktoken library to provide NPM bindings for Node and other JS runtimes.

import assert from "node:assert";
import { get_encoding, encoding_for_model } from "@dqbd/tiktoken";

const enc = get_encoding("gpt2");
assert(
  new TextDecoder().decode(enc.decode(enc.encode("hello world"))) ===
    "hello world"
);

// To get the tokeniser corresponding to a specific model in the OpenAI API:
const enc = encoding_for_model("text-davinci-003");

The open source version of tiktoken can be installed from PyPI:

npm install @dqbd/tiktoken

Acknowledgements