Skip to content

Commit

Permalink
feat(ctranslate): initial infrastructure support (bentoml#694)
Browse files Browse the repository at this point in the history
* perf: compact and improve speed and agility

Signed-off-by: Aaron <[email protected]>

* --wip--

Signed-off-by: Aaron <[email protected]>

* chore: cleanup infrastructure

Signed-off-by: Aaron <[email protected]>

* chore: update styles notes and autogen mypy configuration

Signed-off-by: Aaron <[email protected]>

---------

Signed-off-by: Aaron <[email protected]>
  • Loading branch information
aarnphm authored Nov 19, 2023
1 parent 93ffb29 commit 206521e
Show file tree
Hide file tree
Showing 38 changed files with 506 additions and 641 deletions.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ openllm-python/CHANGELOG.md linguist-generated=true

# Others
Formula/openllm.rb linguist-generated=true
mypy.ini linguist-generated=true

* text=auto eol=lf
# Needed for setuptools-scm-git-archive
Expand Down
File renamed without changes.
6 changes: 6 additions & 0 deletions DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,12 @@ See this [docs](/.github/INFRA.md) for more information on OpenLLM's CI/CD workf
## Typing
For all internal functions, it is recommended to provide type hint. For all public function definitions, it is recommended to create a stubs file `.pyi` to separate supported external API to increase code visibility. See [openllm-client's `__init__.pyi`](/openllm-client/src/openllm_client/__init__.pyi) for example.

If an internal helpers or any functions, utilities that is prefixed with `_`, then it is recommended to provide inline annotations. See [STYLE.md](./STYLE.md) to learn more about style and typing philosophy.

If you want to update any mypy configuration, please update the [`./tools/update-mypy.py`](./tools/update-mypy.py)

If you need to update pyright configuration, please update the [`pyrightconfig.json`](./pyrightconfig.json)

## Install from git archive install

```bash
Expand Down
16 changes: 0 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -503,14 +503,6 @@ openllm start tiiuae/falcon-7b --backend pt
### Quickstart
> **Note:** FlanT5 requires to install with:
> ```bash
> pip install "openllm[flan-t5]"
> ```
Run the following command to quickly spin up a FlanT5 server:
```bash
Expand Down Expand Up @@ -869,14 +861,6 @@ TRUST_REMOTE_CODE=True openllm start mosaicml/mpt-7b --backend pt
### Quickstart
> **Note:** OPT requires to install with:
> ```bash
> pip install "openllm[opt]"
> ```
Run the following command to quickly spin up a OPT server:
```bash
Expand Down
69 changes: 58 additions & 11 deletions STYLE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## the coding style
## the coding style.

This documentation serves as a brief discussion of the coding style used for
OpenLLM. As you have noticed, it is different from the conventional
Expand Down Expand Up @@ -48,14 +48,16 @@ rather the brevity of expression. (it enables
[expository programming](http://archive.vector.org.uk/art10000980), combining
with prototyping new ideas and logics within models implementation)

## some guidelines
## some guidelines.

Though I have stopped using deterministic formatter and linter, I do understand
that people have preferences for using these tools, and it plays nicely with IDE
and editors. As such, I included a [`pyproject.toml`](./pyproject.toml) file
that specifies some configuration for the tools that makes it compiliant with
the repository's style. In short, some of the tools include `ruff`, `yapf`, and
`interrogate`. Since we manage everything via `hatch`, refer back to the
the repository's style. In short, I'm using `ruff` for both linting and formatting,
`mypy` for type checking, and provide a `pyright` compatible configuration for those
who wishes to use VSCode or `pyright` LSP.
Since we manage everything via `hatch`, refer back to the
[DEVELOPMENT.md](./DEVELOPMENT.md) for more information on this.

Overtime, Python has incorporated a lot of features that supports this style of
Expand All @@ -68,7 +70,7 @@ somewhat, type-safe. Since there is no real type-safety when working with
Python, typing should be a best-effort to make sure we don't introduce too many
bugs.

### naming
### naming.

- follow Python standard for this, I don't have too much opinion on this. Just
make sure that it is descriptive, and the abbreviation describes the intent of
Expand All @@ -84,7 +86,7 @@ bugs.

_If you have any suggestions, feel free to give it on our discord server!_

### layout
### layout.

- Preferably not a lot of whitespaces, but rather flowing. If you can fit
everything for `if`, `def` or a `return` within one line, then there's no need
Expand All @@ -108,17 +110,18 @@ _If you have any suggestions, feel free to give it on our discord server!_

- With regards to writing operator, try to follow the domain-specific notation.
I.e: when writing pathlib, just don't add space since that is not how you
write a path in the terminal. `yapf` will try to accommodate some of this
write a path in the terminal. `ruff format` will try to accommodate some of this
changes.

- Avoid trailing whitespace

- use array, pytorch or numpy-based indexing where possible.

- If you need to export anything, put it in `__all__` or do lazy export for
type-safe checker.
type-safe checker. See [OpenLLM's `__init__.py`](./openllm-python/src/openllm/__init__.py)
for example on how to lazily export a module.

### misc
### misc.

- import alias should be concise and descriptive. A convention is to always
`import typing as t`.
Expand All @@ -129,20 +132,64 @@ _If you have any suggestions, feel free to give it on our discord server!_
MDX and will be hosted on the GitHub Pages, so stay tuned!
- If anything that is not used for runtime, just put it under `t.TYPE_CHECKING`

### note on codegen
### note on codegen.

- We also do some codegen for some of the assignment functions. These logics are
largely based on the work of [attrs](https://github.com/python-attrs/attrs) to
ensure fast and isolated codegen in Python. If you need codegen but don't know
how it works, feel free to mention @aarnphm on discord!

### types.

I do believe in static type checking, and often times all of the code in OpenLLM are safely-types.
Types play nicely with static analysis tools, and it is a great way to catch bugs for applications
downstream. In Python, there are two ways for doing static type:

1. Stubs files (recommended)

If you have seen files that ends with `.pyi`, those are stubs files. Stubs files are great format
for specifying types for external API, and it is a great way to separate the implementation from
the API. For example, if you want to specify the type for `openllm_client.Client`, you can create
a stubs file `openllm_client/__init__.pyi` and specify the type there.

A few examples include [`openllm.LLM` types definition](./openllm-python/src/openllm/_llm.pyi) versus
the [actual implementation](./openllm-python/src/openllm/_llm.py).

> Therefore, if you touch any public API, make sure to also update and add/update the stubs files correctly.
2. Inline annotations (encourage, not required)

Inline annotations are great for specifying types for internal functions. For example:
```python
def _resolve_internal_converter(llm: LLM, type_: str) -> Converter: ...
```

This is not always required. If the internal functions are expressive enough, as well
as the variable names are descriptive to ensure there is not type abrasion, then it is not
required to specify the types. For example:
```python
import torch, torch.nn.functional as F
rms_norm = lambda tensor: torch.sqrt(F.mean(torch.square(tensor)))
```
As you can see, the function calculate the RMSNorm of a given torch tensor.

#### note on `TYPE_CHECKING` block.

As you can see, we also incorporate `TYPE_CHECKING` argument into various places.
This will provides some nice in line type checking when development. Usually, I think
it is nice to have, but once the files get more and more complex, it is better to just
provide a stubs file for it.

## FAQ

### Why not use `black`?

`black` is used on our other projects, but I rather find `black` to be very
verbose and overtime it is annoying to work with too much whitespaces.

Personally, I think four spaces is a mistake, as in some cases it is harder to read
with four spaces code versus 2 spaces code.

### Why not PEP8?

PEP8 is great if you are writing library such as this, but I'm going to do a lot
Expand All @@ -152,7 +199,7 @@ probably not fit here, and want to explore more expressive style.
### Editor is complaining about the style, what should I do?

Kindly ask you to disable linting for this project 🤗. I will try my best to
accomodate with ruff and yapf, but I don't want to spend too much time on this.
accomodate for ruff and yapf, but I don't want to spend too much time on this.
It is pretty stragithforward to disable it in your editor, with google.

### Style might put off new contributors?
Expand Down
10 changes: 10 additions & 0 deletions all.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/usr/bin/env bash

printf "Running mirror.sh\n"
bash ./tools/mirror.sh
printf "Running update-mypy.py\n"
python ./tools/update-mypy.py
printf "Running update-config-stubs.py\n"
python ./tools/dependencies.py
printf "Running dependencies.py\n"
python ./tools/update-config-stubs.py
99 changes: 43 additions & 56 deletions hatch.toml
Original file line number Diff line number Diff line change
@@ -1,66 +1,53 @@
[envs.default]
dependencies = [
"openllm-core @ {root:uri}/openllm-core",
"openllm-client @ {root:uri}/openllm-client",
"openllm[opt,chatglm,fine-tune] @ {root:uri}/openllm-python",
# NOTE: To run all hooks
"pre-commit",
# NOTE: towncrier for changelog
"towncrier",
# NOTE: Using under ./tools/update-optional-dependencies.py
"tomlkit",
# NOTE: For fancy PyPI readme
"hatch-fancy-pypi-readme",
# NOTE: For working with shell pipe
"plumbum",
# The below sync with mypyc deps and pre-commit mypy
"types-psutil",
"types-tabulate",
"types-PyYAML",
"types-protobuf",
"openllm-core @ {root:uri}/openllm-core",
"openllm-client @ {root:uri}/openllm-client",
"openllm[chatglm,fine-tune] @ {root:uri}/openllm-python",
# NOTE: To run all hooks
"pre-commit",
# NOTE: towncrier for changelog
"towncrier",
# NOTE: Using under ./tools/update-optional-dependencies.py
"tomlkit",
# NOTE: For fancy PyPI readme
"hatch-fancy-pypi-readme",
# NOTE: For working with shell pipe
"plumbum",
# The below sync with mypyc deps and pre-commit mypy
"types-psutil",
"types-tabulate",
"types-PyYAML",
"types-protobuf",
]
[envs.default.scripts]
changelog = "towncrier build --version main --draft"
check-stubs = ["./tools/update-config-stubs.py"]
inplace-changelog = "towncrier build --version main --keep"
quality = [
"./tools/dependencies.py",
"- ./tools/update-brew-tap.py",
"check-stubs",
"bash ./tools/mirror.sh",
"- pre-commit run --all-files",
"- pnpm format",
]
setup = [
"pre-commit install",
"- ln -s .python-version-default .python-version",
"curl -fsSL https://raw.githubusercontent.com/clj-kondo/clj-kondo/master/script/install-clj-kondo | bash -",
]
tool = ["quality", "bash ./clean.sh", "bash ./compile.sh {args}"]
typing = [
"- pre-commit run mypy {args:-a}",
"- pre-commit run pyright {args:-a}",
"pre-commit install",
"- ln -s .python-version-default .python-version",
]
quality = ["bash ./all.sh", "- pre-commit run --all-files", "- pnpm format"]
tool = ["quality", "bash ./clean.sh", 'python ./cz.py']
[envs.tests]
dependencies = [
"openllm-core @ {root:uri}/openllm-core",
"openllm-client @ {root:uri}/openllm-client",
"openllm[opt,chatglm,fine-tune] @ {root:uri}/openllm-python",
# NOTE: interact with docker for container tests.
"docker",
# NOTE: Tests strategies with Hypothesis and pytest, and snapshot testing with syrupy
"coverage[toml]>=6.5",
"filelock>=3.7.1",
"pytest",
"pytest-cov",
"pytest-mock",
"pytest-randomly",
"pytest-rerunfailures",
"pytest-asyncio>=0.21.0",
"pytest-xdist[psutil]",
"trustme",
"hypothesis",
"syrupy",
"openllm-core @ {root:uri}/openllm-core",
"openllm-client @ {root:uri}/openllm-client",
"openllm[chatglm,fine-tune] @ {root:uri}/openllm-python",
# NOTE: interact with docker for container tests.
"docker",
# NOTE: Tests strategies with Hypothesis and pytest, and snapshot testing with syrupy
"coverage[toml]>=6.5",
"filelock>=3.7.1",
"pytest",
"pytest-cov",
"pytest-mock",
"pytest-randomly",
"pytest-rerunfailures",
"pytest-asyncio>=0.21.0",
"pytest-xdist[psutil]",
"trustme",
"hypothesis",
"syrupy",
]
skip-install = false
template = "tests"
Expand Down Expand Up @@ -91,10 +78,10 @@ clojure = ["bash external/clojure/run-clojure-ui.sh"]
detached = true
[envs.ci.scripts]
client-stubs = "bash openllm-client/generate-grpc-stubs"
compile = "bash ./compile.sh {args}"
compile = "bash ./tools/compile.sh {args}"
recompile = ["bash ./clean.sh", "compile"]
edi = "bash local.sh"
lock = [
"bash tools/lock-actions.sh",
"pushd external/clojure && pnpm i --frozen-lockfile",
"bash tools/lock-actions.sh",
"pushd external/clojure && pnpm i --frozen-lockfile",
]
5 changes: 3 additions & 2 deletions mypy.ini

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion openllm-client/src/openllm_client/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@


def __dir__():
return dir(openllm_core.utils)
coreutils = set(dir(openllm_core.utils)) | set([it for it in openllm_core.utils._extras if not it.startswith('_')])
return sorted(list(coreutils))


def __getattr__(name):
Expand Down
2 changes: 1 addition & 1 deletion openllm-client/src/openllm_client/_utils.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ from openllm_core.utils import (
generate_hash_from_file as generate_hash_from_file,
get_debug_mode as get_debug_mode,
get_quiet_mode as get_quiet_mode,
getenv as getenv,
in_notebook as in_notebook,
lenient_issubclass as lenient_issubclass,
reserve_free_port as reserve_free_port,
Expand All @@ -40,7 +41,6 @@ from openllm_core.utils.import_utils import (
is_jupyter_available as is_jupyter_available,
is_jupytext_available as is_jupytext_available,
is_notebook_available as is_notebook_available,
is_optimum_supports_gptq as is_optimum_supports_gptq,
is_peft_available as is_peft_available,
is_torch_available as is_torch_available,
is_transformers_available as is_transformers_available,
Expand Down
4 changes: 2 additions & 2 deletions openllm-core/src/openllm_core/_typing_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ def get_literal_args(typ: t.Any) -> tuple[str, ...]:
TupleAny = t.Tuple[t.Any, ...]
At = t.TypeVar('At', bound=attr.AttrsInstance)

LiteralDtype = t.Literal['float16', 'float32', 'bfloat16']
LiteralDtype = t.Literal['float16', 'float32', 'bfloat16', 'int8', 'int16']
LiteralSerialisation = t.Literal['safetensors', 'legacy']
LiteralQuantise = t.Literal['int8', 'int4', 'gptq', 'awq', 'squeezellm']
LiteralBackend = t.Literal['pt', 'vllm', 'ggml', 'mlc']
LiteralBackend = t.Literal['pt', 'vllm', 'ctranslate', 'ggml', 'mlc']
AdapterType = t.Literal[
'lora', 'adalora', 'adaption_prompt', 'prefix_tuning', 'p_tuning', 'prompt_tuning', 'ia3', 'loha', 'lokr'
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ class BaichuanConfig(openllm_core.LLMConfig):
'trust_remote_code': True,
'timeout': 3600000,
'url': 'https://github.com/baichuan-inc/Baichuan-7B',
'requirements': ['cpm-kernels', 'sentencepiece'],
'requirements': ['cpm-kernels'],
'architecture': 'BaiChuanForCausalLM',
# NOTE: See the following
# https://huggingface.co/baichuan-inc/Baichuan-13B-Chat/blob/19ef51ba5bad8935b03acd20ff04a269210983bc/modeling_baichuan.py#L555
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ class ChatGLMConfig(openllm_core.LLMConfig):
'trust_remote_code': True,
'timeout': 3600000,
'url': 'https://github.com/THUDM/ChatGLM-6B',
'requirements': ['cpm-kernels', 'sentencepiece'],
'requirements': ['cpm-kernels'],
'architecture': 'ChatGLMModel',
'default_id': 'thudm/chatglm-6b',
'model_ids': [
Expand Down
Loading

0 comments on commit 206521e

Please sign in to comment.