Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump tqdm from 4.66.2 to 4.66.3 #21

Closed
wants to merge 35 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
506d2fe
llama : expose llama_load_model_from_file_gpt4all
cebtenzzre Nov 24, 2023
88289b5
kompute : fix ggml_vk_device leaks
cebtenzzre Jan 31, 2024
3f7c4b9
kompute : fix c++11 compatibility
cebtenzzre Jan 31, 2024
9d5207b
kompute : enable Pascal GPUs
cebtenzzre Jan 31, 2024
b6891bc
llama : wrap llama_new_context_with_model in try/catch
cebtenzzre Feb 1, 2024
b80287e
kompute : add missing call to ggml_backend_kompute_device_unref
cebtenzzre Feb 1, 2024
dc7a50b
kompute : fix ggml_vk_allocate failure control flow
cebtenzzre Feb 1, 2024
c5014f6
kompute : disable GPU offload for Mixtral
cebtenzzre Feb 5, 2024
c76f5c3
kompute : do not list Intel GPUs as they are unsupported (#14)
cebtenzzre Feb 12, 2024
6ff4387
kompute : make partial tensor copies faster by syncing less data (#15)
cebtenzzre Feb 13, 2024
12dcddc
kompute : disable LLAMA_SPLIT_LAYER after ggerganov/llama.cpp#5321
cebtenzzre Feb 21, 2024
82b50e5
kompute : add gemma, phi-2, qwen2, and stablelm to whitelist
cebtenzzre Feb 21, 2024
a76f5f4
kompute : enable GPU support for 10 more model architectures
cebtenzzre Feb 22, 2024
877851b
llama : fix -Wunused-const-variable warning for non-Kompute build
cebtenzzre Feb 22, 2024
729d661
llama : expose model name and architecture via API
cebtenzzre Mar 5, 2024
2b8cb26
kompute : put device with most VRAM first, not least
cebtenzzre May 1, 2024
6e0b5d9
vulkan : make ggml_vk_instance_init static
cebtenzzre Apr 30, 2024
aea0abe
vulkan : don't filter devices by default, don't abort if none
cebtenzzre Apr 30, 2024
535c7b1
vulkan : implement ggml_vk_available_devices
cebtenzzre Apr 30, 2024
2a91dbf
vulkan : guard against multiple initialization
cebtenzzre May 1, 2024
ad1ab57
rocm : symlink source files so CUDA can be built in the same project
cebtenzzre May 2, 2024
09058b1
cuda : implement ggml_cuda_available_devices
cebtenzzre May 6, 2024
b0ccbe1
kompute : update submodule for install fix
cebtenzzre May 8, 2024
74a41c6
kompute : fix leaks in ggml_vk_current_device
cebtenzzre May 13, 2024
f10326c
kompute : fix use-after-free in ggml_vk_get_device
cebtenzzre May 20, 2024
e5c0df7
llama : replace ngl=0 hack with llama_model_using_gpu
cebtenzzre Jun 4, 2024
159235e
llama : use the correct buffer type when we choose not to load on GPU
cebtenzzre Jul 10, 2024
c301b42
kompute : update for leak fixes, cleanup changes, shaderFloat16
cebtenzzre Jul 18, 2024
7d402b3
kompute : plug a few memory leaks
cebtenzzre Jul 18, 2024
48a830c
common : Kompute supports --main-gpu, do not warn
cebtenzzre Jul 18, 2024
6e0ad3c
kompute : fix dangling references in ggml_vk_graph_kompute
cebtenzzre Jul 18, 2024
c3d5264
kompute : avoid freeing device/instance until absolutely necessary
cebtenzzre Jul 18, 2024
561d0ce
kompute : update ggml_vk_supports_op to fix false pos/neg
cebtenzzre Jul 18, 2024
cd13f44
kompute : fix missing unref on allocation failure
cebtenzzre Jul 18, 2024
3d4c558
Bump tqdm from 4.66.2 to 4.66.3
dependabot[bot] Jul 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
llama : wrap llama_new_context_with_model in try/catch
This fixes a crash where ggml_vk_allocate fails in llama_kv_cache_init,
but the exception is never caught.
  • Loading branch information
cebtenzzre committed Jul 18, 2024
commit b6891bc9b3298cc53f879aa606a0e9bd96135a9c
14 changes: 13 additions & 1 deletion src/llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18993,7 +18993,7 @@ void llama_free_model(struct llama_model * model) {
delete model;
}

struct llama_context * llama_new_context_with_model(
static struct llama_context * llama_new_context_with_model_internal(
struct llama_model * model,
struct llama_context_params params) {

Expand Down Expand Up @@ -19394,6 +19394,18 @@ struct llama_context * llama_new_context_with_model(
return ctx;
}

struct llama_context * llama_new_context_with_model(
struct llama_model * model,
struct llama_context_params params
) {
try {
return llama_new_context_with_model_internal(model, params);
} catch (const std::exception & err) {
LLAMA_LOG_ERROR("%s: failed to init context: %s\n", __func__, err.what());
return nullptr;
}
}

void llama_free(struct llama_context * ctx) {
delete ctx;
}
Expand Down