Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::bad_alloc Exception When Loading Large Model on iOS with MediaPipe #5757

Open
lightScout opened this issue Nov 27, 2024 · 5 comments
Open
Assignees
Labels
platform:ios MediaPipe IOS issues stale stat:awaiting response Waiting for user response task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:others issues not falling in bug, perfromance, support, build and install or feature

Comments

@lightScout
Copy link

I'm experiencing a std::bad_alloc exception when attempting to load a large model (~2.16 GB) using MediaPipe's LLM inference capabilities on an iPhone 16 Pro. The app crashes during model initialization due to what appears to be a memory allocation issue.

Environment:

Device: iPhone 16 Pro
iOS Version: latest
MediaPipe Version: latest
Xcode Version: 16.1

Steps to Reproduce:

Model Preparation:

Use a large .task model file approximately 2.16 GB in size (e.g., Llama-3.2-1b-q8.task).
The model is downloaded at runtime and stored in the app's documents directory to avoid bundling it with the app.
Model Initialization Code:

Initialize the model using the following code snippet:

init(model: Model) throws {
let options = LlmInference.Options(modelPath: model.modelFileURL.path)
options.maxTokens = 512
inference = try LlmInference(options: options)
let sessionOptions = LlmInference.Session.Options()
sessionOptions.temperature = 0.2
sessionOptions.randomSeed = 2222
session = try LlmInference.Session(llmInference: inference, options: sessionOptions)
}
Run the App:

Launch the app on the iPhone 16 Pro.
The app attempts to initialize the model using the above code.
Expected Behavior:

The model should initialize successfully, allowing for on-device inference using MediaPipe's LLM capabilities.
Actual Behavior:

The app crashes with a std::bad_alloc exception during model initialization.

Here are the relevant logs and error messages:

normalizer.cc(52) LOG(INFO) precompiled_charsmap is empty. use identity normalization.
Initialized TensorFlow Lite runtime.
INFO: Initialized TensorFlow Lite runtime.
Created TensorFlow Lite XNNPACK delegate for CPU.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
XNNPack weight cache: could not load '/private/var/mobile/Containers/Data/Application/9D9A1F20-9F60-412D-8A23-21BC0BF22DF1/tmp/Llama-3.2-1b-q8.task.xnnpack_cache': No such file or directory.
WARNING: XNNPack weight cache: could not load '/private/var/mobile/Containers/Data/Application/9D9A1F20-9F60-412D-8A23-21BC0BF22DF1/tmp/Llama-3.2-1b-q8.task.xnnpack_cache': No such file or directory.
libc++abi: terminating due to uncaught exception of type std::bad_alloc: std::bad_alloc

libsystem_kernel.dylib`__pthread_kill:
    0x1ef0db26c <+0>:  mov    x16, #0x148               ; =328 
    0x1ef0db270 <+4>:  svc    #0x80
->  0x1ef0db274 <+8>:  b.lo   0x1ef0db294               ; <+40>
    0x1ef0db278 <+12>: pacibsp 
    0x1ef0db27c <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1ef0db280 <+20>: mov    x29, sp
    0x1ef0db284 <+24>: bl     0x1ef0d6348               ; cerror_nocancel
    0x1ef0db288 <+28>: mov    sp, x29
    0x1ef0db28c <+32>: ldp    x29, x30, [sp], #0x10
    0x1ef0db290 <+36>: retab  
    0x1ef0db294 <+40>: ret    

Is there a recommended way to load large models using MediaPipe on iOS devices without exceeding memory limits?

Are there any best practices or techniques within MediaPipe or TensorFlow Lite to handle large models efficiently on mobile devices?

Can MediaPipe support loading models in a way that mitigates high memory consumption, such as streaming parts of the model or more efficient memory management during initialization?

@lightScout lightScout added the type:others issues not falling in bug, perfromance, support, build and install or feature label Nov 27, 2024
@kalyan2789g kalyan2789g added platform:ios MediaPipe IOS issues task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup labels Nov 28, 2024
@kalyan2789g
Copy link
Collaborator

@lightScout, The MediaPipe LLM Inference API offers well-defined usage guidelines. To troubleshoot the issue you're encountering, please refer to the official documentation: LLM Inference. If the problem persists after following the guidelines, please let us know, and we'll be happy to assist further.
Thanks,
@kalyan2789g

@kalyan2789g kalyan2789g added the stat:awaiting response Waiting for user response label Nov 28, 2024
@lightScout
Copy link
Author

@kalyan2789g

Thank you for your prompt response. I appreciate your reference to the official documentation for the MediaPipe LLM Inference API. I have thoroughly reviewed the guidelines provided in the documentation. However, I believe that the guidelines do not fully address the specific issue I am encountering.

As mentioned in my previous message, I am experiencing a std::bad_alloc exception when attempting to load a large .task model (~2.16 GB) using MediaPipe on an iPhone 16 Pro. Despite following the recommended practices in the documentation, the app crashes during model initialisation due to memory allocation issues.

Key Points:

Memory Constraints on iOS Devices:

iOS imposes strict per-app memory limits, which seem to be exceeded when loading large models.
The documentation does not provide guidance on handling models of this size within the memory constraints of mobile devices.

I understand that mobile devices have inherent limitations, but I was hoping to utilise MediaPipe's capabilities for on-device inference with larger models. Given that the official guidelines do not cover this scenario in detail, I kindly request further investigation into this issue.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Waiting for user response label Nov 28, 2024
@kalyan2789g
Copy link
Collaborator

Hi @lightScout, We are actively working with our internal team to diagnose the root cause of the issue. We will provide a resolution as soon as our investigation is complete.
Thanks,
@kalyan2789g

@schmidt-sebastian
Copy link
Collaborator

Two thoughts:

@schmidt-sebastian schmidt-sebastian added the stat:awaiting response Waiting for user response label Dec 13, 2024
Copy link

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:ios MediaPipe IOS issues stale stat:awaiting response Waiting for user response task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:others issues not falling in bug, perfromance, support, build and install or feature
Projects
None yet
Development

No branches or pull requests

3 participants