-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::bad_alloc Exception When Loading Large Model on iOS with MediaPipe #5757
Comments
@lightScout, The MediaPipe LLM Inference API offers well-defined usage guidelines. To troubleshoot the issue you're encountering, please refer to the official documentation: LLM Inference. If the problem persists after following the guidelines, please let us know, and we'll be happy to assist further. |
Thank you for your prompt response. I appreciate your reference to the official documentation for the MediaPipe LLM Inference API. I have thoroughly reviewed the guidelines provided in the documentation. However, I believe that the guidelines do not fully address the specific issue I am encountering. As mentioned in my previous message, I am experiencing a std::bad_alloc exception when attempting to load a large .task model (~2.16 GB) using MediaPipe on an iPhone 16 Pro. Despite following the recommended practices in the documentation, the app crashes during model initialisation due to memory allocation issues. Key Points: Memory Constraints on iOS Devices: iOS imposes strict per-app memory limits, which seem to be exceeded when loading large models. I understand that mobile devices have inherent limitations, but I was hoping to utilise MediaPipe's capabilities for on-device inference with larger models. Given that the official guidelines do not cover this scenario in detail, I kindly request further investigation into this issue. |
Hi @lightScout, We are actively working with our internal team to diagnose the root cause of the issue. We will provide a resolution as soon as our investigation is complete. |
Two thoughts:
|
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you. |
I'm experiencing a std::bad_alloc exception when attempting to load a large model (~2.16 GB) using MediaPipe's LLM inference capabilities on an iPhone 16 Pro. The app crashes during model initialization due to what appears to be a memory allocation issue.
Environment:
Device: iPhone 16 Pro
iOS Version: latest
MediaPipe Version: latest
Xcode Version: 16.1
Steps to Reproduce:
Model Preparation:
Use a large .task model file approximately 2.16 GB in size (e.g., Llama-3.2-1b-q8.task).
The model is downloaded at runtime and stored in the app's documents directory to avoid bundling it with the app.
Model Initialization Code:
Initialize the model using the following code snippet:
init(model: Model) throws {
let options = LlmInference.Options(modelPath: model.modelFileURL.path)
options.maxTokens = 512
inference = try LlmInference(options: options)
let sessionOptions = LlmInference.Session.Options()
sessionOptions.temperature = 0.2
sessionOptions.randomSeed = 2222
session = try LlmInference.Session(llmInference: inference, options: sessionOptions)
}
Run the App:
Launch the app on the iPhone 16 Pro.
The app attempts to initialize the model using the above code.
Expected Behavior:
The model should initialize successfully, allowing for on-device inference using MediaPipe's LLM capabilities.
Actual Behavior:
The app crashes with a std::bad_alloc exception during model initialization.
Here are the relevant logs and error messages:
Is there a recommended way to load large models using MediaPipe on iOS devices without exceeding memory limits?
Are there any best practices or techniques within MediaPipe or TensorFlow Lite to handle large models efficiently on mobile devices?
Can MediaPipe support loading models in a way that mitigates high memory consumption, such as streaming parts of the model or more efficient memory management during initialization?
The text was updated successfully, but these errors were encountered: