This is a Swift demo app for SAM 2 Core ML models.
SAM 2 (Segment Anything in Images and Videos), is a collection of foundation models from FAIR that aim to solve promptable visual segmentation in images and videos. See the SAM 2 paper for more information.
Download the repo, compile with Xcode and run. The app comes with the Large version of the model, but you can replace it with one of the supported models:
This demo supports images, video support will be coming later.
- You can select one or more foreground points to choose objects in the image.
- Use a background point to refine your selection by removing areas.
- You can use a box to select an approximate area that contains the object you're interested in. In Box selection mode, the points are considered refinements of the box, and not independent selections.
If you want to use a fine-tuned model, you can convert it using this fork of the SAM 2 repo. Please, let us know what you use it for!
Feedback, issues and PRs are welcome! Please, feel free to get in touch.
To cite the SAM 2 paper, model, or software, please use the below:
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal={arXiv preprint arXiv:2408.00714},
url={https://arxiv.org/abs/2408.00714},
year={2024}
}