Grounded-Segment-Anything/EfficientSAM at main · ucas-dx/Grounded-Segment-Anything

History

Name		Name	Last commit message	Last commit date
parent directory ..
FastSAM		FastSAM
LightHQSAM		LightHQSAM
MobileSAM		MobileSAM
README.md		README.md
grounded_fast_sam.py		grounded_fast_sam.py
grounded_light_hqsam.py		grounded_light_hqsam.py
grounded_mobile_sam.py		grounded_mobile_sam.py

README.md

Efficient Grounded-SAM

We're going to combine Grounding-DINO with efficient SAM variants for faster annotating.

Installation

Install Grounded-SAM
Install Fast-SAM

Efficient SAMs

Here's the list of Efficient SAM variants:

Title	Description	Links
FastSAM	The Fast Segment Anything Model(FastSAM) is a CNN Segment Anything Model trained by only 2% of the SA-1B dataset published by SAM authors. The FastSAM achieve a comparable performance with the SAM method at 50× higher run-time speed.	[Github] [Demo]
MobileSAM	MobileSAM performs on par with the original SAM (at least visually) and keeps exactly the same pipeline as the original SAM except for a change on the image encoder. Specifically, we replace the original heavyweight ViT-H encoder (632M) with a much smaller Tiny-ViT (5M). On a single GPU, MobileSAM runs around 12ms per image: 8ms on the image encoder and 4ms on the mask decoder.	[Github]
Light-HQSAM	Light HQ-SAM is based on the tiny vit image encoder provided by MobileSAM. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask. Instead of only applying it on mask-decoder features, we first fuse them with ViT features for improved mask details. Refer to Light HQ-SAM vs. MobileSAM for more details.	[Github]

Run Grounded-FastSAM Demo

Firstly, download the pretrained Fast-SAM weight here
Run the demo with the following script:

cd Grounded-Segment-Anything

python EfficientSAM/grounded_fast_sam.py --model_path "./FastSAM-x.pt" --img_path "assets/demo4.jpg" --text "the black dog." --output "./output/"

And the results will be saved in ./output/ as:

Input	Text	Output
	"The black dog."

Note: Due to the post process of FastSAM, only one box can be annotated at a time, if there're multiple box prompts, we simply save multiple annotate images to ./output now, which will be modified in the future release.

Run Grounded-MobileSAM Demo

Firstly, download the pretrained MobileSAM weight here
Run the demo with the following script:

cd Grounded-Segment-Anything

python EfficientSAM/grounded_mobile_sam.py --MOBILE_SAM_CHECKPOINT_PATH "./EfficientSAM/mobile_sam.pt" --SOURCE_IMAGE_PATH "./assets/demo2.jpg" --CAPTION "the running dog"

And the result will be saved as ./gronded_mobile_sam_anontated_image.jpg as:

Input	Text	Output
	"The running dog"

Run Grounded-Light-HQSAM Demo

Firstly, download the pretrained Light-HQSAM weight here
Run the demo with the following script:

cd Grounded-Segment-Anything

python EfficientSAM/grounded_light_hqsam.py

And the result will be saved as ./gronded_light_hqsam_anontated_image.jpg as:

Input	Text	Output
	"Bench"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EfficientSAM

EfficientSAM

README.md

Efficient Grounded-SAM

Table of Contents

Installation

Efficient SAMs

Run Grounded-FastSAM Demo

Run Grounded-MobileSAM Demo

Run Grounded-Light-HQSAM Demo

Files

EfficientSAM

Directory actions

More options

Directory actions

More options

Latest commit

History

EfficientSAM

Folders and files

parent directory

README.md

Efficient Grounded-SAM

Table of Contents

Installation

Efficient SAMs

Run Grounded-FastSAM Demo

Run Grounded-MobileSAM Demo

Run Grounded-Light-HQSAM Demo