Name	Name	Last commit message	Last commit date
Latest commit History 8 Commits
amlt_configs	amlt_configs
docs	docs
scripts	scripts
src	src
tests	tests
.amltignore	.amltignore
.gitignore	.gitignore
README.md	README.md
pyproject.toml	pyproject.toml
requirements-app.txt	requirements-app.txt
requirements-dev.txt	requirements-dev.txt
requirements.txt	requirements.txt
setup.cfg	setup.cfg

Name

Last commit message

Last commit date

amlt_configs

Segment and Caption Anything

The repository contains the official implementation of "Segment and Caption Anything"

Project Page, Paper

tl;dr

Despite the absence of semantic labels in the training data, SAM implies high-level semantics sufficient for captioning.
SCA (b) is a lightweight augmentation of SAM (a) with the ability to generate regional captions.
On top of SAM architecture, we add a fixed pre-trained language mode, and a optimizable lightweight hybrid feature mixture whose training is cheap and scalable.

News

[12/05/2023] Release paper, code v0.0.1, and project page!

Environment Preparation

Please check docs/ENV.md.

Model Zoo

Please check docs/MODEL_ZOO.md

Gradio Demo

Please check docs/DEMO.md

Running Training and Inference

Please check docs/USAGE.md.

Experiments and Evaluation

Please check docs/EVAL.md

Acknowledgement

Deeply appreciate these wonderful open source projects: transformers, accelerate, deepspeed, detectron2, hydra, timm, gradio.

Citation

If you find this repository useful, please consider giving a star ⭐ and citation 🦖:

@misc{xiaoke2023SCA,
  title={{Segment and Caption Anything}},
  author={Xiaoke, Huang and Jianfeng, Wang and Yansong, Tang and Zheng, Zhang and Han, Hu and Jiwen, Lu and Lijuan, Wang and Zicheng, Liu},
  journal={arXiv},
  volume={abs/2312.00869},
  year={2023},
}

About

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gradio demo that show how to use the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Segment and Caption Anything

Environment Preparation

Model Zoo

Gradio Demo

Running Training and Inference

Experiments and Evaluation

Acknowledgement

Citation

About

Releases 1

Packages

Languages

License

xk-huang/segment-caption-anything

Folders and files

Latest commit

History

Repository files navigation

Segment and Caption Anything

Environment Preparation

Model Zoo

Gradio Demo

Running Training and Inference

Experiments and Evaluation

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages