Personal stuff for robots
https://github.com/baegwangbin/surface_normal_uncertainty https://colab.research.google.com/drive/1HLjJORchZvzIdl8Mr_mXMYOr7-rhQygW?usp=sharing#scrollTo=TiSitL3XKvJH
Stuff NEEDED and information
EXLLAMA
- DPopenhermes v2 7b model
- 35 tokens per second
- 4.5 gb vram?
SURFACE NORMAL ESTIMATION
- small pth model
- 0.4 seconds for 1 image?
- 2 gb vram?(maybe less)
OPEN GROUNDING DINO T
- 0.4 seconds for 1 image as well?
- 3.5ish gb vram?
- 1gb sized model
REP_VIT SAM
- 0.2 seconds for 1 image
- very tiny model
- 2 gb vram
CLIP_SEG
- low quality but 0 shot segmentation
- 0.1 seconds per image
- 2gb vram
LLAMA CPP PYTHON?
- llava model for img2text
- decent quality
- takes time for encoding but then 30 tokens per second.
DEPTH ANYTHING HF
- pretty small sized
- accurate
- 0.15 seconds?
PATH ESTIMATION
- self implemented
- decently accurate
- 0.1 seconds?
probably more but idk