SpaceTimeGPT: A Spatiotemporal Video Captioning Model Checkpoint Hugging Face Model Card Dataset VaTeX Evaluation 67.3 CIDEr on VATEX test set