Skip to content
View Fredham's full-sized avatar

Block or report Fredham

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

Python 124 9 Updated Apr 9, 2024

[CVPR 2024] SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design

Python 76 5 Updated Jun 14, 2024

A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiase…

Jupyter Notebook 1,085 228 Updated Oct 25, 2024
Python 81 7 Updated Jun 27, 2022

FACTUAL benchmark dataset, the pre-trained textual scene graph parser trained on FACTUAL.

Python 108 12 Updated Nov 7, 2024

[ICCV 2023] Accurate and Fast Compressed Video Captioning

Python 36 4 Updated Feb 18, 2024

Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).

Jupyter Notebook 196 31 Updated Jun 8, 2022

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 136,610 27,351 Updated Dec 22, 2024

[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning

Jupyter Notebook 66 6 Updated Feb 16, 2024
Python 15 1 Updated Oct 8, 2023

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

Python 2,805 717 Updated Jul 28, 2022

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,578 316 Updated May 21, 2024

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,572 128 Updated Jun 17, 2024

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Python 528 61 Updated Dec 6, 2023

总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力

Python 6,990 1,186 Updated Aug 24, 2022

Official pytorch implementation of the AAAI 2021 paper "Semantic Grouping Network for Video Captioning"

Python 51 14 Updated Jul 9, 2021

Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"

Python 237 35 Updated May 26, 2022

CVPR 2024 论文和开源项目合集

18,584 2,607 Updated Jul 4, 2024

[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Jupyter Notebook 581 73 Updated Jul 11, 2023

[CVPR2022] Official code for Hierarchical Modular Network for Video Captioning. Our proposed HMN is implemented with PyTorch.

Python 52 9 Updated Sep 30, 2022

ML models and internal tensors 3D visualizer

Python 1,291 133 Updated Aug 8, 2022

pytorch implementation of video captioning

Python 402 131 Updated Aug 19, 2019

[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

Python 93 5 Updated Apr 7, 2022
Jupyter Notebook 189 61 Updated Oct 12, 2021

Simple program to learn CNN (LeNet-5) in pure C

C++ 270 183 Updated Apr 27, 2017

COMP9321 Data Services Engineering Lab

Python 10 9 Updated Apr 23, 2018