Skip to content
View reilxlx's full-sized avatar
:octocat:
:octocat:

Block or report reilxlx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. llava-Qwen2-7B-Instruct-Chinese-CLIP llava-Qwen2-7B-Instruct-Chinese-CLIP Public

    模型 llava-Qwen2-7B-Instruct-Chinese-CLIP 增强中文文字识别能力和表情包内涵识别能力,接近gpt4o、claude-3.5-sonnet的识别水平!

    Python 18 3

  2. chinese-meme-description-dataset chinese-meme-description-dataset Public

    为了促进小模型在图像文本描述任务上的性能提升,本研究结合两个高质量的中文表情包数据集,并利用 Gemini-1.5-pro,Gemini-1.5-flash,Gemini-1.0-pro-vision,gpt4o,claude-3.5-sonnet,Yi-Vision 六种大型语言模型 (LLM) 对数据集进行高质量的标注,生成丰富的图像-文本描述。

    Python 4 1

  3. ImageText-Question-answer-pairs-58K-Claude-3.5-Sonnnet ImageText-Question-answer-pairs-58K-Claude-3.5-Sonnnet Public

    From the VisualGenome dataset V1.2, 21717 images were randomly selected. Using the Claude-3-opus-20240229 and Claude-3-sonnet-20240620 models, a total of 58312 question-answer pairs were generated,…

    Python

  4. podcast-player podcast-player Public

    一个基于PyQt5开发的播客播放器,具有音频播放、字幕显示和实时翻译功能。支持Google翻译、Gemini和SiliconCloud三种翻译服务,可实时显示双语字幕,并支持单词级别的同步高亮。播放器还具有字幕缓存、历史记录等功能,为用户提供流畅的播客学习体验。

    Python 4

  5. VisualDataset100K VisualDataset100K Public

    VisualDataset100K: A comprehensive image question-answering dataset created using large vision-language models. It includes 100K detailed image descriptions, 100K & 58K Q&A pairs, and datasets for …

    Python 1