Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog post on sharing research datasets on the Hub #1614

Merged
merged 27 commits into from
Oct 30, 2023

Conversation

davanstrien
Copy link
Member

@davanstrien davanstrien commented Oct 26, 2023

Initial blog post on sharing research datasets on the Hub. Possibly to be followed with more domain-specific examples.

@davanstrien davanstrien changed the title [WIP] Blog post on sharing research datasets on the Hub Blog post on sharing research datasets on the Hub Oct 26, 2023
@davanstrien davanstrien marked this pull request as ready for review October 26, 2023 13:23
Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome ! It's clear and straight to the point, love it :)

researcher-dataset-sharing.md Show resolved Hide resolved
_blog.yml Outdated

- local: researcher-dataset-sharing
title: "Empowering Open Source Machine Learning through Dataset Sharing on the Hugging Face Hub"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a bit more concise. Are you creating open machine learning datasets: share it on the Hub

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

_blog.yml Outdated Show resolved Hide resolved
_blog.yml Show resolved Hide resolved
assets/researcher-dataset-sharing/thumbnail.png Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@severo severo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's excellent and will be super useful when trying to get people to publish their dataset to the Hub

researcher-dataset-sharing.md Outdated Show resolved Hide resolved

You can learn more about how you can use this tool in this [blog post](https://huggingface.co/blog/scalable-data-inspection).

#### Lilac
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also nomic, maybe?

researcher-dataset-sharing.md Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
researcher-dataset-sharing.md Outdated Show resolved Hide resolved
@davanstrien
Copy link
Member Author

I will wait to merge this until Monday. @lhoestq @severo, I can maybe remove the reference to the loading scripts for now?

@severo
Copy link
Collaborator

severo commented Oct 27, 2023

remove the reference to the loading scripts

+1

@davanstrien davanstrien removed the request for review from yjernite October 30, 2023 10:38
@davanstrien davanstrien merged commit 39b3c64 into main Oct 30, 2023
1 check passed
@davanstrien davanstrien deleted the researchers-datasets-sharing-hub branch October 30, 2023 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants