Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupervised learning for protein sequences #190

Open
2 tasks done
agitter opened this issue May 7, 2019 · 0 comments
Open
2 tasks done

Unsupervised learning for protein sequences #190

agitter opened this issue May 7, 2019 · 0 comments

Comments

@agitter
Copy link
Collaborator

agitter commented May 7, 2019

Have you checked the list of proposed tips to see if the tip has already been proposed?

  • Yes

Did you add yourself as a contributor by making a pull request if this is your first contribution?

  • Yes, I added myself or am already a contributor

There has been a fair amount of discussion on Twitter the past few days about how to properly evaluate deep learning models that learn representations of protein sequences. This may provide good examples for how to evaluate models. For reference:

I haven't looked at these papers in particular, but it reminds me of related discussions in biochemistry like https://doi.org/10.1021/acs.jcim.7b00403. In that domain, there are pitfalls when dataset splits do not account for chemical similarity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant