Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data cleaning issues for team dataset #27

Open
mkoo opened this issue Mar 23, 2022 · 0 comments
Open

data cleaning issues for team dataset #27

mkoo opened this issue Mar 23, 2022 · 0 comments

Comments

@mkoo
Copy link
Member

mkoo commented Mar 23, 2022

when viewing the zipped CSV, you can see these data issues:

  • encoding issues in several geography fields like locality, state/Province (eg. Gu��_rico 1)
  • data entry errors in state/Province field (eg. A)
  • null values for Order, Family, which could be filled in with the genus and species values
  • what to do with eDNA samples for species fields?
  • gaps in data when basisOfRecord is PreservedSpecimen (no museum GUID supplied in institution_cde, collection_cde, catalog_num)

Much can be cleaned up prior to posting in API via python I think, but should any be adjusted in the templates (ie by user)?
Discussion topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant