Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classes that sha1() should have methods for in digest #23

Open
ThierryO opened this issue Jan 4, 2016 · 19 comments
Open

Classes that sha1() should have methods for in digest #23

ThierryO opened this issue Jan 4, 2016 · 19 comments

Comments

@ThierryO
Copy link
Contributor

ThierryO commented Jan 4, 2016

  • Date
  • complex
  • array
eddelbuettel added a commit that referenced this issue May 1, 2016
@vsimko
Copy link
Contributor

vsimko commented May 1, 2016

Already implement types:

  • complex converted to a 2-column matrix (Re, Im)
  • Date converted to an integer (positive/negative) representing number of seconds since 1970

Any ideas for the array type ?

@eddelbuettel
Copy link
Owner

Same as for the other two? There is as.array() though I am not sure what one usually converts from...

@vsimko
Copy link
Contributor

vsimko commented May 1, 2016

array is a multi-dimensional matrix. One of the problems is to produce different hashes for differently shaped arrays that contain the same data:

# the same content for differenly shaped arrays
content <- 1:8

# creates a cube 2x2x2
a <- array(content, dim = c(2,2,2))
b <- array(content, dim = c(2,4,1))

# structures to be hashed
a_tohash <- list(
    dim = dim(a),
    data = as.numeric(a)
)

b_tohash <- list(
    dim = dim(b),
    data = as.numeric(b)
)

# should produce different hashes
sha1(a_tohash)
sha1(b_tohash)
> sha1(a_tohash)
[1] "96dbde57a89f37cf8fbbf9e706047a3fc822e705"
> sha1(b_tohash)
[1] "d51b50b0a6414a453e909a5d74686156edb099ef"

@vsimko
Copy link
Contributor

vsimko commented May 1, 2016

any other problems that need to be addressed before I create a pull request ?

@eddelbuettel
Copy link
Owner

Ah, yes, of course.

Well a 1:8 sequence with dim(2,2,2) is different from dim(2,4,1) so I think this is good enough.

@eddelbuettel
Copy link
Owner

Unit tests maybe (in the super-simple format we use here) ?

vsimko added a commit to vsimko/digest that referenced this issue May 1, 2016
vsimko added a commit to vsimko/digest that referenced this issue May 1, 2016
@vsimko
Copy link
Contributor

vsimko commented May 1, 2016

Note: the current implementation assumes that: sha1(2+0i) != sha1(2) (conservative approach, but maybe mathematically incorrect)

@vsimko
Copy link
Contributor

vsimko commented May 1, 2016

I've just found a related section in the vignette titled "Creating a sha1 method for other classes": https://github.com/eddelbuettel/digest/blob/master/vignettes/sha1.Rmd#creating-a-sha1-method-for-other-classes

Useful for new contributors.

@vsimko
Copy link
Contributor

vsimko commented May 2, 2016

Any types that should also be implemented ?

@ThierryO
Copy link
Contributor Author

ThierryO commented May 2, 2016

I've started this issue so everyone could add other classes to the wishlist. Keep in mind that @eddelbuettel wants to keep the list of dependencies as minimal as possible. That limits the classes to those available in base R.

@eddelbuettel
Copy link
Owner

Yes, I think anything besides base R is likely to be out of scope. Unless we all agree that it is worth it.

But minimal dependencies are a Good Thing (TM) for a widely-used utility package such as this.

@vsimko
Copy link
Contributor

vsimko commented May 10, 2016

what about zoo class for time series ?

@eddelbuettel
Copy link
Owner

I love zoo and and xts -- but this would get a little out of hand.

We cannot possibly follow up to every other S3 class on CRAN. Base types is good.

@ThierryO
Copy link
Contributor Author

#41 add sha1 methods for the pairlist and name classes. It also adds documentation for the sha1 methods on array, complex and Date

@ThierryO
Copy link
Contributor Author

#54 adds a sha1 method for the "function" class. e.g. sha1(sum)

@eddelbuettel
Copy link
Owner

Thanks for that. Looks like a very clean and straightforward PR. Will merge once Travis completes.

@ThierryO
Copy link
Contributor Author

The current implementation of sha1.function(x) takes environment(x) into account. This can lead to differences in sha1 hash when storing and retrieve the function in an S4 object. I am experiencing that problem now.
A solution would be to update sha1.function(x) so that it ignores the environment. Another option is to use an extra argument which allows the user to should whether the environment should be taken into account. In that case sha1() should gain a ... argument.
I'll make a PR after feedback from @eddelbuettel

@eddelbuettel
Copy link
Owner

From the top of my head I like the extra argument idea better.

@billdenney
Copy link
Contributor

Another pair of classes that should have a method: formula and environment?

Assuming that formula gets a method, I think that ideally there would be an option to ignore the attached environment (equivalent to sha1.function()).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants